Disable search engine indexing

Prevent search engines from indexing pages, folders, your entire site, or just your webflow.io subdomain.

We’re transitioning to a new UI, and are in the process of updating our Webflow University content.

You can control which pages search engines crawl on your site in 2 ways: by writing a robots.txt file or by adding a noindex tag to certain pages. Then, you can prevent search engines from crawling and indexing specific pages, folders, your entire site, or your webflow.io subdomain. This is useful for hiding pages — like your site’s 404 page — from being indexed and listed in search results.

Important: Content from your site may still be indexed, even if it hasn’t been crawled. That happens when a search engine knows about your content either because it was published previously, or there’s a link to that content from other content online. To ensure that a previously indexed page is not indexed, don’t add it in the robots.txt. Instead, use the Sitemap indexing toggle to remove that content from Google’s index.

In this lesson: 

  1. How to disable indexing of the Webflow subdomain
  2. How to enable or disable indexing of site pages
  3. Best practices for privacy
  4. FAQ and troubleshooting tips

How to disable indexing of the Webflow subdomain 

You can prevent Google and other search engines from indexing your site’s webflow.io subdomain by disabling indexing from your Site settings.

  1. Go to Site settings > SEO tab > Indexing section
  2. Set Disable Webflow subdomain indexing to “Yes” 
  3. Click Save changes and publish your site

This will publish a unique robots.txt only on the subdomain, telling search engines to ignore this domain. 

Note: You’ll need a Site plan or paid Workspace to disable search engine indexing of the Webflow subdomain. Learn more about Site and Workspace plans.

How to enable or disable indexing of site pages

There are 2 ways to disable indexing of site pages:

  • By using the Sitemap indexing toggle in Page settings
  • By generating a robots.txt file

Note that if you disable indexing of a site page via a robots.txt file, the page will still be included in your site’s auto-generated sitemap (if you’ve enabled the sitemap). Additionally, if you’ve previously added a noindex tag to a site page via custom code, the page will still be included in your site’s auto-generated sitemap (unless you toggle the Sitemap indexing toggle “on”).

How to disable indexing of site pages with the Sitemap indexing toggle

If you disable indexing of a static site page with the Sitemap indexing toggle, that page will no longer be indexed by search engines and will no longer be included in your site’s sitemap. You can only disable indexing with the toggle if you’ve enabled your site’s auto-generated sitemap.

Note: The Sitemap indexing toggle adds <meta content="noindex" name="robots"> to your site page. This prevents the page from being crawled and indexed by search engines.

To prevent search engines from indexing certain site pages:

  1. Go to the page you want to prevent Google from indexing
  2. Go to Page settings > SEO settings
  3. Toggle Sitemap indexing off
  4. Publish your site

How to re-enable indexing of site pages with the Sitemap indexing toggle

To allow search engines to index certain site pages:

  1. Go to the page you want to prevent Google from indexing
  2. Go to Page settings > SEO settings
  3. Toggle Sitemap indexing on
  4. Publish your site

How to generate a robots.txt file 

The robots.txt is usually used to list the URLs on a site that you don't want search engines to crawl. You can also include the sitemap of your site in your robots.txt file to tell search engine crawlers which content they should crawl.

Just like a sitemap, the robots.txt file lives in the top-level directory of your domain. Webflow will generate the /robots.txt file for your site once you create it in your Site settings.

To create a robots.txt file:

  1. Go to Site settings > SEO tab > Indexing section
  2. Add the robots.txt rule(s) you want
  3. Click Save changes and publish your site
Important: Content from your site may still be indexed, even if it hasn’t been crawled. That happens when a search engine knows about your content either because it was published previously, or there’s a link to that content from other content online. To ensure that a previously indexed page is not indexed, don’t add it in the robots.txt. Instead, use the Sitemap indexing toggle to remove that content from Google’s index.

Robots.txt rules

You can use any of these rules to populate the robots.txt file.

  • User-agent: * means this section applies to all robots.
  • Disallow:   tells the robot to not visit the site, page, or folder.

To hide your entire site

User-agent: *

Disallow: /

To hide individual pages

User-agent: *

Disallow: /page-name

To hide an entire folder of pages

User-agent: *

Disallow: /folder-name/

To include a sitemap

Sitemap: https://your-site.com/sitemap.xml

Note: Webflow adds a link to your sitemap to your robots.txt by default.
Helpful resources

Check out more useful robots.txt rules.

Note: Anyone can access your site’s robots.txt file, so they may be able to identify and access your private content. 

Best practices for privacy 

If you’d like to prevent the discovery of a particular page or URL on your site, don’t use the robots.txt to disallow the URL from being crawled. Instead, use either of the following options: 

FAQ and troubleshooting tips

Can I use a robots.txt file to prevent my Webflow site assets from being indexed? 

It’s not possible to use a robots.txt file to prevent Webflow site assets from being indexed because a robots.txt file must live on the same domain as the content it applies to (in this case, where the assets are served). Webflow serves assets from our global CDN, rather than from the custom domain where the robots.txt file lives. 

I removed the robots.txt file from my Site settings, but it still shows up on my published site. How can I fix this? 

Once the robots.txt has been made, it can’t be completely removed. However, you can replace it with new rules to allow the site to be crawled, e.g.: 

User-agent: * 

Disallow:

Make sure to save your changes and republish your site. If the issue persists and you still see the old robots.txt rules on your published site, please contact customer support.

Table of contents

Continue learning

Hmm…we couldn’t find any results for “search query”. Try a different search term or check out our community forum.

Search the forumReset the filter
Load more

Filter

Reset
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Topics
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Back to top