Disable search engine indexing

Prevent search engines from indexing pages, folders, your entire site, or just your webflow.io subdomain.

This video features an old UI. Updated version coming soon!

You can tell search engines which pages to crawl and which not to crawl on your site by writing a robots.txt file. You can prevent crawling pages, folders, your entire site. Or, just disable indexing of your webflow.io subdomain. This is useful to hide pages like your 404 page from being indexed and listed in search results.

In this lesson

Disabling Webflow subdomain indexing

You can prevent Google and other search engines from indexing the webflow.io subdomain by simply disabling indexing from your Project settings.

  1. Go to Project Settings → SEO → Indexing
  2. Set Disable Subdomain Indexing to "Yes"
  3. Save the changes and publish your site

A unique robots.txt will be published only on the subdomain telling search engines to ignore the domain.

The switch button is turned to YES on the Disable Webflow subdomain indexing in the Indexing section. This area is highlighted along with the green save changes button on the top right corner.

Generating a robots.txt file

The robots.txt is usually used to list the URLs on a site that you don't want search engines to crawl. You can also include the sitemap of your site in your robots.txt file to tell search engine crawlers which content they should crawl.

Just like a sitemap, the robots.txt file lives in the top-level directory of your domain. Webflow will generate the /robots.txt file for your site once you populate it in your Project settings.

  1. Go to Project Settings → SEO → Indexing
  2. Add the robots.txt rule(s) you want (see below)
  3. Save the changes and publish your site
A robots.txt rule "User-agent:*", break, "Disallow: /" is filled in the robots.txt field and highlighted along with a green save changes button on the top right corner of the Indexing section.
Generate a robots.txt file for your site by adding robots rules, saving the changes and publishing your site.

Robots.txt rules

You can use any of these rules to populate the robots.txt file.

  • User-agent: * means this section applies to all robots.
  • Disallow:  tells the robot to not visit the site, page, or folder.

To hide your entire site

User-agent: *
Disallow: /

To hide individual pages

User-agent: *
Disallow: /page-name

To hide entire folder of pages

User-agent: *
Disallow: /folder-name/

To include a sitemap

Sitemap: https://your-site.com/sitemap.xml

Helpful resources

Check out more useful robots.txt rules

Must know
  • Content from your site may still be indexed, even if it hasn't been crawled. That happens when a search engine knows about your content either because it was published previously or there's a link to that content on other content online. To ensure that a page isn't indexed, don't add it in the robots.txt. Instead, use the noindex meta code.
  • Anyone can access your site's robots.txt file, so they may be able to identify and access your private content.
Best practices

If you don't want anyone from finding a particular page or URL on your site, do not use the robots.txt file to disallow the URL from being crawled. Instead, use any of the options below:

Try Webflow — it's free
No items found.
This video features an old UI. Updated version coming soon!