Disable search engine indexing

You can tell search engines which pages to crawl and which not to crawl on your site by writing a robots.txt file. You can prevent crawling pages, folders, your entire site. Or, just disable indexing of your webflow.io subdomain. This is useful to hide pages like your 404 page from being indexed and listed in search results.

In this lesson

Disabling Webflow subdomain indexing

You can prevent Google and other search engines from indexing the webflow.io subdomain by simply disabling indexing from your Project settings.

  1. Go to Project Settings → SEO → Indexing
  2. Set Disable Subdomain Indexing to "Yes"
  3. Save the changes and publish your site

A unique robots.txt will be published only on the subdomain telling search engines to ignore the domain.

Disable Webflow subdomain indexing

Generating a robots.txt file

The robots.txt is usually used to list the URLs on a site that you don't want search engines to crawl. You can also include the sitemap of your site in your robots.txt file to tell search engine crawlers which content they should crawl.

Just like a sitemap, the robots.txt file lives in the top-level directory of your domain. Webflow will generate the /robots.txt file for your site once you populate it in your Project settings.

  1. Go to Project Settings → SEO → Indexing
  2. Add the robots.txt rule(s) you want (see below)
  3. Save the changes and publish your site
Update the robots.txt file to prevent your entire site, a page, or folder of pages from being indexed
Generate a robots.txt file for your site by adding robots rules, saving the changes, and publishing your site.

Robots.txt rules

You can use any of these rules to populate the robots.txt file.

  • User-agent: * means this section applies to all robots.
  • Disallow:  tells the robot to not visit the site, page, or folder.

To hide your entire site

User-agent: *
Disallow: /

To hide individual pages

User-agent: *
Disallow: /page-name

To hide entire folder of pages

User-agent: *
Disallow: /folder-name/

To include a sitemap

Sitemap: https:yo

Helpful resources
Must know
  • Content from your site may still be indexed, even if it hasn't been crawled. That happens when a search engine knows about your content either because it was published previously or there's a link to that content on other content online. To ensure that a page isn't indexed, don't add it in the robots.txt. Instead, use the noindex meta code.
  • Anyone can access your site's robots.txt file, so they may be able to identify and access your private content.
Best practices

If you don't want anyone from finding a particular page or URL on your site, do not use the robots.txt file to disallow the URL from being crawled. Instead, use the any of the options below: