Understanding the Role of robots.txt in SEO

Share on:

Date: 22 Apr 2024

In the realm of search engine optimization (SEO), the robots.txt file plays a pivotal role in guiding search engine crawlers through your website. As outlined in the Next.js documentation, this simple text file is crucial for managing crawler access and ensuring that only the content you want to be indexed is visible to search engines.

What is a robots.txt File?

A `robots.txt` file is a public directive placed at the root of a website. It informs search engine crawlers which pages or files they can or cannot request from your site. This is particularly important for areas of your website that you wish to keep private, such as user accounts, admin sections, or certain API routes.

How doest robots.txt work?

The robots.txt file uses a straightforward syntax to communicate with crawlers. For instance, to block all crawlers from accessing the /accounts directory, you would include the following lines in your robots.txt:

# Block all crawlers for /accounts
User-agent: *
Disallow: /accounts

Conversely, to allow all crawlers to access your entire site, you would use:

# Allow all crawlers
User-agent: *
Allow: /

Implementing robots.txt in Next.js

Next.js simplifies the process of adding a robots.txt file to your project. Thanks to its static file serving feature, you can create a robots.txt file in the public folder of your project’s root directory. When you run your app, the robots.txt will be accessible, reflecting the rules you’ve set for crawler access.

The Impact of robots.txt on SEO

The robots.txt file is a powerful tool for SEO. By controlling crawler access, you can prevent search engines from indexing sensitive information or pages that are not meant for public consumption. This ensures that your site’s SEO is focused on content that truly matters to your audience and your business goals.

In conclusion, the robots.txt file is an essential component of a well-optimized website. The Next.js framework provides an easy way to implement and manage this file, giving developers and SEO specialists the control they need over how search engines interact with their sites. By following the guidelines provided in the Next.js documentation, you can effectively use robots.txt to enhance your site’s SEO performance.

This article provides a high-level overview of robots.txt as described in the Next.js documentation. For more detailed information and advanced configurations, it’s recommended to refer to the official Next.js SEO guide.

How can it be created in nextjs 14?

Creating robots.txt is very easy in nextjs 14. Add a robots.js or robots.ts file in the app folder, and it returns a Robot’s object.

Check out the example of the robots.ts file.

// app/robots.ts or app/robots.js

import { MetadataRoute } from 'next'
 
export default function robots(): MetadataRoute.Robots {
  return {
    rules: {
      userAgent: '*',
      allow: '/',
      disallow: '/private/',
    },
    sitemap: 'https://fireup.pro/sitemap.xml',
  }
}

In next.js 14, you can customize the functionality of individual search engine bots. The rules property contains every user agent in an array.

import type { MetadataRoute } from 'next'

export default function robots(): MetadataRoute.Robots {
  return {
    rules: [
      {
        userAgent: 'Googlebot',
        allow: ['/'],
        crawlDelay: 10
      },
      {
        userAgent: ['Applebot', 'Bingbot'],
        allow: ['/'],
      },
      {
        userAgent : 'GPTBot',
        disallow : ["/"]
      }
    ], 
    sitemap: 'https://fireup.pro/sitemap.xml',
  }
}

Inside the function, there is a rules array containing objects for different user agents:
Googlebot: Allowed to access all paths (’/’) with a crawlDelay of 10 seconds.
Applebot and Bingbot: Both are allowed to access all paths without any specified crawl delay.
GPTBot: Explicitly disallowed from accessing any paths on the site.
Sitemap: The function also specifies the location of the sitemap for the website: https://fireup.pro/sitemap.xml.

Generally this code give us static robots.txt file:

User-Agent: Googlebot
Allow: /
Crawl-delay: 10

User-Agent: Applebot
User-Agent: Bingbot
Allow: /

User-Agent: GPTBot
Disallow: /

Sitemap: https://fireup.pro/sitemap.xml