In the realm of search engine optimization (SEO), the robots.txt file plays a pivotal role in guiding search engine crawlers through your website. As outlined in the Next.js documentation, this simple text file is crucial for managing crawler access and ensuring that only the content you want to be indexed is visible to search engines.
What is a robots.txt File?
A `robots.txt` file is a public directive placed at the root of a website. It informs search engine crawlers which pages or files they can or cannot request from your site. This is particularly important for areas of your website that you wish to keep private, such as user accounts, admin sections, or certain API routes.
How doest robots.txt work?
The robots.txt
file uses a straightforward syntax to communicate with crawlers. For instance, to block all crawlers from accessing the /accounts
directory, you would include the following lines in your robots.txt
:
# Block all crawlers for /accounts
User-agent: *
Disallow: /accounts
Conversely, to allow all crawlers to access your entire site, you would use:
# Allow all crawlers
User-agent: *
Allow: /
Implementing robots.txt in Next.js
Next.js simplifies the process of adding a robots.txt
file to your project. Thanks to its static file serving feature, you can create a robots.txt
file in the public folder of your project’s root directory. When you run your app, the robots.txt
will be accessible, reflecting the rules you’ve set for crawler access.
The Impact of robots.txt on SEO
The robots.txt
file is a powerful tool for SEO. By controlling crawler access, you can prevent search engines from indexing sensitive information or pages that are not meant for public consumption. This ensures that your site’s SEO is focused on content that truly matters to your audience and your business goals.
In conclusion, the robots.txt
file is an essential component of a well-optimized website. The Next.js framework provides an easy way to implement and manage this file, giving developers and SEO specialists the control they need over how search engines interact with their sites. By following the guidelines provided in the Next.js documentation, you can effectively use robots.txt
to enhance your site’s SEO performance.
This article provides a high-level overview of robots.txt
as described in the Next.js documentation. For more detailed information and advanced configurations, it’s recommended to refer to the official Next.js SEO guide.
How can it be created in nextjs 14?
Creating robots.txt
is very easy in nextjs 14. Add a robots.js or robots.ts file in the app folder, and it returns a Robot’s object.
Check out the example of the robots.ts
file.
// app/robots.ts or app/robots.js
import { MetadataRoute } from 'next'
export default function robots(): MetadataRoute.Robots {
return {
rules: {
userAgent: '*',
allow: '/',
disallow: '/private/',
},
sitemap: 'https://fireup.pro/sitemap.xml',
}
}
In next.js 14, you can customize the functionality of individual search engine bots. The rules property contains every user agent in an array.
import type { MetadataRoute } from 'next'
export default function robots(): MetadataRoute.Robots {
return {
rules: [
{
userAgent: 'Googlebot',
allow: ['/'],
crawlDelay: 10
},
{
userAgent: ['Applebot', 'Bingbot'],
allow: ['/'],
},
{
userAgent : 'GPTBot',
disallow : ["/"]
}
],
sitemap: 'https://fireup.pro/sitemap.xml',
}
}
Inside the function, there is a rules array containing objects for different user agents:
Googlebot: Allowed to access all paths (‘/’) with a crawlDelay of 10 seconds.
Applebot and Bingbot: Both are allowed to access all paths without any specified crawl delay.
GPTBot: Explicitly disallowed from accessing any paths on the site.
Sitemap: The function also specifies the location of the sitemap for the website: https://fireup.pro/sitemap.xml
.
Generally this code give us static robots.txt file:
User-Agent: Googlebot
Allow: /
Crawl-delay: 10
User-Agent: Applebot
User-Agent: Bingbot
Allow: /
User-Agent: GPTBot
Disallow: /
Sitemap: https://fireup.pro/sitemap.xml
Remember, you can not move the robots.ts
or robots.js
file into a different folder or route. It gives you a 404 error, and It can not work.