Create a File Named robots.txt:
Use any text editor (such as Notepad, TextEdit, vi, or emacs) to create a new file.
Save the file as robots.txt.
Ensure that you save it with UTF-8 encoding if prompted during the save process.
Add Rules to the robots.txt File:
A robots.txt file consists of one or more rules.
Each rule specifies whether a specific crawler (user agent) is allowed or disallowed access to certain file paths on your domain.
By default, all files are implicitly allowed for crawling unless specified otherwise.
Upload the robots.txt File:
Upload the robots.txt file to the root directory of your website.
For example, if your site is www.example.com, the robots.txt file should be accessible at www.example.com/robots.txt.
Remember that your site can have only one robots.txt file.
Test the robots.txt File:
Verify that the file is accessible by visiting www.example.com/robots.txt in your browser.
Check if the rules are correctly defined and match your intended restrictions.
Here’s a simple example of a robots.txt file:
User-agent: Googlebot
Disallow: /nogooglebot/
User-agent: *
Allow: /
Sitemap: https://www.example.com/sitemap.xml
The user agent named Googlebot is not allowed to crawl any URL starting with /nogooglebot/.
All other user agents are allowed to crawl the entire site (which is the default behavior).
The site’s sitemap file is located at https://www.example.com/sitemap.xml.
Remember to adjust the rules according to your specific requirements. Creating a well-structured robots.txt file ensures better control over search engine crawlers
