Using WordPress Robots.txt File For Better SEO
If you manage a WordPress site, chances are that you have heard of ‘robots.txt’. Yet, you probably wonder what it is. Besides, you might have asked yourself “Is it an important part of my site?” Well, we have got you covered. In this post, you will get a clear picture of what WordPress robot.txt is and how it manages and helps increase your Website’s security.
If you are a business owner and using the WordPress website to interact with your clients, promoting it in the search engines is crucial for you. Optimization of the search engine involves many important steps. One of them constructs a good file for WordPress Robots.txt.
What Is WordPress Robots.Txt File?
Before getting into the details about WordPress Robots.txt, let us first define what a ‘robot’ means in this context. However, we will do so by taking an example of search engine crawlers. They ‘crawl’ about the internet, and help search engines like Google index and rank pages. Check tips to getting Google to index your site. Besides, these crawlers are ‘bots’ or ‘robots’ visiting websites on the internet.
To make it clear, bots are a necessary thing for the internet. Nonetheless, that does not mean you should let them run around your site unregulated. The WordPress Robots.txt file get referred as the ‘Robots Exclusion Protocol.’ They got developed because site owners wanted to control their interaction with websites. The robots.txt file can get used to limit the access of bots to certain areas of the site or even block them completely.
Even so, this regulation is subject to certain limitations. For instance, bot cannot get forced to follow the commands of the robots.txt file. Also, malicious bots are able to ignore the file. Google and other prominent organizations ignore certain controls you add in robots.txt. If you are going through lots of problems with bots, a security solution is useful. For example, Cloudflare or Sucuri can be quite useful.
How Does the WordPress Robots.txt File Help Your Website?
There are two basic benefits of a well-integrated WordPress Robots.txt file. First, blocking bots that waste your server resources. Hence, it increases the efficiency of your site. Second, it optimizes search engines’ crawl resources. It does so by telling them which URLs on y our site they get allowed to index. What happens before a search engine crawls any page on a domain it has not come across before? The domains robots.txt file get opened and its commands get analyzed. Unlike the believe, robots.txt is not for regulating indexing of pages in search engines.
Is stopping certain pages from inclusion in search engine results your main aim? If so, a better way of doing this is by using a no-index meta tag or another equally direct approach. Reason being, robots.txt does not fully command search engines not to index content. Instead, it only commands them not to crawl it. As a result, it means that even though Google will not crawl the specified areas within your site. Even so, those pages will still get indexed whenever an external site links to them.
Creating and Editing Your WordPress Robots.txt File
Your site will already have a robots.txt file created for it by WordPress. The WordPress Robots.txt file is always at the root of your domain. So, if your domain is www.nameofwebsite.com, it should get found at http://nameofwebsite.com/robots.txt. This is a virtual file. Thus, it cannot get edited. To be able to edit your robots.txt file, what you need to do is create a physical file on your server. It can then get tweaked according to your requirements.
Creating And Editing A Robots.Txt With Yoast SEO
This is a very popular plugin. Besides, its interface allows you to create/edit the robots.txt file. Here are steps to follow:
- 1First, you need to enable Yoast SEO’s advanced features. This can get done by going to SEO, tapping on Dashboard and choosing Features from the menu that appears. Then, toggle on Advanced settings pages and Enable it.
- 2Once it gets activated, go to SEO and select Tools, then click on File Editor. then get an option to create the robots.txt file.
- 3Click the “Create robots.txt file” button. Hence, you will get allowed to use the same interface to edit the contents of your file.
We will talk about what types of commands to put in your WordPress Robots.txt file later in this article.
Creating and editing a robots.txt file with All in One SEO
When it comes to popularity, the All in One SEO Pack plugin is almost on par with Yoast SEO. This plugin’s interface can be used to create and edit the WordPress Robots.txt file. Just follow these simple steps:
- Go to the plugin dashboard, select Feature Manager and Activate the Robots.txt feature.
- Now, choose Robots.txt and You’ll be able to manage your robots.txt file here.
Creating and Editing a Robots.txt File via FTP
Do you use an SEO plugin for ranking that offers robots.txt? Hence, there is no need to worry. A WordPress Robots.txt file can still get created, and edited by using SFTP. Follow these steps:
- Make a blank file named “robots.txt” using any text editor and save it.
- Upload this file to the root folder of your site while you are connected to your site via SFTP.
- You can now use STFP to make changes and edit your robots.txt file. You can also upload new versions of the file if you wish.
Deciding What to Put in Your WordPres Robots.txt
Now that you have a physical robots.txt file, you can tweak and edit it as per your requirements. Let us look at what you can do by using this file. We have already talked about the importance of WordPress Robots.txt in controlling bots and your site. Now, we will discuss the two core commands that get required to do this.
- The goal of the User-agent control is to target particular bots. This command will help you create a rule that applies to one search engine but not to another. Bots use user-agents to identify themselves.
- The Disallow command enables you to keep robots from accessing specific areas of your site.
Furthermore, there is another command called Allow. The command comes in use when you disallow access to a folder and its sub-folders. Yet, you want to allow access to a specific folder out of these. Keep in mind that all the content on your site get marked with “Allow” by default. Thus, when adding rules, the first thing you should do is to state the user-agent to which the rule will apply. Then, state the regulations that will put it in place using the allow and disallow functions.
Specific use cases for WordPress robots.txt
Using robots.txt to block access to your whole site
If your site is still in the development stage, you may want to block crawler access to it. To do this, you will have to add the following code to your WordPress Robots.txt file:
How Does This Code Work?
The *(asterisk) after user agent signifies “all user agents” and the /(slash) after Disallow signifies that access to all pages that contain “www.nameofwebsite.com/ ” (every page on your site) should be disallowed.
- Use of robots.txt to block a specific bot from accessing your site
Now let’s say you want to prevent a particular search engine from crawling your content. For example, you might want to allow Google to crawl your site but want to disallow Bing. You can do so simply by replacing the *(asterisk) in the previous example with Bingbot.
- Using robots.txt to block access to a particular folder/file
If you only want to block access to a particular folder or file (and consequently its sub-folders), this is the command that you should follow.
Here, we use the example of the wp-admin file. It can be any file as per your requirements, and all you need to do is replace “wp-admin” in the above code with the name of the folder or file you want to prevent from being crawled by search engines.
- Using robots.txt to allow access to a specific file in a folder that is otherwise completely disallowed
Let’s say you wish to block an entire folder but still allow access to a particular file within it. In the previous example, we blocked access to the WordPress admin folder completely. What if we want to block access to the entire contents of the /wp-admin/ folder EXCEPT the /wp-admin/admin-ajax.php file? All you have to do is add an Allow command to the code in the previous example.
- Utilizing robots.txt to stop bots from crawling WordPress search results
If you wish to prevent search crawlers from crawling your search results pages, there’s a very simple command that will save the day. WordPress, by default, uses the query parameter “?s=”. Just add this command to block access. For example:
- Using robots.txt to create different rules for different bots
In all the above cases, we worked with one rule that accomplished a singular goal. However, what happens if you want to create different sets of commands for different bots? This is easier to do. All you have to do is create a separate set of rules under the user-agent command for each bot. For example, you want one order for all bots but a separate order only for Bingbot, this is what you will do:
What you are doing in this case is blocking all bots from accessing the wp-admin file. Yet, you are blocking Bingbot from accessing your entire site.
How to Test Your WordPress Robots.txt
You can check your WordPress Robots.txt file to see if your entire site is crawlable, if you have blocked specific URLs, or if you already have blocked or disallowed certain crawlers. You can do this in the Google search console. You just have to go to your site and go to “Crawl”. Under it, select “robots.txt Tester” and enter any URL to check its accessibility.
Look out for the UTF-8 BOM
Your WordPress Robots.txt file may look completely okay but really have a major issue. For example, you may find that the directives given are not being adhered to and pages that are not supposed to be crawled are in fact being crawled. The reason behind this almost always comes down to an invisible character called the UTF-8 BOM.
Here BOM signifies byte order mark, and it sometimes tends to be added to files by older text editors. If this character is present in your robots.txt file, Google might not be able to read it and complain about “Syntax not understood”. This has a significant impact on SEO and can render your robots.txt file useless. While you are testing your robots.txt file, make sure to look out for the UTF-8 BOM by checking whether Google does not understand any of your syntaxes.
Ensuring the Correct Use of the WordPress Robots.txt
Let us end this guide with a quick reminder. Although robots.txt blocks are crawling, it does not necessarily stop indexing. Even so, robots.txt helps you add guidelines. They control and outline the interaction of your site with search engines and bots. Nonetheless, it does not control explicitly whether or not your content gets indexed. Tweaking your site’s robots.txt file can be very helpful if you intend:
- Fixing your site, which is having trouble with a specific bot.
- To have better control over search engines and some content/plugins on your site. Nonetheless, if you do not meet the above, you need to change the default virtual robots.txt file on your site.
How to optimize WordPress Robots.txt
WordPress Robots.txt generally resides in the root folder of your site. You will have to link to your site using an FTP client, or to view it using the file manager of your cPanel. If you have no robots.txt file in the root directory of your site, then you can create one. All you have to do is build a new text file and save it to your computer as robots.txt. Then, just upload it to the root folder on your site.
In this article, we have explained about the WordPress Robots.txt file which is a very popular component to make the search engine bots more visible to the site. There are many reasons to optimize your WordPress Robot.txt file in which the main purpose of optimizing your robots.txt file is to stop search engines from crawling non-publicly accessible pages. We suggest that you follow the format given above to build a WordPress Robots.txt file for your site. We hope that this ultimate guide will help you for better SEO.