Avoid spiders crawling and indexing techniques avoid conflict error

robot (Meta  Robots  Tag label) for the establishment of page rank that search engine robots. Yuan robot tag should be placed in the HTML file header.

< link  rel=" canonical"   href=" 贵族宝贝example贵族宝贝/quality-wrenches.htm" />  

today, I talk about the limitation of the use of robot control technology. In order to let spiders don’t grab a page, the webmaster will sometimes use multiple robot control technology,   to prohibit access to a web search engine. Unfortunately, these techniques can sometimes contradict each other: on the other hand, such restrictions will give out some hidden links.  


robot tag  


specification label (canonical  tag;  

specification (canonical  tag) tags are meta tags located in a page HTML page head level. It tells that a URL search engine which is standard. Its purpose is not to let the search engine grab the duplicate content, will also focus on the weight of the duplicate pages of a page on the specifications.  

X robot tag  


quick review

code is this:

since 2007, Google and other search engines have to tell the spider crawling and indexing priority > X-Robots-Tag as a way to

as you know, you can’t rely on the spider engine in access or index your site can be very effective operation always. Completely rely on their own ports, the spider will produce many duplicate content, put some important pages as garbage, the index should not be displayed to the user in the link to the entrance, there are other problems. There are some tools to allow us to fully control the spider on the site’s activities, such as meta  robots.txt, robots tags, canonical tags.  

before we enter into the subject, let us look at some of the limitations of those of mainstream robots techniques:  

so, when a page in the robots file are forbidden to visit, or by using noindex  and   tag  canonical  tag; what will happen?

Leave a Reply

Your email address will not be published. Required fields are marked *