Google Announces Open Sourcing Of Its Robots.txt Parser
Google has announced that it is open sourcing its own robots.txt parser in an effort to turn the Robots Exclusion Protocol (REP) into an official internet standard.
Google has open sourced the C++ library that it has been using for parsing and matching rules in robots.txt files. According to the company, this library has been in use for 20 years and it contains codes that were written in the 90's.
"Since then, the library evolved; we learned a lot about how webmasters write robots.txt files and corner cases that we had to cover for, and added what we learned over the years also to the internet draft when it made sense." wrote the company
Google is also including a testing tool in the open source package to enable developers to test a few rules.
robots_main
You can visit the GitHub repository for the robots.txt parser here.
Advertising