The dispute revolves around Baidu’s robots.txt files. Nearly all websites have robots.txt files that serve as a notification of what “spiders” — the web-crawling bots that search engines and other web services use to index websites — are allowed to crawl the site. This function exists to allow people to control where their sites are listed, and to protect their servers from being overrun by web-crawling spiders that have gone amok. Generally, if a spider sees in a website’s robots.txt file that it has been disallowed from indexing the site, it simply moves on to the next website without indexing the site in question.
But when Baidu added Qihoo’s spiders to robots.txt files on its services like Baike to stop Qihoo from making use of Baidu’s content, Qihoo apparently reprogrammed its spiders to ignore the robots.txt file and crawl Baidu’s site anyway. Baidu’s lawsuit contends that this constitutes a violation of industry practices and amounts to an illegal seizure of Baidu’s intellectual property, and the company is seeking 100 million RMB ($15 million) in damages.
Speaking with the China Business News, a Qihoo representative characterized the robots.txt file as a good-faith request, not a hard-and-fast requirement. Qihoo also stated that content from Baidu Images, its Baike wikipedia service, Baidu Music, Baidu Knows, and more should not be considered Baidu’s intellectual property under copyright law because much of their content was submitted by users. (Not coincidentally, these are the same services Baidu is accusing Qihoo of using its web spiders to copy from).
It’s not clear what will happen with the lawsuit in court. The robots.txt file is not explicitly addressed in the law, but Baidu may be hoping to take Qihoo down with one of China’s commerce laws that, in rather vague language, suggests that companies must compete fairly and respect publicly-known industry standards. But it’s also possible that this lawsuit — not the first public spat in China’s web industry over a robots.txt file — could spur the government to clear things up once and for all by passing legislation that addresses the robots.txt file directly.
Baidu declined to comment on this story but did confirm to Tech in Asia the basic details of it were correct and that the case will indeed be heard by the court.
Sounds like Qihoo is up to some foul play!