Before I decided to embark on this project, I decided that first I should
take a look to see what work had already been done on programmatically
I had already made myself a Google account so that I could access
their Web APIs,
but to the disappointment of myself and many other developers,
these APIs failed to provide an interface to grab the PageRank of a URL.
Next, I did a Google search to see how far hackers on the Web had come in terms of finding a workaround for the missing API. I found a Mac messageboard discussing how Mac users could view PageRanks, as the Google toolbar is only available for Windows IE users. The discussion revealed that someone had composed a piece of software that allowed you to check the PageRank for a site; however, "Google got upset about it [and] Put the squeeze on the author and he took it down." I did a Google search for this piece of software, but had no success.
From the messageboard, it was not clear how this other piece of software worked. There was some discussion of trying to decode the query the Google toolbar sent to Google using a packet sniffer, but no one had decoded the query as it appeared to use some variation of an md5 hash.
However, one person posted the following URL that showed the result of a Google query in an XML format: http://www.google.com/search?client=navclient-auto&ch=12206152085&q=info:http%3A%2F%2Fwww%2Ewebmasterworld%2Ecom By inspecting the value of the <RK> tag in the XML, you can get the PageRank of http://www.webmasterworld.com. Unfortunately, the author of this post did not reveal how he came up with the URL for this XML document, and substituting other domain names for webmasterworld.com in the query results in a "Google Forbidden Error," so using this URL was not a viable option as a general solution of the PageRank query problem.
Further web searches about programmatically acquiring PageRanks did not yield any helpful results, so this is where my background research came to an end. I was a little concerned that Google allegedly "Put the squeeze" on that other software author, but I decided that I would cross that bridge when I came to it. As my program does not do any packet sniffing, and it actually loads and renders the web page as an actual user would, I did not see how my software would violate the Google toolbar terms of service.
|©2004 Michael Bolin » email@example.com||www.bolinfest.com|