PullRank Home a tool to enable you to log the Google PageRank of a URL
Development History
The Code
Since I had no experience with algorithms that match patterns in images, I tried out a simple algorithm to find the PageRank graphic (shown in Figure 1) in a bitmapped screen shot. For my algorithm to work, I would need to access a screenshot as an array of pixels, such as in a bitmap. After some Google searching, I finally found the type of navigable documentation that I was pining for after struggling with the API in the previous step. Here is the pseudo-code for the algorithm that operated on the pixel array:
for each row y in the bitmap
  for each column x in the bitmap
    read the 28 pixels right of (x,y) into an array pixels
    if pixel[0] is different than pixel[1]
      then continue
    else if pixels 0,1,2,3,4,22,23,24,25,26 are not equal
      then continue
    else if pixels 5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,27 are not equal
      then continue
    else PATTERN MATCHED! return (x,y) where match occurs

The matching pixels are the ones under the blue line in Figure 1. I designed this algorithm to check for pixel color equality against each other (instead of against an absolute color), in hopes that this algorithm would be more tolerant of color differences across platforms.

Close-Up of Google Toolbar
Figure 1: PageRank graphic

Once a match was found, the PageRank meter bar could be found easily, as it was at a fixed point relative to the pixel returned from the algorithm above. I then wrote another simple algorithm to determine the length of the dark green bar that runs along the bottom of the meter:

Translate (x,y) to (x',y') such that
  (x',y') is the lower left-hand pixel of the meter bar
let green be the color of the pixel at (x',y')
move right along the bar until the pixel color is not equal to green
  or until the end of the bar is reached, whichever comes first
if such a pixel is found before the end of the bar
  then normalize the length of the bar to a PageRank and return the PR
else
  then see if the PageRank is 0, 10, or undefined
  if the colors of (x',y') and (x',y'-1) match
    then see if the PageRank is 0 or undefined
    if the colors of (x',y') and (x',y'-5) match
      then return -1 which represents UNDEFINED
    else return 0
  else
    return 10
Once again, this algorithm works without using any absolute color values. By surfing the web among some of my favorite web sites, I was able to find a URL with every possible PageRank value except 1. For each URL, I took a screenshot of my Desktop with an IE browser open to it. I saved these bitmaps so that I could use them as a testbed for my code. Once my code passed all of the tests, I decided to focus on the missing link: being able to programmatically take a screenshot of the IE browser I had launched and then pass that screenshot as a bitmap to the pixel scanning code I had just written to determine its PageRank.


References:
  1. Documentation for Bitmap on MSDN


©2004 Michael Bolin » bolinfest@gmail.com www.bolinfest.com