Since I had no experience with algorithms that match patterns in images,
I tried out a simple algorithm to find the PageRank graphic
(shown in Figure 1) in a bitmapped screen shot.
For my algorithm to work, I would need to access a screenshot as
an array of pixels, such as in a bitmap. After some Google searching,
I finally found the type of
navigable documentation that I was pining
for after
struggling with the API in the previous step.
Here is the pseudo-code for the algorithm that operated on the pixel array:
for each row y in the bitmap
for each column x in the bitmap
read the 28 pixels right of (x,y) into an array pixels
if pixel[0] is different than pixel[1]
then continue
else if pixels 0,1,2,3,4,22,23,24,25,26 are not equal
then continue
else if pixels 5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,27 are not equal
then continue
else PATTERN MATCHED! return (x,y) where match occurs
The matching pixels are the ones under the blue line in Figure 1. I designed this algorithm to check for pixel color equality against each other (instead of against an absolute color), in hopes that this algorithm would be more tolerant of color differences across platforms.
Figure 1: PageRank graphic Once a match was found, the PageRank meter bar could be found easily, as it was at a fixed point relative to the pixel returned from the algorithm above. I then wrote another simple algorithm to determine the length of the dark green bar that runs along the bottom of the meter:
Translate (x,y) to (x',y') such that
(x',y') is the lower left-hand pixel of the meter bar
let green be the color of the pixel at (x',y')
move right along the bar until the pixel color is not equal to green
or until the end of the bar is reached, whichever comes first
if such a pixel is found before the end of the bar
then normalize the length of the bar to a PageRank and return the PR
else
then see if the PageRank is 0, 10, or undefined
if the colors of (x',y') and (x',y'-1) match
then see if the PageRank is 0 or undefined
if the colors of (x',y') and (x',y'-5) match
then return -1 which represents UNDEFINED
else return 0
else
return 10
Once again, this algorithm works without using any absolute color values.
By surfing the web among some of my favorite web sites, I was able to find
a URL with every possible PageRank value except 1.
For each URL, I took a screenshot of my Desktop with an IE browser open
to it. I saved these bitmaps so that I could use them as a testbed
for my code. Once my code passed all of the tests,
I decided to focus on the missing link:
being able to programmatically take a screenshot of the IE
browser I had launched and then pass that screenshot as a bitmap
to the pixel scanning code I had just written to determine its PageRank.
References: |
| ©2004 Michael Bolin » bolinfest@gmail.com | www.bolinfest.com |