Here’s a fun tool that lets you create your own heat maps of words: WordWhere. Choose any geographic location and see where words tend to cluster. The search is made against Flickr’s gigantic collection of geotagged words, searching the photos’ titles, descriptions and tags.
The program’s strength is in finding geographic locations, such as searching for “Boston” or “Chicago.” Searching for larger areas, like “Canada,” will focus on populated areas where people are posting geo-tagged photos. You can also get reasonable responses for things like “beach,” “alligator,” or “rodeo.” Words not associated with regions tend to be dominated by the heavily-populated areas, though it is still fun to search more-abstract words.
The program uses treemapping, basically a binary search. It keeps dividing a portion of the map into halves, trying to figure out where the biggest results are. You can see the map being updated as the map is divided into smaller rectangles. You can control how many divisions should be made. The greater the density of words, the pinker that rectangle will appear.
As you hover over each rectangle, you’ll see the most interesting photo for that region, determined by Flickr’s mysterious algorithm for “interestingness.”
Though AJAX is asynchronous, I make only one API request at a time, mostly because I need those results to determine which rectangle to divide next. But this can result in pauses.
A binary search isn’t necessarily the best way to create a heat map, but I thought it would be fun to watch it in progress. Another way would be to simply collect all matching photos for the region and plot each photo’s location. But as the result set grows in size, this process would become more time-consuming. (And in any case, Flickr’s API will return no more than 500 images per call.) The binary search lets me quickly get approximate results for a large area.
I display only the top image result for each rectangle. When I divide a rectangle into two halves, I make only one additional query to Flickr. To save time, I figure out which half the parent rectangle’s top results resides in, and then query the other half. I can figure out the other half’s totals by subtracting the sibling’s totals from the parent’s total. The Flickr API has a limit of 3,600 queries per hour, so this makes it less likely I’ll hit the limit.
The “move map to” feature uses Google’s geocoder API, in order to translate a text description into a point on the map. This feature is also asynchronous, so there can be a pause before action is taken.
Try out the WordWhere geographic word-search program, and share your comments and suggestions below.
See also Projecting Word Frequency onto Maps.