Category: programming

You may visit Aspen, Colorado to go skiing, but how do you get around when visiting in the summer? We-Cycle has introduced bikesharing to this mountain town, giving residents and tourists a new option for getting around. They recently shared some trip history data with me, letting me create this short animation of how cyclists travel between the 13 stations.

» Continue Reading…

Bikesharing at a Ski Resort

Why do so many APIs offer geographic searches based on a single point and a radius, but not based on a bounding box using two points? Does your computer or mobile device have a round screen? If so, a radius search is perfect for you. But if your screen is rectangular, your search should be too. API designers need to wake up and realize that nobody has a round screen, and thus radius-based geo searches are vastly inferior to rectangular searches!

Here’s an example from Times Square. For a search at 42nd and Broadway, if you set the radius to be half the width of the screen, your search area would look like this green circle. On a square screen, those four corners outside the circle add up to 21% of the display (given a square display of width w, (w² – π×w²/4)/w² = 21%). For rectangular screens, it becomes higher. For example, putting a circle in a rectangle that’s twice as long as it is high means you are missing 61% of the display. This means you might not be including results that the user would expect to see. » Continue Reading…

The San Francisco region has joined the bikesharing movement, with the introduction of Bay Area Bikeshare in August 2013. I wanted to see if I could adapt any of my CaBi tools for the “BABS” system, but their open data is too limited to be of much use. They have a System Metrics page which offers only ridership and membership data, which is not very interesting. To analyze the system we need trip history data, like Capital Bikeshare shares every quarter.

Luckily, I discovered Eric Fischer, who has been tracking station statuses since late August. Every minute, he records the number of available bikes and docks at each station. While not as valuable as trip history data, this data does let us discover when stations are either full or empty.

The data he records is a copy of the current station data, available at I had to reduce the size of the file by writing a Java program to remove redundancy and unnecessary fields. Still, storing data for a single day takes a megabyte of space for even the condensed JSON file. » Continue Reading…

If you’re a total transit nerd, this will be exciting. To prepare for a bus-themed event for the Transportation Techies meetup group, we’re making public APC data sets. That’s automated passenger counter; electronic devices that measure people boarding and alighting. We’re sharing it in hopes that local programmers will use it to create visualizations of how people use the bus.

2013-09 Raw Stop Data.xlsx is from Arlington Transit. It has 12 columns and 20,460 rows (1.2MB). The data is for weekdays in September 2013. I’ve created a CSV version, 2013-09 Raw Stop Data.csv. Here’s what 3 sample rows looks like: » Continue Reading…

Force Diagram of WMATA Metro StationsWhat is the minimum information you need when planning a trip on the Metro system? If all you want to see is which stations are connected, the Force Diagram of WMATA Metro Stations is the Metro map for you.

This visualization was designed using the JavaScript library D3, which includes the Force Layout design. I was inspired to do a version for Washington, DC after seeing Muyueh Lee‘s visualization of the Taipei MRT system. You can click-and-drag stations to try to reposition them. The layout pays no attention to the geographic locations of the stations. The distribution starts off as a random mess, and then coalesces into positions based on simulating physical properties of the links between stations. This is an even-more-severe rendering than my isochronal Metro Distortion Map.

The code is relatively compact, and customizing it was a good way for me to learn D3. That’s the same tool I used to create the Voronoi Diagram of CaBi Stations and the interactive bar chart I used for Looking Back at 2013 CaBi Data.

A Bare-Minimum Metro Map

visualizerThe Trip Visualizer has been updated with new data for Capital Bikeshare. Instead of just posting the new quarter, I made all of 2013 a single data set. That’s over two-and-a-half million trips (2,585,010 bikeouts) using 309 stations, summarized into a big origin-destination table.

Montgomery County joined the network in late September, introducing bikesharing to four regions in Maryland: Rockville, Bethesda/Chevy Chase, Silver Spring, and Takoma Park. 73% of bikeouts from Maryland went to other stations in Maryland, with 26% headed to DC. Just under 0.5% (21 trips out of 4,675) went from Maryland to Virginia. The fastest Maryland-to-Virginia ride took 33 minutes, from Friendship Heights Metro to Rosslyn Metro, a trip that takes 27 minutes on Metro. The longest MD/VA trip was 1 hour and 48 minutes, when someone biked from Crystal City Metro to Battery Lane in Silver Spring.

The Trip Visualizer lets you select a single station to see the most-significant trips to/from that station. You can use some hidden features to select clusters of stations, to examine networks. Hitting “M” will select all stations in Maryland. You can see how isolated Rockville is, with its closest station to Bethesda still over five miles away. » Continue Reading…

When I saw Dan Macy’s aerial photo of Washington, DC, I knew I had to turn it into the background for a map. Dan’s plane was flying into National Airport from New York City, and made an unusual entry over the Anacostia River, giving him a spectacular view of East Capitol Street, the Anacostia and Potomac Rivers, and the National Mall.

I’d been looking at the 2013 data from Capital Bikeshare, and made a little app that shows you the total number of bikeouts from each station, the Birds-eye view of CaBi stations. (A bikeout means someone rented a bike from that station.) The program was mostly an excuse to play with JavaScript and SVG graphics, without using a mapping or graphics library.

I had to figure out a not-too-complex way of mapping the latitude and longitude coordinates to the oblique photo. Unlike a map, the meridians (longitudes) aren’t vertical and the parallels (latitudes) aren’t horizontal. The bird’s-eye perspective also meant the meridians and parallels would be skewed. » Continue Reading…

Want to look back at 2013 using Capital Bikeshare data? I’ve put together an interactive tool to examine the 2013 daily ridership statistics for Capital Bikeshare. The data looks at daily “bikeout” totals, that is, how many bikes were checked out each day. You can summarize the data into weeks, months, quarters, and days of the week. The weekly view ignores December 31, in order to avoid having a 53rd week with only a single day in it.

You can compare the difference between bikeouts from subscribers (those with memberships for a month or a year) and casual riders (those with memberships for 1 or 3 days). You can also look at bikeout stats for any of the 306 individuals stations.

The program lets you find the correlation between any two data sets. You can use a second data set to color the bars of the first data set. The correlation is automatically calculated. It ranges from 1, a perfect positive correlation, to -1, a perfect negative correlation. 0 means there is no correlation. » Continue Reading…

Looking Back at 2013 CaBi Data

Flickr Calend'rWhat were the best photos of 2013? Flickr’s “interestingness” rating offers one way to judge the best, based on a special sauce made up of the number of views, faves, and comments. I wrote Flickr Calend’r for discovering the best pix of the year, breaking it down into the best of each month, and displaying the results in a calendar format. The Flickr Calend’r lets you narrow down which photos to search, filtering by user (or group), a search term, a WOE code, or others.

Oh, yeah, “WOE code” – not a commonly-known term. A Yahoo invention, which they normally call WOE IDs, for Where On Earth identifiers. Read Using “Where on Earth” Codes in Flickr for more info. This of course works only on photos that have been geo-tagged, so using this field will greatly reduce the available pool of photos.

You can leave any of these fields blank. When you hit “get calendar” it’ll fire off a request to search each month. This takes a while. I wrote a proxy script in PHP to handle the Flickr API calls, and of course this calendar requires 12 separate calls. I’m not sure if the problem is Flickr’s slow servers, or my own. » Continue Reading…

HangmanI did something weird with the Twitter API. A while back I used tweets to form the word puzzles for a Hangman game. You can feed the puzzles from a user’s stream by putting the user’s twitter handle in the URL, like ?twitter=mvs202. Oddly enough, this cool game never went viral, and it sat unloved and unplayed for a long time. When I tried it myself recently, it had stopped working. Here’s what happened and how I fixed it.

The original Twitter did not require authentication for the most basic searches. The API handled API calls just like it handled web searches, so the API version of was simply This worked great back when I wrote Game-ifying Twitter with Hangman two years ago. The new API requires developers to sign up. Plus, the response format has been totally changed to improve object consistency.

I wrote the game before discovering the joys of AJAX. Back then I consumed synchronous APIs by making the whole program in PHP. That means that the web page doesn’t render until the API call is complete so the server can wrap up the HTML code. Not a problem with Twitter, which has a speedy API, but still. The bigger problem is the readability of my own code. Mingling HTML and PHP makes for a difficult read, made worse by having “echo” statements generating HTML and Javascript on the fly. » Continue Reading…

Updating the API Approach