What is OpenTimes?

OpenTimes lets you download bulk travel data for free and with no limits. It is a database of around 20 billion point-to-point travel times and distances between United States Census geographies.

All times are calculated using open-source software from publicly available data. The OpenTimes data pipelines, infrastructure, packages, and website are all open-source and available on GitHub.

Goals

The primary goal of OpenTimes is to enable research by providing an accessible and free source of bulk travel data. The target audience is academics, urban planners, or really anyone who needs to quantify spatial access to resources (i.e. how many grocery stores someone can reach in an hour).

The secondary goal is to provide a free alternative to paid travel time/distance matrix products such as Google’s Distance Matrix API, Esri’s Network Analyst tool, and traveltime.com. However, note that OpenTimes is not exactly analogous to these services, which are often doing different and/or more sophisticated things (i.e. incorporating traffic and/or historical times, performing live routing, etc.).

FAQs

This section focuses on the what, why, and how of the overall project. For more specific questions about the data (i.e. its coverage, construction, and limitations), see the Data section.

General questions

What is a travel time?

In this case, a travel time is just how long it takes to get from location A to location B when following a road or path network. Think the Google Maps or your favorite smartphone mapping service. OpenTimes provides billions of these times, all pre-calculated from public data. It also provides the distance traveled for each time, though unlike a smartphone map, it does not provide the route itself.

What are the times between?

Times are between the population-weighted centroids of United States Census geographies. See Data for a full list of geographies. Centroids are weighted because sometimes Census geographies are huge and their unweighted centroid is in the middle of a desert or mountain range. However, most people don’t want to go to the desert, they want to go to where other people are. Weighting the centroids moves them closer to where people actually want to go (i.e. towns and cities).

What travel modes are included?

Currently, driving, walking, and biking are included. I plan to include transit once Valhalla (the routing engine OpenTimes uses) incorporates multi-modal costing into their matrix API.

Are the travel times accurate?

Kind of. They’re accurate relative to the other times in this database (i.e. the are internally consistent), but may not align perfectly with real-world travel times. Driving times tend to be especially optimistic (faster than the real world). My hope is to continually improve the accuracy of the times through successive versions.

Why are the driving times so optimistic?

Currently, driving times do not include traffic. This has a large effect in cities, where traffic greatly influences driving times. Times there tend to be at least 10-15 minutes too fast. It has a much smaller effect on highways and in more rural areas. Traffic data isn’t included because it’s pretty expensive and adding it might limit the open-source nature of the project.

The time between A and B is wrong! How can I get it fixed?

Please file a GitHub issue. However, understand that given the scale of the project (billions of times), the priority will always be on fixing systemic issues in the data rather than fixing individual times.

Technology

For more a more in-depth technical overview of the project, visit the OpenTimes GitHub page.

What input data is used?

OpenTimes currently uses three major data inputs:

OpenStreetMap data. Specifically, the yearly North America extracts from Geofabrik.
Elevation data. Automatically downloaded by Valhalla. Uses the public Amazon Terrain Tiles.
Origin and destination points. Derived from the centroids of U.S. Census TIGER/Line data.

Input and intermediate data are built and cached by DVC. The total size of all input and intermediate data is around 750 GB. In the future, OpenTimes will also use GTFS data for public transit routing.

How do you calculate the travel times?

All travel time calculations require some sort of routing engine to determine the optimal path between two locations. OpenTimes uses Valhalla because it’s fast, has decent Python bindings, can switch settings on the fly, and has a low memory/resource footprint.

U.S. states are used as the unit of work. For each state, I load all the input data (road network, elevation, etc.) for the state plus a 300km buffer around it. I then use the Valhalla Matrix API to route from each origin in the state to all destinations in the state plus the buffer area.

What do you use for compute?

Travel times are notoriously compute-intensive to calculate at scale, since they basically require running a shortest path algorithm many times over a very large network. However, they’re also fairly easy to parallelize since each origin can be its own job, independent from the other origins.

I use a combination of GitHub Actions and a beefy home server to calculate the times for OpenTimes. On GitHub Actions, I use a workflow-per-state model, where each state runs in a parameterized workflow that splits the work into many smaller jobs that run in parallel. This works surprisingly well and lets me calculate tract-level times for the entire U.S. in about a day.

I built OpenTimes during a 6-week programming retreat at the Recurse Center, which I highly recommend.

Why did you build this?

A few reasons:

Bulk travel times are really useful for quantifying access to amenities. In academia, they’re used to measure spatial access to primary care, abortion, and grocery stores. In industry, they’re used to construct indices for urban amenity access and as features for predictive models for real estate prices.
There’s a gap in the open-source spatial ecosystem. The number of open-source routing engines, spatial analysis tools, and web mapping libraries has exploded in the last decade, but bulk travel times are still difficult to get and/or expensive.
It’s a fun technical challenge to calculate and serve billions of records.
I was inspired by the OpenFreeMap project and wanted to use my own domain knowledge to do something similar.

What is OpenTimes?

Goals

FAQs

What is a travel time?

What are the times between?

What travel modes are included?

Are the travel times accurate?

Why are the driving times so optimistic?

The time between A and B is wrong! How can I get it fixed?

What input data is used?

How do you calculate the travel times?

What do you use for compute?

How is the data served?

How much does this all cost to host?

What map stack do you use for the homepage?

Why is the homepage slow sometimes?

How is this project funded?

Is commercial usage allowed?

Are there any usage limits?

How do I cite this data?

What license do you use?

Colophon

Who is behind this project?

Why did you build this?

What is OpenTimes?

GoalsLink to goals section

FAQsLink to faqs section

What is a travel time?Link to what-is-a-travel-time section

What are the times between?Link to what-are-the-times-between section

What travel modes are included?Link to what-travel-modes-are-included section

Are the travel times accurate?Link to are-the-travel-times-accurate section

Why are the driving times so optimistic?Link to why-are-the-driving-times-so-optimistic section

The time between A and B is wrong! How can I get it fixed?Link to the-time-between-a-and-b-is-wrong-how-can-i-get-it-fixed section

What input data is used?Link to what-input-data-is-used section

How do you calculate the travel times?Link to how-do-you-calculate-the-travel-times section

What do you use for compute?Link to what-do-you-use-for-compute section

How is the data served?Link to how-is-the-data-served section

How much does this all cost to host?Link to how-much-does-this-all-cost-to-host section

What map stack do you use for the homepage?Link to what-map-stack-do-you-use-for-the-homepage section

Why is the homepage slow sometimes?Link to why-is-the-homepage-slow-sometimes section

How is this project funded?Link to how-is-this-project-funded section

Is commercial usage allowed?Link to is-commercial-usage-allowed section

Are there any usage limits?Link to are-there-any-usage-limits section

How do I cite this data?Link to how-do-i-cite-this-data section

What license do you use?Link to what-license-do-you-use section

ColophonLink to colophon section

Who is behind this project?Link to who-is-behind-this-project section

Why did you build this?Link to why-did-you-build-this section

Goals

FAQs

What is a travel time?

What are the times between?

What travel modes are included?

Are the travel times accurate?

Why are the driving times so optimistic?

The time between A and B is wrong! How can I get it fixed?

What input data is used?

How do you calculate the travel times?

What do you use for compute?

How is the data served?

How much does this all cost to host?

What map stack do you use for the homepage?

Why is the homepage slow sometimes?

How is this project funded?

Is commercial usage allowed?

Are there any usage limits?

How do I cite this data?

What license do you use?

Colophon

Who is behind this project?

Why did you build this?