Uberdata: Building the Perfect Uber Party City

What up humans?! Bradley Voytek again with another round of (hopefully) interesting #uberdata asking a few questions today:

“How is San Francisco’s Financial District like New York?” and “What neighborhood tells us the most about DC’s lifestyle?”

And hopefully providing a few data-derived answers.

Obviously the Uber nerd-collective are into maps. I mean, mapping is kind of important to what we’re building at Uber. But as a brain guy I usually don’t think in terms of space as much as I do in terms of time: temporal correlations, autoregressive models, causal relationships, time-frequency analyses, and so on. What’s happening in the brain and when.

Today, I’ll be talking about what Uber’s temporal patterns of demand tell us about the neighborhoods of the cities we service. Maybe it will be easier to just show you what I’m talking about. This is what San Francisco’s demand looks like, broken down by hour of week:

Uberdata: San Francisco demand curve

Now compare that to New York:

Uberdata: San Francisco v. New York demand curve

Right away you can see there’s something different; the differences aren’t huge, but they’re there.

Our ridership in New York is more heavily skewed toward weekdays whereas San Francisco demand jumps up on weekends.

Of course, now that we’ve been in business for two years and still growing like crazy, we can get much more granular. Instead of looking at differences between cities, we can start to look at differences between neighborhoods. 271 of them across nine major US cities, to be exact:

Boston
Chicago
DC
LA
New York
Philadelphia
San Diego
San Francisco
Seattle

Now that we can get more fine-grained we can begin to observe some pretty clear neighborhood-by-neighborhood differences. Again, a nerd-picture is worth a thousand nerd-words, so have a look at two neighborhoods in San Francisco — the Mission and the Financial District:

Uberdata: San Francisco Mission v Financial District demand curves

Check out how daily demand in the Mission peaks later in the day–after work hours–whereas demand in the Financial District peaks toward the end of the work day. The big difference, of course, is that the Mission has a lot more demand on Saturdays.

Now look at how San Francisco’s Financial District compares to New York’s Financial District:

Uberdata: San Francisco v New York Financial Districts demand curves

I love this stuff!

San Francisco’s Financial District is more Manhattan-like than it is San Francisco-like!

(And, of course, by “more like” I mean Uber demand, which is an index for activity within that neighborhood).

In fact, we can quantify how <city>-like or not <city>-like any given neighborhood is. That is, we can ask, “how San Francisco-like is the Mission, really?” and “how much more like New York is the Financial District than it is San Francisco?”

And we can do this for every neighborhood. What do we find?

Cities have “stereotypical” neighborhoods that very strongly match the flow of their home cities really well, and some neighborhoods just don’t really seem to belong to their home city. They’re outliers.

“But wait a minute!” you might say, donning your +3 internet troll-hat of ones-upmanship, “you’re correlating a variable with another variable that includes the first! If one neighborhood contributes more overall power to the signal of the city average, of course it will correlate with it better!”

“By Jove!” I might retort if I didn’t worry about this crap so much already. “You’re right! Thank you Dr. Needs-to-show-the-internet-how-smart-I-am.”

So yeah. I corrected for that so as not to get you all riled up. Happy now Internet Math Patrol? You made me write 3 more lines of code.

The concern here is that some neighborhoods have more demand and thus contribute more to the overall city demand. One way to address this is to correlate a city’s neighborhood demand with the city’s demand curves removing the effect of that neighborhood. Which, as you can tell from the imaginary argument in my head that I just subjected you all to, is what I’ve done.

The most stereotypically “like” neighborhood for each city is:
• San Francisco: North Beach
• New York: Chelsea
• Seattle: Capitol Hill
• Chicago: Near North Side
• Boston: Back Bay – Beacon Hill
• DC: Dupont Circle
• LA: Mid-City West

Now, in contrast…

The most stereotypically “unlike” neighborhood for each city is:
• San Francisco: Crocker Amazon
• New York: Washington Heights
• Seattle: South Park
• Chicago: Montclare
• Boston: West Roxbury
• DC: Deanwood
• LA: Southeast LA

We can also extract “types” of demand curves: are there neighborhoods that are more active on weekends and others that are clearly work-week hotspots? One simple mathematical technique to identify stereotyped patterns in data is via principal component analysis. The details aren’t too important, so let’s just jump to the results: there are two “types” of demand curves that account for 93% of the variance in overall demand. Here’s what they look like:

Uberdata: PCA demand curves weekend/weekday

Essentially you’ve got one rising demand curve that peaks on evenings and Friday and Saturday nights (red) and one workday/workweek curve that diminishes on weekends (blue). We can then ask, for each city, which neighborhood is the most “weekend-like” and which is the most “weekday-like” (that is, how strongly does each neighborhood correlate with each of these two curves)?

So if we could build the perfect “party city” consisting only of the neighborhoods from each city that correlate most with the weekend curve, this is what it would look like:
• San Francisco: North Beach
• New York: SoHo
• Seattle: First Hill
• Chicago: Near North Side
• Boston: South Boston
• DC: Dupont Circle
• LA: Santa Monica

And now again, in contrast…

The lame all work/no play city would be:
• San Francisco: Financial District
• New York: Garment District
• Seattle: Overlake
• Chicago: O’Hare
• Boston: East Boston
• DC: Deanwood
• LA: Westchester

But this is looking at how neighborhoods relate to cities. What about how they relate to one-another? Well, given that we’re working with 271 neighborhoods, we’re talking about running 36585 correlations, which is messy to display. So I’ve pared the data down to just the strongest relationships, which you can play around with by clicking the image below.

interactive plot

built with d3.js

Of course this is all academic. The thing that makes cities like San Francisco great are their diversity. I’m sure living in Uber Party City (UPC) would eventually have to get old. Right?

Uberdata: Building the Perfect Uber Party City

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112