S.Dasgupta Big Data Energy:

Women Yes Women Tech

Proposal

This project addresses the question posed by the Women Yes Women Tech organization: To generate interest and awareness for an annual gala, which NYC subway stations should street teams be located at to collect email addresses for potential attendees? The event will be in early summer and people who sign up for the gala will receive free tickets.

New York’s Metropolitan Transportation Authority (MTA) provides public transportation to millions of people every year and this network records turnstile data from its stations to track usage over the past couple of years.

Data: MTA Turnstile data

Strategy

In order to capture a timeframe that is most representative of when outreach by street teams should be done this year, the data sets analyzed were between March 30 and April 19, 2019, giving ample time for recruiting attendees before an early summer (~June) event.

Methodology

In this exploratory data analysis project, Python Data Analysis Library, Pandas, was used to clean and sort through the MTA turnstile data. Visualizations were then done using Matplotlib.

Review code here

Initial Analyses

Top twenty stations were sorted based off cumulative entries into the subway station during this time period

Top 20 Stations

Stations that were excluded due to high tourist traffic, stations in residential areas, etc.

Excluded stations

Technology company offices are concentrated in Fidi, Soho, and Midtown.

Tech areas

Combining this information, the recommended stations to place street teams are:

Recommended Stations

Stations to place street teams:

  1. 34th Street-Penn Station
  2. Grand Central-42nd Street
  3. 34th Street-Herald Square
  4. 14th Street-Union Square
  5. Fulton Street
  6. 42nd Street- Port Authority
  7. 59th Street Columbus
  8. Canal Street
  9. World Trade Center (PATH)
  10. 47th-50th Streets Rockefeller Center