1.2 Metrics & Feature Selection
While the profit margin of taxi driving is generally proportionate to the total fare amount and is well-explained in the TLC pricing structure 10, there is currently no indicator of trip profitability (i.e., the ability to generate profit by taking a trip), which is an area of interest for most stakeholders and the main focus of this report. It comprises of many factors, such as the profitability of the pick-up zone and the availability of alternative modes of transportation. Thus, considering the complexity and dynamicity of this relative measurement, the current exploratory analysis will tackle these three areas separately using different feature combinations from the three datasets as follows:
1.2.1 Pick-up Zone Profitability
One component of the profitability equation is where taxi drivers should circulate in order to pick up the most lucrative passengers. We devise the formula for the profitability of each pick-up zone \(d\) as the log product between the annual average of trip duration, number of trips and rate per trip from all trips within the zone. It represents the expected profit from picking up a customer in that zone, adjusted for the demand and competition level (via the number of trips) and the logistic cost (via the trip duration).
\[Z_d=\log{(E_d[\text{Trip Duration}] \times E_d[\text{Rate per trip}] \times E_d[\text{Number of trips}])}\] where \[\text{Rate per trip (dollar/min)} = \frac{\text{Total Fare Amount}-\text{ACPM}\times\text{Distance (miles)}}{\text{Duration (min)}}\] where \(\text{ACPM}=0.58\), the estimated cost per miles by TLC 11.
1.2.2 Hourly Demands
The Zone Profitability formula does not account for hourly demand nor the drop-off location, meaning that a trip which ends in a low profitable zone may not be desirable even if it begins in a highly profitable zone. The trip count is visualised with respect to pick-up and drop-off zones; and hourly demand, characterised by the total trip count per hour, is analysed for hotspot areas with the highest number of trips.
1.2.3 Public Transport Competition Factor
Competition factor from alternative modes of transportation (i.e., public transport) is defined in two ways: the accessibility of public transport, measured by transport access time (TAT) 12 and the mode preference ratio 13.
Transport Access Time (TAT)
TAT measures the average time taken to the nearest subway station in a taxi zone at a specific hour \(h\) of the day: \[\text{TAT}_h \text{(min)}=60\times \frac{\text{Distance to nearest subway}}{v_\text{walking}}+\frac{60}{\text{Average number of trains at } h \text { hour}}\] where \(v_\text{walking}=3.1\text{ (miles/h)}\), the average walking speed.
Mode preference ratio
Mode preference ratio describes the tendency of a zone \(d\) toward taxi or subway use, ranging between -1 and 1. \(M_d < 0\) indicates a travel behavior tendency toward subway use, and \(M_d > 0\) indicates a tendency toward taxi use. \[M_d = \frac{\text{Average taxi trips picked up}_d}{\max{(\text{Average taxi trips picked up}})_\forall} - \frac{\text{Average subway entries}_d}{\max{(\text{Average subway entries}})_\forall}\] where \(\max{(\text{Average}_\forall)}\) denotes the highest average value out of all zones.
Taxi Fare - TLC. (2020). Retrieved 30 August 2020, from https://www1.nyc.gov/site/tlc/passengers/taxi-fare.page↩︎
Utilization Rate | NYC Rules. (2018). Retrieved 30 August 2020, from https://rules.cityofnewyork.us/tags/utilization-rate↩︎
Correa, D., Xie, K., & Ozbay, K. (2017). Exploring the taxi and Uber demand in New York City: An empirical analysis and spatial modeling. In 96th Annual Meeting of the Transportation Research Board, Washington, DC.↩︎
Hochmair, H. (2016). Spatiotemporal Pattern Analysis of Taxi Trips in New York City. Retrieved 30 August 2020, from https://doi.org/10.3141/2542-06↩︎