doordash

DoorDash Data Analyst Case Interview – Marketplace Reliability and Growth Insights

This 60-minute, interview-led case mirrors real DoorDash product analytics and operations problems in a three-sided marketplace (consumers, merchants, dashers). It tests your ability to structure ambiguous problems, write production-grade SQL, reason about experimentation in marketplaces with interference, and synthesize insights into crisp, actionable recommendations aligned with DoorDash’s bias for action and operator-first culture. What the case covers: 1) Problem context (5 min): You’re a Data Analyst on the Marketplace Quality team. Over the last two weeks, the On-Time Delivery Rate (OTR) in the Bay Area dipped by 3 percentage points following a consumer free-delivery promotion and a batching algorithm tweak. Leadership wants to understand drivers, quantify impact, and decide whether to roll back the change, tighten promo targeting, or boost dasher pay temporarily. 2) Metric definition and guardrails (5–8 min): - Define OTR precisely (e.g., actual_delivery_ts <= estimated_delivery_ts for completed, non-scheduled orders; exclude orders with full refunds; treat partial refunds separately). - Identify primary/secondary metrics: completion rate, average delivery time, long-tail delay rate (e.g., >20 min late), consumer CSAT proxy, cancellation reasons, dasher pickup wait time, merchant prep time, distance, marketplace reliability (supply/demand balance), and promo attribution. - Discuss trade-offs DoorDash cares about: reliability vs. growth, pay fairness, cost per order, and avoiding unintended supply shocks. 3) SQL analytics case (15–20 min): Using a simplified schema, outline or write SQL to: (a) compute daily city-level OTR, (b) decompose lateness into merchant prep vs. transit vs. pickup wait, (c) segment impact by promo exposure and batching flag, and (d) quantify how much of the OTR decline is explained by supply/demand imbalance, distance, and weather. Example schema (columns representative of real interviews): - orders(order_id, consumer_id, merchant_id, city_id, created_at, status, scheduled, subtotal, total, estimated_delivery_ts, actual_delivery_ts, canceled_at, refund_amount) - deliveries(order_id, dasher_id, accepted_at, pickup_arrival_ts, pickup_depart_ts, dropoff_arrival_ts, distance_mi, batched_flag) - dashers(dasher_id, active_city_id, activated_at, rating, completion_rate, is_top_dasher) - merchant_hours(merchant_id, day_of_week, open_ts, close_ts) - promotions(order_id, promo_id, promo_type, start_ts, end_ts) - experiments(entity_id, unit, exp_name, variant, start_ts, end_ts) - weather(city_id, ts_hour, precip, temp_f) - city_supply_demand(city_id, ts_hour, online_dashers, demand_orders) Expectations: clean joins, correct filtering (exclude scheduled orders for OTR unless stated), appropriate time-grain (city-day or city-hour), late-arriving events handling, and performance-minded SQL (CTEs acceptable, but prefer clarity and correctness first). 4) Causal and experimental reasoning (10–12 min): - Identify confounders: seasonality, weather, merchant hour changes, expansion to long-tail merchants, and city mix. - Explain why marketplace interference matters: order-level randomization can spill over via shared dashers; propose a safer design (e.g., city-level or time-sliced geo experiment, or consumer-level with supply caps) and checks for SRM and power. - Outline a quick backtest or diff-in-diff using pre-post with matched control cities; specify unit (city-day), clustering of standard errors, and guardrails. 5) Product sense and operations recommendations (8–10 min): - Synthesize: “OTR down 3pp; ~1.8pp driven by increased distance from promo-induced demand shift to farther merchants; ~0.9pp from longer pickup waits due to supply tightness; batching change explains residual in high-density hours.” - Propose actions with Doordash-style pragmatism: tighten promo geo/radius, introduce temporary dasher pay boosts for peak hours in affected zones, cap batching size during dinner peaks, surface merchant prep-time SLAs, and prioritize nearby merchants in ranking during spikes. Include expected impact, cost, and monitoring plan. - Call out risks: cannibalizing growth, fairness to dashers/merchants, and overfitting to short-term anomalies. 6) Communication and stakeholder alignment (throughout): - Start top-down, verify definitions early, quantify trade-offs, and end with a single-threaded owner plan (what, by when, owner, metric targets). Interviewers value concise, data-backed recommendations and clear next steps. Evaluation rubric (what interviewers look for): - SQL correctness and analytical hygiene (edge cases, filters, time windows, null handling). - Marketplace intuition specific to DoorDash: supply/demand balance, interference, regionality, and operational constraints. - Experimentation rigor: unit selection, SRM checks, power, guardrails; awareness of geo/time experiments. - Business impact: a clear, measurable recommendation with costs/benefits and a fast rollout/rollback plan. - Communication: structured, concise, and adaptable under time pressure, reflecting DoorDash’s bias for action and operator mindset. What to bring to the conversation: - A clear metric tree; a short, readable SQL plan; 2–3 prioritized recommendations with estimated impact; and a monitoring dashboard outline (OTR, completion rate, pickup wait, distance, cost per order, and dasher online minutes).

engineering

8 minutes

Practice with our AI-powered interview system to improve your skills.

About This Interview

Interview Type

PRODUCT SENSE

Difficulty Level

4/5

Interview Tips

• Research the company thoroughly

• Practice common questions

• Prepare your STAR method responses

• Dress appropriately for the role