
Nike Software Engineer Case Interview: Designing a Global SNKRS Drop Platform
Overview This case simulates a real-world Nike digital event: a limited-edition SNKRS release that must handle huge, spiky, global traffic while protecting fairness and brand trust. Based on commonly reported Nike interview formats, the session blends collaborative system design, practical trade-offs, and culture alignment (consumer/athlete* focus, winning as a team, speed/simplicity, continuous learning). What this case covers - System design at scale: traffic shaping for flash sales, multi-region resiliency, data consistency under load, and graceful degradation. - API and data modeling: reservations, inventory, orders, and idempotency. - Reliability and performance: SLOs/SLA thinking, backpressure, caching, queueing, and failure isolation. - Security and fairness: bot mitigation, abuse controls, and equitable allocation models. - Observability and operations: metrics, logs, tracing, runbooks, and rollback strategies. - Product trade-offs: fairness vs. speed, instant purchase vs. lottery, reservation TTLs, and global consumer experience. - Collaboration and Nike culture signals: clear communication, consumer empathy, structured problem solving, and team-first behavior. Candidate brief (given at start) “Design the backend and high-level architecture for a global SNKRS drop. Expect a 10-minute traffic spike at launch. Ensure fairness, protect inventory, and provide a great experience across North America, EMEA, Greater China, and APLA. Assume a modern cloud stack.” Constraints and targets (clarify/confirm) - Scale: ~1–3M users attempting to join within 10 minutes; peak 80–120k RPS on ‘join/queue’ and 15–30k RPS on ‘reserve/checkout’. - Inventory: 50k units total, sized by SKU/region. No oversells. - Latency goals: p99 < 250 ms for ‘reserve’, p99 < 2 s for checkout orchestration. - Availability: target 99.95% during the event; graceful degradation required. - Reservation TTL: 90–180 seconds (candidate to justify). - Global: multi-region active-active; data residency where relevant. Expected artifacts from candidate - High-level architecture diagram: CDN/WAF + waiting room/queue; API gateway; services for Identity, Queue, Reservation, Inventory, Orders/Payments, Anti-abuse; async bus (e.g., Kafka/Kinesis/PubSub); caches (e.g., Redis) and data stores (e.g., relational for orders, key-value for reservations/inventory); multi-region routing and failover. - APIs: examples such as POST /queue/join, POST /reservations, POST /orders, GET /inventory/{sku}, with idempotency keys and retry semantics. - Data model sketch: Inventory(SKU, region, available), Reservation(user, sku, size, expiresAt, status), Order(id, items, paymentStatus), Allocation events. - Capacity plan: rough RPS, partitioning strategy (by SKU/region), cache hit assumptions, and storage/write patterns. - Resiliency playbook: circuit breakers, rate limits, load shedding, and rollback/canary plan. Flow and timing (approx.) - 0–5 min: Context + candidate questions (consumer/athlete* focus, fairness definition, success metrics). - 5–20 min: Architecture proposal (walk the diagram, data flows, global strategy). - 20–35 min: Deep dives (inventory consistency, reservations, idempotency, failure modes, region failover). - 35–50 min: Security/fairness (bot detection, challenge/response, queue/lottery trade-offs, abuse signals, device fingerprinting considerations). - 50–60 min: Observability & ops (p99s, saturation, business KPIs like conversion/fairness rate, dashboards, runbooks, chaos tests). - 60–70 min: Culture and reflection (trade-offs made, what you’d ship by launch day vs. phase 2, how you’d partner with product/ops/legal; learning mindset). Interviewer prompts (use selectively) - “How do you prevent oversells with eventual consistency?” - “Design the waiting room: deterministic ordering vs. weighted lottery; pros/cons.” - “Walk through idempotent reservations under retries/timeouts.” - “A region fails mid-drop—what degrades and what stays up?” - “How do you measure fairness? What’s your anti-bot posture without degrading real fans?” - “Propose a canary and rollback plan compatible with a 10-minute spike.” Evaluation rubric (Nike-aligned) - Consumer/athlete* focus: defines fairness, protects experience under load, considers accessibility and mobile networks. - Simplicity and speed: clear MVP, phased rollout, pragmatic tech choices; communicates what ships today vs. later. - Technical depth: solid partitioning, consistency story (e.g., reservation tokens + TTL), correctness under concurrency, realistic capacity math. - Reliability mindset: thoughtful SLOs, backpressure, failure isolation, chaos/DR readiness. - Security/fairness: layered bot mitigations; abuse detection signals; transparent, auditable allocation. - Collaboration: listens, structures problem, explains trade-offs, invites feedback; “win as a team” behaviors. Common pitfalls (red flags) - Single-region design or no plan for failover. - No idempotency or concurrency control on reservations/orders. - Hand-waving on bot/fairness and abuse economics. - Ignoring observability and rollback during a live event. - Over-optimizing niche components without an MVP path. Optional stretch topics (if time permits) - Event-sourced allocation ledger for post-event audits. - Fairness metrics (Gini coefficient of access, unique-user conversion, challenge pass rates). - Sustainability/efficiency nods (e.g., right-sizing infra, off-peak re-queues) aligned with Nike initiatives.
8 minutes
Practice with our AI-powered interview system to improve your skills.
About This Interview
Interview Type
PRODUCT SENSE
Difficulty Level
4/5
Interview Tips
• Research the company thoroughly
• Practice common questions
• Prepare your STAR method responses
• Dress appropriately for the role