
J.P. Morgan AI Engineer Case Interview: Designing a Risk-Aware LLM System for KYC and Banker Assistance
Focus and fit for J.P. Morgan: This case mirrors how J.P. Morgan evaluates AI engineers on structured thinking, risk awareness, and client-first impact. Expect deep probing on model governance, controls, and how you balance innovation with operational rigor across multiple lines of business. Case prompt candidates receive: You are asked to design an LLM powered solution that 1 extracts entities and risk signals from KYC onboarding documents and 2 serves as a banker assistant that answers client specific questions using approved internal knowledge. The system must operate within a highly regulated environment, respect data entitlements, and produce auditable outputs for control functions. Scope and constraints to clarify early: Data sources KYC packages PDFs IDs emails CRM notes trade and account metadata approved policy docs. Data classification presence of PII and sensitive client information. Access control need for fine grained entitlements and differential access by role. Non functionals target p50 response under 1.5 s p95 under 3 s concurrency 2k active users. Reliability SLO 99.9 percent with clear fallbacks and read only degradation. Regions and residency data must not leave approved regions. Budget guardrails and vendor risk management for any third party components. What strong answers include at J.P. Morgan: Problem framing tie model outputs to measurable business outcomes such as reduced onboarding cycle time and fewer manual reviews while keeping client interests first. Architecture a secure LLM stack with retrieval augmented generation over a curated document store vector index approval based connectors data lineage and policy based access. Controls prompts and guardrails including PII redaction policy aware retrieval allow lists content filters jailbreak and prompt injection defenses and isolation between user prompts and retriever context. Governance alignment with model risk management documentation model cards intended use performance limits monitoring thresholds challenger models periodic reviews and human in the loop for high risk actions. Evaluation design offline golden sets for KYC extraction and banker Q and A validation precision recall factuality hallucination rate compliance violations per 1k responses answerability coverage and business KPIs. Deployment plan CI CD with approvals canary releases rollback plans batch shadow runs and red teaming before production. Observability and audit tracing inputs retrieval context model versions feature flags decision logs and user feedback loops with explainability for reviewers. Cost and latency levers model selection distillation caching prompt optimization retrieval scope control and autoscaling. Change management communication plan with control partners operations and frontline users. How interviewers typically probe: Why RAG vs fine tuning and when to choose one. Specific mitigations for prompt injection data leakage and training on client data. Designing for model drift and policy changes. Handling regulator or audit asks to reproduce a decision from months ago. Trade offs between open source and managed models with vendor risk considerations. Incident response scenario walk a hallucination created an inappropriate KYC note draft your containment and remediation steps and client communication. Evaluation rubric aligned to J.P. Morgan: Structure and clarity crisp problem restatement hypothesis driven plan and trade off articulation. Risk and control mindset explicit mapping of risks to controls and acceptance criteria. Technical depth pragmatic model and system design tuned to bank realities. Evidence and metrics clear offline online evaluation with thresholds and rollback criteria. Stakeholder empathy ability to partner with compliance operations and bankers and to explain decisions in plain language. Suggested time flow 75 minutes: 5 minutes problem restatement and scoping. 25 minutes architecture and data plan. 15 minutes modeling guardrails and evaluation. 15 minutes governance monitoring and deployment. 10 minutes ROI trade offs and Q and A. Expected artifacts during the session: A high level diagram of the LLM RAG architecture, a controls checklist tied to identified risks, an evaluation table with offline and online metrics and thresholds, and a phased rollout plan with canary and human in the loop review.
75 minutes
Practice with our AI-powered interview system to improve your skills.
About This Interview
Interview Type
PRODUCT SENSE
Difficulty Level
4/5
Interview Tips
• Research the company thoroughly
• Practice common questions
• Prepare your STAR method responses
• Dress appropriately for the role