This certification exam assesses a candidate's ability to design and implement LLM-enabled solutions on Databricks, including decomposing complex requirements into manageable tasks and selecting appropriate models, tools, and approaches from the current generative AI landscape. It evaluates working knowledge of Databricks-specific tooling including Mosaic AI Vector Search, Model Serving, MLflow, Unity Catalog, Agent Bricks, Agent Framework, AI Gateway, and MCP server integration. The exam validates that a candidate can build, evaluate, deploy, govern, and monitor performant RAG applications, single-agent applications, and LLM chains on the Databricks platform.
Prepare for the Databricks Certified Generative AI Engineer Associate exam with structured study material, scenario-based practice questions, sample exam questions and a realistic exam simulator.
A handful of real practice questions from our Databricks Gen AI Associate bank — to give you a true feel for the style and difficulty before you sign up.
Knowledge tools and action tools in agent design differ on which property?
Why: Tool classification by side effect drives ordering and approval design. Knowledge tools read facts without changing state, so retries are safe and they can run before any commitment is made. Action tools change external state, are not idempotent, and require confirmation or audit. The Unity Catalog distractor is the most tempting because action tools often need grants, but knowledge tools that touch governed data also need them; the discriminator is state change, not authorization.
A customer-facing assistant must reject requests that arrive with personal data already embedded in the prompt, while responses that contain hallucinated personal data should be redacted rather than dropped silently. Which AI Gateway PII configuration matches both directions?
Why: AI Gateway's PII guardrail supports BLOCK to reject the call when PII is detected and MASK to replace PII with redaction tokens before the response returns. The brief asks for rejection on input and silent redaction on output, which maps to BLOCK on input and MASK on output. BLOCK on output would drop the response entirely, which the brief excludes; MASK on input would not reject. The most tempting distractor is symmetric MASK, which keeps traffic flowing but never rejects.
An agent endpoint serves latency-sensitive interactive traffic at a volume consistently above the documented break-even token throughput for a provisioned-throughput unit. Which endpoint configuration minimizes cost while preserving the latency profile?
Why: Above the break-even token throughput, a provisioned-throughput endpoint with reserved capacity beats per-token billing at a flat hourly rate. Scale-to-zero must stay disabled on latency-sensitive interactive traffic because the cold start on the next request after idle violates the latency profile. Provisioned-throughput with scale-to-zero enabled is the most tempting distractor since the toggle appears to save spend, but it is the wrong fit when interactive latency is part of the stated requirement.
Which storage surface captures every request and response from a Mosaic AI Model Serving endpoint for the monitoring phase?
Why: Inference Tables are the Delta-backed sink that Mosaic AI Model Serving writes each request and response into for monitoring, and Agent Monitoring samples from them on a schedule to run judges. MLflow runs hold offline evaluation metrics rather than live request streams. The Review App tables store SME rubric scores, not raw traffic, so they cannot serve as the monitoring source from which sampled production data flows back into the eval loop.
Regulatory news feeds update a Delta source table every few seconds, and downstream agents must surface new chunks within a tight freshness window. Cost is secondary to recency. Which Delta Sync pipeline mode matches the requirement?
Why: `CONTINUOUS` Delta Sync tails the source Change Data Feed and applies updates to the index within seconds, matching the freshness requirement when recency outweighs pipeline cost. `TRIGGERED` once per minute is the most tempting distractor since it appears nearly continuous, but each triggered run pays scheduling and startup overhead, and observed freshness still drifts on the order of minutes, missing a seconds-tight target.
All figures should be confirmed on the official Databricks page.
The Databricks Certified Generative AI Engineer Associate exam contains 45 questions and lasts 90 minutes. Always confirm the latest exam blueprint on the official page before scheduling.
The passing score is 70%.
You get 90 minutes to complete the exam. The MyCertStack exam simulator uses the same time budget so you can build pacing under realistic pressure.
No. MyCertStack provides original practice questions, sample exam questions, and a realistic exam simulator written by our team to mirror the style and difficulty of the real exam. They are not dumps and are not the actual questions used by Databricks.
Work through the structured study material chapter by chapter, then drill the practice zone for each topic until you consistently score above the passing threshold. Finish with at least two full exam simulations under timed conditions before sitting the real exam.
Loading certification…