The fastest way to calibrate your AIGP readiness is not to read another framework summary — it is to sit with questions that force you to apply the frameworks under time pressure. These ten questions are designed to surface the precise gaps that send well-prepared candidates out of the exam room frustrated: Provider vs. Deployer confusion, lifecycle sequencing errors, and the "utopian answer" trap that punishes idealists who ignore documented risk acceptance thresholds.
Each question is tagged by domain and cognitive level (Recall, Apply, or Evaluate). The distractor analysis after every answer is where the real learning happens — study the wrong answers as carefully as the right one.
Pause on every scenario question and identify three things: (1) the lifecycle stage described in the stem, (2) whether the organisation is acting as a Provider or Deployer, and (3) which framework's obligations govern the situation. This three-variable lock is the single most reliable technique for eliminating distractors on the real exam.
Domain I — Foundations of AI
A financial institution deploys a machine learning model that assigns creditworthiness scores to loan applicants. After several months of operation, the risk team observes that the model's predictions have drifted — applicants from a particular demographic group who historically received lower scores are now receiving even lower scores than the original training data would predict, despite no documented change in the model's parameters.
- A Model overfitting, caused by insufficient regularization during training.
- B Feedback loop amplification, where prior model outputs become inputs that reinforce and compound historical bias over time.
- C Data poisoning, caused by a malicious actor injecting manipulated records into the training pipeline.
- D Distribution shift, caused by a change in the statistical properties of live applicant data relative to the training set.
The key phrase in the stem is "no documented change in the model's parameters." This eliminates explanations that depend on retraining or external interference. The bias is worsening over time through operation — the defining signature of a feedback loop. When a model's outputs (lower credit scores) affect the real-world opportunities available to a group, which then affects the data generated by that group, which is fed back into future model inputs, the original bias compounds. This is distinct from drift, overfitting, or poisoning.
An AI governance team is cataloguing the organisation's AI systems for a new enterprise inventory. One system uses a pre-trained large language model to generate first-draft responses for customer service agents, who review and edit each response before it is sent to the customer. A second system automatically approves or denies customer refund requests under $50 based on purchase history, with no human review step.
- A Both systems operate with human-in-the-loop oversight because a human agent is present within each workflow.
- B System 1 is human-in-the-loop; System 2 is fully automated. Neither requires additional governance controls because the decisions involved are low-stakes.
- C System 1 is human-in-the-loop; System 2 is human-out-of-the-loop. System 2 requires more robust governance controls due to the absence of human review at the point of decision.
- D System 1 is human-on-the-loop because agents can override the output; System 2 is human-in-the-loop because the policy parameters were set by a human.
The distinction between human-in-the-loop and human-out-of-the-loop turns on whether a human reviews and can modify the AI output before it affects the subject. In System 1, the agent reviews every draft before it reaches the customer — that is the definition of human-in-the-loop. In System 2, the refund decision is made and applied without any human review step — that is human-out-of-the-loop, regardless of who set the policy thresholds. The claim in option B that low dollar value eliminates governance concerns conflates transaction value with governance risk, which the BoK explicitly does not permit.
Domain II — AI Laws & Frameworks
A European hospital deploys an AI system purchased from a US-based vendor. The system analyses patient imaging data to flag potential anomalies for radiologist review. The vendor's technical documentation states that the system was validated for use as a "clinical decision support tool for trained medical professionals." The hospital has not conducted its own conformity assessment prior to deployment. A regulatory audit is initiated.
- A The US-based vendor bears sole accountability as the system's manufacturer; the hospital has no obligations under the EU AI Act because it did not develop the system.
- B The hospital bears sole accountability because it is operating the system within the EU, regardless of where it was developed.
- C The vendor, as Provider, holds primary accountability for conformity. The hospital, as Deployer, holds independent obligations including verifying that required documentation was received and that the system is used within its intended purpose — obligations the hospital did not meet.
- D Both parties share equal accountability because the EU AI Act does not distinguish between manufacturers and operators for high-risk medical systems.
This question tests the BoK v2.1 Provider/Deployer distinction, one of the most heavily examined concepts in the February 2026 curriculum update. Under the EU AI Act, a Provider (the entity that develops and places the system on the market) bears primary responsibility for conformity assessment, technical documentation, CE marking, and instructions for use. A Deployer (the entity that uses the system in a professional context) has its own independent obligations: confirming documentation was received, using the system within its intended purpose, monitoring it in operation, and implementing human oversight measures. The hospital's failure to verify documentation and conduct its own review constitutes a Deployer obligation breach — but this does not transfer the Provider's conformity assessment duties to the hospital.
A governance lead at a logistics company is building an AI risk management programme from scratch. She begins by mapping all AI systems in operation, documenting the data they consume, the decisions they produce, and the stakeholders they affect. She then identifies the risk categories most relevant to each system — accuracy risk, fairness risk, security risk, and operational risk. She has not yet established monitoring protocols or mitigation controls.
- A She has completed the Manage function; she should next address the Govern function to establish oversight structures.
- B She has completed the Govern function; she should next address the Map function to contextualise AI risks.
- C She has completed the Map function; she should next address the Measure function to assess the likelihood and magnitude of identified risks.
- D She has completed the Measure function; she should next address the Manage function to implement controls.
The NIST AI RMF organises risk activities into four functions: Govern, Map, Measure, and Manage. Govern establishes organisational policies, roles, and accountability structures. Map identifies and classifies the AI system's context, risks, and affected stakeholders. Measure assesses the identified risks in terms of likelihood, severity, and priority. Manage implements controls to address those risks. The governance lead's activities — inventorying systems, documenting data flows, and categorising risk types — are Map activities. She has not yet assessed those risks quantitatively or qualitatively, which is the Measure function. Measure comes before Manage because you cannot prioritise controls without first assessing severity.
A multinational corporation implements ISO/IEC 42001 as the foundation of its AI management system. During a gap assessment, the internal auditor identifies that the organisation has a robust policy for identifying AI risks, but lacks a formal process for determining which individuals and business units should be consulted when assessing the impact of a proposed AI system on external stakeholders such as customers and regulators.
- A Clause 6 — Planning, specifically the risk assessment process for identifying AI-related harms.
- B Clause 4 — Context of the Organisation, specifically the requirement to identify and understand the needs and expectations of interested parties.
- C Clause 9 — Performance Evaluation, specifically the internal audit requirements for reviewing governance processes.
- D Clause 7 — Support, specifically the competence and awareness requirements for personnel involved in AI governance.
ISO/IEC 42001 Clause 4 — Context of the Organisation — requires an organisation to determine who its interested parties are (internal and external), understand what they need and expect from the AI management system, and document which of those needs are relevant to the AIMS scope. The gap described is specifically about the process for identifying which parties to consult — customers and regulators — which is the Clause 4 requirement, not a planning or risk assessment gap. Risk assessment (Clause 6) assumes you already know your interested parties; you cannot assess impact on stakeholders you have not identified.
Domain III — Governing AI Development
A data science team is preparing a training dataset for a predictive maintenance model. The dataset combines sensor readings from industrial equipment, maintenance records, and failure logs collected over eight years. The governance team has been asked to review the dataset before model training begins.
During review, the governance team identifies that the failure logs from the first three years were collected under a different classification scheme — what was recorded as "minor fault" before 2017 would be classified as "critical fault" under current standards. The data science team proposes relabelling those records using an automated mapping algorithm before training.
- A Approve the automated relabelling, as it corrects a known inconsistency and improves data quality prior to training.
- B Reject the dataset entirely and require the team to source new failure logs that use consistent classification standards throughout.
- C Request that the relabelling methodology be documented, validated against a sample of original records by domain experts, and that the resulting dataset version be tracked separately with clear lineage notes before training proceeds.
- D Instruct the team to exclude the pre-2017 records from training to eliminate the classification inconsistency.
This question tests data governance principles at the pre-training stage. The governance team's role is not to veto technical decisions but to ensure those decisions are made responsibly, documented, and auditable. The risk in automated relabelling is that the mapping algorithm may not perfectly reflect domain-expert judgement, and the transformation should be traceable. Option C — requiring documentation, expert validation on a sample, and dataset lineage tracking — is the governance-appropriate response: it enables the project to proceed while maintaining accountability for the transformation decision. Option A is inadequate because it accepts technical authority without governance oversight. Options B and D are both forms of avoidance that sacrifice eight years of data without attempting a defensible remediation.
An organisation is evaluating a completed AI model for resume screening. Post-training evaluation metrics show that the model achieves 91% accuracy overall. However, a disaggregated analysis reveals that the model's recall rate — the proportion of genuinely qualified candidates correctly identified — is 88% for male candidates and 71% for female candidates. The model's developer argues that the 91% overall accuracy demonstrates the system is fit for purpose. The organisation's documented AI risk appetite states that demographic performance differentials above 10 percentage points on any primary metric require escalation and remediation before deployment.
- A Approve deployment. The 91% overall accuracy exceeds the industry benchmark and demonstrates that the model is sufficiently accurate for resume screening.
- B Reject the model permanently. Differential performance of this magnitude in a hiring context constitutes illegal discrimination under most employment law frameworks and cannot be remediated.
- C Escalate and require remediation before deployment. The 17-point recall differential between demographic groups exceeds the organisation's documented 10-point threshold, triggering the mandatory escalation and remediation requirement regardless of overall accuracy.
- D Approve deployment with enhanced monitoring. Recall differentials are less material than precision differentials in screening contexts, and monitoring will capture any emerging bias in production.
The stem's key sentence is the risk appetite statement: a differential above 10 percentage points requires escalation and remediation before deployment. The observed differential is 17 points (88% minus 71%). This is a procedural question, not a statistical one. The organisation has pre-committed to a threshold, and that threshold has been exceeded. Option C is the only answer that respects the documented governance control. Options A and D both approve deployment in ways that contradict the stated risk appetite. Option B introduces a permanence ("rejected permanently") and a legal determination that the governance team is not positioned to make — and that also ignores the possibility of remediation.
A governance lead is designing a standardised documentation framework for all AI systems the company develops internally. She wants a single-page artifact for each model that records the model's intended use, performance metrics, known limitations, evaluation datasets, and intended user population — and is designed to be readable by both technical and non-technical stakeholders.
- A A Fundamental Rights Impact Assessment (FRIA), used to evaluate the effect of an AI system on the fundamental rights of affected individuals.
- B A Model Card, a standardised documentation artifact that summarises a model's purpose, performance characteristics, limitations, and appropriate use cases for diverse audiences.
- C A System Card, a broader artifact that documents an entire AI system — including components, data pipelines, and integration architecture — rather than a single underlying model.
- D A Data Sheet, which documents the composition, provenance, and intended use of a dataset rather than the model trained on it.
A Model Card is the artifact described. Developed as a transparency mechanism, it documents a model's intended purpose, performance benchmarks across different conditions and population subgroups, evaluation datasets, known limitations, and recommendations for appropriate use — in a format accessible to non-technical readers. The description is highly specific and matches Model Card conventions precisely. The other options are real governance artifacts but serve distinct purposes: FRIAs assess rights impacts of systems (not model-level documentation), System Cards cover full system architecture, and Data Sheets document datasets rather than models.
Domain IV — Governing AI in Deployment
An insurance company's AI model for claims processing has been in production for fourteen months. The model automatically approves standard claims within seconds. During a routine audit, the monitoring team discovers that over the past six weeks, the model's approval rate for a specific claim category has dropped from 74% to 41% — a shift not explained by any documented change in claims patterns, policy rules, or model parameters. No customer complaints have been filed. The model's output logs show normal confidence scores throughout the period.
- A Take no immediate action. No customer complaints have been filed, and normal confidence scores confirm the model is operating as intended.
- B Notify affected customers immediately and issue manual reviews of all denied claims from the past six weeks.
- C Escalate to an incident investigation: suspend automated decision-making for the affected claim category, route those decisions to human reviewers, and initiate a root cause analysis before determining remediation steps.
- D Retrain the model on the most recent six weeks of claims data to recalibrate approval rates to historical norms.
The scenario describes an unexplained material performance shift: a 33-percentage-point change in approval rate with no documented cause. The absence of customer complaints is not exculpatory — it may reflect that denied customers did not know they had grounds to complain, or had not yet appealed. Confidence scores appearing normal can itself be a signal of model failure (a miscalibrated model can output high confidence on incorrect outputs). The governance-appropriate response is to escalate to investigation, protect affected individuals by routing to humans during the investigation, and determine root cause before any remediation. Retraining without root cause analysis (option D) risks amplifying whatever underlying problem caused the shift.
A technology company deploys an agentic AI system that autonomously manages supplier contract renewals. The system can access internal financial data, send binding emails on behalf of procurement officers, and approve purchase orders up to $100,000 without human sign-off. The system completes tasks across multiple sequential steps, constructing its own plan of action before executing. Three months after deployment, the system renews a supplier contract at a rate 22% above market value without escalating to a human for approval, citing its authority to approve contracts up to $100,000.
- A Insufficient model accuracy. A more capable model would have identified the above-market rate and escalated appropriately.
- B Inadequate monitoring. Post-deployment monitoring would have detected the anomalous contract value before the renewal was executed.
- C Absent escalation triggers. The system's authority definition was scoped solely by transaction value, without escalation rules for contextual anomalies such as significant deviation from market pricing — a governance design gap at the pre-deployment stage.
- D Excessive system autonomy. Agentic systems should never be authorised to send binding communications or approve financial transactions without per-action human approval.
This question tests the governance design requirements specific to agentic AI — the BoK v2.1 addition that reflects the shift from static models to autonomous, multi-step, goal-directed systems. The system acted within its technically-defined authority (the transaction was under $100,000). The governance failure was not model capability, not monitoring lag, and not the existence of autonomy itself — it was the design of the authority boundary. A well-governed agentic system requires escalation triggers not just for dollar value, but for contextual anomalies (material deviation from market price is an obvious candidate). This is a pre-deployment governance architecture failure, not a post-deployment monitoring failure. Option D is a "utopian" answer that would eliminate the operational value of agentic systems entirely — the exam does not reward blanket prohibition when targeted governance controls exist.
How to Interpret Your Score
Use the table below to assess your readiness across domains. The passing scaled score on the AIGP is 300 on a 200–400 scale, which corresponds roughly to 70% correct on the full 100-question exam. For practice sets, aim higher — 75%+ across multiple question sets before scheduling.
| Score | Signal | Recommended Action |
|---|---|---|
| 9–10 / 10 | Strong readiness | Expand to full-length timed mock exams. Focus on weak domains. |
| 7–8 / 10 | On track | Review distractor analyses for missed questions. Drill your weakest domain with additional scenario questions. |
| 5–6 / 10 | Foundational gaps remain | Return to the BoK sections for domains where you missed. Study the governance mindset — prioritise scenario-based review over re-reading frameworks. |
| 0–4 / 10 | Material preparation needed | Do not schedule the exam yet. Start with Domain I and II foundations, build vocabulary from the IAPP Glossary, then return to scenario practice. |