SecAI+ Study Guide

Exam prep • concepts • attacks • defenses • governance • practice

Exam priority dashboard

My study notes, vibe coded into a high-yield SecAI+ cram.

Covering core exam domains, testable attack patterns, AI governance frameworks, RAG security, explainability, defensive controls, and exam-style practice. The biggest scoring opportunity is still understanding how attacks and controls differ across the AI lifecycle. This site is not meant to be everything you need to pass the test, but hopefully helps.

17%
Basic Concepts
40%
Securing AI Systems
24%
AI-Assisted Security
19%
Governance

Objective weights

Memorize cold
Training vs Inference

If it happens after deployment, it is usually not data poisoning.

Privacy distinction
Inversion vs Membership

“Was this record used?” is membership inference; “reconstruct the record” is inversion.

Architecture
Registry vs Vector DB

Model registry stores model versions. Vector DB stores embeddings for retrieval.

Framework core
NIST AI RMF

Govern, Map, Measure, Manage — in that order.

High-yield study sequence

Step 1
Master the lifecycle

Know where attacks occur: poisoned data, backdoored training, API abuse, adversarial inference, and drift during operations.

Step 2
Map attacks to defenses

Rate limiting for extraction, differential privacy for privacy leakage, validation for poisoned data, monitoring for prompt injection.

Step 3
Learn architecture terms

Feature store, vector database, inference API, model registry, retriever, embeddings, and model cards show up in scenario questions.

Step 4
Drill exam wording traps

CompTIA often tests “BEST” versus “FIRST,” governance versus technical controls, and subtle attack naming.

Night-before checklist

  • Know the 10 core attack types.
  • Know the 7 trustworthy AI characteristics.
  • Know NIST AI RMF and ISO 42001 / 23894 roles.
  • Know RAG architecture and its security risks.
  • Know LIME, SHAP, counterfactuals, saliency maps, attention.
  • Know defensive AI uses: UEBA, alert correlation, SOAR.

AI fundamentals and exam language

SecAI+ is security and governance focused, not heavy on advanced math. You need clean distinctions between AI, machine learning, deep learning, NLP, LLMs, and operational patterns like RAG and federated learning.

Artificial Intelligence

Broad field of systems performing tasks that usually require human intelligence.

Machine Learning

Uses data to learn patterns and make decisions or predictions.

Deep Learning

Neural network-based learning for high-dimensional data like images, audio, and language.

Supervised Learning

Learns from labeled examples such as benign versus malicious samples.

Unsupervised Learning

Finds hidden patterns in unlabeled data such as clustering network traffic.

NLP / LLMs

Language-focused AI systems. LLMs are deep learning models that power chat, summarization, and generation.

Concept pairs to keep separate

AI vs ML
AI is the umbrella. ML is one approach inside AI.
ML vs DL
Deep learning is a subset of ML using neural networks.
LLM vs RAG
The LLM generates; RAG adds retrieval from trusted documents to ground the answer.
Generative vs Predictive
Generative creates content; predictive classifies or forecasts.

Modern AI terms you must know

RAG: Retrieval-Augmented Generation. The model retrieves relevant private data before generating a response.
Embeddings: Numeric representations of content used to compare semantic similarity.
Prompt Injection: Malicious instructions intended to override system behavior or retrieval policies.
Context Poisoning: Manipulating the retrieval source so the model consumes malicious or false context.
Model Drift: Performance degradation as real-world data changes over time.
Differential Privacy: Adding noise so individual data points are harder to recover.
Federated Learning: Training locally and sharing updates instead of raw data.
Model Card: Documentation for intended use, limitations, metrics, and bias considerations.

Emerging threat examples

Automated phishing

GenAI can create highly personalized phishing lures or clone voices for social engineering.

Deepfakes

Synthetic audio or video can enable fraud, disinformation, or identity spoofing.

Polymorphic malware

Malware mutates itself to evade signature-based detection.

Model extraction

Repeated API queries can help an attacker reconstruct a proprietary model.

AI / ML lifecycle security

CompTIA repeatedly tests where in the lifecycle a problem appears. The same attack name may be wrong if the phase is wrong.

Phase What happens Primary risks Common controls
1. Data CollectionGather raw samples from logs, users, images, documents, telemetry, or labels.Data poisoning, data leakage, bias in source data, poor lineage.Dataset provenance, access control, validation, data minimization.
2. Data PreparationCleaning, normalization, feature engineering, labeling.Label flipping, hidden bias, bad feature engineering.Quality checks, dual review, representative samples, lineage tracking.
3. TrainingLearn weights or rules from prepared data.Backdoor insertion, poisoned updates, insecure dependencies, training data leakage.Secure training environment, signed dependencies, privacy controls, verification.
4. ValidationTest performance, fairness, robustness, and reliability.Undetected bias, poor robustness, hidden performance gaps.Bias testing, adversarial testing, explainability review, red teaming.
5. DeploymentRelease model to API, service, app, or device.API abuse, supply chain compromise, misconfiguration, secrets exposure.API gateways, model signing, configuration hardening, access control.
6. MonitoringObserve predictions, security events, outputs, and drift.Model drift, abuse, prompt injection, abnormal queries, unsafe outputs.Telemetry, alerts, anomaly detection, output review, retraining triggers.
7. Retraining / RetirementUpdate or retire a model when performance or risk changes.Catastrophic forgetting, stale controls, untracked model versions.Versioning, controlled retraining, rollback plans, decommissioning process.

Training vs inference attacks

Training phase
  • Data poisoning
  • Label flipping
  • Backdoor attacks
  • Poisoned federated updates
Inference phase
  • Adversarial examples
  • Model extraction
  • Prompt injection
  • Evasion attacks
Exam trap: If the question says “after deployment,” “during prediction,” or “while the model is live,” that strongly points away from data poisoning.

Lifecycle questions CompTIA likes

Q: Where is data poisoning most likely introduced?
A: During data collection or preparation before or during training.
Q: Where would adversarial examples appear?
A: During inference or live prediction.
Q: Where does model drift show up?
A: Monitoring and operations over time.
Q: What phase is most associated with API abuse?
A: Deployment and inference.

Attack Atlas: the 10 core attacks you must know cold

These show up constantly in SecAI+ material. Focus on the goal, phase, and best defense for each.

Data Poisoning

Manipulating training data so the model learns the wrong patterns.

Phase: Training
Defense: Dataset validation and provenance

Label Flipping

Deliberately mislabeling training examples, such as marking malware benign.

Phase: Data prep / training
Defense: Label QA and peer review

Backdoor Attack

A hidden trigger is inserted during training so a special input causes a malicious result.

Phase: Training
Defense: Secure training and robustness testing

Adversarial Examples

Small input changes cause misclassification, like a stop sign read as a speed limit sign.

Phase: Inference
Defense: Adversarial training and input validation

Model Extraction

Attacker queries the model repeatedly to approximate or steal it.

Phase: Inference
Defense: Rate limiting and query monitoring

Model Inversion

Attacker reconstructs sensitive training data from outputs.

Goal: Recover hidden data
Defense: Differential privacy and restricted outputs

Membership Inference

Attacker determines whether a specific record was used in training.

Goal: “Was this record used?”
Defense: Privacy controls and regularization

Prompt Injection

Malicious prompts or retrieved content override instructions in an LLM workflow.

Phase: Inference / RAG
Defense: Isolation, filtering, output monitoring

Context Poisoning

Attacker poisons retrieved documents so the model consumes malicious context.

Phase: Retrieval / RAG
Defense: Retrieval trust controls and source vetting

Training Data Leakage

Sensitive source content becomes memorized or exposed by the model.

Impact: Privacy and confidentiality breach
Defense: Data minimization and privacy safeguards

Attack distinctions that get tested

Data poisoning changes what the model learns during training.
Adversarial examples change the input at inference time.
Model extraction steals model behavior or decision boundaries.
Model inversion reconstructs hidden data from outputs.
Membership inference checks whether a single record participated in training.

Best-vs-first attack mitigation trap

Model extraction: the best mitigation is often an architectural control like API rate limiting, not only monitoring.
Bias problems: the best answer is often representative data or better governance, not encryption.
RAG abuse: isolate system prompts and validate retrieved context, not just scan outputs after the fact.

Defensive controls across the AI stack

SecAI+ is heavy on practical controls: privacy protection, integrity protection, secure deployment, and monitoring. Tie every control to a specific attack or lifecycle risk.

Layer Control Why it matters
Data LayerData minimizationReduces privacy exposure and unnecessary sensitive training content.
Data LayerDifferential privacyMakes individual training records harder to reconstruct.
Data LayerDataset validation / verificationDetects poisoning, corruption, and poor-quality sources.
Model LayerDigital signaturesProtects integrity of model artifacts during deployment.
Model LayerWeights encryptionProtects sensitive model internals from theft or tampering.
Inference LayerAPI gateways / rate limitingReduces extraction and abuse through repeated query attempts.
Inference LayerInput sanitizationCatches malformed, malicious, or manipulative inputs.
Monitoring LayerInference monitoringDetects unusual prompts, outputs, query patterns, and drift.

Security monitoring priorities

  • Abnormal query volumes
  • Prompt injection attempts
  • Unexpected output patterns
  • Adversarial input indicators
  • Model drift and performance decline
  • Changes to dependencies or models

Deployment environments

On-premises: Strong control over sensitive data, often preferred for healthcare or high-regulation environments, but less elastic.
Cloud: Scalable and fast, but depends on shared responsibility. You still secure your data, model settings, and access controls.
Hybrid: Often used when training benefits from cloud compute but inference must remain private or low-latency on-prem.

Supply chain controls

Pretrained models: verify source, signatures, and provenance.
Datasets: check integrity, licensing, lineage, and poisoning risk.
ML libraries: pin dependencies, validate packages, monitor for compromise.
Serialization risk: prefer safer data-only formats where possible and avoid unsafe deserialization.

Architecture components CompTIA loves

Component Purpose
Model RegistryStores and versions trained models.
Vector DatabaseStores embeddings used for semantic retrieval.
Feature StoreStores curated ML features for consistent training and inference.
Inference APIAccepts requests and returns model predictions or generations.
RetrieverFinds relevant documents or passages for RAG workflows.
Model CardDocuments intended use, limitations, risk, and evaluation details.

RAG architecture

User Prompt
Retriever
Vector Database / Document Store
Relevant Context
LLM
Response
Key risks: prompt injection, context poisoning, sensitive document exposure.
Key defenses: trusted retrieval sources, prompt isolation, access controls, output review.

Architecture wording traps

Registry
Stores model versions, not embeddings.
Vector DB
Stores embeddings, not the trained model binary.
Feature store
Holds features used by models.
Inference API
Runs live predictions or generations.

Governance and compliance

NIST AI RMF

Primary U.S. AI risk framework with four core functions: Govern, Map, Measure, Manage.

ISO/IEC 42001

AI management system standard, similar in spirit to how ISO 27001 structures management systems for security.

ISO/IEC 23894

AI risk management guidance focused on identifying, analyzing, and treating AI risks.

MITRE ATLAS

Adversarial threat framework for AI systems, similar in spirit to ATT&CK but for ML/AI attacks.

NIST AI RMF functions

Govern: culture, oversight, policy, accountability, acceptable use.
Map: context, stakeholders, intended use, harms, scope.
Measure: accuracy, bias, drift, robustness, resilience metrics.
Manage: treat, monitor, respond, retrain, improve.

Trustworthy AI characteristics

  1. Valid and reliable
  2. Safe
  3. Secure and resilient
  4. Accountable and transparent
  5. Explainable and interpretable
  6. Privacy-enhanced
  7. Fair with managed harmful bias

EU AI Act risk tiers

Unacceptable risk
Prohibited uses such as certain social scoring-style patterns.
High risk
Critical use cases such as healthcare or critical infrastructure face strong requirements.
Limited risk
Transparency obligations are common for user-facing systems like chatbots.

Governance concepts likely to be tested

Accountability
Fairness and bias mitigation
Transparency and explainability
Lifecycle monitoring
Model cards and documentation
Dataset integrity and lineage
Privacy-enhancing controls
Risk committees and oversight

Explainable AI (XAI)

Explainability is crucial in regulated environments like healthcare, finance, and government. Expect questions on intrinsic models versus post-hoc explanations.

Intrinsic / Interpretable models

These are explainable by design: linear regression, decision trees, and rule-based systems.

Post-hoc methods

These explain black-box models after prediction: LIME, SHAP, saliency maps, attention maps, and surrogate models.

Counterfactual explanations

Answer “What would need to change for a different outcome?” Great for fairness and user-facing explanation.

Which methods are common in regulated AI?

SHAP
Widely used for consistent feature contribution explanations.
LIME
Good for local explanations of a single prediction.
Counterfactuals
Useful for regulatory and fairness-oriented explanation.
Decision trees / rules
Naturally interpretable and easy to audit.

Comparison matrix

Method Type What it explains Best use
Linear modelsIntrinsicDirect coefficientsSimple interpretable predictions
Decision treesIntrinsicDecision pathAuditable human-readable branching
Rule-based modelsIntrinsicHuman-written rulesCompliance and SIEM style logic
LIMEPost-hocLocal explanationOne prediction at a time
SHAPPost-hocFeature contributionsConsistent explanation across models
Feature importanceModel-specificGlobal variable influenceTree ensembles and boosted models
Attention mapsModel-specificToken focusTransformer / LLM interpretation
Saliency mapsVision XAIImportant pixelsComputer vision
CounterfactualsModel analysisWhat would change the outcomeFairness and user explanation

LIME vs SHAP

LIME: perturbs the input around one prediction and fits a simple local surrogate model.
SHAP: uses Shapley-value ideas from game theory to attribute each feature’s contribution.
Exam clue: if the wording emphasizes consistency or mathematically grounded feature contribution, SHAP is a strong answer.

Practical examples

Decision tree: “IF age > 65 AND BP > 140, then high risk.”
Counterfactual: “If credit score were 720 instead of 680, the loan would be approved.”
Attention map: Shows which tokens influenced the LLM most.

AI-assisted security operations

This exam also covers how AI helps defenders: anomaly detection, alert correlation, incident enrichment, and automation through SOAR-style playbooks.

UEBA

User and Entity Behavior Analytics baselines behavior and flags anomalies, such as strange access time or impossible travel.

Alert Correlation

Combines many raw alerts into one incident storyline across email, endpoint, and network tools.

SOAR Playbooks

Automates consistent response steps such as isolating a host, disabling an account, and creating a case.

Next-gen SIEM

Uses AI/ML to reduce false positives and prioritize likely-real events.

Defensive vs adversarial AI

Generative AI
Defense: reports, patching support. Attack: phishing, deepfakes.
Anomaly detection
Defense: find intrusions. Attack: learn “normal” to hide.
Neural networks
Defense: malware classification. Attack: evasion and adversarial noise.

Sample ransomware playbook

Suspicious encryption event detected
Correlate endpoint + identity + email telemetry
Isolate host
Disable user account / revoke session
Create ticket and notify responders

Common testable use cases

  • Behavior-based insider threat detection
  • False-positive reduction in SIEM queues
  • Automated triage of phishing emails
  • Malware classification and clustering
  • Incident prioritization and summarization
  • Automated compliance evidence mapping

20-question practice exam

Answer all questions, then click grade. Explanations appear automatically so you can use this as both an assessment and a cram sheet.

Top exam tricks

  • After deployment → likely inference attack
  • “Was this record used?” → membership inference
  • “Reconstruct the record” → inversion
  • Bias problem → representative data / governance
  • “BEST” answer may be the simplest strong control

Quick recall pairs

Govern → oversight and culture
Map → context and harms
Measure → metrics and testing
Manage → mitigate and monitor
Vector DB → embeddings
Model Registry → model versions

Glossary flashcards

Click a card to flip it. Use search in the header to jump to matching terms.

Expandable cram notes