🎯 ML Fundamentals + AWS Services

AWS AIF-C01 — Day 1 Complete Study Guide

A clear, practical walkthrough of the ML ideas and AWS services that show up again and again on AIF-C01. Built for quick review, not for reading like a wall of notes.

Best used from start to finish, then Topic 19 for service mapping and Topic 21 for final revision.

Topic 1

The Big Picture (read this first)

Before any topic, fix this mental model in your head:

Real-world problem ↓ What KIND of ML is this? → Supervised / Unsupervised / Reinforcement / Semi-supervised ↓ What kind of OUTPUT do I need? → Classification / Regression / Clustering / Anomaly / Recommendation ↓ What AWS SERVICE fits? → SageMaker (custom) OR Pre-built AI service (Rekognition, Comprehend, etc.) ↓ How will it be USED? → Real-time / Batch / Async inference

Every exam question is testing one or more of these four layers. When you read a question, mentally walk down this ladder.

Topic 2

Types of Machine Learning

The "type" answers one question: What kind of data did the model learn from?

2.1 Supervised Learning

The intuition: Imagine teaching a child what a cat is by showing them 1,000 photos labeled "cat" and 1,000 photos labeled "not cat." The labels are the supervision. The child (model) learns to map photo → label.

Mechanics:

Input data has a known correct answer (a label)
Model learns the mapping input → output
After training, give it a new input and it predicts the output

Two flavors of supervised learning:

Flavor	Output	Example
Classification	A category	Spam or not spam
Regression	A number	Predicted house price

Real examples:

Loan approval: features (age, salary, credit score) → label (approved / rejected)
Tumor diagnosis: image → label (malignant / benign)
Demand forecast: historical sales → label (next week's sales number)

🔑 Exam clue words: "labeled data", "historical data with known outcomes", "predict category", "predict value", "training examples include the answer"

2.2 Unsupervised Learning

The intuition: Same child, but now you dump a pile of photos in front of them with no labels and say: "Sort these into groups however you want." The child might separate by color, by animal type, by background. They find structure that was already there but hidden.

Mechanics:

Input data has no labels
Model finds hidden structure: clusters, patterns, outliers
You don't tell it what's "correct" — it discovers it

Real examples:

Customer segmentation: group 10 million customers into "buying personas" you didn't define
Topic discovery: find themes in a million news articles
Anomaly detection: find the 0.01% of credit card transactions that "look weird"

🔑 Exam clue words: "unlabeled data", "discover patterns", "group similar items", "segment customers", "no predefined categories"

2.3 Reinforcement Learning (RL)

The intuition: Think of training a dog. The dog tries something → you give a treat (reward) or say "no" (penalty). Over many trials, the dog learns which actions lead to treats. There's no labeled dataset — there's a goal and feedback.

Mechanics:

An agent takes actions in an environment
Each action returns a reward (positive or negative)
The agent's goal: learn a policy (strategy) that maximizes total reward over time
Learning happens via trial and error

Real examples:

Game-playing AI (AlphaGo, chess engines)
Robotics (a robot learning to walk)
Autonomous driving simulation
Dynamic pricing (raise/lower price → observe sales → adjust)
Ad bidding strategies

🔑 Exam clue words: "agent", "environment", "reward", "policy", "trial and error", "learn through interaction"

2.4 Semi-supervised Learning

The intuition: You have 1,000 medical X-rays carefully labeled by a radiologist (very expensive!) and 100,000 unlabeled X-rays sitting in a database. Throwing away the unlabeled ones is wasteful. Semi-supervised learning uses both.

Mechanics:

Small amount of labeled data + large amount of unlabeled data
Model uses the unlabeled data to learn structure, and the labeled data to anchor the meaning
Best of both worlds when labeling is expensive

Real examples:

Medical imaging (labeling requires doctors)
Speech recognition (transcription is slow)
Web page classification (millions of pages, few labeled)

🔑 Exam clue words: "limited labeled data", "large unlabeled dataset", "reduce labeling cost", "labeling is expensive"

🧠 Quick check

"A bank wants to detect fraudulent transactions. They have 5 years of transactions, each marked as fraud or legitimate."

✅ Supervised. Labels exist.

"A retailer wants to discover natural groups of customers based on shopping behavior. They have no predefined groups."

✅ Unsupervised. No labels.

"A robot vacuum learns the optimal cleaning path by trying routes and being rewarded for coverage."

✅ Reinforcement. Agent + reward.

Topic 3

ML Problem Types

The "problem type" answers: What shape is the output?

This is different from "type of ML." A supervised problem can be classification or regression — those are different problem types.

3.1 Classification — predict a category

Output is discrete (a finite set of choices).

Input	Output
Email text	Spam / Not spam
Transaction	Fraud / Legitimate
X-ray image	Disease / No disease / Inconclusive

Binary classification: 2 classes (spam vs not spam)
Multi-class classification: 3+ classes (cat / dog / bird)
Multi-label classification: multiple labels per input (an image can be "beach" AND "sunset" AND "people")

🔑 Exam clue words: "which class?", "category", "yes/no", "type of object", "label"

3.2 Regression — predict a number

Output is continuous (a number on a scale).

Input	Output
House details	Price ($427,500)
Last 30 days of sales	Revenue forecast ($1.2M)
Weather data	Energy demand (450 MWh)

Key distinction from classification: "Will this customer churn?" = classification. "How many days until they churn?" = regression.

🔑 Exam clue words: "predict price", "forecast demand", "estimate value", "numeric output", "how much", "how many"

3.3 Clustering — group similar things (unsupervised)

No labels exist. Model finds groups.

Examples:

Customer segmentation (find 5 natural buyer personas)
Document grouping (organize 1M articles into themes)
Genome analysis (group similar gene expression profiles)

🔑 Exam clue words: "group similar", "segment", "no labels", "natural groupings"

3.4 Anomaly Detection — find the weird stuff

Find rare events that don't match the normal pattern.

Examples:

Unusual login (different country, weird time)
Fraudulent transaction (purchase pattern doesn't match user)
Machine failure (vibration sensor showing abnormal pattern)
Network intrusion (traffic pattern looks unlike usual)

Why it's special: You usually have very few anomaly examples. So pure supervised learning struggles. Often handled as unsupervised or one-class.

🔑 Exam clue words: "unusual", "outlier", "abnormal", "rare event", "deviation from normal", "strange spike"

3.5 Recommendation — suggest relevant items

Predict what a user will like based on their history and similar users' behavior.

Examples:

Netflix: "because you watched X..."
Amazon: "customers who bought this also bought..."
Spotify Discover Weekly
News article recommendations

🔑 Exam clue words: "personalized", "users who bought X also bought Y", "next best item", "recommend"

🧠 Map the problem to the type

Problem	ML Type	Problem Type
Spam filter	Supervised	Classification
Predict tomorrow's stock price	Supervised	Regression
Group products by similarity	Unsupervised	Clustering
Find rare network attacks	Unsupervised (usually)	Anomaly detection
"You may also like..."	Specialized	Recommendation

Topic 4

The ML Pipeline (7 stages)

This is the lifecycle of every ML system. Memorize the order.

1. Data Collection ↓ 2. Preprocessing (cleaning) ↓ 3. Feature Engineering ↓ 4. Training ↓ 5. Evaluation ↓ 6. Deployment ↓ 7. Monitoring ↓ (back to 4 when retraining needed)

Stage 1 — Data Collection

Gather raw data: databases, logs, images, audio, IoT sensors, clickstream, public datasets.

⚠️ Garbage in, garbage out. ML is not magic. If your data is biased, incomplete, or wrong, your model will be too. AWS exam loves to test this — if the question says "model is biased," the answer is almost always "fix the data," not "tune the model."

Stage 2 — Preprocessing

Clean the raw data so the model can use it.

Task	What it does
Remove duplicates	Same row twice will mislead training
Handle missing values	Fill (imputation) or drop
Normalize / standardize	Scale numeric values (e.g., height in cm and salary in $ are on wildly different scales — bad for many models)
Convert formats	Image → tensor, text → tokens
Remove noise	Drop irrelevant or corrupted rows
Encode categories	"Country = India" → one-hot or embedding

Stage 3 — Feature Engineering

Transform raw fields into useful features that help the model learn.

Why features matter: A model can't extract every pattern on its own. Smart features make patterns obvious.

Examples:

Raw: date_of_birth = 1995-03-12 → Feature: age = 30
Raw: transaction_timestamp → Features: hour_of_day, day_of_week, is_weekend
Raw: address → Features: city, pincode, is_metro
Text: convert to TF-IDF vectors, embeddings, etc.

💡 Old-school ML (like XGBoost) leans hard on feature engineering. Deep learning can often learn features automatically from raw data. The exam tests this distinction.

Stage 4 — Training

The model learns by adjusting its internal parameters to minimize prediction error.

Component	Meaning
Algorithm	The learning method (e.g., XGBoost, neural network, k-means)
Training data	Examples used to teach
Parameters	Internal numbers the model learns (weights, biases)
Loss function	Measures how wrong the predictions are
Optimizer	Adjusts parameters to reduce loss (e.g., gradient descent)
Hyperparameters	Settings YOU choose before training (learning rate, batch size)

Stage 5 — Evaluation

Test the trained model on data it has never seen. This tells you if it actually generalizes or if it just memorized.

Metrics depend on the problem type — we'll cover these in detail in Topic 10.

Stage 6 — Deployment

Put the model into production so users / apps can use it.

Type	When to use
Real-time inference	Need an answer in <1 second (chatbot, fraud check)
Batch inference	Score millions of records overnight (no rush)
Async inference	Large input, takes minutes to process (video analysis)
Edge deployment	Model runs on the device itself (phone, IoT sensor)

Stage 7 — Monitoring

After deployment, watch for problems:

Issue	What it means
Data drift	Input data has changed (e.g., new product line model never saw)
Model drift / concept drift	Relationship between input and output has changed (e.g., COVID changed shopping patterns)
Bias	Model treats groups unfairly
Latency	Predictions getting slower
Cost	Endpoint is too expensive
Errors	Failed inference requests

When monitoring detects problems, you retrain the model with new data. This is the loop.

Topic 5

Training Data Concepts

5.1 Labeling

Adding the correct answer to each training example. Required for supervised learning.

AWS service: SageMaker Ground Truth. It coordinates human labelers (mechanical turk-style) and uses ML to speed up the process (active learning + auto-labeling).

5.2 Data Splits

You split your dataset into 3 parts:

Split	Purpose	Typical size
Training set	Model learns from this	70–80%
Validation set	Tune hyperparameters; pick best model	10–15%
Test set	Final unbiased evaluation, used once	10–15%

⚠️ Cardinal rule: Never train on the test set. Never tune on the test set. The test set must remain "unseen" until the very end. If you peek, your accuracy number is a lie.

5.3 Class Imbalance

The problem: Fraud detection has 1% fraud, 99% legit. A naive classifier predicts "legit" always → 99% accuracy → useless.

Fixes:

Technique	What it does
Oversampling minority class	Duplicate (or synthesize) fraud examples
Undersampling majority class	Drop some legit examples
SMOTE	Generates synthetic minority samples
Class weights	Tell the model "mistakes on fraud cost 99x more"
Use F1 / Recall instead of accuracy	Fix the metric, not just the data

5.4 Data Augmentation

Create modified versions of existing data to expand your training set artificially.

For images: rotate, crop, flip, change brightness, add noise, zoom.
For text: synonym replacement, back-translation, random word dropout.
For audio: time-stretching, pitch-shifting, adding background noise.

Purpose: Reduces overfitting and helps generalization. The model sees more "variety" without you collecting new data.

Topic 6

Hyperparameters vs Parameters

This distinction shows up constantly.

6.1 Parameters

Numbers the model learns automatically during training.

Examples:

Weights in a neural network
Biases
Learned word embeddings
Tree splits in a decision tree

You don't set these. The optimizer does.

6.2 Hyperparameters

Numbers YOU choose before (or during) training.

Examples:

Learning rate
Batch size
Number of epochs
Number of layers / neurons per layer
Dropout rate
Regularization strength
Number of trees in a random forest
Max tree depth

You tune these. This is called hyperparameter tuning — try different combinations and see which gives the best validation performance.

💡 AWS: SageMaker has automatic model tuning (also called hyperparameter optimization, HPO) that searches the hyperparameter space for you using Bayesian optimization or random search.

Memory hook

Type	Set by	Example
Parameter	Model (during training)	Neural network weights
Hyperparameter	You (before training)	Learning rate

Topic 7

Inference vs Training

7.1 Training

The model learns.

Factor	Training
Compute	Very high (GPUs)
Time	Long (hours to weeks)
Cost	High
Data	Large dataset
Output	A trained model
Frequency	Periodic (once, then retrained)

7.2 Inference

The model predicts on new inputs.

Factor	Inference
Compute	Lower (but scales with traffic)
Time	Should be fast
Cost	Depends on volume
Data	One input at a time (or a batch)
Output	A prediction
Frequency	Continuous / on-demand

7.3 Inference Modes (Critical for AWS exam)

Need	AWS option
Immediate (<1s) response	Real-time inference (SageMaker Real-time Endpoint)
Score a huge dataset offline	Batch Transform
Large payloads / long processing, async OK	Async Inference
Sporadic, unpredictable traffic	Serverless inference

This mapping is on the exam in many forms. Memorize it.

Topic 8

Overfitting vs Underfitting

This is the #1 ML concept. The exam will test it many ways.

8.1 Overfitting

The intuition: A student who memorizes the answer key but doesn't understand the subject. Aces practice tests, fails the real exam.

The model: Learns the noise in the training data, not just the pattern. Performs great on training data, poorly on new data.

Signs:

Training accuracy: 99%
Test accuracy: 65%
Big gap = overfitting

Causes:

Model too complex (too many parameters for too little data)
Not enough training data
Trained for too many epochs
Too many features (some are just noise)

Fixes (memorize these — exam favorite):

Fix	How it helps
More training data	Harder to memorize a bigger set
Regularization (L1, L2)	Penalizes overly complex models
Reduce model complexity	Smaller network, fewer features
Dropout (neural nets)	Randomly turns off neurons during training
Early stopping	Stop training when validation accuracy plateaus
Cross-validation	Better estimate of true performance
Data augmentation	Create variations of existing data

8.2 Underfitting

The intuition: A student who studied only the chapter titles. Bad at practice tests, bad at the real exam.

The model: Too simple to capture the pattern. Performs poorly on both training and test data.

Signs:

Training accuracy: 60%
Test accuracy: 58%
Both are low

Causes:

Model too simple
Not enough features
Training stopped too early
Wrong algorithm choice

Fixes:

Use a more complex model
Add more / better features
Train longer
Reduce regularization (it might be too aggressive)

🧠 Quick check

"Training accuracy is 98%, validation accuracy is 70%. What's wrong?"

✅ Overfitting. Big gap. Add regularization, get more data, or simplify.

"Both training and validation accuracy are 55%."

✅ Underfitting. Model is too simple. Increase complexity.

Topic 9

Bias vs Variance Tradeoff

Closely related to overfitting/underfitting, but a deeper view.

9.1 Bias

Error from oversimplified assumptions. The model is too rigid to capture reality.

Example: Trying to fit a straight line to data that's shaped like a curve. No matter how much data you give it, a line can't bend.

High bias = underfitting.

9.2 Variance

Error from being too sensitive to the specific training set. Tiny changes in training data → wildly different model.

Example: A super-flexible squiggly curve that passes through every training point exactly. Slightly different data → totally different squiggle.

High variance = overfitting.

9.3 The Tradeoff

You can't easily have zero bias AND zero variance. Reducing one often increases the other.

Situation	Bias	Variance	Outcome
Underfitting	High	Low	Model too simple
Overfitting	Low	High	Model too complex
Sweet spot	Low	Low	Generalizes well ✅

The art of ML is finding the sweet spot.

💡 Mnemonic: Bias = Bad assumptions (too simple). Variance = Volatile (too sensitive).

Topic 10

Evaluation Metrics

This is where many candidates lose points. Master this section.

10.1 Setup — the Confusion Matrix

For binary classification, every prediction lands in one of 4 boxes:

	Predicted Positive	Predicted Negative
Actually Positive	True Positive (TP) ✅	False Negative (FN) ❌ — missed it
Actually Negative	False Positive (FP) ❌ — false alarm	True Negative (TN) ✅

Memorize this. All metrics derive from it.

10.2 Accuracy

Accuracy = (TP + TN) / Total

Plain English: Of all predictions, how many were right?

Use when: Classes are roughly balanced (50/50 or close).

⚠️ The accuracy trap (huge exam trap): If 99% of transactions are legitimate and 1% are fraud, a model that always predicts "legitimate" gets 99% accuracy — but is completely useless. For imbalanced data, never use accuracy alone.

10.3 Precision

Precision = TP / (TP + FP)

Plain English: Of everything I flagged as positive, how many actually were?

Use when: False positives are expensive. You want to be sure when you say "positive."

Classic example — Spam filter:

If you mark a legit email as spam (FP), the user misses something important. Bad.
High precision = "When I say spam, it's really spam."

10.4 Recall (Sensitivity)

Recall = TP / (TP + FN)

Plain English: Of all the actual positives out there, how many did I catch?

Use when: False negatives are expensive. You can't afford to miss a positive case.

Classic example — Cancer screening:

If you miss a real cancer case (FN), the patient doesn't get treated. Catastrophic.
High recall = "I catch nearly every real cancer case, even if I have some false alarms."

10.5 The Precision–Recall Tradeoff

You can usually trade one for the other by adjusting the decision threshold.

Lower the threshold → flag more things as positive → catch more (↑ recall) but more false alarms (↓ precision)
Raise the threshold → only flag the most certain cases → fewer false alarms (↑ precision) but miss some (↓ recall)

Choose based on what's costlier in your domain.

10.6 F1 Score

F1 = 2 × (Precision × Recall) / (Precision + Recall)

Plain English: Harmonic mean of precision and recall. A single number that balances both.

Use when:

Imbalanced data
Both false positives and false negatives matter
You want one combined metric

💡 F1 is your default for imbalanced classification problems. AWS loves to ask "imbalanced fraud dataset, which metric?" — the answer is usually F1 or recall.

10.7 AUC-ROC

ROC curve: Plots Recall vs False Positive Rate at every possible threshold.

AUC (Area Under Curve): A single number from 0 to 1.

1.0 = perfect classifier
0.5 = random guessing
<0.5 = worse than random (and you should flip predictions)

Use when:

Binary classification
You want a threshold-independent measure
Comparing multiple classification models

Plain English: "How well does the model separate positives from negatives, regardless of where I draw the line?"

10.8 Regression Metrics (briefly)

Metric	Meaning
MAE (Mean Absolute Error)	Average of \|prediction − actual\|
MSE (Mean Squared Error)	Average of (prediction − actual)² — penalizes big errors more
RMSE	Square root of MSE — same units as the target
R²	How much of the variance the model explains (0 to 1)

10.9 The Master Cheat Table — memorize this cold

Scenario	Best Metric
Balanced classes	Accuracy
False positive is dangerous	Precision
False negative is dangerous	Recall
Imbalanced data, balance both	F1
Compare binary classifiers, threshold-independent	AUC-ROC
Spam filter	Precision
Medical diagnosis / cancer detection	Recall
Fraud detection	F1 or Recall
Regression problem	RMSE / MAE / R²

Topic 11

Neural Network Basics

You don't need to derive backprop. You need to know what each piece does.

11.1 The Neuron

A single neuron does:

output = activation( Σ (input × weight) + bias )

Inputs → numerical features
Weights → learned numbers controlling each input's importance
Bias → a learned offset (lets the neuron fire even when all inputs are 0)
Activation function → adds non-linearity (more on this below)

11.2 Layers

Layer	Job
Input layer	Receives raw features
Hidden layer(s)	Learn intermediate patterns (more layers = "deeper" = deep learning)
Output layer	Produces the final prediction

11.3 Activation Functions

Without these, a neural network is just a glorified linear model. Activations let it learn curves and complex shapes.

Activation	When used
ReLU (Rectified Linear Unit)	Default for hidden layers in most modern networks
Sigmoid	Output layer for binary classification (squashes to 0–1, interpretable as probability)
Softmax	Output layer for multi-class classification (gives probabilities that sum to 1)
Tanh	Older networks, some sequence models

11.4 Backpropagation (Backprop)

The intuition: The network makes a prediction. You compare it to the truth. The error gets propagated backward through the network, and each weight is nudged in the direction that would have reduced the error.

Forward pass: Input → ... → Prediction Error: Prediction vs Actual Backward pass: Error → ... → Adjust weights Repeat thousands of times.

This is how a network learns.

Topic 12

Deep Learning Model Types

Three families dominate the exam.

12.1 CNN — Convolutional Neural Network

Best for: images.

Why: A regular neural network treats each pixel independently. A CNN uses convolutional filters that detect local patterns (edges, textures, shapes) and stack them into higher-level concepts (eye → face).

Use cases:

Image classification (cat vs dog)
Object detection (find all cars in this photo)
Facial recognition
Medical imaging (tumor detection)
Defect detection in manufacturing
Self-driving perception

🔑 Exam clue words: image, visual, pixels, object detection, computer vision

12.2 RNN — Recurrent Neural Network

Best for: sequential / time-ordered data.

Why: RNNs have a "memory" — output at time t depends on input at time t AND the hidden state from time t-1. This lets them process sequences.

Use cases:

Time-series forecasting
Speech recognition (audio over time)
(Older) text generation
Stock price prediction
Sensor data over time

Weakness: Struggles with long-range dependencies. If a sentence is 50 words long, by the time RNN reaches word 50, it's largely forgotten word 1. Transformers fixed this.

🔑 Exam clue words: sequence, time series, ordered data, temporal

12.3 Transformers

Best for: language and generative AI. The architecture behind every modern LLM (GPT, Claude, Llama, etc.).

Why they dominate: Instead of processing tokens one at a time like RNNs, transformers use an attention mechanism that lets every token directly look at every other token. This means:

They can handle very long sequences
They train in parallel (massively faster than RNNs)
They scale beautifully to billions of parameters

Use cases:

Text generation
Translation
Summarization
Question answering
Chatbots
Code generation
All Large Language Models (LLMs)
All Foundation Models

🔑 Exam clue words: LLM, generative AI, foundation model, attention, chatbot, summarization, text generation

Quick mapping

Data shape	Use
Images, video	CNN
Time series, sensor streams	RNN (or modern variants)
Text, language, generative tasks	Transformer

Topic 13

MLOps Concepts

MLOps = DevOps applied to ML. Building and shipping models reliably.

13.1 Model Versioning

Track every version of every model.

Why:

Rollback if a new model breaks production
Compare v2 vs v3 performance
Audit what changed and when
Reproduce past results
Compliance (GDPR, banking regulations)

13.2 CI/CD/CT for ML

Stage	Meaning
CI (Continuous Integration)	Test your code, data pipelines, and model training scripts automatically on every change
CD (Continuous Delivery)	Deploy new models safely (canary, blue-green)
CT (Continuous Training)	Retrain automatically when triggers fire

13.3 Monitoring (in production)

Monitor	Why
Accuracy / business metric	Is the model still useful?
Latency	Is inference still fast?
Errors	Are requests failing?
Data drift	Has input distribution shifted?
Model drift	Has prediction quality declined?
Bias	Is the model fair across groups?

13.4 Retraining Triggers

You retrain when:

Accuracy drops below threshold
New data becomes available
Data drift is detected
Business rules change (new product launch, new regulation)
Seasonality changes (holiday vs non-holiday)
Bias is detected
Compliance requires it

AWS service: SageMaker Pipelines automates this whole loop.

Topic 14

Amazon SageMaker AI (THE big one)

SageMaker AI is AWS's flagship ML platform. It covers the full ML lifecycle: prepare data, build, train, tune, deploy, monitor.

The exam will test which feature of SageMaker handles which part of the lifecycle. Memorize these.

14.1 SageMaker Studio

What it is: A web-based IDE for ML development.
Use when: You want one workspace to build, train, debug, deploy, and monitor.
🔑 Clue: "integrated development environment for ML"

14.2 SageMaker Autopilot

What it is: AutoML. Give it a dataset → it tries algorithms, tunes hyperparameters, and produces a deployable model.
Use when: You want a model with minimal coding / minimal ML expertise.
🔑 Clue: "AutoML", "automatically build model", "least amount of code"

14.3 SageMaker Ground Truth

What it is: Data labeling service (uses humans + ML to speed it up).
Use when: You need labeled training data for supervised learning.
🔑 Clue: "label training data", "human labelers"

14.4 SageMaker Pipelines

What it is: Workflow orchestration for ML (preprocess → train → evaluate → deploy).
Use when: You need repeatable MLOps pipelines / CI-CD for ML.
🔑 Clue: "orchestrate ML workflow", "automate end-to-end ML"

14.5 SageMaker Clarify

What it is: Detects bias and explains predictions (feature attribution).
Use when: You need fairness checks or explainability.
🔑 Clue: "bias detection", "explainability", "why did the model predict this?"

14.6 SageMaker Model Monitor

What it is: Monitors deployed models for data quality issues, drift, and bias.
Use when: You need ongoing monitoring of production models.
🔑 Clue: "detect drift", "monitor deployed model"

14.7 SageMaker JumpStart

What it is: Library of pretrained models, foundation models, and solution templates.
Use when: You want to start fast with a pretrained model instead of training from scratch.
🔑 Clue: "pretrained model", "quick start", "foundation model", "solution template"

14.8 SageMaker Feature Store

What it is: A centralized repository for ML features (the engineered inputs you feed models).
Use when: Multiple teams / models need to reuse the same features consistently for both training and inference.
🔑 Clue: "store and reuse features", "consistency between training and inference features"

14.9 SageMaker Data Wrangler (worth knowing)

What it is: Visual data prep and feature engineering tool.
Use when: Cleaning, transforming, and exploring data before training.

14.10 SageMaker Canvas (worth knowing)

What it is: No-code ML for business analysts. Drag-and-drop model building.
Use when: Non-technical users want to build ML models.

Topic 15

SageMaker Training Job Types

Three levels of control:

Option	Control	Effort	When to use
Built-in algorithms	Low	Low	Standard problems (XGBoost, K-Means, etc.)
Script mode	Medium	Medium	Bring your own TensorFlow / PyTorch / scikit-learn script, SageMaker manages infrastructure
Custom containers	High	High	Full control — bring your own Docker image

Common built-in algorithms:

Problem	Built-in algorithm
Classification	XGBoost
Regression	Linear Learner
Clustering	K-Means
Anomaly detection	Random Cut Forest
Recommendation	Factorization Machines
Time series	DeepAR
Image classification	ResNet (built-in)
Topic modeling	LDA, Neural Topic Model

Topic 16

SageMaker Deployment Types (memorize the table)

Deployment	When	Examples
Real-time Endpoint	Need an answer in <1 sec	Fraud check at payment, chatbot, product recs on a website
Batch Transform	Score huge dataset offline, no rush	Score all customers overnight, monthly churn batch
Async Inference	Large payload OR long processing, async OK	Video analysis, large document processing, 1GB inputs
Serverless Inference	Sporadic / unpredictable traffic, want to skip server management	Internal tool used a few times a day

Exam pattern:

"Process millions of records every night, no real-time requirement." → Batch Transform.
"Large image (500MB), processing takes 5 minutes." → Async Inference.
"Detect fraud during checkout in <100ms." → Real-time Endpoint.

Topic 17

Pre-built AWS AI Services (Group 1)

These are API-based services. You don't train a model. You send data → AWS returns the AI result. Think of them as "AI as a service."

17.1 Amazon Rekognition — images & video

Face detection, object detection, label detection
Content moderation (detect unsafe content)
Celebrity recognition
Text in images (e.g., a sign in a photo)
🔑 Clue: image / video analysis, detect faces / objects

17.2 Amazon Comprehend — text understanding (NLP)

Sentiment analysis (positive / negative / neutral)
Entity detection (people, places, organizations)
Key phrase extraction
Language detection
Topic modeling
Document classification (custom)
Comprehend Medical — medical entities (drug names, conditions)
🔑 Clue: understand text, sentiment, entities, key phrases

17.3 Amazon Textract — extract data from documents

OCR + structure extraction
Pulls out text, tables, forms, and key-value pairs
Works on scanned PDFs, invoices, receipts
🔑 Clue: OCR, forms, tables, invoice extraction, document data extraction

17.4 Amazon Transcribe — speech → text

Audio / video → text
Real-time or batch
Speaker identification, custom vocabulary
Transcribe Medical for medical dictation
🔑 Clue: audio to text, speech recognition, call transcription, captions

17.5 Amazon Polly — text → speech

Generates natural-sounding voice from text
Many languages and voices, including neural voices
🔑 Clue: text to speech, natural voice, audio generation from text

17.6 Amazon Translate — language translation

Real-time and batch translation between many languages
🔑 Clue: translate languages, multilingual

17.7 Amazon Lex — chatbots / voice bots

Builds conversational interfaces using intents and slots
Same engine as Alexa
🔑 Clue: chatbot, voice bot, conversational interface, IVR, intent and slots

Topic 18

Pre-built AWS AI Services (Group 2)

18.1 Amazon Forecast — time-series forecasting

Predicts future values from historical time-series
Demand forecasting, inventory, workforce
🔑 Clue: forecast, time-series prediction, future demand

18.2 Amazon Personalize — recommendations

Real-time personalized recommendations (same engine Amazon.com uses)
"Users who bought X also bought Y," personalized ranking, similar items
🔑 Clue: recommend, personalize, user-specific suggestions

18.3 Amazon Kendra — intelligent enterprise search

Natural-language search across company documents (S3, SharePoint, Confluence, etc.)
Returns answers, not just links
🔑 Clue: enterprise search, search across documents, knowledge discovery, natural-language search

18.4 Amazon Fraud Detector — fraud detection

Online payment fraud, fake account detection, risk scoring
⚠️ Check current AWS product status before relying on it in production. Existing customers may still use it, and the exam may still mention the concept. New fraud-detection use cases are often built on SageMaker instead.
🔑 Clue: fraud detection

18.5 Amazon Lookout for Metrics — anomaly detection in business metrics

Detects unusual changes in KPIs (revenue, web traffic, conversion)
🔑 Clue: detect anomalies in metrics, business KPI monitoring, revenue / traffic spikes

Bonus services worth recognizing

Service	Purpose
Amazon Bedrock	Managed access to foundation models (Claude, Llama, Titan, Stable Diffusion, etc.) for generative AI apps. Pair with Knowledge Bases for RAG.
Amazon Q	Generative AI assistant for businesses (Q Business) and developers (Q Developer).
Amazon Lookout for Vision	Industrial defect detection in images
Amazon Lookout for Equipment	Predictive maintenance from sensor data
Amazon CodeWhisperer / Q Developer	AI code suggestions in IDE

Topic 19

The Master "Which Service?" Table (memorize hardest)

This is the table you must know cold. Every exam has 5+ questions that map directly to this.

Use case	AWS service
Detect objects / faces in images or videos	Rekognition
Moderate unsafe image / video content	Rekognition
Extract text from scanned documents	Textract
Extract tables / forms / key-value from invoices	Textract
Convert speech / audio to text	Transcribe
Convert text to natural-sounding speech	Polly
Translate text between languages	Translate
Sentiment / entities / key phrases in text	Comprehend
Build chatbot / voice bot	Lex
Forecast future demand / sales	Forecast
Recommend products / content to users	Personalize
Search across enterprise documents	Kendra
Detect fraud (legacy)	Fraud Detector (or SageMaker)
Detect anomalies in business metrics	Lookout for Metrics
Build / train / deploy custom ML model	SageMaker AI
Label training data	SageMaker Ground Truth
Auto-build ML models	SageMaker Autopilot
Detect bias / explain predictions	SageMaker Clarify
Monitor drift in production model	SageMaker Model Monitor
Store / reuse ML features	SageMaker Feature Store
Use pretrained / foundation models	SageMaker JumpStart (or Bedrock for hosted FMs)
Automate ML workflow	SageMaker Pipelines
Real-time predictions	SageMaker Real-time Endpoint
Offline bulk predictions	SageMaker Batch Transform
Long-running / large-payload inference	SageMaker Async Inference
Generative AI app using foundation models	Amazon Bedrock
Detect manufacturing defects in images	Lookout for Vision
Predictive maintenance from sensors	Lookout for Equipment

Topic 20

Exam Traps (these get people)

Trap 1 — "Accuracy is the right metric"

Almost never on the exam. If the dataset is imbalanced (fraud, disease, etc.) → use F1, Recall, or AUC-ROC.

Trap 2 — Rekognition vs Textract

Rekognition = images/videos as visual content (faces, objects, scenes).
Textract = images of documents where you want the text/tables/forms extracted.

"Extract text from a scanned invoice" → Textract, not Rekognition.

Trap 3 — Transcribe vs Polly

Transcribe = audio → text
Polly = text → audio

Mnemonic: Transcribe writes down what was said. Polly reads things out loud.

Trap 4 — Comprehend vs Kendra

Comprehend = analyze text (sentiment, entities, classification)
Kendra = search across documents

Trap 5 — Forecast vs Personalize

Forecast = predict numbers over time (sales next month)
Personalize = recommend items to users

Trap 6 — When to use SageMaker vs a Pre-built AI service

Situation	Choose
There's a ready API for this exact task	Pre-built AI service
You need a custom model on your own data	SageMaker AI
You want full ML lifecycle control	SageMaker AI
Team has no ML expertise	Pre-built AI service or Autopilot / Canvas

Default heuristic: If a pre-built service does it, use the pre-built service. It's cheaper, faster, no training needed. Only go SageMaker when the pre-built services don't fit your specific task.

Trap 7 — Real-time vs Batch vs Async Inference

"low latency", "during checkout", "live" → Real-time
"overnight", "millions of records", "no rush" → Batch
"large input (>1MB or video)", "long processing", "async OK" → Async

Trap 8 — Underfitting fixes

A common wrong answer: "add regularization". That's the fix for overfitting. For underfitting → reduce regularization, add complexity, train longer.

Trap 9 — "Model is biased" usually means data is biased

The first answer is rarely "tune the model" — it's "collect more representative data" or "use SageMaker Clarify to detect and mitigate bias."

Trap 10 — Foundation Models

JumpStart = pretrained models hosted in SageMaker (you can fine-tune)
Bedrock = managed API for foundation models (Claude, Llama, Titan), no infrastructure

Topic 21

One-Line Memory Sheet (final revision)

Concept	Remember this
Supervised	Labeled data
Unsupervised	No labels, find patterns
Reinforcement	Agent learns by reward
Semi-supervised	Few labels + many unlabeled
Classification	Predict category
Regression	Predict number
Clustering	Group similar things
Anomaly detection	Find unusual behavior
Recommendation	Suggest relevant items
Overfitting	Memorized training data, fails on new
Underfitting	Too simple to learn
Bias	Too simple assumptions (underfit)
Variance	Too sensitive to training data (overfit)
Precision	Avoid false positives
Recall	Avoid false negatives
F1	Balance precision and recall
AUC-ROC	Class separation ability
CNN	Images
RNN	Sequences
Transformer	Language / GenAI / LLMs
Parameter	Model learns
Hyperparameter	You tune
Training	Model learns
Inference	Model predicts
Real-time inference	<1 second response
Batch Transform	Offline, large dataset
Async Inference	Large payload, long processing
Ground Truth	Label data
Autopilot	AutoML
Clarify	Bias / explainability
Model Monitor	Drift monitoring
Feature Store	Reuse features across teams
JumpStart	Pretrained models, foundation models
Pipelines	ML workflow automation
Bedrock	Foundation models as a service
Rekognition	Image / video
Textract	Document / OCR
Comprehend	NLP analysis
Transcribe	Speech → Text
Polly	Text → Speech
Translate	Language translation
Lex	Chatbots
Forecast	Time-series forecasting
Personalize	Recommendations
Kendra	Enterprise search
Lookout for Metrics	Anomalies in KPIs

Topic 22

Day 1 Study Plan (do this in order)

You will not fail this exam on definitions. You will fail it on service selection under pressure. So focus accordingly.

Recommended order for today:

Read Topics 1–9 in order (foundations: ML types → problem types → pipeline → training data → hyperparameters → inference vs training → over/underfitting → bias/variance). 30–45 min.
Drill Topic 10 (metrics) until you can answer "spam filter → precision," "cancer screening → recall," "imbalanced data → F1" without thinking. 20 min.
Read Topics 11–13 (neural networks, deep learning model types, MLOps). 30 min.
Memorize Topics 14–18 (SageMaker + AI services). This is the heaviest section. Use spaced repetition. 60+ min.
Memorize Topic 19 (the master Which-Service table) cold. Cover the right column with your hand and quiz yourself. Repeat until you get 100%. 30 min.
Read Topic 20 (traps) twice. Each trap is worth 1–2 exam points. 15 min.
Use Topic 21 as your final 5-minute pre-exam revision.

Self-test before you stop today

Answer these in your head. If you stumble, re-read the relevant topic.

A bank wants to flag suspicious transactions where 0.1% are fraud. Which evaluation metric should it prioritize?
A retailer has 50 million customer records and wants to find natural shopping segments without predefined labels. ML type? Problem type?
The model gets 99% training accuracy and 71% validation accuracy. Diagnosis and 3 fixes?
A company wants to extract line items and totals from PDF invoices. Which AWS service?
A team needs to monitor a deployed model for input distribution changes. Which SageMaker feature?
A model needs to score 50 million records every Sunday night, no real-time requirement. Which SageMaker deployment?
A startup wants a chatbot that handles "book a flight" with multiple slots (date, destination, passengers). Which AWS service?
Difference between a parameter and a hyperparameter? One example of each.
Which deep learning architecture powers modern LLMs and why?
When would you choose Amazon Bedrock vs SageMaker?

Answers:

Recall (or F1). Missing fraud is costly. Accuracy is misleading on imbalanced data.
Unsupervised, Clustering.
Overfitting. Fixes: more data, regularization, dropout, simpler model, early stopping (any 3).
Amazon Textract.
SageMaker Model Monitor (data drift).
Batch Transform.
Amazon Lex (intents + slots).
Parameters are learned by the model (e.g., neural network weights). Hyperparameters are set by you (e.g., learning rate).
Transformers — they use attention, scale to billions of parameters, and parallelize well.
Bedrock when you want a managed API for foundation models with no infra. SageMaker when you need to train a custom model or have full ML lifecycle control.

If you got 8+ correct, you're in great shape for Day 1. Don't try to do practice MCQs yet — let this material settle. Tomorrow, hit practice questions hard.

End of Day 1 guide. Re-read Topics 19 + 21 right before sleep — last-thing-you-read sticks best.