🎯 ML Fundamentals + AWS Services

AWS AIF-C01 β€” Day 1 Complete Study Guide

A clear, practical walkthrough of the ML ideas and AWS services that show up again and again on AIF-C01. Built for quick review, not for reading like a wall of notes.

Best used from start to finish, then Topic 19 for service mapping and Topic 21 for final revision.
Topic 1
The Big Picture (read this first)

Before any topic, fix this mental model in your head:

Real-world problem ↓ What KIND of ML is this? β†’ Supervised / Unsupervised / Reinforcement / Semi-supervised ↓ What kind of OUTPUT do I need? β†’ Classification / Regression / Clustering / Anomaly / Recommendation ↓ What AWS SERVICE fits? β†’ SageMaker (custom) OR Pre-built AI service (Rekognition, Comprehend, etc.) ↓ How will it be USED? β†’ Real-time / Batch / Async inference

Every exam question is testing one or more of these four layers. When you read a question, mentally walk down this ladder.

Topic 2
Types of Machine Learning

The "type" answers one question: What kind of data did the model learn from?

2.1 Supervised Learning

The intuition: Imagine teaching a child what a cat is by showing them 1,000 photos labeled "cat" and 1,000 photos labeled "not cat." The labels are the supervision. The child (model) learns to map photo β†’ label.

Mechanics:

Two flavors of supervised learning:

FlavorOutputExample
ClassificationA categorySpam or not spam
RegressionA numberPredicted house price

Real examples:

πŸ”‘ Exam clue words: "labeled data", "historical data with known outcomes", "predict category", "predict value", "training examples include the answer"

2.2 Unsupervised Learning

The intuition: Same child, but now you dump a pile of photos in front of them with no labels and say: "Sort these into groups however you want." The child might separate by color, by animal type, by background. They find structure that was already there but hidden.

Mechanics:

Real examples:

πŸ”‘ Exam clue words: "unlabeled data", "discover patterns", "group similar items", "segment customers", "no predefined categories"

2.3 Reinforcement Learning (RL)

The intuition: Think of training a dog. The dog tries something β†’ you give a treat (reward) or say "no" (penalty). Over many trials, the dog learns which actions lead to treats. There's no labeled dataset β€” there's a goal and feedback.

Mechanics:

Real examples:

πŸ”‘ Exam clue words: "agent", "environment", "reward", "policy", "trial and error", "learn through interaction"

2.4 Semi-supervised Learning

The intuition: You have 1,000 medical X-rays carefully labeled by a radiologist (very expensive!) and 100,000 unlabeled X-rays sitting in a database. Throwing away the unlabeled ones is wasteful. Semi-supervised learning uses both.

Mechanics:

Real examples:

πŸ”‘ Exam clue words: "limited labeled data", "large unlabeled dataset", "reduce labeling cost", "labeling is expensive"
🧠 Quick check
"A bank wants to detect fraudulent transactions. They have 5 years of transactions, each marked as fraud or legitimate."
βœ… Supervised. Labels exist.
"A retailer wants to discover natural groups of customers based on shopping behavior. They have no predefined groups."
βœ… Unsupervised. No labels.
"A robot vacuum learns the optimal cleaning path by trying routes and being rewarded for coverage."
βœ… Reinforcement. Agent + reward.
Topic 3
ML Problem Types

The "problem type" answers: What shape is the output?

This is different from "type of ML." A supervised problem can be classification or regression β€” those are different problem types.

3.1 Classification β€” predict a category

Output is discrete (a finite set of choices).

InputOutput
Email textSpam / Not spam
TransactionFraud / Legitimate
X-ray imageDisease / No disease / Inconclusive
πŸ”‘ Exam clue words: "which class?", "category", "yes/no", "type of object", "label"

3.2 Regression β€” predict a number

Output is continuous (a number on a scale).

InputOutput
House detailsPrice ($427,500)
Last 30 days of salesRevenue forecast ($1.2M)
Weather dataEnergy demand (450 MWh)

Key distinction from classification: "Will this customer churn?" = classification. "How many days until they churn?" = regression.

πŸ”‘ Exam clue words: "predict price", "forecast demand", "estimate value", "numeric output", "how much", "how many"

3.3 Clustering β€” group similar things (unsupervised)

No labels exist. Model finds groups.

Examples:

πŸ”‘ Exam clue words: "group similar", "segment", "no labels", "natural groupings"

3.4 Anomaly Detection β€” find the weird stuff

Find rare events that don't match the normal pattern.

Examples:

Why it's special: You usually have very few anomaly examples. So pure supervised learning struggles. Often handled as unsupervised or one-class.

πŸ”‘ Exam clue words: "unusual", "outlier", "abnormal", "rare event", "deviation from normal", "strange spike"

3.5 Recommendation β€” suggest relevant items

Predict what a user will like based on their history and similar users' behavior.

Examples:

πŸ”‘ Exam clue words: "personalized", "users who bought X also bought Y", "next best item", "recommend"
🧠 Map the problem to the type
ProblemML TypeProblem Type
Spam filterSupervisedClassification
Predict tomorrow's stock priceSupervisedRegression
Group products by similarityUnsupervisedClustering
Find rare network attacksUnsupervised (usually)Anomaly detection
"You may also like..."SpecializedRecommendation
Topic 4
The ML Pipeline (7 stages)

This is the lifecycle of every ML system. Memorize the order.

1. Data Collection ↓ 2. Preprocessing (cleaning) ↓ 3. Feature Engineering ↓ 4. Training ↓ 5. Evaluation ↓ 6. Deployment ↓ 7. Monitoring ↓ (back to 4 when retraining needed)

Stage 1 β€” Data Collection

Gather raw data: databases, logs, images, audio, IoT sensors, clickstream, public datasets.

⚠️ Garbage in, garbage out. ML is not magic. If your data is biased, incomplete, or wrong, your model will be too. AWS exam loves to test this β€” if the question says "model is biased," the answer is almost always "fix the data," not "tune the model."

Stage 2 β€” Preprocessing

Clean the raw data so the model can use it.

TaskWhat it does
Remove duplicatesSame row twice will mislead training
Handle missing valuesFill (imputation) or drop
Normalize / standardizeScale numeric values (e.g., height in cm and salary in $ are on wildly different scales β€” bad for many models)
Convert formatsImage β†’ tensor, text β†’ tokens
Remove noiseDrop irrelevant or corrupted rows
Encode categories"Country = India" β†’ one-hot or embedding

Stage 3 β€” Feature Engineering

Transform raw fields into useful features that help the model learn.

Why features matter: A model can't extract every pattern on its own. Smart features make patterns obvious.

Examples:

πŸ’‘ Old-school ML (like XGBoost) leans hard on feature engineering. Deep learning can often learn features automatically from raw data. The exam tests this distinction.

Stage 4 β€” Training

The model learns by adjusting its internal parameters to minimize prediction error.

ComponentMeaning
AlgorithmThe learning method (e.g., XGBoost, neural network, k-means)
Training dataExamples used to teach
ParametersInternal numbers the model learns (weights, biases)
Loss functionMeasures how wrong the predictions are
OptimizerAdjusts parameters to reduce loss (e.g., gradient descent)
HyperparametersSettings YOU choose before training (learning rate, batch size)

Stage 5 β€” Evaluation

Test the trained model on data it has never seen. This tells you if it actually generalizes or if it just memorized.

Metrics depend on the problem type β€” we'll cover these in detail in Topic 10.

Stage 6 β€” Deployment

Put the model into production so users / apps can use it.

TypeWhen to use
Real-time inferenceNeed an answer in <1 second (chatbot, fraud check)
Batch inferenceScore millions of records overnight (no rush)
Async inferenceLarge input, takes minutes to process (video analysis)
Edge deploymentModel runs on the device itself (phone, IoT sensor)

Stage 7 β€” Monitoring

After deployment, watch for problems:

IssueWhat it means
Data driftInput data has changed (e.g., new product line model never saw)
Model drift / concept driftRelationship between input and output has changed (e.g., COVID changed shopping patterns)
BiasModel treats groups unfairly
LatencyPredictions getting slower
CostEndpoint is too expensive
ErrorsFailed inference requests

When monitoring detects problems, you retrain the model with new data. This is the loop.

Topic 5
Training Data Concepts

5.1 Labeling

Adding the correct answer to each training example. Required for supervised learning.

AWS service: SageMaker Ground Truth. It coordinates human labelers (mechanical turk-style) and uses ML to speed up the process (active learning + auto-labeling).

5.2 Data Splits

You split your dataset into 3 parts:

SplitPurposeTypical size
Training setModel learns from this70–80%
Validation setTune hyperparameters; pick best model10–15%
Test setFinal unbiased evaluation, used once10–15%
⚠️ Cardinal rule: Never train on the test set. Never tune on the test set. The test set must remain "unseen" until the very end. If you peek, your accuracy number is a lie.

5.3 Class Imbalance

The problem: Fraud detection has 1% fraud, 99% legit. A naive classifier predicts "legit" always β†’ 99% accuracy β†’ useless.

Fixes:

TechniqueWhat it does
Oversampling minority classDuplicate (or synthesize) fraud examples
Undersampling majority classDrop some legit examples
SMOTEGenerates synthetic minority samples
Class weightsTell the model "mistakes on fraud cost 99x more"
Use F1 / Recall instead of accuracyFix the metric, not just the data

5.4 Data Augmentation

Create modified versions of existing data to expand your training set artificially.

For images: rotate, crop, flip, change brightness, add noise, zoom.
For text: synonym replacement, back-translation, random word dropout.
For audio: time-stretching, pitch-shifting, adding background noise.

Purpose: Reduces overfitting and helps generalization. The model sees more "variety" without you collecting new data.

Topic 6
Hyperparameters vs Parameters

This distinction shows up constantly.

6.1 Parameters

Numbers the model learns automatically during training.

Examples:

You don't set these. The optimizer does.

6.2 Hyperparameters

Numbers YOU choose before (or during) training.

Examples:

You tune these. This is called hyperparameter tuning β€” try different combinations and see which gives the best validation performance.

πŸ’‘ AWS: SageMaker has automatic model tuning (also called hyperparameter optimization, HPO) that searches the hyperparameter space for you using Bayesian optimization or random search.

Memory hook

TypeSet byExample
ParameterModel (during training)Neural network weights
HyperparameterYou (before training)Learning rate
Topic 7
Inference vs Training

7.1 Training

The model learns.

FactorTraining
ComputeVery high (GPUs)
TimeLong (hours to weeks)
CostHigh
DataLarge dataset
OutputA trained model
FrequencyPeriodic (once, then retrained)

7.2 Inference

The model predicts on new inputs.

FactorInference
ComputeLower (but scales with traffic)
TimeShould be fast
CostDepends on volume
DataOne input at a time (or a batch)
OutputA prediction
FrequencyContinuous / on-demand

7.3 Inference Modes (Critical for AWS exam)

NeedAWS option
Immediate (<1s) responseReal-time inference (SageMaker Real-time Endpoint)
Score a huge dataset offlineBatch Transform
Large payloads / long processing, async OKAsync Inference
Sporadic, unpredictable trafficServerless inference

This mapping is on the exam in many forms. Memorize it.

Topic 8
Overfitting vs Underfitting

This is the #1 ML concept. The exam will test it many ways.

8.1 Overfitting

The intuition: A student who memorizes the answer key but doesn't understand the subject. Aces practice tests, fails the real exam.

The model: Learns the noise in the training data, not just the pattern. Performs great on training data, poorly on new data.

Signs:

Causes:

Fixes (memorize these β€” exam favorite):

FixHow it helps
More training dataHarder to memorize a bigger set
Regularization (L1, L2)Penalizes overly complex models
Reduce model complexitySmaller network, fewer features
Dropout (neural nets)Randomly turns off neurons during training
Early stoppingStop training when validation accuracy plateaus
Cross-validationBetter estimate of true performance
Data augmentationCreate variations of existing data

8.2 Underfitting

The intuition: A student who studied only the chapter titles. Bad at practice tests, bad at the real exam.

The model: Too simple to capture the pattern. Performs poorly on both training and test data.

Signs:

Causes:

Fixes:

🧠 Quick check
"Training accuracy is 98%, validation accuracy is 70%. What's wrong?"
βœ… Overfitting. Big gap. Add regularization, get more data, or simplify.
"Both training and validation accuracy are 55%."
βœ… Underfitting. Model is too simple. Increase complexity.
Topic 9
Bias vs Variance Tradeoff

Closely related to overfitting/underfitting, but a deeper view.

9.1 Bias

Error from oversimplified assumptions. The model is too rigid to capture reality.

Example: Trying to fit a straight line to data that's shaped like a curve. No matter how much data you give it, a line can't bend.

High bias = underfitting.

9.2 Variance

Error from being too sensitive to the specific training set. Tiny changes in training data β†’ wildly different model.

Example: A super-flexible squiggly curve that passes through every training point exactly. Slightly different data β†’ totally different squiggle.

High variance = overfitting.

9.3 The Tradeoff

You can't easily have zero bias AND zero variance. Reducing one often increases the other.

SituationBiasVarianceOutcome
UnderfittingHighLowModel too simple
OverfittingLowHighModel too complex
Sweet spotLowLowGeneralizes well βœ…

The art of ML is finding the sweet spot.

πŸ’‘ Mnemonic: Bias = Bad assumptions (too simple). Variance = Volatile (too sensitive).
Topic 10
Evaluation Metrics

This is where many candidates lose points. Master this section.

10.1 Setup β€” the Confusion Matrix

For binary classification, every prediction lands in one of 4 boxes:

Predicted PositivePredicted Negative
Actually PositiveTrue Positive (TP) βœ…False Negative (FN) ❌ β€” missed it
Actually NegativeFalse Positive (FP) ❌ β€” false alarmTrue Negative (TN) βœ…

Memorize this. All metrics derive from it.

10.2 Accuracy

Accuracy = (TP + TN) / Total

Plain English: Of all predictions, how many were right?

Use when: Classes are roughly balanced (50/50 or close).

⚠️ The accuracy trap (huge exam trap): If 99% of transactions are legitimate and 1% are fraud, a model that always predicts "legitimate" gets 99% accuracy β€” but is completely useless. For imbalanced data, never use accuracy alone.

10.3 Precision

Precision = TP / (TP + FP)

Plain English: Of everything I flagged as positive, how many actually were?

Use when: False positives are expensive. You want to be sure when you say "positive."

Classic example β€” Spam filter:

10.4 Recall (Sensitivity)

Recall = TP / (TP + FN)

Plain English: Of all the actual positives out there, how many did I catch?

Use when: False negatives are expensive. You can't afford to miss a positive case.

Classic example β€” Cancer screening:

10.5 The Precision–Recall Tradeoff

You can usually trade one for the other by adjusting the decision threshold.

Choose based on what's costlier in your domain.

10.6 F1 Score

F1 = 2 Γ— (Precision Γ— Recall) / (Precision + Recall)

Plain English: Harmonic mean of precision and recall. A single number that balances both.

Use when:

πŸ’‘ F1 is your default for imbalanced classification problems. AWS loves to ask "imbalanced fraud dataset, which metric?" β€” the answer is usually F1 or recall.

10.7 AUC-ROC

ROC curve: Plots Recall vs False Positive Rate at every possible threshold.

AUC (Area Under Curve): A single number from 0 to 1.

Use when:

Plain English: "How well does the model separate positives from negatives, regardless of where I draw the line?"

10.8 Regression Metrics (briefly)

MetricMeaning
MAE (Mean Absolute Error)Average of |prediction βˆ’ actual|
MSE (Mean Squared Error)Average of (prediction βˆ’ actual)Β² β€” penalizes big errors more
RMSESquare root of MSE β€” same units as the target
RΒ²How much of the variance the model explains (0 to 1)

10.9 The Master Cheat Table β€” memorize this cold

ScenarioBest Metric
Balanced classesAccuracy
False positive is dangerousPrecision
False negative is dangerousRecall
Imbalanced data, balance bothF1
Compare binary classifiers, threshold-independentAUC-ROC
Spam filterPrecision
Medical diagnosis / cancer detectionRecall
Fraud detectionF1 or Recall
Regression problemRMSE / MAE / RΒ²
Topic 11
Neural Network Basics

You don't need to derive backprop. You need to know what each piece does.

11.1 The Neuron

A single neuron does:

output = activation( Ξ£ (input Γ— weight) + bias )

11.2 Layers

LayerJob
Input layerReceives raw features
Hidden layer(s)Learn intermediate patterns (more layers = "deeper" = deep learning)
Output layerProduces the final prediction

11.3 Activation Functions

Without these, a neural network is just a glorified linear model. Activations let it learn curves and complex shapes.

ActivationWhen used
ReLU (Rectified Linear Unit)Default for hidden layers in most modern networks
SigmoidOutput layer for binary classification (squashes to 0–1, interpretable as probability)
SoftmaxOutput layer for multi-class classification (gives probabilities that sum to 1)
TanhOlder networks, some sequence models

11.4 Backpropagation (Backprop)

The intuition: The network makes a prediction. You compare it to the truth. The error gets propagated backward through the network, and each weight is nudged in the direction that would have reduced the error.

Forward pass: Input β†’ ... β†’ Prediction Error: Prediction vs Actual Backward pass: Error β†’ ... β†’ Adjust weights Repeat thousands of times.

This is how a network learns.

Topic 12
Deep Learning Model Types

Three families dominate the exam.

12.1 CNN β€” Convolutional Neural Network

Best for: images.

Why: A regular neural network treats each pixel independently. A CNN uses convolutional filters that detect local patterns (edges, textures, shapes) and stack them into higher-level concepts (eye β†’ face).

Use cases:

πŸ”‘ Exam clue words: image, visual, pixels, object detection, computer vision

12.2 RNN β€” Recurrent Neural Network

Best for: sequential / time-ordered data.

Why: RNNs have a "memory" β€” output at time t depends on input at time t AND the hidden state from time t-1. This lets them process sequences.

Use cases:

Weakness: Struggles with long-range dependencies. If a sentence is 50 words long, by the time RNN reaches word 50, it's largely forgotten word 1. Transformers fixed this.

πŸ”‘ Exam clue words: sequence, time series, ordered data, temporal

12.3 Transformers

Best for: language and generative AI. The architecture behind every modern LLM (GPT, Claude, Llama, etc.).

Why they dominate: Instead of processing tokens one at a time like RNNs, transformers use an attention mechanism that lets every token directly look at every other token. This means:

Use cases:

πŸ”‘ Exam clue words: LLM, generative AI, foundation model, attention, chatbot, summarization, text generation

Quick mapping

Data shapeUse
Images, videoCNN
Time series, sensor streamsRNN (or modern variants)
Text, language, generative tasksTransformer
Topic 13
MLOps Concepts

MLOps = DevOps applied to ML. Building and shipping models reliably.

13.1 Model Versioning

Track every version of every model.

Why:

13.2 CI/CD/CT for ML

StageMeaning
CI (Continuous Integration)Test your code, data pipelines, and model training scripts automatically on every change
CD (Continuous Delivery)Deploy new models safely (canary, blue-green)
CT (Continuous Training)Retrain automatically when triggers fire

13.3 Monitoring (in production)

MonitorWhy
Accuracy / business metricIs the model still useful?
LatencyIs inference still fast?
ErrorsAre requests failing?
Data driftHas input distribution shifted?
Model driftHas prediction quality declined?
BiasIs the model fair across groups?

13.4 Retraining Triggers

You retrain when:

AWS service: SageMaker Pipelines automates this whole loop.

Topic 14
Amazon SageMaker AI (THE big one)

SageMaker AI is AWS's flagship ML platform. It covers the full ML lifecycle: prepare data, build, train, tune, deploy, monitor.

The exam will test which feature of SageMaker handles which part of the lifecycle. Memorize these.

14.1 SageMaker Studio

14.2 SageMaker Autopilot

14.3 SageMaker Ground Truth

14.4 SageMaker Pipelines

14.5 SageMaker Clarify

14.6 SageMaker Model Monitor

14.7 SageMaker JumpStart

14.8 SageMaker Feature Store

14.9 SageMaker Data Wrangler (worth knowing)

14.10 SageMaker Canvas (worth knowing)

Topic 15
SageMaker Training Job Types

Three levels of control:

OptionControlEffortWhen to use
Built-in algorithmsLowLowStandard problems (XGBoost, K-Means, etc.)
Script modeMediumMediumBring your own TensorFlow / PyTorch / scikit-learn script, SageMaker manages infrastructure
Custom containersHighHighFull control β€” bring your own Docker image

Common built-in algorithms:

ProblemBuilt-in algorithm
ClassificationXGBoost
RegressionLinear Learner
ClusteringK-Means
Anomaly detectionRandom Cut Forest
RecommendationFactorization Machines
Time seriesDeepAR
Image classificationResNet (built-in)
Topic modelingLDA, Neural Topic Model
Topic 16
SageMaker Deployment Types (memorize the table)
DeploymentWhenExamples
Real-time EndpointNeed an answer in <1 secFraud check at payment, chatbot, product recs on a website
Batch TransformScore huge dataset offline, no rushScore all customers overnight, monthly churn batch
Async InferenceLarge payload OR long processing, async OKVideo analysis, large document processing, 1GB inputs
Serverless InferenceSporadic / unpredictable traffic, want to skip server managementInternal tool used a few times a day

Exam pattern:

"Process millions of records every night, no real-time requirement." β†’ Batch Transform.
"Large image (500MB), processing takes 5 minutes." β†’ Async Inference.
"Detect fraud during checkout in <100ms." β†’ Real-time Endpoint.
Topic 17
Pre-built AWS AI Services (Group 1)

These are API-based services. You don't train a model. You send data β†’ AWS returns the AI result. Think of them as "AI as a service."

17.1 Amazon Rekognition β€” images & video

17.2 Amazon Comprehend β€” text understanding (NLP)

17.3 Amazon Textract β€” extract data from documents

17.4 Amazon Transcribe β€” speech β†’ text

17.5 Amazon Polly β€” text β†’ speech

17.6 Amazon Translate β€” language translation

17.7 Amazon Lex β€” chatbots / voice bots

Topic 18
Pre-built AWS AI Services (Group 2)

18.1 Amazon Forecast β€” time-series forecasting

18.2 Amazon Personalize β€” recommendations

18.3 Amazon Kendra β€” intelligent enterprise search

18.4 Amazon Fraud Detector β€” fraud detection

18.5 Amazon Lookout for Metrics β€” anomaly detection in business metrics

Bonus services worth recognizing

ServicePurpose
Amazon BedrockManaged access to foundation models (Claude, Llama, Titan, Stable Diffusion, etc.) for generative AI apps. Pair with Knowledge Bases for RAG.
Amazon QGenerative AI assistant for businesses (Q Business) and developers (Q Developer).
Amazon Lookout for VisionIndustrial defect detection in images
Amazon Lookout for EquipmentPredictive maintenance from sensor data
Amazon CodeWhisperer / Q DeveloperAI code suggestions in IDE
Topic 19
The Master "Which Service?" Table (memorize hardest)

This is the table you must know cold. Every exam has 5+ questions that map directly to this.

Use caseAWS service
Detect objects / faces in images or videosRekognition
Moderate unsafe image / video contentRekognition
Extract text from scanned documentsTextract
Extract tables / forms / key-value from invoicesTextract
Convert speech / audio to textTranscribe
Convert text to natural-sounding speechPolly
Translate text between languagesTranslate
Sentiment / entities / key phrases in textComprehend
Build chatbot / voice botLex
Forecast future demand / salesForecast
Recommend products / content to usersPersonalize
Search across enterprise documentsKendra
Detect fraud (legacy)Fraud Detector (or SageMaker)
Detect anomalies in business metricsLookout for Metrics
Build / train / deploy custom ML modelSageMaker AI
Label training dataSageMaker Ground Truth
Auto-build ML modelsSageMaker Autopilot
Detect bias / explain predictionsSageMaker Clarify
Monitor drift in production modelSageMaker Model Monitor
Store / reuse ML featuresSageMaker Feature Store
Use pretrained / foundation modelsSageMaker JumpStart (or Bedrock for hosted FMs)
Automate ML workflowSageMaker Pipelines
Real-time predictionsSageMaker Real-time Endpoint
Offline bulk predictionsSageMaker Batch Transform
Long-running / large-payload inferenceSageMaker Async Inference
Generative AI app using foundation modelsAmazon Bedrock
Detect manufacturing defects in imagesLookout for Vision
Predictive maintenance from sensorsLookout for Equipment
Topic 20
Exam Traps (these get people)

Trap 1 β€” "Accuracy is the right metric"

Almost never on the exam. If the dataset is imbalanced (fraud, disease, etc.) β†’ use F1, Recall, or AUC-ROC.

Trap 2 β€” Rekognition vs Textract

"Extract text from a scanned invoice" β†’ Textract, not Rekognition.

Trap 3 β€” Transcribe vs Polly

Mnemonic: Transcribe writes down what was said. Polly reads things out loud.

Trap 4 β€” Comprehend vs Kendra

Trap 5 β€” Forecast vs Personalize

Trap 6 β€” When to use SageMaker vs a Pre-built AI service

SituationChoose
There's a ready API for this exact taskPre-built AI service
You need a custom model on your own dataSageMaker AI
You want full ML lifecycle controlSageMaker AI
Team has no ML expertisePre-built AI service or Autopilot / Canvas
Default heuristic: If a pre-built service does it, use the pre-built service. It's cheaper, faster, no training needed. Only go SageMaker when the pre-built services don't fit your specific task.

Trap 7 β€” Real-time vs Batch vs Async Inference

Trap 8 β€” Underfitting fixes

A common wrong answer: "add regularization". That's the fix for overfitting. For underfitting β†’ reduce regularization, add complexity, train longer.

Trap 9 β€” "Model is biased" usually means data is biased

The first answer is rarely "tune the model" β€” it's "collect more representative data" or "use SageMaker Clarify to detect and mitigate bias."

Trap 10 β€” Foundation Models

Topic 21
One-Line Memory Sheet (final revision)
ConceptRemember this
SupervisedLabeled data
UnsupervisedNo labels, find patterns
ReinforcementAgent learns by reward
Semi-supervisedFew labels + many unlabeled
ClassificationPredict category
RegressionPredict number
ClusteringGroup similar things
Anomaly detectionFind unusual behavior
RecommendationSuggest relevant items
OverfittingMemorized training data, fails on new
UnderfittingToo simple to learn
BiasToo simple assumptions (underfit)
VarianceToo sensitive to training data (overfit)
PrecisionAvoid false positives
RecallAvoid false negatives
F1Balance precision and recall
AUC-ROCClass separation ability
CNNImages
RNNSequences
TransformerLanguage / GenAI / LLMs
ParameterModel learns
HyperparameterYou tune
TrainingModel learns
InferenceModel predicts
Real-time inference<1 second response
Batch TransformOffline, large dataset
Async InferenceLarge payload, long processing
Ground TruthLabel data
AutopilotAutoML
ClarifyBias / explainability
Model MonitorDrift monitoring
Feature StoreReuse features across teams
JumpStartPretrained models, foundation models
PipelinesML workflow automation
BedrockFoundation models as a service
RekognitionImage / video
TextractDocument / OCR
ComprehendNLP analysis
TranscribeSpeech β†’ Text
PollyText β†’ Speech
TranslateLanguage translation
LexChatbots
ForecastTime-series forecasting
PersonalizeRecommendations
KendraEnterprise search
Lookout for MetricsAnomalies in KPIs
Topic 22
Day 1 Study Plan (do this in order)

You will not fail this exam on definitions. You will fail it on service selection under pressure. So focus accordingly.

Recommended order for today:

  1. Read Topics 1–9 in order (foundations: ML types β†’ problem types β†’ pipeline β†’ training data β†’ hyperparameters β†’ inference vs training β†’ over/underfitting β†’ bias/variance). 30–45 min.
  2. Drill Topic 10 (metrics) until you can answer "spam filter β†’ precision," "cancer screening β†’ recall," "imbalanced data β†’ F1" without thinking. 20 min.
  3. Read Topics 11–13 (neural networks, deep learning model types, MLOps). 30 min.
  4. Memorize Topics 14–18 (SageMaker + AI services). This is the heaviest section. Use spaced repetition. 60+ min.
  5. Memorize Topic 19 (the master Which-Service table) cold. Cover the right column with your hand and quiz yourself. Repeat until you get 100%. 30 min.
  6. Read Topic 20 (traps) twice. Each trap is worth 1–2 exam points. 15 min.
  7. Use Topic 21 as your final 5-minute pre-exam revision.

Self-test before you stop today

Answer these in your head. If you stumble, re-read the relevant topic.

  1. A bank wants to flag suspicious transactions where 0.1% are fraud. Which evaluation metric should it prioritize?
  2. A retailer has 50 million customer records and wants to find natural shopping segments without predefined labels. ML type? Problem type?
  3. The model gets 99% training accuracy and 71% validation accuracy. Diagnosis and 3 fixes?
  4. A company wants to extract line items and totals from PDF invoices. Which AWS service?
  5. A team needs to monitor a deployed model for input distribution changes. Which SageMaker feature?
  6. A model needs to score 50 million records every Sunday night, no real-time requirement. Which SageMaker deployment?
  7. A startup wants a chatbot that handles "book a flight" with multiple slots (date, destination, passengers). Which AWS service?
  8. Difference between a parameter and a hyperparameter? One example of each.
  9. Which deep learning architecture powers modern LLMs and why?
  10. When would you choose Amazon Bedrock vs SageMaker?

Answers:

  1. Recall (or F1). Missing fraud is costly. Accuracy is misleading on imbalanced data.
  2. Unsupervised, Clustering.
  3. Overfitting. Fixes: more data, regularization, dropout, simpler model, early stopping (any 3).
  4. Amazon Textract.
  5. SageMaker Model Monitor (data drift).
  6. Batch Transform.
  7. Amazon Lex (intents + slots).
  8. Parameters are learned by the model (e.g., neural network weights). Hyperparameters are set by you (e.g., learning rate).
  9. Transformers β€” they use attention, scale to billions of parameters, and parallelize well.
  10. Bedrock when you want a managed API for foundation models with no infra. SageMaker when you need to train a custom model or have full ML lifecycle control.

If you got 8+ correct, you're in great shape for Day 1. Don't try to do practice MCQs yet β€” let this material settle. Tomorrow, hit practice questions hard.


End of Day 1 guide. Re-read Topics 19 + 21 right before sleep β€” last-thing-you-read sticks best.