Artificial Intelligence and Machine Learning

AI Fundamentals
Intelligent Agents
Search Techniques
Knowledge Representation
Expert Systems
Machine Learning Types
Key ML Algorithms
Bias-Variance Tradeoff
Overfitting and Underfitting
Cross-Validation
Evaluation Metrics
Ensemble Methods
Feature Engineering

AI Fundamentals

What is AI?

Artificial Intelligence is the simulation of human intelligence processes by computer systems. These processes include learning (acquiring information and rules), reasoning (using rules to reach conclusions), and self-correction.

AI Approaches

Approach	Description	Example
Symbolic AI (Good Old-FAI)	Rule-based, explicit knowledge representation	Expert systems, logic programming
Connectionist AI	Learning from data via neural networks	Deep learning
Evolutionary AI	Inspired by biological evolution	Genetic algorithms
Statistical AI	Probabilistic reasoning and learning	Bayesian networks

AI-Levels

Narrow AI (Weak AI): Specialized in one task (e.g., Siri, Chess engines) — all current AI
General AI (Strong AI): Human-level intelligence across all domains — theoretical
Super AI: Surpasses human intelligence — purely hypothetical

Turing Test

Proposed by Alan Turing (1950) — a machine passes if a human evaluator cannot distinguish its responses from a human's. The Chinese Room Argument (Searle) claims that passing the Turing Test doesn't imply true understanding.

Key AI Applications

Natural Language Processing (NLP)
Computer Vision
Robotics
Speech Recognition
Recommendation Systems
Autonomous Vehicles
Healthcare Diagnostics
Fraud Detection

Intelligent Agents

An agent is anything that perceives its environment through sensors and acts upon it through actuators.

Agent Types

Simple Reflex Agent

Based on condition-action rules (if-then)
Only considers current percept
Works only in fully observable environments
Example: Thermostat (if temp > threshold → turn off)

Model-Based Reflex Agent

Maintains an internal model of the world
Can handle partially observable environments
Tracks state based on percept history

Goal-Based Agent

Uses goal information to decide actions
Can plan sequences of actions to reach goals
More flexible than reflex agents

Utility-Based Agent

Maximizes a utility function (preference measure)
Chooses actions that yield highest expected utility
Handles trade-offs between conflicting goals

Learning Agent

Has a learning element that improves performance over time
Components: learning element, performance element, critic, problem generator
Can operate in unknown environments

PEAS Framework

Describes an agent's task environment:
- Performance measure
- Environment
- Actuators
- Sensors

Example — Self-driving Car:
- Performance: Safe, fast, legal, comfortable travel
- Environment: Roads, traffic, pedestrians, weather
- Actuators: Steering, accelerator, brake, horn
- Sensors: Cameras, GPS, LIDAR, speedometer

Environment Properties

Property	Description
Fully Observable	Agent can see entire state
Deterministic	Actions have predictable outcomes
Static	Environment doesn't change while agent deliberates
Discrete	Finite set of percepts and actions
Single Agent	No other agents competing

Search Techniques

Search is a fundamental AI technique for finding sequences of actions to achieve goals.

Search Problem Formulation

State Space: All possible states
Initial State: Starting state
Goal Test: Determines if a state is the goal
Successor Function: Defines possible actions and resulting states
Path Cost: Cost function for evaluating solutions

Breadth-First Search (BFS)

Explores all nodes at depth d before depth d+1
Uses FIFO queue
Complete: Yes (if solution exists)
Optimal: Yes (if all actions have equal cost)
Time Complexity: O(b^d)
Space Complexity: O(b^d) — major drawback (stores all nodes)

Depth-First Search (DFS)

Explores deepest node first
Uses LIFO stack (or recursion)
Complete: No (can get stuck in infinite loops)
Optimal: No
Time Complexity: O(b^m) where m = max depth
Space Complexity: O(bm) — much better than BFS

Depth-Limited Search (DLS)

DFS with a depth limit L
Complete: Only if L ≥ d (depth of shallowest solution)
Optimal: No

Iterative Deepening Search (IDS)

Repeatedly runs DLS with increasing depth limits (0, 1, 2, ...)
Complete: Yes
Optimal: Yes (for uniform cost)
Time Complexity: O(b^d) — overhead is negligible
Space Complexity: O(bd)
Best uninformed search for large state spaces

Uniform Cost Search (UCS)

Expands node with lowest path cost g(n)
Uses priority queue
Complete: Yes
Optimal: Yes
Time/Space: O(b^(1 + ⌊C/ε⌋)) where C = optimal cost, ε = min edge cost

Informed (Heuristic) Search

Heuristic Function h(n)

Estimates cost from node n to nearest goal. Must be admissible (never overestimates) for A* optimality.

Greedy Best-First Search

Expands node that appears closest to goal: f(n) = h(n)
Not optimal, can be misled by heuristics
Time/Space: O(b^m) but good heuristic helps

A* Search

f(n) = g(n) + h(n) where g(n) = actual cost so far, h(n) = estimated remaining cost
Complete: Yes
Optimal: Yes, if h(n) is admissible (and consistent)
Best known optimal search for pathfinding
Disadvantage: Exponential space complexity

Properties of Heuristics

Admissible: h(n) ≤ h*(n) (never overestimates true cost) — guarantees optimality
Consistent (Monotonic): h(n) ≤ c(n, n') + h(n') — guarantees efficiency
Dominance: h2(n) ≥ h1(n) for all n → h2 dominates h1 → h2 gives better pruning

Local Search

Hill Climbing

Greedily moves to neighbor with best value
Problems: Local maxima, plateaus, ridges
Solutions:
Stochastic HC: Choose among uphill moves randomly
First-choice HC: Generate random successors
Random-restart HC: Multiple random starts
Memory: O(1) — very efficient

Simulated Annealing

Concept: Borrowed from metallurgy — heating and slowly cooling
Accepts worse moves with probability: P = e^(-ΔE/T) where T = temperature
High T: Accepts almost anything (exploration)
Low T: Only accepts improvements (exploitation)
Cooling Schedule: T decreases over time (e.g., T = T × 0.95 each step)
Guaranteed to find global optimum if cooling is slow enough

Genetic Algorithm (GA)

Concept: Inspired by natural selection (evolution)
Components:
Population: Set of candidate solutions (chromosomes)
Fitness Function: Evaluates quality of solutions
Selection: Fitter individuals chosen for reproduction (roulette wheel, tournament)
Crossover: Combine two parents to create offspring (single-point, multi-point, uniform)
Mutation: Random small changes to maintain diversity
Replacement: New generation replaces old
Parameters: Population size, crossover rate, mutation rate, generations
Use Cases: Optimization, scheduling, feature selection

Search Algorithm Comparison

Algorithm	Complete	Optimal	Time	Space	Type
BFS	Yes	Yes (uniform)	O(b^d)	O(b^d)	Uninformed
DFS	No	No	O(b^m)	O(bm)	Uninformed
UCS	Yes	Yes	O(b^(1+C*/ε))	O(b^(1+C*/ε))	Uninformed
Greedy	No	No	O(b^m)	O(b^m)	Informed
A*	Yes	Yes*	O(b^d)	O(b^d)	Informed
Hill Climb	No	No	O(∞) worst	O(1)	Local
GA	No	No	Varies	Varies	Local

Knowledge Representation

Approaches

Propositional Logic

Uses propositions (True/False statements) and logical connectives
Operators: AND (∧), OR (∨), NOT (¬), IMPLIES (→), BICONDITIONAL (↔)
Limitation: Cannot express relations between objects; each fact must be stated individually
Inference: Modus Ponens, resolution, truth tables

First-Order Logic (Predicate Logic)

Extends propositional logic with objects, relations, and quantifiers
∀x: For all x (universal)
∃x: There exists x (existential)
Example: ∀x (Cat(x) → Mammal(x)) — "All cats are mammals"
More expressive than propositional logic

Semantic Network

Graph-based representation
Nodes: Objects/concepts
Edges: Relationships (is-a, has, part-of)
Inheritance: Subclass inherits properties from superclass

Frames

Concept: Structured knowledge (like objects/classes)
Slots: Attributes with default values
Similar to: Object-oriented classes

Ontologies

Formal specification of concepts and relationships in a domain
Components: Classes, properties, instances, axioms
Use Case: Semantic Web, knowledge graphs

Expert Systems

Architecture

Knowledge Base: Facts and rules (IF-THEN) from domain experts
Inference Engine: Applies rules to facts to derive conclusions
Forward Chaining: Data-driven (start from facts, apply rules)
Backward Chaining: Goal-driven (start from goal, find supporting facts)
Working Memory: Current facts and intermediate results
Explanation Facility: Explains reasoning process
User Interface: Interaction with users

Advantages

Consistent decisions
Preserves expert knowledge
Available 24/7
Can handle complex domains

Limitations

Knowledge Acquisition Bottleneck: Difficult to extract expert knowledge
Cannot learn from experience
Brittle (fails outside knowledge domain)
Expensive to build and maintain

Machine Learning Types

Supervised Learning

Input: Labeled data (input-output pairs)
Goal: Learn mapping from inputs to outputs
Types:
Classification: Output is a category (e.g., spam/not spam)
Regression: Output is a continuous value (e.g., house price)
Algorithms: Linear regression, logistic regression, SVM, decision trees, k-NN, neural networks

Unsupervised Learning

Input: Unlabeled data
Goal: Discover hidden patterns/structure
Types:
Clustering: Group similar data (e.g., k-means, hierarchical)
Dimensionality Reduction: Reduce features (e.g., PCA, t-SNE)
Association: Find rules (e.g., Apriori algorithm)
Algorithms: k-means, DBSCAN, PCA, autoencoders

Reinforcement Learning

Concept: Agent learns by interacting with environment
Components:
Agent: Learner/decision maker
Environment: What agent interacts with
State: Current situation
Action: What agent can do
Reward: Feedback signal
Policy: Strategy (state → action mapping)
Key Concepts:
Exploration vs Exploitation: Try new actions vs use known good ones
Value Function: Expected cumulative reward from a state
Q-Learning: Model-free RL; learns Q(s,a) values
Bellman Equation: V(s) = max_a [R(s,a) + γ × V(s')]
Algorithms: Q-learning, SARSA, Deep Q-Network (DQN), Policy Gradient
Applications: Game playing (AlphaGo), robotics, recommendation

Semi-Supervised Learning

Mix of labeled and unlabeled data
Uses unlabeled data to improve learning

Self-Supervised Learning

Generates labels from data itself (e.g., predicting masked words in BERT)

Key ML Algorithms

Linear Regression

Goal: Predict continuous output using linear relationship: y = w₀ + w₁x₁ + w₂x₂ + ... + wₙxₙ
Loss Function: Mean Squared Error (MSE) = (1/n) Σ(yᵢ - ŷᵢ)²
Optimization: Gradient descent or normal equation (closed form)
Assumptions: Linearity, independence, homoscedasticity, normality of residuals
Regularization:
Ridge (L2): Adds λΣwᵢ² penalty — shrinks coefficients
Lasso (L1): Adds λΣ|wᵢ| penalty — can zero out coefficients (feature selection)

Logistic Regression

Goal: Binary classification (despite "regression" in name)
Function: P(y=1|x) = 1/(1 + e^(-z)) where z = w·x + b (sigmoid function)
Decision Boundary: P ≥ 0.5 → class 1; P < 0.5 → class 0
Loss Function: Binary Cross-Entropy = -[y·log(p) + (1-y)·log(1-p)]
Optimization: Gradient descent

Decision Trees

Structure: Tree with internal nodes (feature tests), branches (outcomes), leaves (predictions)
Splitting Criteria:
Information Gain (ID3): Uses entropy — H(S) = -Σ pᵢ log₂(pᵢ)
Gain Ratio (C.4.5): Normalizes information gain
Gini Index (CART): Gini = 1 - Σ pᵢ²
Pruning: Pre-pruning (stop early) or post-pruning (grow full tree, then cut)
Advantages: Interpretable, handles non-linear data, no feature scaling needed
Disadvantages: Prone to overfitting, unstable (small data change → different tree)

Random Forest

Concept: Ensemble of decision trees using bagging (bootstrap aggregating)
Process:
Create multiple bootstrap samples (random sampling with replacement)
Train a decision tree on each sample
At each split, consider only a random subset of features
Aggregate predictions (majority vote for classification, average for regression)
Advantages: Reduces overfitting, handles high dimensionality, robust to outliers
Out-of-Bag (OOB) Error: Each tree tested on samples not in its bootstrap — built-in validation

Support Vector Machine (SVM)

Goal: Find the maximum margin hyperplane separating classes
Margin: Distance between hyperplane and nearest data points (support vectors)
Optimization: Minimize ½||w||² subject to yᵢ(w·xᵢ + b) ≥ 1
Soft Margin: Allows some misclassification (C parameter controls trade-off)
Kernel Trick: Maps data to higher dimension for non-linear separation
Linear Kernel: K(x,y) = x·y
Polynomial Kernel: K(x,y) = (x·y + c)^d
RBF/Gaussian Kernel: K(x,y) = exp(-γ||x-y||²) — most popular
Sigmoid Kernel: K(x,y) = tanh(αx·y + c)
Advantages: Effective in high dimensions, memory efficient (uses support vectors only)
Disadvantages: Doesn't scale well to very large datasets, sensitive to noise

k-Nearest Neighbors (k-NN)

Concept: Classify based on majority vote of k closest training examples
Distance Metrics:
Euclidean: √(Σ(xᵢ - yᵢ)²)
Manhattan: Σ|xᵢ - yᵢ|
Minkowski: (Σ|xᵢ - yᵢ|^p)^(1/p)
Choosing k: Small k → sensitive to noise; Large k → smoother boundaries
Advantages: Simple, no training phase, naturally handles multi-class
Disadvantages: Slow prediction (lazy learner), sensitive to irrelevant features and scale
Curse of Dimensionality: Distance becomes meaningless in very high dimensions

k-Means Clustering

Algorithm:
Initialize k centroids randomly
Assign each point to nearest centroid
Recalculate centroids as mean of assigned points
Repeat until convergence (centroids don't change)
Objective: Minimize within-cluster sum of squares (WCSS)
Choosing k: Elbow method, silhouette score
Advantages: Simple, fast O(n×k×i×d)
Disadvantages: Sensitive to initialization, assumes spherical clusters, need to specify k
k-Means++: Smart initialization — spreads initial centroids apart

Neural Networks Basics

Structure:
Input Layer: Receives features
Hidden Layers: Process information (can be multiple)
Output Layer: Produces prediction
Neuron: z = Σ(wᵢxᵢ) + b; a = activation(z)
Activation Functions:
Sigmoid: σ(z) = 1/(1+e^(-z)) — range (0,1); vanishing gradient problem
Tanh: range (-1,1); zero-centered but still vanishing gradient
ReLU: max(0, z) — most popular; fast; can "die" (always output 0)
Leaky ReLU: max(αz, z) — fixes dying ReLU
Softmax: Used in output layer for multi-class classification
Training: Forward propagation → compute loss → backpropagation → update weights
Backpropagation: Chain rule of calculus to compute gradients layer by layer
Optimizers: SGD, Momentum, Adam (adaptive learning rate), RMSprop

Convolutional Neural Networks (CNN)

Purpose: Image processing, computer vision
Key Layers:
Convolutional Layer: Applies filters/kernels to detect features (edges, textures)
Pooling Layer: Reduces spatial dimensions (Max pooling, Average pooling)
Fully Connected Layer: Final classification
Key Concepts: Stride, padding, feature maps, receptive field
Famous Architectures: LeNet, AlexNet, VGG, ResNet, Inception

Recurrent Neural Networks (RNN)

Purpose: Sequential data (text, time series, speech)
Key Idea: Hidden state carries information from previous time steps
Problem: Vanishing/exploding gradients for long sequences
Solutions:
LSTM (Long Short-Term Memory): Uses gates (forget, input, output) to control information flow
GRU (Gated Recurrent Unit): Simplified LSTM with reset and update gates
Applications: Language modeling, machine translation, speech recognition

Bias-Variance Tradeoff

Decomposition of Error

Total Error = Bias² + Variance + Irreducible Error

Component	Description	Cause
Bias	Error from overly simplistic assumptions	Underfitting (model too simple)
Variance	Error from sensitivity to training data fluctuations	Overfitting (model too complex)
Irreducible Error	Noise inherent in data	Cannot be reduced by any model

The Tradeoff

Simple Model (High Bias, Low Variance): Underfits — misses patterns
Complex Model (Low Bias, High Variance): Overfits — captures noise
Goal: Find the sweet spot that minimizes total error

Managing the Tradeoff

Reduce Bias: More complex model, more features, longer training
Reduce Variance: More data, regularization, simpler model, ensemble methods
Regularization: L1 (Lasso), L2 (Ridge), Elastic Net — penalize complexity

Overfitting and Underfitting

Underfitting

Model too simple to capture underlying pattern
Symptoms: High training error, high test error
Solutions:
Increase model complexity
Add more features
Train longer
Reduce regularization

Overfitting

Model memorizes training data including noise
Symptoms: Low training error, high test error
Solutions:
Get more training data
Use regularization (L1, L2, dropout)
Reduce model complexity
Use cross-validation
Early stopping (for neural networks)
Pruning (for decision trees)
Data augmentation

Regularization Techniques

Technique	Description
**L1 (Lasso)	Adds λΣ
**L2 (Ridge)	Adds λΣwᵢ² penalty; shrinks coefficients
**Elastic Net	Combines L1 and L2
**Dropout	Randomly disable neurons during training (NN)
**Early Stopping	Stop training when validation error increases
**Data Augmentation	Artificially increase training data

Cross-Validation

Purpose

Estimate model performance on unseen data and tune hyperparameters.

Types

k-Fold Cross-Validation

Split data into k equal folds
Train on k-1 folds, test on remaining fold
Repeat k times (each fold used as test once)
Average the k performance scores
Common: k = 5 or k = 10
Advantage: Every data point used for both training and testing

Stratified k-Fold

Preserves class distribution in each fold
Important for imbalanced datasets

Leave-One-Out Cross-Validation (LOOCV)

k = n (number of samples)
Train on n-1 samples, test on 1
Repeat n times
Advantage: Nearly unbiased estimate
Disadvantage: Computationally expensive

Holdout Method

Simple split: 70-80% training, 20-30% testing
Disadvantage: Performance estimate depends on specific split

Evaluation Metrics

Classification Metrics

Confusion Matrix

	Predicted Positive	Predicted Negative
Actual Positive	True Positive (TP)	False Negative (FN)
Actual Negative	False Positive (FP)	True Negative (TN)

Key Metrics

Metric	Formula	Interpretation
Accuracy	(TP+TN)/(TP+TN+FP+FN)	Overall correctness
Precision	TP/(TP+FP)	Of predicted positives, how many are correct
Recall (Sensitivity)	TP/(TP+FN)	Of actual positives, how many detected
Specificity	TN/(TN+FP)	Of actual negatives, how many detected
F1 Score	2×(Precision×Recall)/(Precision+Recall)	Harmonic mean of precision and recall

When to Use What?

Accuracy: Balanced classes
Precision: When FP is costly (e.g., spam detection — don't want legitimate email marked as spam)
Recall: When FN is costly (e.g., cancer detection — don't want to miss actual cases)
F1 Score: Imbalanced classes; need balance between precision and recall

ROC Curve and AUC

ROC (Receiver Operating Characteristic): Plots True Positive Rate (Recall) vs False Positive Rate (1-Specificity) at various thresholds
AUC (Area Under Curve): Single number summarizing ROC
AUC = 1.0: Perfect classifier
AUC = 0.5: Random classifier
AUC < 0.5: Worse than random
Advantage: Threshold-independent evaluation

Regression Metrics

Metric	Formula	Interpretation
MAE	(1/n)Σ\|yᵢ - ŷᵢ\|	Average absolute error
MSE	(1/n)Σ(yᵢ - ŷᵢ)²	Average squared error (penalizes large errors)
RMSE	√MSE	Same units as target variable
R² (R-squared)	1 - (SS_res/SS_tot)	Proportion of variance explained (0 to 1)
Adjusted R²	1 - [(1-R²)(n-1)/(n-p-1)]	R² adjusted for number of predictors

Ensemble Methods

Ensemble methods combine multiple models to improve performance.

Bagging (Bootstrap Aggregating)

Concept: Train multiple models on different bootstrap samples; aggregate predictions
Reduces variance without increasing bias
Example: Random Forest
Key: Models trained independently (can be parallel)

Boosting

Concept: Sequentially train models; each new model focuses on errors of previous ones
Reduces bias (and some variance)
Key: Models trained sequentially (each depends on previous)

AdaBoost (Adaptive Boosting)

Initialize equal weights for all training samples
Train weak learner (e.g., decision stump)
Increase weights of misclassified samples
Assign weight to learner based on accuracy
Repeat; final prediction = weighted vote

Gradient Boosting

Each new model fits the residual errors of the previous ensemble
Uses gradient descent to minimize loss function
Variants: XGBoost, LightGBM, CatBoost (industry standard)

XGBoost

Regularized gradient boosting
Handles missing values, supports parallel processing
Most winning algorithm in Kaggle competitions

Stacking

Concept: Train multiple diverse models (base learners); train a meta-learner on their predictions
Base learners: Different algorithms (e.g., SVM, RF, k-NN)
Meta-learner: Combines base predictions (e.g., logistic regression)

Ensemble Comparison

Method	Strategy	Reduces	Parallel?
Bagging	Independent bootstrap samples	Variance	Yes
Boosting	Sequential error correction	Bias	No
Stacking	Meta-learner on base models	Both	Partially

Feature Engineering

Feature engineering is the process of creating/transforming features to improve model performance.

Feature Selection Methods

Method	Description
Filter	Statistical measures (correlation, chi-square, mutual information)
Wrapper	Use model performance to select features (forward selection, backward elimination)
Embedded	Selection during model training (Lasso, tree-based importance)

Feature Transformation

Technique	Description
Normalization	Scale to [0,1]: x' = (x - min)/(max - min)
Standardization	Scale to mean=0, std=1: x' = (x - μ)/σ
Log Transform	Reduces skewness in right-skewed data
Binning	Convert continuous to categorical
One-Hot Encoding	Convert categorical to binary columns
Label Encoding	Convert categories to integers (for ordinal data)
PCA	Principal Component Analysis — reduce dimensions while preserving variance

Handling Missing Data

Remove: Drop rows/columns (if missing is random and small)
Impute: Mean, median, mode, or predictive imputation
Indicator: Add binary column indicating missingness

Handling Imbalanced Data

Oversampling: Duplicate minority class samples (SMOTE — synthetic oversampling)
Undersampling: Remove majority class samples
Class Weights: Assign higher weight to minority class in loss function
Threshold Adjustment: Change decision threshold

Key Formulas Summary

Concept	Formula
A* Search	f(n) = g(n) + h(n)
Sigmoid	σ(z) = 1/(1 + e^(-z))
MSE	(1/n) Σ(yᵢ - ŷᵢ)²
Entropy	H(S) = -Σ pᵢ log₂(pᵢ)
Gini Index	Gini = 1 - Σ pᵢ²
Precision	TP / (TP + FP)
Recall	TP / (TP + FN)
F1 Score	2 × (P × R) / (P + R)
R²	1 - (SS_res / SS_tot)
Simulated Annealing	P(accept) = e^(-ΔE/T)
Error Decomposition	Bias² + Variance + Irreducible

Exam Tips

Search Algorithms: Know BFS vs DFS vs A* — completeness, optimality, complexity
A* Optimality: Understand admissibility and consistency conditions
ML Types: Clearly distinguish supervised, unsupervised, and reinforcement learning
SVM: Understand margin, support vectors, and kernel trick
Bias-Variance: Know the tradeoff and how to manage it
Evaluation Metrics: Know when to use precision vs recall vs F1
Ensemble Methods: Bagging vs Boosting — key differences
Neural Networks: Understand backpropagation, activation functions, CNN vs RNN
Decision Trees: Know splitting criteria (entropy, Gini, information gain)
Cross-Validation: Understand k-fold and stratified k-fold

Practice Questions

10 MCQs for Artificial Intelligence and Machine Learning with detailed explanations.

Q1. Which of the following best describes - Time Complexity: O(b^d) — overhead?

A. a category (e.g., spam/not spam)
B. negligible
C. the goal
D. mammals"

✅ Correct Answer: Option B

Explanation:
The correct answer is Option B — negligible.

This concept is covered under Artificial Intelligence and Machine Learning in the CBDT Assistant Director Systems syllabus. The answer is established through standard definitions and widely accepted principles in the field.

Why other options are incorrect:
- Option A — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.
- Option C — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.
- Option D — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.

Q2. Regarding the following concept: '(using rules to reach conclusions), and...', which statement is correct?

A. This is defined exclusively at the physical layer of system design
B. This approach has been deprecated in all modern implementations
C. This concept applies only to analog systems and not digital ones
D. (using rules to reach conclusions), and

✅ Correct Answer: Option D

Explanation:
The correct answer is Option D — (using rules to reach conclusions), and.

Why other options are incorrect:
- Option A — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.
- Option B — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.
- Option C — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.

Q3. Regarding the following concept: '(acquiring information and rules),...', which statement is correct?

A. This concept applies only to analog systems and not digital ones
B. This is defined exclusively at the physical layer of system design
C. This approach has been deprecated in all modern implementations
D. (acquiring information and rules),

✅ Correct Answer: Option D

Explanation:
The correct answer is Option D — (acquiring information and rules),.

Why other options are incorrect:
- Option A — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.
- Option B — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.
- Option C — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.

Q4. Regarding the following concept: '| Inspired by biological evolution | Genetic algorithms |

|...', which statement is correct?

A. This is defined exclusively at the physical layer of system design
B. This concept applies only to analog systems and not digital ones
C. | Inspired by biological evolution | Genetic algorithms |
|
D. This approach has been deprecated in all modern implementations

✅ Correct Answer: Option C

Explanation:
The correct answer is Option C — | Inspired by biological evolution | Genetic algorithms |
|.

Why other options are incorrect:
- Option A — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.
- Option B — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.
- Option D — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.

Q5. Which of the following best describes - Classification: Output?

A. mammals"
B. the goal
C. a category (e.g., spam/not spam)
D. negligible

✅ Correct Answer: Option C

Explanation:
The correct answer is Option C — a category (e.g., spam/not spam).

Why other options are incorrect:
- Option A — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.
- Option B — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.
- Option D — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.

Q6. Which of the following best describes Search?

A. mammals"
B. the goal
C. negligible
D. a fundamental AI technique for finding sequences of actions to achieve goals.

✅ Correct Answer: Option D

Explanation:
The correct answer is Option D — a fundamental AI technique for finding sequences of actions to achieve goals..

Why other options are incorrect:
- Option A — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.
- Option B — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.
- Option C — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.

Q7. Which of the following best describes - Regression: Output?

A. mammals"
B. a continuous value (e.g., house price)
C. negligible
D. the goal

✅ Correct Answer: Option B

Explanation:
The correct answer is Option B — a continuous value (e.g., house price).

Q8. Regarding the following concept: '| Rule-based, explicit knowledge representation | Expert systems, logic programm...', which statement is correct?

A. This approach has been deprecated in all modern implementations
B. This concept applies only to analog systems and not digital ones
C. | Rule-based, explicit knowledge representation | Expert systems, logic programming |
|
D. This is defined exclusively at the physical layer of system design

✅ Correct Answer: Option C

Explanation:
The correct answer is Option C — | Rule-based, explicit knowledge representation | Expert systems, logic programming |
|.

Why other options are incorrect:
- Option A — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.
- Option B — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.
- Option D — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.

Q9. Regarding the following concept: 'Human-level intelligence across all domains — theoretical

-...', which statement is correct?

A. This is defined exclusively at the physical layer of system design
B. This concept applies only to analog systems and not digital ones
C. This approach has been deprecated in all modern implementations
D. Human-level intelligence across all domains — theoretical

✅ Correct Answer: Option D

Explanation:
The correct answer is Option D — Human-level intelligence across all domains — theoretical
-.

Why other options are incorrect:
- Option A — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.
- Option B — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.
- Option C — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.

Q10. Regarding the following concept: '.

AI Approaches

| Approach | Description | Example |
|----------|----------...', which statement is correct?

A. .

AI Approaches

Approach	Description	Example

- B. This is defined exclusively at the physical layer of system design
- C. This approach has been deprecated in all modern implementations
- D. This concept applies only to analog systems and not digital ones

✅ Correct Answer: Option A

Explanation:
The correct answer is Option A — .

AI Approaches

Approach	Description	Example
.

Why other options are incorrect:
- Option B — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.
- Option C — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.
- Option D — This option is factually incorrect or describes a concept from a different domain, making it an invalid choice for this question.

Artificial Intelligence and Machine Learning

Table of Contents

AI Fundamentals

What is AI?

AI Approaches

AI-Levels

Turing Test

Key AI Applications

Intelligent Agents

Agent Types

Simple Reflex Agent

Model-Based Reflex Agent

Goal-Based Agent

Utility-Based Agent

Learning Agent

PEAS Framework

Environment Properties

Search Techniques

Search Problem Formulation

Uninformed (Blind) Search

Breadth-First Search (BFS)

Depth-First Search (DFS)

Depth-Limited Search (DLS)

Iterative Deepening Search (IDS)

Uniform Cost Search (UCS)

Informed (Heuristic) Search

Heuristic Function h(n)

Greedy Best-First Search

A* Search

Properties of Heuristics

Local Search

Hill Climbing

Simulated Annealing

Genetic Algorithm (GA)

Search Algorithm Comparison

Knowledge Representation

Approaches

Propositional Logic

First-Order Logic (Predicate Logic)

Semantic Network

Frames

Ontologies

Expert Systems

Architecture

Advantages

Limitations

Machine Learning Types

Supervised Learning

Unsupervised Learning

Reinforcement Learning

Semi-Supervised Learning

Self-Supervised Learning

Key ML Algorithms

Linear Regression

Logistic Regression

Decision Trees

Random Forest

Support Vector Machine (SVM)

k-Nearest Neighbors (k-NN)

k-Means Clustering

Neural Networks Basics

Convolutional Neural Networks (CNN)

Recurrent Neural Networks (RNN)

Bias-Variance Tradeoff

Decomposition of Error

The Tradeoff

Managing the Tradeoff

Overfitting and Underfitting

Underfitting

Overfitting

Regularization Techniques

Cross-Validation

Purpose

Types

k-Fold Cross-Validation

Stratified k-Fold

Leave-One-Out Cross-Validation (LOOCV)

Holdout Method

Evaluation Metrics

Classification Metrics