Design and Analysis of Algorithms

Comprehensive Theory Notes for CBDT Assistant Director (Systems) Exam


1. Asymptotic Notation

Asymptotic notation describes the running time of algorithms as input size grows, ignoring constant factors and lower-order terms.

Big-O Notation (Upper Bound)

f(n) = O(g(n)) if ∃ constants c > 0, n₀ > 0 such that f(n) ≤ c·g(n) for all n ≥ n₀

Big-Omega Notation (Lower Bound)

f(n) = Ω(g(n)) if ∃ constants c > 0, n₀ > 0 such that f(n) ≥ c·g(n) for all n ≥ n₀

Big-Theta Notation (Tight Bound)

f(n) = Θ(g(n)) if ∃ constants c₁, c₂ > 0, n₀ > 0 such that c₁·g(n) ≤ f(n) ≤ c₂·g(n) for all n ≥ n₀

Little-o Notation (Strict Upper Bound)

f(n) = o(g(n)) if lim(n→∞) f(n)/g(n) = 0

Little-omega Notation (Strict Lower Bound)

f(n) = ω(g(n)) if lim(n→∞) f(n)/g(n) = ∞

Hierarchy of Growth Rates

O(1) < O(log n) < O(√n) < O(n) < O(n log n) < O(n²) < O(n³) < O(2ⁿ) < O(n!)

Properties


2. Divide and Conquer

Paradigm: Break problem into smaller subproblems, solve recursively, combine results.

General Form

Top-level:      [  Problem of size n   ]
                /         |         \
Level 1:    [n/b]      [n/b]      [n/b]    ... a subproblems
              ...        ...        ...
Level k:    [1][1]...[1][1]...[1][1]  ... n^log_b(a) leaves

Recurrence: T(n) = aT(n/b) + f(n)

Merge Sort

Quick Sort

Best pivot: Median (but expensive to compute). Common strategies: first, last, random, median-of-three.

Strassen's Matrix Multiplication

Standard matrix multiplication: O(n³) (three nested loops)

Strassen's method:
1. Divide each n×n matrix into four (n/2)×(n/2) submatrices
2. Compute 7 products (instead of 8) recursively
3. Combine using additions/subtractions

Binary Search (Divide and Conquer)


3. Greedy Method

Paradigm: Make the locally optimal choice at each step, hoping it leads to a global optimum.

Greedy vs Dynamic Programming

Feature Greedy Dynamic Programming
Choice Local optimum Evaluates all subproblems
Backtracking Never backtracks Considers all possibilities
Speed Usually faster Usually slower
Optimality Not always optimal Always optimal (if applicable)
Complexity Usually O(n log n) or O(n) Usually O(n²) or O(nW)

Fractional Knapsack Problem

Activity Selection Problem

Huffman Coding

Minimum Spanning Tree — Greedy Algorithms

Kruskal's Algorithm:
1. Sort edges by weight (ascending)
2. Add edges greedily (skip if forms cycle)
3. Use Union-Find for cycle detection
- Time: O(E log E)

Prim's Algorithm:
1. Start from any vertex
2. Add minimum-weight edge connecting tree to new vertex
3. Repeat with priority queue
- Time: O((V + E) log V)

Dijkstra's Algorithm (Greedy)

Job Sequencing with Deadlines


4. Dynamic Programming

Paradigm: Solve each subproblem once, store results (memoization), and reuse.

Key Properties

  1. Optimal Substructure: Optimal solution contains optimal solutions to subproblems
  2. Overlapping Subproblems: Subproblems share subsubproblems

Approaches

Approach Description Direction
Top-down (Memoization) Recursive + cache Problem → Subproblems
Bottom-up (Tabulation) Iterative + table Subproblems → Problem

Fibonacci Numbers — Example of DP

Naive recursive: T(n) = O(2^n) — exponential (many repeated computations)
DP (memoization): T(n) = O(n) — linear (each subproblem computed once)

Matrix Chain Multiplication

DP recurrence:

m[i,j] = min(m[i,k] + m[k+1,j] + p_{i-1}·p_k·p_j) for i ≤ k < j
m[i,i] = 0

Example: Matrices A(10×30), B(30×5), C(5×60)
- (AB)C = 10×30×5 + 10×5×60 = 1500 + 3000 = 4500
- A(BC) = 30×5×60 + 10×30×60 = 9000 + 18000 = 27000
- (AB)C is optimal

0/1 Knapsack Problem

DP recurrence:

K[i][w] = max(K[i-1][w], K[i-1][w-w[i]] + v[i])  if w[i]  w
         = K[i-1][w]                                otherwise
K[0][w] = 0

Longest Common Subsequence (LCS)

DP recurrence:

L[i][j] = L[i-1][j-1] + 1                   if X[i] = Y[j]
        = max(L[i-1][j], L[i][j-1])         otherwise
L[0][j] = L[i][0] = 0

Floyd-Warshall Algorithm (All-Pairs Shortest Path)

DP formulation:

dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][k])

Optimal Binary Search Tree

DP recurrence:

e[i,j] = min(e[i,r-1] + e[r+1,j] + w[i,j]) for i  r  j
w[i,j] = sum of probabilities from i to j

5. Backtracking

Paradigm: Build solution incrementally, abandon partial solutions ("backtrack") when they cannot lead to a valid solution.

General Backtracking Algorithm

function backtrack(candidate):
  if candidate is a solution: output and return
  for each next choice:
    if choice is valid:
      make choice
      backtrack(candidate + choice)
      undo choice (backtrack)

N-Queens Problem

Subset Sum Problem

Graph Coloring

Hamiltonian Cycle

Comparison: Backtracking vs Branch and Bound

Feature Backtracking Branch and Bound
Goal Find all (or one) solutions Find optimal solution
Strategy Depth-first Best-first or FIFO
Pruning Feasibility bound Optimality bound
Data structure Stack (implicit) Priority queue or queue

6. Branch and Bound

Paradigm: Systematically enumerate candidate solutions, but prune branches that cannot produce a better solution than the current best.

Strategies

Strategy Description Use Case
FIFO (Queue) Breadth-first exploration General problems
LIFO (Stack) Depth-first exploration Memory constrained
LC (Least Cost) Priority queue by bound Optimization problems (most efficient)

0/1 Knapsack — Branch and Bound

  1. Sort items by value/weight ratio
  2. Use upper bound (greedy fractional knapsack) to prune
  3. Explore nodes with highest upper bound first
  4. Prune if upper bound ≤ current best

Travelling Salesman Problem (TSP)


7. NP-Completeness

Complexity Classes

Class Description Example
P Solvable in polynomial time Sorting, shortest path
NP Verifiable in polynomial time SAT, TSP, Knapsack
NP-Complete NP and as hard as any NP problem SAT, 3-SAT, Clique, TSP
NP-Hard At least as hard as NP-Complete Halting problem, TSP (optimization)

P ⊆ NP

Key Definitions

Polynomial-time reduction (A ≤ₚ B):
- Problem A can be transformed to B in polynomial time
- If B ∈ P, then A ∈ P

NP-Complete problem requires:
1. It is in NP (verifiable in polynomial time)
2. Every problem in NP reduces to it (NP-hard)

NP-Hard problem:
1. At least as hard as the hardest problems in NP
2. Does NOT need to be in NP

Cook-Levin Theorem

SAT (Boolean Satisfiability) is NP-Complete — the first proven NP-complete problem.

Karp's 21 NP-Complete Problems

Key ones to know: SAT, 3-SAT, Clique, Vertex Cover, Hamiltonian Cycle, TSP (decision), Graph Coloring, Subset Sum, Knapsack (decision), Partition.

NP-Complete Problem Relationships

SAT → 3-SAT → Clique → Vertex Cover
                  ↓
            Hamiltonian Cycle → TSP

Coping with NP-Completeness

  1. Approximation algorithms — near-optimal solutions
  2. Heuristics — practical but no guarantees
  3. Special cases — restricted versions may be in P
  4. Randomized algorithms — probabilistic guarantees
  5. Parameterized complexity — focus on specific parameters

8. Amortized Analysis

Amortized analysis gives the average time per operation over a worst-case sequence of operations.

Methods

Aggregate Method

Accounting Method

Potential Method

Dynamic Array (Table Doubling)

Operation Actual Cost Amortized Cost
Normal insert O(1) O(1)
Resize insert O(n) O(1)

Total for n inserts: O(n) → Amortized per insert: O(1)

Binary Counter

Operation Actual Cost Amortized Cost
Increment O(k) worst (k bits flip) O(1)

Total for n increments: O(n) → Amortized per increment: O(1)


9. Master Theorem

Solves recurrences of the form: T(n) = aT(n/b) + f(n)

Where:
- a ≥ 1 (number of subproblems)
- b > 1 (factor by which problem size shrinks)
- f(n) = cost of dividing and combining

Let n^(log_b(a)) = critical exponent

Case Condition Solution
Case 1 f(n) = O(n^(log_b(a) - ε)) for ε > 0 T(n) = Θ(n^(log_b(a)))
Case 2 f(n) = Θ(n^(log_b(a)) · log^k(n)), k ≥ 0 T(n) = Θ(n^(log_b(a)) · log^(k+1)(n))
Case 3 f(n) = Ω(n^(log_b(a) + ε)) AND af(n/b) ≤ cf(n) for c < 1 T(n) = Θ(f(n))

Applications

Algorithm Recurrence Case Solution
Merge Sort T(n) = 2T(n/2) + O(n) Case 2 (k=0) O(n log n)
Binary Search T(n) = T(n/2) + O(1) Case 2 (k=0) O(log n)
Strassen's T(n) = 7T(n/2) + O(n²) Case 1 O(n^2.807)
Karatsuba T(n) = 3T(n/2) + O(n) Case 1 O(n^1.585)

Extended Master Theorem (Case 2)


10. Graph Algorithms

Depth-First Search (DFS)

DFS(u):
  visited[u] = true
  for each neighbor v of u:
    if not visited[v]:
      DFS(v)

Topological Sort using DFS

  1. Perform DFS on DAG
  2. Add vertex to front of list after exploring all descendants
  3. Time: O(V + E)
  4. Valid only for DAGs (directed acyclic graphs)

Breadth-First Search (BFS)

BFS(source):
  queue.add(source)
  visited[source] = true
  while queue not empty:
    u = queue.remove()
    for each neighbor v of u:
      if not visited[v]:
        visited[v] = true
        queue.add(v)

Strongly Connected Components (SCC)

Kosaraju's Algorithm:
1. Perform DFS, record finish times
2. Reverse the graph
3. Process vertices in decreasing finish time order
4. Each DFS tree in step 3 is an SCC
- Time: O(V + E)

Tarjan's Algorithm:
1. Single DFS with low-link values
2. Uses stack to track current SCC
- Time: O(V + E)
- More efficient in practice (single pass)

Articulation Points and Bridges

Articulation Point (Cut Vertex): Vertex whose removal disconnects the graph.
Bridge (Cut Edge): Edge whose removal disconnects the graph.

Tarjan's approach using DFS:
- Maintain disc[u] (discovery time) and low[u] (earliest reachable)
- u is articulation if: root with ≥2 children, or child v has low[v] ≥ disc[u]
- Edge (u,v) is bridge if: low[v] > disc[u]
- Time: O(V + E)


11. Algorithm Complexity Summary

Algorithm Best Average Worst Space
Merge Sort O(n log n) O(n log n) O(n log n) O(n)
Quick Sort O(n log n) O(n log n) O(n²) O(log n)
Heap Sort O(n log n) O(n log n) O(n log n) O(1)
Matrix Chain O(n³) O(n³) O(n³) O(n²)
LCS O(mn) O(mn) O(mn) O(mn)
0/1 Knapsack O(nW) O(nW) O(nW) O(nW)
Dijkstra O((V+E)log V) O((V+E)log V) O((V+E)log V) O(V)
Bellman-Ford O(VE) O(VE) O(VE) O(V)
Floyd-Warshall O(V³) O(V³) O(V³) O(V²)
BFS/DFS O(V+E) O(V+E) O(V+E) O(V)
Kruskal O(E log E) O(E log E) O(E log E) O(E)
Prim O((V+E)log V) O((V+E)log V) O((V+E)log V) O(V)
Strassen's O(n^2.807) O(n^2.807) O(n^2.807) O(n²)

12. Paradigm Comparison

Paradigm Key Idea When to Use Example
Divide & Conquer Break, solve recursively, combine Independent subproblems Merge Sort, Quick Sort
Greedy Local optimal choice Greedy choice + optimal substructure Dijkstra, Kruskal, Huffman
Dynamic Programming Solve subproblems, store results Overlapping subproblems + optimal substructure Knapsack, LCS, Matrix Chain
Backtracking Try and undo Search problems with constraints N-Queens, Subset Sum
Branch & Bound Prune with bounds Optimization problems TSP, 0/1 Knapsack (BB)

When to Use What?

Situation Method
Subproblems don't overlap Divide & Conquer
Greedy choice property holds Greedy Method
Subproblems overlap Dynamic Programming
Need to search solution space Backtracking
Optimization with pruning Branch and Bound
No known polynomial solution Approximation / Heuristic

13. Key Theorems and Formulas

Master Theorem Quick Reference

Stirling's Approximation

Number of Comparisons

Important Complexity Classes

Fast Matrix Multiplication Algorithms

Algorithm Complexity
Standard O(n³)
Strassen (1969) O(n^2.807)
Coppersmith-Winograd O(n^2.376)
Alman-Vassilevska Williams O(n^2.373)

14. Exam Tips


Practice Questions

9 MCQs for Design and Analysis of Algorithms with detailed explanations.

Q1. Which of the following best describes O(n²) — when pivot?

Dijkstra's Algorithm (Greedy)

✅ Correct Answer: Option A

Explanation:
The correct answer is Option A — always min or max.

This is a standard concept in Design and Analysis of Algorithms. The answer follows from the fundamental definitions and properties covered in this topic.

Why other options are incorrect:
- Option B (Sort by finish time; select earliest finishing compatible activity
-...) — This does not correctly answer the question as it either misstates the concept or refers to a different context.
- Option C (O((V + E) log V)

Dijkstra's Algorithm (Greedy)


Q2. Consider the following about Sort by finish time; select earliest finishing compatible ac...

✅ Correct Answer: Option B

Explanation:
The correct answer is Option B — Sort by finish time; select earliest finishing compatible activity
-.

This is a standard concept in Design and Analysis of Algorithms. The answer follows from the fundamental definitions and properties covered in this topic.

Why other options are incorrect:
- Option A (It requires a minimum of O(n²) time complexity...) — This does not correctly answer the question as it either misstates the concept or refers to a different context.
- Option C (It applies only to sequential processing models...) — This does not correctly answer the question as it either misstates the concept or refers to a different context.
- Option D (It is not applicable in distributed systems...) — This does not correctly answer the question as it either misstates the concept or refers to a different context.


Q3. Consider the following about O((V + E) log V)

Dijkstra's Algorithm (Greedy)

Dijkstra's Algorithm (Greedy)

✅ Correct Answer: Option B

Explanation:
The correct answer is Option B — O((V + E) log V)

Dijkstra's Algorithm (Greedy)

This is a standard concept in Design and Analysis of Algorithms. The answer follows from the fundamental definitions and properties covered in this topic.

Why other options are incorrect:
- Option A (It applies only to sequential processing models...) — This does not correctly answer the question as it either misstates the concept or refers to a different context.
- Option C (It requires a minimum of O(n²) time complexity...) — This does not correctly answer the question as it either misstates the concept or refers to a different context.
- Option D (It is not applicable in distributed systems...) — This does not correctly answer the question as it either misstates the concept or refers to a different context.


Q4. Consider the following about Optimal solution contains optimal solutions to subproblems

2...

✅ Correct Answer: Option A

Explanation:
The correct answer is Option A — Optimal solution contains optimal solutions to subproblems
2..

This is a standard concept in Design and Analysis of Algorithms. The answer follows from the fundamental definitions and properties covered in this topic.

Why other options are incorrect:
- Option B (It applies only to sequential processing models...) — This does not correctly answer the question as it either misstates the concept or refers to a different context.
- Option C (It is not applicable in distributed systems...) — This does not correctly answer the question as it either misstates the concept or refers to a different context.
- Option D (It requires a minimum of O(n²) time complexity...) — This does not correctly answer the question as it either misstates the concept or refers to a different context.


Q5. Consider the following about Subproblems share subsubproblems

Approaches

| Approach...

Approaches

Approach Description Direction
- D. It is not applicable in distributed systems

✅ Correct Answer: Option C

Explanation:
The correct answer is Option C — Subproblems share subsubproblems

Approaches

| Approach | Description | Direction |
|----------|-------------|-----.

This is a standard concept in Design and Analysis of Algorithms. The answer follows from the fundamental definitions and properties covered in this topic.

Why other options are incorrect:
- Option A (It requires a minimum of O(n²) time complexity...) — This does not correctly answer the question as it either misstates the concept or refers to a different context.
- Option B (It applies only to sequential processing models...) — This does not correctly answer the question as it either misstates the concept or refers to a different context.
- Option D (It is not applicable in distributed systems...) — This does not correctly answer the question as it either misstates the concept or refers to a different context.


Q6. Consider the following about Does NOT need to be in NP

Cook-Levin Theorem

**SAT (Boo...

Cook-Levin Theorem

SAT (Boolean Satisfiability) is NP
-
C. It is not applicable in distributed systems
-
D.** It requires a minimum of O(n²) time complexity

✅ Correct Answer: Option B

Explanation:
The correct answer is Option B — Does NOT need to be in NP

Cook-Levin Theorem

**SAT (Boolean Satisfiability) is NP.

This is a standard concept in Design and Analysis of Algorithms. The answer follows from the fundamental definitions and properties covered in this topic.

Why other options are incorrect:
- Option A (It applies only to sequential processing models...) — This does not correctly answer the question as it either misstates the concept or refers to a different context.
- Option C (It is not applicable in distributed systems...) — This does not correctly answer the question as it either misstates the concept or refers to a different context.
- Option D (It requires a minimum of O(n²) time complexity...) — This does not correctly answer the question as it either misstates the concept or refers to a different context.


Q7. Consider the following about Example: 3n² + 2n + 1 = O(n²)...

✅ Correct Answer: Option D

Explanation:
The correct answer is Option D — Example: 3n² + 2n + 1 = O(n²).

This is a standard concept in Design and Analysis of Algorithms. The answer follows from the fundamental definitions and properties covered in this topic.

Why other options are incorrect:
- Option A (It is not applicable in distributed systems...) — This does not correctly answer the question as it either misstates the concept or refers to a different context.
- Option B (It applies only to sequential processing models...) — This does not correctly answer the question as it either misstates the concept or refers to a different context.
- Option C (It requires a minimum of O(n²) time complexity...) — This does not correctly answer the question as it either misstates the concept or refers to a different context.


Q8. Consider the following about Example: 3n² + 2n + 1 = Ω(n²)...

✅ Correct Answer: Option A

Explanation:
The correct answer is Option A — Example: 3n² + 2n + 1 = Ω(n²).

This is a standard concept in Design and Analysis of Algorithms. The answer follows from the fundamental definitions and properties covered in this topic.

Why other options are incorrect:
- Option B (It is not applicable in distributed systems...) — This does not correctly answer the question as it either misstates the concept or refers to a different context.
- Option C (It requires a minimum of O(n²) time complexity...) — This does not correctly answer the question as it either misstates the concept or refers to a different context.
- Option D (It applies only to sequential processing models...) — This does not correctly answer the question as it either misstates the concept or refers to a different context.


Q9. Consider the following about Reflexive:** f(n) = Θ(f(n))...

✅ Correct Answer: Option D

Explanation:
The correct answer is Option D — Reflexive:** f(n) = Θ(f(n)).

This is a standard concept in Design and Analysis of Algorithms. The answer follows from the fundamental definitions and properties covered in this topic.

Why other options are incorrect:
- Option A (It is not applicable in distributed systems...) — This does not correctly answer the question as it either misstates the concept or refers to a different context.
- Option B (It applies only to sequential processing models...) — This does not correctly answer the question as it either misstates the concept or refers to a different context.
- Option C (It requires a minimum of O(n²) time complexity...) — This does not correctly answer the question as it either misstates the concept or refers to a different context.