Recursive Algorithms and Recurrence Equations
Overview
- Performance of recursive algorithms typically specified with
recurrence equations
- Recurrence equations require special techniques for solving
- We will focus on induction and the Master Method (and its variants)
- And touch on other methods
Analyzing Performance of Non-Recursive Routines is (relatively) Easy
for i in 1 .. n loop
Loop: $T(n) = \Theta(n^2)$
for i in 1 .. n loop
for j in 1 .. n loop
end loop;
end loop;
for i in 1 .. n loop
for j in 1 .. i loop
end loop;
end loop;
Loop: $T(n) = \Theta(n^3)$ ... obvious
Performance of Recursive Routines
- Analysis of recursive routines is not as easy
- Let's look at several:
- Factorial
- Fibonacci
- Binary Search
Performance of Factorial
fac(n)
if n = 1
return 1
else
return n * fac(n-1)
For fac(n), how many times is fac called?
- $T(1) = 1$
- $T(2) = T(1) + 1 = 2$
- $T(3) = T(2) + 1 = 3$
- $T(n) = T(n-1) + 1 $ [Open form]
- $T(n) = n$ [Guess closed form. Proved by induction below]
Performance of Fibonacci
fibb(n)
if n in 1 .. 2
return 1
else
return n + fibb(n-1) + fibb(n-2)
For fac(n), how many times is fac called?
- $T(1) = 1$
- $T(2) = 1$
- $T(3) = T(1) + T(2) + 1 = 3 $
- $T(4) = T(3) + T(2) + 1 = 3 + 1 + 1 = 5 $
- $T(5) = T(4) + T(3) + 1 = 5 + 3 + 1 = 9 $
- $T(n) = T(n-1) + T(n-2) + 1 $
- $T(n) = ?$ [This one's not so easy]
- Can be solved using characteristic equation
Performance of Recursive Binary Search
index binsearch(number n, index low, index high,
const keytype S[], keytype x)
if low ≤ high then
mid = (low + high) / 2
if x = S[mid] then
return mid
elsif x < s[mid] then
return binsearch(n, low, mid-1, S, x)
else
return binsearch(n, mid+1, high, S, x)
else
return 0
end binsearch
For fac(n), how many times is fac called in the worst case?
- $T(1) = 2$
- $T(2) = T(1) + 1 = 3$
- $T(4) = T(2) + 1 = 4 $
- $T(8) = T(4) + 1 = 4 + 1 = 5 $
- $T(n) = T(n/2) + 1 $
- $T(n) = \lg n + 2$ for $n=2^k$ (Guess. Prove by induction)
- $T(n) = \Theta(\lg n)$
Recurrence Equation - Definition and Examples
- A recurrence equation defines $T(n)$ in terms of $T$ for smaller values
- Performance of many recursive algorithms is described by a Recurrence Equation
- Eamples:
- Factorial (Every Case): $T(n) = T(n-1) + 1$
- Fibonacci (Every Case): $T(n) = T(n-1) + T(n-2) + 1$
- Binary Search (Worst Case): $T(n) = T(n/2) + 1$
- Quick Sort (Worst Case): $T(n) = T(n-1) + \Theta(n)$
- Quick Sort(Best Case): $T(n) = 2T(\frac{n}{2}) + \Theta(n)$
- Merge Sort(Every Case): $T(n) = 2T(\frac{n}{2}) + \Theta(n)$
- General Forms:
- $T(n) = aT(\frac{n}{b}) + f(n)$
- $T(n) = aT(n - b) + f(n)$
Technical Issues
- Floors and Ceilings
- Usually Ignore
- Or assume size is power of 2
- Exact vs asymptotic functions
- Usually asymptotic is good enough
- Finding exact solutions requires exact boundary conditions
- Boundary conditions
- Usually we use asymptotic values
Reminder: $\Theta(n)$ on the RHS
- In $T(n) = T(n-1) + \Theta(n)$, the $\Theta(n)$ terms refers to some function
in $\Theta(n)$
- In most cases, the exact function does not affect the asymptotic behavior
- This allows us to simplify by ignoring the exact function
Recurrence Equations - Solution Techniques
- No simple way to solve all recurrence equations
- Recurrence equations are open forms
- Following techniques are used:
- Guess a solution and use induction to prove its correctness
- Use a general formula (ie the Master Method)
- For $T(n) = aT(\frac{n}{b}) + cn^k$
- For $T(n) = aT(\frac{n}{b}) + f(n)$
- Solve using Characteristic Equation
- Linear homogeneous equations with constant coefficients
- Non-linear homogeneous equations with constant coefficients
- Change of Variable
- Substitution
- We focus on the general formulae and touch on the others
- General formulae can be understood using recursion trees
- First we see an example of induction
Example Using Induction: Factorial
- For fac(n), how many times is fac called (every case)?
- Above we derived:
- $T(1) = 1$
- $T(k) = T(k-1) + 1 $
- $T(n) = n$
- Inductive proof:
- Base Case: $T(1) = 1$
- IH: Assume that $T(k) = k, \textrm{ for all } k < n$
- Now consider $T(n)$:
$\begin{align*}
T(n) & = T(n-1) + 1 \\
& = n-1 + 1\textrm{, by IH} \\
& = n
\end{align*}
$
- Therefore, $T(n) = n$ for all $n ≥ 1$
Example Using Induction: Binary Search
- Worst case number of calls of binary search
- $T(2n) = T(n) + 1 $ [Derived above]
- $T(n) = \lg n + 2$ [Guessed above.]
- Inductive proof:
- Base Case: $T(1) = \lg 1 + 2 = 0 + 2 = 2$
- IH: Assume that $T(k) = \lg k + 2, \textrm{ for all } k < n$
- Now consider $T(n)$:
$\begin{align*}
T(n) & = T(n/2) + 1{, by definition of } T(n) \\
& = \lg n/2 + 2 + 1\textrm{, by IH} \\
& = \lg n - \lg 2 + 3 \\
& = \lg n - 1 + 3 \\
& = \lg n + 2
\end{align*}
$
- Therefore, $T(n) = \lg n + 2$ for all $n ≥ 1$ where $n$ is a power of 2
- Can also be proved for non-powers of 2
Recursion Tree
- A Recursion Tree is a technique for calculating the amount of work expressed by a recurrence equation
- Each level of the tree shows the non-recursive work for a given parameter
value
- Write each node with two parts:
- Upper part: $T(s)$ for some $s$
- Lower part: non-recursive part of $T(s)$
- Relation between parts: Upper part = lower part + upper parts of children
Example Recursion Tree
- Consider $T(n) = 2T(\frac{n}{3}) + 5n$
- $T(0) = 7$
- These values don't represent a specific algorithm.
They are just chosen for illustration.
- First rewrite as $T(s) = 5s + 2T(\frac{s}{3})$
- Putting non-recursive term first is (slightly) easier
- Using a different variable allows us to focus on calculating $T(n)$
- Now calculate $T(27) = T(3^3)$:
- Root is $T(27)$:
- Upper part: $T(27)$
- Lower part: $5s=5\times 27 = 135$
- Children of root:
- How many children?
- What's in each child?
- Calculating $T(27)$:
$\begin{align*}
T(27) & = 135 + 2 \times T(27/3) \\
& = 135 + 2 \times [45 + 2 \times T(9/3)] \\
& = 135 + 2 \times [45 + 2 \times [15 + 2 \times T(3/3)]] \\
& = 135 + 2 \times [45 + 2 \times [15 + 2 \times 7]] \\
& = 135 + 2 \times [45 + 2 \times 29] \\
& = 135 + 2 \times 103 \\
& = 135 + 206 \\
& = 341
\end{align*}
$
Example Recursion Tree
- Consider $T(s) = 5s + 2T(\frac{s}{3}) $
- Calculate $T(27)$
Tree Properties
- Consider $T(s) = 5s + 2T(\frac{s}{3}) $
- Bottom level is level $\log_3 27$ = level 3
- Number of levels: $ 1 + \log_3 27 = 1 + 3 = 4$
- Sum of level 0: $2^0 \times (5\times 27) = 135$
- Sum of level 1: $2^1 \times (5\times 9) = 90$
- Sum of level 2: $2^2 \times (5\times 3) = 60$
- Sum of bottom level: $2^{\log_3 27} * 7 = 2^3 \times 7 = 56$
- Sum of all levels: $135 + 90 + 60 + 56 = 341$
Recursion Tree for $T(n) = aT(\frac{n}{b}) + f(n)$
- General Form: $T(s) = f(s) + a T(s/b)$
- Find $T(n)$
Summing the Values in the Tree
- General Form: $T(s) = f(s) + a T(s/b)$
- Bottom level is level $\log_b n$
- Number of levels: $ 1 + \log_b n$
- Sum of level 0: $a^0 \times f(\frac{n}{b^0}) $
- Sum of level 1: $a^1 \times f(\frac{n}{b^1}) $
- Sum of level 2: $a^2 \times f(\frac{n}{b^2}) $
- Sum of level i: $a^i \times f(\frac{n}{b^i}) $
- Sum of bottom level: $d a^{log_b n} = d n^{log_b a} $
- Sum of all levels:
$$
d n^{log_b a} + \sum_{i=0}^{\log_b n - 1} f(\frac{n}{b^i})
$$
Fine Points
- $a ≥ 1, b > 1, \text{ are constants}$
- If $n$ is not a power of $b$, then replace $n/b$ by $\lfloor n/b \rfloor$ (or
$\lceil n/b \rceil$)
- It can be proved that this does not affect the complexity
Evaluating the Complexity of the Sum of the Tree Levels
- Sum of all levels:
$\displaystyle \Theta(n^{log_b a}) + \sum_{i=0}^{\log_b n - 1} f(\frac{n}{b^i})
$
- Simplify base case of recursion using $\Theta(1)$ instead of $d$
- When this formula is a polynomial in $n$, we can simplify it by
considering the power of $n$ in each term:
- Either first or second term can dominate or they can be the same order
- Possible results are as follows:
- $\Theta(n^{log_b a})$, if the first term dominates
- $\Theta(n^{log_b a}\lg n )$, if neither term dominates
- $\Theta(f(n))$, if the second term (ie the sum) dominates
- Some restrictions occur
- These results can be used to develop a more general Master Method
Specifying How One Term Dominates Another
- To determine a more general Master Method, we consider the relation
between the terms of the sum of the levels.
- Sum of all levels:
$\displaystyle \Theta(n^{log_b a}) + \sum_{i=0}^{\log_b n - 1} f(\frac{n}{b^i})
$
- Formal definition of one term dominating is below (assume $0 < \epsilon$):
- $\Theta(n^{log_b a})$, if $f(n) = O(n^{log_b a-\epsilon})$ [ie first
term dominates]
- $\Theta(n^{log_b a}\lg n )$, if $f(n) = \Theta(n^{log_b a})$ [ie
neither term
dominates]
- $\Theta(f(n))$, if $f(n) = \Omega(n^{log_b a+\epsilon})$ and $af(n/b)
≤ cf(n)$ for some constant $c ≤ 1$ and large $n$ [ie second term
dominates]
- This can also be written as $n^{log_b a+\epsilon} = O(f(n))$
- We ignore this case
- These results give the general Master Method
General Master Method
- Recurrence Equation $T(n) = aT(n/b) + f(n)$ has following solutions:
- $\Theta(n^{log_b a})$, if $f(n) = O(n^{log_b a-\epsilon})$
- $\Theta(n^{log_b a}\lg n )$, if $f(n) = \Theta(n^{log_b a})$
- $\Theta(f(n))$, if $f(n) = \Omega(n^{log_b a+\epsilon})$ and $af(n/b)
≤ cf(n)$ for some constant $c ≤ 1$ and large $n$
- This can also be written as $n^{log_b a+\epsilon} = O(f(n))$
- We ignore this case
- Assume $0 < \epsilon$
Examples
- $T(n) = 8T(n/2) + n$
- $a=8, b=2, f(n)=n, \log_b a = \log_2 8 = 3$
- Case 1: $f(n) = n = O(n^{3-1})$, with $\epsilon=1$
- Thus, $T(n) = \Theta(n^{\log_b a}) = \Theta(n^{\lg 8}) = \Theta(n^3)$
- $T(n) = 2T(n/2) + n$
- $a=2, b=2, f(n)=n, \log_b a = \lg 2 = 1$
- Case 2: $f(n) = n = n^1 = \Theta(n^{\log_b a})$
- Thus, $T(n) = \Theta(n^{\log_b a}\lg n) = \Theta(n \lg n)$
Master Method from Text
- If $f(n) = cn^k$, then the General Master Method reduces to the Master
Method in the text (with different constant names)
- $T(n) = aT(\frac{n}{b}) + cn^k$, for $n > 1$ and $n$ a power of $b$
- $T(1) = d$
- $b ≥ 2$ and $k ≥ 0$ are constant integers
- $a > 0, c > 0, d ≥ 0$ are constants
- $
T(n) \in
\begin{cases}
\Theta(n^k), & a < b^k \\
\Theta(n^k\lg n), & a = b^k \\
\Theta(n^{\log_b a)}, & a > b^k
\end{cases}
$
- Can also replace $T(n)=$ with ≤ or ≥ and get $O$ or $\Omega$
performance
Master Method from Text is a Simplification of the General MM
- The text's Master Method follows from the general Master Method, as follows:
$
\begin{align*}
a < b^k & \Leftrightarrow \lg a < \lg b^k \\
& \Leftrightarrow \lg a < k\lg b \\
& \Leftrightarrow \frac{\lg a}{\lg b} < k \\
& \Leftrightarrow \log_b a < k \\
& \Leftrightarrow n^{\log_b a} < n^k \\
& \Leftrightarrow n^{\log_b a} = O(n^{k-\epsilon})
\end{align*}
$
- Assume values of constants are as given above.
Other Methods for Solving Recurrence Equations
- Characteristic Equation
- Linear homogeneous equations with constant coefficients
- Non-linear homogeneous equations with constant coefficients
- Change of Variable
- Substitution
- We ignore all but the first, which we look at briefly
Homogenous Linear Recurrence Equations
- General form: $a_0t_n + a_1t_{n-1} + \dots + a_kt_{n-k} = 0$
- Slightly different from previous recurrence equations
- Notation: $t_n$ vs $T(n)$
- all $t$ terms on LHS of equation
- 0 on RHS of equation
- Assume solution is of form $t_n = r^n$, for all $n$
- Generate characteristic equation by substituting $r^n$ for $t_n$ and
dividing by $r^{n-k}$
- Characteristic equation: $a_0r^k + a_1r^{k-1} + \dots + a_kr^{k-k} = 0$
- Find equation roots and use to form recurrence solution
Homogenous Linear Recurrence Equation: Example
- Recurrence Equation: $t_n - 5t_{n-1} + 6t_{n-2} = 0$
- Assume solution is of form $t_n = r^n$, for all $n$
- Corresponding Characteristic Equation (CE): $r^2 - 5r^1 + 6r^0 = 0$
- $r^n - 5r^{n-1} + 6r^{n-2} = r^{n-2}(r^2 - 5r^1 + 6r^0) = 0$
- Find roots of CE: $r^2 - 5r^1 + 6r^0 = (r-3)(r-2) = 0$
- Roots are 0, 3, and 2
- That is, $r^n - 5r^{n-1} + 6r^{n-2} = 0$ when r=0 or r=2 or r=3
- By assumption, $t_n=0$, $t_n=3^n$ and $t_n=2^n$, are all solutions to the RE.
- General solution: $t_n = c_1 3^n + c_2 2^n$
- Plugging general solution into recurrence and distributing gives 0
- Choose $c_1$ and $c_2$ to meet the initial conditions: $c_1=1$ and $c_2 = -1$
- Thus: solution is $t_n = 3^n - 2^n$
Non-homogenous Linear Recurrence Equations
- General form: $a_0t_n + a_1t_{n-1} + \dots + a_kt_{n-k} = f(n) \ne 0$
- Notation: $t_n$ vs $T(n)$
- Look at later
Change of Variable
- Substitute, for example, $2^k$ for $n$
- We ignore this
Substitution
- Substitute value of smaller $t(k)$ to help guess
- Example: $t_n=t_{n-1} + n, t_1=1$
$\begin{align*}
t_n & = t_{n-1} + n \\
& = [t_{n-2} + n-1] + n \\
& = [(t_{n-3} + n-2) + n-1] + n \\
& \dots \\
& = 1 + 2 + \dots + n-1 + n \\
\displaystyle
& = \sum_{i=1}^n i
\end{align*}
$