Big-Oh for Recursive Functions: Recurrence Relations
It's not easy trying to determine the asymptotic complexity (using big-Oh) of recursive functions without an easy-to-use but underutilized tool. This web page gives an introduction to how recurrence relations can be used to help determine the big-Oh running time of recursive functions. This material is taken from what we present in our courses at Duke University and was given at a College Board AP workshop in August of 1998 at Berkeley.Motivation
The problem below appeared as AB Problem 3 on the 1996 AP exam, for which a C++ translation has been made.Given a binary tree, is it a search tree?
In part A students are asked to write the function ValsLess:
A function with similar characteristics
What is the asymptotic complexity of the function DoStuff shown below. Why? Assume that the function Combine runs in O(n) time when |left-right| = n, i.e., when Combine is used to combine n elements in the vector a.We'll make a mathematical definition:
The Recurrence Relation
Let T(n) be the time for DoStuff to execute on an n-element vector, i.e., when |left-right| = n. Note that the time for DoStuff to execute on a one element vector is O(1), constant time.Then we have the following relationship:
T(n) = 2 T(n/2) + O(n) [the O(n) is for Combine] T(1) = O(1)This relationship is called a recurrence relation because the function T(..) occurs on both sides of the = sign. This recurrence relation completely describes the function DoStuff, so if we could solve the recurrence relation we would know the complexity of DoStuff since T(n) is the time for DoStuff to execute.
Base Case
When you write a recurrence relation you must write two equations: one for the general case and one for the base case. These correspond to the recursive function to which the recurrence applies. The base case is often an O(1) operation, though it can be otherwise. In some recurrence relations the base case involves input of size one, so we wrote T(1) = O(1). However, there are cases when the bse case has size zero. In such cases the base could would be T(0) = O(1).How does this relate to the time for IsBST to execute? If you look carefully at the code for IsBST you'll see that it has the same form as the function DoStuff, so that IsBST will have the same recurrence relation as DoStuff. This means that if you accept that DoStuff is an O(n log n) function, then IsBST is also an O(n log n) function.
Solving Recurrence Relations
You can actually solve the recurrence relation given above. We'll sketch how to do that here.We'll write n instead of O(n) in the first line below because it makes the algebra much simpler.
T(n) = 2 T(n/2) + n = 2 [2 T(n/4) + n/2] + n = 4 T(n/4) + 2n = 4 [2 T(n/8) + n/4] + 2n = 8 T(n/8) + 3n = (ask your class to fill in this line, or you fill it in) you should have written: 16 T(n/16) + 4n = 2k T(n/2k) + k n [this is the Eureka! line]You could ask students to fill in parts of the last line. Note that the last line is derived by seeing a pattern --- this is the Eureka/leap of faith/practice with generalizing mathematical patterns part of the problem.
We know that T(1) = 1 and this is a way to end the derivation above. In particular we want T(1) to appear on the right hand side of the = sign. This means we want:
n/2k = 1 OR n = 2k OR log2 n = kContinuing with the previous derivation we get the following since k = log2 n:
= 2k T(n/2k) + k n = 2log2 n T(1) + (log2n) n = n + n log2 n [remember that T(1) = 1] = O(n log n)So we've solved the recurrence relation and its solution is what we "knew" it would be. To make this a formal proof you would need to use induction to show that O(n log n) is the solution to the given recurrence relation, but the "plug and chug" method shown above shows how to derive the solution --- the subsequent verification that this is the solution is something that can be left to a more advanced algorithms class.
Recurrence Relations to Remember
My suggestion is to do what we do in our classes: show students the "big five" recurrence relations below and ask them to remember what algorithms these are associated with. We ask our students to solve other recurrence relations, but we really want them to reason about recursive functions using the recurrence relations below more than knowing how to solve any given recurrence relation. We also want students to be able to derive a recurrence relation from a recursive function --- more on that later.- T(n) = T(n/2) + O(1)
- T(n) = T(n-1) + O(1)
- T(n) = 2 T(n/2) + O(1)
- T(n) = T(n-1) + O(n)
- T(n) = 2 T(n/2) + O(n)
Recurrence | Algorithm | Big-Oh Solution |
---|---|---|
T(n) = T(n/2) + O(1) | Binary Search | O(log n) |
T(n) = T(n-1) + O(1) | Sequential Search | O(n) |
T(n) = 2 T(n/2) + O(1) | tree traversal | O(n) |
T(n) = T(n-1) + O(n) | Selection Sort (other n2 sorts) | O(n2) |
T(n) = 2 T(n/2) + O(n) | Mergesort (average case Quicksort) | O(n log n) |
Practice Problem
Find the kth largest element in a vector where the kth largest is greater than k elements so that the Oth largest is the smallest element, the 3rd largest is greater than three elements (will have index 3 if the vector is sorted) and so on. Assume that there are no duplicate elements in the vector to make the solution easier.The solution below correctly solves the problem. It makes a call to the partition function from Quicksort. Assume that the partition function runs in O(n) time for an n-element vector/vector-segment. For completeness we'll include a partition function at the end of this document.
For an n-element vector a the call FindKth(a,0,n-1,k) returns the kth element in a:
Worst-case for FindKth
In the worst case the partition function is malicious and partitions the vector poorly every time it's called. A poor parition is one where the first element in the segment being partitioned is the pivot/paritition element. This means that the segment of the vector in which the kth largest element appears will decrease by one in each recursive call so that not much progress is made in each single instantiation of the function FindKth.If T(n) is the time for FindKth to execute for an n-element vector, the recurrence relation in the worst-case is:
This is one of the big-five recurrences, it's solution is O(n2) so that FindKth in the worst-case is an n2 function.
Average-case for FindKth
In the average case we're assuming good things happen. This means that the partition function will divide the vector into two roughly equal pieces each time it's called. Although this seems like a lot to ask for, it approximates what happens well enough to yield the correct complexity for FindKth in the average case.The recurrence relation for the average case is
A Partition Function
Owen L. Astrachan Last modified: Thu Apr 17 10:50:06 EDT 2003
No comments:
Post a Comment