Big O Notation

In the CatMatch problem, we said that Phase II, which compared every pair of cat profiles and took n*(n-1)/2 * 1 time dominated Phase I, which took n * 1000 time. A function f(n) dominates another function g(n) if

lim_{n -> ∞} g(n)/f(n) = 0

This means that as n gets larger and larger, g(n) is smaller and smaller compared to f(n). This is exactly what happened with the time it takes to run Phase II and the time it takes to run Phase I.

We define O(f) = { g | g does not dominate f}.

Note that O(f) is a set of functions. Here is a table of examples:

`f`	`g`	`f` dominates `g`?	`g` dominates `f`?	`f` in `O(g)`	`g` in `O(f)`
1	1	false	false	true	true
1	n	false	true	true	false
n	n	false	false	true	true
n	n**2	false	true	true	false

To be hyper-accurate, we would say that the function f(n) = n is in the set O(n**2) (since n**2 dominates n). Speaking less formally, we say that f(n) = n is O(n**2).

Note that O(n**2) includes functions that are dominated by n**2. If we want to talk about sets of functions, all of which do not dominate each other, there is a second set we use sometimes called Θ(f).

Θ(f) = { g | g does not dominate f, and f does not dominate g }

This is often called "big theta". Big theta is more specific than big O: when we say g is in O(f), we are saying that g does not grow to dominate f. When we say g is in Θ(f), we are saying that g does not grow to dominate f, AND g grows at fast enough to keep up with f.

In general practice, most people just talk about big O and avoid misleading, silly statements like: "n is in O(n**3)". While accurate, this would tend to overstate the growth of n.

Why not always use big theta? The reason is that it is often easy to compute conservative over-estimates of time complexity, which will give us an upper bound on how fast the function grows. It is often more difficult to compute lower-bound estimates to establish that an algorithm does not run too fast.

Basically, the math is harder, so we don't do it.

Some Big O Theorems

Say:
f_1 is in O(f_2),
f_2 is in O(f_3),
THEN f_1 is in O(f_3).

Proof:

Since f_2 is in O(f_3), there exists an M and N such that, for all
n>N, f_2(n)/f_3(n) < M (otherwise the limit would not exist or be
infinity). So:

f_1(n)/f_3(n) = (f_1(n)/f_2(n)) * (f_2(n)/f_3(n)) < M * f_1(n)/f_2(n)

But we know that f_1 is in O(f_2), so the limit of f_1(n)/f_2(n) is
non-infinite. Multiplying this by the constant M results in a limit
which is also finite, so f_1 is in O(f_3).

An immediate corrolary is that if:

If f in O(g) and g in O(f), then O(g) = O(f).

For this reason, we never say things like O(2n**2); instead we simplify and say O(n**2).

We've seen that Phase I of CatMatch is O(n) while Phase II is O(n**2). What is the time complexity of CatMatch overall?

The time complexity of an algorithm is always the worst of the time complexities of its components.

Say:

f_1 in O(g_1),
f_2 in O(g_2),
AND g_1 is in O(g_2).

THEN, if f(n) = f_1(n) + f_2(n), f is in O(g_2).

Proof:

lim_{n -> ∞} (f_1(n) + f_2(n)) / (g_2(n))
    = (lim_{n -> ∞} f_1(n) / g_2(n)) + (lim_{n -> ∞} f_2(n) / g_2(n))

Since f_1 is in O(g_1), and g_1 is in O(g_2), then f_1 must be in
O(g_2), by the transitivity theorem above. Therefore the first limit
is equal to zero.

We know the second limit is finite, since f_2 must not dominate g_2,
or else it would not be in O(g_2).

Therefore the sum of the limits is finite, so f is in O(g_2).

In simple terms: the time complexity of the worst bottleneck in your algorithm is the timecomplexity of the whole algorithm.

Analyzing polynomials

When dealing with polynomial functions of the form:

a_p x**p + a_p-1 x ** (p-1) + ... + a_1 x + a_0

for instance,

n**3 + 2n**2 + 1

then we can just look at the highest power. Consider:

does n**3 dominate 100n**2 + 10n + 900?

We know that n**3 crushes n**2 because

lim n**2/n**3 = lim 1/n = 0

n**3 still crushes 100n**2; the constant factor does not change the rate of growth of the quadratic term. The smaller terms of 100n**2 + 10n + 900 are negligable; if the biggest term is crushed, all the smaller ones will be too.

Run-time

Just because two algorithms belong to the same complexity class doesn't mean they will take the same amount of time to solve the same size of problem. For instance, here's an algorithm to find the smallest number in an array:

def min(nums)
  min = nums.first
  nums.each { |n| min = n if n < min }

  min
end

This does one comparison per element, for n comparisons total. The algorithm is linear in the number of elements. Here's a second algorithm, which finds the two smallest numbers:

def two_min(nums)
  min1, min2 = nums[0], nums[1]

  nums.each do |n|
    if n < min1
      min1, min2 = n, min1
    elsif n < min2
      min2 = n
    end
  end

  return [min1, min2]
end

This does two comparisons for each element in the array, for 2n comparisons total. This is again linear in the number of elements, but we expect it to be about twice as slow.

Remember: time complexity is about dominance and bottlenecks, not mere speed. two_min and min are equally bad as far as being a bottleneck.

Big O Notation

Big O Notation

Some Big O Theorems

Analyzing polynomials

Run-time

results matching ""

No results matching ""