This section defines the expected-comparison recurrence for randomized quicksort
(CLRS equation (7.4)) and proves its closed-form solution, giving the
O(n log n) average-case bound for the first time in CLRS-Lean.
The expected number of comparisons expectedComparisons n = E[T(n)]
satisfies:
T(0) = 0, T(1) = 0
For n >= 1: T(n) = n-1 + (2/n) * sum_{k=0}^{n-1} T(k)
The closed form is T(n) = 2(n+1)H_n - 4n where H_n is the n-th
harmonic number. This yields T(n) <= 2n H_n and T(n) <= n^2
(quadratic fallback).
Main results:
Lemma harmonic_succ: recurrence for harmonic numbers
Lemma harmonic_le_n: H_n <= n
Lemma sum_mul_harmonic_eq: sum_{k=1}^{n} k H_k = n(n+1)/2 H_n - n(n-1)/4
Lemma sum_expectedComparisons_eq: closed form of sum_{k=0}^{n-1} T(k)
Theorem expectedComparisons_closed_form: named CLRS closed-form formula
Theorem expectedComparisons_recurrence: closed form satisfies CLRS (7.4)
Expected number of comparisons in randomized quicksort on n distinct
elements, given by the closed-form solution of CLRS recurrence (7.4):
T(n) = 2(n+1)H_n - 4n
where H_n is the n-th harmonic number. This is a computable
deterministic rational function; the expectation is folded into the recurrence
coefficients, not into a probability space.
The closed-form expectedComparisons satisfies the CLRS expected-comparison
recurrence (7.4): for n >= 1,
T(n) = n-1 + (2/n) * sum_{k=0}^{n-1} T(k).
The proof multiplies through by n and uses the closed form of the sum.
theoremexpectedComparisons_recurrence(n:Nat)(hn:n≥1):expectedComparisonsn=((n:Rat)-1)+(2/(n:Rat))*(∑k∈Finset.rangen,expectedComparisonsk):=byhavehnpos:(n:Rat)≠0:=byintrohzerohave:n=0:=byexact_mod_casthzeroomega-- Clear denominator by multiplying both sides by nfield_simp[hnpos]-- Goal: n * T(n) = n * (n-1) + 2 * S(n)rw[sum_expectedComparisons_eqn]rw[expectedComparisons]ring
Alternative form of the recurrence, clearing denominators:
(n+1) * T(n+1) = (n+2) * T(n) + 2n for all n >= 0.
This telescoping identity is the key to the closed form and is used in the
inductive proofs below.
Quadratic upper bound. On any input of length n, the expected number
of comparisons is at most n^2.
The proof uses induction with the telescope identity:
T(n+1) = ((n+2)T(n) + 2n)/(n+1). The inductive hypothesis
T(n) <= n^2 and a simple polynomial inequality n^2 + n + 1 >= 0
close the step.
theoremexpectedComparisons_quadratic(n:Nat):expectedComparisonsn≤(n:Rat)*(n:Rat):=byinductionnwith|zero=>simp|succnih=>haveht:=expectedComparisons_telescopen-- ht: (n+1)*T(n+1) = (n+2)*T(n) + 2nhavehpos:((n+1:Nat):Rat)≠0:=Nat.cast_ne_zero.mpr(Nat.succ_ne_zeron)-- From ht: T(n+1) = ((n+2)*T(n) + 2n) / (n+1)havehT_succ:expectedComparisons(n+1)=((((n:Rat)+2))*expectedComparisonsn+2*(n:Rat))/((n+1:Nat):Rat):=(eq_div_iff_mul_eqhpos).mpr(bysimpa[mul_comm]usinght)rw[hT_succ]-- Need: ((n+2)*T(n) + 2n) / (n+1) <= (n+1)^2-- First, bound the numerator using ih: T(n) <= n^2havehnum_bound:(((n:Rat)+2))*expectedComparisonsn+2*(n:Rat)≤((n:Rat)+1)*((n:Rat)+1)*((n:Rat)+1):=by-- (n+2)*T(n) + 2n <= (n+2)*n^2 + 2n = n^3 + 2n^2 + 2n-- <= n^3 + 3n^2 + 3n + 1 = (n+1)^3 (since n^2 + n + 1 >= 0)nlinarith-- Apply the division lemma: if a <= b and c > 0, then a/c <= b/crefinele_trans(div_le_div_of_nonneg_righthnum_bound(bypositivity))?_-- Now need: (n+1)^3 / (n+1) <= (n+1)^2-- Since (n+1)^3 / (n+1) = (n+1)^2 exactly, this is equalitypush_casthaveh_eq:((n:Rat)+1)*((n:Rat)+1)*((n:Rat)+1)/((n:Rat)+1)=((n:Rat)+1)*((n:Rat)+1):=byfield_simp[show((n:Rat)+1)≠0frombypositivity]exacth_eq.le
Monotonicity. The expected comparison count is non-decreasing:
T(n) <= T(n+1).
From the telescope identity, T(n+1) - T(n) = (T(n) + 2n)/(n+1) >= 0.