Imports

CLRS Section 7.3 - Randomized quicksort

This section defines the expected-comparison recurrence for randomized quicksort (CLRS equation (7.4)) and proves its closed-form solution, giving the O(n log n) average-case bound for the first time in CLRS-Lean.

The expected number of comparisons expectedComparisons n = E[T(n)] satisfies:

  • T(0) = 0, T(1) = 0

  • For n >= 1: T(n) = n-1 + (2/n) * sum_{k=0}^{n-1} T(k)

The closed form is T(n) = 2(n+1)H_n - 4n where H_n is the n-th harmonic number. This yields T(n) <= 2n H_n and T(n) <= n^2 (quadratic fallback).

Main results:

  • Lemma harmonic_succ: recurrence for harmonic numbers

  • Lemma harmonic_le_n: H_n <= n

  • Lemma sum_mul_harmonic_eq: sum_{k=1}^{n} k H_k = n(n+1)/2 H_n - n(n-1)/4

  • Lemma sum_expectedComparisons_eq: closed form of sum_{k=0}^{n-1} T(k)

  • Theorem expectedComparisons_closed_form: named CLRS closed-form formula

  • Theorem expectedComparisons_recurrence: closed form satisfies CLRS (7.4)

  • Theorem expectedComparisons_telescope: (n+1)T(n+1) = (n+2)T(n) + 2n

  • Theorem expectedComparisons_clrs_harmonic_bound: T(n) <= 2(n+1)H_n

  • Theorem expectedComparisons_harmonic_bound: T(n) <= 2n H_n

  • Theorem expectedComparisons_quadratic: T(n) <= n^2

  • Theorem expectedComparisons_monotone: T(n) <= T(n+1)

Notation conventions:

  • harmonic n : H_n, the n-th harmonic number in Q

  • expectedComparisons n : T(n), expected number of comparisons for randomized quicksort on n distinct elements

namespace CLRSnamespace Chapter07open Chapter07

Harmonic numbers

The n-th harmonic number as a rational. H_0 = 0, H_{n+1} = H_n + 1/(n+1).

def harmonic : Nat Rat | 0 => 0 | n+1 => harmonic n + 1 / ((n+1 : Nat) : Rat)
@[simp] theorem harmonic_zero : harmonic 0 = 0 := rfl@[simp] theorem harmonic_one : harmonic 1 = 1 :=

Recurrence for harmonic numbers: H_{n+1} = H_n + 1/(n+1).

theorem harmonic_succ (n : Nat) : harmonic (n+1) = harmonic n + (1 : Rat) / ((n+1 : Nat) : Rat) := rfl

Harmonic numbers are nonnegative.

The harmonic number is bounded by its index: H_n <= n for all n.

This trivial bound is enough for many estimates.

Expected comparisons: closed form

Expected number of comparisons in randomized quicksort on n distinct elements, given by the closed-form solution of CLRS recurrence (7.4):

T(n) = 2(n+1)H_n - 4n

where H_n is the n-th harmonic number. This is a computable deterministic rational function; the expectation is folded into the recurrence coefficients, not into a probability space.

def expectedComparisons (n : Nat) : Rat := 2 * ((n : Rat) + 1) * harmonic n - 4 * (n : Rat)

Named CLRS closed form for randomized-quicksort expected comparisons.

theorem expectedComparisons_closed_form (n : Nat) : expectedComparisons n = 2 * ((n : Rat) + 1) * harmonic n - 4 * (n : Rat) := rfl
@[simp] theorem expectedComparisons_zero : expectedComparisons 0 = 0 := @[simp] theorem expectedComparisons_one : expectedComparisons 1 = 0 :=

Explicit formula for expectedComparisons (n+1) in terms of harmonic (n+1).

theorem expectedComparisons_succ (n : Nat) : expectedComparisons (n+1) = 2 * ((n+1 : Rat) + 1) * harmonic (n+1) - 4 * ((n+1 : Rat)) :=

Key combinatorial identity - sum of k times harmonic k

Central combinatorial identity for the expected-quicksort closed form:

sum_{k=1}^{n} k * H_k = (n(n+1)/2) * H_n - n(n-1)/4

This is proved by induction on n using the harmonic recurrence to express H_n in terms of H_{n+1} in the inductive step.

Sum of expected comparisons

Closed form for the sum of expected comparisons up to n-1:

sum_{k=0}^{n-1} T(k) = n(n+1)*H_n - (5 n^2 - n)/2

Recurrence verification

The closed-form expectedComparisons satisfies the CLRS expected-comparison recurrence (7.4): for n >= 1,

T(n) = n-1 + (2/n) * sum_{k=0}^{n-1} T(k).

The proof multiplies through by n and uses the closed form of the sum.

Alternative form of the recurrence, clearing denominators:

(n+1) * T(n+1) = (n+2) * T(n) + 2n for all n >= 0.

This telescoping identity is the key to the closed form and is used in the inductive proofs below.

Expected comparisons: nonnegativity

Expected comparisons are nonnegative.

refine div_nonneg hnum_nonneg ( )

Bounds

Harmonic upper bound. The expected number of comparisons in randomized quicksort is at most 2 n * H_n.

Since H_n = Theta(log n), this gives T(n) = O(n log n).

CLRS-facing harmonic upper bound using the closed-form scale 2(n+1)H_n.

Quadratic upper bound. On any input of length n, the expected number of comparisons is at most n^2.

The proof uses induction with the telescope identity: T(n+1) = ((n+2)T(n) + 2n)/(n+1). The inductive hypothesis T(n) <= n^2 and a simple polynomial inequality n^2 + n + 1 >= 0 close the step.

Monotonicity. The expected comparison count is non-decreasing: T(n) <= T(n+1).

From the telescope identity, T(n+1) - T(n) = (T(n) + 2n)/(n+1) >= 0.

end Chapter07end CLRS