Imports

CLRS Section 9.3 - Deterministic selection

This file factors the Chapter 9 selection proof through a pivot-parametric interface. The key point is pure correctness rather than running time: any pivot rule that returns an element of the current input list yields a selector whose successful result satisfies the same count-based rank certificate used in Section 9.2.

Main results:

  • Theorem selectWithPivot?_correct: a pivot-parametric SELECT is rank correct whenever the pivot rule returns members of the current list.

  • Theorem medianOfFive?_certificate: the median selected from a five-element group has at least three elements below it weakly and at least three elements above it weakly.

  • Theorems medianGroupCertificates_leCount_lower_bound and medianGroupCertificates_geCount_lower_bound: a collection of certified five-element groups contributes three original elements for every group median on the corresponding side of a pivot.

  • Theorem fullGroupsOfFive_medianPivot_split_counts: the executable full-grouping wrapper constructs the certificates and obtains the split counts for a median of the group medians.

  • Theorem fullGroupsOfFive_medianPivot_fullInput_split_counts: the grouped split counts lift to the original input list because the flattened full groups are a sublist of the input.

  • Theorem fullGroupsOfFive_medianPivot_partition_size_bound: both strict recursive branches around the pivot satisfy the familiar 7n/10 + O(1) CLRS size bound.

  • Theorem selectRecurrence_linear_step: a pure Nat substitution step for closing a linear envelope from one one-fifth median subproblem, one 7n/10 + O(1) strict branch, and local-work slack.

  • Theorem medianOfMediansPivot?_recursive_branch_size_bound: the proved count bound specialized to the actual filtered recursive branch lists.

  • Theorems medianOfMediansPivot?_low_branch_linear_work_step and medianOfMediansPivot?_high_branch_linear_work_step: executable branch wrappers that connect the median-of-medians pivot bound to the linear-work recurrence step.

  • Theorem deterministicSelect?_correct: a deterministic median-pivot instance is rank correct.

  • Theorem selectRecurrence_linear_induction: a threshold-parametric strong induction that lifts selectRecurrence_linear_step to the full recursion tree for any cost function satisfying the CLRS subproblem-size bounds.

  • Theorem medianOfMedians_linear_bound: a concrete instantiation with the standard median-of-medians branch sizes n/5 and (7n+12)/10, proving linear cost whenever the per-element work coefficient is bounded relative to the overall constant.

  • Theorem clrsSelectRecurrence_linear_bound: a CLRS-facing name for the same linear-time SELECT recurrence closure.

Current gaps:

  • The recurrence induction above closes the substitution-method half, but this is still not the full CLRS linear-time median-of-medians cost theorem. The remaining step is a concrete cost semantics (operational or denotational) for medianOfMediansSelect? that maps list input to a Nat cost, together with proofs that the concrete cost satisfies the hypotheses of medianOfMedians_linear_bound.

namespace CLRSnamespace Chapter09

Pivot-parametric selection

A pivot function is membership-safe when every pivot it returns is an element of the current input list.

def PivotMembership (choosePivot? : List Nat Option Nat) : Prop := {xs : List Nat} {pivot : Nat}, choosePivot? xs = some pivot pivot xs

Fuelled SELECT with an abstract deterministic pivot rule.

The algorithm mirrors the CLRS three-way partition around the chosen pivot: recurse on elements below the pivot, return the pivot block when the requested rank falls inside it, or recurse on elements above the pivot after shifting the rank by the number of elements at most the pivot.

def selectWithPivotFuel? (choosePivot? : List Nat Option Nat) : Nat Nat List Nat Option Nat | 0, _, _ => none | fuel + 1, k, xs => match choosePivot? xs with | none => none | some pivot => if k < ltCount pivot xs then selectWithPivotFuel? choosePivot? fuel k (xs.filter fun y => decide (y < pivot)) else if k < leCount pivot xs then some pivot else selectWithPivotFuel? choosePivot? fuel (k - leCount pivot xs) (xs.filter fun y => decide (pivot < y))

Public SELECT wrapper using one unit of fuel per input element.

def selectWithPivot? (choosePivot? : List Nat Option Nat) (k : Nat) (xs : List Nat) : Option Nat := selectWithPivotFuel? choosePivot? xs.length k xs

Correctness of the fuelled pivot-parametric SELECT.

If the pivot function is membership-safe and the computation returns x, then x is a valid zero-based order statistic certificate for the original input.

Rank-correctness theorem for the public pivot-parametric SELECT wrapper.

theorem selectWithPivot?_rankCorrect (choosePivot? : List Nat Option Nat) (hpivot : PivotMembership choosePivot?) {k : Nat} {xs : List Nat} {x : Nat} (hsel : selectWithPivot? choosePivot? k xs = some x) : RankCertificate xs k x :=

Membership projection for pivot-parametric SELECT.

theorem selectWithPivot?_mem (choosePivot? : List Nat Option Nat) (hpivot : PivotMembership choosePivot?) {k : Nat} {xs : List Nat} {x : Nat} (hsel : selectWithPivot? choosePivot? k xs = some x) : x xs := (selectWithPivot?_rankCorrect choosePivot? hpivot hsel).1

Reader-facing correctness wrapper for pivot-parametric SELECT.

theorem selectWithPivot?_correct (choosePivot? : List Nat Option Nat) (hpivot : PivotMembership choosePivot?) {k : Nat} {xs : List Nat} {x : Nat} (hsel : selectWithPivot? choosePivot? k xs = some x) : RankCertificate xs k x := selectWithPivot?_rankCorrect choosePivot? hpivot hsel

Five-element median certificate

Correctness-oriented median selector for a five-element group.

def medianOfFive? (xs : List Nat) : Option Nat := selectByRank? 2 xs

Local certificate used by the CLRS median-of-medians split argument.

For a five-element group, the selected median is an input member, at least three group elements are at most it, and at least three group elements are at least it.

def MedianFiveCertificate (xs : List Nat) (median : Nat) : Prop := xs.length = 5 median xs 3 leCount median xs 3 geCount median xs

The rank-2 selector on a five-element group supplies the local 3/3 median certificate needed by the deterministic SELECT split-size proof.

The five-element median selector succeeds on any five-element input.

theorem medianOfFive?_isSome_of_length_eq_five {xs : List Nat} (hlen : xs.length = 5) : median, medianOfFive? xs = some median := exact selectByRank?_isSome_of_lt ( )

Certificates pairing each full five-element group with its selected median.

The theorem layer below intentionally does not require the groups to be computed by a particular chunking function. That keeps the split-size argument usable for both executable median-of-medians code and later lower-level array refinements.

def MedianGroupCertificates (groups : List (List Nat)) (medians : List Nat) : Prop := groups.length = medians.length {group : List Nat} {median : Nat}, (group, median) groups.zip medians MedianFiveCertificate group median

Executable five-element grouping

Fuelled grouping into full five-element blocks, dropping any trailing block of fewer than five elements.

The fuel is only a termination device; the public wrapper below uses xs.length, so the function is executable while keeping the proof obligations straightforward.

def fullGroupsOfFiveFuel : Nat List Nat List (List Nat) | 0, _ => [] | fuel + 1, a :: b :: c :: d :: e :: rest => [a, b, c, d, e] :: fullGroupsOfFiveFuel fuel rest | _ + 1, _ => []

Executable full five-element grouping used by the median-of-medians layer.

def fullGroupsOfFive (xs : List Nat) : List (List Nat) := fullGroupsOfFiveFuel xs.length xs
theorem fullGroupsOfFiveFuel_lengths {fuel : Nat} : {xs group : List Nat}, group fullGroupsOfFiveFuel fuel xs group.length = 5 := induction fuel with cases xs with cases xs with cases xs with cases xs with cases xs with

Every executable full group has length five.

theorem fullGroupsOfFive_lengths {xs group : List Nat} (hmem : group fullGroupsOfFive xs) : group.length = 5 := fullGroupsOfFiveFuel_lengths hmem
theorem fullGroupsOfFiveFuel_length_mul_five_le {fuel : Nat} : xs : List Nat, 5 * (fullGroupsOfFiveFuel fuel xs).length xs.length := induction fuel with cases xs with cases xs with cases xs with cases xs with cases xs with theorem fullGroupsOfFive_length_mul_five_le (xs : List Nat) : 5 * (fullGroupsOfFive xs).length xs.length := fullGroupsOfFiveFuel_length_mul_five_le xs theorem fullGroupsOfFive_length_near (xs : List Nat) : xs.length 5 * (fullGroupsOfFive xs).length + 4 := fullGroupsOfFiveFuel_length_near (Nat.le_refl xs.length)theorem fullGroupsOfFiveFuel_flatten_sublist {fuel : Nat} : xs : List Nat, (List.flatten (fullGroupsOfFiveFuel fuel xs)).Sublist xs := induction fuel with cases xs with cases xs with cases xs with cases xs with cases xs with theorem fullGroupsOfFive_flatten_sublist (xs : List Nat) : (List.flatten (fullGroupsOfFive xs)).Sublist xs := fullGroupsOfFiveFuel_flatten_sublist xs

Map the five-element median selector across a list of groups, failing if any group is not a valid five-element median input.

def medianOfFiveGroups? : List (List Nat) Option (List Nat) | [] => some [] | group :: groups => match medianOfFive? group, medianOfFiveGroups? groups with | some median, some medians => some (median :: medians) | _, _ => none

If every group has length five, then the executable median-map produces exactly the certificate package required by the grouped split-count theorem.

refine , ?_

Every median returned by the executable median-map comes from the flattened input groups.

The executable median-map succeeds when every group has length five.

exact median :: medians,

The executable full-grouping plus median-map automatically constructs the abstract grouped certificate layer.

theorem fullGroupsOfFive_medianGroupCertificates {xs medians : List Nat} (hsel : medianOfFiveGroups? (fullGroupsOfFive xs) = some medians) : MedianGroupCertificates (fullGroupsOfFive xs) medians := medianOfFiveGroups?_certificates (fun _ hmem => fullGroupsOfFive_lengths hmem) hsel

The executable median-map always succeeds on the executable full groups.

theorem fullGroupsOfFive_medianOfFiveGroups?_isSome (xs : List Nat) : medians, medianOfFiveGroups? (fullGroupsOfFive xs) = some medians := medianOfFiveGroups?_isSome_of_all_lengths (fun _ hmem => fullGroupsOfFive_lengths hmem)

Every certified group whose median is at most pivot contributes at least three original group elements at most pivot.

Every certified group whose median is at least pivot contributes at least three original group elements at least pivot.

If pivot has rank certificate k among the group medians, then the original grouped values have at least 3 * (k + 1) elements at most the pivot and at least 3 * (medians.length - k) elements at least the pivot.

This is the reusable counting core of the CLRS median-of-medians split-size argument; the executable wrappers below convert it to the familiar 7n/10 + O(1) branch-size bound.

Executable-grouping version of the median-of-medians split-count core.

theorem fullGroupsOfFive_selectPivot_split_counts {xs medians : List Nat} {pivot k : Nat} (hmedians : medianOfFiveGroups? (fullGroupsOfFive xs) = some medians) (hrank : RankCertificate medians k pivot) : 3 * (k + 1) leCount pivot (List.flatten (fullGroupsOfFive xs)) 3 * (medians.length - k) geCount pivot (List.flatten (fullGroupsOfFive xs)) := medianGroupCertificates_selectPivot_split_counts (fullGroupsOfFive_medianGroupCertificates hmedians) hrank

When the pivot is selected as the median of the executable group medians, the flattened full groups inherit the standard three-per-median split counts.

theorem fullGroupsOfFive_medianPivot_split_counts {xs medians : List Nat} {pivot : Nat} (hmedians : medianOfFiveGroups? (fullGroupsOfFive xs) = some medians) (hpivot : selectByRank? (medians.length / 2) medians = some pivot) : 3 * (medians.length / 2 + 1) leCount pivot (List.flatten (fullGroupsOfFive xs)) 3 * (medians.length - medians.length / 2) geCount pivot (List.flatten (fullGroupsOfFive xs)) :=

Full-input version of the executable median-of-medians split-count theorem.

The counts first proved on the flattened full groups lift to the original input because that flattening is a sublist of the input. The partition-size wrapper below packages these count lower bounds with the group-count arithmetic above.

theorem fullGroupsOfFive_medianPivot_fullInput_split_counts {xs medians : List Nat} {pivot : Nat} (hmedians : medianOfFiveGroups? (fullGroupsOfFive xs) = some medians) (hpivot : selectByRank? (medians.length / 2) medians = some pivot) : 3 * (medians.length / 2 + 1) leCount pivot xs 3 * (medians.length - medians.length / 2) geCount pivot xs :=

The strict recursive branches around a median-of-medians pivot are bounded by the input length minus the certified opposite-side mass.

CLRS-style partition-size packaging for executable median-of-medians grouping.

Both strict recursive branches have size at most 7n/10 + O(1), stated without division as 10 * branchSize ≤ 7 * n + 12.

Deterministic median-pivot instance

Deterministic pivot rule that chooses the median of the current list according to the specification selector.

This is a correctness-oriented pivot rule. It deliberately separates the rank proof from the harder CLRS median-of-medians running-time argument.

def deterministicPivot? (xs : List Nat) : Option Nat := selectByRank? (xs.length / 2) xs

The deterministic median-pivot rule returns only members of its input.

theorem deterministicPivot?_mem : PivotMembership deterministicPivot? := exact selectByRank?_mem ( )

Deterministic SELECT using the specification median as its pivot rule.

def deterministicSelect? (k : Nat) (xs : List Nat) : Option Nat := selectWithPivot? deterministicPivot? k xs

Rank-correctness theorem for deterministic median-pivot SELECT.

theorem deterministicSelect?_rankCorrect {k : Nat} {xs : List Nat} {x : Nat} (hsel : deterministicSelect? k xs = some x) : RankCertificate xs k x := exact selectWithPivot?_rankCorrect deterministicPivot? deterministicPivot?_mem ( )

Membership projection for deterministic median-pivot SELECT.

theorem deterministicSelect?_mem {k : Nat} {xs : List Nat} {x : Nat} (hsel : deterministicSelect? k xs = some x) : x xs := (deterministicSelect?_rankCorrect hsel).1

Reader-facing correctness wrapper for deterministic median-pivot SELECT.

theorem deterministicSelect?_correct {k : Nat} {xs : List Nat} {x : Nat} (hsel : deterministicSelect? k xs = some x) : RankCertificate xs k x := deterministicSelect?_rankCorrect hsel

Median-of-medians pivot instance

CLRS-style median-of-medians pivot rule.

For inputs with at least one full five-element group, this chooses the median of the executable group medians. For shorter inputs, it falls back to the specification median pivot so that the pivot-parametric SELECT wrapper remains usable on every nonempty input.

def medianOfMediansPivot? (xs : List Nat) : Option Nat := match medianOfFiveGroups? (fullGroupsOfFive xs) with | some (median :: medians) => selectByRank? ((median :: medians).length / 2) (median :: medians) | _ => deterministicPivot? xs

Every median-of-medians pivot returned by the wrapper belongs to the input.

Any pivot returned by the median-of-medians pivot rule satisfies the proved CLRS branch-size bound. The fallback branch can only occur when there are no full five-element groups, hence the input has length at most four.

exact fullGroupsOfFive_medianPivot_partition_size_bound (xs := xs) (medians := median :: medians) (pivot := pivot) hgroups ( )

Recurrence-size wrappers

Pure arithmetic substitution step for the median-of-medians recurrence.

If the median subproblem has size at most one fifth of the input, the selected strict branch satisfies the proved 7n/10 + O(1) bound, and the local work plus the additive split slack fits in the remaining tenth, then the two recursive calls plus local work are bounded by the same linear envelope.

exact Nat.le_of_mul_le_mul_left htotal ( : 0 < 10)

The actual strict recursive SELECT branch lists have the CLRS bound.

theorem medianOfMediansPivot?_recursive_branch_size_bound {xs : List Nat} {pivot : Nat} (hsel : medianOfMediansPivot? xs = some pivot) : 10 * (xs.filter fun y => decide (y < pivot)).length 7 * xs.length + 12 10 * (xs.filter fun y => decide (pivot < y)).length 7 * xs.length + 12 :=

Linear-work recurrence step for the low recursive branch of median-of-medians SELECT.

Linear-work recurrence step for the high recursive branch of median-of-medians SELECT.

Full recurrence induction

Recurrence induction for median-of-medians cost.

Given a cost function T : Nat → Nat, subproblem-size functions g (median subproblem) and h (strict recursive branch), a local-work function f, and a base threshold t, this theorem proves T n ≤ C * n for all n whenever the following hold for all n ≥ t:

  1. The subproblem sizes satisfy the CLRS partition bounds: 5 * g n ≤ n and 10 * h n ≤ 7 * n + 12;

  2. The local work is small enough: 10 * f n + 12 * C ≤ C * n;

  3. T satisfies the one-level recurrence T n ≤ T (g n) + T (h n) + f n;

  4. Base cases n < t respect the linear bound: T n ≤ C * n.

The proof chains selectRecurrence_linear_step through strong induction to lift the single-level substitution to the full recursion tree.

Corresponds to the substitution-method closure in CLRS Section 9.3.

calc T n T (g n) + T (h n) + f n := hstep _ C * (g n) + C * (h n) + f n := htotal _ C * n := hsubst

Concrete linear bound for the median-of-medians recurrence.

Corollary of selectRecurrence_linear_induction with the standard CLRS subproblem sizes: the median subproblem is ⌊n/5⌋, the strict branch is ⌊(7n+12)/10⌋, and the local work is bounded by a*n where 20*a ≤ C (so that the one-level algebraic slack closes).

The base threshold is 50, which is large enough to absorb the additive constants from the partition bound.

exact selectRecurrence_linear_induction ( ) f g h hmedian_size hstrict_size hlocal_work T hT_step_wrapped hT_base

CLRS-facing linear-time SELECT recurrence theorem.

If a cost function satisfies the standard median-of-medians recurrence with subproblem sizes ⌊n/5⌋ and ⌊(7n+12)/10⌋, and if the base cases respect the same linear envelope, then the cost is globally linear.

This theorem is intentionally still an abstract recurrence wrapper: connecting the executable medianOfMediansSelect? implementation to a concrete cost function remains a separate refinement target.

theorem clrsSelectRecurrence_linear_bound {C a : Nat} (hCpos : 0 < C) (ha_bound : 20 * a C) (T : Nat Nat) (hT_step : n, 50 n T n T (n / 5) + T ((7 * n + 12) / 10) + a * n) (hT_base : n, n < 50 T n C * n) : n, T n C * n := medianOfMedians_linear_bound hCpos ha_bound T hT_step hT_base

SELECT specialized to the executable median-of-medians pivot rule.

def medianOfMediansSelect? (k : Nat) (xs : List Nat) : Option Nat := selectWithPivot? medianOfMediansPivot? k xs

Rank-correctness theorem for median-of-medians SELECT.

theorem medianOfMediansSelect?_rankCorrect {k : Nat} {xs : List Nat} {x : Nat} (hsel : medianOfMediansSelect? k xs = some x) : RankCertificate xs k x := exact selectWithPivot?_rankCorrect medianOfMediansPivot? medianOfMediansPivot?_mem ( )

Membership projection for median-of-medians SELECT.

theorem medianOfMediansSelect?_mem {k : Nat} {xs : List Nat} {x : Nat} (hsel : medianOfMediansSelect? k xs = some x) : x xs := (medianOfMediansSelect?_rankCorrect hsel).1

Reader-facing correctness wrapper for median-of-medians SELECT.

theorem medianOfMediansSelect?_correct {k : Nat} {xs : List Nat} {x : Nat} (hsel : medianOfMediansSelect? k xs = some x) : RankCertificate xs k x := medianOfMediansSelect?_rankCorrect hsel
end Chapter09end CLRS