If Google would be looking to hire F# programmers – part 3
We may approach the solution with a variant of quickselect algorithm. Let’s begin with the signature of target function:
findKthSmallest: k:int -> ls:’a list -> ‘a when ‘a: comparison
Instead of brute forcing the solution via the full sort we’ll have recourse to somewhat lighter data partitioning approach that would be gradually shrinking the search space. Let’s pick the head element of our source list as a pivot point with an arbitrary value p and then partition the list into the two (possibly empty) buckets: lt for elements that are smaller, than p, and gt for elements that are greater, leaving out the pivot element itself and possibly other elements with the same value p. Then let’s see where our desired Kth smallest element may end up:
- it may get into bucket lt iff lt.Length >= k and, consequently can be found through applying the same process to lt only, i.e. with findKthSmallest k lt
- iff k > ls.Length – gt.Length this means that k belongs to gt; but as we already placed lt.Length elements into lt and got rid of current pivot(s) the sought element would be (k – (ls.Length – gt.Length))th smallest of gt and in order to find it we may do findKthSmallest (k – ls.Length + gt.Length) gt
- iff both options above are failed this means that the sought element in not either in lt, or in gt, so it only can be pivot p and we do not need any further partitioning.
This wordy algorithm description translates into a quite compact F# code:
let findKthSmallest k ls = let rec splitter k ls = let pivot = List.head ls let (lt,gt) = ((,),ls) ||> List.fold (fun (lt,gt) n -> match (System.Math.Sign(Operators.compare n pivot)) with | -1 -> (n::lt,gt) | 1 -> (lt,n::gt) | _ -> (lt,gt)) let leLength = ls.Length - gt.Length in if k <= lt.Length then splitter k lt elif k > leLength then splitter (k - leLength) gt else pivot splitter k ls // Tests: findKthSmallest 3 [1;2;3;4;-1] //val it : int = 2 findKthSmallest 5 [1;1;1;1;1;1;1;1;1;1] //val it : int = 1 findKthSmallest 4 ['a';'b';'c';'d'] //val it : char = 'd' findKthSmallest 3 [4.5;2.2;3.33;1.5] //val it : float = 3.33
The workhorse is splitter recursive function in line 2; code in lines 2-10 performs partitioning of ls elements into buckets lt & gt in one pass using a custom folder function defined in lines 7-10.
It’s important to understand the role of System.Math.Sign wrapping around generic comparison at line 7; without it line 10 instead of matching with the single possible zero value may bring unpleasantly surprising behavior for some ls element types.
Lines 22-29 demonstrate the solution at work, some corner cases of input, and that it is generic, indeed. And, not to forget mentioning, the worst case complexity of the solution is O(k*n), where n is the length of the original list.