Skip to content

Quieting FSI output

This post is further enhancement of my suggestion on Stack Overflow upon an approach to solving the following problem: how to limit output to F# Interactive window to just user’s own?

At first I thought that such mode doesn’t make sense as FSI outputs gazillion of useful bits and pieces, but on further consideration found it may be of certain use sometimes.

So, the initial approach attempt was to somehow use –quiet setting upon starting FSI session. But unfortunately this is an “all or none” option, being set it suppresses FSI’s output to the stdout stream altogether. Thinking along this line I recalled that stderr stream continues to be active regardless of quieting stdout, hence using eprintf/eprintfn in place of printf/printfn must do the trick. And it does, indeed. This is where I stopped on Stack Overflow.

Further thinking was along the lines that having two different code versions for quiet and normal FSI modes is an inconvenience, if not too much burden. This is where F# shadowing in concert with –use: come into play. Using the following FSI script Quiet.fsx below that basically redirects stdout output into stderr will do the trick:

#if INTERACTIVE
let printf fmt =
    Microsoft.FSharp.Core.Printf.eprintf fmt
let printfn fmt =
    Microsoft.FSharp.Core.Printf.eprintfn fmt
#endif

The only matter left is providing both FSI options on startup. I’ll show how to do this for FSI fired within Visual studio. Similarly, for autonomous FSI run these options should be placed on command line.

Thinking of placing Quiet.fsi script I finally chose my user directory, which has permanent relative location against FSI default startup directory; if you prefer another arrangement, then adjust accordingly:

  • on Visual Studio menu click Tools -> Options -> F# Tools -> F# Interactive
  • on Misc tab append –quiet –use:”../../../Quiet.fsx” to the contents of F# Interactive options
  • Now, in normal mode the following script

    let a = 1
    printfn "The value of a is %d" a
    

    produces quite verbose output:

    Microsoft (R) F# Interactive version 11.0.60610.1
    Copyright (c) Microsoft Corporation. All Rights Reserved.

    For help type #help;;

    >
    The value of a is 1

    val a : int = 1
    val it : unit = ()

    >

    In quiet mode the same script’s output will be as terse as:

    The value of a is 1

    Going Artistic with FSharp.Charting

    Since the early days of Postscript and gnuplot people tend to demonstrate the coolness of a plotting tool by creating miscellaneous computer art “masterpieces”. So, why FSharp.Charting should be an exception?

    I’ll try demonstrating compositional abilities and richness of available styling provided by FSharp.Charting by plotting two still eye-candies, reserving a motion plot for another blog post. Initial ideas for both plots are taken from Jon Harrop’s book “Visual F# 2010 for Technical Computing” and this post of Kean Walmsley.

    Running the snippets requires just installing FSharp.Charting via NuGet. The first plot just demonstrates chart compositionality:

    #load @"..\packages\FSharp.Charting.0.82\Fsharp.Charting.fsx"
    open FSharp.Charting
    open System.Drawing
    
    let n = 16
    let point i =
        let t = float(i % n) / float n * 2.0 * System.Math.PI in (sin t, cos t)
    
    Chart.Combine(seq { for i in 0..n-1 do for j in i+1..n -> Chart.Line [point i; point j] })
    |> Chart.WithTitle(Text="Artistic FSharp.Charting Sample 1",Color=Color.DarkBlue,Style=ChartTypes.TextStyle.Frame)
    |> Chart.WithYAxis(Enabled=false)
    |> Chart.WithXAxis(Enabled=false)
    

    yielding the following image:
    FSArt1

    The second plot in addition to compositionality demonstrates extensive styling abilities of the library (backgrounds, gradient fill, etc.):

    #load @"..\packages\FSharp.Charting.0.82\Fsharp.Charting.fsx"
    open FSharp.Charting
    open System.Drawing
    open System.Windows.Forms.DataVisualization.Charting
    
    let myBackground = ChartTypes.Background.Gradient (Color.LightSkyBlue,Color.White,GradientStyle.DiagonalRight)
    let myFrame = ChartTypes.Background.Solid Color.Blue
    
    // drawing per se...
    Chart.Combine(
        seq { for x in 0.0..16.0..1024.  do 
                let repers = [x,0.; 512.,x; 512.-x,512.; 0.,512.-x; x,0.] 
                for (x0,y0),(xl,yl) in Seq.pairwise repers -> Chart.Line [(x0,y0);(xl,yl)] })
    // ...the rest is pure styling
    |> Chart.WithTitle(Text="Artistic FSharp.Charting Sample 2",Color=Color.DarkBlue,Style=ChartTypes.TextStyle.Emboss)
    |> Chart.WithXAxis(Enabled=false)
    |> Chart.WithYAxis(Enabled=false)
    |> Chart.WithStyling(AreaBackground=myBackground,Background=myFrame)
    

    Running this script plots the following:
    FSArt2

    Enjoy!

    If Google would be looking to hire F# programmers – part 7

    In this installment of this series we turn to string edit distance problem. Edit distance measures number of elementary operations on string characters required to transform one string into the other. Of many possible definitions of edit distance we are going to consider Levenshtein distance: a minimum number of single-character mutations (insert character, delete character, substitute character) required for transformation of source string into the target string.

    The following relations hold for Levenshtein distance d between source string s and target string t:

    • d between two empty strings is, apparently, 0
    • d between empty and non-empty string is the length of non-empty string
    • for both non-empty strings if we present them in the form of prefix and the last character each, i.e. sourcep+sc and targetp+tc, then d(sourcep+sc,targetp+tc) = min (
      • d(sourcep,targetp) + if sc = tc then 0 else 1 (whatever it takes to edit sourcep into targetp, then substitute sc by tc, if not equal)
      • d(sourcep+sc,targetp) + 1 (edit sourcep+sc into targetp, then insert tc)
      • d(sourcep,targetp+tc) + 1 (delete sc, then edit sourcep into targetp+tc)
    • )

    Although this looks like a case for expressing recurrence via recursion, this is not a good idea for anything, but very short strings as amount of recursive calls on substrings will grow exponentially.

    So, as the edit distance is the minimum of edit distances of shorter string(s) this allows to solve the problem with dynamic programming technique. If we represent Levenshtein distances in a matrix holding distances between all prefixes of source and target strings, then we can compute this matrix elements in each row from left to right, then moving to next row from up to down, getting the distance between the full strings as the last computed value. This translates into the following straightforward code:

    let levenshteinV1 (src : string) (target : string) =
        let min3 a b c = min a (min b c)
     
        let (distance : int[,]) = Array2D.zeroCreate (src.Length + 1) (target.Length + 1)
     
        for i = 1 to src.Length do distance.[i, 0] <- i
        for j = 1 to target.Length do distance.[0, j] <- j
     
        for j = 1 to target.Length do
            for i = 1 to src.Length do
                distance.[i, j] <-
                    min3 (distance.[i-1, j] + 1)
                         (distance.[i, j-1] + 1)
                         (distance.[i-1, j-1] +
                           if src.[i - 1] = target.[j - 1] then 0 else 1)
        distance.[src.Length, target.Length]
    

    Playing with this version gives us:

    levenshteinV1 "food" "money"
    val it : int = 4
    levenshteinV1 "Saturday" "Sunday"
    val it : int = 3
    levenshteinV1 "kitty" "sitting"
    val it : int = 4
    levenshteinV1 "algorithm" "altruistic"
    val it : int = 6

    Exercising it for performance yields the following baseline:

    let longString = String.replicate 1000 "fizzbuzz"
    let longString' = String.replicate 1000 "xxzzyyzz"
    levenshteinV1 longString longString'
    Real: 00:00:01.088, CPU: 00:00:01.093, GC gen0: 0, gen1: 0, gen2: 0
    val it : int = 4000

    Can we do better? Sure! It is easy to spot in the code above, that calculation of each row of distances depends only on the previous row, so we may get rid of dealing with a matrix. Throwing in this and couple of other minor improvements the refactored version follows:

    let levenshteinV2 (src: string) (target: string) =
        let min3 a b c = min a (min b c)
        let m,n = src.Length, target.Length
        let prev = Array.init (n+1) id
        let nxt = Array.zeroCreate (n+1)
        for i in 1..m do
            nxt.[0] <- i
            for j in 1..n do
                if src.[i - 1] = target.[j - 1] then
                    nxt.[j] <- prev.[j - 1]
                else
                    nxt.[j] <- min3 (prev.[j] + 1)
                                    (nxt.[j - 1] + 1)
                                    (prev.[j - 1] + 1)
            Array.blit nxt 0 prev 0 (n+1)
        nxt.[n]
    

    It’s time profiling yields:

    let longString = String.replicate 1000 "fizzbuzz"
    let longString' = String.replicate 1000 "xxzzyyzz"
    levenshteinV2 longString longString'
    Real: 00:00:00.283, CPU: 00:00:00.296, GC gen0: 0, gen1: 0, gen2: 0
    val it : int = 4000

    Almost 4x time improvement!

    But how about pure functional code of the same? Comes out this is doable too:

    let levenshteinV3 (src: string) (target: string) =
        let min3 a b c = min a (min b c)
        let targetAsList = target.ToCharArray() |> Array.toList
        let rec distance run (srcTail: char list) (edits: int list) =
            match srcTail with
            | [] -> edits.Item (edits.Length - 1)
            | h::t -> List.fold2 (fun (newList,prev) c o -> 
                            ((min3 (o + 1) (newList.Head + 1) (if h = c then prev else prev + 1))::newList, o))
                            ([run], edits.Head) targetAsList edits.Tail
                            |> fst |> List.rev |> distance (run + 1) t
        distance 0 (src.ToCharArray() |> Array.toList) [ for i in 0..target.Length -> i]
    

    The solution workhorse is inner recursive function distance in line #4 that takes characters of the source string one by one and calculates the list of distances based on initial apparent list of distances. Accumulator of List.fold2 is the new list of distances accompanied by previous element prev of old list of distances as the min3 recurrence dictates.

    Unfortunately, this functional purity comes with tremendous performance penalty:


    let longString = String.replicate 1000 "fizzbuzz"
    let longString' = String.replicate 1000 "xxzzyyzz"
    levenshteinV3 longString longString'
    Real: 00:00:04.511, CPU: 00:00:04.500, GC gen0: 1001, gen1: 261, gen2: 0
    val it : int = 4000

    To be continued…

    If Google would be looking to hire F# programmers – part 6

    The next problem to be covered in this series is revisiting Part 2 with the goal of further improving performance of search in 2D array. Saddleback search algorithm suggested there is apparently excessive as not fully exploiting the given element order in the 2D search space. The suggested approach would be generalizing binary search algorithm from one-dimension to two.

    Let’s illustrate the idea behind the 1D binary search generalization upon 2D with a drawing:

    Assume we’ve picked a pivotal element A.[i,j] somewhere in the middle of search space A, which is a m x n matrix, and it came out that A.[i,j] > key, where key is the sought value. Apparently, as all A elements down and to the right of A.[i,j] are no less, than it is, the part of search space colored orange can be safely excluded from further search that should be continued within the part of the search space colored blue. A similar dissecting consideration can be formulated for the case when A.[i,j] < key. The third leftover case is A.[i,j] = key meaning that the sought element has been found.

    OK, let’s switch from musing to coding this “divide and conquer” search strategy. The only small leftover decision to make is how to represent irregularly shaped continuation search space. It is easy to notice that it is always a union of two rectangles, so we can use a search function uniformly operating on rectangle areas of A and put them for processing into a stack. Here is the algorithm implementation:

    let binarySearch2D key (a: int[,]) =
        let rec searcher (sl: int[,] list) =
            if sl.IsEmpty then false
            else
                let aa = sl.Head
                if aa.Length = 0 then searcher (sl.Tail)
                else
                    let pivoth, pivotv = (Array2D.length1 aa)/2, (Array2D.length2 aa)/2 
                    if aa.[pivoth, pivotv] > key then
                        searcher (aa.[*,..pivotv-1] :: aa.[..pivoth-1,pivotv..] :: (sl.Tail))
                    elif aa.[pivoth, pivotv] < key then
                        searcher (aa.[pivoth+1..,*] :: aa.[..pivoth,pivotv+1..] :: (sl.Tail))
                    else true
        searcher [a]
    

    Search per se is performed by inner recursive function searcher defined in line #2, having as argument a stack of rectangular Array2D elements. If it is empty (line #3) means that search for key in the initial matrix a has failed. Otherwise we pop next element (line #5) and, if the element is not empty, dissect it into 4 approximately equal pieces by picking the pivotal element in line #8. Lines #10 and #12 push into the stack remaining search space pieces for search continuation. Thanks to wonderful F# array slicing, expressing required data dissection is a breeze and searcher is always deals with zero-based Array2D pieces.

    Let’s wrap up by playing some tests and corner cases:


    let a = Array2D.init 10 10 (fun i j -> 3*i + 27*j + j*j)
    val a : int [,] = [[0; 28; 58; 90; 124; 160; 198; 238; 280; 324]
    [3; 31; 61; 93; 127; 163; 201; 241; 283; 327]
    [6; 34; 64; 96; 130; 166; 204; 244; 286; 330]
    [9; 37; 67; 99; 133; 169; 207; 247; 289; 333]
    [12; 40; 70; 102; 136; 172; 210; 250; 292; 336]
    [15; 43; 73; 105; 139; 175; 213; 253; 295; 339]
    [18; 46; 76; 108; 142; 178; 216; 256; 298; 342]
    [21; 49; 79; 111; 145; 181; 219; 259; 301; 345]
    [24; 52; 82; 114; 148; 184; 222; 262; 304; 348]
    [27; 55; 85; 117; 151; 187; 225; 265; 307; 351]]
    binarySearch2D 73 a
    val it : bool = true
    binarySearch2D 149 a
    val it : bool = false
    binarySearch2D -1 a
    val it : bool = false
    binarySearch2D 400 a
    val it : bool = false

    To be continued…

    If Google would be looking to hire F# programmers – part 5

    Next problem to be covered in this series is maximum subarray problem: the task of finding the contiguous subarray within a one-dimensional array of numbers (containing at least one positive number) which has the largest sum.

    An algorithm having O(n) complexity was first found by Jay Kadane. It scans through values computing at each position a maximum subarray ending in this position. This subarray is either empty having sum of elements equal to zero, or has one more element, than the same subarray ending in the previous position. Expressing this generically in F# is almost trivial with Array.fold combinator:

    let inline kadanesV1 (xs: 'a []) =
        (LanguagePrimitives.GenericZero, LanguagePrimitives.GenericZero)
        |> Array.fold (fun (maxSum, maxSumEndingHere) x ->
                        let maxSumEndingHere = (max LanguagePrimitives.GenericZero (maxSumEndingHere + x)) in
                        ((max maxSum maxSumEndingHere), maxSumEndingHere))
        <| xs
        |> fst
    

    Folder function in line 4 maintains a maximum subarray, updating maximum sum in line 5. Here are some tests:

    [| 1; -2; 1; -3; 4; -1; 2; 1; -5; 4 |] |> kadanesV1
    val it : int = 6
    [| 1.0; -2.0; 1.0; -3.0; 4.0; -1.0; 2.0; 1.0; -5.0; 4.0 |] |> kadanesV1
    val it : float = 6.0
    [| -1;-2;-3;0 |] |> kadanesV1
    val it : int = 0
    [| -1;-2;-3; |] |> kadanesV1
    val it : int = 0
    [| -3;-2;-1; |] |> kadanesV1
    val it : int = 0

    So far so good, but results of the last two test cases ask for a further generalization: how can we also correctly handle arrays consisting of just negative numbers? Comes out this requires trivial changes in lines 2 and 4:

    let inline kadanesV2 (xs: 'a []) =
        (xs.[0], xs.[0])
        |> Array.fold (fun (maxSum, maxSumEndingHere) x ->
                        let maxSumEndingHere = if maxSumEndingHere < LanguagePrimitives.GenericZero then x else maxSumEndingHere + x in
                        ((max maxSum maxSumEndingHere), maxSumEndingHere))
        <| xs
        |> fst
    

    Now we handle arrays of negative numbers too:

    [| -1;-2;-3; |] |> kadanesV2
    val it : int = -1
    [| -3;-2;-1; |] |> kadanesV2
    val it : int = -1

    But what if we are also required to accompany the found largest sum with indices of subarray that constitutes it? This is where functional approach must seemingly get into trouble: so far we didn’t need dealing with any indices whatsoever, right?
    Comes out we still would be able getting away with a folder. However, now we are forced to take care of much more moving parts, including the array element indices. This is where Array.fold2 combinator comes to help:

    let inline kadanesV3 (xs: 'a []) =
        ((xs.[0], 0, 0), (xs.[0], 0))
        |> Array.fold2 (fun (maxSum, maxSumEndingHere) x i ->
                        let maxSumEndingHere =
                            let maxSumEndingHere, startIdxEndingHere = maxSumEndingHere
                            if maxSumEndingHere < LanguagePrimitives.GenericZero then
                                (x,i)
                            else
                                (maxSumEndingHere + x, startIdxEndingHere) in
                        let maxSum =
                            let maxSum, startIdx, endIdx = maxSum 
                            if maxSum < fst maxSumEndingHere then
                                (fst maxSumEndingHere, snd maxSumEndingHere, i)
                            else
                                (maxSum, startIdx, endIdx) in
                            (maxSum, maxSumEndingHere)
                        )
        <|| (xs,(xs |> Array.mapi (fun i _ -> i)))
        |> fst
    

    Here in line #18 we arrange for folding the second array just consisting of indices of the first array, so our Array.fold2 folder in line #3 is getting access to the current index, represented by lambda argument i. Although now we are taking care of significantly extended state we still can make folder state looking very similar to previous versions. However, now maxSum is 3-element tuple (line # 11) and maxSumEndingHere is 2-element tuple (line #5), carrying indices. Nevertheless, F# let bindings and shadowing allow us to fully localize these details within the folder, where these tuples are disassembled and reassembled at need.

    Quick tests follow:

    [| 1; -2; 1; -3; 4; -1; 2; 1; -5; 4 |] |> kadanesV3
    val it : int * int * int = (6, 4, 7)
    [| 1.0; -2.0; 1.0; -3.0; 4.0; -1.0; 2.0; 1.0; -5.0; 4.0 |] |> kadanesV3
    val it : float * int * int = (6.0, 4, 7)
    [| -1;-2;-3;0 |] |> kadanesV3
    val it : int * int * int = (0, 3, 3)
    [| -1;-2;-3; |] |> kadanesV3
    val it : int * int * int = (-1, 0, 0)
    [| -3;-2;-1; |] |> kadanesV3
    val it : int * int * int = (-1, 2, 2)

    Folding is a workhorse of functional programming, indeed.

    To be continued…

    If Google would be looking to hire F# programmers – part 4

    As the next installment of the series let’s take the following problem from the bookgiven an arbitrary positive number find a next higher number consisting of the same digits. If such does not exist, then return the original number.

    The base line approach would be splitting the number into a list of digits, making the list of all digit permutations, assembling digits back into numbers, sorting list of numbers, and finally picking the element next to the given. Apparently, the time and space complexities of this “solution” are awful, so let’s improve.

    • The first useful observation towards the optimized solution would be the following: solution exists iff a pair of adjacent digits exists in the given number, where the left is strictly less, than right.
    • The next useful observation would be: if we scan the list of digits of the given number from right to left by a sliding window of width 2, then the first occurred pair that matches the first observation would be the place of change; everything to the left of it (if any exists) must stay intact.
    • The final useful observation: taken the pair matching second observation, the sublist to the right including the right element of the pair is sorted from right to left; the digit that must substitute the left element of the pair must be minimal greater digit from the sublist; the left element that we just substituted should be placed some place to the right preserving the sublist order.

    Now if we concatenate (if non-empty) the sublist to the left of changing digit, followed by substituting digit, followed by reversed sublist after accommodating of changing digit, and convert resulting digit list back to number, this would yield the solution with surprisingly good time complexity of O(n) and space complexity of O(n), where n is number of digits in the original number.

    As this problem is of “puzzle” type let’s add some spice to the solution showing how F# can be advantageous for its coding. One such feature would be making the solution generic, i.e. accepting any integral type as input (byte, BigInteger, nativeint, whatever consisting of just digits). Achieving this statically (i.e. relying onto compiler with constraining right types) would require using inline F# facility. Let’s see this implemented in the code below:

    let inline nextHigher number =
    
        let GenericTen = (LanguagePrimitives.GenericOne<'a> <<< 3) +
                         (LanguagePrimitives.GenericOne<'a> <<< 1)
    
        let toDigits n =
            let rec toDigitList digits n =
                if n = LanguagePrimitives.GenericZero then digits
                else toDigitList ((n % GenericTen) :: digits) (n / GenericTen)
            toDigitList [] n
    
        let fromDigits digits =
            let rec fromDigitList n = function
                | [] -> n
                | h::t -> fromDigitList (n * GenericTen + h) t
            fromDigitList LanguagePrimitives.GenericZero digits
    
        let make p ll  =
            ll |> List.rev |> List.partition ((<) p)
            |> fun (x,y) -> (x.Head::y) @ (p::(x.Tail))
    
        let rec scan (changing: 'a list) source =
            match source with
            | [] -> changing
            | h::t -> if h >= changing.Head then
                        scan (h::changing) t
                      else
                        (List.rev t) @ (make h changing)
    
        number |> toDigits |> List.rev
        |> fun x -> scan [(x.Head)] (x.Tail) |> fromDigits
    

    A reader has commented upon previous posts that despite his previous F# coding experience the snippets look quite opaque. I’ll try addressing this matter by giving extended comments over the code.
    The main inlined function nextHigher number definition in line #1 yields a quite impressive inferred list of constraints:


    val inline nextHigher :
    number: ^a -> ^d
    when ( ^a or ^b) : (static member ( % ) : ^a * ^b -> ^a0) and
    ^a : (static member get_Zero : -> ^a) and
    ( ^a or ^b) : (static member ( / ) : ^a * ^b -> ^a) and
    ^a : equality and
    ( ^d or ^b) : (static member ( * ) : ^d * ^b -> ^c) and
    ^a0 : (static member get_One : -> ^a0) and
    ^a0 : (static member ( + ) : ^a0 * ^a0 -> ^b) and
    ^a0 : (static member ( << ^a0) and
    ^a0 : comparison and
    ( ^c or ^a0) : (static member ( + ) : ^c * ^a0 -> ^d) and
    ^d : (static member get_Zero : -> ^d)

    While the constraints above can be slightly simplified it is still quite impressive how good is type inference in F#.

    A variable Generic10 in line #3 is self-described – it resolves to 10 for any primitive numeric type with a static member One and operators <<< and +.

    A pair of auxiliary functions defined in lines #6 and #12 perform disassembly of a numeric value into digits (toDigits) and vice versa (fromDigits). The functions allow solution to work upon any integral type via static checks.

    The code in lines #18 – #31 implements the algorithm per se. Function make p ll declared in line #18 performs the conversion of pivot digit p and scanned part of the input number into corresponding part of the final order. Function scan in line #22 performs search for the pivot digit in the reversed list of digits ensuring either return of original value (line #24) if the solution does not exist, or the solution itself in line #28. Finally, lines #30-31 represent the execution pipeline.

    I want to wrap up this post with some tests and corner cases demonstrating the implementation in action:

    nextHigher 1987654321 // Just working on int
    val it : int = 2113456789
    nextHigher 987654321L // Working on long, solution does not exist
    val it : int64 = 987654321L
    nextHigher 32154321 // Pivot digit is in the middle
    nextHigher 12uy // Working on unsigned bytes
    val it : byte = 21uy
    // Working on bigint
    nextHigher 5598734987954054911111111111111I
    val it : System.Numerics.BigInteger =
    5598734987954059111111111111114 {IsEven = true;
    IsOne = false;
    IsPowerOfTwo = false;
    IsZero = false;
    Sign = 1;}
    nextHigher 136442n // It even works with nativeints!!
    val it : nativeint = 142346n

    To be continued after approximately 2 week break…

    If Google would be looking to hire F# programmers – part 3

    The next problem to cover in these series would be the following: in a list of comparable items find Kth smallest item.

    We may approach the solution with a variant of quickselect algorithm. Let’s begin with the signature of target function:

    findKthSmallest: k:int -> ls:’a list -> ‘a when ‘a: comparison

    Instead of brute forcing the solution via the full sort we’ll have recourse to somewhat lighter data partitioning approach that would be gradually shrinking the search space. Let’s pick the head element of our source list as a pivot point with an arbitrary value p and then partition the list into the two (possibly empty) buckets: lt for elements that are smaller, than p, and gt for elements that are greater, leaving out the pivot element itself and possibly other elements with the same value p. Then let’s see where our desired Kth smallest element may end up:

    • it may get into bucket lt iff lt.Length >= k and, consequently can be found through applying the same process to lt only, i.e. with findKthSmallest k lt
    • iff k > ls.Length – gt.Length this means that k belongs to gt; but as we already placed lt.Length elements into lt and got rid of current pivot(s) the sought element would be (k – (ls.Length – gt.Length))th smallest of gt and in order to find it we may do findKthSmallest (k – ls.Length + gt.Length) gt
    • iff both options above are failed this means that the sought element in not either in lt, or in gt, so it only can be pivot p and we do not need any further partitioning.

     
    This wordy algorithm description translates into a quite compact F# code:

    let findKthSmallest k ls =
        let rec splitter k ls =
            let pivot = List.head ls
            let (lt,gt) =
                (([],[]),ls)
                ||> List.fold
                    (fun (lt,gt) n -> match (System.Math.Sign(Operators.compare n pivot)) with
                                        | -1 -> (n::lt,gt)
                                        | 1  -> (lt,n::gt)
                                        | _  -> (lt,gt))
                
            let leLength = ls.Length - gt.Length in
                if k <= lt.Length then
                    splitter k lt
                elif k > leLength then
                    splitter (k - leLength) gt
                else
                    pivot
        splitter k ls
    
    // Tests:
    findKthSmallest 3 [1;2;3;4;-1]
    //val it : int = 2
    findKthSmallest 5 [1;1;1;1;1;1;1;1;1;1]
    //val it : int = 1
    findKthSmallest 4 ['a';'b';'c';'d']
    //val it : char = 'd'
    findKthSmallest 3 [4.5;2.2;3.33;1.5]
    //val it : float = 3.33
    

    The workhorse is splitter recursive function in line 2; code in lines 2-10 performs partitioning of ls elements into buckets lt & gt in one pass using a custom folder function defined in lines 7-10.
    It’s important to understand the role of System.Math.Sign wrapping around generic comparison at line 7; without it line 10 instead of matching with the single possible zero value may bring unpleasantly surprising behavior for some ls element types.

    Lines 22-29 demonstrate the solution at work, some corner cases of input, and that it is generic, indeed. And, not to forget mentioning, the worst case complexity of the solution is O(k*n), where n is the length of the original list.

    To be continued

    If Google would be looking to hire F# programmers – part 2

    Problem #2 from the book reads: “write an efficient algorithm that searches for a given element in a 2D array, which elements are sorted by rows and columns, i.e. for each pair of indices i and j elem[i,j] <= elem[i+1,j] and elem[i,j] <= elem[i,j+1]".
    This problem is known as saddleback search. It has received a lot of attention in the literature. I am not going to dig too dip and will provide a “good enough” solution that allows for further improvement.

    My solution would be similarly to Nuts and Bolts problem based upon “divide and conquer” approach.

    Let’s begin the search from the lower left corner having the whole 2D array aa of n columns and m rows as the search space. Compare element aa.[0,m – 1] with the pattern x. If they are equal, our mission is accomplished (we will disregard here that multiple occurences of the pattern may have place). If not and aa.[0,m – 1] > x, then we should exclude the last row from search space as elements there just grow towards right side, and move new comparison element one row up. This effectively makes the search space narrower by one row. For the scenario aa.[0,m – 1] < x the similar considerations exclude from search space leftmost column moving comparison one element to the right to aa.[1,m – 1]. Repeating these actions on the shrinking search space eventually either brings the matched element, or the conclusion that the match does not exist.

    F# implementation below is quite literal reproduction of this search strategy description:

    let saddlebackSearch (aa: 'a[,]) n m x =
        let up (h, v) =
            match v with
            | 0 -> None
            | _ -> Some((h, v - 1))
        let right (h, v) =
            match m - 1 - h with
            | 0 -> None
            | _ -> Some((h + 1, v))
        let rec searcher = function
        | None -> None
        | Some(h, v) -> match (Operators.compare aa.[h,v] x) with
                        | 0 -> Some(h, v)
                        | -1 -> searcher (right(h, v))
                        | _ -> searcher (up(h, v))
        searcher (Some((0, m - 1)))
    

    Talking of the effectiveness of the code above is should be clear that it delivers conclusion by time O(n+m) in the worst case scenario. Exercising the implementation with the code below

    let aa = Array2D.init  10 10 (fun i j -> 3*i + 27*j + j*j)
    saddlebackSearch aa 10 10 22
    saddlebackSearch aa 10 10 175
    

    delivers the following results:

    val aa : int [,] = [[0; 28; 58; 90; 124; 160; 198; 238; 280; 324]
                        [3; 31; 61; 93; 127; 163; 201; 241; 283; 327]
                        [6; 34; 64; 96; 130; 166; 204; 244; 286; 330]
                        [9; 37; 67; 99; 133; 169; 207; 247; 289; 333]
                        [12; 40; 70; 102; 136; 172; 210; 250; 292; 336]
                        [15; 43; 73; 105; 139; 175; 213; 253; 295; 339]
                        [18; 46; 76; 108; 142; 178; 216; 256; 298; 342]
                        [21; 49; 79; 111; 145; 181; 219; 259; 301; 345]
                        [24; 52; 82; 114; 148; 184; 222; 262; 304; 348]
                        [27; 55; 85; 117; 151; 187; 225; 265; 307; 351]]
    
    val it : (int * int) option = None
     
    val it : (int * int) option = Some (5, 5)
    

    To be continued…

    If Google would be looking to hire F# programmers – part 1

    Recently I came across the book Top 10 coding interview problems asked in Google with solutions: Algorithmic Approach. This is perhaps a good book with solutions given in C++, I must admit I just browsed thru the book contents shown by Amazon. Then I thought about an imaginary setting when these coding problems would be given in functional programming interview setting. It seems to me quite tempting to approach them with F# in hand. This post opens a 10 part series where each post would be devoted to F# solution of a problem from this set.

    The first problem originates from the “old, but gold” book by Gregory Rawlings Compared to What?: An Introduction to the Analysis of Algorithms under the name “Nuts and bolts” (p.293): We wish to sort a bag of n nuts and n bolts by size in the dark. We can compare the sizes of a nut and a bolt by attempting to screw one into the other. This operation tells us that either the nut is bigger than the bolt; the bolt is bigger than the nut; or they are the same size (and so fit together). Because it is dark we are not allowed to compare nuts directly or bolts directly. Devise how to fit all nuts and bolts minimizing number of fittings.

    Let’s first establish a solution baseline by using a naïve approach:

    • select a random nut;
    • fit random bolts one by one until finding the match; put the matching pair aside;
    • repeat the process for each of remaining nuts.

     
    Easy to notice that required number of fittings is O(n*n). Can we do better? Sure. Let’s modify the approach:

    • as before select a random nut;
    • now partition ALL bolts into three piles of fitting, bigger, and smaller, than the pivot nut
    • then taking a fitting bolt as the pivot (we always have at least one by definition) make the similar partitioning to ALL nuts
    • at this point we end up with (probably empty) pair of nuts and bolts piles smaller, than pivot, pair of piles of nuts and bolts (having at least one of each) that fit, and (again, probably empty) pair of nuts and bolts piles bigger, than pivot. This is the AHA! moment as it is easy to notice that fitting piles of smaller items and/or fitting piles of bigger items is fully equivalent to the original task, but for shrink dimension.

     
    Using such “divide and conquer” approach we can decrease required number of fittings to only O(n*log(n))! And we’ve just rediscovered quickselect, an essential constituent of the famous quicksort.

    Now let’s jump to coding. Interestingly, F# would allow us to express the solution very closely to the approach description given above; in other words we can kind of come up with a micro-DSL for solving this particular task:

    type Size = int
    
    type Bolt = Bolt of Size
    type Nut = Nut of Size
    
    let nutSize = function Nut(size) -> size
    let boltSize = function Bolt(size) -> size
    
    type Bolt with
        member bolt.Size = boltSize bolt
    
    type Nut with
        member nut.Size = nutSize nut
    
    type Bolt with
        member bolt.Fit (nut: Nut) = bolt.Size.CompareTo nut.Size
        member bolt.Fit nuts =
            nuts |> List.partition (bolt.Fit >> (=) 0)
            |> fun (fit, nonfit) -> let (smaller, bigger) = nonfit |> List.partition (bolt.Fit >> (<) 0)
                                    in (fit, smaller, bigger)
    
    type Nut with
        member nut.Fit (bolt: Bolt) = nut.Size.CompareTo bolt.Size
        member nut.Fit bolts =
            bolts |> List.partition (nut.Fit >> (=) 0)
            |> fun (fit, nonfit) -> let (smaller, bigger) = nonfit |> List.partition (nut.Fit >> (<) 0)
                                    in (fit, smaller, bigger)
    
    let rec fit (nuts: Nut list) (bolts: Bolt list) =
        seq { match nuts with
              | [] -> ()
              | _  -> let (boltsFit, boltsSmaller, boltsBigger) = nuts.Head.Fit bolts
                      let (nutsFit, nutsSmaller, nutsBigger) = boltsFit.Head.Fit nuts
                      for (nut,bolt) in (List.zip nutsFit boltsFit) do yield (nut,bolt) 
                      yield! fit nutsSmaller boltsSmaller
                      yield! fit nutsBigger boltsBigger
            }
    
    

    Let’s see how this works by entertaining the following snippet in FSI:

    let amount = 10
    let variety = 8
    let rand = System.Random()
    let pieces = [ for i in 1..amount -> rand.Next(1, variety) ]
    let bagOfNuts = pieces |> List.map Nut
    let bagOfBolts = pieces |> List.map Bolt |> List.rev
    fit bagOfNuts bagOfBolts |> Seq.toList
    

    getting:

    val pieces : int list = [2; 3; 5; 6; 2; 2; 1; 1; 3; 3]
    val bagOfNuts : Nut list =
      [Nut 2; Nut 3; Nut 5; Nut 6; Nut 2; Nut 2; Nut 1; Nut 1; Nut 3; Nut 3]
    val bagOfBolts : Bolt list =
      [Bolt 3; Bolt 3; Bolt 1; Bolt 1; Bolt 2; Bolt 2; Bolt 6; Bolt 5; Bolt 3;
       Bolt 2]
    val it : (Nut * Bolt) list =
      [(Nut 2, Bolt 2); (Nut 2, Bolt 2); (Nut 2, Bolt 2); (Nut 1, Bolt 1);
       (Nut 1, Bolt 1); (Nut 3, Bolt 3); (Nut 3, Bolt 3); (Nut 3, Bolt 3);
       (Nut 5, Bolt 5); (Nut 6, Bolt 6)]
    

    To be continued…

    A Tale of Two Functions

    When covering the matter of mutually-recursive functions the first Edition of Programming F# comes up with the following sample on page 55:

    let rec isOdd n = (n = 1) || isEven (n - 1)
    and isEven n = (n = 0) || isOdd (n - 1)
    

    Looks beautiful, but unfortunately does not work. The second Edition Programming F# 3.0 attempts fixing the problem with the following code on page 60:

    let rec isOdd x =
        if x = 0 then false
        elif x = 1 then true
        else isEven(x - 1)
    and isEven x =
        if x = 0 then true
        elif x = 1 then false
        else isOdd(x - 1)
    

    This snippet works correctly, but it got four times longer, than the original and apparently smells. May it be refactored towards DRY? Sure, for example:

    let rec isOdd = function
    | 0 | 1 as n -> (n = 1)
    | n -> isEven (n - 1)
    and isEven n = isOdd (n - 1)
    

    Good, but still 100% longer, than the original, so let’s keep trying:

    let rec isOdd n = if n >= 2 then isEven (n - 1) else n = 1
    and isEven n = if n >= 2 then isOdd (n - 1) else n = 0
    

    Cool, this works correctly and is as succinct as the original. But wait a minute, boolean if-then-else expression can be rewritten to the equivalent expression using && and || operators:

    let rec isOdd n = n >= 2 && isEven (n - 1) || n = 1
    and isEven n = n >= 2 && isOdd (n - 1) || n = 0
    

    Almost perfect, but what if we out of curiosity try something like isOdd 10000000? Oh,no! Although equivalent to previous snippet, the latest one gets not tail-recursive and blows the stack.

    OK, let’s convert it back to a tail-recursive equivalent:

    let rec isOdd n = n = 1 || n >= 2 && isEven (n - 1) 
    and isEven n = n = 0 || n >= 2 && isOdd (n - 1)
    

    or even to just another tail-recursive equivalent:

    let rec isOdd n = n < 2 && n = 1 || isEven (n - 1) 
    and isEven n = n < 2 && n = 0 || isOdd (n - 1)
    

    The latest snippet is almost similar to the original, but works perfectly.

    The bottom line: often a boolean expression may perfectly substitute one built with if-then-else. But watch for subtleties, like inadvertently losing tail-recursiveness.

    Follow

    Get every new post delivered to your Inbox.