Sorting a linked list using only recursion is a common interview question and something any CS graduate should be able to do.
One good approach is to start from a known sort and clearly recursive algorithm such as QuickSort. Here’s a description of QuickSort in English: “Given a list of elements, choose one element to be the pivot; partition the list in two, i.e. the lists of elements lesser and greater than the pivot respectively; perform QuickSort on these two lists and concatenate the results.”
Here’s a description in pseudo-code:
function QuickSort elements (list -> list) : if elements is empty then return elements let pivot = elements. let lesser, greater = Partition(elements, fun e -> e <= pivot) return Concat(QuickSort(lesser), [pivot], QuickSort(greater))
Note that the pivot can be chosen randomly, but since a linked list only gives us access to the first element directly, we’ll select the first element as the pivot.
Also note that we’re using a hypothetical Partition function that returns two list, one containing the elements for which the predicate applied to each element returns true and false respectively, and a hypothetical Concat function that concatenates 3 lists together.
From this the F# translation is actually shorter than the pseudocode:
let rec sort ls = match ls with |  -> ls | hd::tl -> let lesser, greater = List.partition ((>=) hd) tl List.append (sort lesser) (hd::(sort greater))
Test run in FSI:
val sort : ls:'a list -> 'a list when 'a : comparison > sort [-1;4;2;7;3];; val it : int list = [-1; 2; 3; 4; 7]
It’s a bit of cheating to be using pre-defined List module functions that do most of the grunt work for us, so let’s implement partition and append (without looking at the F# source code).
partition is a function that takes a list, iterates through it while building two lists that it finally returns. Since it doesn’t take these two lists in parameter, we’ll have to use an inner loop, which in recursive style will make our function look like this:
let partition ls = let rec loop ls ls1 ls2 = // build ls1 and ls2 loop ls  
Here in the inner loop, “ls” hides the outer “ls” and corresponds to the list that “loop” is iterating through; “ls1” and “ls2” are the two lists it is building. So now all we have to do is implement the body of the loop, which should take the first element of the list, add it to either ls1 and ls2 and call itself recursively on the rest of the list until it is empty. Very straightforward:
let partition fn ls = let rec loop ls ls1 ls2 = match ls with | hd::tl when fn hd -> loop tl (hd::ls1) ls2 | hd::tl -> loop tl ls1 (hd::ls2) |  -> ls1, ls2 loop ls  
Test run in FSI, partitioning a list of integers into even and odd lists:
val partition : fn:('a -> bool) -> ls:'a list -> 'a list * 'a list > partition (fun e -> e % 2 = 0) [1;5;3;7;2;6;4;9];; val it : int list * int list = ([4; 6; 2], [9; 7; 3; 5; 1])
It works, however it reverses the order of elements. This happens because of the recursive nature of the algorithm: as we advance through the list we are adding each element at the beginning of the list. Therefore the last element to be added becomes the first element, and the list is reversed. Does this matter to our sorting algorithm? No, because all that matters about the lists returned by partition is that they contain they elements for which the predicate returns true and false respectively, not that these elements are in any particular order. We can now swap List.partition for our version and verify that it still works:
> let rec sort ls = match ls with |  -> ls | hd::tl -> let lesser, greater = partition ((>=) hd) tl List.append (sort lesser) (hd::(sort greater));; val sort : ls:'a list -> 'a list when 'a : comparison > sort [3;2;55;4;7;6];; val it : int list = [2; 3; 4; 6; 7; 55]
Ok, now to write append. This is a function that takes two list and returns one list containing the elements of the first list followed by the elements of the second list, preserving their order. It does essentially the opposite of partition; whereas the former iterated one list to build two lists, append iterates two lists to build one list; but apart from that, the principle is similar. So the function will have a similar form:
let append ls1 ls2 = let rec loop ls ls1 ls2 = // build ls by iterating through ls1 and ls2 loop  ls1 ls2
At this point, implementing it is quite straightforward:
let append ls1 ls2 = let rec loop ls ls1 ls2 = match ls1, ls2 with | hd::tl, _ -> loop (hd::ls) tl ls2 | _, hd::tl -> loop (hd::ls) ls1 tl | _ -> ls loop  ls1 ls2
The idea is this: if there are elements in ls1 (the pattern hd::tl, _ matches), add the first element of ls1 to the list and keep iterating with the rest of ls1 (tl) and ls2. If there are no elements left in ls1 but there still are in ls2 (the first pattern doesn’t match but the second, _, hd::tl, matches), add the first element of ls2 to the list and keep iterating with the rest of ls1 (which should be empty) and ls2 (tl). If none of these pattern matches, then we’re done and we can return the constructed list.
As we test this, however, we notice a problem:
append [1;2;3] [5;6;7];; val it : int list = [7; 6; 5; 3; 2; 1]
It does what it should except the order of elements is reversed; and again, this is due to the recursive nature of the algorithm; as we advance through both lists we are adding the elements we encounter to the beginning of the list; so the last element we encounter is the last to be added, and hence ends up being the first in the constructed list.
At this point it becomes apparent we need a reverse function. F# provides one, List.reverse, but of course we won’t use it and implement it ourselves. It turns out that implementing reverse was the most tricky thing for me to understand and the reason I’m writing this post. It would be possible to write reverse by fetching the nth element of the list for n = [length – 1 to 0], but that would run in O(n^2) which is completely unacceptable. So I started writing some code without really thinking and came up with this:
let rec badReverse ls = match ls with | hd::tl -> hd::(badReverse tl) |  -> ls
But that just returns the original list! At this point I was very confused. In our implementations of partition and append, simply building a list by recursively appending each element to the beginning effectively reversed the list, because as we went through the list, each element was added to the beginning of the list, so the last element encountered became the new beginning of the list and so on. Aren’t we doing the same thing here?
And the answer is yes and no; yes it’s the same iteration mechanism, but instead of appending the element on the argument we pass to the next iteration, we’re appending it on the return value, so effectively we’ve reversed the order of operations. In the previous examples, we were starting with an empty list, and iterating through the original list, adding each element to that new list recursively.
So it appears that’s what we need to do here; start with an empty list and build it by appending the first element of the original list in the argument of the next loop iteration.
let reverse ls = let rec loop ls acc = match ls with | hd::tl -> loop tl (hd::acc) |  -> acc loop ls 
And this works. If we have access to the very general List.fold, we also could just do:
> List.fold (fun list x -> x::list)  [1;2;3];; val it : int list = [3; 2; 1]
Putting it all together, our append function can now return elements in the correct order by reversing its result:
let append ls1 ls2 = let rec loop ls ls1 ls2 = match ls1, ls2 with | hd::tl, _ -> loop (hd::ls) tl ls2 | _, hd::tl -> loop (hd::ls) ls1 tl | _ -> ls loop  ls1 ls2 |> reverse
And now we could go and replace List.append with our own, and have our fully hand-crafted recursion-only algorithm working! But something still feels very wrong about that append function. We’re appending by adding elements in reverse order and then reversing the result. Couldn’t we just, you know, add the elements in the right order in the first place?
Hey, maybe that dumb badReverse code might be useful! It preserves order, does it not? All we need to do is have it iterate two lists instead of one and call it “append” instead:
let rec append ls1 ls2 = match ls1, ls2 with | hd::tl, _ -> hd::(append tl ls2) | _, hd::tl -> hd::(append ls1 tl) | _ -> 
… aaaand this works:
val append : ls1:'a list -> ls2:'a list -> 'a list > append [1;2;3] [4;5;6];; val it : int list = [1; 2; 3; 4; 5; 6]
Aaaah much better. More efficient, and we can get rid of the reverse function. So our full listing becomes:
let partition fn ls = let rec loop ls ls1 ls2 = match ls with | hd::tl when fn hd -> loop tl (hd::ls1) ls2 | hd::tl -> loop tl ls1 (hd::ls2) |  -> ls1, ls2 loop ls   let rec append ls1 ls2 = match ls1, ls2 with | hd::tl, _ -> hd::(append tl ls2) | _, hd::tl -> hd::(append ls1 tl) | _ ->  let rec sort ls = match ls with |  -> ls | hd::tl -> let lesser, greater = partition ((>) hd) tl append (sort lesser) (hd::(sort greater))
Not a single library call in there, doesn’t this feel great? At this point we could even fix partition so that it preserves ordering, but I’ll leave that as an exercise to the reader.
In “fixing” our implementation to avoid calling reverse on a reversed list, however, we’ve lost one very useful property: tail-recursion. Indeed whereas our original append uses an inner tail-recursive loop, our new append performs an operation on the return value (appending an element) and so cannot be tail-recursive; therefore it will blow up the stack with large enough lists. Is there a way we could get both the simplicity and efficiency of our order-preserving append while keeping it tail-recursive?