Monthly Archives: July 2015

How to avoid boxing value types in F# equality comparisons

This week I realized, and promptly reported to the Visual Fsharp compiler, that F#’s operator (=) boxes user-defined value types. That’s right, any value type you might want to use from a third-party library, or any value type you define for performance reasons, well, don’t compare them using (=) in F#, currently. Unless performance is of no concern to you, but then why would you use value types in the first place?

type AnyType =
    // whatever

AnyType() = AnyType() // allocates on the GC heap!?!

I don’t want to get into the how the problem happens as that would just be guesswork, I don’t work on the F# compiler. I just know it is there, and based on the feedback so far, it’s not going to be a quick fix.

So what can we do if we want to use value types in F#? Can’t use Equals, we’d need to cast to IEquatable(T) which is in itself a boxing operation, and Object.Equals boxes and casts both operands. One working alternative (thanks latkin) is to define op_Equality and use the new F# 4.0 NonStructuralModule, but that disables structural comparisons altogether. A workaround around that problem is to alias the equality operators in NonStructuralModule themselves, but then one has to remember to use the aliased operators, and I don’t trust anyone to remember anything, starting with myself.

Based on a later suggestions I received (thanks manofstick), this is what I would personally do:

#nowarn "86"

module FastEquals =
    let inline eq<'a when 'a :> System.IEquatable<'a>> (x:'a) (y:'a) = x.Equals y    
    let inline (=) x y = eq x y
    let inline (<>) x y = not (eq x y)
    let inline (=@) x y = Microsoft.FSharp.Core.Operators.(=) x y
    let inline (<>@) x y = Microsoft.FSharp.Core.Operators.(<>) x y

First, we are redefining the (=) operator in terms of IEquatable(T).Equals. This does the correct thing, not just for value types, but for classes, records and discriminated unions. There’s no overhead, as the method compiles to a single x86 instruction for all basic numerics types, and yet we still get structural equality for most F# types. F# complains with warning #86, hence, the #nowarn. Shut up F#, we’re fixing you up.

But of course not all types implement IEquatable(T), and the beauty of this is that it won’t compile in this case! So you can confidently use (=) everywhere, and trust the compiler that it won’t let you misuse it. Idiomatic, correct and fast. That’s what we want. I feel good now.

I’m not entirely sure if it’s a good idea to expose the default F# operator, but just in case, I did it by renaming it “=@”, that should make it obvious. The obvious case for using this operator would be structural collection comparison, but be aware that, again, this boxes all elements if they’re user-defined value types! Again, not behavior we’d like to accidentally run into. I’m considering getting rid of the default (=) altogether and just redefining structural equality manually for these collections as needed, perhaps using overloaded static members.

Unfortunately, you won’t able to prevent third-party F# code from using the default operator, starting with everything in FSharp.Core: List, Array, etc. It then becomes a matter of avoiding the functions that do box value types. For example:

List.contains value // boxes value types
List.exists (fun v -> v = value) // fast if (=) is redefined as suggested above

Of course, as the compiler warning suggests and I’ve been advised on the fsharp compiler thread, this might not be a good thing to do: I’m unsure as to why but since I don’t use F# in anything large-scale yet, I may be missing some insight into unintended consequences of this. You might want to use another operator for non-boxing comparisons, but I really value being able to use (=) idiomatically and not worry about performance pitfalls.

So far, I do think redefining the operator as I’ve suggested is actually more consistent and less surprising than what the default does (which suffers from other bugs and inconsistencies). What do I mean by that?

More consistent: By default, (=) gives you structural equality on everything except classes, where the behavior is reference equality. This is inconsistent and surprising in F#. By redefining (=) as suggested, where the default would silently use reference equality, the override will cause a compilation error: and that’s probably a good thing. If you really intend to use reference equality (which is too often a bogus and lazy way to compare), you can write Object.ReferenceEquals and it becomes obvious.

I’m only now starting to use this approach and I’ll update this post with my findings as I go along. I’m open to suggestions, refutations, let me know what you think!