How to avoid boxing value types in F# equality comparisons

This week I realized, and promptly reported to the Visual Fsharp compiler, that F#’s operator (=) boxes user-defined value types. That’s right, any value type you might want to use from a third-party library, or any value type you define for performance reasons, well, don’t compare them using (=) in F#, currently. Unless performance is of no concern to you, but then why would you use value types in the first place?

[<Struct>]
type AnyType =
    // whatever

AnyType() = AnyType() // allocates on the GC heap!?!

I don’t want to get into the how the problem happens as that would just be guesswork, I don’t work on the F# compiler. I just know it is there, and based on the feedback so far, it’s not going to be a quick fix.

So what can we do if we want to use value types in F#? Can’t use Equals, we’d need to cast to IEquatable(T) which is in itself a boxing operation, and Object.Equals boxes and casts both operands. One working alternative (thanks latkin) is to define op_Equality and use the new F# 4.0 NonStructuralModule, but that disables structural comparisons altogether. A workaround around that problem is to alias the equality operators in NonStructuralModule themselves, but then one has to remember to use the aliased operators, and I don’t trust anyone to remember anything, starting with myself.

Based on a later suggestions I received (thanks manofstick), this is what I would personally do:

#nowarn "86"

[<AutoOpen>]
module FastEquals =
    let inline eq<'a when 'a :> System.IEquatable<'a>> (x:'a) (y:'a) = x.Equals y    
    let inline (=) x y = eq x y
    let inline (<>) x y = not (eq x y)
    
    let inline (=@) x y = Microsoft.FSharp.Core.Operators.(=) x y
    let inline (<>@) x y = Microsoft.FSharp.Core.Operators.(<>) x y

First, we are redefining the (=) operator in terms of IEquatable(T).Equals. This does the correct thing, not just for value types, but for classes, records and discriminated unions. There’s no overhead, as the method compiles to a single x86 instruction for all basic numerics types, and yet we still get structural equality for most F# types. F# complains with warning #86, hence, the #nowarn. Shut up F#, we’re fixing you up.

But of course not all types implement IEquatable(T), and the beauty of this is that it won’t compile in this case! So you can confidently use (=) everywhere, and trust the compiler that it won’t let you misuse it. Idiomatic, correct and fast. That’s what we want. I feel good now.

I’m not entirely sure if it’s a good idea to expose the default F# operator, but just in case, I did it by renaming it “=@”, that should make it obvious. The obvious case for using this operator would be structural collection comparison, but be aware that, again, this boxes all elements if they’re user-defined value types! Again, not behavior we’d like to accidentally run into. I’m considering getting rid of the default (=) altogether and just redefining structural equality manually for these collections as needed, perhaps using overloaded static members.

Unfortunately, you won’t able to prevent third-party F# code from using the default operator, starting with everything in FSharp.Core: List, Array, etc. It then becomes a matter of avoiding the functions that do box value types. For example:

List.contains value // boxes value types
List.exists (fun v -> v = value) // fast if (=) is redefined as suggested above

Of course, as the compiler warning suggests and I’ve been advised on the fsharp compiler thread, this might not be a good thing to do: I’m unsure as to why but since I don’t use F# in anything large-scale yet, I may be missing some insight into unintended consequences of this. You might want to use another operator for non-boxing comparisons, but I really value being able to use (=) idiomatically and not worry about performance pitfalls.

So far, I do think redefining the operator as I’ve suggested is actually more consistent and less surprising than what the default does (which suffers from other bugs and inconsistencies). What do I mean by that?

More consistent: By default, (=) gives you structural equality on everything except classes, where the behavior is reference equality. This is inconsistent and surprising in F#. By redefining (=) as suggested, where the default would silently use reference equality, the override will cause a compilation error: and that’s probably a good thing. If you really intend to use reference equality (which is too often a bogus and lazy way to compare), you can write Object.ReferenceEquals and it becomes obvious.

I’m only now starting to use this approach and I’ll update this post with my findings as I go along. I’m open to suggestions, refutations, let me know what you think!

Advertisements

4 thoughts on “How to avoid boxing value types in F# equality comparisons

  1. Pingback: F# Weekly #29, 2015 | Sergey Tihon's Blog

  2. Sami Perttu

    Thanks for your post. I tried your suggestion in my game development codebase. I think it is better than the alternatives, although for me (in VS 2015) it does not work with enums or discriminated unions. I wonder if there is any generic way of getting it working with those…?

    Reply
    1. Zeckul Post author

      Thanks for your comment. I’d be interested to know how this approach affects your garbage collections and general performance!

      I’m not sure what problem you’re having with DUs and enums. After defining the above module, this works for me:

      type DU = AnInt of int | AFloat of float | Other
      AnInt(3) = AFloat(3.0);;
      val it : bool = false
      > AnInt(3) = AnInt(4);;
      val it : bool = false
      > AnInt(3) = AnInt(3);;
      val it : bool = true

      Reply
      1. Sami Perttu

        Well, this is weird. It turns out DU equality was failing because of recursive definitions, for instance:

        type DU = AnInt of int | AFloat of float | Other
        and R =
        { x : int }
        member this.isFive = AnInt(5) = AnInt(this.x)

        Here the compiler complains that DU does not implement IEquatable but if I replace “and” with “type” it works fine.

        For me, the effect on performance was minimal because I was already using a custom operator in performance sensitive code. This way of defining equality means I can get rid of the custom operator, which is the real benefit.

        Enums still do not implement IEquatable. But I can live with that.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s