In this post I will discuss monomorphization: what it is, why it’s useful, and what problems can potentially arise from it, with a particular focus on the problem of monomorphization bloat. I will also look at options for dealing with monomorphization bloat in your codebase.
Monomorphization is a compilation strategy to allow polymorphism with static dispatch. To explain how it works, we’ll look at the following function:
As you can see, this function takes in some type
T. This means we can call
the function with any type, like so:
At compile time, the compiler sees that
do_the_thing is called with three
&str. So it creates three different
functions based on
do_the_thing, replacing the general
T type with
each of the three concrete types the function is actually called with. In
the end, you get something like this:
This is monomorphization! The generated code is equivalent to you having written three different functions, but you didn’t have to actually write three functions! This is really cool, as writing the general version is a lot less work than writing a version for every concrete type you’ll use, and a lot more flexible too.
The downside is that, by generating these functions, you potentially add
bloat to the resulting binary. If you have a function with size
it’s called with
m number of concrete types, it’ll have a size of
n×m in the resulting binary! In some contexts, a large binary size can
be a problem.
Thankfully, there are several options to deal with this. In my last post,
I talked about a small conversion trick using
Into. This is a fairly
common trick in the Rust world, and can if you’re not careful cause
the monomorphization bloat problem we’re discussing here. But imagine
you have something like the following:
The way monomorphization works, the entire body of the function gets copied.
But in the case of conversion traits like
ToString, you can actually separate the part that needs to
be monomorphized from the rest of the function like so:
By splitting out the conversion code from the function, you keep the part that the compiler will monomorphize, and thereby duplicate, quite small. This helps to keep the size of the resulting binary down.
Alternatively, you can remove the conversion trait abstraction, and instead require callers to do the conversion themselves before calling the function. This may make the API a little more tedious to use, but it avoids the potential problem of bloat by not requiring monomorphization at all.
Additionally, you can always try to shrink the sizes of your functions by refactoring. A big function is probably a sign of something gone wrong anyway.
I don’t say any of this to encourage you not to use trait bounds or not to use some of the conversion trait niceties in your API. Generics and static dispatch are an important part of Rust, and it would be silly not to take advantage of these features. Just keep in mind the potential for bloat, and keep an eye on the size of your binary. If it grows too large and becomes a problem, hopefully you will be better equipped to understand what is happening, and to correct it.