Don't Forget the Inline!
If you're writing a header file and you're at global or namespace scope, then you almost certainly do not mean to declare bare const
or constexpr
variables.
Background
So, C++, like C, has a concept called linkage. The linkage of a symbol (function or a variable) controls two things:
- Whether the symbol has any visibility outside of the translation unit it's declared in. That is, whether the linker (thus the name "linkage") cares about it at all.
- If the symbol is visible outside of its translation unit, it controls how the linker reacts when multiple translation units declare the same thing.
At global or namespace scope (and nowhere else), linkage is determined as follows (this is a simplification, you can go be a language lawyer on cppreference if you need the nitty gritty):
static
means the symbol has internal linkage. It can't be seen outside the current translation unit, and the linker doesn't care about it.- Otherwise, the symbol has external linkage. The linker will look for other external symbols that have the same name and link them as follows:
- If a variable or function is marked
extern
then that means that the linker must find another one with a matching name which is not also markedextern
. All references to theextern
symbol are then rewritten to point to the matching non-extern
one (that is, they are linked together), and theextern
one is thrown away. - If a variable or function is marked
inline
then the linker will take it and every other matching one which is also markedinline
, pick one, and then throw away the rest (rewriting all references to the thrown away copies to refer, instead, to the one which was kept - again, this is linking and it's what a linker is for). - If, after ignoring the
extern
symbols and merging together theinline
symbols, the linker finds multiple symbols that all share the same name, then that's a link error and your build fails.
- If a variable or function is marked
After linking, all of the remaining functions and variables which aren't discarded as being unreferenced or dead code are then written to the output executable binary.
Consequences
Note the rules above: after the linker does its thing (which may include eliminating unreferenced symbols), whatever remains goes into the final executable. If linking succeeds, then we know that the symbols with external linkage are all unique, because it's an error to have more than one copy left of any of them after the linker's done its thing. However we know no such thing about symbols with internal linkage, which the linker left as-is.
So what happens if you declare a bunch of static
things in a header file at global or namespace scope? Well, every CPP file that includes that header (even transitively) gets its own copy of that symbol. (And things get really interesting when a code generator is spitting out massive headers full of absurd number of static
symbols.) The linker will happily stuff as many redundant copies of a symbol with internal linkage as you (knowingly or otherwise) produce into your executable.
I have seen projects where tens of megabytes were wasted on nothing but this. And it's not that hard to do. You just make a few headers with a few hundred such declarations each and then include them in hundreds of CPP files (possibly by including them in a project-wide precompiled header or something like that). And hardly anyone these days looks at linker maps, so in a big project it goes totally unnoticed (until the team runs face-first into a hard memory limit on a target platform).
So, y'know, don't do that.
Okay, but what does that have to do with const
and constexpr
?
This variable isn't marked static
, so it has external linkage:
int foo = 8;
However this variable, which also isn't marked static
, has internal linkage - as if it was marked static
:
const int foo = 8;
And so does this one:
constexpr int foo = 8;
Why do const
and constexpr
change the default linkage in this manner? Well, const
started it and constexpr
is probably just trying to be consistent with the existing convention. But why is const
like this...?
I don't actually know the answer, but I'll venture a guess. It probably goes back to C (if not to some predecessor of C I don't know about). See, in ages past, inline
was only valid on functions (which probably contributes to some people still thinking inline
refers to inlining, the optimization - it doesn't!). Variable declarations could not be inline
. So if you wanted to put a bunch of constants in a header file, then you'd have to also mark them all static
in order to prevent the linker from seeing duplicate declarations if that header was then included in multiple C files. That would be annoying, so my guess is that someone decided that const
could just imply static
and that would be good enough.
And in the past, it probably wasn't all that bad. A compiler would likely just copy the value of a const int
into places where it's referenced instead of referencing the symbol itself. That makes the declaration unreferenced dead code, and it gets eliminated. And as far as I've ever seen, the original advice when promoting the use of const
over macros was exactly that - to use them for basic things like integers.
But now we don't just want constant integers. We want bigger, chonkier constants: configuration blocks, binary resources, things that have constructors... A const int
can get completely eliminated, but a const SomethingWithAConstructor
probably can't (unless the compiler can prove in all translation units that the constructor has no side effects). And in order to run that constructor before main
, as the language requires, the compiler needs to generate a function to call it, and then a pointer to that function probably needs to be put into an array somewhere where the C/C++ runtime will find and call it before calling main
. And maybe the constructor can't be proven not to throw, so exception-handling tables need to be set up for the little function... And sure, that's all just a few bytes, but in a nontrivial project, those few bytes can easily be multiplied out by a large number, and they do add up.
As I said, I've seen megabytes lost because of this and, in one case, those "mere" megabytes were a straw among a handful of others which nearly broke our camel's back.
Neat, how do I fix it?
To avoid this problem in modern C++, the solution is to just add inline
to all of these declarations. Instead of const int foo
it's now inline const int foo
; similarly, constexpr int foo
becomes inline constexpr int foo
. The terrible default is terrible and it's annoying to have to override it every single time, but that's how things are and we gotta deal with it.