18 December, 2021

C++20 introduced the idea of module linkage, but it seems to be somewhat inconsistently implemented.

Exploring module linkage in C++20

As far as I can tell, symbols in a module unit should not be visible outside of their module unless they have been exported with the export keyword. Other symbols which would otherwise have had external linkage (such as functions) should instead have a new kind of linkage, module linkage, which basically means that they’re visible within their module but not outside of it.

So far, so good. My understanding of this is that I can use the same symbol in a different module and not get a name clash.

Does this work in practice?

Single-file modules

tl;dr - it works with MSVC and gcc, but fails with clang.

Here’s a module interface unit, Module1.cpp. It exports a function Module1Greeter() which calls an unexported function SayHello().

// Module1.cpp
export module Module1;

export const char* Module1Greeter();

const char* SayHello()
{
    return "Hello from Module1";
}

const char* Module1Greeter()
{
    return SayHello();
}

Here’s another module interface unit, Module2.cpp. It exports a function Module2Greeter() which calls an unexported function SayHello(). This is clearly not the same SayHello() as the one in Module1.cpp. They’re both in separate modules and aren’t exported, so should therefore have module linkage.

// Module2.cpp
export module Module2;

export const char* Module2Greeter();

const char* SayHello()
{
    return "Hello from Module2";
}

const char* Module2Greeter()
{
    return SayHello();
}

Here’s a short program that uses both of these modules.

// main.cpp
import Module1;
import Module2;

#include <iostream>

int main()
{
	std::cout << Module1Greeter() << '\n';
	std::cout << Module2Greeter() << '\n';
}

But there’s a discrepancy here between compilers.

Here’s MSVC (cl 19.30.30706)

It compiles and runs.

C:> cl /EHsc /c /std:c++20 /interface /TP Module1.cpp
C:> cl /EHsc /c /std:c++20 /interface /TP Module2.cpp
C:> cl /EHsc /std:c++20 main.cpp Module1.obj Module2.obj
C:> main.exe
Hello from Module1
Hello from Module2

Here’s gcc (gcc 11.2.0 - mingw)

It compiles and runs.

C:> g++ -c --std=c++20 -fmodules-ts Module1.cpp
C:> g++ -c --std=c++20 -fmodules-ts Module2.cpp
C:> g++ --std=c++20 -fmodules-ts main.cpp Module1.o Module2.o -o main.exe
C:> main
Hello from Module1
Hello from Module2

Here’s clang (clang 13.0.0)

It fails to compile, stating that SayHello() has different definitions in different modules.

C:> clang -c -std=c++20 -Xclang -emit-module-interface Module1.cpp -o Module1.pcm
C:> clang -c -std=c++20 -Xclang -emit-module-interface Module2.cpp -o Module2.pcm
C:> clang -std=c++20 -fprebuilt-module-path=. main.cpp Module1.pcm Module2.pcm -o main.exe
In file included from main.cpp:2:
Module2.cpp:5:13: error: 'SayHello' has different definitions in different modules; definition in
module 'Module2' first difference is function body
const char* SayHello()
~~~~~~~~~~~~^~~~~~~~~~
Module1.cpp:5:13: note: but in 'Module1' found a different body
const char* SayHello()
~~~~~~~~~~~~^~~~~~~~~~
1 error generated.#

Modules with a separate implementation unit

tl;dr - it works with g++ and clang (Linux) but fails with MSVC and clang (Windows).

Let’s take the module interface units from the previous example and split them into interface and implementation.

Here’s module interface unit, Module1.cpp. It exports a function Module1Greeter().

// Module1.cpp
export module Module1;

export const char* Module1Greeter();

Here’s its implementation unit, Module1_impl.cpp. It contains a definition for the exported function Module1Greeter(), and an unexported function SayHello().

// Module1_impl.cpp
module Module1;

const char* SayHello()
{
    return "Hello from Module1";
}

const char* Module1Greeter()
{
    return SayHello();
}

Similarly, here’s module interface unit, Module2.cpp. It exports a function Module2Greeter().

// Module2.cpp
export module Module2;

export const char* Module2Greeter();

Here’s its implementation unit, Module2_impl.cpp. It contains a definition for the exported function Module2Greeter(), and an unexported function SayHello(). As before, SayHello() is in a module unit and therefore should have module linkage.

// Module2_impl.cpp
module Module2;

const char* SayHello()
{
    return "Hello from Module2";
}

const char* Module2Greeter()
{
    return SayHello();
}

I won’t show main.cpp as it’s the same as the previous example.

Here’s MSVC (cl 19.30.30706)

It fails to link due to multiply defined symbols.

C:> cl /EHsc /c /std:c++20 /interface /TP Module1.cpp
C:> cl /EHsc /c /std:c++20 /interface /TP Module2.cpp
C:> cl /EHsc /c /std:c++20 Module1_impl.cpp
C:> cl /EHsc /c /std:c++20 Module2_impl.cpp
C:> cl /EHsc /std:c++20 main.cpp Module1.obj Module1_impl.obj Module2.obj Module2_impl.obj
...
Module2_impl.obj : error LNK2005: "char const * __cdecl SayHello(void)" (?SayHello@@YAPBDXZ)
already defined in Module1_impl.obj
main.exe : fatal error LNK1169: one or more multiply defined symbols found

Here’s gcc (gcc 11.2.0 - mingw)

It compiles and runs.

C:> g++ -c --std=c++20 -fmodules-ts Module1.cpp
C:> g++ -c --std=c++20 -fmodules-ts Module2.cpp
C:> g++ -c --std=c++20 -fmodules-ts Module1_impl.cpp
C:> g++ -c --std=c++20 -fmodules-ts Module2_impl.cpp
C:> g++ --std=c++20 -fmodules-ts main.cpp Module1.o Module1_impl.o Module2.o Module2_impl.o -o main.exe
C:> main
Hello from Module1
Hello from Module2

Here’s clang (clang 13.0.0) on Windows

It fails to link due to multiply defined symbols.

C:> clang -c -std=c++20 -Xclang -emit-module-interface Module1.cpp -o Module1.pcm
C:> clang -c -std=c++20 -Xclang -emit-module-interface Module2.cpp -o Module2.pcm
C:> clang -c -std=c++20 -fmodule-file=Module1.pcm Module1_impl.cpp
C:> clang -c -std=c++20 -fmodule-file=Module2.pcm Module2_impl.cpp
C:> clang++ -std=c++20 -fprebuilt-module-path=. main.cpp Module1.pcm Module2.pcm Module1_impl.o \
    Module2_impl.o -o main.exe
Module2_impl.o : error LNK2005: "char const * __cdecl SayHello(void)" (?SayHello@@YAPEBDXZ) already
defined in Module1_impl.o
main.exe : fatal error LNK1169: one or more multiply defined symbols found
clang: error: linker command failed with exit code 1169 (use -v to see invocation)

I also tried this with -fuse-ld=lld to use a different linker and got a similar error.

lld-link: error: duplicate symbol: char const * __cdecl SayHello(void)
>>> defined at Module1_impl.o
>>> defined at Module2_impl.o
clang: error: linker command failed with exit code 1 (use -v to see invocation)

Here’s clang (clang 13.0.2) on Linux

It compiles and runs.

$ clang -c -std=c++20 -Xclang -emit-module-interface Module1.cpp -o Module1.pcm
$ clang -c -std=c++20 -Xclang -emit-module-interface Module2.cpp -o Module2.pcm
$ clang -c -std=c++20 -fmodule-file=Module1.pcm Module1_impl.cpp
$ clang -c -std=c++20 -fmodule-file=Module2.pcm Module2_impl.cpp
$ clang++ -std=c++20 -fprebuilt-module-path=. main.cpp Module1.pcm Module2.pcm Module1_impl.o \
  Module2_impl.o -o main.exe
$ ./main
Hello from Module1
Hello from Module2

What’s happening here?

I’m new to C++ modules, so it’s entirely possible that I’ve misunderstood something about module linkage or even that I’m simply using the wrong flags to compile. If this is the case, then please let me know.

However, nearly every article that I’ve read about C++20 modules contains a section that talks about module linkage in the way that I described above, so they can’t all be wrong. Also, I’ve yet to find any sample code (other than slideware) that doesn’t also use a namespace.

My conclusion is that C++20 module support is still new, and that compiler vendors have been concentrating on big wins such as making std importable as a module, and that they may simply not have got around to implementing this particular aspect of module linkage.