C++ modules and circular class reference

Question

To learn more about C++20 modules, I'm in the process of migrating a graphics application from header files to modules. At the moment I have a problem with a circular dependency between two classes. The two classes describe nodes and edges of a graph. The edge class has pointers to two nodes and the node class has a vector of pointers to adjacent edges. I know, there are other ways to describe a graph, but this archtitecture seems very natural to me, I have very fast access to neighboring elements and it works seamlessly in the old world of header files and #include. The key are forward references.

But in the new world of C++20 modules, forward references no longer work.

The topic of circular references has been discussed in many places, but I haven't yet found a solution that really convinces me.

A common statement is that circular references are an architectural problem and should be avoided. If necessary, the two classes should be packed into one module. That would clearly be a step backwards. I try to make modules small and elementary.

I could replace the pointers to nodes or edges with pointers to a common base class NetworkObject that actually already exists. But that would destroy valuable information and force me to use static_cast to artificially add the type information back.

My question is: Am I missing anything? Is there an easier way?

Answer 1

There are a few misconceptions I can see here. Not entirely false, but not entirely true either.

But in the new world of C++20 modules, forward references no longer work.

This is not completely true. You cannot use forward reference that would declare something as part of a different module, but you can certainly do that within the same module.

For example:

export module M;

export namespace n {
    struct B;

    struct A {
        B* b;
    };

    struct B {
        A* a;
    };
}

Then you can split it up in multiple module partitions:

export module M:a;

namespace n {
    struct B;
    export struct A {
        B* b;
    };
};

export module M:b;

namespace n {
    struct A;
    export struct B {
        A* b;
    };
};

export module M;
export import :a;
export import :b;

The gist of it is that types that depends on each other to be defined are coupled enough that they must reside in the same module.

Also, note that modules are not necessarily supposed to be as granular as headers. Dividing your modules too much could hurt compile time performances. For example, a whole library could be just one big module. The standard library chose this approach and export everything in the std modules and turns out it's faster than dividing the standard library in many smaller modules.

Smaller modules are not as good as many may think. Related things and classes should be packed in the same module, and if further splitting is needed for code organization within that module.

The amount of modules and their name is part of your API. This means that if you have too much fine grained module, simply moving your code around will result in a breaking change. Module partitions are not part of your API and can be moved around freely.

A common statement is that circular references are an architectural problem and should be avoided. If necessary, the two classes should be packed into one module. That would clearly be a step backwards. I try to make modules small and elementary.

Those modules would not be small and elementary because of the cycle between them. ie you can't just use one module without also using the other. You will need to link against that other module if the implementation reside in another static library.

The two classes describe nodes and edges of a graph

We there be a program that would work with only the nodes module or only the edges module? Hardly. They should be part of the graph module. You could have a :edge and :node partitions, but it would not make sense using only one of those in a program or part of program.

If this is for compile times, then making bigger modules has been proven today that they are faster than smaller modules with current compiler technologies

The rationale for splitting modules into smaller modules is that there would be a use case for wanting to only import certain specific things. For example, std.freestanding would only contain the freestanding part of the standard library so programmers don't accidentally use parts they are not allowed to use.

Of course, another way to do that would be to drop all the modules safeguards and use Global Module Fragments (GMF). Using that allows modules to interface with the implicit global module. And yes, using that allows the benefit and the consequences that comes with global forward declaration. You will open the way for ODR violations to become possible again, and your entities won't be part of a named module anymore. It also allows a user to use your entities without importing the specific named module the declaration reside in, bypassing the API you expose to your users via your module names.

You can open Pandora's box using the extern "C++" directive:

export module A;

export namespace n {
    extern "C++" {
        struct B;
        struct A {
            B* b;
        };
    }
}

export module B;

export namespace n {
    extern "C++" {
        struct A;
        struct B {
            A* a;
        };
    }
}

Live example

C++ modules and circular class reference

Question

1 answers

solution1
2 2022-09-06 19:48:14

C++ modules and circular class reference

Question

1 answers

solution1 2 2022-09-06 19:48:14

solution1
2 2022-09-06 19:48:14