简体   繁体   中英

Large scale usage of Meyer's advice to prefer Non-member,non-friend functions?

For some time I've been designing my class interfaces to be minimal, preferring namespace-wrapped non-member functions over member functions. Essentially following Scott Meyer's advice in the article How Non-Member Functions Improve Encapsulation .

I've been doing this with good effect in a few small scale projects, but I'm wondering how well it works on a larger scale. Are there any large, well regarded open-source C++ projects that I can take a look at and perhaps reference where this advice is strongly followed?

Update: Thanks for all the input, but I'm not really interested in opinion so much as finding out how well it works in practice on a larger scale. Nick's answer is closest in this regard, but I'd like to be able to see the code. Any sort of detailed description of practical experiences (positives, negatives, practical considerations, etc) would be acceptable as well.

I do this quite a bit on the project I work on; the largest of which at my current company is around 2M lines, but it's not open source, so I can't provide it as a reference. However, I will say that I agree with the advice, generally speaking. The more you can separate the functionality which is not strictly contained to just one object from that object, the better your design will be.

By way of an example, consider the classic polymorphism example: a Shape base class with subclasses, and a virtual Draw() function. In the real world, Draw() would need to take some drawing context, and potentially be aware of the state of other things being drawn, or the application in general. Once you put all that into each subclass implementation of Draw(), you're likely to have some code overlap, or most of your actual Draw() logic will be in the base class, or somewhere else. Then consider that if you want to re-use some of that code, you'll need to provide more entry points into the interface, and possibly pollute the functions with other code not related to drawing shapes (eg: multi-shape drawing correlation logic). Before long, it'll be a mess, and you'll wish you had a draw function which took a Shape (and context, and other data) instead, and Shape just had functions/data which were entirely encapsulated and not using or referencing external objects.

Anyway, that's my experience/advice, for what it's worth.

I'd argue that the benefit of non-member functions increases as the size of the project increases. The standard library containers, iterators, and algorithms library are proof of this.

If you can decouple algorithms from data structures (or, to phrase it another way, if you can decouple what you do with objects from how their internal state is manipulated), you can decrease coupling between your classes and take greater advantage of generic code.

Scott Meyers isn't the only author who has argued in favor of this principle; Herb Sutter has too, especially in Monoliths Unstrung , which ends with the guideline:

Where possible, prefer writing functions as nonmember nonfriends.

I think one of the best examples of an unneccessary member function from that article is std::basic_string::find ; there is no reason for it to exist, really, as std::find provides exactly the same functionality.

OpenCV library does this. They have a cv::Mat class that presents a 3D matrix (or images). Then they have all the other functions in the cv namespace.

OpenCV library is huge and is widely regarded in its field.

One practical advantage of writing functions as nonmember nonfriends is that doing so can significantly reduce the time it takes to thoroughly test and verify the code.

Consider, for example, the sequence container member functions insert and push_back . There are at least two approaches to implementing push_back :

  1. It can simply call insert (it's behavior is defined in terms of insert anyway)
  2. It can do all the work that insert would do (possibly calling private helper functions) without actually calling insert

Obviously, when implementing a sequence container, you probably want to use the first approach. push_back is just a special form of insert and (to the best of my knowledge) you can't really get any performance benefit by implementing push_back some other way (at least not for list , deque , or vector ).

However, to thoroughly test such a container, you have to test push_back separately: since push_back is a member function, it can modify any and all of the internal state of the container. From a testing standpoint, you should (must?) assume that push_back is implemented using the second approach because it is possible that it could be implemented using the second approach. There is no guarantee that it is implemented in terms of insert .

If push_back is implemented as a nonmember nonfriend, it can't touch any of the internal state of the container; it must use the first approach. When you write tests for it, you know that it can't break the internal state of the container (assuming the actual container member functions are implemented correctly). You can use that knowledge to significantly reduce the number of tests that you need to write to fully exercise the code.

I also do this alot, where it seems to make sense, and it causes absolutely no problems with scaling. (although my current project is only 40000 LOC) In fact, I think it makes the code more scalable - it slims down classes, reduces dependencies. It sometimes requires you to refactor your functions to make them independent of members of the class - and thereby often creating a library of more general helper functions, which you can easly reuse elsewhere. I'd also mention that one of the common problems with many large projects is the bloating of classes - and I think preferring non-member, non-friend functions also helps here.

Prefer non-member non-friend functions for encapsulation UNLESS you want implicit conversions to work for class templates non-member functions (in which case you better make them friend functions):

That is, if you have a class template type<T> :

template<class T>
struct type {
  void friend foo(type<T> a) {}
};

and a type implicitly convertible to type<T> , eg:

template<class T>
struct convertible_to_type {
  operator type<T>() { }
};

The following works as expected:

auto t = convertible_to_type<int>{};
foo(t);  // t is converted to type<int>

However, if you make foo a non-friend function:

template<class T>
void foo(type<T> a) {}

then the following doesn't work:

auto t = convertible_to_type<int>{};
foo(t);  // FAILS: cannot deduce type T for type

Since you cannot deduce T then the function foo is removed from the overload resolution set, that is: no function is found, which means that the implicit conversion does not trigger.

(I don't have time to write this up nicely, the following's a 5 minute brain dump which doubtless can be ripped apart at various trival levels, but please address the concepts and general thrust.)

I have considerable sympathy for the position taken by Jonathan Grynspan, but want to say a bit more about it than can reasonably be done in comments.

First - a "well said" to Alf Steinbach, who chipped in with "It's only over-simplified caricatures of their viewpoints that might seem to be in conflict. For what it's worth I don't agree with Scott Meyers on this matter; as I see it he's over-generalizing here, or he was."

Scott, Herb etc. were making these points when few people understood the trade-offs or alternatives, and they did so with disproportionate strength. Some nagging hassles people had during evolution of code were analysed and a new design approach addressing those issues was rationally derived. Let's return to the question of whether there were downsides later, but first - worth saying that the pain in question was typically small and infrequent: non-member functions are just one small aspect of designing reusable code, and in enterprise scale systems I've worked on simply writing the same kind of code you'd have put into a member function as a non-member is rarely enough to make the non-members reusable. It's pretty rare for them to even express algorithms that are both complex enough to be worth reusing and yet not tightly bound to the specific of the class they were designed for, that being weird enough that it's practically inconceivable some other class will happen along supporting the same operations and semantics. Often, you also need to template arguments, or introduce a base class to abstract the set of operations required. Both have significant implications in terms of performance, being inline vs out-of-line, client-code recompilation.

That said, there's often less code changes and impact study required when changing implementation if operations have been implementing in terms of a public interface, and being a non-friend non-member systematically enforces that. Occasionally though, it makes the initial implementation more verbose or in some other way less desirable and maintainble.

But, as a litmus test - how many of these non-member functions sit in the same header as the only class for which they're currently applicable? How many want to abstract their arguments via templates (which means inlining, compilation dependencies) or base classes (virtual function overheads) to allow reuse? Both discourage people from seeing them as reusable, but when not the case, the operations available on a class are delocalised , which can frustrate developers perception of a system: the develop often has to work out for themselves the rather disappointing fact that - "oh - that will only work for class X".

Bottom line: most member functions aren't potentially reusable. Much corporate code isn't broken into clean algorithm versus data with potential for reuse of the former. That kind of division just isn't required or useful or conceivably useful 20 years down the road. It's much the same as get/set methods - they're needed at certain API boundaries, but can constitute needless verbosity when ownership and use of the code is localised.

Personally, I don't have an all or nothing approach to this, but decide what to make a member function or non-member based on whether there's any likely benefit to either, potential reusability versus locality of interface.

As stated in the article, STL has both member and non-member functions. This isn't because of his preference -- it's mostly because many of the free functions operate on iterators and iterators are not in a class hierarchy (because STL wants pointers on arrays to be first-class iterators).

I strongly disagree with this article for C++, because the inconsistency would be annoying, and in modern IDE's intellisense would be broken.

In C#, however, with extension methods, this is all very good advice. There, you get the best of both worlds -- you can make non-member functions that can appear to be member functions. They also can be in separate files -- getting all of the benefits of this practice without the inconsistency drawback.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM