This paper was delivered to Summer AUUG meetings in both
Sydney and Melbourne in March 1996.
Surviving C++: Tips and Tricks
Joan McGalliard * and Paul-Michael Agapow **
GPO Box 964G Melbourne, V.3001
* jem@netspace.net.au
** agapow@latcs1.oz.au
Abstract
C++ has rapidly become a dominant language. Yet it is complex
and difficult to learn and many (like the authors) have had to
teach themselves. In this paper we share some ideas, tips and
approaches for learning and using C++, drawn from our own
experiences and several useful works. A reading list and
pointers to other useful references are included.
Introduction
Whether one likes it or not, C++ is here to stay ... at least
for a while. The language's widespread and almost faddish
popularity is partly justified by some very attractive features.
Conversely, C++ is one of the most complex and obfuscated
programming languages to ever reach general use, as many who who
have learnt the language will know. This paper (which could be
called "C++ for Humans") is an assortment of approaches, tips and
points to help make C++ easier. While there are certainly
arguments against some of our suggestions, we feel they are
useful for avoiding some of the worst pitfalls.
However, it should be noted that this paper is not:
- a C++ reference.
- a basic tutorial in C++.
- a guru's list of C++ tricks.
- a coverage of C or general programming issues except
where these pertain to C++.
Why Use C++?
For many this question will not arise, as they're obliged to use C++
for project or job reasons. For the rest of us, there are some good
(albeit not great) reasons as to why you should use C++:
- I/O handling streams that (compared with
stdio) are robust, typesafe, fast and extensible.
- exceptions, which allows you to code for success instead of
failure (see below).
- a workable (although not brilliant) object system that gives
modularity and reusability. (Although possible in C these
features are more accessible and sophisticated in C++.)
- cleaner memory handling with new and delete.
- stronger type checking.
- const and inline declarations which allow a
decreased reliance on the preprocessor for constants and
macros. (Note const is available if infrequently used
in ANSI C and inline is present in some
implementations. See below for some caveats about their use in
C++.)
- ready availability of a wide range of high quality class
libraries.
- a large language compatibility (and "programmer
compatibility" [15]) with C.
Books and Pointers
There are a lot of very bad C++ books available, that are not
only poorly written but just plain incorrect. Therefore it is
recommended that you mainly use books on which you can get a
recommendation. The books you might find use for can be
broken down into three classes:
- Introductory texts (for learning the language) comprise the
most crowded sector of the market and also represent some of the
worst written. Exceptions include Stevens [1], Lippman [15],
Dewhurst and Stark [16], Satir and Brown [4].
- There's less choice amongst the core reference material (for
getting precise language specs and syntax). Unfortunately for
those of us spoilt by the excellent examples available for C
(Kernighan and Ritchie et al.) Stroustrup [2] and Ellis and
Stroustrup (or ARM) [3] are abominably written. Although these
are the definitive references, we suggest consulting them only in
an emergency or when 'lesser' texts disagree. Also of use is
Cline and Lomow [5], Plauger [29], and - of course - the draft
standard [30].
- Once you know C++, the nest step is to become good at it and
there are some very astute "style" guides around to provide sage
advice. McConnell [6] although not specifically about C++ is
such a wealth of information and good practices that everyone
should read it, if not own a copy. Highly recommended and
specifically about C++ are Saks and Plum [7], Horstmann [8],
Cargill [14], Meyer [10], Ellemtel [20], Taligent [31] and at
advanced levels Coplien [17].
Of course the Internet contains a wealth of C++ information, not
all of it worthwhile. Web searches for C++ will reveal an
(over)abundance of information. Good places to visit would
include the newsgroups comp.lang.c++ and
comp.lang.c++.moderated and the C++ FAQ [22,23]. To
find the right tutorial for your needs, begin looking at
"Learning C++" [25] or "Learn C/C++ today" [26]. C programmers
may want to go straight to "C++ Annotations" [27]. To try to
come around to the object oriented paradigm look at
"Understanding C++" [28]. The best general place to start
looking around the web at C++ is probably "C++ Virtual Library"
[24].
Style
Personal programming style (methods of commenting, use of
identifiers, spacing and so on) is of course a highly personal
issue and many electrons have been wasted arguing about the
naturally "best" way. Despite this history however, we'll dive
in and suggest that there are some stylistic conventions that are
useful in the case of C++. Use of commenting and a self
documenting programming style in any programming environment is
of course important for readability and maintainability. With C++, the
emphasis on reusability makes this issue paramount. Thus all the
'rules' of good commenting and layout apply doubly to C++. Of
course, any body of code should have a unifying style.
Often in C++ you'll have several functions or variables with
the same name (in different classes, or overloaded functions
etc.). The compiler 'mangles' these names to try to produce
unique identifiers, often using a combination of source file
name, class name and object name, then - if necessary -
truncating the name. If using a source code debugger or other
analysis tool, the actually names of variables and functions can
be quite different to that expected. Fortunately this is a
problem that is vanishing with improved debuggers. Name mangling
can also lead to subtle errors if mangled names "collide" (a
problem that is also disappearing). For safety ensure that file
names are unique across different directories in the same
program. Also long identifier names should be unique in the
early part of their names. For example choose
func1UsedInThisBlock and func2UsedInThisBlock
rather than funcUsedInThisBlock1 and
funcUsedInThisBlock2). Another problem along these
lines in the collision of mangled names with those in third party
libraries.
On the subject of identifiers, it's good practice to give
names for formal arguments for functions in their declaration and
to ensure these are the same as in the function definition (i.e.
don't declare functions myCleverFunc( int, char*, long, int ).
Compilers won't complain about the lack of this but it's clearer
both for yourself and those who come after you.
There are two ways of doing comments that are supported under
all C++ compilers, the standard, C 'parenthesis' style
/*...*/ (slash-asterisk), and the C++ 'to the end of line'
style // (double-slash). We suggest the exclusive use of
double-slash comments because :
- Double-slash is the most obvious style for inline comments
and commenting out entire lines
- Double-slash is autoterminating and eliminates the risk of dangling
comments
- Double-slash eliminates errors caused by accidently nesting comments
- When grepping or using other file searching tools, it is easy to
identify (or eliminate) lines or parts of lines that are commented out
using double-slash
- Slash-asterisk is available for temporary commenting out of large
sections of code without having to worry about nesting as above. (This
effect can also be duplicated by
use of #if 0 and #endif.
- Use of one style makes for consistency and uniformity of code appearance
Design and Planning
The whole concept of OO design itself is deep enough to fill
several shelves of books, so we'll simply list some useful
references, and emphasise that the design phase is especially
important with OO languages. There are several excellent books
on the subject of OO design including Meyer [19] and Booch [33]
and much of the new literature on patterns including Gamma [11].
(A note on the subject of patterns: although it seems fairly
complex and obfuscated, at it's heart it's a very simple and
practical development that will reward study.) Try to make use
of other people's work. Use the standard libraries and common
algorithms whenever possible.
The Compiler is Not Your Friend
The C++ compiler writers have a tough job: they're dealing with a
moving target (the as yet unrealised standard), the standards
committee is writing in features that haven't been implemented
yet, and language complexity is blowing out severely. All in
all, this leads to buggy and eccentric compilers [12] although
thankfully this situation has improved significantly over recent
years. (Thus compiler related problems should hopefully be
minor, unless you are forced to deal with old software.)
- If it worked in the last compiler release, don't assume that
it will work in the next. This applies especially for sensu
stricto syntactically legal code that is tricky or
counter-intuitive.
- Don't learn anything from your compiler. Take what you've
learnt from reference material and confirm it with your compiler.
With several very distinct flavours of C++ still around, there's
a good chance your compiler has a few in-house extensions or
stop-gap features within. If your compiler forces you away from
standard C++: add comments!
- Actively distrust your compiler. Be prepared for it to take
your perfectly reasonable (and especially your perfectly
unreasonable) code and produce spaghetti. Every once in a while
do a total fresh compile of your project with all the warning
flags on and try to get rid of every problem reported - if
necessary by getting a new compiler.
- As a consequence of compilers and language specs moving,
any class libraries or toolkits you use may be strangely buggy.
Use recent versions of libraries and distrust them.
Headers
As per other languages, header files are the black box documentation
(interface) of your code, to be used by anyone calling your
code. The usual caveats for such information apply as well as the
following :
- Try and keep unnecessary information out of the header
files. The how (as opposed to the what), as well
as details on any private member functions, belong in the appropriate
source file.
- Unfortunately in C++ encapsulation is not very secure.
Along with the necessary interface details in the header file,
there is also a complete description of your class. The
temptation for a user to hack the header (change a
private to public) and use internal functions
is great. Don't do this yourself, but be aware it could
happen.
- Always explicitly define access type (private,
public etc.) of class members.
- Although many books show member functions defined inline
with their declarations in the class definitions, don't do this.
This is muddying your implementation and interface.
- To keep the size of object files down to a minimum, there
should be one header file per class and one class per header file
(ie. each header file should be a complete description of
a single class). Also you should ensure (through #ifdef,
pragma or similar) that each header file is included at
most once
Default member functions (and traps)
Every time C++ creates a class, it automatically generates a
default constructor, destructor, a copy constructor and an assignment
operator if you don't specify it [13]. It is important to be aware of
these, and to be sure that the default ones suit your purpose.
The default constructor has no parameters, and does nothing
except call the constructors of the base and member classes.
It is only generated if there are no other constructors. Remember
if you add a constructor later, and you have been relying on the
default constructor (eg you have an array of objects), you must
then supply that explicitly yourself.
Another important rule is to make destructors virtual. If an
instance of a base class is a derived class, BaseClass object
= new DerivedClass; then the base class' destructor must be
virtual, otherwise delete object; will not call the
derived class' destructor. If the base destructor is virtual,
then first the derived and then the base destructor will be
called.
The copy constructor can be called explicitly, and is also
used when you in calling a function with call-by-value parameter.
The assignment operator (=) behaves in a similar fashion, and as
an massively overloaded operator, is very useful. With either of
these there is a trap: if your class contains a pointer to a
dynamically allocated item, then both the copy and the assignment
operator will copy the pointer, not the item. Thus if the copy
is deleted, the original becomes corrupted. To avoid this, you
may have to write your own suitable copy and assignment.
Exceptions
C++ has turned around one of the biggest faults in most
Algolic languages. Previously while unexpected cases
might have been only 1% (or much less) of the live data, 90%
of the code might have been for such abnormal conditions. Perhaps
more importantly, the important code is cluttered with largely
unused error-handling code. Thus one of the most
useful features in C++ is exceptions. Exceptions allow the
special cases, unexpected values and errors to be handled
separately from the rest of the code and more simply. C
programmers who find themselves compelled to develop in C++
sometimes dig in their heels and stick to straight Kernighan and
Ritchie C. We would suggest to these people that if they adopt
only one part of C++ it should be exceptions. If using objects,
the use of exceptions becomes almost obligatory.
Some general hints about using exceptions :
- Make liberal use of the catch-all, catch(...). This
will catch every exception that isn't handled by the preceding
catches.
- A throw without a catch is an error
(unhandled exception). Rather than leaving the compiler to handle
this you could put a try around your main procedure,
just in case.
- If you limit the amount of different types of objects you
throw within your program, then you can put a catch for each kind
at every exception handler. This will catch otherwise unhandled
exceptions (ie. programming errors) at the lowest possible level.
- Be careful handling exceptions from library routines. Many
libraries do not implement exceptions yet, and those that do are
not necessarily adhering to the standard. If you port your code,
or upgrade your libraries, you may find your exception handlers
no longer work.
Debugging, Tuning and Optimisation
Debugging and optimisation are often synonymous (e.g. poor
performance can be called a "bug") and optimisation can
introduce bugs and get in the way of debugging. A major cause of
hassles in this area for C++ is the use of inline functions :
- It's always a good idea to debug your project then
optimise it. This applies doubly when it comes to the use of
inline functions. Many compilers and debuggers have poor or
non-existent facilities for the debugging of inline functions.
- Inlining a function doesn't always make it faster. Since a
C++ compiler is not obliged to follow the inline directive,
extensive inlining may in fact make execution slower. Even if all
inline directives are followed severe code bloat may result and
speed may actually decrease in paging environments.
- What speeds up a C program may actually be slower in C++,
even if the C++ program is just strict C code rendered by a C++
compiler. The moral : always test and time your optimisations.
- While "Premature optimisation is the root of all evil"
(Donald Knuth), the best way to ensure optimal code is to design
it correctly in the first place. No amount of tweaking will save
a basically bad design.
- A memory tool/debugger (such as Purify [32]) is quite useful.
- The old assert construct is still quite useful and
can be a substitute for exceptions if for some reason they are
not suitable.
Approach With Caution
The following is a list of the more complex features of C++
that may lead you into a morass of debugging, cryptic compiler
warning and convoluted code. Although the most obvious approach
with problems like this is just not use them, in many cases this
isn't practical. The best advice therefore is that if have to use
these features budget your time appropriately, enlist the
help of a guru and approach with caution.
- Templates and generic programming are one of the smartest
things to come along in a long while and form the basis of some
good class libraries. Templates are however a "deep" topic and
debugging them can be difficult. Overenthusiastic use of
templates can also lead to severe code bloat as some "stupid"
compilers instantiate every variant of a template.
- The I/O classes of C++ are a "Good Thing". However due to
peculiar "diamond shape" inheritance pattern within these classes
and the variety of iostream implementations (the exact
method not being defined in the standard), producing a subclass
can be hell.
- Multiple inheritance is a useful thing. Although resolution
of inheritance is deterministic within the standard, you can
still run into problems with old compilers or complex
inheritance trees.
- Overloading is a very powerful feature of C++. However it
should be used carefully and sparingly, especially when it comes
to overloading operators (as opposed to functions). Ensure that
the semantic are uniform and logical. For example,
compareThing( *bigObject, *bigObject) and
compareThing( *tinyObject, *tinyObject) should perform
their operations in a similar manner. If they do not, the
operator shouldn't overloaded and will only cause confusion
later. If you're overloading the equivalence operator ==,
remember that (x == x). Be aware that poorly defined
operators can reduce the readability of a program and if you
define them for a base class, you may have to redefine for all
the derived classes ...
- C++ gives us use of the ... ellipsis operator
for unspecified function arguments. This should be used
sparingly due to it circumventing C++'s strong type checking.
Gotchas
Every language, of course, has a share of surprises that are
just waiting to slap you in the face with the non-obvious
implications of the code you have written. With C++, a lot of these stem
from the fact that it is not a strict superset of C, despite the
temptation to assume that it is.
- In C there are separate namespaces for structures and types.
In C++ all names within a scope share the same namespace. Moral:
make all of your identifiers unique, always.
- In C++ there is some confusion surrounding the declaration
of variables in the initialisation of a for loop.
Previously the declared initialisation variable was held to exist outside the
loop (as if it had been declared before entering the loop). The
standard now holds that this variable exists only within the
for loop. This means that the below code is fine under
recent C++ but will break under older implementations. Further
it means that if you declare the loop counter in the
initialisation of the loop and later break from that loop, you
cannot access the counter to see how many times the loop has
executed.
for (int i= ... )
{
...
}
...
for (int i= ... ) // Error under some older C++
If you are porting code from an older compiler to one
that conforms to the new standard, you may need to move the declaration of each
loop variable to outside the loop.
int i;
for (i= ... )
{
...
}
...
i=32;
- As with C, a char may be signed or unsigned
in C++, depending on the compiler. So if it matters, specify it.
- sizeof isn't polymorphic. To put it another way,
it is a compile time operator. So if you apply to a class, it
will give you the size of the base class. Obvious but bound to
cause some problems. Also sizeof('x') is equal to
sizeof(char) in C++ and sizeof(int) in C.
- When deleting an array, watch out for using delete my_arr
instead of delete [] my_arr. The compiler won't pick it up, your
program will cheerily run right up until the point it dies horribly.
- Although in trivial examples, and older compilers, you can
mix the use of new/delete and
malloc/free, this is most definitely not
advised and forbidden by the standard.
- C++ reserves the right to do float
arithmetic in single precision. This is compiler dependant, so
be aware that you may not get the mathematics you expect.
- A real mindbender is the difference between const type*
id_1 and type* const id_2. To spell it out : id_1
is a pointer to a constant value, id_2 is a pointer that is
constant.
- When reading from a file eof isn't set when you read the
last byte from a file, it's set when you attempt to read past
the last byte.
- delete ptr does not delete the pointer ptr but
the data it points to, *ptr.
- Recent additions to C++ standard, bool, true
and false are sure to break much existing code.
Miscellanea
There are some other points on assorted topics to take note of :
- Always explicitly type the return value of functions (by
default it is int) and make sure that all non-void
functions explicitly return a value and don't simply fall through
at the end of the function body. Although ARM [3] states that
this is illegal, it contains this mistake in it's own example
code and some compilers will not pick it up.
- Don't expect someone else to clean up after you. If
your function or module allocates memory, it should dispose of
it. (This is as a general principle - there are certain cases,
say with shared data, where this rule can be broken.)
- Just because C++ allows declaration of variables to occur
anywhere within a block, doesn't mean that it has to occur
anywhere in a block. Put declarations where it's most logical and useful.
- Stroustrup is heavily committed to ridding C++ of the
preprocessor [9]. His case is probably overstated but minimal use of
the preprocessor is a good thing to aim for. It avoids
introducing some bugs (as in #define square(y) y * y and
square(a + b)) and use of const values instead of
#define will make for easier debugging, prevent
redefinition and allow stronger typechecking. Macro
functions used for optimisation can be largely replaced by inline
functions with the caveats as above.
- C++ is a big language and there is usually more than one way
to do everything (e.g. polymorphism vs. templates and default
parameters vs. overloading) largely do the same thing. Don't
fall for featurism. Get comfortable with a subset of the
language and get good at that. Exploit the power of the language
and continue to learn about its dustier corners but if you don't
understand a feature, think of a different way to do what you
want.
- As a general rule, C++ is much more greedy than C in terms
of system resources, compile time and code size.
Acknowledgments
Thanks for help during the preparation of this paper are
extended to: John Carey and Greg Bond (Burdett, Buckeridge and
Young), Michael Paddon (Australian Computing and Communications
Institute).
Bibliography
[1] Al Stevens (1995) "Teach Yourself ... C++" Henry Holt &
Company Inc.
[2] Bjarne Stroustrup (1991) "The C++ Programming Language,
2nd Edition", Addison-Wesley.
[3] Margaret A. Ellis and Stroustrup (1990) "The Annotated C++
Reference Manual", Addison Wesley.
[4] Gregory Satir & Doug Brown (1995), "C++ the Core Language",
O'Reilly & Associates.
[5] Marshall Cline & Henry Lomow (1994) "C++ the FAQs", Addison Wesley.
[6] Steve C. McConnell (1993), "Code Complete", Microsoft Press.
[7] Dan Saks and Thomas Plum (1991), "C++ Programming Guidelines",
Plum Hall.
[8] Cay Horstmann (1995) "Mastering Object Oriented Design in
C++", John Wiley.
[9] Bjarne Stroustrup (1994) "The Design and Evolution of C++".
[10] Scott Meyer (1992) "Effective C++", Addison-Wesley.
[11] Erich Gamma, Richard Helm, Ralph Johnston and John
Vlissades (1995) "Design Patterns", Addison-Wesley, ISBN 0-201-63361-2.
[12] P.J. Plauger (1993) "Programming Languages Guessing Games",
Dr Dobbs Oct, page 16.
[13] Steven Sinofsky (1992) "Designing C++ Classes", Dr Dobbs,
November,
p52.
[14] Tom Cargill (1992) "C++ Programming Style", Addison-Wesley.
[15] Stan Lippman (1991) "C++ Primer", Addison-Wesley.
[16] Stephen Dewhurst and Kathy Stark (1995) "Programming in
C++", Prentice-Hall.
[17] James Coplien (1992) "Advanced C++ Programming Styles
and Idioms", Addison-Wesley.
[18] Steve Oualine (1995) "Practical C++ Programming",
O'Reilly.
[19] Bertrand Meyer (1988) "Object-Oriented Software Construction",
Prentice-Hall.
[20] Ellemtel Telecommunication Systems Laboratories
(1990-1992) "Programming in C++, Rules and Recommendations",
<http://www.cs.huji.ac.il/course/plab/rules.txt >
[21] Joan McGalliard, (1994) "The Genderization of Unix",
Open Systems Review, August, p41-44.
[22] comp.lang.c++ FAQ, <http://www.connobj.com/cpp/cppfaq.htm>
[23] comp.lang.c++ FAQ: ASCII version,
<ftp://rtfm.mit.edu/pub/usenet-by-hierarchy/comp/lang/c++/>
[24] "C++ Virtual Library",
<http://info.desy.de/user/projects/C++.html>
[25] "Learning C++",
<http://info.desy.de/user/projects/C++/Learning.html>
[26] "Learn C/C++ Today", <http://vinny.csd.mu.edu/learn.html>
[27] "C++ Annotations",
<http://www.icce.rug.nl/docs/cplusplus/cplusplus.html>
[28] "Understanding C++: An Accelerated Introduction",
<http://www.iftech.com/classes/cpp/cpp0.htm>
[29] PJ Plauger (1994) "The C++ Standard Library", Prentice
Hall.
[30] "C++ Draft Standard"
<http://www.cygnus.com/misc/wp/index.html>
[31] Taligent (1995) "Taligent's Guide To Designing
Programs: Well-Mannered Object-Oriented Design in C++",
Addison-Wesley.
[32] Taed Nelson (1993) "Finding Run-time Memory Errors", Dr
Dobbs, November, p34.
[33] Grady Booch, (1991) "Object-Oriented Design With Applications",
Benjamin/Cummings
Document last revised 96.2.17