C++ Pitfalls

Cay S. Horstmann

Department of Computer Science

San Jose State University

San Jose, CA 95192-0249

`cay\x40horstmann.com`

What is a pitfall?

C++ code that

compiles
links
runs
does something different than you expect

Example:

             if (-0.5 <= x <= 0.5) return 0;

Pitfall:

              if (-0.5 <= x <= 0.5) return 0;

This expression doesnot test the mathematical condition

              -1.5 <= x <= 1.5

Instead, it first computes -0.5 <= x, which is 0 or 1, and then compares the result with 0.5.

Moral: Even though C++ now has a bool type, Booleans are still freely convertible to int.

Since bool->int is allowed as a conversion, the compiler cannot check the validity of expressions. In contrast, the Java compiler would flag this statement as an error.

Constructor pitfalls

Example:

              int main()
              {  string a("Hello");
                 string b();
                 string c = string("World");
                 // ...
                 return 0;
              }

Pitfall:

              string b();

This expression does not construct an object b of type string. Instead, it is the prototype for a function b with no arguments and return type string.

Moral: Remember to omit the ( ) when invoking the default constructor.

The C feature of declaring a function in a local scope is worthless since it lies about the true scope. Most programmers place all prototypes in header files. But even a worthless feature that you never use can haunt you.

Example:

              template<typename T>
              class Array
              {
              public:
                 Array(int size);
                 T& operator[](int);
                 Array<T>& operator=(const Array<T>&);
                 // ...
              }; 

              int main()
              {  Array<double> a(10);
                 a[0] = 0; a[1] = 1; a[2] = 4; 
                 a[3] = 9; a[4] = 16;
                 a[5] = 25; a = 36; a[7] = 49; 
                 a[8] = 64; a[9] = 81;
                 // ...
                 return 0;
              }

Pitfall:

              a = 36;

Surprisingly, it compiles:

              a = Array<double>(36);

a is replaced with a new array of 36 numbers.

Moral: Constructors with one argument serve double duty as type conversions.

Avoid constructors with a single integer argument! Use the explicit keyword if you can't avoid them.

Example:

              template<typename T>
              class Array
              {
              public:
                 explicit Array(int size);
                 // ...
              private:
                 T* _data;
                 int _size;
              }; 

              template<typename T>
              Array<T>::Array(int size)
              :  _size(size), 
                 _data(new T(size))
              {}

              int main()
              {  Array<double> a(10);
                 a[1] = 64; // program crashes
                 // ...
              }

Pitfall:

              template<typename T>
              Array<T>::Array(int size)
              :  _size(size), 
                 _data(new T(size)) // should have been new T[size]
              {}

Why did it compile?

              new T(size)

returns a T* pointer to a single element of type T, constructed from the integer size.

              new T[size]

returns a T* pointer to an array of size objects of type T, constructed with the default constructor.

Moral: Array/pointer duality is dumb, but unfortunately pervasive in C and C++.

The Java compiler would catch this--Java, like most programming languages, supports genuine array types.

Example:

              template<typename T>
              class Array
              {
              public:
                 explicit Array(int size);
                 // ...
              private:
                 T* _data;
                 int _capacity;
                 int _size;
              }; 

              template<typename T>
              Array<T>::Array(int size)
              :  _size(size), 
                 _capacity(_size + 10), 
                 _data(new T[_capacity])
              {}


             int main() 
             {  Array<int> a(100);
                . . .
                // program starts acting flaky
             }

Pitfall:

              Array<T>::Array(int size)
              :  _size(size), 
                 _capacity(size + 10), 
                 _data(new T[_capacity])
              {}

Initialization follows the member declaration order, not the initializer order!

          Array<T>::Array(int size)
              :  _data(new T[_capacity])
                 _capacity(_size + 10), 
                 _size(size),

Tip: Do not use data members in initializer expressions.

          Array<T>::Array(int size)
              :  _data(new T[size + 10])
                 _capacity(size + 10), 
                 _size(size),

Example:

              class Point
              {  
              public:
                 Point(double x = 0, double y = 0);
                 // ...
              private:
                 double _x, _y;
              }; 
              int main()
              {  double a, r, x, y;
                 // ...
                 Point p = (x + r * cos(a), y + r * sin(a));
                 // ...
                 return 0;
              }

Pitfall:

              Point p = (x + r * cos(a), y + r * sin(a));

This should be either

              Point p(x + r * cos(a), y + r * sin(a));

              Point p = Point(x + r * cos(a), y + r * sin(a));

The expression

              (x + r * cos(a), y + r * sin(a))

has a legal meaning. The comma operator discards x + r * cos(a) and evaluatesy + r * sin(a). The

              Point(double x = 0, double y = 0)

constructor makes a Point(y + r * sin(a), 0).

Moral: Default arguments can lead to unintended calls. In our case, the constructionPoint(double) is not reasonable, but the constructionPoint() is. Only use defaults if all resulting call patterns are meaningful.

Example:

              class Shape
              {  
              public:
                 Shape();
              private:
                 virtual void reset();
                 Color _color;
              }; 

              class Point : public Shape
              {  
              public:
                 // ...
              private:
                 double _x, _y;
              }; 

              void Shape::reset() { _color = BLACK; }

              void Point::reset() 
              {  Shape::reset();
                 _x = 0; _y = 0;
              }  

              Shape::Shape() { reset(); }

There is noPoint constructor--we use the virtual function in the Shape constructor.

Pitfall:

              Shape::Shape() { reset(); }
              Point p;

When constructingPoint, the Shape::reset(), not the Point::reset() virtual function is called. Why?

Explanation: Virtual functions do not work in constructors.

TheShape subobject is constructed before thePoint object. Inside the Shape constructor, the partially constructed object is still a Shape.

Example:

              class Shape // an abstract class
              {  
              public:
                 Shape();
              private:
                 void init();
                 virtual void reset() = 0;
                 Color _color;
              }; 

              Shape::Shape() { init(); }
              void Shape::init() { reset(); } 

              class Point : public Shape // a concrete derived class
              {  
              public:
                 virtual void reset();
                 // ...
              private:
                 double _x, _y;
              };

              void Point::reset() { _x = _y = 0; }

Pitfall:

              int main()
              {  Point p; // program crashes
                 return 0;
              }

Explanation: You cannot create an instance of an abstract class (a class with a pure, = 0, virtual function).

         Shape s; // compile-time error; Shape is abstract

That's a good thing: if you could, what would happen if you called

         s.reset(); // reset not defined for shapes

But...I lied. You can create instances of abstract classes.

When constructing a concrete derived, for a fleeting moment, the base class exists. If you invoke a pure virtual function before the derived class constructor has executed, the program terminates.

Destructor pitfalls

Example:

              class Employee
              {
              public:
                 Employee(string name);
                 virtual void print() const;
              private:
                 string _name;
              };

              class Manager : public Employee
              {
              public:
                 Manager(string name, string dept);
                 virtual void print() const;
              private:
                 string _dept;
              };

              int main()
              {  Employee* staff[10];
                 staff[0] = new Employee("Harry Hacker");
                 staff[1] = new Manager("Joe Smith", "Sales");
                 // ...
                 for (int i = 0; i < 10; i++) 
                    staff[i]->print();
                 for (int i = 0; i < 10; i++) 
                    delete staff[i];
                 return 0;
              }

Where is the memory leak?

Pitfall:

              delete staff[i];

destroys all objects with~Employee(). The_dept strings of the Manager objects are never destroyed.

Moral: A class from which you derive must have a virtual destructor.

Example:

              class Employee
              {
              public:
                 Employee(string name);
                 virtual void print() const;
                 virtual ~Employee(); // <-----
              private:
                 string _name;
              };

              class Employee
              {
              public:
                 Employee(string name);
              private:
                 string _name;
              };

              class Manager
              {
              public:
                 Manager(string name, string sname);
                 ~Manager();
              private:
                 Employee* _secretary;
              }

              Manager::Manager(string name, string sname)
              :  Employee(name),
                 _secretary(new Employee(sname))
              {} 

              Manager::~Manager() { delete _secretary; }

What is wrong with the Manager class?

Pitfall:

              int main()
              {  Manager m1 = Manager("Sally Smith", 
                    "Joe Barnes");
                 Manager m2 = m1;
                 // ...
              }

The destructors of both m1 and m2 will delete the same Employee object.

Moral: A class with a destructor needs a copy constructor

              Manager::Manager(const Manager&)

and an assignment operator

              Manager& Manager::operator=(const Manager&).

The Big 3: It's not just a good idea--it's the law (Marshall Cline)

Inheritance pitfalls

Example:

              class Employee
              {
              public:
                 Employee(string name, string dept);
                 virtual void print() const;
                 string dept() const;
              private:
                 string _name;
                 string _dept;
              };

              class Manager : public Employee
              {
              public:
                 Manager(string name, string dept);
                 virtual void print() const;
              private:
                 // ...
              };

              void Employee::print() const
              {  cout << _name << endl;
              }  

              void Manager::print() const
              {  print(); // print base class
                 cout << dept() << endl;
              }

Pitfall:

              void Manager::print() const
              {  print(); // print base class
                 cout << dept() << endl;
              }

Despite what the comment says,print() selects the print operation of theManager class. In contrast, dept() selects the operation of theEmployee class sinceManager does not redefine it.

Moral: When calling a base class operation in a derived class operation of the same name, use scope resolution:

              void Manager::print() const
              {  Employee::print(); // print base class
                 cout << dept() << endl;
              }

Example:

              void Manager::print() const
              {  Employee:print(); // print base class
                 cout << dept() << endl;
              }

Pitfall:

              Employee:print();

It should be

              Employee::print();

But why does it compile?Employee: is a goto label!

Moral:Even language features that you never use can bite you!

Example:

              class Employee
              {
              public:
                 void raise_salary(double by_percent);
                 // ...
              };

              class Manager : public Employee
              {
              public:
                 // ...
              };

              void make_them_happy(Employee* e, int ne)
              {  for (int i = 0; i < ne; i++)
                    e[i].raise_salary(0.10);
              }     

              int main()
              {  Employee e[20];
                 Manager m[5];
                 m[0] = Manager("Joe Bush", "Sales");
                 // ...
                 make_them_happy(e, 20);
                 make_them_happy(m + 1, 4); // let's skip Joe
                 return 0;
              }

Pitfall:

              void make_them_happy(Employee* e, int ne);
              Manager m[5];
              make_them_happy(m + 1, 4);

Why does it compile?

The type of m + 1 isManager*. Because of inheritance, a Manager* is convertible to an Employee* base class pointer. make_them_happy receives anEmployee*. Everyone is happy.

What is the problem?

The array computation e[i] computes an offset of i*sizeof(Employee).

Moral: Pointers are overused in C++. Here we see two interpretations of anEmployee* e.

e points to either anEmployee or a derived class object, such as a Manager.
e points to either anEmployee or a bunch ofEmployee objects, stacked up in an array.

These two interpretations are incompatible. Mixing them leads to runtime errors. However, the intention of the programmer is hidden to the compiler since both ideas are expressed by the same construct--a pointer.

Example:

              class Employee
              {
              public:
                 Employee(char name[]);
                 Employee(const Employee& b);
                 ~Employee();
                 Employee& operator=(const Employee& b);
                 . . .
              private:
                 char* _name;
              };

              class Manager : public Employee
              {
              public:
                 Manager(char name[], char dept[]);
                 Manager(const Manager& b);
                 ~Manager();
                 Manager& operator=(const Manager& b);
                 . . .
              private:
                 char* _dept;
              };

              Manager::Manager(const Manager& b)
              : _dept(new char[strlen(b._dept) + 1])
              {  strcpy(b._dept, _dept);
              }

              Manager::~Manager()
              {  delete[] _dept;
              }

              Manager& Manager::operator=(const Manager& b)
              {  if (this == &b) return *this;
                 delete[] _dept;
                 _dept = new char[strlen(b._dept) + 1];
                 strcpy(b._dept, _dept);
                 return *this;
              }

Pitfall:

              Manager& Manager::operator=(const Manager& b)
              {  if (this == &b) return *this;
                 delete[] _dept;
                 _dept = new char[strlen(b._dept) + 1];
                 strcpy(b._dept, _dept);
                 return *this;
              }

Constructors and destructors automatically call the base constructors and destructors. But operator= does not automatically invoke the operator= of the base class.

Moral: When redefining operator= in a derived class, explicitly call operator= of the base class:

              Manager& Manager::operator=(const Manager& b)
              {  if (this == &b) return *this;
                 Employee::operator=(b);
                 delete[] _dept;
                 _dept = new char[strlen(b._dept) + 1];
                 strcpy(b._dept, _dept);
                 return *this;
              }

Thanks to Johann Deneux for pointing out another pitfall: The copy constructor for Manager is wrong. It wants to call the default constructor for Employee, but there isn't one. And, of course, it wouldn't be appropriate to call it if there was one. A corrected version is


              Manager::Manager(const Manager& b)
              : Employee(*this), _dept(new char[strlen(b._dept) + 1])
              {  strcpy(b._dept, _dept);
              }

Stream pitfalls

Example:

              list<int> a;
              while (!cin.eof())
              {  int x;
                 cin >> x;
                 if (!cin.eof()) a.push_back(x);
              }

Pitfall:

              while (!cin.eof())
              {  // ...
              }

This may be an infinite loop. If the stream state turns to fail, the end of file will never be reached.

The stream state will be set to fail if a non-digit is encountered when trying to read an integer.

Moral: eof() is only useful in combination with fail(), to find out whether EOF was the cause for failure

Example:

              while (cin.good())
              {  int x;
                 cin >> x;
                 if (cin.good()) a.push_back(x);
              }

Pitfall:

                 cin >> x; // <--- may succeed and then encounter EOF
                 if (cin.good()) a.push_back(x);

This codemay miss the last element in the input file, if it is directly followed by EOF.

Remedy: Use fail():

              while (!cin.fail())
              {  int x;
                 cin >> x;
                 if (!cin.fail()) a.push_back(x);
              }

The type conversion basic_ios ----> void* is identical to !fail():

              while (cin)
              {  int x;
                 cin >> x;
                 if (cin) a.push_back(x);
              }

Moral: There are four stream test functions:good(), bad(), eof(), andfail(). (Note thatbad() doesnot mean !good().) Only one of them is useful: fail().

Overloading pitfalls

Example:

              class Complex
              {
              public:
                 Complex(double = 0, double = 0);
                 Complex operator+(Complex b) const;
                 Complex operator-(Complex b) const;
                 Complex operator*(Complex b) const;
                 Complex operator/(Complex b) const;
                 Complex operator^(Complex b) const;
                 // ...
              private:
                 double _re, _im;
              }; 

              int main()
              {  Complex i(0, 1);
                 cout << i^2 + 1; // i*i  is -1
                 return 0;
              }

Why won't it print (0,0)?

Pitfall:

              cout << i^2 + 1;

Using the C/C++ operator precedence rules, we can add parentheses:

              cout << (i ^ (2 + 1));

The ^ operator is weaker than + (but stronger than <<).

Moral: You cannot change the operator precedence when overloading operators. Do not overload an operator if its precedence is not intuitive for the problem domain.

The precedence of ^ is fine for XOR but not for raising to a power.

Example: The stream classes support a type conversion basic_ios ----> void* for testing if a stream is happy:

              while (cin)
              {  int x;
                 cin >> x;
                 // ...
              }

Why convert to void*? A conversion to bool would seem to make more sense.

         
              template<typename C, typename T = char_traits<C> >     
              class basic_ios
              {  
              public:
                 // ...
                 operator bool() const
                 {  if (fail()) return false; 
                    else return true;
                 }  
              private:
                 // ...
              };

Pitfall:

              while (cin)
              {  int x;
                 cin << x;
                 // ...
              }

Note the typo--it should be cin >> x.

Butcin << x has an unintended meaning:cin.operator bool(), converted to an int and shifted by x bits.

Moral: Use conversion tovoid*, not conversion to int or bool, to implement objects yielding truth values. Unlike int or bool, void* have no legal operations other than == comparison.

Example: An array class with a [] operator that grows the array on demand

              class Array
              {  
              public:
                 Array();
                 ~Array();
                 Array(const Array&);
                 Array& operator=(const Array&);
                 int& operator[](int);
              private:
                 int _n; // current number of elements
                 int* _a; // points to heap array
              }; 

              int& Array::operator[](int i)
              {  if (i > _n)
                 {  int* p = new int[i + 1];
                    for (int k = 0; k < _n; k++) 
                       p[k] = _a[k];
                    for (; k < i; k++) p[k] = 0;
                    delete[] _a;
                    _a = p;
                    _n = i;
                 }
                 return _a[i];
              }  

              int main()
              {  Array a;
                 for (int s = 1; s <= 100; s++)
                    a[s] = s * s;
                 return 0;
              }

Pitfall:

              void swap(int& x, int& y)
              {  int temp = x;
                 x = y;
                 y = temp;
              }  

              int main()
              {  Array a;
                 a[3] = 9;
                 swap(a[3], a[4]);
                 return 0;
              }

The swap function gets references to a[3], then to a[4], but the second computation moves the array and invalidates the first reference!a[4] is swapped with a wild reference.

Moral: You cannot simultaneously relocate a memory block and export a reference to it.

Either make [] not grow the array, or use a data structure in which elements never move (i.e. a sequence of chunks, such as in std::deque).

Exception pitfalls

Example:

              void read_stuff(const char filename[])
              {  FILE* fp = fopen(filename, "r");
                 do_reading(fp);
                 fclose(fp);
              }

Why is that an "exception pitfall"? There aren't any exceptions in the code!

Pitfall:

FILE* fp = fopen(filename, "r"); 
do_reading(fp); 
fclose(fp); // <-- may never get here

If do_reading throws an exception, or calls a function that throws an exception, it never comes back!fp is never closed.

Moral: Exception handling drastically alters control flow. You cannot take it for granted that a function ever returns.

Remedy 1: (popular but dumb)

              void read_stuff(const char filename[])
              {  FILE* fp = fopen(filename, "r");
                 try 
                    do_reading(fp);
                 catch(...)
                 {  fclose(fp);
                    throw;
                 }
                 fclose(fp);
              }

Remedy 2: (smart)

              void read_stuff(const char filename[])
              {  fstream fp(filename, ios_base::in);
                 do_reading(fp);
              }

Even ifdo_reading throws an exception, fp is closed by the ifstream destructor.

Moral: In code with exception handling (i.e. all C++ code starting in 1994), relinquish resources only in destructors!

Example:

              double find_salary_increase(auto_ptr<Employee>);

              void do_stuff(const char name[])
              {  auto_ptr<Employee> pe = new Employee(name);
                    // can't use 
                    // Employee* pe = new Employee(name)
                    // that's not not exception safe
                 double rate = find_salary_increase(pe);
                 pe->raise_salary(rate);
              }

Pitfall:

            find_salary_increase(pe);

invokes the copy constructor of auto_ptr<Employee> which transfers ownership to the copy.

Only one auto_ptr can own a heap object. The owning auto_ptr calls the destructor when it goes out of scope.

Remedy: Don't copy an auto_ptr into a function.

              double find_salary_increase(Employee*);

              void do_stuff(const char name[])
              {  Employee* pe = new Employee(name);
                 auto_ptr<Employee> ape = pe; // use only for destruction
                 double rate = find_salary_increase(pe);
                 pe->raise_salary(rate);
              }

Container pitfalls

Example: A set of pointers

              set<Employee*> staff;
              vector<string> names;
              for (int i = 0; i < names.size(); i++)
                 staff.insert(new Employee(names[i]);

Pitfall: Ordered containers (set, map, multiset, multimap) use < for comparison. It is assumed that < is a total ordering.

[NOTE: Thanks to Liam Devine and Richard Smith for pointing out that this out of date. Of course, the containers really use less<T> for comparison, and The C++ standard Section 20.3.3.8 states: "For templates greater, less, greater_equal, and less_equal, the specializations for any pointer type yield a total order, even if the built-in operators <, >, <=, >= do not." When I wrote this, segmented memory models were still widely used, and the C++ standard was still a work in progress :-)]

In a set of pointers

              set<Employee*> staff;

the pointers are compared with <.

Given two arbitrary Employee* pointers p and q, is p < q defined? Only if they point to the same array.

In a segmented memory model, only offsets are compared. Ex. p == 0x740A0004 and q == 0x7C1B0004 compare identical.

Remedy: (risky) Only write code for a flat memory space where pointer word size == integer word size and comparison happens to be a total ordering.

Remedy: (tedious) Supply an ordering:

              bool employee_ptr_less(const Employee* a, const Employee* b)
              {  return a->salary() < b->salary();
              }

              set<Employee*, bool (*)(const Employee*, const Employee*)>
                 staff(employee_ptr_less);

Example: Runaway iterators

       list<int> a, b;
       // ...

       list<int>::iterator p
             = find(a.begin(), b.end(), 100); 
       if (p != a.end()) b.push_back(*p);

Pitfall:

             find(a.begin(), b.end(), 100); // oops, should have been a.end()

To see why this code crashes dramatically, look at the implementation of find:

       template<typename I, typename T>
       I find(I from, I to, const T& target)
       {  while (from != to && *from != target)
             ++from;
                 return from; 
       }

When from reaches a.end(), *from and ++from are undefined.

Moral: Iterators don't know their state. There is no reason why a list<int> iterator couldn't know its state, but STL was built with the objective to make iterators no more powerful than pointers into a C array. Benefit: You can call the standard algorithms with C arrays:

       int a[30];
       int* p = find(a, a + 30, 100);

Drawback: Programming with iterators is pointer-oriented, not object-oriented.

Remedy: Use Safe STL (http://www.horstmann.com/safestl.html)

Example: Muddled iterators

       list<int> a;
       list<int> b;
       list<int>::iterator p = a.begin();

       a.insert(50);
       b.insert(100);

       b.erase(p);
       cout << a.length() << endl; // length is 1
       cout << b.length() << endl; // length is 0

Pitfall: In

       b.erase(p);

the iterator p pointed inside a! The behavior is undefined, but the standard STL implementation does the following:

*p is erased from whatever list it happens to be in, just by following the forward and backwards link
the length of b is decremented

Moral: Iterators don't know their owner.

Remedy: Use Safe STL (http://www.horstmann.com/safestl.html)