Introduction to C++

This courses uses C++ as the language in which we explore fundamental concepts in programming. Here, we provide a basic overview of C++ for students who are familiar with other languages such as Python or Java.

A Simple Program

Let’s start with a simple program that prints Hello world! to the screen:

#include <iostream>

int main() {
  std::cout << "Hello world!" << std::endl;
  return 0;
}

The entry point of a C++ program is the main() function, which is a top-level (global) function that has a signature of the form int main() or int main(int argc, char *argv[]). A function signature consists of a return type, the name of the function, and a list of parameters. The parameters are what the function takes as input, and the return type is the kind of value that the function returns. For main(), it can either have no parameters, or it can have parameters that allow it to take command-line arguments, which are specified when the program is run. For simplicity, we start with a main() that does not have parameters.

Following the signature of the function, we have the function body, which is the code that gets executed when the function is called. The body itself is comprised of statements, which are executed in order from top to bottom. In our program above, the body of main() has two statements: one that prints to the screen, and a second that exits the function with a return value of 0.

Examining the first statement more closely, we see that it is composed of expressions and operators. An expression is a fragment of code that evaluates to some value. The simplest expressions are literals, which hardcode a specific value in the program. For instance, "Hello world!" is a string literal that denotes the sequence of characters H, e, and so on. A name is also an expression, and what it evaluates to depends on what the name is bound to in the environment in which the name is evaluated. In the program above, std::cout is bound to an object representing the standard output stream (stdout), which allows printing to the console (also referred to as shell or terminal) in which the program is run. Similarly, std::endl is bound to an object that causes a newline to be printed when inserted to an output stream. Finally, these expressions are connected via the binary << operator; in this context, we refer to this as the stream-insertion operator. The code is inserting the string "Hello world!" into the standard output stream, followed by inserting std::endl to obtain a newline.

Both std::cout and std::endl are defined in the iostream library header. The first line of the program (#include <iostream>) instructs the C++ compiler to include the contents of the iostream library header, which makes std::cout and std::endl available for us to use.

A name such as std::cout is called a qualified name – it consists of the std qualifier, followed by the :: scope-resolution operator, followed by the cout name. The std qualifier refers to the std namespace, which is a scope in which most standard-library entities are defined. Thus, the qualified name std::cout refers to the cout that is defined within the std namespace, as opposed to some other cout that might be defined in a different scope. Often, we place a directive in our program to be able to use an entity such as cout without qualifying it with std::. The following does so for both cout and endl:

using std::cout;   // we can now use cout without qualification
using std::endl;   // we can now use endl without qualification

The following does so for every entity defined in the std namespace:

using namespace std; // we can now use anything in std without qualification

Use this with caution, however – there are lots of names defined in the std namespace, so this makes it much more likely for our names to conflict with those from the standard library. (In particular, a using namespace directive should never be used in a header file.)

The last statement in our program is return 0;. This returns the value 0 from the main() function, which conventionally indicates that the program completed successfully. We would use a nonzero value instead to indicate an error if something went wrong.

In the specific case of returning from main(), a return statement is actually optional. (This is not the case for returning from other functions, unless they have a void return type.) If we don’t have a return statement in main(), the compiler implicitly adds a return 0; for us. Thus, the program above could be written equivalently as:

#include <iostream>

using std::cout;   // we can now use cout without qualification
using std::endl;   // we can now use endl without qualification

int main() {
  cout << "Hello world!" << endl;
}  // implicit return 0; added by the compiler

Now that we have examined all the pieces of our simple program, let’s compile and run it in a shell. Assuming the code is in the file hello.cpp, we can compile and run it as follows:

$ g++ -std=c++17 -o hello.exe hello.cpp
$ ./hello.exe
Hello world!

Here, we use the $ symbol to denote the shell prompt, which is displayed by the shell to indicate that it is waiting for our commands. (On most machines, the prompt is more complicated, but we simplify it to just a dollar sign.) We then type the compilation command g++ -std=c++17 -o hello.exe hello.cpp, which invokes the g++ compiler. The -std=c++17 tells the compiler to use the C++17 version of the language. The -o hello.exe tells the compiler to use hello.exe as the name of the resulting executable file. Finally, hello.cpp is the filename of our C++ program. When the compiler has finished, we type ./hello.exe to run the resulting executable, and we see the message Hello world! printed to the screen.

In this course, we encourage using both the shell as well as an integrated development environment (IDE). Refer to this tutorial for how to set up and become familiar with using a shell and IDE on your machine.

Static Typing

We saw above that the signature of the main() function includes the return type, which is int in the case of main(). The reason the return type is included is that C++ is a statically typed language, meaning that every value’s type must be known at compile time. To do so, the compiler generally requires us to specify the type of a variable, as well as the parameter types and return type of a function. For instance, the following introduces a local variable x with type int and initial value 3:

int x = 3;

The syntax for defining a variable begins with the type of the variable, followed by the name, followed by an optional initialization. (If no initialization is provided, the variable undergoes default initialization. For primitive types such as int, that means the initial value of the variable is undefined.) The int type is a primitive type, which is a category consisting of the simplest types provided by a language. Common primitive types in C++ include:

int: signed (positive, negative, or zero) integer values in some finite range, typically $[-2^{31}, 2^{31}-1] = [-2147483648, 2147483647]$ on most machines
std::size_t: unsigned (positive or zero only) integer values in some finite range, typically $[0, 2^{64}-1]$
double: double-precision floating-point values, typically using a 64-bit representation
bool: a boolean value, either true or false
char: a character value, generally representing at least the 128 values in the ASCII standard
void: used as the return type of a function to indicate that it does not return a value

As another example, let’s define a square() function that computes the square of a number. We’ll write it to take a double value as input and return a double value as the result:

double square(double x) {
  double result = x * x;
  return result;
}

As with main(), the signature for square() starts with the return type, which is double. Then we have the name of the function, followed by the parameter list. We take in a single double value as input, so we have a single parameter, which we are calling x. In the body of the function, we define a new variable result that is initialized with the value of x * x, and we then return the value of the variable.

Since we only use the variable once, we can simplify our function by eliminating it and using the expression x * x directly where we need it:

double square(double x) {
  return x * x;
}

Since an expression evaluates to a value, it has a type corresponding to that value. In the case of x * x, its type is double since it is the product of two double values. Thus, the return statement returns a double value, matching the declared return type of the function.

We can now invoke square() as follows:

#include <iostream>

double square(double x) {
  return x * x;
}

int main() {
  std::cout << square(3.14) << std::endl;
  double x = square(4);
  std::cout << square(x) << std::endl;
}

The square() function must be declared above where we use it in main() – in C++, the scope of a global function or variable is from the point of its declaration until the end of the file. (The scope of an entity is the region of code where it may be used.) We can then invoke the function in main().

Observe that in the definition

double x = square(4);

we invoke square() on the literal 4, which actually has type int rather than the declared double type of the parameter of the square() function. In most cases, C++ automatically converts between int and double, so that we can use a value of one of these types where the other is expected. On the other hand, a call such as square(std::cout) as erroneous, since there is no conversion between the type of std::cout (which happens to be std::ostream) and the required double type.

Compound Data

In addition to primitive types, C++ has data types that represent more complex objects. For instance, the std::ostream type (defined in the iostream header) represents an output stream, and it is the type of std::cout. The std::string type is another common data type that represents string objects, and it is defined in the string header:

#include <iostream>
#include <string>

int main() {
  std::string s = "Hello world!";
  std::cout << s << std::endl;
}

Often, we want to keep track of a collection of objects of the same type, such as an arbitrary number of integer values. We can use a std::vector to do so, which is an example of a template that is parameterized by some other type. For instance, std::vector<int> is a collection of int values, while std::vector<std::string> is a collection of std::string values. The following demonstrates how to use a vector:

#include <iostream>
#include <vector>

int main() {
  std::vector<int> scores = { 84, 91, 77, 95, 83 };
  for (std::size_t i = 0; i < scores.size(); ++i) {
    std::cout << "Score " << i << " = " << scores[i] << std::endl;
  }
}

In this code, we define a scores variable of type std::vector<int>, and we provide an initial set of values as a comma-separated list enclosed by curly braces. We then iterate over the indices of the vector, which start at 0 and end at one less than the size of the vector, which we obtain via the expression scores.size(). We access an element of the vector using square brackets, with the vector object on the left-hand side and the index between the brackets (scores[i]). Using this syntax, we have to be careful not to go past the end of the vector, which would result in undefined behavior, meaning that anything could happen – e.g. crashing the program, overwriting some other piece of data, stealing your files, or even nothing at all. Alternatively, we can do scores.at(i), which checks whether the index is in range and guarantees an error if it is not – the program crashes, but we get some useful information, and it definitely won’t steal our files.

We can also add elements to a vector after it has been created:

scores.push_back(42);

This appends the element 42 at the end of the vector. Similarly, the expression scores.pop_back() removes the element at the end of the scores vector (assuming there is at least one element in the vector; otherwise we get the dreaded undefined behavior).

Vectors are useful for ordered collections of data that all have the same type. However, sometimes we want to keep track of multiple pieces of data that have individual meanings, and that might even have different types. For instance, a complex number has a real part and an imaginary part; while both might be represented as doubles, we want to keep track of which part is which through a name rather than maintaining an ordering between them. A struct [1] gives us this ability to introduce a compound object, comprised of multiple pieces each with individual names. The following is an example of defining a new Complex type:

struct Complex {
  double real;
  double imaginary;
};

We start with the struct keyword, followed by the name of the type we are defining, followed by an open curly brace. We then specify the components of the struct, using syntax similar to variable declarations. These components are called member variables in C++ [2]. The struct definition is terminated by a closing curly brace and a semicolon. Once we have this definition, we can write a function to print out a Complex value:

void Complex_print(Complex number) {
  std::cout << number.real;
  if (number.imaginary >= 0) {
    std::cout << '+';
  }
  std::cout << number.imaginary << std::endl;
}

As shown above, we can access a member variable using the dot operator. The expression number.real accesses the real component of the number object, and number.imaginary refers to the imaginary component. The function prints the + character to separate the two components if the imaginary part is nonnegative – otherwise, the imaginary part would have a minus sign, so we wouldn’t want a + between the two components.

We can create objects of Complex type and print them as follows:

Complex c1 = { 3.14, -1.7 };
Complex c2;
c2.real = 2.72;
c2.imaginary = -4;
Complex_print(c1);
Complex_print(c2);

As the code demonstrates, we can use curly braces to initialize a Complex variable, which initializes the components in order: c1.real gets initialized to 3.14, and c1.imaginary gets initialized to -1.7. Alternatively, we can rely on default initialization as is the case for c2. However, this ends up initializing the two components to undefined values, so we need to replace their values before we can proceed to use them.

As another example, we define a struct to keep track of a person’s exam score. We need to keep track of the person’s name, as well as the score value itself:

struct Grade {
  std::string name;
  int score;
};

We can now use the Grade type as follows:

Grade g1 = { "Sofia", 99 };
Grade g2;
g2.name = "Amir";
g2.score = 23;
std::cout << g1.name << " earned a " << g1.score << std::endl;
std::cout << g2.name << " earned a " << g2.score << std::endl;

Value Semantics

One of the distinguishing features of a programming language is the relationship between variables and objects, which are pieces of data in memory. In some languages, a variable is directly associated with an object, so that using the variable always accesses the same object (as long as the variable is in scope). We say that the language has value semantics if this is the case. In other languages, a variable is an indirect reference to an object, and the variable may be modified to refer to a different object. This scheme is known as reference semantics.

C++ has value semantics, unlike other languages such as Java [3] or Python that primarily have reference semantics. We can illustrate the difference between these semantic choices using the Complex type we defined previously. Consider the following code:

#include <iostream>

struct Complex {
  double real;
  double imaginary;
};

void Complex_print(Complex number) {
  std::cout << number.real;
  if (number.imaginary >= 0) {
    std::cout << '+';
  }
  std::cout << number.imaginary << std::endl;
}

int main() {
  Complex c1 = { 3.14, -1.7 };
  Complex c2 = c1;
  c2.real = 2.72;
  Complex_print(c1);
  Complex_print(c2);
}

In this code, we define a variable c1 of type Complex and give it an initial value. We then copy it to a c2 variable. If we modify c2, the change does not affect c1, since the two variables each have their own object with which they are associated. In memory, this looks something like the picture in Figure 94.

_images/sup_complex_value_semantics.svg — Figure 94 Two variables corresponding to separate `Complex` objects in memory.

Compiling and running the program produces:

$ g++ -std=c++17 -o complex.exe complex.cpp
$ ./complex.exe
3.14-1.7
2.72-1.7

We see that as expected, the modification to c2 does not affect c1. Compare this to a similar Python program:

class Complex:
    def __init__(self, real, imaginary):
        self.real = real
        self.imaginary = imaginary

def Complex_print(number):
    print(number.real, end='')
    if number.imaginary >= 0:
        print('+', end='')
    print(number.imaginary)

c1 = Complex(3.14, -1.7)
c2 = c1
c2.real = 2.72
Complex_print(c1)
Complex_print(c2)

In memory, the variables look like the diagram in Figure 95.

_images/sup_complex_reference_semantics.svg — Figure 95 Two variables indirectly referring to the same `Complex` object.

Running the program produces the following:

$ python3 complex.py
2.72-1.7
2.72-1.7

This confirms that c1 and c2 refer to the same object, since the modification to c2.real also affected c1.

We can also observe C++’s value semantics when we call a function. Suppose we want a function that modifies a Complex object to be its conjugate, flipping the sign of the imaginary component of the complex number. The following is an attempt to write this function:

void Complex_conjugate(Complex number) {
  number.imaginary = -number.imaginary;
}

Suppose we insert the following statement prior to the print statements in our program above:

Complex_conjugate(c1);

Compiling and running the code, we get:

$ g++ -std=c++17 -o complex.exe complex.cpp
$ ./complex.exe
3.14-1.7
2.72-1.7

Nothing changes! This is because the number parameter of the Complex_conjugate() function is its own object, and it receives a copy of c1‘s value. The diagram in Figure 96 illustrates this in memory at the end of the Complex_conjugate() call.

_images/sup_complex_pass_by_value.svg — Figure 96 A function that received a copy of an argument object.

The copy within the Complex_conjugate() function does have its imaginary component modified, but not the object within the main() function. This behavior is called pass by value, since the value of c1 (the argument of the function call) is copied into the number parameter object.

C++ also supports another mechanism for passing arguments: pass by reference. In this scheme, the parameter is an alias of the object passed in as an argument rather than being its own object. We specify that a parameter should use pass by reference by placing the ampersand (&) symbol to the left of the parameter name:

void Complex_conjugate(Complex &number) {
  number.imaginary = -number.imaginary;
}

Compiling and running the code with this modified definition, gives us:

$ g++ -std=c++17 -o complex.exe complex.cpp
$ ./complex.exe
3.14+1.7
2.72-1.7

We see that c1 has indeed been conjugated. The memory diagram in Figure 97 depicts what the code looks like in memory at the end of the call to Complex_conjugate(). In the figure, we denote an alias with a dashed line, and we do not include a box next to the number name to reflect the fact that it is not associated with a new object.

_images/sup_complex_pass_by_reference.svg — Figure 97 A function that received an alias to an existing argument object.

Example: Stickman

Having seen the basics of C++, let us proceed to write a larger C++ program that implements a stickman game (also called hangman), which involves guessing the letters in a word or phrase. First, let’s sketch out how we want the game to run. If the puzzle is the word hello, an unsuccessful game may play out like the following:

$ ./stickman.exe hello



     _ _ _ _ _

Enter a lowercase letter to guess:
a

 o

     _ _ _ _ _

Enter a lowercase letter to guess:
b

 o
 |
     _ _ _ _ _

Enter a lowercase letter to guess:
c

 o
/|
     _ _ _ _ _

Enter a lowercase letter to guess:
d

 o
/|\
     _ _ _ _ _

Enter a lowercase letter to guess:
e

 o
/|\
     _ e _ _ _

Enter a lowercase letter to guess:
f

 o
/|\
/    _ e _ _ _

Enter a lowercase letter to guess:
g

 o
/|\
/ \  _ e _ _ _

Better luck next time!

We want to print out underscores corresponding to the letters in the puzzle, with a space between each of them. We prompt the player for a guess, reading from the standard input stream corresponding to input from the console. If the player guesses a letter that is not in the puzzle, we reveal a portion of a stick figure. If the player guesses a letter that is in the puzzle, we reveal all occurrences of that letter and do not add to the stick figure. If the player guesses incorrectly six times, the full stick figure is revealed, and the player loses.

The following is a successful run of the game (loosely following the frequency distribution of English letters):

$ ./stickman.exe hello



     _ _ _ _ _

Enter a lowercase letter to guess:
e



     _ e _ _ _

Enter a lowercase letter to guess:
t

 o

     _ e _ _ _

Enter a lowercase letter to guess:
s

 o
 |
     _ e _ _ _

Enter a lowercase letter to guess:
a

 o
/|
     _ e _ _ _

Enter a lowercase letter to guess:
i

 o
/|\
     _ e _ _ _

Enter a lowercase letter to guess:
o

 o
/|\
     _ e _ _ o

Enter a lowercase letter to guess:
h

 o
/|\
     h e _ _ o

Enter a lowercase letter to guess:
r

 o
/|\
/    h e _ _ o

Enter a lowercase letter to guess:
l

 o
/|\
/    h e l l o

Congratulations!

Program Constants

Now that we have a general sense of how the program should behave, we can go ahead and start on our design for its structure. First, let’s define some constants corresponding to the stick figure. The full figure is as follows:

 o
/|\
/ \

This spans three lines. Unfortunately, a string literal in C++ cannot directly include a newline (line break) character. However, we can use the escape sequence \n to tell the compiler we want a newline character. For instance, the following prints Hello on one line, followed by world! on the next line [4]:

std::cout << "Hello\nworld!" << std::endl;

Escape sequences in general start with a backslash in C++. If we actually want a backslash character itself, it too needs to be escaped: \\. Thus, the stick figure can be represented with the following string:

" o\n/|\\\n/ \\"

The string starts with a space to center the head of the stick figure, then the newline escape sequence \n to end the line, then the right arm and torso, followed by the \\ escape sequence for a backslash to represent the left arm, followed by another \n newline, then the right leg, a space, and finally the escaped left leg.

We can now work our way backwards to obtain partially revealed stick figures. There are two things we need to ensure so that the figures line up in the output:

Each figure has exactly two newline characters.
Each figure has exactly three characters on the third line.

We can use a std::vector<std::string> to store the figures in order:

const std::vector<std::string> STICK_FIGURES = {
  "\n\n   ",
  " o\n\n   ",
  " o\n |\n   ",
  " o\n/|\n   ",
  " o\n/|\\\n   ",
  " o\n/|\\\n/  ",
  " o\n/|\\\n/ \\"
};

We prefix the type with the const qualifier to denote that this is a constant, and the compiler will prevent us from modifying it. By convention, we name the constant using all capital letters, with underscores between words. We’ll also use a constant for the maximum number of wrong guesses:

const int MAX_MISSES = 6;

The `Game` Struct

Next, we consider what data we need to represent the state of a game. We have to keep track of both the answer to the puzzle, as well as what pieces have been guessed by the player and what pieces remain. We also need to track the number of missed guesses. We will additionally keep a count of the number of letters remaining – this isn’t strictly necessary, as we can recompute it when we need it, but it will make our job easier to keep track of it separately.

Before we define a struct, we also need to determine what underlying data types are appropriate to represent each member variable. The counts can be represented by the int type, and the answer by the std::string type. We can also represent what the player has and hasn’t guessed with a std::string – letters that have not been guessed will be replaced with an underscore, while those that have been revealed will have the same values as in the answer. We can now define the struct:

// A struct to represent the state of a game of stickman.
struct Game {
  // The number of wrong guesses made so far.
  int miss_count;

  // The number of remaining unguessed positions in the puzzle.
  int remaining_letters;

  // The current state of the puzzle, with underscores representing
  // unguessed letters.
  std::string puzzle;

  // The answer to the puzzle.
  std::string answer;
};

Here, we have documented each component with a comment above it that describes its purpose.

Task Decomposition

Let’s now think about the functions we need in our program. We should break down each discrete task in the program to its own function, so that each function has a small job, and so we can test each piece individually. The following are some tasks that our program needs to do:

Construct a Game object from a std::string answer, with the appropriate initial values for each member variable.
Print the a Game object, showing the player the current state of the game.
Obtain a letter guess from the player, checking whether or not the guess is a valid letter.
Update the Game based on a guessed letter.
Perform the top-level game loop until the game ends.

Let’s write function signatures corresponding to each of these tasks. We will write them in the form of a declaration [5], which excludes the body of a function, replacing it with a semicolon.

The function that constructs a Game takes a std::string as an argument and returns a Game:
```
Game make_game(std::string answer);
```
The function that prints a Game takes a Game object and does not return anything.
```
void print_game(Game game);
```
We actually don’t need a copy of the Game object, so we can specify pass by reference instead to obtain an alias to the existing object:
```
void print_game(Game &game);
```
The function that obtains a guess from the player doesn’t take any arguments, and it returns a character value:
```
char get_guess();
```
The function that updates the game must take the Game object via pass by reference, so that we don’t get a copy of the Game. It also takes in the guessed letter:
```
void update_game(Game &game, char guess);
```
The top-level function takes the answer as a string and does not return anything:
```
void play_game(std::string answer);
```

Implementation

We can now proceed to implement these functions, starting with make_game(). The function needs to define a Game object and set its member variables:

Game game = { 0, 0, answer, answer };

We’ve initialized the puzzle member variable to be a copy of the answer parameter to start, but we need to replace each letter with an underscore. Let’s assume that we only hide lowercase letters, and that other characters are shown to the player without needing to be guessed. This allows our puzzle to contain spaces and other punctuation. We can write a separate function to determine whether or not a character is a lowercase letter:

bool is_lowercase(char value) {
  return value >= 'a' && value <= 'z';
}

This logic takes advantage of the fact that in the ASCII standard (and most other character standards), the lowercase letters are adjacent and in order. Thus, a lowercase letter must be both greater than or equal to the character 'a' and no more than the character 'z'.

We can now loop over the puzzle member variable, replacing each lowercase character with an underscore:

for (std::size_t i = 0; i < game.puzzle.size(); ++i) {
  if (is_lowercase(game.puzzle[i])) {
    game.puzzle[i] = '_';
    ++game.remaining_letters;
  }
}

Iterating over a string works the same way as iterating over a vector. We use a for loop from 0 up to the size of the string, and we use square brackets to index into the string to obtain the character at that position. Unlike in some other languages, C++ strings are mutable, meaning that they can be modified, and we do so here by replacing each lowercase letter with an underscore character. We also increment the number of remaining letters each time we encounter a lowercase letter.

Putting this all together, the following defines our make_game() function:

Game make_game(std::string answer) {
  Game game = { 0, 0, answer, answer };
  for (std::size_t i = 0; i < game.puzzle.size(); ++i) {
    if (is_lowercase(game.puzzle[i])) {
      game.puzzle[i] = '_';
      ++game.remaining_letters;
    }
  }
  return game;
}

To print a game, we print out the stick figure corresponding to the current number of incorrect guesses:

std::cout << STICK_FIGURES[game.miss_count] << " ";

We add a bit of space between the figure and the puzzle. We then print out each of the characters in the puzzle, separated by spaces:

for (std::size_t i = 0; i < game.puzzle.size(); ++i) {
  std::cout << " " << game.puzzle[i];
}

In addition, we add extra newlines to visually separate the turns from each other. The full function definition is as follows:

void print_game(Game &game) {
  std::cout << endl;
  std::cout << STICK_FIGURES[game.miss_count] << " ";
  for (std::size_t i = 0; i < game.puzzle.size(); ++i) {
    std::cout << " " << game.puzzle[i];
  }
  std::cout << "\n" << endl;
}

Next, we implement the get_guess() function to obtain a guess from the player. At a high level, this function needs to do the following:

Obtain a string from the standard input stream.
Check whether the string is a valid guess – it must be a single letter, and the letter must be lowercase.
If the guess is invalid, repeat the process.
If the guess is valid, return the guessed letter.

There’s one more case we need to handle – what happens if we reach the end of the stream, when no more input is available? (On most systems, a user can manually end the standard input stream with the Ctrl-d keyboard input. We will also see later that input can be redirected from a file, in which case the input stream ends when it reaches the end of the file.) In such a case, we will return immediately with an error value that is different from any valid guess:

const char ERROR_CHAR = '\0';   // special "null" character

The following is a definition of get_guess():

char get_guess() {
  std::cout << "Enter a lowercase letter to guess:" << std::endl;
  std::string input;
  while (std::cin >> input) {
    if (input.size() != 1) {
      cout << "Error: guess must be exactly one letter" << std::endl;
    } else if (!is_lowercase(input[0])) {
      cout << "Error: guess must be between a and z" << std::endl;
    } else {
      return input[0];
    }
  }
  return ERROR_CHAR;
}

The std::cin object represents the standard input stream, similar to how std::cout is the standard output stream. We extract from an input stream using the >> operator, which is called the stream-extraction operator in this context. To extract a string, we use a std::string object as the right-hand side of the operator. It is common practice to perform this extraction in the condition of a loop. If the extraction succeeds, the condition has a true value, and the body of the loop runs. If the extraction fails (e.g. in the case of the end of the stream), the loop condition is false, and the loop exits. In this function, we return ERROR_CHAR when the extraction fails.

The function starts by prompting the player to enter a guess. It then reads input in a loop, checking whether or not the input is a valid guess. If the guess is invalid, a message describing the problem is printed, and the loop moves on to the next iteration, reading another input. If the guess is valid, the lone character in the input string is returned.

We now go ahead and implement the update_game() function. The function needs to iterate over the characters in the answer to check if any are the same as the guess. If so, we additionally need to check whether the letter has already been guessed. If it has not been guessed, the corresponding position in the puzzle string has an underscore. In this case, we replace the underscore with the actual letter and decrement the count of remaining letters:

for (std::size_t i = 0; i < game.answer.size(); ++i) {
  if (game.answer[i] == guess && game.puzzle[i] == '_') {
    game.puzzle[i] = guess;  // replace the _ with the actual letter
    --game.remaining_letters;
  }
}

The function also needs to update the miss_count member variable depending on whether the guess was a correct one or not. We can’t know this until we have traversed the entire puzzle, and we use a separate boolean to track this. The full function definition below demonstrates this logic:

void update_game(Game &game, char guess) {
  bool correct_guess = false;
  for (std::size_t i = 0; i < game.answer.size(); ++i) {
    if (game.answer[i] == guess && game.puzzle[i] == '_') {
      game.puzzle[i] = guess;  // replace the _ with the actual letter
      --game.remaining_letters;
      correct_guess = true;
    }
  }
  if (!correct_guess) {
    ++game.miss_count;
  }
}

Lastly, we can write the top-level function that plays a game. It starts by creating a Game object via make_game(). It then has a loop that:

Prints the game using print_game().
Obtains a guess from the player via get_guess(). If ERROR_CHAR is returned, the game immediately exits.
Updates the game using update_game().

Other than the ERROR_CHAR condition, the game terminates either when all letters have been guessed, or the player has made the maximum number of incorrect guesses. Thus, the following is the main loop:

while (game.remaining_letters > 0 && game.miss_count < MAX_MISSES) {
  print_game(game);
  char letter = get_guess();
  if (letter == ERROR_CHAR) {
    cout << "Quitting." << endl;
    return; // quit the game
  }
  update_game(game, letter);
}

After the game is over, we print the game once more to show its final state. We then print a message to the player depending on whether or not they won, as shown in the full function definition below:

void play_game(std::string answer) {
  Game game = make_game(answer);
  while (game.remaining_letters > 0 && game.miss_count < MAX_MISSES) {
    print_game(game);
    char letter = get_guess();
    if (letter == ERROR_CHAR) {
      std::cout << "Quitting." << std::endl;
      return;  // quit the game
    }
    update_game(game, letter);
  }
  print_game(game);
  if (game.remaining_letters == 0) {
    std::cout << "Congratulations!" << std::endl;
  } else {
    std::cout << "Better luck next time!" << std::endl;
  }
}

Testing and File Organization

Before we write a main() function that plays the game, we might want to write some code that tests individual functions in our program. For example, the following code does a basic test of the make_game() function:

#include <cassert>
#include <string>

void test_make_game() {
  std::string answer = "hello world!";
  Game game = make_game(answer);
  assert(game.miss_count == 0);
  assert(game.remaining_letters == 10);
  assert(game.puzzle == "_____ _____!");
  assert(game.answer == answer);
}

int main() {
  test_make_game();
}

The test_make_game() function creates a Game object from the "hello world!" answer string. It then asserts that each of the member variables of the Game object has the expected value. The assert() construct is defined in the cassert library header, which we have included at the top.

Observe that the test code has its own main() function, so it needs to be in a separate .cpp source file than the main() function that actually plays a game of stickman. Let’s assume that the function definitions above are placed in stickman.cpp, and our test code is in test.cpp. The compiler will process these two files individually, even if they are both provided in a single compilation command. How can we make the compiler aware of the functions defined in stickman.cpp when it is compiling test.cpp?

The compiler actually only needs access to the declarations of the functions we use, not the definitions. It does need access to the definition of the Game struct, since it needs to know what member variables the struct has. Thus, common practice is to place struct definitions, function declarations, and constants in a separate header file, conventionally with a file extension such as .hpp. The following code can be placed in stickman.hpp:

#include <string>
#include <vector>

// Stick figures corresponding to each possible number of misses.
const std::vector<std::string> STICK_FIGURES = {
  "\n\n   ",
  " o\n\n   ",
  " o\n |\n   ",
  " o\n/|\n   ",
  " o\n/|\\\n   ",
  " o\n/|\\\n/  ",
  " o\n/|\\\n/ \\"
};

// Maximum number of missed guesses.
const int MAX_MISSES = 6;

// Used to indicate an error when reading user input.
const char ERROR_CHAR = '\0';   // special "null" character

// A struct to represent the state of a game of stickman.
struct Game {
  // The number of wrong guesses made so far.
  int miss_count;

  // The number of remaining unguessed positions in the puzzle.
  int remaining_letters;

  // The current state of the puzzle, with underscores representing
  // unguessed letters.
  std::string puzzle;

  // The answer to the puzzle.
  std::string answer;
};

// EFFECTS: Returns a properly initialized Game object corresponding
//          to the given answer phrase.
Game make_game(std::string answer);

// REQUIRES: game.miss_count <= MAX_MISSES
// MODIFIES: cout
// EFFECTS: Prints the state of the game to standard output.
void print_game(Game &game);

// MODIFIES: cout, cin
// EFFECTS: Repeatedly prompts the user for a guess consisting of a
//          single lowercase letter until a valid guess is provided,
//          or the end of stream is reached. Returns the guess in
//          the first case, or ERROR_CHAR in the second.
char get_guess();

// MODIFIES: game
// EFFECTS: Updates the game's puzzle and miss count according to
//          whether the guess is a letter in the answer and has not
//          been previously guessed.
void update_game(Game &game, char guess);

// MODIFIES: cout, cin
// EFFECTS: Plays a game of stickman with the given answer, reading
//          guesses from standard in and writing game details to
//          standard out. The game ends if the end of standard input
//          is reached, the player exhausts the maximum number of
//          incorrect guesses, or the player correctly guesses all
//          letters.
void play_game(std::string answer);

We start by including the string and vector standard libraries, since the code in stickman.hpp uses both strings and vectors. We then have our constant definitions, followed by the definition of the Game struct. Lastly, we have function definitions for each task in the game. We have included documentation in the form of RMEs (requires, modifies, and effects):

The requires clause specifies what is required to be true prior to calling the function. These are also called preconditions. All bets are off if these conditions are violated: we get the dreaded undefined behavior. The function’s implementation is allowed to assume that these conditions are met – it is not required to check them in any way.
The modifies clause lists the objects outside the function that might modified by a call to the function. We generally include cout if the standard output stream is written to, and cin if input is read from the standard input stream. Pass-by-reference parameters are included if their contents might be modified.
The effects clause tells us what the function actually does, i.e. what the return value means and how any objects in the modifies clause are actually modified. The effects are sometimes also called postconditions. This clause only specifies the “what”, not the “how”; we will come back to this later.

Now that we have this code in stickman.hpp, we can include it in both test.cpp and stickman.cpp. The contents of test.cpp are as follows:

#include <cassert>
#include "stickman.hpp"

void test_make_game() {
  std::string answer = "hello world!";
  Game game = make_game(answer);
  assert(game.miss_count == 0);
  assert(game.remaining_letters == 10);
  assert(game.puzzle == "_____ _____!");
  assert(game.answer == answer);
}

int main() {
  test_make_game();
}

The #include "stickman.hpp" directive pulls in the code from stickman.hpp into the current file. We use quotes around the filename rather than the angle brackets we use with a library header. Observe that we no longer need the #include <string> directive, since stickman.hpp already contains that.

The full contents of stickman.cpp are as follows:

#include <iostream>
#include "stickman.hpp"

using std::cin;
using std::cout;
using std::endl;
using std::size_t;
using std::string;

static bool is_lowercase(char value) {
  return value >= 'a' && value <= 'z';
}

Game make_game(string answer) {
  Game game = { 0, 0, answer, answer };
  for (size_t i = 0; i < game.puzzle.size(); ++i) {
    if (is_lowercase(game.puzzle[i])) {
      game.puzzle[i] = '_';
      ++game.remaining_letters;
    }
  }
  return game;
}

void print_game(Game &game) {
  cout << endl;
  cout << STICK_FIGURES[game.miss_count] << " ";
  for (size_t i = 0; i < game.puzzle.size(); ++i) {
    cout << " " << game.puzzle[i];
  }
  cout << "\n" << endl;
}

char get_guess() {
  cout << "Enter a lowercase letter to guess:" << endl;
  string input;
  while (cin >> input) {
    if (input.size() != 1) {
      cout << "Error: guess must be exactly one letter" << endl;
    } else if (!is_lowercase(input[0])) {
      cout << "Error: guest must be between a and z" << endl;
    } else {
      return input[0];
    }
  }
  return ERROR_CHAR;
}

void update_game(Game &game, char guess) {
  bool correct_guess = false;
  for (size_t i = 0; i < game.answer.size(); ++i) {
    if (game.answer[i] == guess && game.puzzle[i] == '_') {
      game.puzzle[i] = guess;  // replace the _ with the actual letter
      --game.remaining_letters;
      correct_guess = true;
    }
  }
  if (!correct_guess) {
    ++game.miss_count;
  }
}

void play_game(string answer) {
  Game game = make_game(answer);
  while (game.remaining_letters > 0 && game.miss_count < MAX_MISSES) {
    print_game(game);
    char letter = get_guess();
    if (letter == ERROR_CHAR) {
      cout << "Quitting." << endl;
      return;  // quit the game
    }
    update_game(game, letter);
  }
  print_game(game);
  if (game.remaining_letters == 0) {
    cout << "Congratulations!" << endl;
  } else {
    cout << "Better luck next time!" << endl;
  }
}

We include iostream, since that is not included by stickman.hpp. We don’t have struct or constant definitions, since those are defined in the stickman.hpp header. Instead, we just have the definitions for each function.

We made a few minor changes from before:

We added using directives so that we can use specific standard-library entities without the std:: qualification.
We preceded the definition of is_lowercase() with the static keyword. This is common practice for a helper function, and it prevents the function from conflicting with a function of the same name defined in some other source file.

We can now compile and run the test code. We provide both .cpp source files to the compilation command, but not the .hpp file – its contents are pulled directly into the .cpp files via the #include "stickman.hpp" directive:

$ g++ -std=c++17 -o test.exe test.cpp stickman.cpp
$ ./test.exe

The assertions all succeed, so we don’t see any output.

Lastly, we can write a separate play.cpp file that has a main() function to play the game. The following are the contents of the file:

#include <iostream>
#include "stickman.hpp"

int main(int argc, char *argv[]) {
  if (argc < 2) {
    std::cout << "Usage: " << argv[0] << " <answer> " << std::endl;
    return 1;
  }
  play_game(argv[1]);
}

We use an alternate signature for main() that gives us access to the command-line arguments given to the program. The first argument is always the name of the program executable. We take the game answer as the second argument, so we start by checking whether there are at least two arguments. If not, we print an error message and return with a nonzero value. If an answer is provided, we invoke the top-level play_game() function with the answer.

We compile and run the program as follows:

$ g++ -std=c++17 -o play.exe play.cpp stickman.cpp
$ ./play.exe world!



     _ _ _ _ _ !

Enter a lowercase letter to guess:
e

 o

     _ _ _ _ _ !

Enter a lowercase letter to guess:
t

 o
 |
     _ _ _ _ _ !

Enter a lowercase letter to guess:
s

 o
/|
     _ _ _ _ _ !

Enter a lowercase letter to guess:
o

 o
/|
     _ o _ _ _ !

Enter a lowercase letter to guess:
r

 o
/|
     _ o r _ _ !

Enter a lowercase letter to guess:
a

 o
/|\
     _ o r _ _ !

Enter a lowercase letter to guess:
l

 o
/|\
     _ o r l _ !

Enter a lowercase letter to guess:
d

 o
/|\
     _ o r l d !

Enter a lowercase letter to guess:
w

 o
/|\
     w o r l d !

Congratulations!