Arrays
As we mentioned previously, C++ has several different categories of objects, including atomic, array, and class-type objects. An array is simple collection of objects, built into C++ and many other languages. An array has the following properties:
It has a fixed size, set when the array is created. This size never changes as long as the array is alive.
An array holds elements that are of the same type.
The elements of an array are stored in a specific order, with the index of the first element being 0.
The elements are stored contiguously in memory, one after another.
Accessing any element of an array takes constant time, regardless of whether the element is at the beginning, middle, or end of the array.
An array variable can be declared by placing square brackets to the
right of the variable name, with a compile-time constant between the
brackets, denoting the number of elements. For example, the following
declares array to be an array of four int elements:
int array[4];
The following uses a named constant to declare array2 to be an
array of four ints:
const int SIZE = 4;
int array2[SIZE];
In both cases, we did not provide an explicit initialization. Thus,
array and array2 are default initialized by default
initializing each of their elements. Since their elements are of
atomic type int, they are default initialized to undefined values.
We can explicitly initialize an array with an initializer list, a list of values in curly braces:
int array[4] = { 1, 2, 3, 4 };
This initializes the element at index 0 to 1, the element at index 1 to 2, and so on.
If the initializer list contains fewer values than the size of the
array, the remaining array elements are implicitly initialized. For
atomic elements, these remaining elements are initialized to zero
values [1]. Thus, the following initializes the first two elements of
array2 to 1 and 2, respectively, and the last two elements to 0:
int array2[4] = { 1, 2 };
The following results in every element in array3 being initialized
to 0:
int array3[4] = {};
Here, we have provided an empty initializer list, so that the first zero elements (i.e. none of them) are explicitly initialized while the remaining elements (i.e. all of them) are implicitly initialized to 0.
If the size of the array is the same as the size of the initializer list, we can elide the size of the array in its declaration:
int array[] = { 1, 2, 3, 4 };
Figure 38 illustrates the layout of array in
memory.
Figure 38 Layout of an array in memory.
This diagram assumes that an int takes up four bytes in memory,
which is the case on most modern machines.
Individual array elements can be accessed with square brackets, with
an index between the brackets. Indexing starts at 0, up through the
size of the array minus one. For example, the following increments
each element in array by one and prints out each resulting value:
for (int i = 0; i < 4; ++i) {
++array[i];
cout << array[i] << endl;
}
Arrays can be composed with other kinds of objects, such as structs.
The following is an array of three Person elements:
struct Person {
string name;
int age;
bool is_ninja;
};
Person people[3];
Figure 39 shows the layout of this array in memory.
Figure 39 An array of class-type objects.
The following is a struct that contains an array as a member, and its layout is shown in Figure 40:
struct Matrix {
int width;
int height;
int data[6];
};
int main() {
Matrix matrix;
...
}
Figure 40 A class-type object with an array member.
Arrays and Pointers
Arrays in C++ are objects. However, in most contexts, there isn’t a value associated with an array as a whole [2]. The individual elements (if they are not of array type), have values, but not the array as a whole. Instead, when we use an array in a context where a value is required, the compiler converts the array into a pointer to the first element in the array:
The system of values in C++ is very complicated and beyond the scope of this course. In the context of this course, we use the term value to mean something called an rvalue in programming-language terms. There are a handful of ways to construct an array rvalue in C++, but none that we will encounter in this course.
int array[] = { 1, 2, 3, 4 };
cout << &array[0] << endl; // prints 0x1000 assuming the figure above
cout << array << endl; // prints 0x1000 assuming the figure above
*array = -1;
cout << array[0] << endl; // prints -1
In this example, assuming the layout in Figure 38
where the first element is at address 0x1000, printing array
to standard output just prints out the address 0x1000 – it
converts array to a pointer to its first element, and it is the
pointer’s value that is then printed. Similarly, dereferencing the
array first turns it into a pointer to the first element, followed by
the dereference that gives us the first element itself.
The tendency of arrays to decay into pointers results in significant limitations when using an array. For instance, we cannot assign one array to another – the right-hand side of an assignment requires a value, which in the case of an array will become a pointer, which is then incompatible with the left-hand side array:
int arr1[4] = { 1, 2, 3, 4 };
int arr2[4] = { 5, 6, 7, 8 };
arr2 = arr1; // error: LHS is an array, RHS is a pointer
As discussed before, by default, C++ passes parameters by value. This is also true if the parameter is an array. Since an array decays to a pointer when its value is required, this implies that an array is passed by value as a pointer to its first element. Thus, an array parameter to a function is actually equivalent to a pointer parameter, regardless of whether or not the parameter includes a size:
void func1(int arr[4]); // parameter equivalent to int *arr
void func2(int arr[5]); // parameter equivalent to int *arr
void func3(int arr[]); // parameter equivalent to int *arr
void func4(int *arr);
int main() {
int arr1[4] = { 1, 2, 3, 4 };
int arr2[5];
int x = -3;
func1(arr1); // OK: arr1 turns into pointer, as does parameter of func1
func2(arr1); // OK: arr1 turns into pointer, as does parameter of func2
// compiler ignores size in func2 parameter
func3(arr1); // OK: arr1 turns into pointer, as does parameter of func3
func4(arr1); // OK: arr1 turns into pointer, matches parameter of func4
func1(arr2); // OK: arr2 turns into pointer, as does parameter of func1
// compiler ignores size in func1 parameter
func2(arr2); // OK: arr2 turns into pointer, as does parameter of func2
func3(arr2); // OK: arr2 turns into pointer, as does parameter of func3
func4(arr2); // OK: arr2 turns into pointer, matches parameter of func4
func1(&x); // OK: parameter of func1 turns into pointer
func2(&x); // OK: parameter of func2 turns into pointer
func3(&x); // OK: parameter of func3 turns into pointer
func4(&x); // OK: matches parameter of func4
}
This means that a function that takes an array as a parameter cannot guarantee that the argument value corresponds to an array of matching size, or even that it is a pointer into an array. Instead, we need another mechanism for passing size information to a function; we will come back to this momentarily.
Pointer Arithmetic
C++ supports certain arithmetic operations on pointers:
An integral value can be added to or subtracted from a pointer, resulting in a pointer that is offset from the original one.
Two pointers can be subtracted, resulting in an integral value that is the distance between the pointers.
Pointer arithmetic is in terms of number of elements rather than
number of bytes. For instance, if an int takes up four bytes
of memory, then adding 2 to an int * results in a pointer that is
two ints forward in memory, or a total of eight bytes:
int array[] = { 4, 3, 2, 1 };
int *ptr1 = array; // pointer to first element
int *ptr2 = &array[2]; // pointer to third element
int *ptr3 = ptr1 + 2; // pointer to third element
int *ptr4 = array + 2; // pointer to third element
++ptr1; // move pointer to second element
In initializing ptr4, array is converted to a pointer to its
first element, since the + operator requires a value, and the
result is two ints forward in memory, producing a pointer to the
third element. The last line increments ptr1 to point to the next
int in memory. The result is shown in
Figure 41.
Figure 41 Pointer arithmetic is in terms of whole objects, not bytes.
The following demonstrates subtracting pointers:
cout << ptr2 - ptr1 << endl; // prints 1
Since ptr2 is one int further in memory than ptr, the
difference ptr2 - ptr is 1.
Pointer arithmetic is one reason why each C++ type has its own pointer
type – in order to be able to do pointer arithmetic, the compiler
needs to use the size of the pointed-to type, so it needs to know what
that type is. For example, implementations generally represent
double objects with eight bytes, so adding 2 to a double *
moves 16 bytes forward in memory. In general, for a pointer of type
T *, adding N to it moves N * sizeof(T) bytes forward in
memory [3].
sizeof is an operator that can be applied to a type to
obtain the number of bytes used to represent that type. When
applied to a type, the parentheses are mandatory (e.g.
sizeof(int)). The operator can also be applied to a value,
in which case it results in the size of the compile-time type
of that value. Parentheses are not required in this case (e.g.
sizeof 4 or sizeof x).
Pointers can also be compared with the comparison operators, as in the following using the pointers declared above:
cout << (ptr1 == ptr2) << endl; // false (prints as 0)
cout << (ptr2 == ptr3) << endl; // true (prints as 1)
cout << (ptr1 < ptr2) << endl; // true
cout << (*ptr1 < *ptr2) << endl; // false (compares element values)
++ptr1;
cout << (ptr1 == ptr2) << endl; // true
cout << (array == &array[0]) << endl; // true (LHS turns into pointer)
Arithmetic is generally useful only on pointers to array elements, since only array elements are guaranteed to be stored contiguously in memory. Similarly, comparisons are generally only well-defined on pointers into the same array or on pointers constructed from arithmetic operations on the same pointer.
Array Indexing
Array indexing in C++ is actually implemented using pointer
arithmetic. If one of the operands to the subscript ([]) operator
is an array and the other is integral, then the operation is
equivalent to pointer arithmetic followed by a dereference:
int arr[4] = { 1, 2, 3, 4 };
cout << *(arr + 2) << endl; // prints 3: arr+2 is pointer to 3rd element
cout << arr[2] << endl; // prints 3: equivalent to *(arr + 2)
cout << 2[arr] << endl; // prints 3: equivalent to *(2 + arr);
// but don't do this!
Thus, if arr is an array and i is integral, then arr[i] is
equivalent to *(arr + i):
The subscript operation requires the value of
arr, so it turns into a pointer to its first element.Pointer arithmetic is done to produce a pointer
ielements forward in memory.The resulting pointer is dereferenced, resulting in the element at index
i.
Because the subscript operation is equivalent to pointer arithmetic, it can be applied to a pointer equally as well:
int arr[4] = { 1, 2, 3, 4 };
int *ptr = arr + 1; // pointer to second element
cout << ptr[2] << endl; // prints 4: constructs a pointer that is 2
// elements forward in memory, then
// dereferences that
There are several implications of the equivalence between array indexing and pointer arithmetic. First, it is what makes array access a constant time operation – no matter the index, accessing an element turns into a single pointer addition followed by a single dereference. The equivalence is also what makes passing arrays by value work – the result is a pointer, which we can still subscript into since it just does pointer arithmetic followed by a dereference. Finally, it allows us to work with subsets of an array. For instance, the following code prints out just the middle elements of an array:
void print_array(int array[], int size) {
for (int i = 0; i < size; ++i) {
cout << array[i] << " ";
}
}
int main() {
int array[4] = { 3, -1, 5, 2 };
print_array(arr + 1, 2); // prints out just -1 5
}
Figure 42 Passing a subset of an array to a function.
The print_array() function receives a pointer to the array’s
second element as well as a size of 2, as shown in
Figure 42. Thus, it only prints out the second
and third elements; as far as the function knows, it is working with
an array of size 2 that starts at the address 0x1004.
More on Array Decay
An array only decays into a pointer when its value is required. When
an array object’s value is not required, it does not decay into a
pointer. For example, the address-of (&) operator requires an
object but not its value – thus, applying & to an array produces
a pointer to the whole array, not a pointer to an individual element
nor a pointer to a pointer [4].
A pointer to an array of 4 ints can be declared using the
syntax int (*ptr_to_arr)[4];. The address value stored in a
pointer to an array is generally the same address as that of
the array’s first element.
Another example is applying the sizeof operator to an array. The
operator produces the size of the whole array in bytes [5], as
opposed to applying it to a pointer, which just produces the size of a
pointer (generally eight bytes on modern systems):
int x = 42;
int arr[5] = { 1, 2, 3, 4, 5 };
int *ptr = arr;
cout << sizeof x << endl; // 4 (on most machines)
cout << sizeof arr << endl; // 20
cout << sizeof ptr << endl; // 8
Thus, the expression sizeof array / sizeof *array recovers
the number of elements, as long as array is still an array.
Once an array has turned into a pointer, the resulting pointer loses all information about the size of the array, or even that it is a pointer into an array. Thus, we need another mechanism for keeping track of the size of an array, such as when we pass the array to a function (if it is passed by value, it turns into a pointer which retains no information about the array’s size).
The End of an Array
If a program dereferences a pointer that goes past the bounds of the array, the result is undefined behavior [6]. If we are lucky, the program will crash, indicating we did something wrong and giving us an opportunity to debug it. In the worst case, the program may compute the right result when we run it on our own machine but misbehave when run on a different platform (e.g. the autograder).
Constructing a pointer that is out of bounds is not a problem; we often construct pointers that are just past the end of an array, as we will see in a moment. It is dereferencing such a pointer that results in undefined behavior.
There are two general strategies for keeping track of where an array ends.
Keep track of the length separately from the array. This can be done with either an integer size or by constructing a pointer that is just past the end of an array (by just adding the size of the array to a pointer to the array’s first element).
Store a special sentinel value at the end of the array, which allows an algorithm to detect that it has reached the end.
The first strategy is what we used in defining the print_array()
function above. As demonstrated there, the stored size may be smaller
than the size of the array, resulting in the function operating on a
subset of the array.
The second strategy requires there to be a special value that can be
reserved to indicate the end of the array, and that we are assured
will not occur as a real element. It is how built-in (C-style) strings
(as opposed to C++ std::string) are implemented, though we do not
cover the details here. Instead, we will return to the sentinel
strategy when we implement linked data structures.
Array Traversal
The print_array() function above also demonstrates how to traverse
through an array using an index that starts at 0 up to the size of the
array, exclusive. The following is another example:
int const SIZE = 3; // constant to represent array size
int array[SIZE] = { -1, 7, 3 };
for (int i = 0; i < SIZE; ++i) {
cout << *(array + i) << endl; // using pointer arithmetic
cout << array[i] << endl; // using subscript (better)
}
Figure 43 Traversal by index uses an index to walk through an array.
This pattern of accessing elements is called traversal by index –
we use an integer index in the range \([0, SIZE)\) where
\(SIZE\) is the size of the array, and we use the index to obtain
the corresponding element. We can use the subscript operator or do
pointer arithmetic ourselves. (The former is generally considered
better, since it is more familiar and clearer to most programmers.
However, you will often see both arr[0] and *arr used to
access the first element.) The actual syntax we use is irrelevant to
the pattern – what makes this traversal by index is that we use an
integer index to access the array, and we traverse through the
elements by modifying the index.
Another pattern we can use is traversal by pointer, which walks a pointer across the elements of an array:
int const SIZE = 3; // constant to represent array size
int array[SIZE] = { -1, 7, 3 };
int *end = array + SIZE; // pointer just past the end of arr
for (int *ptr = array; ptr < end; ++ptr) { // walk pointer across arr
cout << *ptr << endl; // dereference to obtain element
}
Figure 44 Traversal by pointer uses a pointer to walk through an array.
Here, we start by constructing a pointer that is just past the end of
the array: the last element is at arr + SIZE - 1, so we need to
end our traversal when the pointer we are using reaches arr +
SIZE. We then use another pointer that starts at the first element,
dereference it to obtain an element, and then increment it to move on
to the next element. The syntax we use to dereference an element is
irrelevant to the pattern (it can be *ptr or ptr[0]) – what
makes this traversal by pointer is that we use a pointer to each
element to access the array, and we traverse through the elements by
modifying that pointer.
Traversal by index is the more common pattern when working with general arrays. However traversal by pointer is a special case of traversal by iterator, which we saw previously. We will shortly see that traversal by iterator/pointer allows us to write algorithms that work on both library containers and arrays. Thus, both the traversal-by-index and traversal-by-pointer patterns are important to programming in C++.
Aside from providing us insight about memory and how objects are stored, arrays are a fundamental abstraction that can be used to build more complex abstractions. We proceed to see how to use arrays to build data structures such as vectors and sets.
Arrays and const
Since an array does not have a value of its own, it cannot be assigned to as a whole – we saw previously that a compile error would result, since we cannot obtain an array value to place on the right-hand side of the assignment. Thus, it is also not meaningful for an array itself to be const either.
Similar to a reference, an array may not be const itself, but its elements may be:
const double arr[4] = { 1.1, 2.3, -4.5, 8 };
arr[2] = 3.1; // ERROR -- attempt to assign to const object
The declaration const double arr[4] is read inside out as “arr
is an array of four constant doubles.” The elements can be initialized
through an initializer list, but they may not be modified later
through assignment.
If an array is a member of a class-type object, the array elements inherit the “constness” of the object itself. For example, consider the following:
struct Foo {
int num;
int *ptr;
int arr[4];
};
Like any const object, a const Foo must be initialized upon
creation:
int main() {
int x = 3;
const Foo foo = { 4, &x, { 1, 2, 3, 4 } };
...
}
The array member can be initialized using its own initializer list, which is the same syntax for initializing a local array variable.
Figure 45 Contents of a Foo object. Declaring the object as
const only prohibits modifications to the subobjects
contained within the memory for the object.
As we saw previously, attempting
to modify foo.num or foo.ptr results in a compiler error. The
same is true for the elements of foo.arr:
foo.arr[0] = 2; // ERROR
Since the array element is a subobject of a const object, it cannot be modified.
Strings
A string is a sequence of characters, and it represents text data. C++ has two string abstractions, which we refer to as C-style strings and C++ strings.
C-Style Strings
In the original C language, strings are represented as just an array
of characters, which have the type char. The following initializes
a string representing the characters in the word hello:
char str[6] = { 'h', 'e', 'l', 'l', 'o', '\0' };
Figure 46 Array representation of a string.
Character literals are enclosed in single quotes. For example 'h'
is the character literal corresponding to the lower-case letter h.
The representation of the string in memory is shown in
Figure 46.
A C-style string has a sentinel value at its end, the special null
character, denoted by '\0'. This is not the same as a null
pointer, which is denoted by nullptr, nor the character '0',
which denotes the digit 0. The null character signals the end of the
string, and algorithms on C-style strings rely on its presence to
determine where the string ends.
A character array can also be initialized with a string literal:
char str2[6] = "hello";
char str3[] = "hello";
If the size of the array is specified, it must have sufficient space
for the null terminator. In the second case above, the size of the
array is inferred as 6 from the string literal that is used to
initialize it. A string literal implicitly contains the null
terminator at its end, so both str2 and str3 are initialized
to end with a null terminator.
The char type is an atomic type that is represented by numerical
values. The ASCII standard specifies the numerical values used to
represent each character. For instance, the null character '\0' is
represented by the ASCII value 0, the digit '0' is represented by
the ASCII value 48, and the letter 'h' is represented by the ASCII
value 104. Figure 47 illustrates the ASCII values that
represent the string "hello".
Figure 47 ASCII values of the characters in a string.
An important feature of the ASCII standard is that the digits 0-9 are represented by consecutive values, the capital letters A-Z are also represented by consecutive values, and the lower-case letters a-z as well. The following function determines whether a character is a letter:
bool is_alpha(char ch) {
return (ch >= 'A' && ch <= 'Z') || (ch >= 'a' && ch <= 'z');
}
In C++, atomic objects with value 0 are considered to have false truth values, while atomic objects with nonzero values are considered to be true. Thus, the null terminator is the only character that has a false truth value. We will make use of that when implementing algorithms on C-style strings.
Since C-style strings are just arrays, the pitfalls that apply to
arrays also apply to C-style strings. For instance, a char array
turns into a pointer to char when its value is required. Thus,
comparisons and assignments on C-style strings cannot be done with the
built-in operators:
char str1[6] = "hello";
char str2[6] = "hello";
char str3[6] = "apple";
char *ptr = str1; // manually convert array into pointer;
// ptr points to first character of str1
// Test for equality?
str1 == str2; // false; tests pointer equality
// Copy strings?
str1 = str3; // does not compile; RHS turns into pointer
// Copy through pointer?
ptr = str3; // sets ptr to point to first character of str3
When initializing a variable from a string literal, the variable can be an array, in which case the individual characters are initialized from those in the string literal:
char str1[6] = "hello";
The variable can also be a pointer, in which case it just points to
the first character in the string literal itself. String literals are
stored in memory; however, the C++ standard prohibits us from
modifying the memory used to store a string literal. Thus, we must use
the const keyword when specifying the element type of the pointer:
const char *ptr = "hello";
String Traversal and Functions
The conventional pattern for iterating over a C-style string is to use traversal by pointer: walk a pointer across the elements until the end is reached. However, unlike the traversal pattern we saw previously where we already knew the length, we don’t know the end of a C-style string until we reach the null terminator. Thus, we iterate until we reach that sentinel value:
// REQUIRES: str points to a valid, null-terminated string
// EFFECTS: Returns the length of str, not including the null
// terminator.
int strlen(const char *str) {
const char *ptr = str;
while (*ptr != '\0') {
++ptr;
}
return ptr - str;
}
Here, we compute the length of a string by creating a new pointer that points to the first character. We then increment that pointer [7] until reaching the null terminator. Then the distance between that pointer and the original is equal to the number of non-null characters in the string.
The type const char * denotes a pointer that is pointing at
a constant character. This means that the pointed-to character
cannot be modified through the pointer. However, the pointer
itself can be modified to point to a different character, which
is what happens when we increment the pointer.
We can also use the truth value of the null character in the test of the while loop:
int strlen(const char *str) {
const char *ptr = str;
while (*ptr) {
++ptr;
}
return ptr - str;
}
We can also use a for loop, with an empty initialization and body:
int strlen(const char *str) {
const char *ptr = str;
for (; *ptr; ++ptr);
return ptr - str;
}
The built-in <cstring> header contains a definition for strlen().
We saw previously that we cannot copy C-style strings with the assignment operator. Instead, we need to use a function:
// REQUIRES: src points to a valid, null-terminated string;
// dst points to an array with >= strlen(src) + 1 elements
// MODIFIES: *dst
// EFFECTS: Copies the characters from src into dst, including the
// null terminator.
void strcpy(char *dst, const char *src) {
while (*src) {
*dst = *src;
++src;
++dst;
}
*dst = *src; // null terminator
}
The function takes in a destination pointer; the pointed-to type must
be non-const, since the function will modify the elements. The
function does not need to modify the source string, so the
corresponding parameter is a pointer to const char. Then each
non-null character from src is copied into dst. The last line
also copies the null terminator into dst.
The strcpy() function can be written more succinctly by relying on
the behavior of the postfix increment operator. There are two versions
of the increment operator, and their evaluation process is visualized
in Figure 48:
Figure 48 Evaluation process for prefix and postfix increment.
The prefix increment operator, when applied to an atomic object, increments the object and evaluates to the object itself, which now contains the new value:
int x = 3; cout << ++x; // prints 4 cout << x; // prints 4
The postfix increment operator, when applied to an atomic object, increments the object but evaluates to the old value:
int x = 3; cout << x++; // prints 3 cout << x; // prints 4
There are also both prefix and postfix versions of the decrement
operator (--).
A word of caution when writing expressions that have side effects, such as increment: in C++, the order in which subexpressions are evaluated within a larger expression is for the most part unspecified. Thus, the following results in implementation-dependent behavior:
int x = 3;
cout << ++x << "," << x; // can print 4,4 or 4,3
If the second x in the print statement is evaluated before
++x, then a 3 will be printed out for its value. On the other
hand, if the second x is evaluated after ++x, a 4 will be
printed out for its value. Code like this, where a single statement
contains two subexpressions that use the same variable but at least
one modifies it, should be avoided.
Another feature that our shorter version of strcpy() will rely on
is that an assignment evaluates back to the left-hand-side object:
int x = 3;
int y = -4;
++(x = y); // copies -4 into x, then increments x
cout << x; // prints -3
cout << (y = x); // prints -3
The succinct version of strcpy() is as follows:
void strcpy(char *dst, const char *src) {
while (*dst++ = *src++);
}
The test increments both pointers, but since it is using postfix
increment, the expressions themselves evaluate to the old values.
Thus, in the first iteration, dst++ and src++ evaluate to the
addresses of the first character in each string. The rest of the test
expression dereferences the pointers and copies the source value to
the destination. The assignment then evaluates to the left-hand-side
object, so the test checks the truth value of that object’s value. As
long as the character that was copied was not the null terminator, it
will be true, and the loop will continue on to the next character.
When the null terminator is reached, the assignment copies it to the
destination but then produces a false value, so the loop terminates
immediately after copying over the null terminator.
The <cstring> library also contains a version of strcpy().
Printing C-Style Arrays
Previously, we say that printing out an array prints out the address of its first character, since the array turns into a pointer. Printing out a pointer just prints out the address value contained in the pointer.
On the other hand, C++ output streams have special treatment of
pointers to char. If a pointer to char is passed to cout,
it will assume that the pointer is pointing into a C-style string and
print out every character until it reaches a null terminator:
char str[] = "hello";
char *ptr = str;
cout << ptr; // prints out hello
cout << str; // str turns into a pointer; prints out hello
This means that we must ensure that a char * is actually pointing
to a null-terminated string before passing it to cout. The
following results in undefined behavior:
char array[] = { 'h', 'e', 'l', 'l', 'o' }; // not null-terminated
char ch = 'w'; // just a character
cout << array; // undefined behavior -- dereferences past end of array
cout << &ch; // undefined behavior -- dereferences past ch
To print out the address value of a char *, we must convert it into
a void *, which is a pointer that can point to any kind of object:
cout << static_cast<void *>(&ch); // prints address of ch
C++ Strings
C++ strings are class-type objects represented by the string type
[8]. They are not arrays, though the implementation may use arrays
under the hood. Thus, C++ strings are to C-style strings as vectors
are to built-in arrays.
Technically, string is an alias for basic_string<char>,
so you may see the latter in compiler errors.
The following table compares C-style and C++ strings:
C-Style Strings |
C++ Strings |
|
|---|---|---|
Library Header |
|
|
Declaration |
|
|
Length |
|
|
Copy Value |
|
|
Indexing |
|
|
Concatenate |
|
|
Compare |
|
|
A C++ string can be converted into a C-style string by calling
.c_str() on it:
const char *cstr = str.c_str();
A C-style string can be converted into a C++ string by explicitly or
implicitly calling the string constructor:
string str1 = string(cstr); // explicit call
string str = cstr; // implicit call
C++ strings can be compared with the built-in comparison operators, which compare them lexicographically: the ASCII values of elements are compared one by one, and if the two strings differ in a character, then the string whose character has a lower ASCII value is considered less than the other. If one string is a prefix of the other, then the shorter one is less than the longer (which results from comparing the ASCII value of the null terminator to a non-null character).
C-style strings cannot be compared with the built-in operators – these
would just do pointer comparisons. Instead, the strcmp() function
can be used, and strcmp(str1, str2) returns:
a negative value if
str1is lexicographically less thanstr2a positive value if
str1is lexicographically greater thanstr20 if the two strings have equal values
The expression !strcmp(str1, str2) is often used to check for
equality – if the two strings are equal, strcmp() returns 0,
which has truth value false.