Courses/CS 2114/Lab Manual/Vectors

Vectors

Introduction

This lab will explore the use of std::vector in C++. Elementary array processing algorithms will be discussed and the passing of vectors to/from functions will be examined. Students are encouraged to build a program using the code provided in the tutorial.

Topics Covered in this Lab:
• Standard Template Library (STL)
• vector declarations
• vector methods
• iterating through vectors
• index
• range-based for
• iterators
• passing vectors to functions
• returning vectors from functions
• What is the Standard Template Library?
• What is a vector?
• What is a vector declaration?
• What happens if an index is out-of-bounds?
Demonstrable Skills Acquired in this Lab:
• ability to declare and use vector containers
• understanding of how vectors work as function parameters

Introduction to the Standard Template Library (STL)

The Standard Template Library (STL) is a library to provide common, generic classes and functions that implement many commonly used algorithms and data structures.

The architecture of STL is largely the creation of Alexander Stepanov. In 1979 he began working out his initial ideas of generic programming and exploring their potential for revolutionizing software development. Although David Musser had developed and advocated some aspects of generic programming already by year 1971, it was limited to a rather specialized area of software development (computer algebra).

Stepanov recognized the full potential for generic programming and persuaded his then-colleagues at General Electric Research and Development (including, primarily, David Musser and Deepak Kapur) that generic programming should be pursued as a comprehensive basis for software development. At the time there was no real support in any programming language for generic programming.

(Historical information found on Wikipedia.org)

The STL contains:

• Containers - A container is an abstract data structure. Some examples of containers defined in the STL are vectors, maps and queues. We will discuss vectors in this lab.
• Iterators - iterators provide a way to traverse and access data stored in containers.
• Algorithms - The algorithms library contains functions which implement several common algorithms. These algorithms are designed to be used across a range of elements, which can be found in arrays and STL containers. There are algorithms for such actions as sorting, searching, reversing, etc. We will discuss a few of these algorithms in a future lab.

Introduction to the vector Container

Chapter 7 of the Gaddis book discusses the array, a basic data structure. While it can be useful in many ways, it also has some drawbacks. One such drawback is the inability to define an array using a size determined by the data to be used during a particular run of the program; the size of an array must be specified at compile time and so, cannot be determined at run time.

While there are ways around such a limitation, complexity can become an issue. In order to establish an array whose size is determined at run time, the programmer would need to implement dynamic arrays. The burden of reserving the correct amount of memory and releasing said memory before the end of the program then falls to the programmer. The next limitation is that the array cannot change in size once it has been defined. Once again, the use of dynamic arrays can get around this limitation, but the program is getting more complex.

The vector container is an advanced data structure that implements dynamic arrays. They are similar in design to the primitive array in that the elements of the vector are stored in contiguous memory and the data can be traversed or a single element accessed. The biggest difference is the vector's ability to change size with ease throughout the run of a program.

The following tutorial demonstrates the use of several, but not all, vector member functions. Students are encouraged to research all member functions of the vector container after completing this tutorial.

In order to use a vector, the vector library must be made accessible to the program and, in particular, the vector template class. Be sure to add the following to the beginning of any program using vectors. Code Illustration
#include<vector>
using std::vector;

vector Declarations

There are many different ways to define a vector. The following examples employ the modern Uniform Initialization Syntax. It is left to the reader to research legacy syntax.

If so desired, a vector can be defined without specifying its size using the following syntax.

vector<dataType> identifier;

Following this syntax, the code below creates the vector petsInHousehold as an empty vector. Code Illustration
vector<int> petsInHousehold;

An empty vector is one in which there are no elements in use (size is 0). There may or may not be memory reserved for data (capacity) as this is implementation dependent. If there is memory reserved, it is not initialized.

The size of the vector can be found by calling the size() member function which always indicates the number of storage locations currently in use. The capacity() member function indicates the number of storage locations currently available for use by the vector. The max_size() member function indicates the maximum number items that can be stored in the vector.

To see these values, cout the values returned by these member functions. Code Illustration
cout << "\nInformation regarding vector petsInHousehold:\n";
cout << "capacity: "     << petsInHousehold.capacity() << endl;
cout << "size: "         << petsInHousehold.size()     << endl;
cout << "maximum size: " << petsInHousehold.max_size() << endl; Try adding the following statement to the program anywhere prior to the statement that will display the maximum size: Code Illustration
cout.imbue(std::locale("en_US"));

This will make the extremely large number of maximum size easier for the human eye to read.

The following syntax options may be used to define a vector with a specified number of elements immediately available:

vector<dataType> identifier;
identifier.resize(number of elements);

The resize() member function changes the size of the vector to the number indicated by altering the amount of memory allocated to the vector.

Following this syntax produces the following code. Code Illustration
vector<int> ages;
ages.resize(10);

The first statement creates an empty vector named ages. Since ages was created as an empty vector, the call to resize() will cause storage for 10 elements to be added to the vector.

To see the size and capacity of this vector, Code Illustration
cout << "\nInformation regarding vector ages:\n";
cout << "capacity: "     << ages.capacity() << endl;
cout << "size: "         << ages.size()     << endl;
cout << "maximum size: " << ages.max_size() << endl;

Both the capacity and size of ages is currently 10 as the definition specifies that 10 elements be immediately available. The maximum size will be dependent on the hardware.

Alternatively, the size of the vector could be established at run time using the following code: Code Illustration
vector<int>::size_type numberOfPeople;

cout << "\nEnter the number of people:  ";
cin >> numberOfPeople;

vector<int> ages;
ages.resize(numberOfPeople);

where vector<int>::size_type is a data type defined for the vector container which can be used to indicate the size of the vector and allows access to the data via an index. It is typically either an unsigned int or an unsigned long.

For each of these definitions, there would be storage for the specified number of integers and each element would be initialized to 0.

Initializing with Known Values

Another way to define a vector allows for the initialization of the elements in the vector definition. The following syntax demonstrates using Uniform Initialization Syntax.

vector<dataType> identifier{value_list};

The value_list is a comma separated list of values appropriate for storage in the vector surrounded by curly braces.

As an example, the following code would create a vector named caloriesConsumed and fill it with the values listed. Code Illustration
vector<int> caloriesConsumed{360, 480, 100, 650};

What would be the size and capacity of the caloriesConsumed vector?

Adding Data to a Vector with Existing Elements (size() > 0)

If you have a vector that has a specified size, such as the ages vector declared as: Code Illustration
vector<int> ages;
ages.resize(10);
Using an Index and the [ ] Operator (Array Notation)

You can store the age of each individual in a group using the following code: Code Illustration
cout << "\nEnter the ages of " << numberOfPeople << " people:" << endl;
for (vector<int>::size_type i = 0; i < ages.size(); ++i)
{
cout << "Enter an age:  ";
cin  >> ages[i];
}

This loop begins at an index of 0 and continues looping as long as the index is less than the number of values stored in the vector as obtained by the call to the size() member function. The index of the first element is 0 as the index indicates the distance of the desired element away from the beginning of the data. Since the first element is accessed with an index of 0, the last element will be accessed with an index of size-1. Each iteration stores the value entered by the user in the element accessed by the index for that iteration.

Using the at() Member Function

However, using the [ ] operator to access memory locations within a vector requires the programmer to take responsibility for checking the bounds of the data, the first actual memory location and the last actual memory location. In other words, the integer values within the square brackets are not checked by the compiler to ensure they are accessing data within the actual beginning and ending memory locations of the vector. Integer values that cause the computer to access memory outside the data result in out-of-bounds, semantic errors. To avoid such errors and to allow the computer to do bounds checking, a vector's at member function should be utilized. Code Illustration
cout << "\nEnter the ages of " << numberOfPeople << " people:" << endl;
for (vector<int>::size_type i = 0; i < ages.size(); ++i)
{
cout << "Enter an age:  ";
cin  >> ages.at(i);
}

Adding Data to the End of a Vector

For an empty vector, or when adding more data than what has been reserved, the push_back() member function will append data to the end of the data already stored in memory. The following code demonstrates the use of push_back() using the petsInHousehold vector from above. Recall that it has a size of 0. Code Illustration
cout << "\nEnter the number of pets in each of 10 households:\n";
for (int i = 0; i < 10; ++i)
{
int petCount;
cout << "Enter the number of pets in this household:  ";
cin >> petCount;

petsInHousehold.push_back(petCount);
}

The user would be prompted for the number of pets for ten households. Each number entered by the user would be stored temporarily in petCount and then stored in the vector petsInHousehold using the push_back() member function. The first numbered entered by the user would occupy the first storage location, the second number entered by the user would occupy the second storage location and so on with the last number entered by the user being the last number in the stored data.

What is the size and capacity of the petsInHousehold vector now?

In order to add data to a vector anywhere except at the end, the built-in iterators must be employed. The begin() member function returns the location of the first storage location. With this information, it is possible to add data to a specific location in the vector. Code Illustration
ages.insert(ages.begin() + 3, 101);
petsInHousehold.emplace(petsInHousehold.begin() + petsInHousehold.size()/2,3);

Both of the above lines of code will add a value somewhere within the specified vector. The first line of code will determine the fist storage location and then move over 3 elements before inserting 101. After this statement executes, 101 would be stored in the fourth element. The value that was stored in that element along with all following values will now follow the newly inserted 101. The same type of thing happens with the second line of code.

While it seems that there are two functions that do the same identical thing, the insert() member function has additional capabilities. It is left to the reader to research these additional capabilities.

Removing Data from a Vector

The pop_back() member function will remove the last element stored in the vector. The erase() member function will erase a single element or a range of elements; these elements are indicated using built-in iterators. Code Illustration
ages.pop_back();
petsInHousehold.erase(petsInHousehold.begin());
ages.erase(ages.begin()+1,ages.begin()+4);

The first line of the above code will remove the last value that was stored in the ages vector. The second line of code will remove the first value stored in the petsInHousehold vector.

The last line specifies a range beginning with the second memory location and ending with the fifth memory location. Note that the range of the erase() member function is inclusive for the beginning location but exclusive for the ending location so the second, third and fourth values will be removed from the ages vector. (This statement is used with the understanding that the ages vector should have a size greater than 4. What happens when this is not the case?)

Navigating Through and Accessing Data in a Vector

Using Indices

As has already been seen, it is possible to navigate through the data in a vector using an index just as the data of an array can be accessed via an index.

The code Code Illustration
cout << "\nThe ages of the people in the order they are stored:\n";
for (vector<int>::size_type i = 0; i < ages.size(); ++i)
cout << ages.at(i) << endl;

would produce a listing of the ages in the order in which they are stored with no possibility of going out-of-bounds.

Data values may be altered with the use of an index as shown below. The code Code Illustration
ages.at(0) = 21;

would change the value stored in the first storage location to 21.

Using Built-In Iterators

The data of a vector can also be traversed by using the iterators built-in to the vector container. Code Illustration
cout << "\nThe contents of the petsInHousehold vector via an iterator:\n";
for (vector<int>::iterator it = petsInHousehold.begin(); it != petsInHousehold.end(); ++it)
cout << (*it) << endl;

The identifier it is similar to i; where i is an identifier commonly used to represent the index of an array or a counter in a loop, it is an identifier commonly used to represent an iterator of a container.

An iterator "points to" the value stored in the vector. The '*' dereference operator must be employed to access the actual value pointed to by the iterator.

Recall that the begin() member function "points to" the first value stored in the vector. The end() member function "points to" the location just passed the last value stored in the vector. So the vector's range when described by begin() and end() would be inclusive for begin() and exclusive for end(). *petsInHousehold.end() would actually be an out-of-bounds error; therefor the loop continues as long as the iterator is not end().

Using the range-based for

Another way to do so is to use C++'s range-based for (a.k.a ranged for).

An examination of the syntax:

for (dataType identifier : containerIdentifier)
{
//statement
}

where

• The keyword for indicates a loop.
• The dataType is that of the container that will be traversed.
• The identifier is any valid identifier.
• containerIdentifier is the name of the container to be traversed.

Using the range-based for, the programmer does not have to keep track of the position in the vector as the compiler handles the use of the built-in iterators behind the scenes.

Following this syntax would produce the code below. Code Illustration
cout << "\nThe number of pets in each household, in the order entered:\n";
for (int numberOfPets : petsInHousehold)
cout << numberOfPets << endl;

The identifier numberOfPets has a data type of int. It will temporarily store each value in the petsInHousehold vector beginning with the first value and ending with the last value.

The first iteration of the loop will cause numberOfPets to receive a copy of the first value found in the petsInHousehold vector. The second iteration of the loop, numberOfPets will contain the second value in the petsInHousehold vector. And so on, with the last iteration of the loop resulting in numberOfPets containing the value stored in the last element of the petsInHousehold vector.

Since the variable declared within the for is getting a copy of the values from the vector, in order to modify the contents of the vector using a range-based for, the declaration within the loop should be a reference. Recall that the & operator is the 'address of' operator and is used to indicate a reference variable.

The following code could be used to reduce all of the ages by 2. Code Illustration
for (int &age : ages)
age = age - 2;

Vectors and Functions

The following is an example of a function definition in which a vector is passed as an argument. Code Illustration
void displayData(vector<int> vectorData)
{
for (int elementValue : vectorData)
cout << elementValue << endl;
}

To execute this function, use the following function call. Code Illustration
displayData(ages);

The vector in the above example is passed by value which means a copy of the vector is created for the function. Recall that a vector is an advanced data structure, an object. Passing such a structure by value can be problematic if the object contains a large amount of data. Rather than copying potentially large amounts of data, pass the vector as a reference parameter. Be sure to include the keyword const when the function should not be able to alter the data within the vector.

The preferred version of the above function would be Code Illustration
void displayData(const vector<int> &vectorData)
{
for (int elementValue : vectorData)
cout << elementValue << endl;
}

The displayData() function can now read the data in the vector but cannot alter the data in the vector. (The function call remains the same.)

The same effect can be accomplished with the regular for loop using the at() member function. Code Illustration
void displayData(const vector<int> &vectorData)
{
for (vector<int>::size_type i = 0; i < vectorData.size(); ++i)
cout << vectorData.at(i) << endl;
}

Using Iterators with Constant Vectors

The for loop using an iterator, as described above, creates a way to access and modify the data inside the vector. If a function can modify the vector, this loop will still work inside the function. However, the use of const restricts the function from modifying the values so the iterated for loop from above no longer works. To use an iterator to traverse the vector, the use of a const_iterator will be needed. Code Illustration
void displayData(const vector<int> &vectorData)
{
cout << "\nIterating through a constant vector:\n";
for (vector<int>::const_iterator it = vectorData.begin(); it != vectorData.end(); ++it)
cout << (*it) << endl;
}

Using auto for Looping

Recall that auto can be used to allow the compiler to determine the necessary data type of a variable. While auto should not be used when defining simple, primitive data types, it comes in handy when more advanced data structures are encountered. It is an acceptable practice to use auto to traverse a vector. Consider the following change to the for loop discussed above. Code Illustration
void displayData(const vector<int> &vectorData)
{
cout << "\nIterating through a constant vector using auto:\n";
for (auto it = vectorData.begin(); it != vectorData.end(); ++it)
cout << (*it) << endl;
}
Use the displayData() function to display vector values after each alteration shown above to verify what actually occurs as each statement executes.