Difference between revisions of "User:Jason.wilkins/Laboratory for Sets"

From A-State Computer Science Wiki
Jump to: navigation, search
Line 56: Line 56:
 
For simplicity for the rest of this lab we are only going to only consider the letters from <math>A</math> to <math>G</math> to be the universe.  This set can be denoted as <math>\{ A, B, C, D, E, F, G \}</math>.  A <tt>Set</tt> implemented for this universe in C++ will have to keep track of whether each letter is a member of the <tt>Set</tt> or not.  This can be done by keeping an array of seven boolean values where if an array element is set to <tt>true</tt> then the corresponding set element is a member of the set.
 
For simplicity for the rest of this lab we are only going to only consider the letters from <math>A</math> to <math>G</math> to be the universe.  This set can be denoted as <math>\{ A, B, C, D, E, F, G \}</math>.  A <tt>Set</tt> implemented for this universe in C++ will have to keep track of whether each letter is a member of the <tt>Set</tt> or not.  This can be done by keeping an array of seven boolean values where if an array element is set to <tt>true</tt> then the corresponding set element is a member of the set.
  
<block>
+
 
 +
<blockquote>
 
Create the files <b>Set.h</b> and <b>Set.cpp</b> now.  Implement a constructor for <tt>Set</tt> takes no arguments and creates a set with no members (the empty set).
 
Create the files <b>Set.h</b> and <b>Set.cpp</b> now.  Implement a constructor for <tt>Set</tt> takes no arguments and creates a set with no members (the empty set).
</block>
+
</blockquote>
 +
 
  
 
It is bad programming style to use the literal number <tt>7</tt> in all the places in the code that will access the boolean values of the <tt>Set</tt> class.  If ever the number of elements is changed then it will need to be changed in many places and this introduces the chance that one of the places will be missed.  Values scattered throughout the code like this are called <i>magic numbers</i>.  It is better to make the number of elements into a constant static variable that is given a good descriptive name such as <tt>element_count</tt>.
 
It is bad programming style to use the literal number <tt>7</tt> in all the places in the code that will access the boolean values of the <tt>Set</tt> class.  If ever the number of elements is changed then it will need to be changed in many places and this introduces the chance that one of the places will be missed.  Values scattered throughout the code like this are called <i>magic numbers</i>.  It is better to make the number of elements into a constant static variable that is given a good descriptive name such as <tt>element_count</tt>.
 +
  
 
==Adding Elements==
 
==Adding Elements==
  
 
It would be nice to be able to use C++ character literals like <tt>'A'</tt> and <tt>'E'</tt> to stand for the elements <math>A</math> and <math>E</math>.  Recall that character literals in C++ are really just a convenient way to write ASCII character codes [www.asciitable.com].  For example, <tt>'A'</tt> is equal to to the integer <tt>65</tt>.  Also, the codes <tt>'A'</tt> thru <tt>'G'</tt> are sequential.  Remember that the character constants <tt>'a'</tt> thru <tt>'b'</tt> have different values.
 
It would be nice to be able to use C++ character literals like <tt>'A'</tt> and <tt>'E'</tt> to stand for the elements <math>A</math> and <math>E</math>.  Recall that character literals in C++ are really just a convenient way to write ASCII character codes [www.asciitable.com].  For example, <tt>'A'</tt> is equal to to the integer <tt>65</tt>.  Also, the codes <tt>'A'</tt> thru <tt>'G'</tt> are sequential.  Remember that the character constants <tt>'a'</tt> thru <tt>'b'</tt> have different values.
 +
  
 
However, the boolean variables that are a part of <tt>Set</tt> have indexes from <tt>0</tt> to <tt>6</tt>.  Because character literals are really numbers it is easy to map them to  indexes by using subtraction.  <tt>'A' - 65</tt> is 0 and <tt>'G' - 65</tt> is 6, but even better this can be written as <tt>'A' - 'A'</tt> and <tt>'G' - 'A'</tt> which eliminates the magic number <tt>65</tt>.
 
However, the boolean variables that are a part of <tt>Set</tt> have indexes from <tt>0</tt> to <tt>6</tt>.  Because character literals are really numbers it is easy to map them to  indexes by using subtraction.  <tt>'A' - 65</tt> is 0 and <tt>'G' - 65</tt> is 6, but even better this can be written as <tt>'A' - 'A'</tt> and <tt>'G' - 'A'</tt> which eliminates the magic number <tt>65</tt>.
  
<block>
+
 
 +
<blockquote>
 
Now, implement a member function <tt>add</tt> for <tt>Set</tt> that takes a <tt>char</tt> and sets the corresponding boolean in <tt>Set</tt> to true.  The member function does not need to return a value.  The function should check that the calculated index is between zero and six and print out a useful warning message if it is not.   
 
Now, implement a member function <tt>add</tt> for <tt>Set</tt> that takes a <tt>char</tt> and sets the corresponding boolean in <tt>Set</tt> to true.  The member function does not need to return a value.  The function should check that the calculated index is between zero and six and print out a useful warning message if it is not.   
</block>
+
</blockquote>
 +
 
  
 
There is a function declared as <tt>char toupper(char)</tt> declared in the standard header file named <tt>cctype</tt> that can be used to convert a lower-case letter to an upper-case letter.  For example <tt>toupper('a') == 'A'</tt> is <tt>true</tt>.  If a character is not a lower case letter then <tt>toupper</tt> just returns the character it was given.
 
There is a function declared as <tt>char toupper(char)</tt> declared in the standard header file named <tt>cctype</tt> that can be used to convert a lower-case letter to an upper-case letter.  For example <tt>toupper('a') == 'A'</tt> is <tt>true</tt>.  If a character is not a lower case letter then <tt>toupper</tt> just returns the character it was given.
  
<block>
+
 
 +
<blockquote>
 
Now, modify <tt>add</tt> to take both lower and upper case letters by converting all characters passed into <tt>add</tt> into upper-case letters.
 
Now, modify <tt>add</tt> to take both lower and upper case letters by converting all characters passed into <tt>add</tt> into upper-case letters.
</block>
+
</blockquote>
 +
 
  
  

Revision as of 11:42, 19 April 2010

A portion of this lab is to be done during the scheduled lab time. The take-home programming assignment is to be turned in before the next lab; see the lab website. The in-lab portion is worth 40% of the lab credit; the programming assignment is worth the other 60%. See the website for details on how the programming assignment will be graded. You are not responsible for user errors in input unless specified in the assignment. Feedback will be provided explaining your grade on each assignment.

It is important that you complete each step before going on to the next, as the exercises build upon one another. You will find it helpful to diagram the action of each method or function as you go along. If you have difficulty in some step, DO NOT proceed before resolving it; seek assistance from your lab proctor. You will not be able to fully appreciate the remaining content of the lab and you are likely to compound the problem.

Introduction

In this lab a Set class is developed using the skills learned in previous labs. The instructions given here are less specific and intended more as general guidance.

Topics Covered in this Lab:
  • Sets
  • Set operations
Questions Answered in this Lab:
  • What is a set?
  • How might a set class be designed for C++?
  • What operations can be performed with sets?
  • How might set operations be implemented as part of a class?
Demonstrable Skills Acquired in this Lab:
  • An understanding of sets and set operations
  • How one might implement a class given less specific instructions
Createproject.pngCreate a project oop14iL

Sets

A set is a collection of mathematical objects (As used in mathematics, object is just a way to say "thing", so a set is a collection of things. This is a distinct idea from C++ objects.). A mathematical object is either a member of a set or not, so it does not make sense for an object to be in a set more than once. However, an object can be a member of multiple sets at once. Sets are themselves consider to be mathematical objects, so it is possible to have sets that contain other sets.


To avoid ambiguity, from this point mathematical objects spoken of in relation to sets will be called elements from this point onward.


Sets are commonly denoted by using curly braces in one of two ways. The first is to exhaustively list every element of a set within the curly braces. For example, the set containing the integers between 1 and 10 could be written as \{ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 \}. However most sets are too large to define this way and may even contain an infinite number of elements and are denoted using what is referred to as set builder notation. An example would be all even integers which can be written as \{x|\text{where }k \in \mathbb{I}, x = 2k\}, and which is read as "x such that where k is an integer, x is two times k." Sometimes, infinite sets are more intuitively denoted by using ellipsis, for example, all integers could be written \{ \dots , -2, -1, 0, 1, 2, \dots \} and even integers as \{ \dots , -4, -2, 0, 2, 4, \dots \}.


The "\in" operator, pronounced "element of," used in the previous example is a relational operator that is not built into C++. It returns true if the element given on the left is a member of the set given on the right. Also mentioned in the previous example is the the set of all integers which is so commonly used that it has a shorthand notation "\mathbb{I}."


The set containing all the mathematical objects under consideration is called the universe. For example, if the real numbers are being considered then the universe is the infinite set \mathbb{R} containing all reals.


Summary

  • Sets are collections of elements.
  • Each given element is either is a member or not a member of a given set.
  • The relational operator "\in" and "\notin" are used to test set membership.
  • An element cannot appear in a set twice.
  • An element can be in more than one set at once.
  • A set is denoted by listing all the elements inside of curly braces.
  • Infinite sets are denoted using set builder notation or ellipsis.

More information about sets can be found on the Internet.

  • Wikipedia [1]
  • Wolfram MathWorld [2]


Implementing a Set Class

For simplicity for the rest of this lab we are only going to only consider the letters from A to G to be the universe. This set can be denoted as \{ A, B, C, D, E, F, G \}. A Set implemented for this universe in C++ will have to keep track of whether each letter is a member of the Set or not. This can be done by keeping an array of seven boolean values where if an array element is set to true then the corresponding set element is a member of the set.


Create the files Set.h and Set.cpp now. Implement a constructor for Set takes no arguments and creates a set with no members (the empty set).


It is bad programming style to use the literal number 7 in all the places in the code that will access the boolean values of the Set class. If ever the number of elements is changed then it will need to be changed in many places and this introduces the chance that one of the places will be missed. Values scattered throughout the code like this are called magic numbers. It is better to make the number of elements into a constant static variable that is given a good descriptive name such as element_count.


Adding Elements

It would be nice to be able to use C++ character literals like 'A' and 'E' to stand for the elements A and E. Recall that character literals in C++ are really just a convenient way to write ASCII character codes [www.asciitable.com]. For example, 'A' is equal to to the integer 65. Also, the codes 'A' thru 'G' are sequential. Remember that the character constants 'a' thru 'b' have different values.


However, the boolean variables that are a part of Set have indexes from 0 to 6. Because character literals are really numbers it is easy to map them to indexes by using subtraction. 'A' - 65 is 0 and 'G' - 65 is 6, but even better this can be written as 'A' - 'A' and 'G' - 'A' which eliminates the magic number 65.


Now, implement a member function add for Set that takes a char and sets the corresponding boolean in Set to true. The member function does not need to return a value. The function should check that the calculated index is between zero and six and print out a useful warning message if it is not.


There is a function declared as char toupper(char) declared in the standard header file named cctype that can be used to convert a lower-case letter to an upper-case letter. For example toupper('a') == 'A' is true. If a character is not a lower case letter then toupper just returns the character it was given.


Now, modify add to take both lower and upper case letters by converting all characters passed into add into upper-case letters.




Implementing \in

Now


Adding Elements

Verbatimcode.pngCode Illustration
   int a [100];
   int size = 0;
   cout << "Enter an integer (-1 to end): ";
   std::istream_iterator <int> input (cin);
   while (*input != -1)  // dereference
   {
      a[size++] = *input;  // dereference
      cout << "Enter an integer (-1 to end): ";
      input++;  // increment
   }  // end while


Notice how iterator input is associated with input stream cin. Notice also how input is dereferenced and incremented as a pointer variable might be in a conventional C++ program.

Add a short loop to function main to print the contents of array a to the standard output cout; print one space between each pair of values.


Labcheckpoint.png FOR IN-LAB CREDIT: Demonstrate this output for the lab instructor.


The STL Algorithm

An STL algorithm can be used to print the contents of the array as well; add an include for the header file algorithm to file master.h and add the following to function main.


Verbatimcode.pngCode Illustration
   std::ostream_iterator <int> output (cout, " ");
   std::copy (a, a+size, output);
   cout << endl;


The STL algorithm copy will move everything from the location specified in the first parameter up to, but not including, the location specified in its second parameter to its third parameter. Since a is the name of an array, it is essentially the address of the first array element; the addition of size to a results in a location one past the last element stored in a. Notice how iterator output is associated with cout; the character-string parameter specifies what should be written between values.


The STL Container

An STL container can be used to store and manipulate data. In particular, the STL vector container can be used where an array might be used in a conventional C++ program. Add an include for the header file vector to file master.h and add the following to function main.


Verbatimcode.pngCode Illustration
   std::vector <int> v;
   cout << "\n\nEnter # for max subsequence search (-999 to end): ";
   std::istream_iterator <int> input2 (cin);
   while (*input2 != -999)
   {
      v.push_back (*input2);
      cout << "Enter # for max subsequence search (-999 to end): ";
      ++input2;
   }  // end while


The STL vector object v will be filled with numbers from the keyboard via another istream_iterator object, input2. The vector member function push_back is used to add data to the end of its object.

The values stored in a vector may be manipulated using an iterator. Add the following to function main.


Verbatimcode.pngCode Illustration
   std::vector<int>::const_iterator vIndx;
   for (vIndx = v.begin(); vIndx != v.end(); vIndx++)
      cout << *vIndx << ' ';
   cout << endl;


The vector member function begin returns a reference to the first item in the container; function end returns a reference to just past the last item in the container, hence the use of the != operator.

The same effect may be achieved with the copy algorithm; add the following to function main.


Verbatimcode.pngCode Illustration
   std::copy (v.begin(), v.end(), output);
   cout << endl;


Note here again how copy uses references to the first item and one past the last item in the container. FOR IN-LAB CREDIT: Demonstrate the results of copy for the lab instructor.


Example Problem: Maximal Subsequence

Consider now the problem of finding the subsequence in a series of values whose sum is maximal. A vector will be used to implement a search for this subsequence. A brute-force algorithm is developed here; a more elegant solution is to be devised in the homework.

Consider for example a sequence of five values: a_1, a_2, a_3, a_4 and a_5. There are five subsequences starting with a_1, and each of their sums must be considered: a_1 by itself, a_1 + a_2, a_1 + a_2 + a_3, a_1 + a_2 + a_3 + a_4, and a_1 + a_2 + a_3 + a_4 + a_5. Similarly, there are four subsequences starting with a_2, three with a_3, two with a_4 and just one with a_5 (a_5 by itself). In general, there will be n + (n-1) + (n-2) + \ldots + 1 = n(n+1)/2 subsequences to consider for n values. A nested loop can be written to generate these subsequences; the outer loop will keep track of the starting location while the inner loop keeps track of the ending location. Two iterators shall be used for this purpose. Add the following to function main.


Verbatimcode.pngCode Illustration
   std::vector <int>::const_iterator beginIndx;
   std::vector <int>::const_iterator endIndx;

   for (beginIndx = v.begin(); beginIndx != v.end(); beginIndx++)
      for (endIndx = beginIndx; endIndx != v.end(); endIndx++)
         // body;


The body of this loop must use another iterator, e. g., sumIndx, to move between beginIndx and endIndx, summing the elements between them. This sum should be compared to a running maximum sum; if it is found to be greater than the current maximum, the sum should be stored as the new maximum and the current beginning and ending indices should be saved. This being done, the results can be displayed with the following addition to function main.


Verbatimcode.pngCode Illustration
   cout << "maximal subsequence is ";
   std::copy (maxBeginIndx, maxEndIndx+1, output);
   cout << endl;


The iterators maxBeginIndx and maxEndIndx in the above code fragment are defined like beginIndx and endIndx in the preceding fragment. Note that maxEndIndx+1 is passed to algorithm copy here since copy always expects an ending location one past the last value to be copied.

Execute the program and verify that it is working correctly.


Labcheckpoint.png FOR IN-LAB CREDIT: Demonstrate the algorithm for input of the lab instructor's choosing.



Labsubmitsinglefile.png FOR IN-LAB CREDIT: Zip up these files: oop13iL.zip
Name the file {{{zip}}} and upload to CSCADE.