Difference between revisions of "Courses/CS 2124/Lab Manual/Sets"

From A-State Computer Science Wiki
Jump to: navigation, search
(Union)
(Intersection)
Line 373: Line 373:
  
  
Intersection with single element is less useful than union with single elements, so there is no need to implement multiple versions of <tt>operator *</tt>/
+
Intersection with single element is less useful than union with single elements, so there is no need to implement multiple versions of <tt>operator *</tt>.
  
  
 
<blockquote>
 
<blockquote>
 
Now, implement a single <tt>operator *</tt> that calls <tt>intersection</tt>.
 
Now, implement a single <tt>operator *</tt> that calls <tt>intersection</tt>.
</blockquote>
+
</blockquote>
 
+
  
 
===Difference===
 
===Difference===

Revision as of 00:06, 21 April 2010

Under construction icon-blue.png This page is undergoing initial construction or a major update. Please be advised that material that appears here may not be correct or current, and may change before the page is finalized.

A portion of this lab is to be done during the scheduled lab time. The take-home programming assignment is to be turned in before the next lab; see the lab website. The in-lab portion is worth 40% of the lab credit; the programming assignment is worth the other 60%. See the website for details on how the programming assignment will be graded. You are not responsible for user errors in input unless specified in the assignment. Feedback will be provided explaining your grade on each assignment.

It is important that you complete each step before going on to the next, as the exercises build upon one another. You will find it helpful to diagram the action of each method or function as you go along. If you have difficulty in some step, DO NOT proceed before resolving it; seek assistance from your lab proctor. You will not be able to fully appreciate the remaining content of the lab and you are likely to compound the problem.

Introduction

In this lab a Set class is developed using the skills learned in previous labs. The instructions given here are less specific and intended more as general guidance.

Topics Covered in this Lab:
  • Sets
  • Set operations
Questions Answered in this Lab:
  • What is a set?
  • How might a set class be designed for C++?
  • What operations can be performed with sets?
  • How might set operations be implemented as part of a class?
Demonstrable Skills Acquired in this Lab:
  • An understanding of sets and set operations
  • How one might implement a class given less specific instructions
Createproject.pngCreate a project oop14iL

Sets

A set is a collection of mathematical objects (As used in mathematics, object is just a way to say "thing", so a set is a collection of things. This is a distinct idea from C++ objects.). A mathematical object is either a member of a set or not, so it does not make sense for an object to be in a set more than once. However, an object can be a member of multiple sets at once. Sets are themselves consider to be mathematical objects, so it is possible to have sets that contain other sets.


To avoid ambiguity, from this point mathematical objects spoken of in relation to sets will be called elements from this point onward.


Sets are commonly denoted by using curly braces in one of two ways. The first is to exhaustively list every element of a set within the curly braces. For example, the set containing the integers between 1 and 10 could be written as \{ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 \}. However most sets are too large to define this way and may even contain an infinite number of elements and are denoted using what is referred to as set builder notation. An example would be all even integers which can be written as \{x|\text{where }k \in \mathbb{I}, x = 2k\}, and which is read as "x such that where k is an integer, x is two times k." Sometimes, infinite sets are more intuitively denoted by using ellipsis, for example, all integers could be written \{ \dots , -2, -1, 0, 1, 2, \dots \} and even integers as \{ \dots , -4, -2, 0, 2, 4, \dots \}.


The "\in" operator, pronounced "element of," used in the previous example is a relational operator that is not built into C++. It returns true if the element given on the left is a member of the set given on the right. Also mentioned in the previous example is the the set of all integers which is so commonly used that it has a shorthand notation "\mathbb{I}."


The set containing all the mathematical objects under consideration is called the universe. For example, if the real numbers are being considered then the universe is the infinite set \mathbb{R} containing all reals.


Summary

  • Sets are collections of elements.
  • Each given element is either is a member or not a member of a given set.
  • The relational operator "\in" and "\notin" are used to test set membership.
  • An element cannot appear in a set twice.
  • An element can be in more than one set at once.
  • A set is denoted by listing all the elements inside of curly braces.
  • Infinite sets are denoted using set builder notation or ellipsis.

More information about sets can be found on the Internet.

  • Wikipedia [1]
  • Wolfram MathWorld [2]


Implementing a Set Class

For simplicity for the rest of this lab we are only going to only consider the letters from A to G to be the universe. This set can be denoted as \{ A, B, C, D, E, F, G \}. A Set implemented for this universe in C++ will have to keep track of whether each letter is a member of the Set or not. This can be done by keeping an array of seven boolean values where if an array element is set to true then the corresponding set element is a member of the set.


Create the files Set.h and Set.cpp now. Implement a constructor for Set takes no arguments and creates a set with no members (the empty set).


It is bad programming style to use the literal number 7 in all the places in the code that will access the boolean values of the Set class. If ever the number of elements is changed then it will need to be changed in many places and this introduces the chance that one of the places will be missed. Values scattered throughout the code like this are called magic numbers. It is better to make the number of elements into a constant static variable that is given a good descriptive name such as element_count.


Adding Elements

It would be nice to be able to use C++ character literals like 'A' and 'E' to stand for the elements A and E. Recall that character literals in C++ are really just a convenient way to write ASCII character codes [www.asciitable.com]. For example, 'A' is equal to to the integer 65. Also, the codes 'A' thru 'G' are sequential. Remember that the character constants 'a' thru 'b' have different values.


However, the boolean variables that are a part of Set have indexes from 0 to 6. Because character literals are really numbers it is easy to map them to indexes by using subtraction. 'A' - 65 is 0 and 'G' - 65 is 6, but even better this can be written as 'A' - 'A' and 'G' - 'A' which eliminates the magic number 65.


Now, implement a member function add for Set that takes a char and sets the corresponding boolean in Set to true. The member function does not need to return a value. The function should check that the calculated index is between zero and six and print out a useful warning message if it is not.


There is a function declared as char toupper(char) declared in the standard header file named cctype that can be used to convert a lower-case letter to an upper-case letter. For example toupper('a') == 'A' is true. If a character is not a lower case letter then toupper just returns the character it was given.


Now, modify add to take both lower and upper case letters by converting all characters passed into add into upper-case letters.


Implementing the "element of" operator \in

The "element of" operation is a relational operator like >, <, and ==. Even though it is an operator, for now, lets think of it as a member function.


Lets consider how to implement it as a member function. Because it is a relational operator it should return a boolean. Because it is a member function it has access to the internal private boolean values that indicate set membership. So, the only remaining information that it needs is which element to test for membership. The element to test for membership can be passed to the function as an argument. To summarize, we need to implement a function that looks like this bool Set::functionName(char element). A good name for this function might be hasElement because then when it is used it would be written something like:

if (mySet.hasElement('B')) { 
    //... 
}


The name hasElement is good because it reads well. In the example above it could be read out loud as "if my set has element 'B' then ...," which is understandable as English. Naming variables and functions this way aids others to understand your code.

Unfortunately, a function named "ElementOf" is not possible since it is not possible to add a method to the char type so that 'A'.ElementOf(set) can be written. Fortunately, later we will write a binary operator that implements something better.

Checking if a Set has a particular element does not change the Set itself, so make sure to declare hadElement as const.


Now, implement the method hasElement


Now that there are both add and hasElement functions it is possible to compile and test the code and see if it works.


Now, create a main.cpp and master.h and write code to test the Set class.


  • Declare a Set.
  • Test for all elements and make sure none of them are in the set.
  • Add an element to the set.
  • Test that the newly added element is now a member.
  • Attempt to add erroneous elements to the set and make sure an error message is printed.


Now that the class is tested, add a remove method that does the opposite of add and implement code in main that will test remove.



Labcheckpoint.png FOR IN-LAB CREDIT: Demonstrate the test code for the lab instructor


Implementing write

Non-empty sets are denoted inside curly braces, so any write function should begin by printing a left brace and end by printing a right brace. The empty set is denoted using the symbol \emptyset but the closest thing on the keyboard is the 0.

There are several ways to handle output depending on how fancy one wants to be. Let's start with the easiest, which is to separate elements with spaces, and to just write the empty set as a pair of empty braces. There are several things to pay attention too while implementing this version of write:


  • Writing a Set does not change the Set itself so write should be declared const.
  • Make sure to use hasElement instead of checking membership directly.
  • A user may want to write a set as part of a sentence, so do not output a newline character afterwards.
  • Spacing should be even, meaning that if there is a space before the first element there should be one after the last element.
  • The empty set may look good with either no spaces ("{}") or one ("{ }"), but two ("{ }") is not as nice.
  • The mathematical objects A thru G are not character constants so do not print out single quotes.
  • Final output should look something like { A B C D } or {A B C D}.


The first example, { A B C D } is easier to produce. If one wishes to produce the second example, {A B C D}, then the code is actually similar to that needed to produce the output with commas, {A, B, C, D} because the delimiter (the separating character) is between each element but not before the first or after the last. The trick to producing elegant code for printing commas is to realize that the comma is printed out before every element except the first. The reason the code is more elegant is that it is often easier to tell what the first element is rather than determining what the last element is (although this does not apply if one is using arrays).


Now, implement a version of the write function. The priority here is to make it print neatly following the guidelines above. Optionally, add code to print commas as a delimiter.

The write function should take an ostream as an argument. As a convenience for users, a default argument of cout should be specified so that when write is called with no arguments it assumes it needs to write to standard output.

Make sure to add code to main that tests write by adding and removing elements and printing all the intermediate results.


To print the empty set as a special case requires that the Set be checked for emptiness before even printing the first curly brace. This can be done as a loop that checks for every element in the universe, and if none are found, return true. This code could be added to the write function, however this presents an opportunity to create a new function isEmpty that can be used by other functions or the user.

Again, when implementing isEmpty, use hasElement instead of directly accessing the internal member Booleans. This allows code for accessing the internal data to be reused. If there was ever a time when the internal data needed to be implemented in a different way, the only function that would need changing would be hasElement while every function that uses hasElement will not have to be changed. If a function directly accesses the internal data then it would have to be changed to, so it is better to make sure the only functions that access the internal data directly are add, remove, and hasElement. These three functions can be used to implement all the other methods.


Now, implement a method isEmpty that returns true if a set is empty, and then use isEmpty inside of write to print a 0 if the set is empty.


Make sure to add code to main to test these modifications.

Implementing read

Reading input from the user should be the inverse of writing. That means that the Set class should read exactly what it writes. This means that one character should be read at a time. If the first character is a 0 then the set is empty, if it is a { then the characters should be added to the set until a } is read. If you chose to implement commas as delimiters than make sure to skip over them.

Now, implement a read method that accepts for input exactly the same format that was implemented for output.

Also, add another constructor to the class that takes an istream as an argument and initializes the object from a file. The constructor can merely call the read function to do most of the work.

Make sure to add code to main to test these new functions.



Labcheckpoint.png FOR IN-LAB CREDIT: Demonstrate the write and read functions to the lab instructor


Set Operations

An operation on a set takes one or more sets and creates a new set according to the rules of the operation. An operation that takes one set is called a unary operation while an operation that takes two sets is a binary operation. Set operations are closely related to relational operators. The new set created by an operation is the result of performing a relational operation between the corresponding elements of each set and using the result to initialize the new set.


When operations are implemented as class methods the function used to implement the operation will actually have one less argument because the object for which the method was invoked is considered to be the first argument. This means the unary operations will have zero arguments and binary operations will have one. However, if an operation is implemented as a non-member function then it will have the expected number of arguments because it is not a member so there is no implicit object for which the function was called. Operations can be implemented either way but class methods have access to private data. Non-member functions only have access to private data if the class declares them to be a friend.


Set operations can be implemented as either regular named methods, or by using operator overloading, or both. Here, to help see the commonalities and differences between methods and operators we will first implement named functions, and then use those to implement operators later.


All of the following operations should follow the same basic outline.


  • They all return a Set.
  • None of them change the Set they were called for, so they are all const. (In arithmetic 2+3 does not change the value of 2, it returns a new number 5).
  • For binary operations, the other Set (the on the right hand side of the operator) also does not change, so it should be passed as const.
  • The right-hand side Set should be passed by reference for performance reasons.


Set Set::unaryOperation() const
{
   Set temp;

   // ...

   return temp;
}

Set Set::binaryOperation(const Set& rhs) const
{
   Set temp;

   // ...

   return temp;
}


Complement

The complement of a set is the set of all elements that are not in the original set. The complement of \{ A, C, E \} is \{ B, D, F, G \}. The complement of \{\} is \{ A, B, C, D, E, F, G \}. The complement of a complement is the original set.


The complement operation is analogous to negation which is implemented by the ! operator, because an element is a member of the new set if it is not a member of the original set. Even though we implement complement using operator !, later when we override an operator for doing the complement we will find that operator ~ is a better fit.


Now, using the basic outline for a unary operation and the ! operator, implement a complement method in Set. Do not implement operator ! or operator ~ right now, just use the built-in ! operator to implement a method named complement. Remember to not directly use the underlying implementation of the boolean values or the magic number of elements, instead use hasElement and the special constant value that defines the number of elements.


Make sure to add code to main to test the complement function. Use the description of the complement operation above to get ideas for what to test.

Union

The union of two sets is the set of elements that are in either of the original two sets. The union of \{ A, C, E \} and \{ B, D, F, G \} is \{ A, B, C, D, E, F, G \}. The union of \{ A \} and \{ A B \} is \{ A B \}. In that last example, the result is the same as one of the arguments because one argument is a sub-set of the other. The union of the empty set with any other set results in the other set. Union is commutative, meaning that the order the two sets are given is does not matter.


The union operation is analogous to the || relational operator to say "M or N", because an element is in the new set if it is in one of the original sets or the other one. Although we implement union by using the || operator later we will see that when we overload an operator to perform a union that operator + is a better fit.


Now, using the basic outline for a binary operation and the || operator, implement a union method in Set. Do not implement operator + or operator || right now, just use the built-in || operator to implement a method named union. Remember to not directly use the underlying implementation of the boolean values or the magic number of elements, instead use hasElement and the special constant value that defines the number of elements.

Make sure to add code to main to test the union function. Use the description of the union operation above to get ideas for what to test.

Intersection

The intersection of two sets is the set of elements that are in both of the original two sets. The intersection of \{ A \} and \{ A B \} is \{ A \}. The intersection of \{ A, C, E \} and \{ B, D, F, G \} is \{ \} because they have no elements in common. The intersection of the empty set with any other set always results in the empty set. Intersection is commutative, meaning that the order the two sets are given is does not matter.


The intersection operation is analogous to the && relational operator to say "M and N", because an element is in the new set if it is in one of the original sets and the other one. Although we implement intersection by using the && operator later we will see that when we overload an operator to perform an intersection that operator * is a better fit.


Now, using the basic outline for a binary operation and the && operator, implement an intersection method in Set. Do not implement operator * or operator && right now, just use the built-in && operator to implement a method named intersection. Remember to not directly use the underlying implementation of the boolean values or the magic number of elements, instead use hasElement and the special constant value that defines the number of elements.

Make sure to add code to main to test the intersection function. Use the description of the intersection operation above to get ideas for what to test.

Difference

The difference of two sets is the set of elements that are in the first given set and not in the second. The difference of \{ A B \} from \{ A \} is \{ B \}. The difference of \{ A, C, E \} from \{ B, D, F, G \} is \{ A, C, E \} because none of the elements of the first are in the second. Taking the difference is not commutative, meaning that the order that the two sets are given matters. For example, the difference of any set from the empty set is the original set, while the difference of the empty set from any other set is the empty set.


The difference operation is analogous to using the && and ! relational operators to say "M and not N", because an element is in the new set if it is in one of the original sets and not the other one. Although we implement intersection by using the && and ! operators later we will see that when we overload an operator to perform an intersection that operator - is a better fit.


Now, using the basic outline for a binary operation, the && operator, and the ! operator, implement a difference method in Set. Do not implement operator - right now, just use the built-in && and ! operators to implement a method named difference. Remember to not directly use the underlying implementation of the boolean values or the magic number of elements, instead use hasElement and the special constant value that defines the number of elements.


Make sure to add code to main to test the difference function. Use the description of the difference operation above to get ideas for what to test.



Labcheckpoint.png FOR IN-LAB CREDIT: Demonstrate all the operations for the lab instructor.


Operator Overloading

Now that the class is functionally complete there are several things that can be done to make it easier to use. It is not always correct to add operator overloading to a class unless the class has a strong mathematical basis like the Set class.

Stream Operators

The member functions read and write can be easily written as operators the same as we have done before.


Now, implement operator << and operator >>. All they have to do is call Set::read and Set::write.


Operators for add and remove

The add and remove functions are like the union and difference operations but performed for single members. They are different from operations because they affect the object for which they were invoked. The operators to overload for changing the object for which they were invoked are the operator += and operator -=.


Now, implement operator += and operator -= so that they call add and remove respectively.


Operators for add and remove

The add and remove functions are

Set Operations

In mathematics, each set operation has a unique symbol. Unfortunately none of them are available to overload in C++. However, there are several possible substitutions.

Complement

The notation for the complement of a set N is given in several different ways depending on the preferences of the author and conventions used in a particular field of mathematics. Some of the possibilities are \sim N, \overline{N}, \urcorner N, and N\prime.

The \sim symbol should be familiar because it is used to denote destructors in C++. This is because it can be thought of as the opposite or complement of construction. The operator ~ is actually available as an operator to be overridden.

So, of all the unary operators that can be overridden in C++ there seem to be two good possibilities that make sense from a stylistic standpoint, operator ~ and operator !.

Now, overload either operator ~ or operator ! to call the complement function.


Union

The notation for union is \cup. This is somewhat easy to remember because it resembles a 'u' for union. There is no operator in C++ that resembles this, however a union is similar in behavior to addition because it is commutative and in a certain way, the elements of two sets can be said to be "added" together. However, be careful with this analogy, it doesn't hold together completely because when you add two numbers you always get a bigger number unless one of the numbers is zero, but when two sets are added the resulting set may be smaller than the size of the two sets because of common elements.


So, of all the binary operators that can be overridden in C++ there seem to be two good possibilities that make sense for the union operation, operator + and operator ||. For complement, the choice was arbitrary because both possibilities were of equal precedence, but for + and || there is a significant difference. The arithmetic operators are of higher precedence than the relational operators. Even though set operations are implemented using relational operators, they are used more like arithmetic operators. It might be surprising to the user of set operations had low precedence like relational operators.


These are the kinds of things that must be considered when overloading operators. They should behave in a way that makes sense considering their original meaning. For these reasons, operator + is better to overload for set union because of its high precedence and similarity to numerical addition.


Now, lets consider the add function that adds a single member to a set. This function is very similar to doing a union with a set that contains a single element. For this reason it makes sense to not only overload operator + to handle both sets and individual elements.


Now, create two new member functions and a non-member function to overload operator +.

The first member function should take a Set on the right-hand side and call union.

The second member function should take a char on the right-hand side, create a set with a single member, then return the union.

The non-member function should handle the case were the element is on the left-hand side.

Intersection

The symbol used for intersection is \cap.


The discussion for intersection is similar to that of union except that operator * appears to be the best choice. Multiplication has similar properties to intersection even though the analogy is not perfect. Also, since we have decided to overload arithmetic operators it is important that related operators have the same precedence so this leaves little choice except for operator / and operator %.


Intersection with single element is less useful than union with single elements, so there is no need to implement multiple versions of operator *.


Now, implement a single operator * that calls intersection.

Difference

The difference between sets is denoted using \setminus.


The set difference symbol is a giant backslash, so it might make sense to use operator / which is a forward slash. However, set difference is much like subtraction and to fit in with the other operators we have overloaded we should use operator - to represent it. It is useful to be able to remove individual elements, so like addition there should be three functions.


Now, create two new member functions and a non-member function to overload operator -.

The first member function should take a Set on the right-hand side and call difference.

The second member function should take a char on the right-hand side, create a set with a single member, then return the difference.

The non-member function should handle the case were the element is on the left-hand side.

Submit Files

Labsubmitsinglefile.png FOR IN-LAB CREDIT: Zip up these files: Makefile Set.h Set.cpp master.h main.cpp
Name the file oop14iL.zip and upload to CSCADE.