# Lecture 10: Dictionaries and Sets


In today's lecture we will introduce two mutable data structures to organize data in Python: **dictionaries** and **sets**.  

## Leftovers:  List Comprehensions


Last lecture we wrote some list comprehensions for mapping and filtering.  It is possible to do both mapping and filtering in a single list comprehension.  Examine the example below which filters a list by even numbers and creates a new list of their squares.

In [103]:
# write list comprehension below
sqEvenList = [] #todo
sqEvenList

[]

Note that our expression for mapping still comes before the "for" and our filtering with "if" still comes after our sequence.  Below is the equivalent code without list comprehensions.

In [104]:
newList = []
for x in range(10):
    if x % 2 == 0:
        newList.append(x*x)
newList

[0, 4, 16, 36, 64]

**Exercises:** Try to write the following list comprehension examples:

In [105]:
# Example 1: Write a list comprehension that filters the vowels from a word 
# such as beauteous and returns a list of its capitalized vowels.
word = "beauteous"
newList = []  # todo
newList

[]

In [106]:
# Example 2: Write a list comprehension that filters a list of proper nouns by length.
# It should extract nouns of length greater than 4 but less than 8 and return a list
# where the first letter is properly capitalized

properNouns = ["cher", "bjork", "sting", "beethoven", "prince", "madonna"]
newList = []  #todo
newList

[]

## Sequences vs. Collections

We have seen four kinds of sequences so far: strings, lists, ranges, and tuples.  Python can distinguish among them via their delimiters: quotes for strings, square brackets for lists, and parentheses for tuples.

In [107]:
word = "Spring street"      # a string
numbers = [0, 10, 20, 30, 40, 50]    # a list of numbers
person = ('Harry', 'Potter')     # a tuple of strings

### Common Operations for Sequences
Sequences share many operations: 
- subscripting with indices, 
- slicing (with colon), 
- checking for membership with `in`, 
- use of `len` to indicate length
- iteration through loops

**Examples of indexing.**  Outputs in this case are a single element of the sequence.

In [108]:
word[2] # access element at index 2

'r'

**Examples of slicing.**  Outputs in this case are subsequences.

In [109]:
numbers[2:] # slicing - get all elements starting at index 2

[20, 30, 40, 50]

**Examples of using the membership operator `in`.**  Outputs are boolean values.

In [110]:
'Harry' in person

True

### Reminder: Why do we care about tuples?

Tuples are often used in Python to do multiple assignments in one single statement:

In [111]:
a, b = 0, 1
a, b

(0, 1)

In [112]:
a, b = b,a   #swap values of a and b
a, b

(1, 0)

The statement `print` knows how to unpack tuples:

In [113]:
print(a, b)

1 0


Python generates tuples whenever we use commas to separate values:

In [114]:
len(phrase), len(numbers), len(person)

(38, 6, 2)

Doing multiple **variable assignments** in one step is useful. Here is an example with a `for` loop that iterates over a list of tuples:

In [115]:
pairs = [('Boston', 'MA'), ('Columbus', 'OH'), ('Chicago', 'IL'), ('Trenton', 'NJ')]
for capital, state in pairs: # can iterate directly over tuples with multiple assignment
    print("{} has as capital {}.".format(state, capital))

MA has as capital Boston.
OH has as capital Columbus.
IL has as capital Chicago.
NJ has as capital Trenton.


**Note:** The print statement above uses the string method `format` instead of string concatenation. It's a more concise way of doing print. The .format method fills "holes" in a string with corresponding values. The "holes" are specified by the notation {}.

## New Mutable Collection:  Sets

In Python, a **set** is a **mutable**, **unordered** collection of **immutable values** without duplicates.

**Set Structure and Notation.** Nonempty sets can be written directly as comma-separated elements delimited by curly braces:

In [None]:
nums = {42, 17, 8, 57, 23}
animals = {'duck', 'cat', 'bunny', 'ant'}
potters = {('Ron', 'Weasley'), ('Luna', 'Lovegood'), ('Hermione', 'Granger')}

Confusingly, even though set elements are **fundamentally unordered**, Canopy and Jupyter notebooks will display the elements of returned set values in sorted order: 

In [None]:
nums

In [None]:
animals

In [None]:
potters

However, when sets are `print`ed, elements are displayed in an unpredictable order: 

In [None]:
print(nums)
print(animals)
print(potters)

The empty set is written `set()` rather than `{}`, because `{}` means an empty dictionary. 

In [None]:
lettersSeen = set()

In [None]:
lettersSeen

In [None]:
print(lettersSeen)

Because sets cannot contain duplicate elements, they are an easy way to remove duplicates

In [None]:
animals2 = {'emu', 'duck', 'bunny', 'emu', 'ferret', 'bunny', 'duck', 'emu', 'bunny', 'emu'}
animals2

Just as `list` turns a collection of elements into a list, `set` turns one into a set. 

In [None]:
listWithDups = [4, 1, 3, 2, 3, 2, 4, 1, 2]
listWithoutDups = list(set(listWithDups))
listWithoutDups

In [170]:
# create a list of words from file
wordList = []
for line in open('prideandprejudice.txt'):
    wordList.extend(line.strip().split())

In [176]:
# how many distinct words in the book?
len(set(wordList)) # set is an easy way to remove duplicates

6372

The elements in a set must be **immutable** values. Since lists, dictionaries, and sets themselves are mutable, they **cannot** be elements in a set: 

In [None]:
{[3, 2], [1, 5, 4]}

In [None]:
{ {3, 2}, {1, 5, 4} }

In [None]:
{ {'a':3, 'c':2}, {'b': 1} } 

Because sets are unordered, they are not sequences, attempting to access an element of a set by an index fails with a `TypeError`:

In [None]:
animals[0]

We will look at different methods for sets later in the lecture.

## Introducing Another Mutable Collection:  Dictionaries

Dictionaries are unordered collections that map keys to values:

In [116]:
daysOfMonth = {'Jan': 31, 'Feb': 28, 'Mar': 31, 'Apr': 30,
               'May': 31, 'Jun': 30, 'Jul': 31, 'Aug': 31,
               'Sep': 30, 'Oct': 31, 'Nov': 30} # one month is missing
daysOfMonth

{'Jan': 31,
 'Feb': 28,
 'Mar': 31,
 'Apr': 30,
 'May': 31,
 'Jun': 30,
 'Jul': 31,
 'Aug': 31,
 'Sep': 30,
 'Oct': 31,
 'Nov': 30}

<span style="color:red;font-size:12pt">**Important Note:**</span> Dictionaries are **unordered**. Notice that the order we entered the key:value pairs **differs** from the order in which they were displayed.  

While it looks like the keys are sorted alphabetically, this is a confusing feature of Jupyter that makes it **seem** as if the key:value pairs are sorted by key <span style="color:red;"><strong>when they're really not!</strong></span>.  If we instead run `python3` in a Terminal window, we see that the key:value pairs are displayed in an unpredictable order that is determined by obscure details in the Python implementation: 

```
$ python3
Python 3.7.4 (default, Jul  9 2019, 18:13:23) 
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

>>> daysOfMonth = {'Jan': 31, 'Feb': 28, 'Mar': 31, 'Apr': 30,
...                'May': 31, 'Jun': 30, 'Jul': 31, 'Aug': 31,
...                'Sep': 30, 'Oct': 31, 'Nov': 30} 

>>> daysOfMonth
{'Jan': 31, 'Nov': 30, 'Feb': 28, 'Aug': 31, 'May': 31, 'Mar': 31, 'Jul': 31, 'Apr': 30, 'Jun': 30, 'Oct': 31, 'Sep': 30}
```

#### Subscripting dictionaries.

We do this not with indices, but with **keys**.

In [117]:
daysOfMonth['May']  # what is the value associated with key 'May'?

31

**IMPORTANT** Note that we don't have to loop through the key/value pairs to find the value associated with the key. We just use the key as the subscript, and the dictionary "magically" returns the corresponding values. This is easy and powerful!

Trying to look for a key that doesn't exist...

In [118]:
daysOfMonth['October']

KeyError: 'October'

**Note**: Indexing with 0, 1, ..., will not work, the same way 'October' didn't work, because subscription is based only on the **keys**.

In [119]:
daysOfMonth[0]  # error, no notion of index in dictionaries

KeyError: 0

### Check if a key exists with the operator `in`
The operator `in` can be used with dictionaries just like with sequence types.  
We can check if a key is in a dictionary, by typing: `key in someDict`. However, this doesn't work with values.

In [120]:
'Oct' in daysOfMonth

True

In [121]:
'October' in daysOfMonth

False

In [122]:
if 'October' in daysOfMonth:
    print(daysOfMonth['October'])
else:
    print("Key not found")

Key not found


### More examples of dictionaries

We''ll use these dictionaries in the examples we do in this notebook.

In [123]:
student = {'name': 'Eeph Williams', 
           'dorm': 'Bronfman', 
           'section': 2, 
           'year': 2021, 
           'course': 'CS134'}

phones = {5558671234: 'Gal Gadot', 
          9996541212: 'Trevor Noah', 
          4135974233: 'Maud Mandel'}

friends = {('Harry','Potter'):['harry@hogwarts.edu','Gryffindor'],
           ('Cho', 'Chang'):['cho@hogwarts.edu', 'Ravenclaw'],
           ('Cedric', 'Diggory'):['ced@hogwarts.edu','Hufflepuff']}

### Your Turn: Subscripting dictionaries

In [124]:
# write the expression to retrieve the value 2020 from student


In [125]:
# write the expression to retrieve Cho Chang's information from friends


In [126]:
# what does this return? 
friends[('Harry', 'Potter')][1]

'Gryffindor'

In [127]:
# what does this return? 
friends[('Harry', 'Potter')][1][0]

'G'

In [128]:
# what does this return? 
friends[('Harry', 'Potter')][1][0][0]

'G'

### Using subscript assignment to change the value associated with a key

When a key is already in a dictionary, assigning a value to that key in the dictionary changes the value at that key.

For example, the key `Feb` is associated with what value in the `daysOfMonth` dicionary?

In [129]:
daysOfMonth

{'Jan': 31,
 'Feb': 28,
 'Mar': 31,
 'Apr': 30,
 'May': 31,
 'Jun': 30,
 'Jul': 31,
 'Aug': 31,
 'Sep': 30,
 'Oct': 31,
 'Nov': 30}

In [130]:
daysOfMonth['Feb'] #what is the value of key 'Feb'

28

If it's a leap year, we can change the value associated with `Feb` to be 29 via assignment with the subscript:

In [131]:
daysOfMonth['Feb'] = 29   # change value associated with a key

In [132]:
daysOfMonth # notice changed value of Feb

{'Jan': 31,
 'Feb': 29,
 'Mar': 31,
 'Apr': 30,
 'May': 31,
 'Jun': 30,
 'Jul': 31,
 'Aug': 31,
 'Sep': 30,
 'Oct': 31,
 'Nov': 30}

### Using subscript assignment to add a new key/value pair to a dictionary

When a key is not in a dictionary, assigning a value to that key in the dictionary adds a new key/value pair to the dictionary.

For example, the key `Dec` is **not yet** in the `daysOfMonth` dicionary.

In [133]:
daysOfMonth

{'Jan': 31,
 'Feb': 29,
 'Mar': 31,
 'Apr': 30,
 'May': 31,
 'Jun': 30,
 'Jul': 31,
 'Aug': 31,
 'Sep': 30,
 'Oct': 31,
 'Nov': 30}

In [134]:
'Dec' in daysOfMonth # always check if key is in dict

False

We can add the association `'Dec':31` to the dictionary by assigning `31` to `daysOfMonth['Dec']`:

In [135]:
daysOfMonth['Dec'] = 31

In [136]:
daysOfMonth

{'Jan': 31,
 'Feb': 29,
 'Mar': 31,
 'Apr': 30,
 'May': 31,
 'Jun': 30,
 'Jul': 31,
 'Aug': 31,
 'Sep': 30,
 'Oct': 31,
 'Nov': 30,
 'Dec': 31}

In [137]:
'Dec' in daysOfMonth

True

In [138]:
daysOfMonth['Dec']

31

### Dictionary keys must be immutable

Although dictionaries are mutable, the **keys** of dictionaries must be **immutable**.

In [139]:
daysOfMonth[['Feb'] = 28   # try to use a key that has month and year in a list

SyntaxError: invalid syntax (<ipython-input-139-b48094db7810>, line 1)

As tuples are immutable, they can be keys of a dictionary.  The `friends` dictionary is an example of a dictionary with tuples as keys. 

In [140]:
friends

{('Harry', 'Potter'): ['harry@hogwarts.edu', 'Gryffindor'],
 ('Cho', 'Chang'): ['cho@hogwarts.edu', 'Ravenclaw'],
 ('Cedric', 'Diggory'): ['ced@hogwarts.edu', 'Hufflepuff']}

<a id="sec3"></a>

## Different ways to create dictionaries

**Direct assignment.** We can write the entire dictionary as a literal value by wrapping braces around a comma-separate sequence of key:value pairs: 

In [141]:
student = {'name': 'Eeph Williams', 
           'dorm': 'Bronfman', 
           'section': 2, 
           'year': 2021, 
           'course': 'CS134'}
student

{'name': 'Eeph Williams',
 'dorm': 'Bronfman',
 'section': 2,
 'year': 2021,
 'course': 'CS134'}

**Accumuling in a Dict.** Just like we do with lists, we can accumulate data in a dictionary.  Start with the empty dictionary `{}` and keep growing.  This is a common way to create dictionaries in many of our problems. 

When dealing with a bigger problem with loops, we must check if a key exists in the dictionary before we accumulate the value associated with it.  See Exercise `wordFrequency` at the end of the notebook.

In [142]:
cart = {} # The empty dictionary
cart['oreos'] = 3.99
cart['kiwis'] = 2.54
cart

{'oreos': 3.99, 'kiwis': 2.54}

**Note:** Since dictionaries are unordered, the order in which we enter key/value pairs is irrelevant. 

We often "grow" a dictionary by adding key/value pairs in a loop. For example:

In [143]:
firsts = ['Shikha', 'Iris', 'Lida']
lasts =  ['Singh', 'Howley', 'Doret'] # Order of these last names corresponds to firsts

names = {} 
for i in range(len(firsts)):
    names[firsts[i]] = lasts[i]
names

{'Shikha': 'Singh', 'Iris': 'Howley', 'Lida': 'Doret'}

**Dict constructor function.** We can use the built-in  `dict` function, which can create a dictionary from a list of tuples, where every tuple has two elements. 

In [144]:
dict([('DEU', 49), ('ALB', 355), ('UK', 44)]) # a list of tuples for country codes

{'DEU': 49, 'ALB': 355, 'UK': 44}

A tuple that is not part of a list will not work:

In [145]:
dict(('USA', 1))

ValueError: dictionary update sequence element #0 has length 3; 2 is required

**Empty Dict.** Calling `dict` with zero arguments creates an empty dictionary:

In [146]:
dict() # creates an empty dict

{}

## Dictionary methods that don't cause mutation

A dictionary object has several methods that either _query_ the state of the object, or _modify_ its state.   We will see some of these methods in a later section: `pop`, `update`, and `clear`, which _modify_ (mutate) the dictionary.  Now, we'll look at some methods that _query_ its state.

In [147]:
daysOfMonth # the entire dict

{'Jan': 31,
 'Feb': 29,
 'Mar': 31,
 'Apr': 30,
 'May': 31,
 'Jun': 30,
 'Jul': 31,
 'Aug': 31,
 'Sep': 30,
 'Oct': 31,
 'Nov': 30,
 'Dec': 31}

### The `get`  method 
The method `get` is used to avoid the step of checking for a key before updating.   This is possible because this method will return a "default" value when the key is not in the dictionary.   In all other cases, it will return the value associated with the given key.

In [148]:
daysOfMonth.get('Oct', 'unknown') 

31

In [149]:
daysOfMonth.get('OCT', 'unknown') 

'unknown'

Remember that if we try to access a non-existing key directly, we'll get a KeyError:

In [150]:
daysOfMonth['OCT']

KeyError: 'OCT'

Using `get`, allows us to avoid that error:

In [151]:
daysOfMonth.get('OCT')

**Question**: Why don't we see anything?  
**Answer:** Because when `get` doesn't find a value for a given key, it returns `None`, and avoids the `KeyError`.

## Exercise: `scrabblePoints`

A great use for dictionaries is to store data that can simplify choosing among different values.  For example, we can create a dictionary with alphabets as keys and their scrabble score as values.

In [152]:
scrabbleDict = {'a': 1, 'b': 3, 'c': 3, 'd': 2, 'e': 1, 'f': 4, 'g': 2, 
                'h': 4, 'i': 1, 'j': 8, 'k': 5, 'l': 1, 'm': 3, 'n': 1, 
                'o': 1, 'p': 3, 'q': 10, 'r': 1, 's': 1, 't': 1, 
                'u': 1, 'v': 4, 'w': 4, 'x': 8, 'y': 4, 'z': 10}

In [153]:
# Write the code here
def scrabblePoints2(letter):
    "Return the scrabble score associated with a letter."
    # Algorithm
    # 1. If letter is in scrabbleDict, return its points
    # 2. Otherwise return 0
    if letter in scrabbleDict:
        return scrabbleDict[letter]
    return 0

In [154]:
# Test with different values
for letter in 'abdhjkq7!': 
    print("{} is worth {} points".format(letter, scrabblePoints2(letter)))

a is worth 1 points
b is worth 3 points
d is worth 2 points
h is worth 4 points
j is worth 8 points
k is worth 5 points
q is worth 10 points
7 is worth 0 points
! is worth 0 points


**Simplification:** We can replace the function body with a single statement that does the same thing using the `get` method.

In [155]:
def scrabblePoints3(letter):
    # Write a single line here
    pass 



In [156]:
# Test with different values
for letter in 'abdhjkq7!': 
    print("{} is worth {} points".format(letter, scrabblePoints3(letter)))

a is worth None points
b is worth None points
d is worth None points
h is worth None points
j is worth None points
k is worth None points
q is worth None points
7 is worth None points
! is worth None points


<a id="sec8"></a>

## Example: Word Frequencies

Write a function takes a list of words as input `wordList` and creates a dictionary of word frequency.  In particular, the keys of the dictionary must be words in the `wordList` and the corresponding value must be the number of times they appear in the dictionary.

In [157]:
def frequencies(wordList):
    """Given a list of words, returns a dictionary of word frequencies"""
    # Algorithm
    # 1. create an empty dict
    # 2. iterate through the words of the given list
    # 3. set the value or increment the value for each word
    # 4. return the dict

    pass

In [158]:
# test our function  
verse = """One fish. Two fish. 
           Red fish. Blue fish. 
           Black fish. Blue fish. 
           Old fish. New fish. 
           This one has a little star. 
           This one has a little car."""

frequencies(verse.split())

In [170]:
# create a list of words from file
wordList = []
for line in open('prideandprejudice.txt'):
    wordList.extend(line.strip().split())

In [171]:
# test our function on pride and prejudice
pridePrejDict = frequencies(wordList)
len(pridePrejDict)

TypeError: object of type 'NoneType' has no len()

In [161]:
pridePrejDict['love']

TypeError: 'NoneType' object is not subscriptable

## Exercise: Rewrite `frequencies` with `get`

We can simplify the above function with the `get` method.

In [162]:
def frequencies2(wordList):
    """Given a list of words, returns a dictionary of word frequencies"""
    # Algorithm:
    # many steps similar to Exercise 2
    # difference: use get with the default value 0 to set initial values

    pass

In [163]:
# test    
frequencies2(verse.split())

## Dictionary and Set Methods

Sometimes we are interested in knowing the keys, values or items (key, value pairs) of a dictionary.  Each of these methods returns an object containing only the keys, values, and items, respectively.

### Dictionary methods `keys`, `values`, and `items`

Sometimes we are interested in knowing the keys, values or items (key, value pairs) of a dictionary.  Each of these methods returns an object containing only the keys, values, and items, respectively.

In [164]:
daysOfMonth.keys() 

dict_keys(['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])

Note that the order of elements in the list is **not** predictable.

In [165]:
type(daysOfMonth.keys())

dict_keys

Note that the `.keys()`, `.values()`, and `.items()` methods each return a different type of object.

In [166]:
daysOfMonth.values() 

dict_values([31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31])

In [167]:
type(daysOfMonth.values())

dict_values

The list returned by `.values()` is synchronized with the list returned by `.keys()`. You can find corresponding months and days in the same index.

In [168]:
daysOfMonth.items() 

dict_items([('Jan', 31), ('Feb', 29), ('Mar', 31), ('Apr', 30), ('May', 31), ('Jun', 30), ('Jul', 31), ('Aug', 31), ('Sep', 30), ('Oct', 31), ('Nov', 30), ('Dec', 31)])

In [169]:
type(daysOfMonth.items())

dict_items

Note that the list returned by `.items()` is synchronized with the lists returned by `.keys()` and `.values()`. 

The objects of type `dict_keys`, `dict_values`, and `dict_items` are so-called **dictionary views** that reflect any subsequent changes to the underlying dictionary from which they were made.

In [None]:
numNames = {'one': 1, 'two': 2, 'three': 3}
ks = numNames.keys() 
vs = numNames.values()
its = numNames.items()
print('keys: {}\nvalues: {}\nitems: {}'.format(ks, vs, its))

In [None]:
numNames['four'] = 4
print('keys: {}\nvalues: {}\nitems: {}'.format(ks, vs, its))

### Iterating over a dictionary

There are many ways to iterate over a dictionary: 

1. over the keys (do **not** use `.keys()`)
2. over the values (with `.values()`)
3. over the items (with `.items()`)

In [None]:
phones

In [None]:
# iterate directly (by default Python goes over the keys, because they are unique)
for key in phones:
    print(phones[key], key)

Using `for` to iterate over a dictionary means iterating over all the **keys** in the dictionary, so there is 
<span style="color:red">no need</span> to use `.keys()`, which would create an unnecessary object. So we prefer to write `for key in phones:` rather than `for key in phones.keys():`. 

In [None]:
for val in phones.values():
    print("Call {}!".format(val))

**Question:** Can we go from values to keys, as we did from keys to values? What can we say about keys and values in a dictionary?  


In [None]:
# sometimes is useful to iterate over the items directly
# notice the tuple assignment in the for loop
daysOfMonth = {'Jan': 31, 'Feb': 28, 'Mar': 31, 'Apr': 30,
               'May': 31, 'Jun': 30, 'Jul': 31, 'Aug': 31,
               'Sep': 30, 'Oct': 31, 'Nov': 30, 'Dec': 31}

In [None]:
for month, days in daysOfMonth.items():
    print("{} has {} days".format(month, days))

### Dictionary Methods that mutate it: `pop`, `update`, and `clear`

We saw that dictionaries are **mutable**: we can assign to a key slot to change its value or even add a new key/value pair. 

#### The `pop` method

Given a key, the `pop` method on a dictionary **removes** the key/value pair with that key from the dictionary and **returns** the value that was formerly associated with the key. 

In [None]:
daysOfMonth.pop(('Feb', 2018))

In [None]:
daysOfMonth

**Question:** It looks like the method `pop` works similarly to the one for lists.   Do you think it will behave the same if we don't provide an argument value for it? Explain.  


#### The `update` method

_dict1_`.update(`_dict2_`)` mutates _dict1_ by assigning the key/value pairs of _dict2_ to _dict1_. 

In [None]:
# let's remind ourselves of the contributions
student

In [None]:
newStudent = {
 'year': 2022,
 'course': 'ENG101'}

In [None]:
student.update(newStudent)

**Question:** What didn't you see an output from running the cell above?  

In [172]:
student

{'name': 'Eeph Williams',
 'dorm': 'Bronfman',
 'section': 2,
 'year': 2021,
 'course': 'CS134'}

####  The  `clear` method

We can wipe out the content of a dictionary with `clear`:

In [None]:
student.clear()
student

### Set Operations

#### Set operations that don't change the sets

`in` and `not in` determine membership/nonmembership in a set:

In [None]:
42 in nums

In [None]:
'wombat' in animals

In [None]:
'koala' not in animals

`len` returns the number of elements in a set

In [None]:
print('nums has {} elements'.format(len(nums)))
print('animals has {} elements'.format(len(animals)))
print('potters has {} elements'.format(len(potters)))

The **union** of sets _s1_ and _s2_, written _s1_`.union(`_s2_`)` or _s1_ `|` _s2_, returns a new set that has all the elements that are in **either** set: 

In [None]:
print('animals is {}'.format(animals))
print('animals2 is {}'.format(animals2))
print('animals.union(animals2) is {}'.format(animals.union(animals2)))
print('animals | animals2 is {}'.format(animals | animals2))


The **intersection** of sets _s1_ and _s2_, written _s1_`.intersection(`_s2_`)` or _s1_ `&` _s2_, returns a new set that has all the elements that are in **both** sets: 

In [None]:
print('animals is {}'.format(animals))
print('animals2 is {}'.format(animals2))
print('animals.intersection(animals2) is {}'.format(animals.intersection(animals2)))
print('animals & animals2 is {}'.format(animals & animals2))

The (asymmetric) **difference** of sets _s1_ and _s2_, written _s1_`.difference(`_s2_`)` or _s1_ `-` _s2_, returns a new set that has all the elements that are in _s1_ but not in _s2_:

In [None]:
print('animals is {}'.format(animals))
print('animals2 is {}'.format(animals2))
print('animals.difference(animals2) is {}'.format(animals.difference(animals2)))
print('animals2 - animals is {}'.format(animals2 - animals))

The **symmetric difference** of sets _s1_ and _s2_, written _s1_`.symmetric_difference(`_s2_`)` or _s1_ `^` _s2_, returns a new set that is the union of (all the elements that are in _s1_ but not in _s2_) and (all the elements that are in _s2_ but not in _s1_)

In [None]:
print('animals is {}'.format(animals))
print('animals2 is {}'.format(animals2))
print('animals.symmetric_difference(animals2) is {}'.format(animals.symmetric_difference(animals2)))
print('animals2 ^ animals is {}'.format(animals2 ^ animals))

Sets are tested for equality using ==. The order in which elemets are written in a set is irrelevant for determining equality:

In [None]:
{'cat', 'bunny', 'ant', 'duck'} == {'duck', 'cat', 'bunny', 'ant'}

In [None]:
{'cat', 'bunny', 'ant', 'duck'} == {'duck', 'bunny', 'ant'}

In [None]:
{'cat', 'bunny', 'ant'} == {'bunny', 'ant', 'bunny', 'ant', 'cat', 'ant'}

#### Set operations that change the sets

_s_`.add(`_elem_`)` changes _s_ by adding _elem_ (if it is not already in the set). 

In [None]:
lettersSeen = set()
print(lettersSeen)
lettersSeen.add('f')
print(lettersSeen)
lettersSeen.add('e')
print(lettersSeen)
lettersSeen.add('r')
print(lettersSeen)
lettersSeen.add('r')
print(lettersSeen)
lettersSeen.add('e')
print(lettersSeen)
lettersSeen.add('t')
print(lettersSeen)

_s1_ ` |= ` _s2_ changes _s1_ by adding all the elements of _s2_ to it.

In [None]:
print(lettersSeen)
lettersSeen |= {'e', 'm', 'u'}
print(lettersSeen)

Similarly, _s1_ ` &= ` _s2_ changes _s1_ by intersecting it with the elements of _s2_ with it, and _s1_ ` -= ` _s2_ changes _s1_ by removing of _s2_ the elements in _s2_:

In [None]:
print(lettersSeen)
lettersSeen &= {'t', 'r', 'u', 'e', 's', 'm'}
print(lettersSeen)
lettersSeen -= {'t', 'a', 'r'}
print(lettersSeen)

_s_`.remove(`_elem_`)` changes _s_ by removing _elem_ (raising a `KeyError` if it's not in the set)

In [None]:
print(lettersSeen)
lettersSeen.remove('u')
print(lettersSeen)
lettersSeen.remove('f')

_s_`.clear()` removes all elements from _s_.

In [None]:
print(lettersSeen)
lettersSeen.clear()
print(lettersSeen)

<a id="sec2"></a>

## Digger deeper:  Hashable

**`hash`** When Python stores the keys of a dictionary in memory, it stores their hashes, which is an integer returned by the `hash` function. Only immutable objecs can be hashed.

In [None]:
hash("Wellesley")

In [None]:
hash(['Feb', 2015])

In [None]:
hash( ('Feb', 2015) ) # Tuples are hashable even though lists are not

In [None]:
hash(123456) # numbers are their own hash value

**Note.** At this point, you don't have to worry about why the keys are hashed, or how the `hash` function works!   Take more advanced CS classes to learn more.