Lecture 10: Dictionaries and Sets

In today's lecture we will introduce two mutable data structures to organize data in Python: dictionaries and sets.

Leftovers: List Comprehensions

Last lecture we wrote some list comprehensions for mapping and filtering. It is possible to do both mapping and filtering in a single list comprehension. Examine the example below which filters a list by even numbers and creates a new list of their squares.

In [1]:
# write list comprehension below
nums = range(10)
sqEvenList = [n*n for n in nums if n%2==0] #todo
sqEvenList
Out[1]:
[0, 4, 16, 36, 64]

Note that our expression for mapping still comes before the "for" and our filtering with "if" still comes after our sequence. Below is the equivalent code without list comprehensions.

In [2]:
newList = []
for x in range(10):
    if x % 2 == 0:
        newList.append(x*x)
newList
Out[2]:
[0, 4, 16, 36, 64]

Exercises: Try to write the following list comprehension examples:

In [3]:
# Example 1: Write a list comprehension that filters the vowels from a word 
# such as beauteous and returns a list of its capitalized vowels.
word = "beauteous"
newList = [letter.upper() for letter in word if letter in 'aeiou']  # todo
newList
Out[3]:
['E', 'A', 'U', 'E', 'O', 'U']
In [4]:
# Example 2: Write a list comprehension that filters a list of proper nouns by length.
# It should extract nouns of length greater than 4 but less than 8 and return a list
# where the first letter is properly capitalized

properNouns = ["cher", "bjork", "sting", "beethoven", "prince", "madonna"]
newList = [word[0].upper() + word[1:] for word in properNouns if len(word)>4 and len(word)<8]  #todo
newList
Out[4]:
['Bjork', 'Sting', 'Prince', 'Madonna']

Sequences vs. Collections

We have seen four kinds of sequences so far: strings, lists, ranges, and tuples. Python can distinguish among them via their delimiters: quotes for strings, square brackets for lists, and parentheses for tuples.

In [5]:
word = "Spring street"      # a string
numbers = [0, 10, 20, 30, 40, 50]    # a list of numbers
person = ('Harry', 'Potter')     # a tuple of strings

Common Operations for Sequences

Sequences share many operations:

  • subscripting with indices,
  • slicing (with colon),
  • checking for membership with in,
  • use of len to indicate length
  • iteration through loops

Examples of indexing. Outputs in this case are a single element of the sequence.

In [6]:
word[2] # access element at index 2
Out[6]:
'r'

Examples of slicing. Outputs in this case are subsequences.

In [7]:
numbers[2:] # slicing - get all elements starting at index 2
Out[7]:
[20, 30, 40, 50]

Examples of using the membership operator in. Outputs are boolean values.

In [8]:
'Harry' in person
Out[8]:
True

Reminder: Why do we care about tuples?

Tuples are often used in Python to do multiple assignments in one single statement:

In [9]:
a, b = 0, 1
a, b
Out[9]:
(0, 1)
In [10]:
a, b = b,a   #swap values of a and b
a, b
Out[10]:
(1, 0)

The statement print knows how to unpack tuples:

In [11]:
print(a, b)
1 0

Python generates tuples whenever we use commas to separate values:

In [12]:
len(word), len(numbers), len(person)
Out[12]:
(13, 6, 2)

Doing multiple variable assignments in one step is useful. Here is an example with a for loop that iterates over a list of tuples:

In [13]:
pairs = [('Boston', 'MA'), ('Columbus', 'OH'), ('Chicago', 'IL'), ('Trenton', 'NJ')]
for capital, state in pairs: # can iterate directly over tuples with multiple assignment
    print("{} has as capital {}.".format(state, capital))
MA has as capital Boston.
OH has as capital Columbus.
IL has as capital Chicago.
NJ has as capital Trenton.

Note: The print statement above uses the string method format instead of string concatenation. It's a more concise way of doing print. The .format method fills "holes" in a string with corresponding values. The "holes" are specified by the notation {}.

New Mutable Collection: Sets

In Python, a set is a mutable, unordered collection of immutable values without duplicates.

Set Structure and Notation. Nonempty sets can be written directly as comma-separated elements delimited by curly braces:

In [14]:
nums = {42, 17, 8, 57, 23}
animals = {'duck', 'cat', 'bunny', 'ant'}
potters = {('Ron', 'Weasley'), ('Luna', 'Lovegood'), ('Hermione', 'Granger')}

Confusingly, even though set elements are fundamentally unordered, Canopy and Jupyter notebooks will display the elements of returned set values in sorted order:

In [15]:
nums
Out[15]:
{8, 17, 23, 42, 57}
In [16]:
animals
Out[16]:
{'ant', 'bunny', 'cat', 'duck'}
In [17]:
type(nums)
Out[17]:
set
In [18]:
potters
Out[18]:
{('Hermione', 'Granger'), ('Luna', 'Lovegood'), ('Ron', 'Weasley')}

However, when sets are printed, elements are displayed in an unpredictable order:

In [19]:
print(nums)
print(animals)
print(potters)
{8, 42, 17, 23, 57}
{'ant', 'duck', 'bunny', 'cat'}
{('Hermione', 'Granger'), ('Luna', 'Lovegood'), ('Ron', 'Weasley')}

The empty set is written set() rather than {}, because {} means an empty dictionary.

In [20]:
lettersSeen = set()
In [21]:
lettersSeen
Out[21]:
set()
In [22]:
print(lettersSeen)
set()

Because sets cannot contain duplicate elements, they are an easy way to remove duplicates

In [23]:
animals2 = {'emu', 'duck', 'bunny', 'emu', 'ferret', 'bunny', 'duck', 'emu', 'bunny', 'emu'}
animals2
Out[23]:
{'bunny', 'duck', 'emu', 'ferret'}

Just as list turns a collection of elements into a list, set turns one into a set.

In [24]:
listWithDups = [4, 1, 3, 2, 3, 2, 4, 1, 2]
listWithoutDups = list(set(listWithDups))
listWithoutDups
Out[24]:
[1, 2, 3, 4]
In [25]:
# create a list of words from file
wordList = []
for line in open('prideandprejudice.txt'):
    wordList.extend(line.strip().split())
In [26]:
# how many distinct words in the book?
len(set(wordList)) # set is an easy way to remove duplicates
Out[26]:
6372

The elements in a set must be immutable values. Since lists, dictionaries, and sets themselves are mutable, they cannot be elements in a set:

In [27]:
{[3, 2], [1, 5, 4]}
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-27-ec3d1f9b66da> in <module>
----> 1 {[3, 2], [1, 5, 4]}

TypeError: unhashable type: 'list'
In [28]:
{ {3, 2}, {1, 5, 4} }
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-28-4d0fa1288b80> in <module>
----> 1 { {3, 2}, {1, 5, 4} }

TypeError: unhashable type: 'set'
In [29]:
{ {'a':3, 'c':2}, {'b': 1} } 
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-29-e1d19b3ce8b5> in <module>
----> 1 { {'a':3, 'c':2}, {'b': 1} }

TypeError: unhashable type: 'dict'

Because sets are unordered, they are not sequences, attempting to access an element of a set by an index fails with a TypeError:

In [30]:
animals[0]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-30-b058c8ae0f95> in <module>
----> 1 animals[0]

TypeError: 'set' object does not support indexing

We will look at different methods for sets later in the lecture.

Introducing Another Mutable Collection: Dictionaries

Dictionaries are unordered collections that map keys to values:

In [31]:
daysOfMonth = {'Jan': 31, 'Feb': 28, 'Mar': 31, 'Apr': 30,
               'May': 31, 'Jun': 30, 'Jul': 31, 'Aug': 31,
               'Sep': 30, 'Oct': 31, 'Nov': 30} # one month is missing
daysOfMonth
Out[31]:
{'Jan': 31,
 'Feb': 28,
 'Mar': 31,
 'Apr': 30,
 'May': 31,
 'Jun': 30,
 'Jul': 31,
 'Aug': 31,
 'Sep': 30,
 'Oct': 31,
 'Nov': 30}

**Important Note:** Dictionaries are unordered. Notice that the order we entered the key:value pairs differs from the order in which they were displayed.

While it looks like the keys are sorted alphabetically, this is a confusing feature of Jupyter that makes it seem as if the key:value pairs are sorted by key when they're really not!. If we instead run python3 in a Terminal window, we see that the key:value pairs are displayed in an unpredictable order that is determined by obscure details in the Python implementation:

$ python3
Python 3.7.4 (default, Jul  9 2019, 18:13:23) 
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

>>> daysOfMonth = {'Jan': 31, 'Feb': 28, 'Mar': 31, 'Apr': 30,
...                'May': 31, 'Jun': 30, 'Jul': 31, 'Aug': 31,
...                'Sep': 30, 'Oct': 31, 'Nov': 30} 

>>> daysOfMonth
{'Jan': 31, 'Nov': 30, 'Feb': 28, 'Aug': 31, 'May': 31, 'Mar': 31, 'Jul': 31, 'Apr': 30, 'Jun': 30, 'Oct': 31, 'Sep': 30}

Subscripting dictionaries.

We do this not with indices, but with keys.

In [32]:
daysOfMonth['May']  # what is the value associated with key 'May'?
Out[32]:
31

IMPORTANT Note that we don't have to loop through the key/value pairs to find the value associated with the key. We just use the key as the subscript, and the dictionary "magically" returns the corresponding values. This is easy and powerful!

Trying to look for a key that doesn't exist...

In [33]:
daysOfMonth['October']
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-33-66ac62b8a356> in <module>
----> 1 daysOfMonth['October']

KeyError: 'October'

Note: Indexing with 0, 1, ..., will not work, the same way 'October' didn't work, because subscription is based only on the keys.

In [34]:
daysOfMonth[0]  # error, no notion of index in dictionaries
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-34-4e883c3d2a0f> in <module>
----> 1 daysOfMonth[0]  # error, no notion of index in dictionaries

KeyError: 0

Check if a key exists with the operator in

The operator in can be used with dictionaries just like with sequence types.
We can check if a key is in a dictionary, by typing: key in someDict. However, this doesn't work with values.

In [35]:
'Oct' in daysOfMonth
Out[35]:
True
In [36]:
'October' in daysOfMonth
Out[36]:
False
In [37]:
if 'October' in daysOfMonth:
    print(daysOfMonth['October'])
else:
    print("Key not found")
Key not found

More examples of dictionaries

We''ll use these dictionaries in the examples we do in this notebook.

In [38]:
student = {'name': 'Eeph Williams', 
           'dorm': 'Bronfman', 
           'section': 2, 
           'year': 2021, 
           'course': 'CS134'}

phones = {5558671234: 'Gal Gadot', 
          9996541212: 'Trevor Noah', 
          4135974233: 'Maud Mandel'}

friends = {('Harry','Potter'):['harry@hogwarts.edu','Gryffindor'],
           ('Cho', 'Chang'):['cho@hogwarts.edu', 'Ravenclaw'],
           ('Cedric', 'Diggory'):['ced@hogwarts.edu','Hufflepuff']}

Your Turn: Subscripting dictionaries

In [39]:
# write the expression to retrieve the value 2021 from student
student['year']
Out[39]:
2021
In [40]:
# write the expression to retrieve Cho Chang's information from friends
friends['Cho', 'Chang']
Out[40]:
['cho@hogwarts.edu', 'Ravenclaw']
In [41]:
# what does this return? 
friends[('Harry', 'Potter')][1]
Out[41]:
'Gryffindor'
In [42]:
# what does this return? 
friends[('Harry', 'Potter')][1][0]
Out[42]:
'G'
In [43]:
# what does this return? 
friends[('Harry', 'Potter')][1][0][0]
Out[43]:
'G'

Using subscript assignment to change the value associated with a key

When a key is already in a dictionary, assigning a value to that key in the dictionary changes the value at that key.

For example, the key Feb is associated with what value in the daysOfMonth dicionary?

In [44]:
daysOfMonth
Out[44]:
{'Jan': 31,
 'Feb': 28,
 'Mar': 31,
 'Apr': 30,
 'May': 31,
 'Jun': 30,
 'Jul': 31,
 'Aug': 31,
 'Sep': 30,
 'Oct': 31,
 'Nov': 30}
In [45]:
daysOfMonth['Feb'] #what is the value of key 'Feb'
Out[45]:
28

If it's a leap year, we can change the value associated with Feb to be 29 via assignment with the subscript:

In [46]:
daysOfMonth['Feb'] = 29   # change value associated with a key
In [47]:
daysOfMonth # notice changed value of Feb
Out[47]:
{'Jan': 31,
 'Feb': 29,
 'Mar': 31,
 'Apr': 30,
 'May': 31,
 'Jun': 30,
 'Jul': 31,
 'Aug': 31,
 'Sep': 30,
 'Oct': 31,
 'Nov': 30}

Using subscript assignment to add a new key/value pair to a dictionary

When a key is not in a dictionary, assigning a value to that key in the dictionary adds a new key/value pair to the dictionary.

For example, the key Dec is not yet in the daysOfMonth dicionary.

In [48]:
daysOfMonth
Out[48]:
{'Jan': 31,
 'Feb': 29,
 'Mar': 31,
 'Apr': 30,
 'May': 31,
 'Jun': 30,
 'Jul': 31,
 'Aug': 31,
 'Sep': 30,
 'Oct': 31,
 'Nov': 30}
In [49]:
'Dec' in daysOfMonth # always check if key is in dict
Out[49]:
False

We can add the association 'Dec':31 to the dictionary by assigning 31 to daysOfMonth['Dec']:

In [50]:
daysOfMonth['Dec'] = 31
In [51]:
daysOfMonth
Out[51]:
{'Jan': 31,
 'Feb': 29,
 'Mar': 31,
 'Apr': 30,
 'May': 31,
 'Jun': 30,
 'Jul': 31,
 'Aug': 31,
 'Sep': 30,
 'Oct': 31,
 'Nov': 30,
 'Dec': 31}
In [52]:
'Dec' in daysOfMonth
Out[52]:
True
In [53]:
daysOfMonth['Dec']
Out[53]:
31

Dictionary keys must be immutable

Although dictionaries are mutable, the keys of dictionaries must be immutable.

In [54]:
daysOfMonth[['Feb']] = 28   # try to use a key that has month and year in a list
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-54-0b2b8729cf23> in <module>
----> 1 daysOfMonth[['Feb']] = 28   # try to use a key that has month and year in a list

TypeError: unhashable type: 'list'

As tuples are immutable, they can be keys of a dictionary. The friends dictionary is an example of a dictionary with tuples as keys.

In [55]:
friends
Out[55]:
{('Harry', 'Potter'): ['harry@hogwarts.edu', 'Gryffindor'],
 ('Cho', 'Chang'): ['cho@hogwarts.edu', 'Ravenclaw'],
 ('Cedric', 'Diggory'): ['ced@hogwarts.edu', 'Hufflepuff']}

Different ways to create dictionaries

Direct assignment. We can write the entire dictionary as a literal value by wrapping braces around a comma-separate sequence of key:value pairs:

In [56]:
student = {'name': 'Eeph Williams', 
           'dorm': 'Bronfman', 
           'section': 2, 
           'year': 2021, 
           'course': 'CS134'}
student
Out[56]:
{'name': 'Eeph Williams',
 'dorm': 'Bronfman',
 'section': 2,
 'year': 2021,
 'course': 'CS134'}

Accumuling in a Dict. Just like we do with lists, we can accumulate data in a dictionary. Start with the empty dictionary {} and keep growing. This is a common way to create dictionaries in many of our problems.

When dealing with a bigger problem with loops, we must check if a key exists in the dictionary before we accumulate the value associated with it. See Exercise wordFrequency at the end of the notebook.

In [57]:
item = {}  #empty dict---do not confuse with empty set!
type(item)
Out[57]:
dict
In [58]:
cart = {} # The empty dictionary
cart['oreos'] = 3.99
cart['kiwis'] = 2.54
cart
Out[58]:
{'oreos': 3.99, 'kiwis': 2.54}

Note: Since dictionaries are unordered, the order in which we enter key/value pairs is irrelevant.

We often "grow" a dictionary by adding key/value pairs in a loop. For example:

In [59]:
firsts = ['Shikha', 'Iris', 'Lida']
lasts =  ['Singh', 'Howley', 'Doret'] # Order of these last names corresponds to firsts

names = {} 
for i in range(len(firsts)):
    names[firsts[i]] = lasts[i]
names
Out[59]:
{'Shikha': 'Singh', 'Iris': 'Howley', 'Lida': 'Doret'}

Dict constructor function. We can use the built-in dict function, which can create a dictionary from a list of tuples, where every tuple has two elements.

In [60]:
dict([('DEU', 49), ('ALB', 355), ('UK', 44)]) # a list of tuples for country codes
Out[60]:
{'DEU': 49, 'ALB': 355, 'UK': 44}

A tuple that is not part of a list will not work:

In [61]:
dict(('USA', 1))
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-61-c30d6bfc189c> in <module>
----> 1 dict(('USA', 1))

ValueError: dictionary update sequence element #0 has length 3; 2 is required

Empty Dict. Calling dict with zero arguments creates an empty dictionary:

In [62]:
dict() # creates an empty dict
Out[62]:
{}

Dictionary methods that don't cause mutation

A dictionary object has several methods that either query the state of the object, or modify its state. We will see some of these methods in a later section: pop, update, and clear, which modify (mutate) the dictionary. Now, we'll look at some methods that query its state.

In [63]:
daysOfMonth # the entire dict
Out[63]:
{'Jan': 31,
 'Feb': 29,
 'Mar': 31,
 'Apr': 30,
 'May': 31,
 'Jun': 30,
 'Jul': 31,
 'Aug': 31,
 'Sep': 30,
 'Oct': 31,
 'Nov': 30,
 'Dec': 31}

The get method

The method get is used to avoid the step of checking for a key before updating. This is possible because this method will return a "default" value when the key is not in the dictionary. In all other cases, it will return the value associated with the given key.

In [64]:
daysOfMonth.get('Oct', 'unknown') 
Out[64]:
31
In [65]:
daysOfMonth.get('OCT', 'unknown') 
Out[65]:
'unknown'

Remember that if we try to access a non-existing key directly, we'll get a KeyError:

In [66]:
daysOfMonth['OCT']
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-66-13b9f48288b7> in <module>
----> 1 daysOfMonth['OCT']

KeyError: 'OCT'

Using get, allows us to avoid that error:

In [67]:
print(daysOfMonth.get('OCT'))
None

Question: Why don't we see anything?
Answer: Because when get doesn't find a value for a given key, it returns None, and avoids the KeyError.

Exercise: scrabblePoints

A great use for dictionaries is to store data that can simplify choosing among different values. For example, we can create a dictionary with alphabets as keys and their scrabble score as values.

In [68]:
scrabbleDict = {'a': 1, 'b': 3, 'c': 3, 'd': 2, 'e': 1, 'f': 4, 'g': 2, 
                'h': 4, 'i': 1, 'j': 8, 'k': 5, 'l': 1, 'm': 3, 'n': 1, 
                'o': 1, 'p': 3, 'q': 10, 'r': 1, 's': 1, 't': 1, 
                'u': 1, 'v': 4, 'w': 4, 'x': 8, 'y': 4, 'z': 10}
In [69]:
# Write the code here
def scrabblePoints1(letter):
    "Return the scrabble score associated with a letter."
    # Algorithm
    # 1. If letter is in scrabbleDict, return its points
    # 2. Otherwise return 0
    if letter in scrabbleDict:
        return scrabbleDict[letter]
    return 0
In [71]:
# Test with different values
for letter in 'abdhjkq7!': 
    print("{} is worth {} points".format(letter, scrabblePoints1(letter)))
a is worth 1 points
b is worth 3 points
d is worth 2 points
h is worth 4 points
j is worth 8 points
k is worth 5 points
q is worth 10 points
7 is worth 0 points
! is worth 0 points

Simplification: We can replace the function body with a single statement that does the same thing using the get method.

In [72]:
def scrabblePoints2(letter):
    # Write a single line here
    return scrabbleDict.get(letter, 0)
In [73]:
# Test with different values
for letter in 'abdhjkq7!': 
    print("{} is worth {} points".format(letter, scrabblePoints2(letter)))
a is worth 1 points
b is worth 3 points
d is worth 2 points
h is worth 4 points
j is worth 8 points
k is worth 5 points
q is worth 10 points
7 is worth 0 points
! is worth 0 points

Example: Word Frequencies

Write a function takes a list of words as input wordList and creates a dictionary of word frequency. In particular, the keys of the dictionary must be words in the wordList and the corresponding value must be the number of times they appear in the dictionary.

In [76]:
def frequencies(wordList):
    """Given a list of words, returns a dictionary of word frequencies"""
    # Algorithm
    # 1. create an empty dict
    # 2. iterate through the words of the given list
    # 3. set the value or increment the value for each word
    # 4. return the dict
    freqDict = {}
    for word in wordList:
        if word in freqDict:
            freqDict[word] += 1
        else:
            freqDict[word] = 1
    return freqDict
        
In [77]:
# test our function  
verse = """One fish. Two fish. 
           Red fish. Blue fish. 
           Black fish. Blue fish. 
           Old fish. New fish. 
           This one has a little star. 
           This one has a little car."""

frequencies(verse.split())
Out[77]:
{'One': 1,
 'fish.': 8,
 'Two': 1,
 'Red': 1,
 'Blue': 2,
 'Black': 1,
 'Old': 1,
 'New': 1,
 'This': 2,
 'one': 2,
 'has': 2,
 'a': 2,
 'little': 2,
 'star.': 1,
 'car.': 1}
In [78]:
# create a list of words from file
wordList = []
for line in open('prideandprejudice.txt'):
    wordList.extend(line.strip().split())
In [79]:
# test our function on pride and prejudice
pridePrejDict = frequencies(wordList)
len(pridePrejDict)
Out[79]:
6372
In [80]:
pridePrejDict['love']
Out[80]:
91

Exercise: Rewrite frequencies with get

We can simplify the above function with the get method.

In [81]:
def frequencies2(wordList):
    """Given a list of words, returns a dictionary of word frequencies"""
    # Algorithm:
    # many steps similar to Exercise 2
    # difference: use get with the default value 0 to set initial values
    freqDict = {}
    for word in wordList:
        freqDict[word] = freqDict.get(word, 0) + 1
    return freqDict
In [82]:
# test    
frequencies2(verse.split())
Out[82]:
{'One': 1,
 'fish.': 8,
 'Two': 1,
 'Red': 1,
 'Blue': 2,
 'Black': 1,
 'Old': 1,
 'New': 1,
 'This': 2,
 'one': 2,
 'has': 2,
 'a': 2,
 'little': 2,
 'star.': 1,
 'car.': 1}

Extra: Dictionary and Set Methods

Sometimes we are interested in knowing the keys, values or items (key, value pairs) of a dictionary. Each of these methods returns an object containing only the keys, values, and items, respectively.

Dictionary methods keys, values, and items

Sometimes we are interested in knowing the keys, values or items (key, value pairs) of a dictionary. Each of these methods returns an object containing only the keys, values, and items, respectively.

In [83]:
daysOfMonth.keys() 
Out[83]:
dict_keys(['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])

Note that the order of elements in the list is not predictable.

In [84]:
type(daysOfMonth.keys())
Out[84]:
dict_keys

Note that the .keys(), .values(), and .items() methods each return a different type of object.

In [85]:
daysOfMonth.values() 
Out[85]:
dict_values([31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31])
In [86]:
type(daysOfMonth.values())
Out[86]:
dict_values

The list returned by .values() is synchronized with the list returned by .keys(). You can find corresponding months and days in the same index.

In [87]:
daysOfMonth.items() 
Out[87]:
dict_items([('Jan', 31), ('Feb', 29), ('Mar', 31), ('Apr', 30), ('May', 31), ('Jun', 30), ('Jul', 31), ('Aug', 31), ('Sep', 30), ('Oct', 31), ('Nov', 30), ('Dec', 31)])
In [88]:
type(daysOfMonth.items())
Out[88]:
dict_items

Note that the list returned by .items() is synchronized with the lists returned by .keys() and .values().

The objects of type dict_keys, dict_values, and dict_items are so-called dictionary views that reflect any subsequent changes to the underlying dictionary from which they were made.

In [89]:
numNames = {'one': 1, 'two': 2, 'three': 3}
ks = numNames.keys() 
vs = numNames.values()
its = numNames.items()
print('keys: {}\nvalues: {}\nitems: {}'.format(ks, vs, its))
keys: dict_keys(['one', 'two', 'three'])
values: dict_values([1, 2, 3])
items: dict_items([('one', 1), ('two', 2), ('three', 3)])
In [90]:
numNames['four'] = 4
print('keys: {}\nvalues: {}\nitems: {}'.format(ks, vs, its))
keys: dict_keys(['one', 'two', 'three', 'four'])
values: dict_values([1, 2, 3, 4])
items: dict_items([('one', 1), ('two', 2), ('three', 3), ('four', 4)])

Iterating over a dictionary

There are many ways to iterate over a dictionary:

  1. over the keys (do not use .keys())
  2. over the values (with .values())
  3. over the items (with .items())
In [91]:
phones
Out[91]:
{5558671234: 'Gal Gadot', 9996541212: 'Trevor Noah', 4135974233: 'Maud Mandel'}
In [92]:
# iterate directly (by default Python goes over the keys, because they are unique)
for key in phones:
    print(phones[key], key)
Gal Gadot 5558671234
Trevor Noah 9996541212
Maud Mandel 4135974233

Using for to iterate over a dictionary means iterating over all the keys in the dictionary, so there is no need to use .keys(), which would create an unnecessary object. So we prefer to write for key in phones: rather than for key in phones.keys():.

In [93]:
for val in phones.values():
    print("Call {}!".format(val))
Call Gal Gadot!
Call Trevor Noah!
Call Maud Mandel!

Question: Can we go from values to keys, as we did from keys to values? What can we say about keys and values in a dictionary?

In [94]:
# sometimes is useful to iterate over the items directly
# notice the tuple assignment in the for loop
daysOfMonth = {'Jan': 31, 'Feb': 28, 'Mar': 31, 'Apr': 30,
               'May': 31, 'Jun': 30, 'Jul': 31, 'Aug': 31,
               'Sep': 30, 'Oct': 31, 'Nov': 30, 'Dec': 31}
In [95]:
for month, days in daysOfMonth.items():
    print("{} has {} days".format(month, days))
Jan has 31 days
Feb has 28 days
Mar has 31 days
Apr has 30 days
May has 31 days
Jun has 30 days
Jul has 31 days
Aug has 31 days
Sep has 30 days
Oct has 31 days
Nov has 30 days
Dec has 31 days

Dictionary Methods that mutate it: pop, update, and clear

We saw that dictionaries are mutable: we can assign to a key slot to change its value or even add a new key/value pair.

The pop method

Given a key, the pop method on a dictionary removes the key/value pair with that key from the dictionary and returns the value that was formerly associated with the key.

In [99]:
daysOfMonth.pop(('Feb')) # remove item with key Feb, returns value
Out[99]:
28
In [101]:
daysOfMonth # no longer has Feb
Out[101]:
{'Jan': 31,
 'Mar': 31,
 'Apr': 30,
 'May': 31,
 'Jun': 30,
 'Jul': 31,
 'Aug': 31,
 'Sep': 30,
 'Oct': 31,
 'Nov': 30,
 'Dec': 31}

Question: It looks like the method pop works similarly to the one for lists. Do you think it will behave the same if we don't provide an argument value for it?

The update method

dict1.update(dict2) mutates dict1 by assigning the key/value pairs of dict2 to dict1.

In [102]:
# let's remind ourselves of the contributions
student
Out[102]:
{'name': 'Eeph Williams',
 'dorm': 'Bronfman',
 'section': 2,
 'year': 2021,
 'course': 'CS134'}
In [103]:
newStudent = {
 'year': 2022,
 'course': 'ENG101'}
In [104]:
student.update(newStudent)

Question: What didn't you see an output from running the cell above?

In [105]:
student
Out[105]:
{'name': 'Eeph Williams',
 'dorm': 'Bronfman',
 'section': 2,
 'year': 2022,
 'course': 'ENG101'}

The clear method

We can wipe out the content of a dictionary with clear:

In [106]:
student.clear()
student
Out[106]:
{}

Set Operations

Set operations that don't change the sets

in and not in determine membership/nonmembership in a set:

In [107]:
42 in nums
Out[107]:
True
In [108]:
'wombat' in animals
Out[108]:
False
In [109]:
'koala' not in animals
Out[109]:
True

len returns the number of elements in a set

In [110]:
print('nums has {} elements'.format(len(nums)))
print('animals has {} elements'.format(len(animals)))
print('potters has {} elements'.format(len(potters)))
nums has 5 elements
animals has 4 elements
potters has 3 elements

The union of sets s1 and s2, written s1.union(s2) or s1 | s2, returns a new set that has all the elements that are in either set:

In [111]:
print('animals is {}'.format(animals))
print('animals2 is {}'.format(animals2))
print('animals.union(animals2) is {}'.format(animals.union(animals2)))
print('animals | animals2 is {}'.format(animals | animals2))
animals is {'ant', 'duck', 'bunny', 'cat'}
animals2 is {'duck', 'bunny', 'emu', 'ferret'}
animals.union(animals2) is {'duck', 'bunny', 'ferret', 'ant', 'emu', 'cat'}
animals | animals2 is {'duck', 'bunny', 'ferret', 'ant', 'emu', 'cat'}

The intersection of sets s1 and s2, written s1.intersection(s2) or s1 & s2, returns a new set that has all the elements that are in both sets:

In [112]:
print('animals is {}'.format(animals))
print('animals2 is {}'.format(animals2))
print('animals.intersection(animals2) is {}'.format(animals.intersection(animals2)))
print('animals & animals2 is {}'.format(animals & animals2))
animals is {'ant', 'duck', 'bunny', 'cat'}
animals2 is {'duck', 'bunny', 'emu', 'ferret'}
animals.intersection(animals2) is {'duck', 'bunny'}
animals & animals2 is {'duck', 'bunny'}

The (asymmetric) difference of sets s1 and s2, written s1.difference(s2) or s1 - s2, returns a new set that has all the elements that are in s1 but not in s2:

In [113]:
print('animals is {}'.format(animals))
print('animals2 is {}'.format(animals2))
print('animals.difference(animals2) is {}'.format(animals.difference(animals2)))
print('animals2 - animals is {}'.format(animals2 - animals))
animals is {'ant', 'duck', 'bunny', 'cat'}
animals2 is {'duck', 'bunny', 'emu', 'ferret'}
animals.difference(animals2) is {'ant', 'cat'}
animals2 - animals is {'emu', 'ferret'}

The symmetric difference of sets s1 and s2, written s1.symmetric_difference(s2) or s1 ^ s2, returns a new set that is the union of (all the elements that are in s1 but not in s2) and (all the elements that are in s2 but not in s1)

In [114]:
print('animals is {}'.format(animals))
print('animals2 is {}'.format(animals2))
print('animals.symmetric_difference(animals2) is {}'.format(animals.symmetric_difference(animals2)))
print('animals2 ^ animals is {}'.format(animals2 ^ animals))
animals is {'ant', 'duck', 'bunny', 'cat'}
animals2 is {'duck', 'bunny', 'emu', 'ferret'}
animals.symmetric_difference(animals2) is {'cat', 'ant', 'emu', 'ferret'}
animals2 ^ animals is {'ant', 'emu', 'cat', 'ferret'}

Sets are tested for equality using ==. The order in which elemets are written in a set is irrelevant for determining equality:

In [115]:
{'cat', 'bunny', 'ant', 'duck'} == {'duck', 'cat', 'bunny', 'ant'}
Out[115]:
True
In [116]:
{'cat', 'bunny', 'ant', 'duck'} == {'duck', 'bunny', 'ant'}
Out[116]:
False
In [117]:
{'cat', 'bunny', 'ant'} == {'bunny', 'ant', 'bunny', 'ant', 'cat', 'ant'}
Out[117]:
True

Set operations that change the sets

s.add(elem) changes s by adding elem (if it is not already in the set).

In [118]:
lettersSeen = set()
print(lettersSeen)
lettersSeen.add('f')
print(lettersSeen)
lettersSeen.add('e')
print(lettersSeen)
lettersSeen.add('r')
print(lettersSeen)
lettersSeen.add('r')
print(lettersSeen)
lettersSeen.add('e')
print(lettersSeen)
lettersSeen.add('t')
print(lettersSeen)
set()
{'f'}
{'e', 'f'}
{'e', 'r', 'f'}
{'e', 'r', 'f'}
{'e', 'r', 'f'}
{'t', 'e', 'r', 'f'}

s1 |= s2 changes s1 by adding all the elements of s2 to it.

In [119]:
print(lettersSeen)
lettersSeen |= {'e', 'm', 'u'}
print(lettersSeen)
{'t', 'e', 'r', 'f'}
{'t', 'u', 'f', 'm', 'e', 'r'}

Similarly, s1 &= s2 changes s1 by intersecting it with the elements of s2 with it, and s1 -= s2 changes s1 by removing of s2 the elements in s2:

In [120]:
print(lettersSeen)
lettersSeen &= {'t', 'r', 'u', 'e', 's', 'm'}
print(lettersSeen)
lettersSeen -= {'t', 'a', 'r'}
print(lettersSeen)
{'t', 'u', 'f', 'm', 'e', 'r'}
{'t', 'u', 'm', 'e', 'r'}
{'u', 'm', 'e'}

s.remove(elem) changes s by removing elem (raising a KeyError if it's not in the set)

In [121]:
print(lettersSeen)
lettersSeen.remove('u')
print(lettersSeen)
lettersSeen.remove('f')
{'u', 'm', 'e'}
{'m', 'e'}
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-121-1f52b7a7e5a6> in <module>
      2 lettersSeen.remove('u')
      3 print(lettersSeen)
----> 4 lettersSeen.remove('f')

KeyError: 'f'

s.clear() removes all elements from s.

In [122]:
print(lettersSeen)
lettersSeen.clear()
print(lettersSeen)
{'m', 'e'}
set()

Digger deeper: Hashable

hash When Python stores the keys of a dictionary in memory, it stores their hashes, which is an integer returned by the hash function. Only immutable objecs can be hashed.

In [123]:
hash("Williams")
Out[123]:
7866055876053967815
In [124]:
hash(['Feb', 2015])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-124-808a8ea426e9> in <module>
----> 1 hash(['Feb', 2015])

TypeError: unhashable type: 'list'
In [125]:
hash( ('Feb', 2015) ) # Tuples are hashable even though lists are not
Out[125]:
253790086339038125
In [126]:
hash(123456) # numbers are their own hash value
Out[126]:
123456

Note. At this point, you don't have to worry about why the keys are hashed, or how the hash function works! Take more advanced CS classes to learn more.