Today we'll talk about how to read from a file using python, and how to use nested loops to build on what we learnt about sequences such as str
and list
in last lecture.
Acknowlegement. This notebook has been adapted from the Wellesley CS111 Spring 2019 course materials (http://cs111.wellesley.edu/spring19).
Last lecture together we constructed helper functions such as isVowel
,
def isVowel(char):
"""Predicate that returns true only when a letter is a vowel."""
return char.lower() in 'aeiou'
def startsWithVowel(word):
"""Predicate that returns true only when a word starts with a vowel."""
if len(word): # any length > 0 evaluates to true
return isVowel(word[0])
return False
def countAllVowels(word):
'''Returns number of vowels in the word'''
count = 0 # initialize the counter
for char in word: # iterate over the word one character at a time
if isVowel(char):
count += 1
return count
# our updates function definition for countChar
def countChar(char, word):
'''Counts the number of times a character appears in a word'''
count = 0
for letter in word:
if char.lower() == letter.lower():
count += 1
return count
def wordStartEnd(wordList):
'''Takes a list of words and counts the
number of words in it that start and end
with the same letter'''
count = 0 #initilize counter
for word in wordList:
if len(word): #why do we need this?
if word[0].lower() == word[-1].lower(): # will this work?
# debugging print here perhaps
# print(word)
count += 1
return count
Summary. We built various helper functions that iterate over sequences and returns a statistic (in our case counts) of certain things.
You can imagine wanting to iterate over a sequence and compute a collection of items from it with a cetain property. For example, instead of counting the number of words that start and end with the same letter in a list, we may want to return the words themselves.
To accumulate a collection of items, we can use the lists, almost the same as we accumulate in a counter.
List updates. Lists a mutable structure which means we can update them (delete things from them, add things to them, etc.) . Today we will only look at how we can accumulate using a list: that is how to add things to a given list. Two ways in particular:
In later lectures, we will subtle ways in which these two are different.
firstList = [1, 2, 3, 4, 5]
secondList = [8, 9, 10, 11]
newList = firstList + secondList # list concatenation
newList
List concatenation creates a new list and does not modify the original lists.
firstList
secondList
We can also append a given item to an existing list by using the append
method. This modifies the list it is called on by adding the item to it.
firstList.append(6)
firstList #it has changed!
Summary. There are two ways to accumulate in a list: list concatenation or appending to a list. List concatenation uses the +
operator and returns a new list which is a concatenation of the two lists. Using the append method on a list modifies it by appending the item to it. We will discuss the subtleties between the two next week.
It is often the case that we use loops to iterate over a sequence to "accumulate" certain items from it. Suppose someone gave us a list of words and we want to collect all words in that list that start with a vowel. We can use our isVowel
helper function, but should we approach this problem?
Such processes where we are accumulating something in a list are called list accumulation. You can accumulate items in a list using concatenation, similar to strings. For example:
myList = ['apple', 'orange', 'banana']
myList += ['papaya'] # concatenate myList and ['papaya']
myList
Back to the exercise.
Let us now define a function vowelList
that iterates over a given list of words wordList
and collects all the words in the list that begin with a vowel (in a new list) and returns that list.
def vowelWordList(wordList):
'''Returns a list of words that start with a vowel from the input list'''
result = [] # initialize list accumulation variable
for word in wordList:
if startsWithVowel(word):
result.append(word)
return result
phrase = ['The', 'sun', 'rises', 'in', 'the',\
'east', 'and', 'sets', 'in', 'the', 'west']
vowelWordList(phrase)
Suppose we want to test a function we have written. There are several ways to do so. You can test using interactively python by importing the function from checking to see if it returns the correct output when called on a bunch of different values.
Testing using doctests. Python's doctest module allows you to embed your test cases and expected output directly into a functions docstring. To use the doctest module we must import it. To make sure the test cases are run when the program is run as a script from the terminal, we need to insert a doctest.testmod(). To make sure that the tests are not run in an interactive shell or when the functions from the module are imported, we should place the command within a guarded if __name__ == "__main__":
block. See slides for more explanation.
def isVowel(char):
"""Predicate that returns true only when a letter is a vowel.
>>> isVowel('d')
False
>>> isVowel('e')
True
"""
return char.lower() in 'aeiou'
import doctest
doctest.testmod(verbose = True)
# Task: try this out as a script and
# run from the terminal us try this out
Similar to nested if statements, we can nest loops as well. Similar to nested ifs, in a nested loop, the inner loop's body is determined by the indentation. Let us take an example of two nested for
loops to generate the multiplication table.
for i in range(5): # This is called the "outer loop"
for j in range(5): # This is called the "inner loop"
print(i, j)
for i in range(2, 6): # This is called the "outer loop"
for j in range(2, 6): # This is called the "inner loop"
print(i, 'x', j, '=', i*j)
What happens if we add conditionals in the outer and/or inner loop? Predict the output of the following examples:
for i in range(2, 6):
for j in range(2, 6):
if i <= j:
print(i, 'x', j, '=', i*j)
for i in range(2, 6):
if i % 2 == 0:
for j in range(2, 6):
if i <= j:
print(i, 'x', j, '=', i*j)
As another simple example of nested loops, this one to print words. Predict the words that will be printed, in order.
for letter in ['b','d','r','s']:
for suffix in ['ad', 'ib', 'ump']:
print(letter + suffix)
You can read and write to a file using python commands. Today we will only focus on reading from a file. Next week, we will look at how to write to a file.
A file is an object that is created by the built-in function open
.
myFile = open('textfiles/prideandprejudice.txt', 'r') # 'r' means open the file for reading
print(myFile)
The object returned from open()
is a type of file object called io.TextIOWrapper
. In short, it is a type of file object that interprets the file as a stream of text.
type(myFile)
The mode 'r'
for reading the file is optional, because the most basic way for opening a file is for reading:
myFile2 = open('textfiles/prideandprejudice.txt')
myFile2
Aside: Note that by default Python assumes that your text file (in this case, thesis.txt) is encoded in ASCII. Recall, the ASCII table: http://www.asciitable.com/. There are other types of encodings for text, in particular Unicode encoding which encompasses most of the world's writing characters. To learn more about how the computer interprets text encoding, take CS240!
With block to open and close files. Technically when you open a file, you must also close it. To avoid writing code to explicitly open and close, we will use the with… as block which keeps the file open within it.
Within a with...as block, we can iterate over the lines of a file the same way we would iterate over any sequence.
with open('textfiles/classNames.txt') as roster: # roster: name of file object
for line in roster:
print(line)
# file is implicitly closed here
String functions helpful for file reading. Notice the newline after every line of the file. This is cause by the special newline character \n
. We can remove that using the strip() string method. Suppose we wanted to split the name into a list containing the first name and the last name, we can do that with the split method. Lets try out these methods.
myLine = ' Trying out the strip function to see what it does. '
myLine.strip() # notice the whitespace removed
message = '\n \n Trying out the strip function to see what it does. \n \n'
print(message)
message.strip() # notice the whitespace removed
print(message.strip())
listOfWords = message.split() # splits line around space and turns it into a list
listOfWords
# lets try the same example again with .strip()
with open('textfiles/classNames.txt') as roster: # roster: name of file object
for line in roster:
print(line.strip())
# file is implicitly closed here
Now that we know how to write nested loops, accumulate in lists and read from files, let us do some fun exercises with these concepts. We will build these examples live in class.