Page Contents (hide)
As you know from Java, we can use files stored on a secondary device for both input and output. For example, if we want to produce a report and print that report, we need to write the output to a text file and then print the file. The data needed to produce that report, may itself be stored in a file on disk. We would have to extract the data from the file, process it, and then produce the report.
File access in Python is much simpler than in Java. Files are represented by objects and are actually a built-in type. This means no additional modules are required in order to access a file from within your program.
Whether we are reading from an input file or writing to an output file, we must first open the file and create a Python file object. A file is opened using the built-in file()
constructor1.
The function takes two string arguments. The first is the name of the file and the second is the mode where r means open for reading and w means open for writing. The file()
constructor returns a reference to the newly created file object. This object is then used to access the file.
To open the same files in Java, would require the following statements
When you are finished using the file, it should be closed. By closing the file, the system will flush the buffer and unlink the external file from the interanl Python object.
As in Java, a file can not be used once it has been closed. To reuse the file, you must first reopen it.
Python provides several methods for outputing data to a file. In this chapter, we are only working with text files, though, you can also use binary files in Python. The easiest way to write text to a text file is with the write()
method.
The write()
method writes the given string to the output file represented by the given file
object. To output other value types, you must first convert them to strings. To format the output written to the file you can use the string format operator
Python also allows you to output the entire contents of a string list. Consider the following example
which writes each string in the list, one per line, to the text file and produces
Line 1 Line 2 and yet more
Python provides two methods for extracting data from a text file. Both of which extract the data as strings. If you need to extract other data types, then you must explicitly convert the extracted string(s).
In the following example, the readline()
method is used to extract an entire line from the text file with the contents returned as a string
The end of file is flaged when there is no data to be extracted. In Python, this is done by the readline()
method returning an empty string (""
).
The readline()
method leaves the newline character at the end of the string when the line is extracted from the file. The rstrip()
string method can be used to strip the white space from the end.
If there is no newline at the end, which can occur for the last line in the file, then rstrip()
does nothing.
To read individual characters from a text file, simply pass an integer value to the readline()
method indicating the number of characters to be extracted
Python provides a convient method for extracting the entire contents of a file and storing it into a string list
Consider the following example program which produces a double spaced version of the text file myreport.txt by inserting a blank line between each existing line.
An alternative approach to processing the entire contents of a text file is with the use of the file iterator. Python provides an iterator that can be used as part of a for
loop. The following is a modified version of the dblspace.py program presented in the previous section
In this example, each iteration of the loop causes the nxt line in the file to be extracted and stored in the line
variable.
All of our previous examples delt with the extraction of strings from a text file. But what if we need to extract numeric values? Python only provides methods for extracting strings. To extract other data types, we must handle the conversions explicitly. Consider the following sample text file pertaining to student data for a given course.
Computer Programming I 100 Smith, John 92.4 208 Roberts, Jane 88.05 334 Green, Patrick 76.35
The first line in the file is the name of the course. The remaining lines contain three student records. Each record is spread over three lines: the first line contains the student’s identification number; the second line the student’s name and the last that student’s average grade for the course. Suppose we want to extract this data and produce a report similar to the following
STUDENT REPORT Computer Programming I ---------------------------------------- 100 Smith, John 92.40 208 Roberts, Jane 88.05 334 Green, Patrick 76.35 ---------------------------------------- Average Grade 85.60
We have no alternative but to extract each line of the data file as a string. But we need to treat the grades as real values in order to computer the average grade for the course. We can accomplish this the same as we did with user interaction; typecast the strings to the appropriate data type.
The following program is an implementation of a solution which extracts the student data to produce the report illustrated above.
To extract mixed type data stored on the same input line, we must first split or tokenize the string into individual parts using the split()
method of the string class. Consider the following code segment
which produces
['12', '45.5', 'abc', '9']
The split()
method splits or tokenizes a string into substrings and stores the results in a string list. By default, the string is split using whitespace characters as the delimiter. You can also specific a set of delimiters as a argument