Beginning Python - From Novice to Professional

Beginning Python - From Novice to Professional Beginning Python - From Novice to Professional

16.01.2014 Views

CHAPTER 10 ■ BATTERIES INCLUDED 219 fileinput You learn a lot about reading from and writing to files in Chapter 11, but here is a sneak preview. The fileinput module enables you to easily iterate over all the lines in a series of text files. If you call your script like this (assuming a UNIX command line): $ python some_script.py file1.txt file2.txt file3.txt you will be able to iterate over the lines of file1.txt through file3.txt in turn. You can also iterate over lines supplied to standard input (sys.stdin, remember?), for example, in a UNIX pipe (using the standard UNIX command cat): $ cat file.txt | python some_script.py If you use fileinput, this way of calling your script (with cat in a UNIX pipe) works just as well as the previous one (supplying the file names as command-line arguments to your script). The most important functions of the fileinput module are described in Table 10-4. Table 10-4. Some Important Functions in the fileinput Module Function input([files[, inplace[, backup]]) filename() lineno() filelineno() isfirstline() isstdin() nextfile() close() Description Facilitates iteration over lines in multiple input streams Returns name of current file Returns current (cumulative) line number Returns line number within current file Checks whether current line is first in file Checks whether last line was from sys.stdin Closes current file and moves to the next Closes the sequence The function fileinput.input is the most important of the functions. It returns an object that you can iterate over in a for loop. If you don’t want the default behavior (in which fileinput finds out which files to iterate over), you can supply one or more file names to this function (as a sequence). You can also set the inplace parameter to a true value (inplace=True) to enable in-place processing. For each line you access, you’ll have to print out a replacement, which will be put back into the current input file. The optional backup argument gives a file name extension to a backup file created from the original file when you do in-place processing. The function fileinput.filename returns the file name of the file you are currently in (that is, the file that contains the line you are currently processing). The function fileinput.lineno returns the number of the current line. This count is cumulative so that when you are finished with one file and begin processing the next, the line number is not reset but starts at one more than the last line number in the previous file.

220 CHAPTER 10 ■ BATTERIES INCLUDED The function fileinput.filelineno returns the number of the current line within the current file. Each time you are finished with one file and begin processing the next, the file line number is reset, and restarts at 1. The function fileinput.isfirstline returns a true value if the current line is the first line of the current file—and a false value otherwise. The function fileinput.isstdin returns a true value if the current file is sys.stdin and false otherwise. The function fileinput.nextfile closes the current file and skips to the next one. The lines you skip do not count against the line count. This can be useful if you know that you are finished with the current file—for example, if each file contains words in sorted order, and you are looking for a specific word. If you have passed the word’s position in the sorted order, you can safely skip to the next file. The function fileinput.close closes the entire chain of files and finishes the iteration. Example Numbering the lines of a Python script. Let’s say you have written a Python script and you want to number the lines. Because you want the program to keep working after you’ve done this, you have to add the line numbers in comments to the right of each line. To line them up, you can use string formatting. Let’s allow each program line to get 40 characters maximum and add the comment after that. The program in Listing 10-6 shows a simple way of doing this with fileinput and the inplace parameter. Listing 10-6. Adding Line Numbers to a Python Script # numberlines.py import fileinput for line in fileinput.input(inplace=True): line = line.rstrip() num = fileinput.lineno() print '%-40s # %2i' % (line, num) If you run this program on itself, like this: $ python numberlines.py numberlines.py you end up with the program in Listing 10-7. Note that the program itself has been modified, and that if you run it like this several times, you end up with multiple numbers on each line. Recall that rstrip is a string method that returns a copy of a string, where all the whitespace on the right has been removed (see the section “String Methods” in Chapter 3 and Table B-6 in Appendix B).

CHAPTER 10 ■ BATTERIES INCLUDED 219<br />

fileinput<br />

You learn a lot about reading from and writing <strong>to</strong> files in Chapter 11, but here is a sneak preview.<br />

The fileinput module enables you <strong>to</strong> easily iterate over all the lines in a series of text files. If<br />

you call your script like this (assuming a UNIX command line):<br />

$ python some_script.py file1.txt file2.txt file3.txt<br />

you will be able <strong>to</strong> iterate over the lines of file1.txt through file3.txt in turn. You can also<br />

iterate over lines supplied <strong>to</strong> standard input (sys.stdin, remember?), for example, in a UNIX<br />

pipe (using the standard UNIX command cat):<br />

$ cat file.txt | python some_script.py<br />

If you use fileinput, this way of calling your script (with cat in a UNIX pipe) works just as<br />

well as the previous one (supplying the file names as command-line arguments <strong>to</strong> your script).<br />

The most important functions of the fileinput module are described in Table 10-4.<br />

Table 10-4. Some Important Functions in the fileinput Module<br />

Function<br />

input([files[, inplace[, backup]])<br />

filename()<br />

lineno()<br />

filelineno()<br />

isfirstline()<br />

isstdin()<br />

nextfile()<br />

close()<br />

Description<br />

Facilitates iteration over lines in multiple input<br />

streams<br />

Returns name of current file<br />

Returns current (cumulative) line number<br />

Returns line number within current file<br />

Checks whether current line is first in file<br />

Checks whether last line was from sys.stdin<br />

Closes current file and moves <strong>to</strong> the next<br />

Closes the sequence<br />

The function fileinput.input is the most important of the functions. It returns an object<br />

that you can iterate over in a for loop. If you don’t want the default behavior (in which fileinput<br />

finds out which files <strong>to</strong> iterate over), you can supply one or more file names <strong>to</strong> this function<br />

(as a sequence). You can also set the inplace parameter <strong>to</strong> a true value (inplace=True) <strong>to</strong> enable<br />

in-place processing. For each line you access, you’ll have <strong>to</strong> print out a replacement, which will<br />

be put back in<strong>to</strong> the current input file. The optional backup argument gives a file name extension <strong>to</strong><br />

a backup file created from the original file when you do in-place processing.<br />

The function fileinput.filename returns the file name of the file you are currently in (that<br />

is, the file that contains the line you are currently processing).<br />

The function fileinput.lineno returns the number of the current line. This count is<br />

cumulative so that when you are finished with one file and begin processing the next, the line<br />

number is not reset but starts at one more than the last line number in the previous file.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!