16.01.2014 Views

Beginning Python - From Novice to Professional

Beginning Python - From Novice to Professional

Beginning Python - From Novice to Professional

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

CHAPTER 10 ■ BATTERIES INCLUDED 245<br />

User-Agent: Microsoft-Outlook-Express-Macin<strong>to</strong>sh-Edition/5.02.2022<br />

Date: Wed, 19 Dec 2004 17:22:42 -0700<br />

Subject: Re: Spam<br />

<strong>From</strong>: Foo Fie <br />

To: Magnus Lie Hetland <br />

CC: <br />

Message-ID: <br />

In-Reply-To: <br />

Mime-version: 1.0<br />

Content-type: text/plain; charset="US-ASCII"<br />

Content-transfer-encoding: 7bit<br />

Status: RO<br />

Content-Length: 55<br />

Lines: 6<br />

So long, and thanks for all the spam!<br />

Yours,<br />

Foo Fie<br />

Let’s try <strong>to</strong> find out who this e-mail is from. If you examine the text, I’m sure you can figure it out in this case (especially<br />

if you look at the message itself, at the bot<strong>to</strong>m, of course). But can you see a general pattern? How do you extract<br />

the name of the sender, without the e-mail address? Or, how can you list all the e-mail addresses mentioned in the<br />

headers? Let’s handle the first task first.<br />

The line containing the sender begins with the string '<strong>From</strong>: ' and ends with an e-mail address enclosed in angle<br />

brackets (< and >). You want the text found between those. If you use the fileinput module, this ought <strong>to</strong> be an<br />

easy task. A program solving the problem is shown in Listing 10-10.<br />

■Note You could solve this problem without using regular expressions if you wanted. You could also use<br />

the email module.<br />

Listing 10-10. A Program for Finding the Sender of an E-mail<br />

# find_sender.py<br />

import fileinput, re<br />

pat = re.compile('<strong>From</strong>: (.*?) $')<br />

for line in fileinput.input():<br />

m = pat.match(line)<br />

if m: print m.group(1)<br />

You can then run the program like this (assuming that the e-mail message is in the text file message.eml):

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!