Beginning Python - From Novice to Professional
Beginning Python - From Novice to Professional Beginning Python - From Novice to Professional
CHAPTER 14 ■ NETWORK PROGRAMMING 301 ■Note At the time of writing, this code gave the exact results shown. However, by the time you read this, the Python page may have changed, and you may have to modify the regular expression to make the code work. Retrieving Remote Files The urlopen function gives you a file-like object you can read from. If you would rather have urllib take care of downloading the file for you, storing a copy in a local file, you can use urlretrieve instead. Rather than returning a file-like object, it returns a tuple (filename, headers), where filename is the name of the local file (this name is created automatically by urllib), and headers contains some information about the remote file. (I’ll ignore headers here. Look up urlretrieve in the standard library documentation of urllib if you want to know more about it.) If you want to specify a file name for the downloaded copy, you can supply that as a second parameter: urlretrieve('http://www.python.org', 'C:\\python_webpage.html') This retrieves the Python home page and stores it in the file C:\python_webpage.html. If you don’t specify a file name, the file is put in some temporary location, available for you to open (with the open function), but when you’re done with it, you may want to have it removed so that it doesn’t take up space on your hard drive. To clean up such temporary files, you can call the function urlcleanup without any arguments, and it takes care of things for you. SOME UTILITIES In addition to reading and downloading files through URLs, urllib also offers some functions for manipulating the URLs themselves. (The following assumes some knowledge of URLs and CGI.) These functions are • quote(string[, safe]). Returns a string in which all special characters (characters that have special significance in URLs) have been replaced by URL-friendly versions (such as %7E instead of ~). This can be useful if you have a string that might contain such special characters and you want to use it as a URL. The safe string includes characters that should not be coded like this: the default is '/'. • quote_plus(string[, safe]). Works like quote, but also replaces spaces with plus signs. • unquote(string). The reverse of quote. • unquote_plus(string). The reverse of quote_plus. • urlencode(query[, doseq]). Converts a mapping (such as a dictionary) or a sequence of twoelement tuples—of the form (key, value)—into a “URL-encoded” string, which can be used in CGI queries. (Check the Python documentation for more information.) Other Modules As mentioned, beyond the modules explicitly discussed in this chapter, there are hordes of network-related modules in the Python library and elsewhere. In Table 14-1, you find a list of
302 CHAPTER 14 ■ NETWORK PROGRAMMING some network-related modules from the Python standard library. As noted in the table, some of these modules are discussed elsewhere in the book. Table 14-1. Some Network-Related Modules in the Standard Library Module Description asynchat Additional functionality for asyncore (see Chapter 24) asyncore Asynchronous socket handler (see Chapter 24) cgi Basic CGI support (see Chapter 15) Cookie Cookie object manipulation, mainly for servers (see Chapter 15) cookielib Client-side cookie support (see Chapter 15) email Support e-mail messages (including MIME) ftplib FTP client module gopherlib gopher client module httplib HTTP client module imaplib IMAP4 client module mailbox Reads several mailbox formats mailcap Access to MIME configuration through mailcap files mhlib Access to MH mailboxes nntplib NNTP client module (see Chapter 23) poplib POP client module robotparser Support for parsing Web server robot files SimpleXMLRPCServer A simple XML-RPC server (see Chapter 27) smtpd SMTP server module smtplib SMTP client module telnetlib telnet client module urlparse Support for interpreting URLs xmlrpclib Client support for XML-RPC (see Chapter 27) SocketServer and Friends As you saw in the section about the socket module earlier, writing a simple socket server isn’t really hard. If you want to go beyond the basics, however, getting some help can be nice. The SocketServer module is the basis for a framework of several servers in the standard library, including BaseHTTPServer, SimpleHTTPServer, CGIHTTPServer, SimpleXMLRPCServer, and DocXMLRPCServer, all of which add various specific functionality to the basic server.
- Page 282 and 283: CHAPTER 10 ■ BATTERIES INCLUDED 2
- Page 284: CHAPTER 10 ■ BATTERIES INCLUDED 2
- Page 287 and 288: 256 CHAPTER 11 ■ FILES AND STUFF
- Page 289 and 290: 258 CHAPTER 11 ■ FILES AND STUFF
- Page 291 and 292: 260 CHAPTER 11 ■ FILES AND STUFF
- Page 293 and 294: 262 CHAPTER 11 ■ FILES AND STUFF
- Page 295 and 296: 264 CHAPTER 11 ■ FILES AND STUFF
- Page 297 and 298: 266 CHAPTER 11 ■ FILES AND STUFF
- Page 299 and 300: 268 CHAPTER 11 ■ FILES AND STUFF
- Page 301 and 302: 270 CHAPTER 12 ■ GRAPHICAL USER I
- Page 303 and 304: 272 CHAPTER 12 ■ GRAPHICAL USER I
- Page 305 and 306: 274 CHAPTER 12 ■ GRAPHICAL USER I
- Page 307 and 308: 276 CHAPTER 12 ■ GRAPHICAL USER I
- Page 309 and 310: 278 CHAPTER 12 ■ GRAPHICAL USER I
- Page 311 and 312: 280 CHAPTER 12 ■ GRAPHICAL USER I
- Page 313 and 314: 282 CHAPTER 12 ■ GRAPHICAL USER I
- Page 316 and 317: CHAPTER 13 ■ ■ ■ Database Sup
- Page 318 and 319: CHAPTER 13 ■ DATABASE SUPPORT 287
- Page 320 and 321: CHAPTER 13 ■ DATABASE SUPPORT 289
- Page 322 and 323: CHAPTER 13 ■ DATABASE SUPPORT 291
- Page 324 and 325: CHAPTER 13 ■ DATABASE SUPPORT 293
- Page 326: CHAPTER 13 ■ DATABASE SUPPORT 295
- Page 329 and 330: 298 CHAPTER 14 ■ NETWORK PROGRAMM
- Page 331: 300 CHAPTER 14 ■ NETWORK PROGRAMM
- Page 335 and 336: 304 CHAPTER 14 ■ NETWORK PROGRAMM
- Page 337 and 338: 306 CHAPTER 14 ■ NETWORK PROGRAMM
- Page 339 and 340: 308 CHAPTER 14 ■ NETWORK PROGRAMM
- Page 341 and 342: 310 CHAPTER 14 ■ NETWORK PROGRAMM
- Page 343 and 344: 312 CHAPTER 14 ■ NETWORK PROGRAMM
- Page 345 and 346: 314 CHAPTER 15 ■ PYTHON AND THE W
- Page 347 and 348: 316 CHAPTER 15 ■ PYTHON AND THE W
- Page 349 and 350: 318 CHAPTER 15 ■ PYTHON AND THE W
- Page 351 and 352: 320 CHAPTER 15 ■ PYTHON AND THE W
- Page 353 and 354: 322 CHAPTER 15 ■ PYTHON AND THE W
- Page 355 and 356: 324 CHAPTER 15 ■ PYTHON AND THE W
- Page 357 and 358: 326 CHAPTER 15 ■ PYTHON AND THE W
- Page 359 and 360: 328 CHAPTER 15 ■ PYTHON AND THE W
- Page 361 and 362: 330 CHAPTER 15 ■ PYTHON AND THE W
- Page 363 and 364: 332 CHAPTER 15 ■ PYTHON AND THE W
- Page 365 and 366: 334 CHAPTER 15 ■ PYTHON AND THE W
- Page 367 and 368: 336 CHAPTER 15 ■ PYTHON AND THE W
- Page 369 and 370: 338 CHAPTER 15 ■ PYTHON AND THE W
- Page 372 and 373: CHAPTER 16 ■ ■ ■ Testing, 1-2
- Page 374 and 375: CHAPTER 16 ■ TESTING, 1-2-3 343 W
- Page 376 and 377: CHAPTER 16 ■ TESTING, 1-2-3 345 d
- Page 378 and 379: CHAPTER 16 ■ TESTING, 1-2-3 347 u
- Page 380 and 381: CHAPTER 16 ■ TESTING, 1-2-3 349 F
CHAPTER 14 ■ NETWORK PROGRAMMING 301<br />
■Note At the time of writing, this code gave the exact results shown. However, by the time you read this,<br />
the <strong>Python</strong> page may have changed, and you may have <strong>to</strong> modify the regular expression <strong>to</strong> make the code work.<br />
Retrieving Remote Files<br />
The urlopen function gives you a file-like object you can read from. If you would rather have<br />
urllib take care of downloading the file for you, s<strong>to</strong>ring a copy in a local file, you can use<br />
urlretrieve instead. Rather than returning a file-like object, it returns a tuple (filename, headers),<br />
where filename is the name of the local file (this name is created au<strong>to</strong>matically by urllib), and<br />
headers contains some information about the remote file. (I’ll ignore headers here. Look up<br />
urlretrieve in the standard library documentation of urllib if you want <strong>to</strong> know more about<br />
it.) If you want <strong>to</strong> specify a file name for the downloaded copy, you can supply that as a second<br />
parameter:<br />
urlretrieve('http://www.python.org', 'C:\\python_webpage.html')<br />
This retrieves the <strong>Python</strong> home page and s<strong>to</strong>res it in the file C:\python_webpage.html. If<br />
you don’t specify a file name, the file is put in some temporary location, available for you <strong>to</strong><br />
open (with the open function), but when you’re done with it, you may want <strong>to</strong> have it removed<br />
so that it doesn’t take up space on your hard drive. To clean up such temporary files, you can<br />
call the function urlcleanup without any arguments, and it takes care of things for you.<br />
SOME UTILITIES<br />
In addition <strong>to</strong> reading and downloading files through URLs, urllib also offers some functions for manipulating<br />
the URLs themselves. (The following assumes some knowledge of URLs and CGI.) These functions are<br />
• quote(string[, safe]). Returns a string in which all special characters (characters that have<br />
special significance in URLs) have been replaced by URL-friendly versions (such as %7E instead of ~).<br />
This can be useful if you have a string that might contain such special characters and you want <strong>to</strong> use<br />
it as a URL. The safe string includes characters that should not be coded like this: the default is '/'.<br />
• quote_plus(string[, safe]). Works like quote, but also replaces spaces with plus signs.<br />
• unquote(string). The reverse of quote.<br />
• unquote_plus(string). The reverse of quote_plus.<br />
• urlencode(query[, doseq]). Converts a mapping (such as a dictionary) or a sequence of twoelement<br />
tuples—of the form (key, value)—in<strong>to</strong> a “URL-encoded” string, which can be used in CGI<br />
queries. (Check the <strong>Python</strong> documentation for more information.)<br />
Other Modules<br />
As mentioned, beyond the modules explicitly discussed in this chapter, there are hordes of<br />
network-related modules in the <strong>Python</strong> library and elsewhere. In Table 14-1, you find a list of