資源簡介
一個簡單的python爬蟲程序,各個功能模塊分開

代碼片段和文件信息
import?urllib2
class?HtmlDownloader:
def?download(selfurl):
if?url?is?None:
return?None
response?=?urllib2.urlopen(url)
if?response.getcode()?!=?200:
return?None
return?response.read()
?屬性????????????大小?????日期????時間???名稱
-----------?---------??----------?-----??----
?????目錄???????????0??2016-04-29?17:04??spider\
?????文件?????????210??2016-04-29?11:43??spider\html_downloader.py
?????文件?????????621??2016-04-29?11:43??spider\html_downloader.pyc
?????文件?????????770??2016-04-29?13:18??spider\html_outputer.py
?????文件????????1435??2016-04-29?13:18??spider\html_outputer.pyc
?????文件????????1171??2016-04-29?17:04??spider\html_parser.py
?????文件????????1608??2016-04-29?17:04??spider\html_parser.pyc
?????文件???????70731??2016-04-29?17:05??spider\output.html
?????文件????????1027??2016-04-29?17:04??spider\spider_main.py
?????文件??????????65??2016-04-29?11:16??spider\test.py
?????文件?????????293??2016-04-29?11:17??spider\test.pyc
?????文件?????????542??2016-04-29?10:52??spider\url_manager.py
?????文件????????1367??2016-04-29?10:52??spider\url_manager.pyc
評論
共有 條評論