資源簡介
python爬取百度百科的頁面主要用BeautifulSoup ,urllib2等

代碼片段和文件信息
#?coding=UTF-8
‘‘‘
Created?on?2017年4月4日
@author:?zwl
‘‘‘
import?urllib2
class?HtmlDownloader(object):
????
????
????def?download(self?url):
????????if?url?is?None:
????????????return?None
????????
????????response?=?urllib2.urlopen(url);
????????if?response.getcode()?!=?200?:
????????????return?None
????????return?response.read()
????
????
?屬性????????????大小?????日期????時間???名稱
-----------?---------??----------?-----??----
?????目錄???????????0??2017-04-04?20:50??test\
?????文件?????????375??2017-03-30?20:26??test\.project
?????文件?????????431??2017-03-30?20:26??test\.pydevproject
?????目錄???????????0??2017-03-30?21:36??test\.settings\
?????文件?????????109??2017-04-05?11:04??test\.settings\org.eclipse.core.resources.prefs
?????目錄???????????0??2017-04-05?11:24??test\baike_spider\
?????文件?????????380??2017-04-05?11:05??test\baike_spider\html_downloader.py
?????文件?????????760??2017-04-05?11:05??test\baike_spider\html_downloader.pyc
?????文件?????????876??2017-04-05?11:04??test\baike_spider\html_outputer.py
?????文件????????1470??2017-04-05?11:05??test\baike_spider\html_outputer.pyc
?????文件????????1446??2017-04-05?11:24??test\baike_spider\html_parser.py
?????文件????????1786??2017-04-05?11:24??test\baike_spider\html_parser.pyc
?????文件??????????41??2017-04-05?11:30??test\baike_spider\output.html
?????文件????????1343??2017-04-05?11:04??test\baike_spider\spider_main.py
?????文件?????????804??2017-04-05?11:17??test\baike_spider\url_manager.py
?????文件????????1573??2017-04-05?11:24??test\baike_spider\url_manager.pyc
?????文件???????????0??2017-04-04?20:50??test\baike_spider\__init__.py
?????文件?????????136??2017-04-05?11:01??test\baike_spider\__init__.pyc
?????目錄???????????0??2017-03-30?21:35??test\test\
?????文件?????????846??2017-04-04?20:56??test\test\test_bs4.py
?????文件?????????700??2017-03-30?21:07??test\test\test_urllib2.py
?????文件???????????0??2017-03-30?20:26??test\test\__init__.py
- 上一篇:Python讀取柵格并計算
- 下一篇:Tkinter+SQLite實現賬號密碼存儲
評論
共有 條評論