91av视频/亚洲h视频/操亚洲美女/外国一级黄色毛片 - 国产三级三级三级三级

資源簡介

Python爬蟲--抓取百度百科的前1000個頁面的實現。

資源截圖

代碼片段和文件信息

#?coding:UTF8

from?bs4?import?BeautifulSoup
import?re

html_doc?=?“““
tle>The?Dormouse‘s?storytle>

tle“>The?Dormouse‘s?story



Once?upon?a?time?there?were?three?little?sisters;?and?their?names?were
nk1“>Elsie
nk2“>Lacie?and
nk3“>Tillie;
and?they?lived?at?the?bottom?of?a?well.



...


“““
soup?=?BeautifulSoup(html_doc?‘html.parser‘?from_encoding=‘utf-8‘)
print?‘get?all?links‘
links?=?soup.find_all(‘a‘)
for?link?in?links:
????print?link.name?link[‘href‘]?link.get_text()
????
print?‘\nget?lacie?link‘
link_node??=?soup.find(‘a‘?href=“http://example.com/lacie“)
print?link_node.name?link_node[‘href‘]?link_node.get_text()

print?‘\nre‘
link_node??=?soup.find(‘a‘?href=re.compile(r“ill“))
print?link_node.name?link_node[‘href‘]?link_node.get_text()

print?‘\np‘
p_node??=?soup.find(‘p‘?class_=“title“)
print?p_node.namep_node.get_text()

?屬性????????????大小?????日期????時間???名稱
-----------?---------??----------?-----??----

?????文件???????1161??2016-10-30?13:31??reptile\test_bs4.py

?????文件??????????0??2016-10-30?13:20??reptile\__init__.py

?????目錄??????????0??2016-10-30?13:21??reptile

-----------?---------??----------?-----??----

?????????????????1161????????????????????3


評論

共有 條評論