資源簡介
scrapy爬取cnblog博客園文章列表保存到本地數(shù)據(jù)庫。這個是本人最近學(xué)習(xí)爬蟲的一個實踐案例,源碼解析詳情請移步博文:https://blog.csdn.net/xiaocy66/article/details/83834261
代碼片段和文件信息
#?-*-?coding:?utf-8?-*-
#?Define?here?the?models?for?your?scraped?items
#
#?See?documentation?in:
#?https://doc.scrapy.org/en/latest/topics/items.html
import?scrapy
class?CnblogItem(scrapy.Item):
????#?頭像
????avatar?=?scrapy.Field()
????#?標(biāo)題
????title?=?scrapy.Field()
????#?分類文本
????category?=?scrapy.Field()
????#?分類id
????category_id?=?scrapy.Field()
????#?渠道
????channel?=?scrapy.Field()
????#?創(chuàng)建時間
????created_time?=?scrapy.Field()
????#?當(dāng)前id
????cur_id?=?scrapy.Field()
????#?用戶名
????user_name?=?scrapy.Field()
????#?作者昵稱
????nickname?=?scrapy.Field()
???#?封面圖url
????logo_url?=?scrapy.Field()
????#?用戶詳情url
????user_url?=?scrapy.Field()
????#?展示時間
????showtime?=?scrapy.Field()
????#?展示時間,比如2018年8月、19小時前、2天前
????show_datetime?=?scrapy.Field()
????
????#?來源
????source_from?=?s
?屬性????????????大小?????日期????時間???名稱
-----------?---------??----------?-----??----
?????目錄???????????0??2018-11-03?20:43??cnblog\
?????目錄???????????0??2018-11-03?20:45??cnblog\cnblog\
?????目錄???????????0??2018-11-03?20:45??cnblog\cnblog\db\
?????文件????????2268??2018-11-03?20:45??cnblog\cnblog\db\dbhelper.py
?????文件????????1686??2018-11-03?20:45??cnblog\cnblog\db\init.sql
?????文件?????????161??2018-11-03?20:45??cnblog\cnblog\db\__init__.py
?????目錄???????????0??2018-11-03?21:23??cnblog\cnblog\db\__pycache__\
?????文件????????2202??2018-11-03?21:23??cnblog\cnblog\db\__pycache__\dbhelper.cpython-36.pyc
?????文件?????????144??2018-11-03?21:23??cnblog\cnblog\db\__pycache__\__init__.cpython-36.pyc
?????文件????????1265??2018-11-03?23:29??cnblog\cnblog\items.py
?????文件????????3597??2018-11-03?20:43??cnblog\cnblog\middlewares.py
?????文件?????????488??2018-11-03?20:45??cnblog\cnblog\pipelines.py
?????文件????????3442??2018-11-03?21:25??cnblog\cnblog\settings.py
?????目錄???????????0??2018-11-03?20:44??cnblog\cnblog\spiders\
?????文件????????2980??2018-11-03?22:41??cnblog\cnblog\spiders\cnblogspider.py
?????文件?????????161??2018-07-12?05:14??cnblog\cnblog\spiders\__init__.py
?????目錄???????????0??2018-11-04?21:36??cnblog\cnblog\spiders\__pycache__\
?????文件????????2598??2018-11-04?21:36??cnblog\cnblog\spiders\__pycache__\cnblogspider.cpython-36.pyc
?????文件?????????149??2018-11-03?20:44??cnblog\cnblog\spiders\__pycache__\__init__.cpython-36.pyc
?????文件???????????0??2018-07-12?05:14??cnblog\cnblog\__init__.py
?????目錄???????????0??2018-11-04?21:36??cnblog\cnblog\__pycache__\
?????文件?????????738??2018-11-04?21:36??cnblog\cnblog\__pycache__\items.cpython-36.pyc
?????文件?????????833??2018-11-03?21:23??cnblog\cnblog\__pycache__\pipelines.cpython-36.pyc
?????文件?????????444??2018-11-03?21:25??cnblog\cnblog\__pycache__\settings.cpython-36.pyc
?????文件?????????141??2018-11-03?20:44??cnblog\cnblog\__pycache__\__init__.cpython-36.pyc
?????文件?????????255??2018-11-03?20:43??cnblog\scrapy.cfg
- 上一篇:刪除.class文件
- 下一篇:污水處理step7程序plc
評論
共有 條評論