資源簡介
經(jīng)過上一篇教程我們已經(jīng)大致了解了Scrapy的基本情況,并寫了一個簡單的小demo。這次我會以爬取豆瓣電影TOP250為例進(jìn)一步為大家講解一個完整爬蟲的流程。
工具和環(huán)境
語言:python 2.7
IDE: Pycharm
瀏覽器:Chrome
爬蟲框架:Scrapy 1.2.1
https://zhuanlan.zhihu.com/p/24769534?refer=woodenrobot

代碼片段和文件信息
#?-*-?coding:?utf-8?-*-
#?Define?here?the?models?for?your?scraped?items
#
#?See?documentation?in:
#?http://doc.scrapy.org/en/latest/topics/items.html
import?scrapy
class?DoubanMovieItem(scrapy.Item):
????#?排名
????ranking?=?scrapy.Field()
????#?電影名稱
????movie_name?=?scrapy.Field()
????#?評分
????score?=?scrapy.Field()
????#?評論人數(shù)
????score_num?=?scrapy.Field()
?屬性????????????大小?????日期????時間???名稱
-----------?---------??----------?-----??----
?????目錄???????????0??2017-07-15?16:55??scrapy-tutorial-master\
?????文件??????????17??2017-07-15?16:55??scrapy-tutorial-master\.gitignore
?????文件?????????268??2017-07-15?16:55??scrapy-tutorial-master\scrapy.cfg
?????目錄???????????0??2017-07-15?16:55??scrapy-tutorial-master\scrapyspider\
?????文件???????????0??2017-07-15?16:55??scrapy-tutorial-master\scrapyspider\__init__.py
?????文件?????????386??2017-07-15?16:55??scrapy-tutorial-master\scrapyspider\items.py
?????文件?????????292??2017-07-15?16:55??scrapy-tutorial-master\scrapyspider\pipelines.py
?????文件?????????205??2017-07-15?16:55??scrapy-tutorial-master\scrapyspider\run.py
?????文件????????3177??2017-07-15?16:55??scrapy-tutorial-master\scrapyspider\settings.py
?????目錄???????????0??2017-07-15?16:55??scrapy-tutorial-master\scrapyspider\spiders\
?????文件?????????161??2017-07-15?16:55??scrapy-tutorial-master\scrapyspider\spiders\__init__.py
?????文件?????????387??2017-07-15?16:55??scrapy-tutorial-master\scrapyspider\spiders\blog_spider.py
?????文件????????1326??2017-07-15?16:55??scrapy-tutorial-master\scrapyspider\spiders\douban_ajax_spider.py
?????文件????????1446??2017-07-15?16:55??scrapy-tutorial-master\scrapyspider\spiders\douban_spider.py
- 上一篇:大學(xué)排名爬蟲
- 下一篇:xgboost安裝包
評論
共有 條評論