資源簡介
可以大批量爬取微博數據,用于進行數據分析
代碼片段和文件信息
#?-*-?coding:?utf-8?-*-
from?scrapy?import?Item?Field
class?TweetsItem(Item):
????“““?微博信息?“““
????_id?=?Field()??#?微博id
????weibo_url?=?Field()??#?微博URL
????created_at?=?Field()??#?微博發表時間
????like_num?=?Field()??#?點贊數
????repost_num?=?Field()??#?轉發數
????comment_num?=?Field()??#?評論數
????content?=?Field()??#?微博內容
????user_id?=?Field()??#?發表該微博用戶的id
????tool?=?Field()??#?發布微博的工具
????image_url?=?Field()??#?圖片
????video_url?=?Field()??#?視頻
????origin_weibo?=?Field()??#?原始微博,只有轉發的微博才有這個字段
????location_map_info?=?Field()??#?定位的經緯度信息
????crawl_time?=?Field()??#?抓取時間戳
class?InformationItem(Item):
????“““?個人信息?“““
????_id?=?Field()??#?用戶ID
????nick_name?=?Field()??#?昵稱
????gender?=?Field()??#?性別
????province?=?Field()??#?所在省
????city?=?Field()??#?所在城市
????brief_introduction?=?Field()??#?簡介
???
?屬性????????????大小?????日期????時間???名稱
-----------?---------??----------?-----??----
?????文件?????????71??2019-09-05?19:47??requirements.txt
?????文件????????252??2019-09-05?19:47??scrapy.cfg
?????文件?????624666??2019-12-22?22:46??Sina.Tweets.csv
????.......?????83579??2019-09-05?19:47??images\account.png
????.......????272798??2019-09-05?19:47??images\account_build_screenshot.png
?????文件?????895424??2019-09-05?19:47??images\comment.png
?????文件?????227144??2019-09-05?19:47??images\information.png
?????文件?????253425??2019-09-05?19:47??images\relationship.png
?????文件?????585571??2019-09-05?19:47??images\spider.png
?????文件????1046619??2019-09-05?19:47??images\tweet.png
????.......???1095462??2019-09-05?19:47??images\xiaohao_shop.png
?????文件????????114??2019-12-21?01:13??sina\account_build\account.txt
?????文件????????114??2019-12-21?01:13??sina\account_build\account_sample.txt
?????文件?????????25??2019-12-21?13:31??sina\account_build\account_zlf.txt
?????文件???????1331??2019-12-22?19:43??sina\account_build\ghostdriver.log
?????文件???????3019??2019-12-22?19:45??sina\account_build\login.py
????.......???????161??2019-09-05?19:47??sina\account_build\__init__.py
????.......??????2011??2019-09-05?19:47??sina\items.py
?????文件????????178??2019-12-23?17:55??sina\main.py
?????文件???????2287??2019-12-22?23:53??sina\middlewares.py
????.......??????1324??2019-09-05?19:47??sina\pipelines.py
?????文件???????2716??2019-12-24?23:24??sina\settings.py
????.......??????2251??2019-09-05?19:47??sina\spiders\utils.py
?????文件??????19553??2019-12-24?23:25??sina\spiders\weibo_spider.py
????.......???????161??2019-09-05?19:47??sina\spiders\__init__.py
?????文件???????1992??2019-12-22?16:46??sina\spiders\__pycache__\utils.cpython-37.pyc
?????文件???????9601??2019-12-24?23:15??sina\spiders\__pycache__\weibo_spider.cpython-37.pyc
?????文件????????134??2019-12-22?16:46??sina\spiders\__pycache__\__init__.cpython-37.pyc
????.......?????????0??2019-09-05?19:47??sina\__init__.py
?????文件???????1677??2019-12-22?16:46??sina\__pycache__\items.cpython-37.pyc
............此處省略16個文件信息
- 上一篇:python貪吃蛇代碼
- 下一篇:python爬取百度貼吧圖片(圖片到本地文件夾)
評論
共有 條評論