資源簡介
此資源為對B站的視頻彈幕進行爬取,并對彈幕內容進行處理后,使用詞云圖的可視化技術進行可視化的完整案例,有什么問題歡迎提問。

代碼片段和文件信息
import?requests
import?json
import?chardet
import?re?#?使用正則表達式解析數據
#?1.根據bvid請求得到cid
def?get_cid(bvid):
????#?bvid為bilibili播放視頻時網址/video/后的值??-->?https://www.bilibili.com/video/BV1ag4y1i7dP?
????url?=?‘https://api.bilibili.com/x/player/pagelist?bvid=‘+bvid+‘&jsonp=jsonp‘
????res?=?requests.get(url).text?#?獲取數據
????json_dict?=?json.loads(res)?#?將數據轉為json
????#pprint(json_dict)
????return?json_dict[“data“][0][“cid“]?#?獲取cid并返回
#?2.根據cid請求彈幕,解析彈幕得到最終的數據
“““
注意:嗶哩嗶哩的網頁現在已經換了,那個list.so接口已經找不到,以下為新接口。
“““
def?get_data(cid):
????final_url?=?“https://api.bilibili.com/x/v1/dm/list.so?oid=“?+?str(cid)
????final_res?=?requests.get(final_url)
????final_res.encoding?=?chardet.detect(final_res.content)[‘encoding‘]
????final_res?=?final_res.text
????pattern?=?re.compile(‘(.*?)‘)
????data?=?pattern.findall(final_res)
????return?data
#?3.保存彈幕列表?-->?保存到txt文本文件中
def?save_to_file(data):
????with?open(“../result/dan_mu.txt“?mode=“w“?encoding=“utf-8“)?as?f:
????????for?i?in?data:
????????????f.write(i)
????????????f.write(“\n“)
#?4.開始爬取數據并返回執行狀態
def?begin(bvid):
????try:
????????cid?=?get_cid(bvid)
????????data?=?get_data(cid)
????????save_to_file(data)
????????return?True
????except:
????????return?False
?屬性????????????大小?????日期????時間???名稱
-----------?---------??----------?-----??----
?????目錄???????????0??2020-06-18?22:01??bilibili_barrage\
?????目錄???????????0??2020-09-02?15:43??bilibili_barrage\.idea\
?????文件?????????408??2020-06-18?21:57??bilibili_barrage\.idea\bilibili_barrage.iml
?????文件?????????294??2020-06-15?11:08??bilibili_barrage\.idea\misc.xm
?????文件?????????291??2020-06-18?21:57??bilibili_barrage\.idea\modules.xm
?????文件???????20592??2020-09-02?15:43??bilibili_barrage\.idea\workspace.xm
?????目錄???????????0??2020-09-02?15:42??bilibili_barrage\demo\
?????目錄???????????0??2020-09-02?15:43??bilibili_barrage\demo\__pycache__\
?????文件????????1287??2020-08-10?11:09??bilibili_barrage\demo\__pycache__\get_barrage.cpython-38.pyc
?????文件????????1674??2020-09-02?15:43??bilibili_barrage\demo\__pycache__\view_barrage.cpython-38.pyc
?????文件????????1502??2020-08-10?09:16??bilibili_barrage\demo\get_barrage.py
?????文件?????????353??2020-07-24?12:37??bilibili_barrage\demo\run.py
?????文件????????2138??2020-09-02?15:42??bilibili_barrage\demo\view_barrage.py
?????目錄???????????0??2020-09-02?15:42??bilibili_barrage\image\
?????文件???????27874??2020-06-16?21:09??bilibili_barrage\image\Ace.jpg
?????文件??????271506??2020-06-16?21:12??bilibili_barrage\image\Ace1.jpg
?????文件???????17771??2020-06-16?21:14??bilibili_barrage\image\Back.jpg
?????文件??????137284??2020-06-16?17:53??bilibili_barrage\image\Luffy.jpg
?????目錄???????????0??2020-09-02?15:18??bilibili_barrage\util\
?????文件???????21979??2020-06-16?21:16??bilibili_barrage\image\chun.jpg
?????文件???????29524??2020-07-31?17:29??bilibili_barrage\image\girl.jpg
?????文件????????1434??2020-09-02?15:43??bilibili_barrage\result\dan_mu.txt
?????文件???????17932??2020-07-31?17:34??bilibili_barrage\image\獅子.jpg
?????文件???????48961??2020-06-16?21:28??bilibili_barrage\image\老虎.jpg
?????目錄???????????0??2020-07-24?10:18??bilibili_barrage\result\
?????文件??????140689??2020-09-02?15:43??bilibili_barrage\result\ciyun.png
?????文件???????????6??2020-09-02?15:18??bilibili_barrage\util\stoplist.txt
?????文件???????48961??2020-06-16?21:28??bilibili_barrage\image\Tiger.jpg
?????文件???????17932??2020-07-31?17:34??bilibili_barrage\image\lion.jpg
- 上一篇:python實現類似于QQ或MSN的聊天系統
- 下一篇:史上最強淘寶秒殺腳本
評論
共有 條評論