91av视频/亚洲h视频/操亚洲美女/外国一级黄色毛片 - 国产三级三级三级三级

  • 大小: 6KB
    文件類型: .zip
    金幣: 2
    下載: 0 次
    發布日期: 2021-05-10
  • 語言: Python
  • 標簽: python??MP4??

資源簡介

導出pdf:由于圖片含有防爬密鑰,轉成PDF閱讀體驗更佳 解析TS文件,壓制成mp4:遍歷文章id,獲取m3u8地址,解析每個ts的key解密,按順序排列,壓制成mp4文件(如果有ffmpeg壓制方案可以追加,ffmpeg可以指定視頻為h264編碼格式,可用于OSS存儲web端播放).目前是本地寫了批量轉碼腳本;

資源截圖

代碼片段和文件信息

#?coding=utf-8
import?threading
import?queue
import?requests
import?json
import?time
import?pdfkit
import?ahttp
import?os
from?upload?import?Upload_oss
import?asyncio
import?oss2
import?re


class?LaGou_article():
????def?__init__(self?courseId):
????????self.url?=?‘https://gate.lagou.com/v1/neirong/kaiwu/getCourseLessons?courseId=‘+courseId
????????self.headers?=?{
????????????‘User-Agent‘:?‘Mozilla/5.0?(Windows?NT?10.0;?Win64;?x64)?AppleWebKit/537.36?(KHTML?like?Gecko)?Chrome/77.0.3865.90?Safari/537.36‘
????????????‘Cookie‘:?‘登錄拉勾視頻網站,截取請求頭里的cookie,去掉第一個參數‘
????????????‘Referer‘:?‘https://kaiwu.lagou.com/course/courseInfo.htm?courseId=‘+courseId
????????????‘Origin‘:?‘https://kaiwu.lagou.com‘
????????????‘Sec-fetch-dest‘:?‘empty‘
????????????‘Sec-fetch-mode‘:?‘cors‘
????????????‘Sec-fetch-site‘:?‘same-site‘
????????????‘x-l-req-header‘:?‘{deviceType:1}‘}
????????#?發現課程文章html的請求url前面都是一樣的最后的id不同而已
????????self.textUrl?=?‘https://gate.lagou.com/v1/neirong/kaiwu/getCourseLessonDetail?lessonId=‘
????????self.queue?=?queue.Queue()??#?初始化一個隊列
????????self.error_queue?=?queue.Queue()
????????self.courseId?=?courseId

????def?replace_all_blank(self?value):
????????“““
????????去除value中的所有非字母內容,包括標點符號、空格、換行、下劃線等
????????:param?value:?需要處理的內容
????????:return:?返回處理后的內容
????????“““
????????#?\W?表示匹配非數字字母下劃線
????????result?=?re.sub(‘\W+‘?‘‘?value).replace(“_“?‘‘)
????????return?result

????def?parse_one(self):
????????“““
????????:return:獲取文章html的url
????????“““
????????content?=?requests.get(url=self.url?headers=self.headers).text
????????code_cont?=?json.loads(content)
????????message_list?=?code_cont[‘content‘][‘courseSectionList‘]
????????#?print(message_list)
????????for?message?in?message_list:
????????????for?i?in?message[‘courseLessons‘]:
????????????????real_url?=?self.textUrl+str(i[‘id‘])
????????????????self.queue.put(real_url)??#?文章的請求url
????????return?self.queue

????def?courseInfo(self):
????????“““
????????:return:獲取文章的詳細內容
????????“““
????????content?=?requests.get(url=self.url?headers=self.headers).text
????????info?=?json.loads(content)
????????return?info[‘content‘]

????def?get_html(self?real_url):
????????“““
????????:return:返回一個Str?類型的html
????????“““
????????html?=?requests.get(url=real_url?timeout=10
????????????????????????????headers=self.headers).text
????????dit_message?=?json.loads(html)
????????str_html?=?str(dit_message[‘content‘][‘textContent‘])
????????article_name?=?dit_message[‘content‘][‘theme‘]
????????self.htmltopdf(str_html?article_name)

????def?htmltopdf(self?str_html?article_name):
????????info?=?self.courseInfo()
????????dirkName?=?info[‘courseName‘]
????????time.sleep(1)
????????dir?=?os.path.abspath(os.curdir)+‘\\resources\\‘+dirkName+‘\\‘
????????res?=?self.mkdir(dir)
????????path_wk?=?r‘C:\Users\male\Desktop\wkhtmltox-0.12.6-1.mxe-cross-win64?(1)\wkhtmltox\bin\wkhtmltopdf.exe‘
????????config?=?pdfkit.configur

?屬性????????????大小?????日期????時間???名稱
-----------?---------??----------?-----??----
?????目錄???????????0??2020-08-25?17:26??csdn\
?????文件????????5338??2020-08-25?17:37??csdn\pdf.py
?????文件????????1046??2020-08-26?10:25??csdn\README.md
?????文件????????9904??2020-08-26?10:23??csdn\run.py

評論

共有 條評論