91av视频/亚洲h视频/操亚洲美女/外国一级黄色毛片 - 国产三级三级三级三级

  • 大小: 243KB
    文件類型: .rar
    金幣: 2
    下載: 1 次
    發(fā)布日期: 2021-06-18
  • 語(yǔ)言: Python
  • 標(biāo)簽: python??爬蟲(chóng)??

資源簡(jiǎn)介

智聯(lián)招聘爬取工作崗位薪資分布以及崗位要求(python),直接運(yùn)行可用,需要自己下載依賴的包,比如scrapy,pandas,matplotlib等...可用根據(jù)百度和錯(cuò)誤提示一步一步安裝依賴包

資源截圖

代碼片段和文件信息

#-*-?coding:?utf-8?-*-
import?re
import?csv
import?jieba
import?numpy
import?requests
from?tqdm?import?tqdm
import?pandas?as?pd
from?scipy.misc?import?imread
from?wordcloud?import?WordCloud?ImageColorGenerator
from?collections?import?Counter
from?bs4?import?BeautifulSoup
import?matplotlib.pyplot?as?plt
from?requests.exceptions?import?RequestException

def?get_one_page(city?keyword?region?page):
????‘‘‘
????獲取網(wǎng)頁(yè)html內(nèi)容并返回
????‘‘‘
????paras?=?{
????????‘jl‘:?city?????????#?搜索城市
????????‘kw‘:?keyword??????#?搜索關(guān)鍵詞?
????????‘isadv‘:?0?????????#?是否打開(kāi)更詳細(xì)搜索選項(xiàng)
????????‘isfilter‘:?1??????#?是否對(duì)結(jié)果過(guò)濾
????????‘sg‘:?‘d5259c62115f44e3bbb380dc88411919‘
????????‘p‘:?page??????????#?頁(yè)數(shù)
????????‘re‘:?region????????#?region的縮寫,地區(qū),2005代表海淀
????}

????headers?=?{
????????‘User-Agent‘:?‘Mozilla/5.0?(Windows?NT?10.0;?WOW64)?AppleWebKit/537.36?(KHTML?like?Gecko)?Chrome/63.0.3239.132?Safari/537.36‘
????????‘Host‘:?‘sou.zhaopin.com‘
????????‘Referer‘:?‘https://www.zhaopin.com/‘
????????‘Accept‘:?‘text/htmlapplication/xhtml+xmlapplication/xml;q=0.9image/webpimage/apng*/*;q=0.8‘
????????‘Accept-Encoding‘:?‘gzip?deflate?br‘
????????‘Accept-Language‘:?‘zh-CNzh;q=0.9‘
????}

????url?=?‘https://sou.zhaopin.com/jobs/searchresult.ashx?‘
????try:
????????#?獲取網(wǎng)頁(yè)內(nèi)容,返回html數(shù)據(jù)
????????response?=?requests.get(url?params=paras?headers=headers)
????????print(response.url)
????????#?通過(guò)狀態(tài)碼判斷是否獲取成功
????????if?response.status_code?==?200:
????????????return?response.text
????????return?None
????except?RequestException?as?e:
????????return?None

def?parse_one_page(html):
????‘‘‘
????解析HTML代碼,提取有用信息并返回
????‘‘‘
????#?正則表達(dá)式進(jìn)行解析
????pattern?=?re.compile(‘(.*?).*?‘?#?匹配職位詳情地址和職位名稱
????????‘.*??target=“_blank“>(.*?).*?‘?????????????????????????????#?匹配公司名稱
????????‘(.*?)‘?re.S)????????????????????????????????????????????#?匹配月薪??????

????#?匹配所有符合條件的內(nèi)容
????items?=?re.findall(pattern?html)???

????for?item?in?items:
????????job_name?=?item[1]
????????job_name?=?job_name.replace(‘‘?‘‘)
????????job_name?=?job_name.replace(‘
‘?‘‘)

????????salary_avarage?=?0
????????temp?=?item[3]
????????if?temp?!=?‘面議‘:
????????????idx?=?temp.find(‘-‘)
????????????#?求平均工資
????????????salary_avarage?=?(int(temp[0:idx])?+?int(temp[idx+1:]))//2

????????#?html?=?get_detail_page(job_url)
????????#?print(html)
????????yield?{
????????????‘job‘:?job_name
????????????‘job_url‘:?item[0]
????????????‘company‘:?item[2]
????????????‘salary‘:?salary_avarage
????????}

def?get_detail_page(url):
????‘‘‘
????獲取職位詳情頁(yè)html內(nèi)容并返回
????‘‘‘
????headers?=?{
????????‘User-Agent‘:?‘Mozilla/5.0?(Windows?NT?10.0;?WOW64)?AppleWebKit/537.36?(KHTML?like?Gecko)?Chrome/63.0.3239.132?Safari/537.36‘
????????‘Host‘:?‘jobs.zhaopin.com‘
????????‘Accept‘:?‘text/htmlapplication/xhtml+xmlapplication/xml;q=0.9image/webpimage/apng*/*;q=

?屬性????????????大小?????日期????時(shí)間???名稱
-----------?---------??----------?-----??----

?????文件?????134783??2018-07-10?15:19??智聯(lián)招聘爬取工作崗位薪資分布以及崗位要求(python)\2.png

?????文件??????93404??2018-07-10?17:10??智聯(lián)招聘爬取工作崗位薪資分布以及崗位要求(python)\output.png

?????文件???????8295??2018-07-10?15:40??智聯(lián)招聘爬取工作崗位薪資分布以及崗位要求(python)\stopwords.txt

?????文件??????10665??2018-07-10?17:38??智聯(lián)招聘爬取工作崗位薪資分布以及崗位要求(python)\zhilian.py

?????文件??????72208??2018-07-10?17:08??智聯(lián)招聘爬取工作崗位薪資分布以及崗位要求(python)\zl_上海_java工程師.csv

?????文件?????289072??2018-07-10?17:08??智聯(lián)招聘爬取工作崗位薪資分布以及崗位要求(python)\zl_上海_java工程師.txt

?????目錄??????????0??2018-07-10?19:47??智聯(lián)招聘爬取工作崗位薪資分布以及崗位要求(python)

-----------?---------??----------?-----??----

???????????????608427????????????????????7


評(píng)論

共有 條評(píng)論