91av视频/亚洲h视频/操亚洲美女/外国一级黄色毛片 - 国产三级三级三级三级

  • 大小: 111KB
    文件類型: .rar
    金幣: 2
    下載: 0 次
    發布日期: 2024-02-02
  • 語言: Python
  • 標簽: 微博爬蟲??

資源簡介

python新浪微博爬蟲python新浪微博爬蟲python新浪微博爬蟲python新浪微博爬蟲python新浪微博爬蟲python新浪微博爬蟲python新浪微博爬蟲python新浪微博爬蟲

資源截圖

代碼片段和文件信息

#?coding=utf-8

“““??
Created?on?2016-04-24?@author:?Eastmount

功能:?爬取新浪微博用戶的信息及微博評論
網址:http://weibo.cn/?數據量更小?相對http://weibo.com/

“““????

import?time????????????
import?re????????????
import?os????
import?sys??
import?codecs??
import?shutil
import?urllib?
from?selenium?import?webdriver????????
from?selenium.webdriver.common.keys?import?Keys????????
import?selenium.webdriver.support.ui?as?ui????????
from?selenium.webdriver.common.action_chains?import?ActionChains



#先調用無界面瀏覽器PhantomJS或Firefox????
#driver?=?webdriver.PhantomJS(executable_path=“G:\phantomjs-1.9.1-windows\phantomjs.exe“)????
driver?=?webdriver.Firefox()
wait?=?ui.WebDriverWait(driver10)


#全局變量?文件操作讀寫信息
inforead?=?codecs.open(“SinaWeibo_List_best_1.txt“?‘r‘?‘utf-8‘)
infofile?=?codecs.open(“SinaWeibo_Info_best_1.txt“?‘a‘?‘utf-8‘)


#********************************************************************************
#????????????????????????????第一步:?登陸weibo.cn?
#????????該方法針對weibo.cn有效(明文形式傳輸數據)?weibo.com見學弟設置POST和Header方法
#????????????????LoginWeibo(username?password)?參數用戶名?密碼
#********************************************************************************

def?LoginWeibo(username?password):
????try:
????????#輸入用戶名/密碼登錄
????????print?u‘準備登陸Weibo.cn網站...‘
????????driver.get(“http://login.sina.com.cn/“)
????????elem_user?=?driver.find_element_by_name(“username“)
????????elem_user.send_keys(username)?#用戶名
????????elem_pwd?=?driver.find_element_by_name(“password“)
????????elem_pwd.send_keys(password)??#密碼
????????#elem_rem?=?driver.find_element_by_name(“safe_login“)
????????#elem_rem.click()?????????????#安全登錄

????????#重點:?暫停時間輸入驗證碼(http://login.weibo.cn/login/?手機端需要)
????????time.sleep(20)
????????
????????#elem_sub?=?driver.find_element_by_xpath(“//input[@class=‘smb_btn‘]“)
????????#elem_sub.click()??????????????#點擊登陸?因無name屬性
????????elem_pwd.send_keys(Keys.RETURN)
????????time.sleep(2)
????????
????????#獲取Coockie?推薦資料:http://www.cnblogs.com/fnng/p/3269450.html
????????print?driver.current_url
????????print?driver.get_cookies()??#獲得cookie信息?dict存儲
????????print?u‘輸出Cookie鍵值對信息:‘
????????for?cookie?in?driver.get_cookies():?
????????????#print?cookie
????????????for?key?in?cookie:
????????????????print?key?cookie[key]
????????????????????
????????#driver.get_cookies()類型list?僅包含一個元素cookie類型dict
????????print?u‘登陸成功...‘
????????
????????
????except?Exceptione:??????
????????print?“Error:?“e
????finally:????
????????print?u‘End?LoginWeibo!\n\n‘


#********************************************************************************
#??????????????????第二步:?訪問個人頁面http://weibo.cn/5824697471并獲取信息
#????????????????????????????????VisitPersonPage()
#????????編碼常見錯誤?UnicodeEncodeError:?‘ascii‘?codec?can‘t?encode?characters?
#********************************************************************************

def?VisitPersonPage(user_id):

????try:
????????global?infofile???????#全局文件變量
????????url?=?“http:/

?屬性????????????大小?????日期????時間???名稱
-----------?---------??----------?-----??----

?????文件??????13386??2016-04-24?20:55??[源碼]?爬取移動端微博信息?(強推)\2016-04-23\20160423_SinaWeibo_Num_Best.txt

?????文件???????1595??2016-04-24?20:55??[源碼]?爬取移動端微博信息?(強推)\2016-04-23\Megry_Result_Best.py

?????文件?????237289??2016-04-24?20:52??[源碼]?爬取移動端微博信息?(強推)\2016-04-23\SinaWeibo_Info_best_1.txt

?????文件????????189??2016-04-24?20:46??[源碼]?爬取移動端微博信息?(強推)\2016-04-23\SinaWeibo_List_best_1.txt

?????文件??????12115??2016-04-24?20:54??[源碼]?爬取移動端微博信息?(強推)\2016-04-23\spider_selenium_sina_content.py

?????文件????????840??2016-04-24?21:02??運行配置過程.txt

?????文件???????5628??2016-04-24?20:31??[源碼]?爬取客戶端微博信息\SinaWeibo_Info_best_1.txt

?????文件?????????27??2016-04-24?03:45??[源碼]?爬取客戶端微博信息\SinaWeibo_List_best_1.txt

?????文件???????8119??2016-04-24?20:31??[源碼]?爬取客戶端微博信息\weibo_spider2.py

?????文件??????17680??2016-04-24?21:18??[源碼]?爬取移動端個人信息?關注id和粉絲id?(速度慢)\SinaWeibo_Info_1.txt

?????文件?????????50??2016-04-24?21:17??[源碼]?爬取移動端個人信息?關注id和粉絲id?(速度慢)\SinaWeibo_List_1.txt

?????文件??????14884??2016-04-24?21:19??[源碼]?爬取移動端個人信息?關注id和粉絲id?(速度慢)\spider_selenium_sina_info_other_userid_all.py

?????目錄??????????0??2016-04-24?20:55??[源碼]?爬取移動端微博信息?(強推)\2016-04-23

?????目錄??????????0??2016-04-24?20:46??[源碼]?爬取移動端微博信息?(強推)

?????目錄??????????0??2016-04-24?20:42??[源碼]?爬取客戶端微博信息

?????目錄??????????0??2016-04-24?21:18??[源碼]?爬取移動端個人信息?關注id和粉絲id?(速度慢)

-----------?---------??----------?-----??----

???????????????311802????????????????????16


評論

共有 條評論