資源簡介
從qq國際版中提取出的全球主要國家、省州、城市的數(shù)據(jù)庫(XML版),并轉(zhuǎn)換為csv版。數(shù)據(jù)庫包含country_name,country_code,state_name,state_code,city_name,city_code信息。上傳資源包括xml版數(shù)據(jù)庫、csv版數(shù)據(jù)庫、xml轉(zhuǎn)csv腳本。

代碼片段和文件信息
#!/usr/bin/env?python?3.7
#?-*-?coding:utf-8?-*-
import?pandas?as?pd
import?xml.etree.ElementTree?as?ET
def?xml_to_csv(data=“../data/QQ-LocList/LocList.xml“):
????tree=ET.parse(data)
????root=tree.getroot()
????result=[]
????for?country?in?root.iter(‘CountryRegion‘):#country
????????country_name?country_code?=?country.attrib[“Name“]?country.attrib[“Code“]
????????for?state?in?country:#state
????????????try:
????????????????state_name?state_code?=?state.attrib[‘Name‘]?state.attrib[‘Code‘]
????????????except:
????????????????state_name?=?state_code?=?None
????????????for?city?in?state:#city
????????????????city_name?city_code?=?city.attrib[‘Name‘]?city.attrib[‘Code‘]
????????????????result.append({‘country_name‘:country_name‘country_code‘:country_code
???????????????????????????????‘state_name‘:state_name‘state_code‘:state_code
???????????????????????????????‘city_name‘:city_name‘city_code‘:city_code})
????result=pd.Dataframe(resultcolumns=[‘country_name‘‘country_code‘‘state_name‘‘state_code‘‘city_name‘‘city_code‘])
????result.to_csv(‘../output/FSN_fromQQ.csv‘)
????print(“城市數(shù)量有:“result.shape[0])
????result.drop_duplicates(subset=[‘country_name‘?]?keep=‘first‘?inplace=True)
????print(“國家數(shù)量有:“?result.shape[0])
????return?0
if?__name__==“__main__“:
????xml_to_csv()
?屬性????????????大小?????日期????時間???名稱
-----------?---------??----------?-----??----
?????目錄???????????0??2019-07-05?10:19??01_qq導(dǎo)出的原始xm
?????文件??????184907??2019-07-04?09:36??01_qq導(dǎo)出的原始xm
?????目錄???????????0??2019-07-05?10:20??02_pyhton腳本轉(zhuǎn)換后的csv版本\
?????文件??????139146??2019-07-05?09:54??02_pyhton腳本轉(zhuǎn)換后的csv版本\FSN_fromQQ.csv
?????目錄???????????0??2019-07-05?10:31??03_xm
?????文件????????1359??2019-07-04?17:57??03_xm
評論
共有 條評論