91av视频/亚洲h视频/操亚洲美女/外国一级黄色毛片 - 国产三级三级三级三级

資源簡介

壓縮包主要采用隨機(jī)森林算法處理adult數(shù)據(jù)集的分類問題,主要包含四部分,第一部分是由python編寫的adult數(shù)據(jù)集預(yù)處理過程,第二部分是自己編寫的隨機(jī)森林算法處理adult數(shù)據(jù)集,第三部分是調(diào)用python中sklearn模塊處理adult分類問題,第四部分是基于matlab調(diào)用5種機(jī)器學(xué)習(xí)分類算法分別處理adult分類問題比較哪種算法能夠取得更好的分類效果。

資源截圖

代碼片段和文件信息

#?-*-?coding:?utf-8?-*-
“““
Created?on?Tue?Nov??6?13:29:41?2018

@author:?28770
“““

import?pandas?as?pd

excelFile=r‘ML_data2.xlsx‘
train_df?=?pd.Dataframe(pd.read_excel(excelFilesheet_name=0))??#讀取指定路徑的表格的sheet0為文件并轉(zhuǎn)換到結(jié)構(gòu)框格式
test_df=?pd.Dataframe(pd.read_excel(excelFilesheet_name=1))?#讀取指定路徑的表格的sheet1為文件并轉(zhuǎn)換到結(jié)構(gòu)框格式

‘‘‘
#workClass_loss用于返回train_df中‘workClass‘這一列中的確實(shí)項(xiàng),缺失數(shù)據(jù)處為True
workClass_loss=train_df[‘workClass‘].isnull()??#.notnull()效果與其相反。
‘‘‘

‘‘‘
缺失值填充步驟:(使用缺失值上一行的數(shù)據(jù)填充缺失值處)
對train_df中的缺失值進(jìn)行填充,其中.mode()是用這一列的眾數(shù)填充,mean()使用列平均值填充。
其中,由于可能某一列有多個(gè)相同的眾數(shù),因此.mode()返回的是一個(gè)series不像mean()一樣返回
的是一個(gè)數(shù)值,因此,采用.mode()[0]自動將其填充為第一個(gè)眾數(shù)。
‘‘‘
train_df_fill=train_df.fillna(method=“ffill“)
test_df_fill=test_df.fillna(method=“ffill“)

‘‘‘
刪除重復(fù)的列信息
‘‘‘
train_df_fill=train_df_fill.drop([‘education‘]1)
test_df_fill=test_df_fill.drop([‘education‘]1)

‘‘‘
離散特征映射
‘‘‘
salary_mapping={‘<=50K‘:0‘>50K‘:1}
train_df_fill[‘salary‘]=train_df_fill[‘salary‘].map(salary_mapping)
test_df_fill[‘salary‘]=test_df_fill[‘salary‘].map(salary_mapping)

Discrete_attribute=[‘workClass‘‘education‘‘marital_status‘‘occupation‘
????????????????????‘relationship‘‘race‘‘sex‘‘native_country‘]

for?attribute?in?Discrete_attribute:
????attribute_mapping?=?{lab:idx?for?idxlab?in?enumerate(set(train_df_fill[attribute]))}?
????train_df_fill[attribute]?=?train_df_fill[attribute].map(attribute_mapping)??
????test_df_fill[attribute]?=?test_df_fill[attribute].map(attribute_mapping)

‘‘‘
workClass_mapping?=?{lab:idx?for?idxlab?in?enumerate(set(train_df_fill[‘workClass‘]))}?
train_df_fill[‘workClass‘]?=?train_df_fill[‘workClass‘].map(workClass_mapping)??
test_df_fill[‘workClass‘]?=?test_df_fill[‘workClass‘].map(workClass_mapping)?

education_mapping?=?{lab:idx?for?idxlab?in?enumerate(set(train_df_fill[‘education‘]))}?
train_df_fill[‘education‘]?=?train_df_fill[‘education‘].map(education_mapping)??
test_df_fill[‘education‘]?=?test_df_fill[‘education‘].map(education_mapping)?

marital_status_mapping?=?{lab:idx?for?idxlab?in?enumerate(set(train_df_fill[‘marital_status‘]))}?
train_df_fill[‘marital_status‘]?=?train_df_fill[‘marital_status‘].map(marital_status_mapping)??
test_df_fill[‘marital_status‘]?=?test_df_fill[‘marital_status‘].map(marital_status_mapping)?

occupation_mapping?=?{lab:idx?for?idxlab?in?enumerate(set(train_df_fill[‘occupation‘]))}?
train_df_fill[‘occupation‘]?=?train_df_fill[‘occupation‘].map(occupation_mapping)??
test_df_fill[‘occupation‘]?=?test_df_fill[‘occupation‘].map(occupation_mapping)?

relationship_mapping?=?{lab:idx?for?idxlab?in?enumerate(set(train_df_fill[‘relationship‘]))}?
train_df_fill[‘relationship‘]?=?train_df_fill[‘relationship‘].map(relationship_mapping)??
test_df_fill[‘relationship‘]?=?test_df_fill[‘relationship‘].map(relationship_mapping)?

race_mapping?=?{lab:idx?for?idxlab?in?enumerate(set(train_df_fill[‘race‘]))}?
train_df_fill[‘race‘]?=?train_df_fill[‘race‘].map(race_mapping)??
test_df_

?屬性????????????大小?????日期????時(shí)間???名稱
-----------?---------??----------?-----??----

?????文件???????4575??2018-11-13?23:33??Random_Forest\excel_change.py

?????文件???????1589??2018-11-13?20:55??Random_Forest\Matlab_xlr\excel_run.m

?????文件????2677491??2018-11-06?20:50??Random_Forest\Matlab_xlr\ML_data2_trans.xlsx

?????文件????2918697??2018-11-01?21:57??Random_Forest\ML_data2.xlsx

?????文件?????642592??2018-11-08?10:55??Random_Forest\ML_data2_test.csv

?????文件????1285749??2018-11-08?10:55??Random_Forest\ML_data2_train.csv

?????文件????2677491??2018-11-06?20:50??Random_Forest\ML_data2_trans.xlsx

?????文件?????642435??2018-11-08?10:59??Random_Forest\Random?Forest\ML_data2_test.csv

?????文件????1285592??2018-11-08?10:59??Random_Forest\Random?Forest\ML_data2_train.csv

?????文件????2677491??2018-11-06?20:50??Random_Forest\Random?Forest\ML_data2_trans.xlsx

?????文件??????10260??2018-11-14?13:26??Random_Forest\Random?Forest\Random_Forest.py

?????文件?????642435??2018-11-08?10:59??Random_Forest\RF_sklearn\ML_data2_test.csv

?????文件????1285592??2018-11-08?10:59??Random_Forest\RF_sklearn\ML_data2_train.csv

?????文件????2677491??2018-11-06?20:50??Random_Forest\RF_sklearn\ML_data2_trans.xlsx

?????文件???????1259??2018-11-14?14:15??Random_Forest\RF_sklearn\RF_sklearn.py

?????文件????????214??2018-11-14?13:51??Random_Forest\文本描述(首先閱讀).txt

?????目錄??????????0??2018-12-14?10:51??Random_Forest\Matlab_xlr

?????目錄??????????0??2018-12-14?10:51??Random_Forest\Random?Forest

?????目錄??????????0??2018-12-14?10:51??Random_Forest\RF_sklearn

?????目錄??????????0??2018-12-14?10:51??Random_Forest

-----------?---------??----------?-----??----

?????????????19430953????????????????????20


評論

共有 條評論