91av视频/亚洲h视频/操亚洲美女/外国一级黄色毛片 - 国产三级三级三级三级

  • 大小: 2KB
    文件類型: .py
    金幣: 1
    下載: 0 次
    發(fā)布日期: 2021-06-16
  • 語言: Python
  • 標簽: kaggle??

資源簡介

kaggle比賽HousePrices之數(shù)據(jù)預處理部分的完整代碼,包含非常詳細的注釋,屬于數(shù)據(jù)挖掘預處理的經(jīng)典流程性代碼。

資源截圖

代碼片段和文件信息

#preprocessing?for?training&test?data
#@2016.11.08

import?pandas?as?pd
#step1:reading?csv?data
train?=?pd.read_csv(‘train.csv‘)
test?=?pd.read_csv(‘test.csv‘)
#train.head()???#?take?a?brief?look?at?training?data
all_data?=?pd.concat((train.loc[:‘MSSubClass‘:‘SaleCondition‘]
??????????????????????test.loc[:‘MSSubClass‘:‘SaleCondition‘]))??#?concat?training&test?data

import?numpy?as?np
from?scipy.stats?import?skew
import?matplotlib
matplotlib.use(‘Agg‘)
import?matplotlib.pyplot?as?plt
#step2:log?transform?for?training?data?(including?the?labels)
‘‘‘??a?png?for?labels‘?distribution
matplotlib.rcParams[‘figure.figsize‘]?=?(12.0?6.0)
prices?=?pd.Dataframe({“price“:train[“SalePrice“]?“l(fā)og(price?+?1)“:np.log1p(train[“SalePrice“])})
prices.hist()
plt.savefig(‘label_dist.png‘dpi=150)
‘‘‘
train[“SalePrice“]?=?np.log1p(train[“SalePrice“])?#log?transform?the?target

#log?transform?skewed?numeric?features:
numeric_feats?=?all_data.dtypes[all_data.dtypes?!=?“object“].index???#?get?the?index?of?all?the?n

評論

共有 條評論