資源簡介
西瓜書上第七章樸素貝葉斯采用拉普拉斯修正,簡單易懂,關鍵代碼都有注釋,適合初學者的理解

代碼片段和文件信息
import?numpy?as?np
def?load_data(filepath):
????‘‘‘
????:arg?filepath??filepath是數據的路徑
????:fun?加載數據:1青綠蜷縮濁響清晰凹陷硬滑0.6970.46是
????:return?加載后的數據
????‘‘‘
????file_object?=?open(filepath?encoding=‘UTF-8‘)
????train_data?=?[]
????file_object.readline()
????while?1:
????????data?=?file_object.readline()
????????if?not?data:
????????????break
????????else:
????????????train_data.append(data)
????file_object.close()
????test?=?[]
????for?s?in?train_data:
????????test.append(s.replace(‘\n‘?‘‘).split(‘‘))??#?去掉\n和把數據按照’‘分割再存
????return?test
def?count_labels(data):
????‘‘‘
????:param?data:數據集
????:return:?返回好瓜和壞瓜的數目
????‘‘‘
????yes?=?0
????no?=?0
????for?s?in?range(data.__len__()):
????????if?data[s][-1]?==?‘是‘:
????????????yes?+=?1
????????else:
????????????no?+=?1
????return?yes?no
def?handle_one_data(data?attr?location?yes?no?attr_dis):
????‘‘‘
????:param?data:?數據集
????:param?attr:?要傳入的屬性
????:param?location:?傳入屬性的位置
????:param?yes:?好瓜數量
????:param?no:?壞瓜數量
????:param?attr_dis:?各個屬性的取值不同的個數
????:return:?返回該屬性在好瓜或者是壞瓜的前提下的概率
????‘‘‘
????attr_y?attr_n?=?0?0
????for?s?in?range(data.__len__()):
????????if?data[s][-1]?==?‘是‘:
????????????if?data[s][location]?==?attr:
????????????????attr_y?+=?1
????????else:
????????????if?data[s][location]?==?attr:
????????????????attr_n?+=?1
????return?(attr_y?+?1)?/?(yes?+?attr_dis[location-1])?(attr_n?+?1)?/?(no?+?attr_dis[location-1])
def?handle_data(data):
????‘‘‘
????:param?data:?數據集
????:return:?對密度和含糖率的均值和標準差
????‘‘‘
????midu_y?=?[]
????tiandu_y?=?[]
????midu_n?=?[]
????tiandu_n?=?[]
????for?s?in?range(data.__len__()):
????????if?data[s][-1]?==?‘是‘:
????????????midu_y.append(np.float(data[s][-3]))
????????????tiandu_y.append(np.float(data[s][-2]))
????????else:
????????????midu_n.append(np.float(data[s][-3]))
????????????tiandu_n.append(np.float(data[s][-2]))
????m_midu_y?=?np.mean(midu_y)
????m_midu_n?=?np.mean(midu_n)
????t_tiandu_y?=?np.mean(tiandu_y)
????t_tiandu_n?=?np.mean(tiandu_n)
????std_midu_y?=?np.std(midu_y)
????std_midu_n?=?np.std(midu_n)
????std_tiandu_y?=?np.std(tiandu_y)
????std_tiandu_n?=?np.std(tiandu_n)
????return?m_midu_y?m_midu_n?t_tiandu_y?t_tiandu_n?std_midu_y?std_midu_n?std_tiandu_y?std_tiandu_n
def?show_result(p_yes?p_no):
????‘‘‘
????:param?p_yes:?在好瓜的前提下,測試數據各個屬性的概率
????:param?p_no:?在是壞瓜的前提下,測試數據的各個屬性的概率
????:return:?是好瓜或者是壞瓜
????‘‘‘
????p1?=?1.0
????p2?=?1.0
????for?s?in?range(p_yes.__len__()):
????????p1?*=?np.float(p_yes[s])
????????p2?*=?np.float(p_no[s])
????if?p1?>?p2:
????????print(“好瓜“?p1?p2)
????else:
????????print(“壞瓜“?p1?p2)
def?count_attr_dis(data):
????‘‘‘
????:param?data:?數據集
????:return:?各個屬性取值的個數
????‘‘‘
????count?=?[]??#?記錄各個屬性的取值有多少個不同
????for?i?in?range(data[0].__len__()):
????????if?i?==?0?or?i?==?7?or?i?==?8:??#?去掉編號,密度,甜度這個屬性
????????????continue
????????d?=?[]
????????for?s?in?range(data.__len__())
?屬性????????????大小?????日期????時間???名稱
-----------?---------??----------?-----??----
?????文件????????1114??2018-04-30?20:02??bayes.txt
?????文件????????5245??2018-05-01?17:09??Bayes_laplacian.py
- 上一篇:學習cocos2d-x 3.0自己寫的游戲2048
- 下一篇:平面轉3d立體圖片工具
評論
共有 條評論