資源簡介
使用KNN算法預測鳶尾花的種類,壓縮文件中包含源碼、訓練數據以及測試數據

代碼片段和文件信息
import?numpy?as?np
import?operator
def?trainingFile2Matrix(filename):
????“““
????函數說明:
????????處理訓練數據集
????:param?filename:
????????訓練數據文件
????:return:
????????returnMat?-?處理得到的每一個訓練樣本的數據集合
????????returnLabel?-?每一個訓練樣本所屬的類別標簽集合
????“““
????file?=?open(filename)
????content?=?file.readlines()
????lineCount?=?len(content)
????returnMat?=?np.zeros((lineCount?4))
????returnLabel?=?[]
????index?=?0
????for?line?in?content:
????????line?=?line.strip()
????????example?=?line.split(‘‘)
????????returnMat[index?:?]?=?example[0?:?4]
????????index?+=?1
????????returnLabel.append(example[4])
????return?returnMat?returnLabel
def?testFile2Matrix(filename):
????“““
????函數說明:
????????處理測試數據集
????:param?filename:
????????測試數據文件
????:return:
????????returnMat?-?處理得到的每一個測試樣本的數據集合
????“““
????file?=?open(filename)
????content?=?file.readlines()
????lineCount?=?len(content)
????returnMat?=?np.zeros((lineCount?4))
????index?=?0
????for?line?in?content:
????????line?=?line.strip()
????????example?=?line.split(‘‘)
????????returnMat[index?:?]?=?example[0?:?4]
????????index?+=?1
????return?returnMat
def?calculateDistance(train_example?test_example?example_length):
????“““
????函數說明:
????????計算訓練樣本和測試樣本之間的歐幾里德距離
????:param?train_example:
????????訓練樣本的數據
????:param?test_example:
????????測試樣本的數據
????:param?example_length:
????????樣本的屬性長度
????:return:
????????distance?-?訓練樣本和測試樣本之間的歐幾里德距離
????“““
????distance?=?0.0
????for?i?in?range(example_length):
????????distance?+=?pow(train_example[i]?-?test_example[i]?2)
????return?distance
def?get_K_Neighbors(trainingSet?trainingLabel?test_example?k):
????“““
????函數說明:
????????取得與測試樣本距離最近的k個訓練樣本
????:param?trainingSet:
????????訓練樣本數據集
????:param?trainingLabel:
????????訓練樣本標簽集
????:param?test_example:
????????測試樣本
????:param?k:
????????即參數k
????:return:
????????kNeighbors?-?與測試樣本最近的k個訓練樣本的集合
????“““
????length?=?len(test_example)
????distances?=?[]
????for?i?in?range(len(trainingSet)):
????????dis?=?calculateDistance(trainingSet[i]?test_example?length)
????????distances.append((trainingLabel[i]?dis))
????distances.sort(key=operator.itemgetter(1))
????kNeighbors?=?[]
????for?i?in?range(k):
????????kNeighbors.append(distances[i][0])
????return?kNeighbors
def?getReasult(kNeighbors):
????“““
????函數說明:
????????取得與測試樣本距離最近的k個訓練樣本中的最公共類別
????:param?kNeighbors:
????????與測試樣本最近的k個訓練樣本的集合
????:return:
????????sortedLabel[0][0]?-?預測該測試樣本所屬的類別
????“““
????classLabel?=?{}
????for?i?in?range(len(kNeighbors)):
????????temp?=?kNeighbors[i]
????????if?temp?in?classLabel:
????????????classLabel[temp]?+=?1
????????else:
????????????classLabel[temp]?=?1
????sortedLabel?=?sorted(classLabel.items()?key=operator.itemgetter(1)?reverse=True)
????return?sortedLabel[0][0]
def?getAccuracy(testLabel?predictions):
????“““
????函數說明:
????????計算預測的準確率
????:param?testLabel:
????????測試數據所屬的真實類別
????:param?predictions:
????????預測測試數據所屬的類別
????:return:
????????(cnt?/?float(len(testLabel)))?*?100.0?-?準確率
????“““
????cnt?=?0
?
?屬性????????????大小?????日期????時間???名稱
-----------?---------??----------?-----??----
?????目錄???????????0??2018-10-19?13:25??KNN\
?????文件????????6148??2018-10-19?13:24??KNN\.DS_Store
?????目錄???????????0??2018-10-19?13:25??__MACOSX\
?????目錄???????????0??2018-10-19?13:25??__MACOSX\KNN\
?????文件?????????120??2018-10-19?13:24??__MACOSX\KNN\._.DS_Store
?????文件????????3760??2016-09-26?20:50??KNN\train.txt
?????文件?????????228??2016-09-26?20:50??__MACOSX\KNN\._train.txt
?????文件?????????508??2017-09-20?09:25??KNN\test.txt
?????文件?????????228??2017-09-20?09:25??__MACOSX\KNN\._test.txt
?????文件????????5761??2018-09-20?14:48??KNN\KNN.py
?????文件?????????176??2018-09-20?14:48??__MACOSX\KNN\._KNN.py
評論
共有 條評論