91av视频/亚洲h视频/操亚洲美女/外国一级黄色毛片 - 国产三级三级三级三级

  • 大小: 9.95MB
    文件類型: .zip
    金幣: 2
    下載: 0 次
    發布日期: 2023-11-20
  • 語言: Python
  • 標簽:

資源簡介

西電數據挖掘作業之利用Python編程實現Apriori算法,利用Python編程實現Apriori算法。

資源截圖

代碼片段和文件信息

#?coding=utf-8
import?operator
from?math?import?log
import?time
import?load_bigcsv?as?ls
#?import?load_csv?as?ls


def?createDataSet():
????dataSet?=?[[1?1?‘yes‘]
???????????????[1?1?‘yes‘]
???????????????[1?0?‘no‘]
???????????????[0?1?‘no‘]
???????????????[0?1?‘no‘]]
????labels?=?[‘no?surfaceing‘?‘flippers‘]
????return?dataSet?labels

#?計算香農熵


def?calcShannonEnt(dataSet):
????numEntries?=?len(dataSet)
????labelCounts?=?{}
????for?feaVec?in?dataSet:
????????currentLabel?=?feaVec[-1]
????????if?currentLabel?not?in?labelCounts:
????????????labelCounts[currentLabel]?=?0
????????labelCounts[currentLabel]?+=?1
????shannonEnt?=?0.0
????for?key?in?labelCounts:
????????prob?=?float(labelCounts[key])?/?numEntries
????????shannonEnt?-=?prob?*?log(prob?2)
????return?shannonEnt


def?splitDataSet(dataSet?axis?value):
????retDataSet?=?[]
????for?featVec?in?dataSet:
????????if?featVec[axis]?==?value:
????????????reducedFeatVec?=?featVec[:axis]
????????????reducedFeatVec.extend(featVec[axis?+?1:])
????????????retDataSet.append(reducedFeatVec)
????return?retDataSet


def?chooseBestFeatureToSplit(dataSet):
????numFeatures?=?len(dataSet[0])?-?1??#?因為數據集的最后一項是標簽
????baseEntropy?=?calcShannonEnt(dataSet)
????bestInfoGain?=?0.0
????bestFeature?=?-1
????for?i?in?range(numFeatures):
????????featList?=?[example[i]?for?example?in?dataSet]
????????uniqueVals?=?set(featList)
????????newEntropy?=?0.0
????????for?value?in?uniqueVals:
????????????subDataSet?=?splitDataSet(dataSet?i?value)
????????????prob?=?len(subDataSet)?/?float(len(dataSet))
????????????newEntropy?+=?prob?*?calcShannonEnt(subDataSet)
????????infoGain?=?baseEntropy?-?newEntropy
????????if?infoGain?>?bestInfoGain:
????????????bestInfoGain?=?infoGain
????????????bestFeature?=?i
????return?bestFeature

#?因為我們遞歸構建決策樹是根據屬性的消耗進行計算的,所以可能會存在最后屬性用完了,但是分類
#?還是沒有算完,這時候就會采用多數表決的方式計算節點分類


def?majorityCnt(classList):
????classCount?=?{}
????for?vote?in?classList:
????????if?vote?not?in?classCount.keys():
????????????classCount[vote]?=?0
????????classCount[vote]?+=?1
????return?max(classCount)


def?createTree(dataSet?labels):
????classList?=?[example[-1]?for?example?in?dataSet]
????if?classList.count(classList[0])?==?len(classList):??#?類別相同則停止劃分
????????return?classList[0]
????if?len(dataSet[0])?==?1:??#?所有特征已經用完
????????return?majorityCnt(classList)
????bestFeat?=?chooseBestFeatureToSplit(dataSet)
????bestFeatLabel?=?labels[bestFeat]
????myTree?=?{bestFeatLabel:?{}}
????del(labels[bestFeat])
????featValues?=?[example[bestFeat]?for?example?in?dataSet]
????uniqueVals?=?set(featValues)
????for?value?in?uniqueVals:
????????subLabels?=?labels[:]??#?為了不改變原始列表的內容復制了一下
????????myTree[bestFeatLabel][value]?=?createTree(splitDataSet(dataSet
???????????????????????????????????????????????????????????????bestFeat?value)?subLabels)
????return?myTree


def?main():
????#?data?label?=?createDataSet()
????data?=?ls.data2.tolist()
????label?=?ls.label
????#?print(type(data))
????#?print(type(label))
????t1?=?t

?屬性????????????大小?????日期????時間???名稱
-----------?---------??----------?-----??----
?????目錄???????????0??2019-07-23?16:47??seventh_homework\
?????目錄???????????0??2018-12-11?23:19??seventh_homework\7_guanlianguice\
?????目錄???????????0??2018-12-05?21:42??seventh_homework\7_guanlianguice\Bigmart\
?????文件????????3474??2018-10-10?23:47??seventh_homework\7_guanlianguice\Bigmart\read.txt
?????文件???????85268??2018-10-10?23:24??seventh_homework\7_guanlianguice\Bigmart\SampleSubmission_TmnO39y.csv
?????文件??????527709??2018-10-10?23:18??seventh_homework\7_guanlianguice\Bigmart\Test_u94Q5KV.csv
?????文件??????869537??2018-10-10?23:18??seventh_homework\7_guanlianguice\Bigmart\Train_UWu5bXk.csv
?????目錄???????????0??2018-12-05?21:42??seventh_homework\7_guanlianguice\black_friday\
?????文件????????2009??2018-10-10?23:39??seventh_homework\7_guanlianguice\black_friday\readme.txt
?????文件??????????29??2018-10-10?23:35??seventh_homework\7_guanlianguice\black_friday\Sample_Submission_Tm9Lura.csv
?????文件?????9598228??2015-11-20?00:45??seventh_homework\7_guanlianguice\black_friday\test.csv
?????文件????25525678??2015-11-20?00:47??seventh_homework\7_guanlianguice\black_friday\train.csv
?????文件???????18939??2018-12-11?18:15??seventh_homework\7_guanlianguice\daima.docx
?????文件????????3478??2018-12-11?18:12??seventh_homework\7_guanlianguice\jueceshu.py
?????文件????????1628??2018-12-11?18:14??seventh_homework\7_guanlianguice\load_bigcsv.py
?????文件????????1690??2018-12-11?18:08??seventh_homework\7_guanlianguice\load_csv.py
?????目錄???????????0??2018-12-11?21:53??seventh_homework\7_guanlianguice\pict\
?????文件???????80381??2018-12-11?21:01??seventh_homework\7_guanlianguice\pict\bigmart_10data.png
?????文件???????52375??2018-12-11?21:02??seventh_homework\7_guanlianguice\pict\bigmart_all_data.png
?????文件??????327607??2018-12-11?16:30??seventh_homework\7_guanlianguice\pict\black_10.png
?????文件???????34241??2018-12-11?21:03??seventh_homework\7_guanlianguice\pict\black_100.png
?????文件???????68786??2018-12-11?21:05??seventh_homework\7_guanlianguice\pict\black_1000.png
?????文件???????88850??2018-12-11?21:04??seventh_homework\7_guanlianguice\pict\black_10000.png
?????文件???????12070??2018-12-11?21:03??seventh_homework\7_guanlianguice\pict\black關鍵屬性.png
?????文件??????171499??2018-12-11?21:22??seventh_homework\7_guanlianguice\pict\black處理后數據.png
?????文件???????11897??2018-12-11?21:00??seventh_homework\7_guanlianguice\pict\Mart關鍵屬性.png
?????文件???????91050??2018-12-11?16:32??seventh_homework\7_guanlianguice\pict\Screenshot?from?2018-12-11?16-32-03.png
?????文件???????85403??2018-12-11?21:15??seventh_homework\7_guanlianguice\pict\數據1.png
?????文件???????58939??2018-12-11?21:21??seventh_homework\7_guanlianguice\pict\數據10.png
?????文件???????79858??2018-12-11?21:17??seventh_homework\7_guanlianguice\pict\數據2.png
?????文件???????17647??2018-12-11?21:17??seventh_homework\7_guanlianguice\pict\數據34.png
............此處省略14個文件信息

評論

共有 條評論