資源簡介
基于matlab的垃圾郵件處理,樸素貝葉斯算法

代碼片段和文件信息
[spmatrix?tokenlist?category]?=?readMatrix(‘MATRIX.TEST‘);
testMatrix?=?full(spmatrix);
numTestDocs?=?size(testMatrix?1);
numTokens?=?size(testMatrix?2);
%?Assume?nb_train.m?has?just?been?executed?and?all?the?parameters?computed/needed
%?by?your?classifier?are?in?memory?through?that?execution.?You?can?also?assume?
%?that?the?columns?in?the?test?set?are?arranged?in?exactly?the?same?way?as?for?the
%?training?set?(i.e.?the?j-th?column?represents?the?same?token?in?the?test?data?
%?matrix?as?in?the?original?training?data?matrix).
%?Write?code?below?to?classify?each?document?in?the?test?set?(ie?each?row
%?in?the?current?document?word?matrix)?as?1?for?SPAM?and?0?for?NON-SPAM.
%?Construct?the?(numTestDocs?x?1)?vector?‘output‘?such?that?the?i-th?entry?
%?of?this?vector?is?the?predicted?class?(1/0)?for?the?i-th??email?(i-th?row?
%?in?testMatrix)?in?the?test?set.
output?=?zeros(numTestDocs?1);?????
%---------------
%?YOUR?CODE?HERE
%%%%%%%%%%%%?CODE?BY?王委濤?%%%%%%%%%%%%%%%%%%%%%%%%
%result_test?=?zeros(numTestDocs1);????%?the?result?of?Test?=?output?
result?=?zeros(numclassifier1);???????%?存儲每個測試樣本的屬于各個類別的概率
for?i?=?1:numTestDocs??????????????????%處理每一個測試樣本
????for?k?=?1:numclassifier????????????%在各個類別下的情況??
????????for?j?=?1:numTokens????????%對每個Token進行處理
????????????result(k)?=?result(k)?+?testMatrix(ij)?*?log(P_numToken(kj));%計算P(Xi|Ck)
????????end
????result(k)?=?result(k)?+?log(P_category(k));??%計算P(Xi|Ck)P(Ck)
????end
????if(result(1)?>?result(2))???????????%比較result(k)判斷測試樣本類型
????????output(i)?=?0;
????else
????????output(i)?=?1;
????end
????for?r?=?1:numclassifier????????????????????%將存儲矩陣清零,以用于存儲下一個樣本值
????????result(r)?=?0;
????end
end
%%%%%%%%%%%%%%CODE?BY?王委濤?%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%CODE?BY?葛新朋?%%%%%%%%%%%%%%%%%
%計算誤判次數
numerror?=?0;
for?g?=?1:?numTestDocs
????if(category(g)?~=?output(g))
????????numerror?=?numerror?+?1;
????end
end
%display
disp(‘The?test?set?error?on?MATRIX.TEST?‘)
disp(numerror/numTestDocs);
%%%%%%%%%%%%%%%CODE?BY?葛新朋%%%%%%%%%%%%%%%%%%
%---------------
?屬性????????????大小?????日期????時間???名稱
-----------?---------??----------?-----??----
?????文件??????74967??2013-09-25?13:35??作業\hw1.pdf
?????文件?????129205??2013-11-04?23:16??作業\hw2.pdf
?????文件????????657??2013-11-04?23:16??作業\hw2data.txt
?????文件?????201135??2012-10-26?14:20??作業\MATRIX.TEST
?????文件?????525629??2012-10-26?14:20??作業\MATRIX.TRAIN
?????文件??????33910??2012-10-26?14:20??作業\MATRIX.TRAIN.100
?????文件?????343763??2012-10-26?14:20??作業\MATRIX.TRAIN.1400
?????文件??????58065??2012-10-26?14:20??作業\MATRIX.TRAIN.200
?????文件?????107274??2012-10-26?14:20??作業\MATRIX.TRAIN.400
?????文件??????21769??2012-10-26?14:20??作業\MATRIX.TRAIN.50
?????文件?????205046??2012-10-26?14:20??作業\MATRIX.TRAIN.800
?????文件???????2197??2013-10-27?00:21??作業\nb_test.m
?????文件???????3116??2013-10-27?00:21??作業\nb_train.m
?????文件???????1159??2013-07-14?17:29??作業\readMatrix.m
?????文件???????2665??2013-10-27?00:21??作業\Result_HW1.txt
?????文件??????15391??2012-10-26?14:20??作業\TOKENS_LIST.txt
?????目錄??????????0??2013-11-04?23:17??作業
-----------?---------??----------?-----??----
??????????????1725948????????????????????17
- 上一篇:max貼圖批量重命名
- 下一篇:svm完成聚類功能的小程序
評論
共有 條評論