資源簡介
ngramScore適應度計算文件(包含引入的計算代碼和分詞),python支持的適應度計算導入文件

代碼片段和文件信息
‘‘‘
Allows?scoring?of?text?using?n-gram?probabilities
17/07/12
‘‘‘
from?math?import?log10
class?ngram_score(object):
????def?__init__(selfngramfilesep=‘?‘):
????????‘‘‘?load?a?file?containing?ngrams?and?counts?calculate?log?probabilities?‘‘‘
????????self.ngrams?=?{}
????????for?line?in?file(ngramfile):
????????????keycount?=?line.split(sep)?
????????????self.ngrams[key]?=?int(count)
????????self.L?=?len(key)
????????self.N?=?sum(self.ngrams.itervalues())
????????#calculate?log?probabilities
????????for?key?in?self.ngrams.keys():
????????????self.ngrams[key]?=?log10(float(self.ngrams[key])/self.N)
????????self.floor?=?log10(0.01/self.N)
????def?score(selftext):
????????‘‘‘?compute?the?score?of?text?‘‘‘
????????score?=?0
????????ngrams?=?self.ngrams.__getitem__
????????for?i?in?xrange(len(text)-self.L+1):
????????????if?text[i:i+self.L]?in?self.ngrams:?score?+=?ngrams(text[i:i+self.L])
????????????else:?score?+=?self.floor??????????
????????return?score
?屬性????????????大小?????日期????時間???名稱
-----------?---------??----------?-----??----
?????文件???????????0??2020-01-13?02:40??閫傚簲搴﹁綆?
?????文件????????6148??2020-01-13?02:40??閫傚簲搴﹁綆?.DS_Store
?????文件?????????120??2020-01-13?02:40??__MACOSX\閫傚簲搴﹁綆?._.DS_Store
?????文件?????????972??2019-11-29?01:12??閫傚簲搴﹁綆?ngram_score.py
?????文件??????975928??2019-11-29?01:08??閫傚簲搴﹁綆?english_quadgrams.txt.zip
?????文件?????????591??2019-11-29?01:08??__MACOSX\閫傚簲搴﹁綆?._english_quadgrams.txt.zip
- 上一篇:多標簽數(shù)據集.zip
- 下一篇:python習題
評論
共有 條評論