資源簡介
本項目利用python實現N-gram語言模型,采用的平滑算法是Kneser-Ney平滑。
代碼片段和文件信息
#!/usr/bin/python
#coding?=?utf-8
‘‘‘
????This?file?achieves?to?deal?the?dictionary.
‘‘‘?
f_out?=?open(‘dict.txt‘?‘w‘?encoding?=?‘utf-8‘)
with?open(‘dict_old.txt‘?‘r‘?encoding?=?‘utf-8‘)?as?f_in:
????lines?=?f_in.readlines()
????for?line?in?lines:
????????result?=?line.split(‘?/‘)[0]
????????f_out.write(result)
????????f_out.write(‘\n‘)
f_out.close()
?屬性????????????大小?????日期????時間???名稱
-----------?---------??----------?-----??----
?????文件????????372??2018-06-07?10:03??languageModel_kneser-ney\dealDict.py
?????文件?????519392??2018-05-05?08:55??languageModel_kneser-ney\dict.txt
?????文件?????722950??2018-05-05?08:50??languageModel_kneser-ney\dict_old.txt
?????文件???????9972??2018-06-07?10:03??languageModel_kneser-ney\LM_train_backoff.py
?????文件??????11528??2018-06-07?10:03??languageModel_kneser-ney\LM_train_interpolation.py
?????文件???????3029??2018-06-07?10:04??languageModel_kneser-ney\Ngram.py
?????文件?????184397??2018-05-04?08:48??languageModel_kneser-ney\test.trans
?????文件???50499426??2018-05-05?12:34??languageModel_kneser-ney\Train.trans
?????目錄??????????0??2018-06-07?10:04??languageModel_kneser-ney
-----------?---------??----------?-----??----
?????????????51951066????????????????????9
評論
共有 條評論