資源簡介
學習了部分分詞原理和方法,做了相應的改進:
1、創建了兩種詞典樹,做了性能上的比較
2、改進數字詞組的分詞方法,加入語法的判斷
3、提供地名詞典創建工具。將地名全名詞典分詞獲得分詞后的地名詞典,減少詞條數目,增加查找速度。程序用一個中等省會城市測試,模糊查詢和精確查詢效果都不錯。
4、加入地名輸入提示的工具。
代碼采用C#,為學習成果,尚有很多不完善地方,歡迎交流。
詞典數據來源于網絡!
代碼片段和文件信息
/*
?*?如修改程序,請注明修改人、日期和修改內容提要
?*?Content:DictBuilder
?*?Email:killo007@163.com
?*?Date:August?8th2008
*/
using?System;
using?System.Collections.Generic;
using?System.ComponentModel;
using?System.Data;
using?System.Drawing;
using?System.Text;
using?System.Windows.Forms;
namespace?ChineseSegement
{
????public?partial?class?DictBuilder?:?Form
????{
????????public?DictBuilder()
????????{
????????????InitializeComponent();
????????}
????????private?void?DictBuilder_Load(object?sender?EventArgs?e)
????????{
????????????this.txtExistFile.ReadOnly?=?true;
????????????this.button2.Enabled?=?false;
????????}
????????///?
????????///?獲取文件路徑
????????///?
????????///?
????????///?
????????private?void?btnInput_Click(object?sender?EventArgs?e)
????????{
????????????OpenFileDialog?dialog?=?new?OpenFileDialog();
????????????dialog.Filter?=?“文本文檔?(*.txt)|*.txt|數據文件(*.dat)|*.dat|所有文件(*.*)|*.*“;
????????????dialog.FilterIndex?=?1;
????????????dialog.InitialDirectory?=?@“E:\學習\自定義控件\ChineseSegement\ChineseSegmentTools\Dictionary\DictionaryBuild“;
????????????dialog.AddExtension?=?true;
????????????if?(DialogResult.OK?==?dialog.ShowDialog())
????????????{
????????????????this.txtInput.Text?=?dialog.FileName;
????????????}
????????}
????????///?
????????///?獲取文件路徑
????????///?
????????///?
????????///?
????????private?void?btnprePath_Click(object?sender?EventArgs?e)
????????{
????????????SaveFileDialog?dialog?=?new?SaveFileDialog();
????????????dialog.Filter?=?“文本文檔?(*.txt)|*.txt|數據文件(*.dat)|*.dat|所有文件(*.*)|*.*“;
????????????dialog.FilterIndex?=?1;
????????????dialog.InitialDirectory?=?@“E:\\學習\自定義控件\ChineseSegement\ChineseSegmentTools\Dictionary\DictionaryBuild“;
????????????dialog.DefaultExt?=?“txt“;
????????????dialog.RestoreDirectory?=?true;
????????????dialog.AddExtension?=?true;
????????????if?(DialogResult.OK?==?dialog.ShowDialog())
????????????{
????????????????this.txtpreTreat.Text?=?dialog.FileName;
????????????}
????????}
????????private?void?checkBox1_CheckedChanged(object?sender?EventArgs?e)
????????{
????????????if?(this.checkBox1.Checked?==?true)
????????????{
????????????????this.txtExistFile.ReadOnly?=?false;
????????????????this.button2.Enabled?=?true;
????????????}
????????????else
????????????{
????????????????this.txtExistFile.ReadOnly?=?true;
????????????????this.button2.Enabled?=?false;
????????????}
????????}
????????///?
????????///?詞典預處理
????????///?
????????///?
????????///?
????????private?void?btnPreTreat_Click(object?sender?EventArgs?e)
????????{
????????????if?(this.txtInput.Text?!=?null?&&?this.txtInput.Text?!=?““)
????????????{
????????????????if?(this.txtpreTreat.Text?!=?““)
????????????????{
????????????????????ChineseSegmentT
?屬性????????????大小?????日期????時間???名稱
-----------?---------??----------?-----??----
?????文件??????32768??2008-09-05?10:06??中文地名查詢\ChineseSegement\bin\Debug\ChineseSegement.exe
?????文件??????54784??2008-09-05?10:06??中文地名查詢\ChineseSegement\bin\Debug\ChineseSegement.pdb
?????文件???????5632??2005-12-08?14:51??中文地名查詢\ChineseSegement\bin\Debug\ChineseSegement.vshost.exe
?????文件??????28672??2008-09-05?10:06??中文地名查詢\ChineseSegement\bin\Debug\ChineseSegmentTools.dll
?????文件??????48640??2008-09-05?10:06??中文地名查詢\ChineseSegement\bin\Debug\ChineseSegmentTools.pdb
?????文件???????3888??2008-08-08?09:11??中文地名查詢\ChineseSegement\ChineseSegement.csproj
?????文件???????6949??2008-09-05?09:59??中文地名查詢\ChineseSegement\DictBuilder.cs
?????文件??????16011??2008-08-08?10:28??中文地名查詢\ChineseSegement\DictBuilder.Designer.cs
?????文件???????5814??2008-08-08?10:28??中文地名查詢\ChineseSegement\DictBuilder.resx
?????文件??????14639??2008-09-05?10:05??中文地名查詢\ChineseSegement\Form1.cs
?????文件??????10524??2008-08-08?11:28??中文地名查詢\ChineseSegement\Form1.Designer.cs
?????文件???????5814??2008-08-08?11:28??中文地名查詢\ChineseSegement\Form1.resx
?????文件????????443??2008-09-05?10:43??中文地名查詢\ChineseSegement\obj\ChineseSegement.csproj.FileList.txt
?????文件????????906??2008-08-08?11:28??中文地名查詢\ChineseSegement\obj\Debug\ChineseSegement.csproj.GenerateResource.Cache
?????文件????????180??2008-09-05?10:06??中文地名查詢\ChineseSegement\obj\Debug\ChineseSegement.DictBuilder.resources
?????文件??????32768??2008-09-05?10:06??中文地名查詢\ChineseSegement\obj\Debug\ChineseSegement.exe
?????文件????????180??2008-09-05?10:06??中文地名查詢\ChineseSegement\obj\Debug\ChineseSegement.Form1.resources
?????文件??????54784??2008-09-05?10:06??中文地名查詢\ChineseSegement\obj\Debug\ChineseSegement.pdb
?????文件????????180??2008-09-05?10:06??中文地名查詢\ChineseSegement\obj\Debug\ChineseSegement.Properties.Resources.resources
?????文件???????8846??2008-09-05?10:06??中文地名查詢\ChineseSegement\obj\Debug\ResolveAssemblyReference.cache
?????文件???????4608??2008-08-04?13:16??中文地名查詢\ChineseSegement\obj\Debug\TempPE\Properties.Resources.Designer.cs.dll
?????文件????????474??2008-08-01?10:45??中文地名查詢\ChineseSegement\Program.cs
?????文件???????1196??2008-08-01?10:45??中文地名查詢\ChineseSegement\Properties\AssemblyInfo.cs
?????文件???????2852??2008-08-04?13:16??中文地名查詢\ChineseSegement\Properties\Resources.Designer.cs
?????文件???????5612??2008-08-01?10:45??中文地名查詢\ChineseSegement\Properties\Resources.resx
?????文件???????1111??2008-08-04?13:16??中文地名查詢\ChineseSegement\Properties\Settings.Designer.cs
?????文件????????249??2008-08-01?10:45??中文地名查詢\ChineseSegement\Properties\Settings.settings
?????文件???????1445??2008-08-01?11:06??中文地名查詢\ChineseSegement.sln
????..A..H.?????30720??2008-09-05?10:44??中文地名查詢\ChineseSegement.suo
?????文件??????28672??2008-09-05?10:06??中文地名查詢\ChineseSegmentTools\bin\Debug\ChineseSegmentTools.dll
............此處省略38個文件信息
- 上一篇:基于RFID的簡易圖書管理系統
- 下一篇:NPOI方式讀取、保存等excel操作
評論
共有 條評論