基于字符串匹配算法的搜索蒙古文的研究
Study on the Searching Mongolian Based on String Matching Algorithms
DOI: 10.12677/CSA.2013.36050, PDF, HTML, 下载: 2,959  浏览: 5,775  国家科技经费支持
作者: 菊 花*:内蒙古师范大学传媒学院
关键词: 蒙古文算法字符串搜索Mongolian; Algorithm; String; Search
摘要: 蒙古文是拼音文字它的拼写规则是以词为单位竖写,词与词之间以空格分开,一个词的各个语音音素之间连写。常用的字符串匹配算法有蛮力字符串匹配算法、Boyer-Moore算法和Horspool算法。实现蒙古文搜索时,不仅需要借鉴己有的其它语言的信息搜索技术,同时也需要依据蒙古文的特点进行改进。因此本研究通过对常用的字符串匹配算法及蒙古文的语法特点进行分析,改进Horspool算法,通过六个步骤,完成了从蒙古文语料中搜索相关关键词的任务,并在语料中以选中状态显示所搜索到的关键词。
>Mongolian is an alphabetic writing. Its spelling rules are: the words are written vertically word by word, with a space between every two words, and the speech phonemes of a word are written together. Programmer always uses three string matching algorithms: the brute force string matching algorithms, the Boyer-Moore algorithm and the Horspool algorithm. When searching Mongolian key words, we should not only refer to the information searching technology of other languages, but also make some improvement according to the properties of Mongolian. This paper analyzes the string matching algorithms and the properties of Mongolian and improves the Horspool algorithm to search the Mongolian key words using six steps, and shows the selected status of the Mongolian key words in the corpus.
文章引用:菊花. 基于字符串匹配算法的搜索蒙古文的研究[J]. 计算机科学与应用, 2013, 3(6): 288-291. http://dx.doi.org/10.12677/CSA.2013.36050