首页   按字顺浏览 期刊浏览 卷期浏览 Lexical analysis for Chinese‐ difficulties and possible solutions†
Lexical analysis for Chinese‐ difficulties and possible solutions†

 

作者: Keh‐Jiann Chen,  

 

期刊: Journal of the Chinese Institute of Engineers  (Taylor Available online 1999)
卷期: Volume 22, issue 5  

页码: 561-571

 

ISSN:0253-3839

 

年代: 1999

 

DOI:10.1080/02533839.1999.9670494

 

出版商: Taylor & Francis Group

 

关键词: lexical analysis;word segmentation;unknown word identification

 

数据来源: Taylor

 

摘要:

Chinese sentences are composed with strings of characters without blanks to mark word boundaries. However, the basic processing unit for sentence processing is the word. It is the smallest meaningful, freely used unit for any natural language. Therefore lexical analysis became the first step in processing Chinese sentences. Usually a lexicon is utilized to match words and provide their syntactic and semantic information in the process of lexical analysis. During the word matching process, problems of segmentation ambiguity and occurrences of unknown words will occur. In this paper, both statistical methods and rule‐based methods are discussed for their advantages and disadvantages in solving segmentation ambiguities. For unknown word identification, off‐line word extraction methods and on‐line unknown word identification strategies are surveyed. Both methods complement each other in solving the problem. The strategies and knowledge sources for implementing a practical system are also discussed.

 

点击下载:  PDF (1181KB)



返 回