Publications
Scientific publications
Турдаков Д.
Устранение лексической многозначности терминов Википедии на основе скрытой модели Маркова
// Электронные библиотеки: перспективные методы и технологии, электронные коллекции: Труды XI Всероссийской научной конференции RCDL'2009. Петрозаводск: КарНЦ РАН, 2009. C. 267-275
Turdakov D. Sense disambiguation of Wikipedia terms based on Hidden Markov Model // Digital Libraries: Advanced Methods and Technologies, Digital Collections: Proceedings of the XI All-Russian Research Conference RCDL'2009. Petrozavodsk: KRC RAS, 2009. Pp. 267-275
The paper presents a method for word sense disambiguation using external knowledge extracted from the open encyclopedia Wikipedia. We analyse the drawbacks of the existing word sense disambiguation algorithms and propose own algorithm, based on Hidden Markov Model, to overcome these drawbacks. HMM parameters are estimated by empirical probabilities derived from the Wikipedia dictionary and link structure. A heuristics for speeding up the computational aspects of the algorithm is proposed, and the evaluation of the algorithm for several test collections is provided.
Sense disambiguation of Wikipedia terms based on Hidden Markov Model (225 Kb, total downloads: 174)
Last modified: October 16, 2009