In computational linguistics, word sense disambiguation wsd is an open problem concerned with identifying which sense of a word is used in a sentence. Word sense disambiguation using conceptual density. Simultaneous disambiguation of all words per sentence, using e. Given a word and its possible senses, as defined by a. Its application lies in many different areas including sentiment analysis, information retrieval ir, machine translation and knowledge graph construction. Using the wordnet hierarchy, we embed the construction of abney and light 1999 in the topic model and show that automatically learned domains improve wsd accuracy compared to alternative contexts. Random walk algorithms such as pagerank page et al.
Alhelbawy and gaizauskas 2014 successfully apply the pagerank algorithm to the ned task. In computational linguistics, wordsense disambiguation wsd is an open problem concerned with identifying which sense of a word is used in a sentence. Word sense disambiguation using semisupervised naive bayes with ontological constraints jakob bauer wednesday 23rd november, 2016 abstract background. Chinese word sense disambiguation with pagerank and hownet. A comparative evaluation of word sense disambiguation.
Knowledgebased word sense disambiguation using topic. In this paper, we consider the problem of calculating fast and accurate approximations to the personalized pagerank score of a webpage. Wsd is an important stage in many textprocessing tasks. Graphbased word sense disambiguation of biomedical. Computing personalized pagerank quickly by exploiting. Word sense disambiguation and namedentity disambiguation. Personalized pagerank, on the knowledge base kb graph to rank the vertices according to the given context.
Personalized page rank for named entity disambiguation. Once the graph is built, it can be used as a powerful tool to compute the importance of each interpretation in the graph. Semantic relatedness measures in order to be able to apply a wide range of wsd algorithms to german, we have reimplemented the same suite of semantic relatedness algorithms for german that were pre. In natural language processing, word sense disambiguation wsd is the problem of determining which sense meaning of a word is activated by the use of the word in a particular context, a process which appears to be largely unconscious in people. New evaluation methods for word sense disambiguation. Word sense disambiguation is a basic problem in natural language processing. We focus on techniques to improve speed by limiting the amount of web graph data we need to access. Pagerank on semantic networks, with application to word sense.
Word sense disambiguation wsd, an aicomplete problem, is shown to be able to solve the essential problems of artificial intelligence, and has received increasing attention due to its promising applications in the fields of sentiment analysis, information retrieval, information extraction. In proceedings of the 5th international workshop on semantic evaluation, pages 387391, uppsala, sweden. In our work we use a variant of the personalized pagerank empowered with word sense frequencies utilizing the normalized values of word sense frequencies and the lkb constituted by the semantic connections obtained from isrwn, extended wordnet and word sense pair relations of semcor. For example, the word cold can refer to the viral infection common cold or the sensation of cold. Building on the lu decomposition and using it as preconditoner, we apply gmres method a stateoftheart advanced iterative method to compute ppr for whole web graphs and social networks. Jul 18, 2016 the volume of research published in the biomedical domain has increasingly lead to researchers focussing on specific areas of interest and connections between findings being missed. Word sense disambiguation is a key step for many natural language processing tasks e. In this study we developed and evaluated a knowledgebased wsd method that uses semantic similarity measures derived from the unified medical language system umls and. Distributed algorithms for fully personalized pagerank on. Cooccurrence graphs for word sense disambiguation in the.
Our algorithm uses the full graph of the lkb efficiently, performing better than. In in proceedings of the 16th international conference on computational linguistics, pages 1622. In the method, a free text is firstly represented as a sememe graph with sememes as vertices and relatedness of sememes as weighted edges based on hownet. Personalizing pagerank for word sense disambiguation acl. Background word sense disambiguation wsd methods automatically assign an unambiguous concept to an ambiguous term based on context, and are important to many textprocessing tasks. Word sense disambiguation and namedentity disambiguation using graphbased algorithms eneko agirre ixa2.
Word sense disambiguation wsd systems use the context surrounding an ambiguous term to assign it a unique unambiguous concept. Explore word sense disambiguation with free download of seminar report and ppt in pdf and doc format. Knowledgebased word sense disambiguation and similarity. Enriched page rank for multilingual word sense disambiguation. This paper presents an unsupervised approach to solve semantic ambiguity based on the integration of the personalized pagerank algorithm with wordsense frequency information. Pagerank on semantic networks, with application to word.
Computing personalized pagerank quickly by exploiting graph. On any graph, given a starting node swhose point of view we take, personalized pagerank assigns a score to every node tof the graph. However, the storage and computation of all accurate ppr vectors can be prohibitive for large graphs, especially in caching them in memory for realtime online querying. Wordnet to determine the sense of a given word by means of pagerank and personalized pagerank ppr. In eacl 2009, 12th conference of the european chapter of the association for computational linguistics, proceedings of the conference, athens, greece, march 30 april 3, 2009. Also explore the seminar topics paper on word sense disambiguation with abstract or synopsis, documentation on advantages and disadvantages, base paper presentation slides for ieee final year computer science engineering or cse students for the year 2015 2016. Wsd is an important problem in natural language processing nlp, both in its own right and as a stepping stone to more advanced tasks such as machine translation chan, ng, and chiang 2007, information extraction and retrieval. A wordnetbased algorithm for word sense disambiguation. Word sense disambiguation seminar report and ppt for cse. Graphbased word sense disambiguation of biomedical documents. Word sense disambiguation wsd is the ability to identify the meaning of words in context in a computational manner. Ukb is a collection of programs for performing graphbased word sense disambiguation wsd and lexical similarityrelatedness using a preexisting knowledge base.
Personalizing pagerank for word sense disambiguation. Knowledgebased word sense disambiguation using topic models. Tibetan word sense disambiguation based on a semantic. Pagerank is a way of measuring the importance of website pages. Knowledgebased word sense disambiguation and similarity using random walks eneko agirre ixa2.
Wsd is considered an aicomplete problem, that is, a task whose solution is at least as hard as the most dif. The risk of suboptimal use of open source nlp software. It uses the standard wordnet graph plus disambiguated glosses as. Word sense disambiguation wsd is the task of mapping an ambiguous word in a given context to its correct meaning.
This paper proposed an unsupervised word sense disambiguation method based pagerank and hownet. The two proposed methods are 1 the word sense disambiguation method based on hownet and tibetanchinese parallel corpora, and. Random walks for knowledgebased word sense disambiguation. For example, if we concen trate all the probability mass on a unique node i, all random jumps on the walk will return to i and thus its rank will be. Knowledgebased biomedical word sense disambiguation. Eneko agirre, aitor soroa, personalizing pagerank for word sense disambiguation, proceedings of the 12th conference of the european chapter of the association for computational linguistics, p. The solution to this problem impacts other computerrelated writing, such as discourse, improving relevance of search engines, anaphora resolution, coherence, and inference the human brain is quite proficient at wordsense disambiguation. Word sense disambiguation using semisupervised naive bayes. Word sense disambiguation wsd has been a basic and ongoing issue since its introduction in natural language processing nlp community. We apply a direct method to the small treewidth graph to construct an lu decomposition.
The cosine distance between these vectors was used as feature in a supervised learning process. Knowledgebased word sense disambiguation using topic models devendra singh chaplot, ruslan salakhutdinov. A unified evaluation framework and empirical comparison alessandro raganato, jose camacho collados and roberto navigli 16 ukb agirre et al. Using the multilingual central repository for graphbased word sense disambiguation. Pagerank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is. Last year, a vector of weighted synset nodes was computed for each sentence found in every text and hypotheses. Word sense disambiguation wsd systems automatically choose the intended. Personalized pagerank over wordnet for similarity and word. For example, if we concentrate all the probability mass on a unique node i. In nlp area, ambiguity is recognized as a barrier to human language understanding. The algorithm may be applied to any collection of entities with reciprocal quotations and references. Personalizing pagerank for word sense disambiguation eneko agirre and aitor soroa ixa nlp group university of the basque country donostia, basque contry fe. Typically wsd systems use the sentence or a small window of words around the target word as the context for disambiguation because their.
In this paper, we present a new graphbased unsupervised technique to address this problem. Humans can relatively easily disambiguate the meaning of a term from its context. The solution to this problem impacts other computerrelated writing, such as discourse, improving relevance of search engines, anaphora resolution, coherence, and inference. Approximating personalized pagerank with minimal use of web graph data david gleich and marzia polito abstract. Approximating personalized pagerank with minimal use of. Experiments show that the best results were obtained using the combination of all vocabularies in the mrrel table of the metathesaurus. Personalized pagerank estimation for large graphs peter lofgren stanford joint work with siddhartha banerjee stanford, ashish goel stanford, and c. Spreading semantic information by word sense disambiguation. Automatic sense disambiguation using machine readable dictionaries. In proceedings of the 12th conference of the european chapter of the association for computational linguistics, pages 3341. Personalized pagerank over wordnet for similarity and word sense disambiguation eneko agirre e. In the nlp community, word sense disambiguation wsd is the task of automatically selecting the most appropriate sense for a given word in a given context, be it a sentence or a whole document, among all the possible senses which can be associated to that word. The effect of word sense disambiguation accuracy on. Word sense disambiguation is an open problem in natural language processing which is particularly challenging and useful in the unsupervised setting where all the words in any given text need to be disambiguated without using any labeled data.
Named entity disambiguation, entity linking, wikification. Pdf personalizing pagerank for word sense disambiguation piek vossen academia. Natural language tasks such as machine translation or recommender systems are likely to be enriched by. Most methods for personalized pagerank ppr precompute and store all accurate ppr vectors, and at query time, return the ones of interest directly. Personalized pagerank algorithm is included in the set of experiments described in section 5. The volume of research published in the biomedical domain has increasingly lead to researchers focussing on specific areas of interest and connections between findings being missed. Pdf personalizing pagerank for word sense disambiguation. Wsd is considered an aicomplete problem, that is, a task whose solution is at least as. Personalized pagerank, on the knowledge base kb graph. Literature based discovery lbd attempts to address this problem by searching for previously unnoticed connections between published information also known as hidden knowledge. Random walks over wordnet using personalized pagerank have been also used.
461 351 1094 1213 871 48 1542 33 396 136 699 57 1070 846 1329 551 152 315 387 519 809 788 519 1140 160 918 1098 513 1280 1308 312 311 741 248 730 398 1112 323 886 265 987 420 428 473 1310 370 915