AbstractsLanguage, Literature & Linguistics

What replication and localisation teach us: the case of semantic similarity measures

by M.C. Postma




Institution: Universiteit Utrecht
Department:
Year: 2013
Keywords: computational linguistics, semantic similarity measures, wordnet, cornetto, lexical semantic databases
Record ID: 1243829
Full text PDF: http://dspace.library.uu.nl:8080/handle/1874/282499


Abstract

Many tasks in the field of Natural Language Processing make use of so-called semantic similarity measures, which quantify the degree to which two concepts are semantically similar. In order to know which of the semantic similarity measures is to be used for Natural Language Processing tasks, they are generally evaluated against human judgement. However, because human judgement is subjective, gold standards are created by asking a group of people to indicate the similarity of meaning of a set of word pairs. The correlation between these gold standards and the output from the semantic similarity measures gives a good indication as to which measure correlates best with human judgement. Most research, for example Patwardhan and Pedersen (2006) and Peder- sen (2010), has focused on English, using the English lexical semantic database WordNet (Miller, 1995) to compute the scores for the semantic similarity mea- sures. The main focus of this thesis is upon getting a better understanding of the workings of semantic similarity measures by also using a diff erent lexi- cal semantic database in a di fferent language, which is Cornetto (Vossen, 2006; Vossen et al., 2007, 2008) for Dutch. In order to get a better understanding of these measures, we first inspect the previous English experiments and try to replicate them to be sure that we fully understand the process. Furthermore, we will create a Dutch gold standard and inspect the correlations between the output from the semantic similarity measures using the Dutch lexical semantic database Cornetto and the newly created Dutch gold standard. For English, we will show that a group of semantic similarity measures ap- proaches human judgement in a similar way. Moreover, we will stress the im- portance of addressing every detail of the process that leads to the results by showing that even if the main properties are kept stable, variations in minor properties can lead to completely diff erent outcomes. Furthermore, we will present our gold standard for Dutch and how it was created. In addition, we will show that not only the properties of a semantic similarity measure deter- mine its performance, but that the structure of the lexical semantic database also plays a crucial role