Abstracts Category : Other

Add abstract

Want to add your dissertation abstract to this database? It only takes a minute!

Search abstract

Search for abstracts by subject, author or institution

Share this abstract

Leveraging Lexical-Semantic Knowledge for Text Classification Tasks

by Lucie Flekova

Institution: Technische Universitt Darmstadt
Year: 2017
Posted: 02/01/2018
Record ID: 2153715
Full text PDF: http://tuprints.ulb.tu-darmstadt.de/6765/


Abstract

This dissertation is concerned with the applicability of knowledge, contained in lexical-semantic resources, to text classification tasks. Lexical-semantic resources aim at systematically encoding various types of information about the meaning of words and their relations. Text classification is the task of sorting a set of documents into categories from a predefined set, for example, spam and not spam. With the increasing amount of digitized text, as well as the increased availability of the computing power, the techniques to automate text classification have witnessed a booming interest. The early techniques classified documents using a set of rules, manually defined by experts, e.g. computational linguists. The rise of big data led to the increased popularity of distributional hypothesis - i.e., ``a meaning of word comes from its context'' - and to the criticism of lexical-semantic resources as too academic for real-world NLP applications. For long, it was assumed that the lexical-semantic knowledge will not lead to better classification results, as the meaning of every word can be directly learned from the document itself. In this thesis, we show that this assumption is not valid as a general statement and present several approaches how lexicon-based knowledge will lead to better results. Moreover, we show why these improved results can be expected.One of the first problems in natural language processing is the lexical-semantic ambiguity. In text classification tasks, the ambiguity problem has often been neglected. For example, to classify a topic of a document containing the word 'bank', we dont need to explicitly disambiguate it, if we find the word 'river' or 'finance'. However, such additional word may not be always present. Conveniently, lexical-semantic resources typically enumerate all senses of a word, letting us choose which word sense is the most plausible in our context. What if we use the knowledge-based sense disambiguation methods in addition to the information provided implicitly by the word context in the document? In this thesis, we evaluate the performance of selected resource-based word sense disambiguation algorithms on a range of document classification tasks (Chapter 3). We note that the lexicographic sense distinctions provided by the lexical-semantic resources are not always optimal for every text classification task, and propose an alternative technique for disambiguation of word meaning in its context for sentiment analysis applications.The second problem in text classification, and natural language processing in general, is the one with synonymy. The words used in training documents represent only a tiny fraction of the words in the total possible vocabulary. If we learn individual words, or senses, as features in the classification model, our system will not be able to interpret the paraphrases, where the synonymous meaning is conveyed using different expressions. How much would the classification performance improve if the system could determine that two veryAdvisors/Committee Members: Gurevych, Iryna (advisor), Stein, Benno (advisor), Daelemans, Walter (advisor).

Add abstract

Want to add your dissertation abstract to this database? It only takes a minute!

Search abstract

Search for abstracts by subject, author or institution

Share this abstract

Featured Books

Book cover thumbnail image
Electric Cooperative Managers' Strategies to Enhan...
by White, Michael Edward
   
Book cover thumbnail image
Bullied! Coping with Workplace Bullying
by Gattis, Vanessa M.
   
Book cover thumbnail image
The Filipina-South Floridian International Interne... Agency, Culture, and Paradox
by Haley, Pamela S.
   
Book cover thumbnail image
Solution or Stalemate? Peace Process in Turkey, 2009-2013
by Yurtbay, Baturay
   
Book cover thumbnail image
Performance, Managerial Skill, and Factor Exposure...
by Avci, S. Burcu
   
Book cover thumbnail image
The Deritualization of Death Toward a Practical Theology of Caregiving for the ...
by Gibson, Charles Lynn
   
Book cover thumbnail image
Emotional Intelligence and Leadership Styles Exploring the Relationship between Emotional Intel...
by Olagundoye, Eniola O.
   
Book cover thumbnail image
Commodification of Sexual Labor Contribution of Internet Communities to Prostituti...
by Young, Jeffrey R.
   
Book cover thumbnail image
The Census of Warm Debris Disks in the Solar Neigh...
by Patel, Rahul I.
   
Book cover thumbnail image
Risk Factors and Business Models Understanding the Five Forces of Entrepreneurial R...
by Miles, D. Anthony