Abstracts Category : Other

Add abstract

Want to add your dissertation abstract to this database? It only takes a minute!

Search abstract

Search for abstracts by subject, author or institution

Share this abstract

Identifying and Resolving Entities in Text

by Gregory Christopher Durrett

Institution: University of California, Berkeley
Year: 2017
Keywords: Computer science
Posted: 02/01/2018
Record ID: 2162062
Full text PDF: http://pqdtopen.proquest.com/#viewpdf?dispub=10192292


Abstract

When automated systems attempt to deal with unstructured text, a key subproblem is identifying the relevant actors in that textanswering the "who" of the narrative being presented. This thesis is concerned with developing tools to solve this NLP subproblem, which we call entity analysis. We focus on two tasks in particular: first, coreference resolution, which consists of within-document identification of entities, and second, entity linking, which involves identifying each of those entities with an entry in a knowledge base like Wikipedia. One of the challenges of coreference is that it requires dealing with many different linguistic phenomenon: constraints in reference resolution arise from syntax, semantics, discourse, and pragmatics. This diversity of effects to handle makes it difficult to build effective learning-based coreference resolution systems rather than relying on handcrafted features. We show that a set of simple features inspecting surface lexical properties of a document is sufficient to capture a range of these effects, and that these can power an efficient, high-performing coreference system. Our analysis of our base coreference system shows that some examples can only be resolved successfully by exploiting world knowledge or deeper knowledge of semantics. Therefore, we turn to the task of entity linking and tackle it not in isolation, but instead jointly with coreference. By doing so, our coreference module can draw upon knowledge from a resource like Wikipedia, and our entity linking module can draw on information from multiple mentions of the entity we are attempting to resolve. Our joint model of these tasks, which additionally models semantic types of entities, gives strong performance across the board and shows that effectively exploiting these interactions is a natural way to build better NLP systems. Having developed these tools, we show that they can be useful for a downstream NLP task, namely automatic summarization. We develop an extractive and compressive automatic summarization system, and argue that one deficiency it has is its inability to use pronouns coherently in generated summaries, as we may have deleted content that contained a pronoun's antecedent. Our entity analysis machinery allows us to place constraints on summarization that guarantee pronoun interpretability: each pronoun must have a valid antecedent included in the summary or it must be expanded into a reference that makes sense in isolation. We see improvements in our system's ability to produce summaries with coherent pronouns, which suggests that deeper integration of various parts of the NLP stack promises to yield better systems for text understanding.

Add abstract

Want to add your dissertation abstract to this database? It only takes a minute!

Search abstract

Search for abstracts by subject, author or institution

Share this abstract

Featured Books

Book cover thumbnail image
Electric Cooperative Managers' Strategies to Enhan...
by White, Michael Edward
   
Book cover thumbnail image
Bullied! Coping with Workplace Bullying
by Gattis, Vanessa M.
   
Book cover thumbnail image
The Filipina-South Floridian International Interne... Agency, Culture, and Paradox
by Haley, Pamela S.
   
Book cover thumbnail image
Solution or Stalemate? Peace Process in Turkey, 2009-2013
by Yurtbay, Baturay
   
Book cover thumbnail image
Performance, Managerial Skill, and Factor Exposure...
by Avci, S. Burcu
   
Book cover thumbnail image
The Deritualization of Death Toward a Practical Theology of Caregiving for the ...
by Gibson, Charles Lynn
   
Book cover thumbnail image
Emotional Intelligence and Leadership Styles Exploring the Relationship between Emotional Intel...
by Olagundoye, Eniola O.
   
Book cover thumbnail image
Commodification of Sexual Labor Contribution of Internet Communities to Prostituti...
by Young, Jeffrey R.
   
Book cover thumbnail image
The Census of Warm Debris Disks in the Solar Neigh...
by Patel, Rahul I.
   
Book cover thumbnail image
Risk Factors and Business Models Understanding the Five Forces of Entrepreneurial R...
by Miles, D. Anthony