Abstracts Category : Other

Add abstract

Want to add your dissertation abstract to this database? It only takes a minute!

Search abstract

Search for abstracts by subject, author or institution

Share this abstract

A schema conversion approach for constructing heterogeneous information networks from documents

by Hyung Sul Kim

Institution: University of Illinois Urbana-Champaign
Year: 2017
Keywords: Information network construction
Posted: 2/1/2018 12:00:00 AM
Record ID: 2154271
Full text PDF: http://hdl.handle.net/2142/97387


Information networks with multi-typed nodes and edges with different semantics are called heterogenous information networks. Since heterogeneous information networks embed more complex information than homogeneous information networks due to their multi-typed nodes and edges, mining such networks has produced richer knowledge and insights.To extend the application of heterogeneous information network analysis to document analysis, it is necessary to build information networks from a collection of documents while preserving important information in the documents.This thesis describes a schema conversion approach to apply data mining techniques on the outcomes of natural language processing (NLP) tools to construct heterogeneous information networks.First, we utilize named entity recognition (NER) tools to explore networks over entities, topics, and words to demonstrate how a probabilistic model can convert the data schema of the NER tools. Second, we address a pat- tern mining method to construct a network with authors, documents, and writing styles by extracting discriminative writing styles from parse trees and converting them into nodes in a network. Third, we introduce a clustering method to merge redundant nodes in an information network with documents, claims, subjective, objective, and verbs. We use a semantic role labeling (SRL) tool to get initial network structures from news articles, and merge duplicated nodes using a similarity measure SynRank. Finally, we present a novel event mining framework for extracting high-quality structured event knowledge from large, redundant, and noisy news data. The proposed framework ProxiModel utilizes named entity recognition, time expression extraction, and phrase mining tools to get event information from documents.Advisors/Committee Members: Han, Jiawei (Committee Chair), Hockenmaier, Julia (committee member), Zhai, ChengXiang (committee member), Dmitriev, Pavel (committee member).

Add abstract

Want to add your dissertation abstract to this database? It only takes a minute!

Search abstract

Search for abstracts by subject, author or institution

Share this abstract

Featured Books

Book cover thumbnail image
Electric Cooperative Managers' Strategies to Enhan...
by White, Michael Edward
Book cover thumbnail image
The Filipina-South Floridian International Interne... Agency, Culture, and Paradox
by Haley, Pamela S.
Book cover thumbnail image
Bullied! Coping with Workplace Bullying
by Gattis, Vanessa M.
Book cover thumbnail image
Commodification of Sexual Labor Contribution of Internet Communities to Prostituti...
by Young, Jeffrey R.
Book cover thumbnail image
The Census of Warm Debris Disks in the Solar Neigh...
by Patel, Rahul I.
Book cover thumbnail image
Performance, Managerial Skill, and Factor Exposure...
by Avci, S. Burcu
Book cover thumbnail image
The Deritualization of Death Toward a Practical Theology of Caregiving for the ...
by Gibson, Charles Lynn
Book cover thumbnail image
Emotional Intelligence and Leadership Styles Exploring the Relationship between Emotional Intel...
by Olagundoye, Eniola O.
Book cover thumbnail image
Solution or Stalemate? Peace Process in Turkey, 2009-2013
by Yurtbay, Baturay
Book cover thumbnail image
Risk Factors and Business Models Understanding the Five Forces of Entrepreneurial R...
by Miles, D. Anthony