Abstracts Computer Science

Add abstract

Want to add your dissertation abstract to this database? It only takes a minute!

Search abstract

Search for abstracts by subject, author or institution

Share this abstract

Interactive Data Integration and Entity Resolution for Exploratory Visual Data Analytics

by Kristi Morton

Institution: University of Washington
Year: 2016
Keywords: Data cleaning; Data integration; Tableau; Visual Data Analytics; Computer science; computer science and engineering
Posted: 02/05/2017
Record ID: 2093972
Full text PDF: http://hdl.handle.net/1773/35165


Abstract

Data has become more widely available to the public for consumption, for example, through the Web and the recent “Open Data” movement. An emerging cohort of users, called Data Enthusiasts, want to analyze this data, but have limited technical or data science expertise. In response to these trends, online visual analytics systems have emerged as a popular tool for data analysis and sharing. Current visual analytics systems such as Tableau and Many Eyes enable this user cohort to be able to perform sophisticated data analysis visually at interactive speeds and without any programming. Together, these two systems have been used by tens of thousands of authors to create hundreds of thousands of views, yet we know very little about how these systems are being used. The first challenge we address in this thesis, thus, is: how are popular visual analytics systems such as Tableau and Many Eyes being used for data analysis? To the best of our knowledge, this is the first study of its kind, and presents important details about the use of online, visual analytics systems. Visual analytics systems provide basic support for data integration. A simple approach for interactive data integration in Tableau was implemented in that tool in the context of this dissertation. Visual analytics, systems, however, do not currently assist users with de- tecting or resolving potential data quality problems including the well-known deduplication problem. Recent approaches for deduplication focus on cleaning entire datasets and com- monly require hundreds to thousands of user labels. In this thesis, we address the challenge of deduplication in the context of visual data analytics with an approach that produces significantly cleaner views for small labeling budgets than state-of-the-art alternatives. The key idea behind the approach is to consider the impact that individual tuples have on a visualization and to monitor how the view changes during cleaning. Advisors/Committee Members: Balazinska, Magdalena (advisor).

Add abstract

Want to add your dissertation abstract to this database? It only takes a minute!

Search abstract

Search for abstracts by subject, author or institution

Share this abstract

Relevant publications

Book cover thumbnail image
Prediction of Upper Body Power of Cross-Country Sk...
by Ozciloglu, Mustafa Mikail
   
Book cover thumbnail image
Bitcoins Mining, Transaction, Security Challenges and Futur...
by Zahid, Muhammad Aslam
   
Book cover thumbnail image
Applying User-Centered Interface Design Methods to...
by Mburu, Lucy Waruguru
   
Book cover thumbnail image
Head-Order Techniques and Other Pragmatics of Lamb...
by Troullinos, Nikos B.
   
Book cover thumbnail image
Visualization of Interface Metaphor for Software An Engineering Approach
by Katre, Dinesh S.
   
Book cover thumbnail image
Indoor Wireless Metering Networks A Collection of Algorithms Enabling Low Power/Low ...
by Altan, Nicola
   
Book cover thumbnail image
Automated Generation of Geometrically-Precise and ...
by Mekni, Mehdi
   
Book cover thumbnail image
A Study on the Tone-Reservation Technique for Peak...
by Butt, Umer Ijaz