Abstracts Category : Other

Add abstract

Want to add your dissertation abstract to this database? It only takes a minute!

Search abstract

Search for abstracts by subject, author or institution

Share this abstract

dissertation.com
on Facebook

Content-based image retrieval and speech enhancement system using deep learning structure

by Xiangyuan Zhao

Institution:	Texas Tech University
Year:	2017
Keywords:	Deep Leaning; CBIR; Speech Enhancement; Autoencoder; Convolutional Neural Network
Posted:	02/01/2018
Record ID:	2151914
Full text PDF:	http://hdl.handle.net/2346/72640

Abstract

Deep learning recently attracted a lot of attention in image processing and signal processing. It shows great potential in downsampling high dimension data while abstracts key information inside these data. This characteristic makes deep learning very powerful in content-based image retrieval (CBIR) and speech enhancement (SE) because both of them need high quality and low dimension semantic features. By using the code layer in deep autoencoder (DAE), which is a fully connected deep learning model, the CBIR and SE system can get decent results. For the CBIR, our newly designed multiple input multiple task DAE (MIMT-DAE) using wavelet coefficients can even get better performance than the single input single task DAE using less trainable parameters. However, for image processing, the fully connected structure shows limitation and a locally connected structure named convolutional neural network (CNN) and a hybrid structure is proposed in this dissertation. The CNN works as a preprocessing stage for the autoencoder can provide better input features than the raw images because its locally connected weights. The hybrid structure boosts the retrieval performance substantially in both grayscale and color image retrieval.For the SE system, the fully connected DAE trained only on mask approximation (MA) function does not present desired performance. We design a multiple task structure adding a signal approximation (SA) function during training for the SE system to reduce false positive. Training on both cost functions simultaneously gives much better performance than trained only on MA function or even the latest method that fined tuned on SA function. We also explored the long-short term memory structure and propose it as the future work.Advisors/Committee Members: Mitra, Sunanda (committee member), Pal, Ranadip (committee member), Nutter, Brian (Committee Chair).