|Institution:||Universidade de Lisboa|
|Keywords:||Cheminformatics; Machine learning; Modelos QSAR/QSPR; Neurorecetores; Binding affinity; Teses de mestrado - 2016; Departamento de Informática|
|Full text PDF:||http://www.rcaap.pt/detail.jsp?id=oai:repositorio.ul.pt:10451/23791|
Tese de mestrado, Bioinformática e Biologia Computacional (Bioinformática), Universidade de Lisboa, Faculdade de Ciências, 2016 Only some compounds (e.g. ligands) act as neurotransmitters in the brain, binding to specific neuroreceptors. Understanding the criteria behind why a ligand binds to a particular target in the brain can help design drugs which are more effective. With the help of data-mining techniques, quantitative structure–activity/propriety relationship (QSAR/QSPR(Q (SAR)) models and machine learning methods, a supervised model can be built which can predict binding affinities for any molecule, provided sufficient experimental data is available. Models which can predict binding affinities for specific neuroreceptors were built using three machine learning methods (Random Forests, Support Vector Machines and Least Absolute Shrinkage and Selection Operator) and two sets of molecular descriptors from different chemical toolboxes (Open Babel and CDK). Experimental data was collected to create the database and curated by removing inconsistencies and duplicates. The final dataset had 43901 binding affinity values for 53 human neuroreceptors. In the model building phase, 75% of the dataset was used for training and 25% for validation. The modelling consisted of choosing the most important variables (descriptors) for each neuroreceptor and validating using statistical measures. Random Forests and SVM were the best methods. Random Forests was used to select the most important variables and SVM for the statistical measure. The value of root mean squared error (RMSE) was below 0.214, more than half of the receptors had the percentage of variance explained (PVE) above 50% and Pearson's correlation coefficient was above 0.50, confirming the model had a good fit. Small dataset (below 112 entries) resulted in some models having poor results. RMSE values from validation and modelling parts were similar for the best model resulting in a good therefore can predict the strength of binding between neuroreceptor and neurotransmitter. The values of RMSE for the best models were between 0.087 and 0.201 where the PVE is above 50% and correlation above 0.50. Some molecular descriptors were selected frequently; 46 descriptors appeared in more than 20 neuroreceptors, however only 6 descriptors appeared in all neuroreceptors. The same descriptors are used to identify the same family of neuroreceptors. É importante perceber o critério que determina a ligação entre uma molécula e um recetor específico, em particular no cérebro, onde só alguns compostos atuam como neurotransmissores e ligam-se a neurorecetores especifícos. Os neurotransmissores, dependem da sua estrutura para estabelecerem uma ligação com os neurorecetores. Essa ligação pode ser medida através de valores de binding affinity. É possivel, com ajuda de técnicas de data-mining, métodos de machine learning e de relação quantitativa estrutura-propriedade/atividade (QSAR/QSPR), construir um modelo que consiga prever esses valores de binding affinity, desde que tenhamos… Advisors/Committee Members: Falcão, André Osório e Cruz de Azerêdo, 1969-.