AbstractsMathematics

Role of fibulin1 in the pathogenesis of chronic pulmonary diseases

by Salman Arif Cheema




Institution: University of Newcastle
Department:
Year: 2016
Keywords: aggregate; categorical; data analysis
Posted: 02/05/2017
Record ID: 2065329
Full text PDF: http://hdl.handle.net/1959.13/1312984


Abstract

Research Doctorate - Doctor of Philosophy (PhD) The analysis of aggregate data from 2x2 contingency tables has a long and interesting history. Traditionally, the approach taken to estimate the unknown cell frequencies (or some function of them) is to use ecological inference (EI). However, EI relies on assumptions that are either untestable or are unrealistic. Rather than adopting strategies to estimate the unknown cells, one may instead focus on understanding the underlying association structure between the variables using the Aggregate Association Index (AAI). Given only the aggregate data, the AAI quantifies how likely it is that an association exists between two nominal dichotomous variables when a test of independence is performed at the α level of significance. Such a test therefore relies on Pearson’s chi-squared statistic and does so in terms of the conditional proportion P₁. Here, P₁ is the proportion of individuals/subjects classified into the first column category of the 2x2 table given that they are classified into the first row category. This thesis discusses and expands upon the AAI which was proposed less than a decade ago. The generalisations and variants of the original AAI that we propose highlight the emerging growth of this index in the context of aggregate data analysis and how the AAI overcomes many of the pitfalls that confront the analyst when performing EI. We generalise the AAI to incorporate various linear transformations related to P₁ and demonstrate the invariance of the index to any linear transformation; for example, such transformations include the independence ratio, Pearson contingency, standardised residual and adjusted residual. We also show how the AAI is linked to one of the most common measures of association used to analyse 2x2 contingency tables – the odds ratio. The link between the AAI and odds ratio is investigated further as we establish the theoretical relationship between the index and the extended hypergeometric distribution. In doing so, the analyst may consider any a priori association structure using a new variant of the AAI called the Extended Aggregate Association Index (EAAI). Further extensions of the AAI are also made by generalising the index to incorporate the structure of ordered dichotomous variables. This is achieved by examining the features of ordinal log-linear models and how they may be used to analyse aggregate data. Since the underlying statistic that we shall be using is Pearson’s chi-squared statistic, its magnitude (and therefore the magnitude of the AAI) is strongly influenced by the size of the sample being studied. So, this thesis examines the impact of the sample size on the AAI and proposes strategies to minimise the impact of the sample size on the magnitude of the index. We also introduce the pseudo p-value so that the analyst can evaluate the relative significance of the AAI while isolating the impact of the sample size. Another new measure of association for analysing aggregate data is proposed in this thesis… Advisors/Committee Members: University of Newcastle. Faculty of Science & Information Technology, School of Mathematical and Physical Sciences.