Improving the direct marketing practices of FMCG retailers through better customer selection. An empirical study comparing the effectiveness of RFM (Recency, Freuency and Monetary) CHAID (Chi-squared Automatic Interaction Detection), stepwise logit (logistic regression) and ANN (Artificial Neural Networks) techniques using different data variable depths

Ian Di Tullio

Abstracts Business Management & Administration

by Ian Di Tullio

Institution:	Cranfield University
Department:
Year:	2014
Record ID:	1410589
Full text PDF:	http://dspace.lib.cranfield.ac.uk/handle/1826/8837

Abstract

The intent of this thesis is to understand Data Mining technique effectiveness in both shallow (RFM variable only) and expanded data environments. The thesis addresses two specific gaps in research: (1) the relationship between customer selection techniques and performance and (2) the effects of using different depths of data on performance. In shallow-data contexts stepwise logit and neural networks provided the greatest cumulative lift and outperformed both RFM and CHAID across all top deciles. However, RFM shows the second highest fit measure, illustrating its relative stability in predicting outcomes. In addition, the RFM technique performance was tested using both one-month and 12-month time series. The 12-month series performed better and showed a greater level of fit. The subsequent study comparing technique effectiveness under expanded variable sets demonstrated an even more significant and visible lift increase versus the RFM technique. Looking at logistic regression, CHAID and neural networks, the lifts and gains obtained at the first two deciles provide enough response lift to allow these techniques’ cumulative performance to surpass RFM well past decile five into decile six. From a cumulative perspective, the strong performance of logit and ANN allow these techniques to outperform CHAID in deciles one and two, but as of decile three, cumulative performance of all three advanced techniques becomes virtually identical. Though CHAID remains the technique with the best fit performance, RFM fit value falls to last place once an expanded variable set is introduced. Furthermore, both logistic and ANN performance increases significantly, and though they remain very close from an overall Gini and PCC score perspective, the logistic regression outperforms ANN when using expanded data. In both studies, dimensionality reduction plays a role in optimising model response. In limited data sets, logit applications reduced data to achieve better response, whereas in extended data sets, all models applied reductions. These findings contribute to the growing literature on customer selection techniques and provide a specific contribution to data mining, RM, segmentation and marketing practice by demonstrating how these techniques can be used for better consumer selection for purposes of customer development in FMCG retail.

AbstractsBusiness Management & Administration

Abstract

Abstracts Business Management & Administration