|Institution:||University of Windsor|
|Keywords:||Online social networks; Spam links; Spammer; Star sampling; Twitter; Weibo|
|Full text PDF:||http://scholar.uwindsor.ca/etd/5282
Fake followers in online social networks (OSNs) are the accounts that are created to boost the rank of some targets. These spammers can be generated by programs or human beings, making them hard to identify. In this thesis, we propose a novel spammer detection method by detecting near-duplicate accounts who share most of the followers. It is hard to discover such near-duplicates on large social networks that provide limited remote access. We identify the near-duplicates and the corresponding spammers by estimating the Jaccard similarity using star sampling, a combination of uniform random sampling and breadth-first crawling. Then we applied our methods in Sina Weibo and Twitter. For Weibo, we find 395 near-duplicates, 12 millions suspected spammers and 741 millions spam links. In Twitter, we find 129 near-duplicates, 4.93 million suspected spammers and 2.608 billion spam links. Moreover, we cluster the near-duplicates and the corresponding spammers, and analyze the properties of each group.