Abstract
Microblogging social media (mainly represented by Twitter) focuses on fast open real-time communication using short messages between users and their followers. These platforms generate large amounts of content and community finding techniques are an attractive alternative for organising it. However there is no clear agreement in the literature for a definition of user community for the microblogging use case, leading to unreliable ground-truth data and evaluation. In this work, we differentiate between functional and structural definitions of communities for microblogging. A functional community groups its users by a common independent social function, e.g. fans of the same football team, while in a structural community the members exclusively depend on their connectivity in a network, e.g. modularity. We build and characterise eight types of functional communities to be used as user-labelled ground-truth and five types of live user interactions networks from Twitter. We then evaluate thirteen popular structural community definitions using five different Twitter datasets, exploring their goodness and robustness for detecting the functional ground-truth under different perturbation strategies. Our results show that definitions based on internal connectivity, e.g. Triangle Participation Ratio, Fraction Over Median Degree or Conductance work best for the Twitter use case and are very robust. On the other hand, classic scores such as Modularity are limited and do not fit very well due to the sparsity and noise of microblogging.
| Original language | English (Ireland) |
|---|---|
| Title of host publication | 2018 IEEE ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM) |
| Publisher | IEEE |
| Number of pages | 3 |
| Publication status | Published - 1 Jan 2018 |
Authors (Note for portal: view the doc link for the full list of authors)
- Authors
- Hromic, H;Hayes, C