Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior.
In recent years, offensive, abusive and hateful language, sexism, racism andother types of aggressive and cyberbullying behavior have been manifesting withincreased frequency, and in many online social media platforms. In fact, pastscientific work focused on studying these forms in popular media, such asFacebook and Twitter.
Building on such work, we present an 8-month study of the various forms ofabusive behavior on Twitter, in a holistic fashion. Departing from past work,we examine a wide variety of labeling schemes, which cover different forms ofabusive behavior, at the same time. We propose an incremental and iterativemethodology, that utilizes the power of crowdsourcing to annotate a large scalecollection of tweets with a set of abuse-related labels. In fact, by applyingour methodology including statistical analysis for label merging orelimination, we identify a reduced but robust set of labels. Finally, we offera first overview and findings of our collected and annotated dataset of 100thousand tweets, which we make publicly available for further scientificexploration.
Stay in the loop.
Subscribe to our newsletter for a weekly update on the latest podcast, news, events, and jobs postings.