Posted on

I published an update to our dataset of crowdsourced annotations on controversy aspects, as part of the ControCurator project.

Experimental Setup

We evaluated the controversy aspects through a crowdsourcing experiment using the CrowdFlower platform. The collected annotations from this experiment were evaluated using the CrowdTruth methodology for measuring the quality of the annotations, the annotators, and the annotated articles. The relevance of each of the aspects was collected by asking the annotators whether they applied to the main topic of a given newspaper article. For this, we used a collection of 5 048 The Guardian articles that were retrieved through the Guardian news API. In order to save cost and focus on the main topic of an article only the first two paragraphs of each article were used. In an initial pilot we used 100 articles to test the use of a five point likert-scale answers versus “yes/no/I don’t know” type answers, and additionally whether showing five comments would help annotators identify whether the topic in an article is controversial. In a second pilot we evaluated with the same dataset whether rephrasing of the aspects and adding the time-persistence would make the identification more clear.


The results of the first pilot showed that for both settings when showing the article comments the number of annotators that select “I don’t know” option is significantly smaller (p-value = 0.003). Additionally, we found that the “yes/no/I don’t know” setup always finished faster. Although this difference is not significant (p-value = 0.0519), it may indicate that annotators were more willing to perform this task. Based on this we conclude that the variant with comments and yes-no answers gave the best performance in terms of speed and annotation quality. The results of the second pilot showed the rephrasing of the questions improved the identification as the number of people that selected the “I don’t know” option dropped from 15% to 3% with p=0.0001.

In the main experiment 5048 articles were annotated by 1 659 annotators resulting in 31 888 annotations. The evaluation of the controversy aspects was a two-fold: first the Pearson correlation coefficients were measured in order to identify how strong an aspect correlated with controversy in each judgment. Second, linear regression was applied to learn the regression coefficient between all of the aspects combined and the controversy score for a judgment. This value indicates the weight of an aspect with respect to the other aspects. The emotion aspect of an article was found to be the strongest indicator for controversy using both measures, while the multitude of actors was the weakest. The openness was said to be present most in 70.9% of the annotations, was annotated with a majority in 73% of the articles, and was found to be the most clearly represented aspect.


This dataset is built and used for the following papers. Please cite them if you decide to use our work.

Benjamin Timmermans, Lora Aroyo, Tobias Kuhn, Kaspar Beelen, Evangelos Kanoulas, Bob van de Velde, Gerben van Eerten: ControCurator: Understanding Controversy Using Collective Intelligence. Collective Intelligence Conference 2017

  title={ControCurator: Human-Machine Framework For Identifying Controversy},
  author={Timmermans, Benjamin and Beelen, Kaspar and Aroyo, Lora and Kanoulas, Evangelos and Kuhn, Tobias and van de Velde, Bob and van Eerten, Gerben},
  journal={Collective Intelligence Conference},

Benjamin Timmermans, Kaspar Beelen, Lora Aroyo, Evangelos Kanoulas, Tobias Kuhn, Bob van de Velde, Gerben van Eerten: ControCurator: Human-Machine Framework For Identifying Controversy. ICT Open 2017

  title={ControCurator: Understanding Controversy Using Collective Intelligence},
  author={Timmermans, B and Aroyo, L and Kuhn, T and Beelen, K and Kanoulas, E and van de Velde, B},
  journal={ICT Open},
Posted on

On June 15-16 the Collective Intelligence conference took place at New York University. The CrowdTruth team was present with Lora Aroyo, Chris Welty and Benjamin Timmermans. Together with Anca Dumitrache and Oana Inel we published a total of six papers at the conference.


The first keynote was presented by Geoff Mulgan, CEO of NESTA. He set the context of the conference by stating that there is a problem with technological development, namely that it only takes knowledge out of society and does not put it back in. Also, he made it clear that many of the tools we see today like Google Maps are actually nothing more than companies that were bought and merged together. This combination of things is what creates the power. He also defined what the biggest trends are in collective intelligence: the observation e.g. citizen generated data on floods, predictive models e.g. fighting fires with data, memory e.g. what works centers on crime reduction, and judgement e.g. adaptive learning tool for schools. Though, there are a few issues with collective intelligence: Who pays for all of this? What skills are needed for CI? What are the design principles of CI? What are the centers of expertise? These are all not yet clear. However, what is clear is that there is a new field emerging through combining AI with CI: Intelligence Design. We used to think systems resolve this intelligence, but actually we need to steer and design it.

In a plenary session there was an interesting talk on public innovation by Thomas Kalil. He defined the value of concreteness as things that happen when particular people or organisations take some action in pursuit of a goal. These actions are more likely to affect change if you can articulate who would needs to do what. He said he would like to identify the current barriers to prediction markets and areas where governments could be a user and funder of collective intelligence. This can be achieved through connecting people that are working to solve similar problems locally, e.g. in local education. Then change can be driven realistically, by making clear who needs to do what. Though, it was noted also that people need to be willing and able for change to work.

Parallel Sessions

There were several interesting talks during the parallel sessions. Thomas Malone spoke about using contest webs to address the problem of global climate change. He claims that funding science can be both straightforward and challenging, for instance government policy does not always correctly address the need of a domain issues, and even conflicts of interest may exist. Also, fundamental research can be tough to convince the general public of its use, as it is not sexy. Digital entrepreneurship is furthermore something that is often overlooked. There are hard problems, and there are new ways of solving them. It is essential now to split the problems up into parts, solve each of them with AI, and combine them back together.

Chris Welty presented our work on Crowdsourcing Ambiguity Aware Ground Truth at Collective Intelligence 2017.

Also Mark Whiting presented his work on Daemo, a new crowdsourcing platform that has a self-governing marketplace. He stress the fact that crowdsourcing platforms are notoriously disconnected from user interests. His new platform has a user driven design, in order to get rid of the flaws that exist in for instance Amazon Mechanical Turk.

Plenary Talks

Daniel Weld from the University of Washington presented his work on argumentation support in crowdsourcing. Their work uses argumentation support in crowd tasks to allow workers to reconsider their answers based on the argumentation of others. They found this to significantly increase the annotation quality of the crowd. He also claimed that humans will always need to stay in the loop of machine intelligence, for instance to define what the crowd should work on. Through this, hybrid human-machine systems are predicted to become very powerful.

Hila Lifshitz-Assaf of NYU Stern School of Business gave an interesting talk on changing innovation processes. The process of innovation has changed from a lane inventor, to labs, to collaborative networks, and now into open innovation platforms. The main issue with this is that the best practices of innovation fail in the new environment. In standard research and development there is a clearly defined and selectively permeable, whereas with open innovation platforms this is not the case. Experts can participate from in and outside the organisation. It is like open innovation: managing undefined and constantly changing knowledge in which anyone can participate. For this to work, you have to change from being a problem solve to a solution seeker. It is a shift from thinking: The lab is my world, to the world is my lab. Still, problem formulation is key as you need to define the problems in ways that cross boundaries. The question always remains, what is really the problem?

Poster Sessions

In the poster sessions there were several interesting works presented, for instance work on real-time synchronous crowdsourcing using “human swarms” by Louis Rosenberg. Their work allows people to change their answers through the influence of the rest of the swarm of people. Another interesting poster was by Jie Ren of Fordham University, who presented a method for comparing the divergent thinking and creative performance of crowds compared to experts. We ourselves had a total of five posters covering both poster sessions, which were received well by the audience.

Posted on

On 9th of June we are organising a Coffee & Data event with the Amsterdam Data Science community. The topic is “How to deal with controversy, bias, quality and opinions on the Web” and will be organised in the context of the COMMIT ControCurator project. In this project VU and UvA computer scientists and humanities researchers investigate jointly the computational modeling of controversial issues on the Web, and explore its application within real use cases in existing organisational pipelines, e.g. Crowdynews and Netherlands Institute for Sound and Vision.

The Agenda is as follows:

09:00-09:10 Coffee

Introduction & Chair by Lora Aroyo, Full Professor at the Web & Media group (VU, Computer Science)

09:10 – 09:25: Gerben van Eerten – Crowdynews deploying ControCurator

09:25 – 09:40: Kaspar Beelen – Detecting Controversies in Online News Media (UvA, Faculty of Humanities)

09:40 – 09:50: Benjamin Timmermans – Understanding Controversy Using Collective Intelligence (VU, Computer Science)

09:50 – 10:00: Davide Ceolin – (VU, Computer Science)

10:00 – 10:15: Damian Trilling – (UvA, Faculty of Social and Behavioural Sciences)

10:15 – 10:30: Daan Oodijk (Blendle)

10:30 – 10:45: Andy Tanenbaum – “Unskewed polls” in 2012

10:45 – 11:00: Q&A Coffee

The event takes place at the Kerkzaal (HG-16A00) on the top floor of the VU Amsterdam main building.

Posted on

Last week I gave another talk in our Weekly AI meeting on the topic of ControCurator. This is a project that I am currently working on, which has the goal to enable the discovery and understanding of controversial issues and events by combining human-machine active learning workflows.

In this talk I went into the different aspects of controversies, which we have identified in this project. You can view the slides here:

Posted on

Our ControCurator paper abstract titled “ControCurator: Understanding Controversy Using Collective Intelligence” has been accepted at Collective Intelligence 2017. In this paper we describe the aspects of controversy: the time-persistence, emotion, multiple actors, polarity and openness. Using crowdsourcing, the ControCurator dataset of 31888 controversy annotations was obtained for the relevance of these aspects to 5048 Guardian articles. The results indicate that each of these aspects is a positive indicator of controversy, but also that there is a clear difference in their signal strength. Most notably, the emotion was found to be the highest indicator. Though, all the measured controversy aspects were found to positively correlate with controversy. These results suggest that the controversy model is accurate and useful for modeling controversy in news articles.

The full dataset with controversy annotations is available for download at

Posted on

Our demo of ControCurator titled “ControCurator: Human-Machine Framework For Identifying Controversy” will be shown at ICT Open 2017. In this demo the ControCurator human-machine framework for identifying controversy in multimodal data is shown. The goal of ControCurator is to enable modern information access systems to discover and understand controversial topics and events by bringing together crowds and machines in a joint active learning workflow for the creation of adequate training data. This active learning workflow allows a user to identify and understand controversy in ongoing issues, regardless of whether there is existing knowledge on the topic.

Posted on

Today I gave a talk in our Weekly AI meeting on the topic of ControCurator. This is a project that I am currently working on, which has the goal to enable the discovery and understanding of controversial issues and events by combining human-machine active learning workflows.

In the talk I explained the issue of defining the space of a controversy, and how this relates to for instance wicked problems. You can see the slides below.

Posted on

The aim of the ControCurator project is for modern information access systems to discover and understand controversial topics and events. This is done by 1) bringing together different types of crowds: niches of experts, lay crowds and engaged social media contributors; and 2) using machines in a joint active learning workflow for real-time and offline creation of adequate training data. The ControCurator system will consist of two end-user applications: a Controversy Barometer for identifying controversial claims in medical forums, and an Event Blender for summarization of high-profile and catastrophic events in broadcast news & social media. Both systems use the ControCurator platform for curating the data.

The ControCurator projects extends and validates the work on the Accurator (a SealincMedia project) by adding (non-expert) crowdsourcing annotations from CrowdTruth. Additionally, event interpretations will be added that are derived from the analysis and mining of user-generated data on social media through Crowdynews. These event interpretations will also be extended with event interpretations derived from analysis and mining of broadcast news through MediaNow. This will allow to expand even further the range of interpretations on specific topics and events with the media perspective, i.e. how news media manages catastrophic events, detecting controversial topics and events. Finally, these perspectives and interpretations will all be combined in joint temporal summarization of controversial streaming broadcast news events, to enable for user feedback to be incorporated in search and access algorithms.