Summary of the project

One of the primary concerns of the news media industry is how to manage the comments that readers post on news articles. Most online news publishers provide content in a form that allows readers not only to access it but to post their own comments: for readers, this is valuable in allowing them to express their opinions and interact with each other; for the publishers, it is valuable in that it provides a way to understand their audience and increase reader engagement. However, the ability to comment is often misused, with comments used to advertise, abuse others, spread misinformation and post illegal content. In many countries, publishers are legally accountable for the content that is posted. Publishers therefore usually employ some form of moderation: human moderators will scan the comments posted, and apply some moderation policy to block those that should not appear, and in severe cases perhaps ban the users from posting again.

This job is not easy: decisions can be subjective and hard to make consistently; it can be easy to miss comments that need blocking, and when high volumes of comments are coming in (peak volumes of many thousands of comments per hour are not unusual during events of note) it can be difficult to keep up. There has therefore been great interest in recent years in AI tools to assist moderators: tools to analyse the content of comments using natural language processing (NLP) methods and help flag those which should or should not be blocked, helping speed up the moderators’ work and produce consistent results. Recent research shows impressive accuracies.

However, transferring these AI methods from research to practical industry use is not straightforward. Tools must usually be trained on large volumes of data labelled with the correct expected output decisions: this data must be in the domain, style and language that will be seen in use, so must generally be produced from scratch for any new publisher, newspaper or topic. This process is expensive and needs expertise in NLP and AI methods.

This project seeks to develop new methods to bypass this problem and make the initial implementation process easy and fast. We will develop methods for semi-automatic annotation of data, including new variants of active learning in which the AI tools can quickly select the data they need to be labelled. We will build on recent progress in topic-dependent comment filtering to build tools that can take the context of the associated news article into account, reducing the new data needed. Finally, we will use recent progress in transfer learning to allow tools to be initialised from existing labelled data in other domains and languages, reducing the amount of data required.

The result will be a suite of tools to enable easy, fast, practical implementation of accurate, robust comment filtering methods for use in the news media industry.

see website
COOKIES

AI4Media may use cookies to store your login data, collect statistics to optimize the website's functionality and to perform marketing actions based on your interests.

COOKIES
They allow you to browse the website and use its applications as well as to access secure areas of the website. Without these cookies, the services you have requested cannot be provided.
These cookies are necessary to allow the main functionality of the website and they are activated automatically when you enter this website. They store user preferences for site usage so that you do not need to reconfigure the site each time you visit it.
These cookies direct advertising according to the interests of each user so as to direct advertising campaigns, taking into account the tastes of users, and they also limit the number of times you see the ad, helping to measure the effectiveness of advertising and the success of the website organisation.

Required Cookies They allow you to browse the website and use its applications as well as to access secure areas of the website. Without these cookies, the services you have requested cannot be provided.

Functional Cookies These cookies are necessary to allow the main functionality of the website and they are activated automatically when you enter this website. They store user preferences for site usage so that you do not need to reconfigure the site each time you visit it.

Advertising Cookies These cookies direct advertising according to the interests of each user so as to direct advertising campaigns, taking into account the tastes of users, and they also limit the number of times you see the ad, helping to measure the effectiveness of advertising and the success of the website organisation.