Exploring real-world AI applications in Media: A look at seven use cases

Seven use cases have been defined by AI4Media’s industry partners, informed by emerging market opportunities and urgent industry challenges, raising specific requirements and research questions. AI4Media use cases highlight how AI applies throughout the media industry value chain, from research and content creation to production, distribution, consumption/interaction, performance and quality measurement. These industry cases play a key role in exploiting and sustaining results of AI4Media research activities. Have a look at them:

  • AI for Social Media and Against Disinformation 

This use case from Deutsche Welle (DW) and Athens Technology Center (ATC) leverages AI technologies to improve support tools used by journalists and fact-checking experts for digital content verification and disinformation detection. While partner DW provides journalistic and media-focused requirements, ATC is responsible for AI component integration and the operation of the demonstrators, Truly Media – a web-based platform for collaborative verification – and TruthNest – a Twitter analytics and bot detection tool. Two main topics are covered within the use case: 1) verification of content from social media with a focus on synthetic media detection, and 2) detection of communication narratives and patterns related to disinformation. The key motivation behind this work is to demonstrate how advanced AI support functions can enable news journalists to keep up with rapid new developments in the area of manipulated social media content, synthetic media, and disinformation.

To that end, related AI technologies that are being integrated into the use case demonstrators support journalists in detecting manipulated and synthetically generated images, videos, and audio, as well as detecting bot-generated tweets and managing their content through media summarization technologies. These AI-based tools are being developed by some of the largest research centres in Europe, such as the Centre for Research and Technology Hellas (CERTH), Fraunhofer, and CEA. We are also experimenting with AI at the edge applications for journalism. We are experimenting with how latest advances in the area can be leveraged, with a view to perform on-device critical media processing tasks, such as deepfake detection and face anonymization or NLP-based text analysis and question answering. This is considered a valuable capability in the context of counteracting disinformation, especially in cases where the media content of interest is of confidential or sensitive nature or in cases where the surrounding context does not allow it to be shared over public communication networks (e.g. areas without high-bandwidth connectivity or under strict monitoring by local authorities).

Another key aspect is the exploration of Trustworthy AI in relation to these topics and the specific needs of media organisations. Our goals are to explore and demonstrate how an AI component from a third-party provider can be enhanced in terms of transparency and robustness, to develop related AI transparency information documents for different target groups within a media organisation, and to make such transparency information available in the user interface of the demonstrators.

If you wish to know more about this work, please contact Danae Tsabouraki at d.tsabouraki@atc.gr

  • AI for News: The Smart News Assistant 

Journalists face a challenging environment where the amount of incoming content is ever increasing, while the need to publish news as fast as possible is extremely pressing. At the same time, journalists need to ensure the published content is both relevant to their audience and is a trustworthy source of information, avoiding errors and misinformation. This use case from the Flemish Public Broadcaster (VRT) focuses on interweaving smart (AI-powered) tools into day-to-day journalistic workflows in modern newsrooms, aiming to optimise repetitive tasks and create opportunities for new story formats supported by these tools. VRT is creating a Smart News Assistant, i.e., a multi-functional and AI-driven toolbox that will support journalists in monitoring, fact-checking, and creating engaging news formats.

Current work focused on investigating the workflow of a journalist and how to customise it with AI. We enhanced the Image Verification Tool from CERTH by creating a new user interface that provides step-by-step guidance through the image verification process. We developed a new prototype called Video Curator that matches incoming audiovisual content with news-related text written by journalists, in order to suggest suitable video output. Work is underway to create a new prototype that will help journalists to better understand and use data in their stories.

  • AI for Vision: Video Production and Content Automation 

Content preservation, high-quality production and process automation are at the core of the current transformation of Public Service Media (PSM) from its traditional business to the modern digital era. Emerging AI-based technologies can support PSMs in this transition by providing capabilities to simplify and accelerate content production processes and to enhance existing content, such as broadcasters’ archives. 

The use case defined by Rai, the Italian public broadcaster, focuses on three main tasks usually accomplished during everyday operations, namely content access, content production and content monitoring. Content access includes tools supporting users to find content according to specific semantic features, like persons’ names, places and organisations referenced in texts, recognising monuments depicted in images or identifying TV celebrities appearing in videos. Content production involves activities aimed at the creation and enhancement of content (e.g., video super resolution, video denoising). Content monitoring comprises some of the pillars of public media services, such as diversity analysis, content reliability assessment and social media analysis.  

The use case aims at exploring the plethora of new AI-driven tools to find the most suitable for each of these domains of application, identifying an as smooth as possible integration of each component into well established media workflows. Content access tasks have already been tackled, working on the possible introduction in the production workflows of four AI-driven components related to informative content and archive exploitation.  

Indeed, since being able to leverage on visual features instead of using only textual metadata could be of help to journalists for their search and retrieval activities, we worked on technologies allowing professionals to identify faces of TV personalities and geographic landmarks in video, as well as improving their possibilities to search for content using images as query. Another important feature that has been integrated into Rai’s tools for journalists is an AI-driven NER working on English and German content, which will improve some daily workflow of professionals working in bilingual regions of Italy. 

As for the content production activities, Public Service Media are extremely interested in video enhancement tasks, to be able to upgrade the big amount of archived content from e.g., HD to 4k (or even from SD to HD sometimes) for their possible reuse. Following this path, we assessed a super-resolution component and compared its performances with SOTA technologies obtaining promising results. Further tests will follow using different models. Content monitoring activities will also be tackled in the next period. 

  • AI for Social Sciences and Humanities 

Researchers working in media studies, history, political sciences, and other fields within social sciences and humanities (SSH) have greatly benefited from the digitization of audiovisual archives. It has expanded the scale and scope of their investigations, stimulating new research questions – a great example of this is an examination of political party representation in the media during the election period. The Netherlands Institute for Sound & Vision developed a use case that investigates how AI-based tooling could enhance SSH research with big data from archival collections. Specifically, AI4Media has provided us with an opportunity to expand the capabilities of the CLARIAH Media Suite, a digital environment for conducting research with multimodal datasets from Dutch heritage organisations. 

Over the last two years we have been collaborating with Fraunhofer IDMT to develop the Partial Audio Functionality (PAM) for the Media Suite that allows researchers to detect and trace the reuse of audiovisual programs based on the matching of identical audio signals. This can show ways in which moving images have been reused to frame a topic by giving source material a different meaning in a new context. For instance, a Media Suite user might choose a particular press conference and perform a PAM analysis to identify how segments from this program have been quoted in the evening news in weeks that follow, allowing them to compare how the same topic is reported on by different TV channels.

We have already performed an initial evaluation of PAM with researchers in the field of media studies. They confirmed the usefulness of the tool in studying the circulation and ‘canonization’ of images and speeches. Researchers were particularly excited to see a tool that is based on audio rather than visual analysis. This opens up new possibilities for currently underrepresented research areas, such as the analysis of soundscapes. What also became evident during this evaluation is that researchers place a high priority on the explainability and trustworthiness of AI tools. They need to be transparent about the limitations of their methods or potential biases and make their research replicable. Therefore, the next step in our work will be extending PAM with a model card based on IBM’s AI Fairness 360.

  • AI for Games: 

Digital games are one of the fastest-growing multimedia sectors with a projected market growth of $200 billion by 2023. This incredible trajectory is partly supported by a “games-as-a-service” business model, in which games are continuously developed and expanded beyond their initial release. While the steady flow of content helps with customer retention, it also puts pressure on developers because this content has to be tested and optimised before release. Artificial Intelligence (AI) can provide a radically new approach to game development and testing by allowing developers to test thousands of different configurations. AI can replace or augment existing practices by providing product evaluations faster than current methods and can evaluate potential products with a reduced need for human labour or data.

Automated Testing for Games: In the first sub-use case, Automated Testing for Games, MODL.AI demonstrates how AI tools can enhance the development process through automated testing and bug finding. The first objective is to provide a prototype of the platform where users can investigate a number of quality assurance reports generated by an AI agent. These reports are generated by a quality diversity agent run in a simple game demo. We are currently working to expand the prototype of the platform into a fully functional one, where the user can investigate a number of quality assurance reports generated by an AI agent in any game, supported by a plug-in for the world’s most prolific game engines, Unity and Unreal Engine, for easy integration for game developers.

Improved Music Analysis for Games: Even if video game producers usually ask human musicians to compose original background music, the development team needs audio examples that match the ambiance of the game in order to define audio mood boards and to provide music examples to facilitate the communication with the composers. The finding of suitable music examples is not a simple task, and it can take a long time. In this context IRCAM intends to demonstrate the benefit of AI methods in the context of developing video games. Based on an automatic analysis of music files, the demonstrator proposes an exploration of a wide music catalogue that is not manually annotated. 

In the current release, a catalogue of 105,000 songs was analysed to predict attributes (music genres, moods, tempo, etc.) and to compute similarity embeddings. Then the “Music Explorer” demonstrator – a web service – allows the exploration of the catalogue in two ways: first, the user defines the musical attributes which fit the ambiance of the game, and the service proposes a list of songs which fit the attributes. Contrarily to similar tools, here the criterion is based on automatically estimated attributes, and the method is then applicable even for catalogues which are not manually annotated. The second search method is based on music similarity. Here the user chooses a reference song, and he selects one or more music concepts (among: genre, mood, instrumentation, era, harmony and rhythm) in order to define the meaning of “similarity”. Then, the service returns the list of the closest songs in the catalogue. The analyses, for attributes and similarity search, are based on AI methods, and the web service is composed of a GUI displayed in the web browser of the user, and a back-end integrated on a distant server and running the AI components remotely. During the first evaluation, the Music Explorer demonstrator proved its usefulness and its ability to quickly find music examples, in order to help video game producers during the creation of a game.

  • AI for Human Co-Creation

This use case developed by the Barcelona Supercomputing Center explores the relationship between Human creation and AI tools for music composition. Labelled as Human co-creation, it potentially may have a deep impact on an industry feeding content to a society continuously consuming media production. We are currently developing novel tools that may contribute to an efficient creation task using AI tools, where the efforts of the artist or creator are focused on deeply creative tasks, relying on the assistant to perform less critical parts transparently during content co-creation. As the functionalities of these models can be complex to handle, the purpose is to provide to the final user – typically a music creator – a collection of well-organised functionalities and capabilities for AI-assisted music creation. These functionalities enable users to a) train and manipulate the model using a defined dataset selected by them, b) generate from the trained model novel content based on a small audio seed, and c) assess the quality of the generated audio content and publish the content on popular audio platforms

The current developments allow a non expert user to use advanced, pre-trained generative models or to prepare datasets for training under controlled conditions. We include a number of generative models released under the AI4Media project but also elsewhere. In addition, we have explored user requirements to understand the needs of a community of non experts approaching AI tools. The implementation of musical processing tools opens the possibility to create in a transparent manner content used in multiple formats. Composers use large datasets with music fragments and combine them using Machine Learning methods. While a single training may provide a large amount of content (different audio files), using different datasets improves the quality and variability of the generated output. However, the computational requirements are large and better training methods and data models are needed.

  • AI for (Re-)organisation and Content Moderation

Media companies have accumulated vast digital archives and collections of images and videos over the years. Since these collections have been gradually and iteratively built over time, often by different departments and units of media companies, they usually have little or no metadata such as tags, categories, and other types of annotations. This lack of coherent media asset organisation tailored to the media company business and services precludes the easy reusage and the successful monetisation of these media assets, and the creation and offering of new services. In addition, both big traditional media companies and more so digital media platforms combine in their collections both media content, created by these companies, but increasingly also user-generated content (UGC). Such hybrid media archives need advanced content moderation (CM) solutions, often working in real time to safeguard viewers and meet law and regulation requirements of various jurisdictions.

Currently our work focuses on including the integration and the use of Imagga’s content moderation and facial recognition technologies. Imagga has implemented novel methodologies based on advanced deep learning techniques such as CNNs and RNNs aimed at photo and video moderation – tagging, categorisation, and facial recognition As part of the content moderation, we have included object detection of infamous symbols and analysis whether the video has not-safe-for-work (NSFW) or explicit content. For facial recognition, we have included a celebrity recognition model able to recognize around 3,000 different celebrities. For each scene in each video, we have generated annotation metadata that is used for filtering and searching. The videos are split by keyframes and then processed by Imagga’s technologies to receive coordinates for infamous symbols and celebrities, present in the extracted keyframe images. These frames are also analysed for the presence of NSFW content. Then through a user-friendly web UI, the content can be searched and filtered.

Conclusions

The active guidance provided by use case partners to research partners throughout the integration process plays a crucial role in achieving success in all cases. This emphasises the significance of industry and research collaboration right from the project’s inception, highlighting that a lab-to-market transfer process requires their joint efforts. Moreover, the direct involvement of end-users in iterative and agile development processes further amplifies the potential market adoption of AI-related innovations, fostering a user-centric approach and ensuring the practical relevance of the developed solutions.

Nevertheless, as in the case of every innovation process, various challenges arise and often need to be tackled impromptu. Even more so, since we are dealing with the fast-evolving and frequently disrupted domain of digital technologies. Among these challenges are problems related to  structural and organisational differences among the consortium partners, integration complexities, usability and understandability issues, human-AI collaboration challenges, and dataset creation concerns. Moreover, ChatGPT’s public release was certainly a game changer for AI-driven innovation, although its long term impact on the media industry remains to be seen. 

To address these challenges, we sought to enhance collaboration by establishing closer, one-to-one relationships between industrial and research partners. Knowledge exchange, co-design activities, and joint events were also used to strengthen collaboration. Efforts to design more user-friendly interfaces and increase transparency for end users of the demonstrators were made in order to address usability issues. Human-AI collaboration aspects were improved through the development of transparency information and feedback mechanisms to enhance user trust in AI-generated results. Finally, careful dataset curation and Non-Disclosure-Agreements addressed bias and privacy concerns. Overall, our experience shows that a collaborative approach and ongoing adaptability are key to addressing challenges and ensuring the successful integration of AI research innovations into real-world applications.

Author(s): DanaeTsabouraki (Athens Technology Center); Birgit Gray (Deutsche Welle); Chaja Libot (VRT); Maurizio Montagnuolo (RAI); Rasa Bocyte (Netherlands Institute for Sound & Vision); Christoffer Holmgård (modl.ai); Rémi Mignot (IRCAM); Artur Garcia (BSC); Chris Georgiev (Imagga Technologies).

 

Exploring the future of Media: AI4Media’s fascinating video series

Unveiling the AI4Media Video Series

AI4Media, a European funded project at the intersection of AI and media, has curated a video series that provides a fascinating glimpse into the realm of AI applications within the media industry. Accessible to a global audience, this series aims to demystify AI’s role in shaping the future of media while highlighting the practical implications and potential of this powerful technology.

What makes the AI4Media video series truly captivating is its multifaceted exploration of AI’s applications in media. Each episode delves into a specific facet of this dynamic relationship, offering valuable insights and real-world examples. These are the exciting topics covered in the series:

    1. AI for News Production: This video highlights how AI enhances journalism, aiding in effective news reporting, especially in challenging scenarios, by optimising bandwidth, content management, and enabling real-time mapping and 3D visualisations.
    2. Robot Journalism: This video exemplifies how AI streamlines event coverage and content generation by managing extensive data, integrating information efficiently, and improving the quality of automated content, all while preserving editorial control.
    3. AI for the Next-Gen of Social Media: This video explores the various applications of AI in social media, including automating trend detection, aggregation, categorisation, analysing public sentiment and perceptions, content translation, and more.
    4. AI for Entertainment/Movie Production: This video demonstrates how AI technologies streamline the filming process, enhance content reach, and provide creative options, all while saving valuable time and resources.
    5. AI for Games: This video illustrates how AI assists medium-sized game development companies by streamlining testing processes, pinpointing issues in new content, and elevating overall productivity and quality, establishing itself as an invaluable solution in the industry.
    6. AI for Music: This video demonstrates how AI can enhance music composition and live performances, providing synchronisation support for movie soundtracks and empowering DJs with dynamic, style-adaptive music creation during live shows.
    7. AI for Publishing: This video explores how AI-driven co-creation platforms are revolutionising manuscript selection for publishers through user feedback analysis and content feature assessment.

Real-world applications and case studies are highlighted, illustrating how AI is being harnessed to address challenges and unlock new opportunities in the media ecosystem.

AI4Media’s video series goes beyond mere dissemination; it aims to empower knowledge and foster dialogue. By presenting complex concepts in an accessible manner, the series invites viewers to join the conversation surrounding AI in the media. Whether you are an industry professional, a curious enthusiast, or an academic, the series provides a platform for understanding, discussion, and engagement.

Access the Series Today

The AI4Media video series is just a click away on YouTube, accessible to anyone with an internet connection and a thirst for knowledge. To embark on this enlightening journey and explore the applications of AI and media, follow this link: AI4Media Video Series on YouTube.

Don’t miss the opportunity to uncover the transformative potential of AI in the media industry.

Author: Candela Bravo (LOBA)

Explore the new Scientific Papers page

Navigating the vast landscape of scientific papers can be daunting, but our new and improved filtering system empowers you to effortlessly refine your search. Whether you’re a student seeking cutting-edge studies or a seasoned researcher exploring AI4Media’s subjects, our revamped platform allows you to filter and customise your search results with precision. Check out the updated scientific papers’ page.

We are thrilled to introduce the new set of filters that empower you to precisely tailor your scientific paper searches:

  • Terms: The foundation of your search, allowing you to input specific keywords and phrases to pinpoint exactly what you’re looking for. Whether it’s machine learning, image detection, or the application potential of AI for the Media industry, our term filter ensures your search is laser-focused.
  • Author: Seek papers authored by your favourite experts or discover new voices in your field of interest. With this filter, you can find research directly from those who inspire you.
  • Year of Publication: Stay up-to-date with the latest research or delve into historical archives by narrowing your search to a particular publication year.
  • Institution: Explore papers affiliated with prestigious institutions or uncover hidden gems from lesser-known research centres.
  • Type of Publication: Are you searching for journal articles, conference papers, or books? Choose the publication type that suits your needs.
  • Publisher: Identify papers from trusted publishers, ensuring the credibility and quality of your sources.
  • Access Type: Make sure you access a wealth of research freely and without restrictions by filtering the open-access scientific papers.

Discovering knowledge should be intuitive, and our commitment to innovation ensures that your experience is both seamless and enriching. Welcome to the future of scientific exploration, where finding the research you seek is as simple as a few clicks. Explore, learn, and thrive with us as we continue to advance the way you access scientific papers.

Author: Mariana Carola

Unveiling propaganda on news articles: Cutting-edge models with linguistic and argumentative features

Propaganda has long been employed as a powerful communication tool to promote a cause or a viewpoint, especially in politics, despite its often misleading and harmful nature. Given the number of propagandist and fallacious messages posted on online social media everyday, the need to automatically detect and categorise propagandist content is crucial to safeguard society from its potential harm. We proposed text models that tackle these tasks and analyze the features that characterise propagandist messages. We based our proposed models on state-of-the-art transformer-based architectures and enrich them with a set of linguistic features ranging from sentiment and emotion to argumentation features. The experiments were conducted on two standard benchmarks in the Natural Language Processing field: NLP4IF’ 19 and SemEval’20-Task 11. Both are collections of news articles annotated with propaganda classes. Our models outperformed state-of-the-art systems on many of the propaganda detection and classification tasks. F1 scores of 0.72 and 0.68 were achieved on the sentence-level binary classification task for NLP4IF’ 19 and SemEval’20-Task 11 respectively. For the fragment-level classification task, our models outperformed the SOTA model in some propaganda classes. For instance, using NLP4IF’ 19, F1 scores of 0.61, 0.42 and 0.40 were obtained for “flag-waving”, “loaded language” and “appear to fear” respectively.

 

Semantic and argumentative features behind propaganda

In our pursuit to understand propaganda’s linguistic characteristics, we considered four groups of features that have previously shown links to propaganda: persuasion, sentiment, message simplicity, and argumentative features. In the persuasion group, we examined speech style, concreteness, subjectivity, and lexical complexity. For sentiment, we gathered sentiment labels, emotion labels, VAD scores, connotation, and politeness measurements. Message simplicity was analyzed through exaggeration and various text length-related metrics. To measure most of these variables we used, or constructed, a variety of lexicons. Finally, we trained classifiers that helped us extract argumentative features. That is, which parts of the text correspond to claims, premises, or none of them. This is important to understand the logical structure behind propaganda.

Propaganda’s levels of detection.

We addressed both Sentence-Level Classification (SLC), which asks to predict whether a sentence contains at least one propaganda technique, and Fragment-Level Classification (FLC), which asks to identify both the spans and the type of propaganda technique. The evaluation of the FLC task varied depending on the dataset being used. One of the main differences lies in the number of propaganda categories considered in each corpus: 18 in NLP4IF’ 19, and 14 in SemEval’20-Task 11.

Sentence-Level Classification

To tackle SLC, we employ a range of models, including BERT, T5, Linear-Neuron Attention BER, Multi-granularity BERT, BERT combined with BiLSTM, and BERT combined with logistic regression. In our proposed models, we utilize the last 3 architectures and modify them to include semantic and argumentative features. Our proposed models surpassed the state-of-the-art architectures. In some cases, semantic features alone demonstrated slightly better results than combining them with argumentation features.

Fragment-Level Classification

On the NLP4IF’19 Dataset, we evaluate various models, such as BERT, RoBERTa and the transformer-based winner architecture from the NLP4IF’19 shared task. Our proposed architectures used BERT with CRF output layers, outperforming the state-of-the-art model for several propaganda techniques.

In the SemEval’20 T11 dataset, we implement solutions based on BERT, RoBERTa, and the winning approach of the SemEval’20 T11 challenge. Our proposed model combined a transformer architecture with a BiLSTM. In addition to textual input, we fed the model with semantic and argumentation features. Also, we used a joint loss function that considers the loss at the sentence level, span level, and for the additional features. Such a model outperformed the SOTA model in some propaganda classes. In general, we noticed that using different training epochs help to detect different propaganda techniques. For instance, the classes “bandwagon and reductio ad hitlerium” and “thought-terminating cliches” are learnt best at low training epochs, while “casual oversimplification”, is learnt at high training epochs.

This task remains challenging, in particular regarding the fine-grained classification of the different propaganda classes.

What’s next?

Propaganda leverages emotional and logical fallacies and it is present in all kinds of media. That is why we have turned our attention to the study of fallacies in Twitter (now X), the bustling hub of information and opinions. This is a challenging task since fallacy identification, many times, relies on the context in which the text exists. Given the short length of tweets, such context is not always available. We are currently working on the definition of transformer-based architectures that will help us classify fallacies in this social media and continue our journey to fight misinformation and promote a more informed society.

 

Author: Mariana Chaves (UCA-3IA)

How did the European press treat the covid-19 “no-vax” phenomenon?

In AI4Media, Idiap has worked on the analysis of newspapers in different countries and their relationship with misinformation in the context of Covid-19 vaccination news.

Initially, a dataset was created, comprising more than 50,000 articles on Covid-19 vaccination from 19 newspapers across 5 European countries. From this dataset, a set of subtopics (within the main topic of covid vaccination news) was identified using topic models. Companies, countries, and individuals most frequently mentioned in each country were also identified using Named Entity Recognition techniques, while the sentiment of both headlines and full articles was determined. The results revealed consistencies across countries and subtopics (e.g. a prevalence of a neutral tone, and relatively more negative sentiment in non-neutral articles, with few exceptions like the case of vaccine brands). Moreover, distinctly high negative-to-positive sentiment ratios were identified for the “no-vax” subtopic, showing that this issue had a notably negative tone. This dataset and the results of the analysis were presented at the ACM International Workshop on Multimedia AI against Disinformation (MAD’22) in Newark, US. For more details, the paper can be consulted here.

Subsequently, Idiap directed its focus towards the “no-vax” movement theme. This line of research examines how the European press addressed reactions against the Covid-19 vaccine and the disinformation and misinformation associated with this movement. Based on the Covid-19 vaccination news dataset, Idiap employed a number of methods, including named entity recognition, word embeddings, and semantic networks, to comprehend the coverage provided by the European press within the disinformation ecosystem. The results of this multi-faceted analysis demonstrate that the European press actively countered a variety of hoaxes primarily propagated on social media and criticized the anti-vax trend, irrespective of the political orientation of the newspaper. This confirms the significance of studying the role of high-quality press in the disinformation ecosystem. This research was presented at the ACM International Workshop on Multimedia AI against Disinformation (MAD’23) in Thessaloniki, Greece. For more details, the paper can be found here

Overall, Idiap’s work serves as a point of comparison with other news sources on a topic where disinformation and misinformation have led to increased risks and negative outcomes for people’s health. We believe that linguistic analyses of high-quality press in Europe can contribute to informing the design of tools against disinformation, serving as a benchmark for what constitutes reliable information.

Authors: David Alonso del Barrio & Daniel Gatica-Perez (Idiap Research Institute)

Making synthetic image detection practical

Detecting whether an image posted on the Internet is authentic or generated by one of the recently produced generative AI models poses a significant challenge for journalists and fact checkers on a daily basis. While most people are familiar with tools and models such as Midjourney, DALL-E 2 and Stable Diffusion, there is nowadays a growing number of tools, services and apps that make synthetic image generation extremely accessible and enable anyone to create highly realistic images using plain text descriptions, which are widely known as prompts. It’s only natural that such potential can be exploited by malicious actors to spread disinformation. Therefore having capable tools in place to detect whether a suspicious image is AI-generated or not holds value for media organisations and newsrooms.

Such detection tools are also abundant, with the large majority being based on “deep learning models” – very large neural networks that have been trained to distinguish between authentic and synthetic media. In academic papers, these tools have often demonstrated to perform exceptionally well on separating between authentic and synthetic imagery. However, deploying such tools in operational settings presents several challenges.

A primary challenge is these models’ tendency to perform well in a restricted set of cases (referred to as “domain”) used for their training.  Consider a scenario where the researcher primarily used synthetic and authentic images of human faces to train the model. If a journalist wants to use this model to detect whether an image depicting a building is synthetic or not, the AI model is likely to give an unreliable response due to the domain mismatch between training (human faces) and testing (buildings). The Multimedia Knowledge and Social Media Analytics Lab (MKLab) at CERTH have recently developed a method to alleviate this issue. The method achieves better generalisation ability across different domains by training the detection model solely using high-quality synthetic images. This compels the model to “learn” quality-related artifacts instead of content-related cues. This method was presented at the international workshop on Multimedia AI against Disinformation (MAD’23) in Thessaloniki, Greece. A paper providing technical details is available as part of the workshop proceedings.

A second challenge when employing synthetic image detection models in practice is that most, if not all, available tools are in the form of web services that send the provided images to a server for analysis. This is often due to the computational intensity of the detection models, necessitating a powerful server for quick calculations. However, there are situations where journalists, fact-checkers, or citizens might be uncomfortable or at risk when sharing suspicious images with third-party services. To address this challenge, the MKLab team at CERTH leveraged a newly proposed method to “compress” detection models into a much smaller size, enabling execution on a standard smartphone.  This approach allows for  deepfake detection analysis without submitting the suspicious image to a third party. This compression uses “knowledge distillation”, where a computationally expensive model acts as a “teacher” to train a lighter model (the “student”). In experiments, the model size could be halved while maintaining nearly the same detection accuracy.  Even a 10-fold reduction was possible with only a slight decrease in accuracy. The method used for these results has been submitted for publication in an international journal, and a preprint is publicly available. 

It’s important to note a key limitation of the above results. Both focus on detecting GAN-generated images (well-known cases of such images are generated by https://thispersondoesnotexist.com/). Currently, detecting images produced by models like DALL-E 2, Stable Diffusion, and Midjourney is not feasible, although ongoing experiments show promise in developing tools that could enhance journalists’ and fact-checkers’ capabilities in countering disinformation.

Author: Akis Papadopoulos (CERTH)

AIDA: Maximising efforts toward accessible AI education and research

AIDA (International AI Doctoral Academy) is a non-profit organization comprising academic and industrial partners. It receives support from the European Networks of Excellence AI4Media, ELISE, TAILOR, HumanE-AI NET, and VISION CSA. AIDA’s primary aim is to enhance accessibility to AI education and research.

Its key objectives include:

  • Coordinating educational and training activities in AI for PhD and postdoc students among AIDA partners.
  • Establishing itself as a global reference point for all matters related to AI education and research.
  • Developing mechanisms for the sharing of educational resources in the field of PhD-level AI across universities.
  • Paving the way for future efforts aimed at creating a charter for European universities to share, accredit, and recognize PhD education credits in AI.

In line with these objectives, AIDA has undertaken significant efforts to maximize the impact and user engagement in AI education and research excellence.

More concretely, AIDA’s vision is to:

  • Cultivate a new generation of AI talents in Europe.
  • Establish itself as a leading reference in AI education.
  • Operate with a focus on realism and ensure long-term sustainability.

In this context, AIDA is a strong advocate for providing free access to AI educational resources and materials. The expansion of its AI offerings can be summarized as follows:

These figures reflect significant growth due to AIDA’s effective communication and dissemination practices. Depending on the context, AIDA selectively or publicly communicates and disseminates its efforts and directs offerings to relevant groups of recipients. This is primarily achieved through the use of social media and mailing lists.

Focusing on the user, the primary beneficiary of free access to AI education and research, AIDA has improved and enhanced its website to promote transparency, attractiveness, user-friendliness, and content enrichment.

In summary, these efforts have maximized the impact and user engagement of AIDA’s offerings, bringing it closer to establishing itself as an authority and a one-stop-shop for AI excellence in Europe.

For interested readers, becoming an AIDA member is possible by following the links below:

Author: George Bouchagiar (AUTH)

The ten projects from AI4Media’s second funding program are introducing fresh AI research and innovation for the media industry

The objective of AI4Media – Open Call #2, much like the first open call, was to engage companies and researchers in developing new research and applications for AI and media, thus contributing to the enrichment of the technological tools developed within the AI4Media network. Applicants were required to address specific challenges outlined by AI4Media partners, all of which are aligned with the Roadmap on AI technologies and applications for the Media Industry. 

Out of a total of 95 submissions received from 24 countries during the competitive open call, 10 projects were selected. The open call ran from September 29 to November 30, 2022. Eligible submissions underwent external evaluation by independent experts, and a selected group of proposals advanced to the interview stage. Each project has been awarded a grant of up to €50,000 to implement their work plan.

Throughout the remainder of the funding program, AI4Media will provide beneficiaries with tailored coaching, business support, and external visibility. Additionally, a boot camp featuring various workshops will be conducted later in 2023.

Here’s a brief overview of the funded projects and their objectives:

APPLICATION projects:

  1. JECT-CLONE (JECT.AI Limited, SME from the UK): Delivering new computational creativity capabilities as a software-as-a-service (SaaS) that autonomously generates novel themes, angles, and voices for stories, sending them regularly to subscribed journalists and editors through existing channels.
  2. VIREO (Human Opsis, SME from Greece): Recommending images to professionals in the News and Media industry using AI techniques, enhancing the creation of visually compelling articles and improving the reading experience for media consumers.
  3. NLMIE (Kaspar Aps, SME from Denmark): Combining Natural Language Processing with Computer Vision to modernize audiovisual archives.
  4. MBD (Tech Wave Development Srl, SME from Romania): Uniting artists, journalists, and programmers against misinformation by providing ways to visualize the hidden structures of fake information, making it meaningful for both journalists and the general public.
  5. magnet (inknow solutions, lda, SME from Portugal): Offering a tool to support journalists in the early phases of article production by automatically resurfacing relevant content from previous activities.


RESEARCH projects:

  1. CAMOUFLAGE (Politecnico di Torino, Higher Education institution from Italy): Developing diffusion models for extreme image anonymization in social media.
  2. ELMER (University of Surrey, Higher Education institution from the UK): Creating an efficient system for content retrieval capable of handling multi-modal audio, image, text, and video data, particularly for footage longer than 10 seconds.
  3. HoloNeXT (Fundació i2CAT, Research organization from Spain): Developing a novel XR media production tool integrating two volumetric/XR technologies: Neural Radiance Field scene modeling and holographic real-time video volumetric transmission.
  4. CLIP LENS (CENTIC, Research organization from Spain): Enhancing AI-based systems like image classifiers and search engines through generative data augmentation and CLIP.
  5. VolEvol (“Gheorghe Asachi” Technical University of Iasi, Higher Education institution from Romania): Facilitating the rendering of images from volume data using evolutionary algorithms to search for rendering parameters based on quality and diversity-oriented optimization objectives.

Authors: Samuel Almeida, Ellie Shtereva, and Catarina Reis (F6S)

The AI Media Observatory is now fully launched

The AI Media Observatory is a knowledge platform that monitors and curates relevant research on AI in media. Over the last few months, we have been slowly building a knowledge foundation of articles and audiovisual content covering questions regarding the environmental and societal impact of AI, emerging policies and legislation, how to ensure social and ethical AI, and what the upcoming trends and technologies look like.

The content currently featured on the observatory is curated by the consortium and is based on the expertise of more than 30 leading research and industry partners in the field of AI in media. However, all stakeholders are also invited to submit content to the Observatory to ensure its relevance for everyone working at the intersection of media and AI. More information on how to submit and the criteria can be found on the ‘Editorial Board’ page.

As the latest feature, the AI Media Observatory now also includes an expert directory where AI and Media experts can be featured, allowing stakeholders to easily get in touch with relevant experts in the field. The expert directory is open, and all experts are welcome to sign up and have their profiles featured as long as they meet the eligibility criteria.

The Observatory in short

In short, the overarching goal of the Observatory is to support the ongoing efforts of the multidisciplinary community of professionals who are working towards ensuring the responsible use of AI in the media sector. It aims to contribute to the broader discussion and understanding of the development and use of AI in the sector and its impacts on society, the economy, and people. The observatory aims to fulfil this goal by curating relevant content that provides expert perspectives on the potentials and challenges that AI poses for the media sector through its sections ‘Your AI Media Fee’ for written content and ‘Let’s Talk AI and Media’ for audiovisual content. It also provides an easy overview of relevant experts in the field through our directory ‘Find your AI Media Expert.’

Author: Anna Schjøtt Hansen (UvA)