Discover great EU-funded Innovations
Education, Content & Creativity INNOVATION
New improved methods for Automated speech/speaker and Name Entity Recognition to generate test descriptions
SHARE:
Market Maturity: Tech Ready
These are innovations that are progressing on technology development process (e.g. pilots, prototypes, demonstration). Learn more
Market Creation Potential
This innovation was assessed by the JRC’s Market Creation Potential indicator framework as addressing the needs of existing markets and existing customers. Learn more
Location of Key Innovators developing this innovation
Key Innovators
UN Sustainable Development Goals(SDG)
This innovation contributes to the following SDG(s)
SUSTAINABLE DEVELOPMENT GOAL 8
Promote sustained, inclusive and sustainable economic growth, full and productive employment and decent work for all

The UN explains: "Roughly half the world’s population still lives on the equivalent of about US$2 a day. And in too many places, having a job doesn’t guarantee the ability to escape from poverty. This slow and uneven progress requires us to rethink and retool our economic and social policies aimed at eradicating poverty."

The EU-funded Research Project
This innovation was developed under the Horizon 2020 project MeMAD with an end date of 31/12/2020
  • Read more about this project on CORDIS
Description of Project MeMAD
Audiovisual media content created and used in films and videos is key for people to communicate and entertain. It has also become an essential resource of modern history, since a large portion of memories and records of the 20th and 21st centuries are audiovisual. To fully benefit from this asset, fast and effective methods are needed to cope with the rapidly growing audiovisual big data that are collected in digital repositories and used internationally. MeMAD will provide novel methods for an efficient re-use and re-purpose of multilingual audiovisual content which revolutionize video management and digital storytelling in broadcasting and media production. We go far beyond the state-of-the-art automatic video description methods by making the machine learn from the human. The resulting description is thus not only a time-aligned semantic extraction of objects but makes use of the audio and recognizes action sequences. While current methods work mainly for English, MeMAD will handle multilingual source material and produce multilingual descriptions and thus enhance the user experience. Our method interactively integrates the latest research achievements in deep neural network techniques in computer vision with knowledge bases, human and machine translation in a continuously improving machine learning framework. This results in detailed, rich descriptions of the moving images, speech, and audio, which enable people working in the Creative Industries to access and use audiovisual information in more effective ways. Moreover,the intermodal translation from images and sounds into words will attract millions of new users to audiovisual media, including the visually and hearing impaired. Anyone using audiovisual content will also benefit from these verbalisations as they are non-invasive surrogates for visual and auditory information, which can be processed without the need of actually watching or listening, matching the new usage of video consumption on mobile devices.

Innnovation Radar's analysis of this innovation is based on data collected on 30/08/2019.
The unique id of this innovation in the European Commission's IT systems is: 18044