The IPTC took part in a panel on Diversity and Inclusion at the CEPIC Congress 2022, the picture industry’s annual get-together, held this year in Mallorca Spain.
Google’s Anna Dickson hosted the panel, which also included Debbie Grossman of Adobe Stock, Christina Vaughan of ImageSource and Cultura, and photographer Ayo Banton.
Unfortunately Abhi Chaudhuri of Google couldn’t attend due to Covid, but Anna presented his material on Google’s new work surfacing skin tone in Google image search results.
Brendan Quinn, IPTC Managing Director participated on behalf of the IPTC Photo Metadata Working Group, who put together the Photo Metadata Standard including the new properties covering accessibility for visually impaired people: Alt Text (Accessibility) and Extended Description (Accessibility).
Brendan also discussed IPTC’s other Photo Metadata properties concerning diversity, including the Additional Model Information which can include material on “ethnicity and other facets of the model(s) in a model-released image”, and the characteristics sub-property of the Person Shown in the Image with Details property which can be used to enter “a property or trait of the person by selecting a term from a Controlled Vocabulary.”
Some interesting conversations ensued around the difficulty of keeping diversity information up to date in an ever-changing world of diversity language, the pros and cons of using controlled vocabularies (pre-selected word lists) to cover diversity information, and the differences in covering identity and diversity information on a self-reported basis versus reporting by the photographer, photo agency or customer.
It’s a fascinating area and we hope to be able to support the photographic industry’s push forward with concrete work that can be implemented at all types of photographic organisations to make the benefits of photography accessible for as many people as possible, regardless of their cultural, racial, sexual or disability identity.
Where else can you hear about the difficulties of examining photo metadata in NFTs, see a lifelike image of a human being generated from pure data before your eyes, see how Wikidata can be used to take semantic fingerprints of news articles, and discover that an hour is nowhere near long enough to discuss simplifying machine-readable rights? Nowhere but the IPTC Meeting, of course! And this year’s Spring Meeting was the venue for all of this and much more.
We held the meeting virtually from Monday May 16 to Wednesday May 18th, and attending were over 70 people from at least 45 organisations across more than 20 countries.
Along with our usual Working Group updates and committee meetings, we invited speakers from several fascinating startups, services and projects at member companies. Here’s a quick summary of their sessions:
- We heard from Kairntech who are working on a classification system based on extracting entities from news stories and building a “semantic fingerprint” which can be used for cross-language classification, search and content enhancement
- The New York Times’ R&D Lab presented PaperTrail, a project to enhance the quality of the Times’ print archive through the use of machine learning to improve on basic OCR techniques (they’re looking for collaborators, more info coming soon!)
- Bria.ai showed us how an API can be used to enhance and create images and videos through the use of a custom GAN model trained in a “responsible AI” method
- Margaret Warren talked us through her efforts in creating and selling an NFT, looking at the process view the perspective of a photo metadata expert
- Consultant and author Henrik de Gyor talked us through the latest in synthetic media, which will be helpful in helping us to finalise our Digital Source Type vocabulary for synthetic media
- Laurent Le Meur from EDRLab presented his project’s recommendation on a Text and Data Mining Reservation Protocol, which can be used by publishers to restrict the rights of data miners in scraping any content for the purpose of analysis or building a model
- We heard from Dominic Young of Axate on his approach to offer pay-as-you-go payment options on paywalled news sites based on a simple pre-paid wallet mechanism.
We also had many announcements and discussions around IPTC standards, many of which we will be revealing in the coming months. One notable update is that the Standards Committee approved ninjs version 1.4 which we will release soon.
Thanks to all the IPTC members, Working Group leads, committee members and guests who made this member meeting one to remember.
The National Association of Broadcasters (NAB) Show wrapped up its first face-to-face event in three years last week in Las Vegas. In spite of the name, this is an internationally attended trade conference and exhibition showcasing equipment, software and services for film and video production, management and distribution. There were 52,000 attendees, down from a typical 90-100k, with some reduction in booth density; overall the show was reminiscent of pre-COVID days. A few members of IPTC met while there: Mark Milstein (vAIsual), Alison Sullivan (MGM Resorts), Phil Avner (Associated Press) and Pam Fisher (The Media Institute). Kudos to Phil for working, showcasing ENPS on the AP stand, while others walked the exhibition stands.
NAB is a long-running event and several large vendors have large ‘anchor’ booths. Some such as Panasonic and Adobe reduced their normal NAB booth size, while Blackmagic had their normal ‘city block’-sized presence, teeming with traffic. In some ways the reduced booth density was ideal for visitors: plenty of tables and chairs populated the open areas making more meeting and refreshment space available. The NAB exhibition is substantially more widely attended than the conference, and this year several theatres were provided on the show floor for sessions any ‘exhibits only’ attendee could watch. Some content is now available here: https://nabshow.com/2022/videos-on-demand/
For the most part this was a show of ‘consolidation’ rather than ‘innovation’. For example, exhibitors were enjoying welcoming their partners and customers face-to-face rather than launching significant new products. Codecs standardised during the past several years were finally reaching mainstream support, with AV1, VP9 and HEVC well-represented across vendors. SVT-AV1 (Scalable Vector Technology) was particularly prevalent, having been well optimised and made available to use license-free by the standard’s contributors. VVC (Versatile Video Coding), a more recent and more advanced standard, is still too computationally intensive for commercial use, though a small set made mention of it on their stands (e.g. Fraunhofer).
IP is now fairly ubiquitous within broadcast ecosystems. To consolidate further, an IP Showcase booth illustrating support across standards bodies and professional organisations championed more sophisticated adoption. A pyramid graphic showing a cascade of ‘widely available’ to ‘rarely available’ sub-systems encouraged deeper adoption.
Super Resolution – raising the game for video upscaling
One of the show floor sessions – “Improving Video Quality with AI” – presented advances by iSIZE and Intel. The Intel technology may be particularly interesting to IPTC members, and concerns “Super Resolution.” Having followed the subject for over 20 years, for me this was a personal highlight of the show.
Super Resolution is a technique for creating higher resolution content from smaller originals. For example, achieving a professional quality 1080p video from a 480p source, or scaling up a social media-sized image for feature use.
A few years ago a novel and highly effective new Super Resolution method was innovated (“RAISR”, see https://arxiv.org/abs/1606.01299); this represented a major discontinuity in the field, albeit with the usual mountain of investment and work needed to take the ‘R’ (research) to ‘D’ (development).
This is exactly what Intel have done, and the resulting toolsets will be made available at no cost at the company’s Open Visual Cloud repository at the end of May.
Intel invested four years in improving the AI/ML algorithms (having created a massive ground truth library for learning), optimising to CPUs for performance and parallelisation, and then engineering the ‘applied’ tools developers need for integration (e.g. Docker containers, FFmpeg and GStreamer plug-ins). Performance will now be commercially robust.
The visual results are astonishing, and could have a major impact on the commercial potential of photographic and film/video collections needing to reach much higher resolutions or even to repair ‘blurriness’.
Next year’s event is the centennial of the first NAB Show and takes place from April 15th-19th in Las Vegas.
– Pam Fisher – Lead, IPTC Video Metadata Working Group
With less than two weeks to go, we are pleased to announce the full agenda for the IPTC Spring Meeting 2022.
The IPTC Spring Meeting 2022 will be held virtually from Monday May 16th to Wednesday May 18th, from 1300 – 1800 UTC each day.
IPTC member representatives can view the full agenda and register at https://iptc.org/moz/events/spring-meeting-2022/
Highlights of the meeting include:
- Updates from all IPTC Working Groups, including Photo Metadata, Video Metadata, NewsCodes, Sports Content, NewsML-G2 and News in JSON
- Updates from the IPTC PR Committee and the IPTC Standards Committee, including votes on proposed new versions of IPTC standards
- Invited presentations from:
- United Robots, presenting their “robot journalism” system built for media companies
- Axate‘s micropayments system for publishers
- Kairntech presenting their content classification system used by Agence France-Presse among others
- Bria.ai‘s image generation and manipulation API backed with cutting-edge artificial intelligence
- Consultant Henrik de Gyor speaking on the latest developments in synthetic media
- Laurent Le Meur from EDRLab discussing the W3C Text and Data Mining Community Group’s recommendation for a Text and Data Mining Reservation Protocol
- Member presentations:
- Recently-joined IPTC members will have a chance to introduce themselves and their organisations to the IPTC membership
- The New York Times presenting their “Papertrail” system used to target advertising based on content metadata
- Margaret Warren from ImageSnippets discussing what she learned when creating NFTs from her artwork
- Member discussions:
- IPTC members will be discussing how we might be able to simplify rights management with a cut-down basic set of rights assertions, possibly creating a simpler alternative to RightsML
- IPTC members will also be discussing the News Architecture and how we can better utilise the key data model that underlies both NewsML-G2 and ninjs
- and more!
Attendance to the 2022 IPTC Spring Meeting is free for all delegates and member experts from IPTC member organisations.
Invited speakers are welcome to attend the day on which they are speaking.
IPTC members will be appearing at imaging.org’s Imaging Science and Technology DigiTIPS 2022 meeting series tomorrow, April 26.
The session description is as follows:
Abstract: Learn how embedded photo metadata can aid in a data-driven workflow from capture to publish. Discover what details exist in your images; and learn how you can affix additional information so that you and others can manage your collection of images. See how you can embed info to automatically fill in “Alt Text” to images shown on your website. Explore how you can test your metadata workflow to maximize interoperability.”
Registration is still open. You can register at https://www.imaging.org/Site/IST/Conferences/DigiTIPS/DigiTIPS_Home.aspx?Entry_CCO=3#Entry_CCO
A hot topic in media circles these days is “synthetic media”. That is, media that was created either partly or fully by a computer. Usually the term is used to describe content created either partly or wholly by AI algorithms.
IPTC’s Video Metadata Working Group has been looking at the topic recently and we concluded that it would be useful to have a way to describe exactly what type of content a particular media item is. Is it a raw, unmodified photograph, video or audio recording? Is it a collage of existing photos, or a mix of synthetic and captured content? Was it created using software trained on a set of sample images or videos, or is it purely created by an algorithm?
We have an existing vocabulary that suits some of this need: Digital Source Type. This vocabulary was originally created to be able to describe the way in which an image was scanned into a computer, but it also represented software-created images at a high level. So we set about expanding and modifying that vocabulary to cover more detail and more specific use cases.
It is important to note that we are only describing the way a media object has been created: we are not making any statements about the intent of the user (or the machine) in creating the content. So we deliberately don’t have a term “deepfake”, but we do have “trainedAlgorithmicMedia” which would be the term used to describe a piece of content that was created by an AI algorithm such as a Generative Adversarial Network (GAN).
Here are the terms we propose to include in the new version of the Digital Source Type vocabulary. (New terms and definition changes are marked in bold text. Existing terms are included in the list for clarity.)
|Term name||Original digital capture sampled from real life|
|Term description||The digital media is captured from a real-life source using a digital camera or digital recording device|
|Examples||Digital photo or video taken using a digital SLR or smartphone camera|
|Term name||Digitised from a negative on film|
|Term description||The digital media was digitised from a negative on film on any other transparent medium|
|Examples||Film scanned from a moving image negative|
|Term name||Digitised from a positive on film|
|Term description||The digital media was digitised from a positive on a transparency or any other transparent medium|
|Examples||Digital photo scanned from a photographic positive|
|Term name||Digitised from a print on non-transparent medium|
|Term description||The digital image was digitised from an image printed on a non-transparent medium|
|Examples||Digital photo scanned from a photographic print|
|Term name||Original media with minor human edits|
|Term description||Minor augmentation or correction by a human, such as a digitally-retouched photo used in a magazine|
|Examples||Video camera recording, manipulated digitally by a human editor|
|Term name||Composite of captured elements|
|Term description||Mix or composite of several elements that are all captures of real life|
|Examples||* A composite image created by a digital artist in Photoshop based on several source images
* Edited sequence or composite of video shots
|Term name||Algorithmically-enhanced media|
|Term description||Minor augmentation or correction by algorithm|
|Examples||A photo that has been digitally enhanced using a mechanism such as Google Photos’ “de-noise” feature|
|Term name||Data-driven media|
|Term description||Digital media representation of data via human programming or creativity|
|Examples||A representation of a distant galaxy created by analysing the outputs of a deep-space telescope (as opposed to a regular camera)
An infographic created using a computer drawing tool such as Adobe Illustrator or AutoCAD
|Term name||Digital art|
|Term description||Media created by a human using digital tools|
|Examples||* A cartoon drawn by an artist into a digital tool using a digital pencil, a tablet and a drawing package such as Procreate or Affinity Designer
* A scene from a film/movie created using Computer Graphic Imagery (CGI)
* Electronic music composition using purely synthesised sounds
|Term name||Virtual recording|
|Term description||Live recording of virtual event based on synthetic and optionally captured elements|
|Examples||* A recording of a computer-generated sequence, e.g. from a video game
* A recording of a Zoom meeting
|Term name||Composite including synthetic elements|
|Term description||Mix or composite of several elements, at least one of which is synthetic|
|Examples||* Movie production using a combination of live-action and CGI content, e.g. using Unreal engine to generate backgrounds
* A capture of an augmented reality interaction with computer imagery superimposed on a camera video, e.g. someone playing Pokemon Go
|Term name||Trained algorithmic media|
|Term description||Digital media created algorithmically using a model derived from sampled content|
|Examples||* Image based on deep learning from a series of reference examples
* A “speech-to-speech” generated audio or “deepfake” video using a combination of a real actor and an AI model
* “Text-to-image” using a text input to feed an algorithm that creates a synthetic image
|Term name||Algorithmic media|
|Term description||Media created purely by an algorithm not based on any sampled training data, e.g. an image created by software using a mathematical formula|
|Examples||* A purely computer-generated image such as a pattern of pixels generated mathematically e.g. a Mandelbrot set or fractal diagram
* A purely computer-generated moving image such as a pattern of pixels generated mathematically
We propose that the following term, which exists in the current DigitalSourceType CV, be retired:
|Term ID||RETIRE: softwareImage|
|Term name||Created by software|
|Term description||The digital image was created by computer software|
|Note||We propose that trainedAlgorithmicMedia or algorithmnicMedia be used instead of this term.|
We welcome all feedback from across the industry to these proposed terms.
Anyone who has managed photo metadata can attest that it is often difficult to know which metadata properties to use for different purposes. It is especially tricky to know how to tag consistently across different metadata standards. For example, how should a copyright notice be expressed in Exif, IPTC Photo Metadata and schema.org metadata?
For software vendors wanting to build accurate mapping into their tools to make life easier for their customers, it’s no easier. For a while, a document created by a consortium of vendors known as the Metadata Working Group solved some of the problems, but the MWG Guidelines are no longer available online.
To solve this problem, the IPTC collaborated with Exif experts at CIPA, the camera products industry group that maintains the Exif standard. We also spoke with the team behind schema.org. Based on these conversations, we created a document that describes how to map properties between these formats. The aim is to remove any ambiguity regarding which IPTC Photo Metadata properties are semantically equivalent to Exif tags and schema.org properties.
Generally, Exif tags and IPTC Photo Metadata properties represent different things: Exif mainly represents the technical data around capturing an image, while IPTC focuses on describing the image and its administrative and rights metadata, and schema.org covers expressing metadata in a web page. However, quite a few properties are shared by all standards, such as who is the Creator of the image, the free-text description of what the image shows, or the date when the image was taken. Therefore it is highly recommended to have the same value in the corresponding fields of the different standards.
The IPTC Photo Metadata Mapping Guidelines outlines the 17 IPTC Photo Metadata Standard properties with corresponding fields in Exif and/or Schema.org. Further short textual notes help to implement these mappings correctly.
The intended audience of the document is those managing the use of photo metadata in businesses and the makers of software that handles photo metadata.The IPTC Photo Metadata Mapping Guidelines document can be accessed on the iptc.org website. We encourage IPTC members to provide feedback through the usual channels, and non-members to respond with feedback and questions on the public IPTC Photo Metadata email discussion group.
Next Thursday 10th March, IPTC members will be presenting a webinar on IPTC Media Topics and Wikidata. It will be held in association with the European Broadcasting Union as part of the EBU Wikidata Workshop.
The webinar is part of our series of “member-to-member” webinars, but as this is a special event in conjunction with EBU, attendance is open to the public.
The IPTC component of the workshop features Jennifer Parrucci of The New York Times, lead of the IPTC NewsCodes Working Group which manages the Media Topics vocabulary, and Managing Director of IPTC Brendan Quinn, introducing Media Topics and how they can be used with Wikidata. Then Tor Kristian Flage of Norwegian agency NTB and Gustav Carlberg of vendor and IPTC member iMatrics will present on their recent project to integrate IPTC Media Topics and Wikidata into their newsroom workflow.
Other speakers at the workshop on March 10th include France TV, RAI Italy, YLE Finland, Gruppo RES, Media Press and Perfect Memory.
The IPTC has an ongoing project to the news and media industry deal with content credibility and provenance. As part of this, we have started working with Project Origin, a consortium of news and technology organisations who have come together to fight misinformation through the use of content provenance technologies.
On Tuesday 22nd February, Managing Director of IPTC Brendan Quinn spoke on a panel at an invite-only Executive Briefing event attended by leaders from news organisations around the world.
Other speakers at the event included Marc Lavallee, Head of R&D for The New York Times, Pascale Doucet of France Télévision, Eric Horvitz of Microsoft Research, Andy Parsons of Adobe, and Laura Ellis, Jamie Angus and Jatin Aythora of the BBC.
The event marks the beginning of the next phase of the industry’s work on content credibility. C2PA has now delivered the 1.0 version of its spec, so the next phase of the work is for the news industry to get together to create best practices around implementing it in news workflows.
IPTC and Project Origin will be working together with stakeholders from all parts of the news industry to establish guidelines for making provenance work in a practical way across the entire news ecosystem.
We have just released a small update to the Media Topics controlled vocabulary for news and media content. The changes support the Winter Olympics which starts this week.
The changes are:
- The definition of bobsleigh (medtop:20000854) was changed to reflect the fact that bobsleigh now offers a one-person version (which is incidentally referred to as “monobob”). The new definition is: One, two or four people racing down a course in a sled that consists of a main hull, a frame, two axles and sets of runners. The total time of all heats in a competition is added together to determine the winner.
- Similarly, the definition of freestyle skiing (medtop:20001058) was changed to reflect new events this year. The new definition is: Skiing competitions which, in contrast to alpine skiing, incorporate acrobatic moves and jumps. Events include aerials, halfpipe, slopestyle, ski cross, moguls and big air.
We also took the opportunity to add a term which was recently suggested by ABC Australia and Fourth Estate in the US:
- tsunami (medtop:20001353), child of medtop:20000151 natural disaster – High and powerful ocean waves caused by an underwater land disturbance, such as an earthquake or volcanic eruption, known to cause significant damage and loss when they hit land
We would like to thank to all Media Topics users and maintainers for their feedback and support.