Recently IPTC has been working with many organisations who are creating solutions for the ongoing problem of misinformation and disinformation in news. We are happy to announce that this work continues through IPTC’s liaison relationship with C2PA, the Coalition for Content Provenance and Authenticity.
C2PA was created to unify the efforts of the Adobe-led Content Authenticity Initiative (CAI) which focuses on systems to provide context and history for digital media, and Project Origin, a Microsoft- and BBC-led initiative that tackles disinformation in the digital news ecosystem. C2PA creates technical standards for certifying the source and history (or provenance) of media content.
The IPTC has been working with both the Content Authenticity Initiative and Project Origin in recent years. Andy Parsons from CAI presented at the IPTC Photo Metadata Conference in 2020. IPTC members who are also members of CAI and/or Project Origin include Adobe, BBC, CBC/Radio Canada and The New York Times.
IPTC and C2PA have agreed to share information and allow each organisation to attend the other’s meetings in the areas of technical specifications of file formats, particularly around image and video files; to share knowledge and expertise around newsroom practices and workflows; and to collaborate in the areas of content syndication and distribution.
The IPTC Photo Metadata Standard is widely used by photographers, photo agencies and other photo suppliers around the world. To help photo people use it properly, IPTC has a specification document with a lot of details in document form.
Now, we have released a machine-readable version of the spec that can be consumed directly by software tools.
We call it the IPTC Photo Metadata TechReference. (See below for direct links to the data files.)
The TechReference is a data object containing all the details of the IPTC Photo Metadata technical specifications in the easy-to-use JSON and YAML formats.
The file covers all IPTC properties and structures.
For each property, we specify:
- the property’s formal name
- corresponding identifiers in the ISO XMP and the IPTC IIM formats, if applicable;
- the property’s datatype, such as string, number or a custom property structure like Location; and
- the property identifier that can be used with ExifTool to read or write the metadata property (such as “XMP-dc:creator” for XMP or “IPTC:Creator” for IIM);
- … and a few more details.
We have also published rich documentation about the TechReference data object on the IPTC website. The data objects themselves can be downloaded from the IPTC site by both IPTC members and other interested parties.
In time for the 2020 Summer Olympics, soon to be held in Tokyo Japan (in 2021), we have released a new version of the Media Topics vocabulary covering all Olympic sports.
As many MediaTopics users don’t use the sports facets system, we wanted to make sure that the top-level Olympic and Paralympic sports were all represented in the main Media Topics vocabulary.
To make this possible, we have made the following changes:
New and changed labels and definitions for Olympics and Paralympics
We have added the following new sport concepts, all under Competition Discipline:
- medtop:20001329 3×3 basketball
- medtop:20001330 canoe slalom
- medtop:20001331 canoe sprint
- medtop:20001332 bmx freesytle
- medtop:20001333 road cycling
- medtop:20001334 track cycling
- medtop:20001335 football 5-a-side
- medtop:20001336 goalball
We have “unretired” the following term, which was retired in 2017:
- medtop:20001077 marathon swimming
We have moved the following term:
- medtop:20000887 sport climbing to under “competition discipline”
In another major update we have added labels in French, Spanish and Arabic for most recently-added terms. Thanks to Anne Raynaud and her team at Agence France-Presse (AFP) for this contribution.
Another small change is that in the HTML tree view, we now mark retired concepts more clearly by visually striking out their labels and definitions.
We always welcome feedback on IPTC MediaTopics and the other NewsCodes vocabularies on the public discussion list email@example.com.
IPTC’s Video Metadata Working Group is happy to announce that the first version of the IPTC Video Metadata Hub Generator tool has been released. It can be used to create IPTC Video Metadata Hub records without any knowledge of the underlying technical metadata schema.
The Video Metadata Hub tool serves as a demonstrator to show how easy it could be to enter metadata for a video using the Video Metadata Hub common video metadata schema. It illustrates the power of Video Metadata Hub to video architects, digital asset managers and developers of video software and systems.
How do I use the Video Metadata Hub Generator?
To use the tool, simply start typing text into fields in the form on the left hand side of the screen. The right-hand side will automatically update showing a JSON version of the VMHub data according to the IPTC Video Metadata Hub JSON schema.
Because one of the features of IPTC Video Metadata Hub is its rich set of mappings to other well-known video formats, we will be adding other output formats such as XML (NewsML-G2), EBUCore, XMP and EIDR.
What can I do with the output?
The resulting JSON file can be used to supply data to IT systems. Alternatively, the generated JSON file can be saved alongside your video assets as a “sidecar”. This usage is explained in the section of the IPTC Video Metadata Hub User Guide called “Using Video Metadata Hub with your video content”.
In the future, we hope that Video Metadata Hub properties will be built into many video editing tools and digital asset management systems, along with a common way of storing the metadata properties embedded into video files. When this has happened, users will be able to fill out standardised metadata fields in one tool and then view the entered metadata when loading that video file into another tool.
The current version of the Video Metadata Hub Generator shows only a small subset of the 91 Video Metadata Hub fields. In the future, we aim to add a control that lets users specify their use case (for example “video archives” or “news agency”) and all of the relevant fields for that use case would be displayed.
We are very interested in feedback from users. Join the conversation about this tool on the public iptc-videometadata email discussion group.
It is with mixed feelings that we say farewell to Stéphane Guérillot, our friend and colleague and Agence France-Presse’s Delegate to IPTC for 40 years. On one hand, we are happy that it is time for him to start a well-deserved retirement. But we are sad that we will not see his big smile and hear his always thoughtful and relevant comments and ideas. Stéphane contributed greatly to our Working Groups and on the Standards Committee, which he chaired for many years.
Over this time, Stéphane has contributed to almost all of IPTC’s standards, including NewsML and NewsML-G2, IPTC Photo Metadata and its predecessor IIM, NewsCodes and more.
What’s more, Stéphane contributed as a Board Member and Treasurer of IPTC for many years, was Chairman of IPTC from 2005 to 2011, and hosted several IPTC Meetings in France. As the main Member Delegate for Agence France-Presse and CEO of AFP’s technology subsidiary AFP Blue, Stéphane was key in driving AFP’s adoption of IPTC standards, which continues to this day.
We wish Stéphane and his family all the best.
At the recent IPTC Spring Meeting, we surprised Stéphane with a special session remembering his time with IPTC. Many previous members were invited to attend to pass along their congratulations.
Unfortunately, this time we could only say farewell over Zoom. But we are very happy to announce that in recognition for his work, the IPTC Board has granted lifetime Honorary Membership to Stéphane, which means that he is entitled to come to future meetings.
So hopefully we can say au revoir et félicitations in person some time soon!
Last week, Brendan Quinn and Jennifer Parrucci presented about IPTC NewsCodes at the EBU’s Metadata Developer Network workshop.
Brendan Quinn of IPTC and Jennifer Parrucci of The New York Times present IPTC’s NewsCodes vocabularies, describing what they are, how they are maintained, how they can be used and a look into the future. Including a focus on IPTC MediaTopics, our leading vocabulary for topics of news content. Originally presented at the EBU Metadata Developers Network workshop, held online from 25 – 27 May 2021.
The full presentation slides are embedded below. A video recording of the session, including questions and answers, is available to EBU members via the EBU MDN website.
We are happy to announce a new version of the popular NewsML-G2 generator tool.
This version is easier to use, and shows how NewsML-G2 files can be created using either QCodes or URIs for controlled values. It also allows the user to select the body text format – either NITF (IPTC’s News Industry Text Format) or XHTML. Both formats are used by large news agencies to distribute news content, so using the selector can help you to see the difference between the two formats and perhaps help you to make a decision about which format to use.
The new version of the NewsML-G2 Generator can be accessed at the same URL as the older version: https://iptc.org/std/NewsML-G2/generator/.
Using the NewsML-G2 Generator
To use the generator, simply start typing into the form on the left side of the screen. The grey box on the right hand side will immediately update with the relevant XML markup to represent your content in NewsML-G2 format.
The selectors above the output box on the right hand side allow you to change the output format:
- QCodes vs URIs: Metadata values such as “itemClass” can be expressed either using IPTC’s QCodes format, or by URIs. So for example the “item type” can be expressed as a QCode (<itemClass qcode="ninat:graphic"/>) or a URI (<itemClass uri="http://cv.iptc.org/newscodes/ninat/graphic"/>). This radio button allows you to switch between the two formats for all controlled values in your NewsML-G2 file.
- XHTML vs NITF: NewsML-G2 describes how the metadata around a news item should be delivered, but the actual content of a text news item must be expressed in another format. Two options expressed here are NITF, IPTC’s News Industry Text Format, and XHTML. This radio button changes the <contentSet/> section to include an embedded XML document in either XHTML or NITF format.
More enhancements to come
Currently, the generator tool handles simple text news stories. Ideas for future enhancements are to include support for images, audio and video, packages of multiple news items possibly in multiple formats, the partMeta framework to include metadata about part of a news item, and more. Suggestions are always welcome – please contact us if you have any further ideas.
This week IPTC hosted its Spring Meeting. We’re getting used to the online format now and it worked very well once again! We had over 70 attendees this time, from IPTC member organisations and invited guests.
Day One included a focus on accessibility, with Jeanne Spellman of the W3C Silver Task Force giving a preview of the work towards the next version of WCAG, version 3. Jeanne described how the focus is shifting from “all or nothing” compliance to a graded score, with a fairer approach to multiple disabilities rather than the current focus on only a few conditions.
This dovetailed well with Caroline Desrosiers‘ talk about her company (and recently joined IPTC Startup Member) Scribely, which provides image description services to e-commerce and photography companies.
Monday also included a detailed session from Michael Steidl, now co-lead of the Photo Metadata Working Group along with David Riecks, who gave a detailed history lesson on metadata embedded in photos and the various ways that image metadata is used in different image formats. The knowledge came in handy on Wednesday when we looked at the detail of how trust metadata is embedded in images.
Pam Fisher, lead of the Video Metadata Working Group, discussed the WG’s recent work updating the Video Metadata Hub User Guide, looking at new embedded metadata formats, and looking at how to promote the use of Video Metadata Hub as a standardised set of video metadata fields in any formats and tools.
Day Two saw a focus on Knowledge Graphs and Semantic Technology, a growing topic of interest in newsrooms and media organisations around the world. We saw presentations on real-world implementations of knowledge graphs from Stuart Jennings of the BBC; Pia Virtanen of YLE, the Finnish national broadcaster; Ridho Reinanda of Bloomberg, and Manfred Mitterholzer of APA, the Austrian national news agency.
Silver Oliver of consultancy Data Language shared some lessons learned from working with linked data and semantic technologies for 10 years including his work on the BBC Sport Ontology. This led to an update from Paul Kelly on the progress of the IPTC Sports Content Working Group‘s ongoing work on making a semantic web version of our SportsML standard, with help from Silver and others.
Day Three started with updates from the News in JSON Working Group (from WG Lead, Johan Lindgren of TT) looking at work towards ninjs 2.0 including Protocol Buffers compatibility and some new fields for rights management. The NewsML-G2 and News Architecture Working Group update from WG Lead, Dave Compton of Refinitiv discussed some work on making NewsML-G2 easier to understand including a soon-to-be-revealed new version of the NewsML-G2 Generator tool.
After Linda Burman presented the recent work of the IPTC Public Relations Committee, we heard from three trust and credibility projects. WeVerify, presented by Denis Tayssou of AFP, is an EU project creating a toolkit for forensic analysis of website and images that can be used by fact checkers. WordProof, presented by its founder Sebastiaan van der Lans, is a blockchain-based verified time-stamping system that can be used to show when a piece of content was first created. Finally C2PA (the Coalition for Content Authenticity and Provenance) was presented by technical working group lead Leonard Rosenthal of Adobe. C2PA is working on the technical details underpinning the Content Authenticity Initiative, which we have heard about before, so it was great to learn more about the nuts and bolts of how it is planned to work.
Jennifer Parrucci of The New York Times, lead of the NewsCodes Working Group, presented the WG’s latest work, including last week’s update to NewsCodes including Media Topics, and looking into future work around supporting new languages, more integration with Wikidata, and explaining how users of Media Topics can extend the vocabulary to include their own terms. Then Kurt Mathiasen of TV2 Danmark discussed his organisation’s use IPTC Media Topics in their system workflows, and the challenges of their plans to use more industry standards such as IPTC’s News Architecture as a way to join up the metadata that is distributed between third-party systems and currently must be re-keyed or cut-and-pasted from one system to another.
The Spring Meeting ended with a surprise for Stéphane Guérillot, chair of the Standards Committee. He thought he was going to be chairing a meeting but instead we introduced many past IPTC member delegates as guest attendees, and presented a slideshow of some of his history over his amazing 40 year membership of IPTC! We all value and appreciate the work Stéphane has put in to IPTC over his tenure as working group lead, board member and Chair of the Board, Standards Committee Chair, and Treasurer. When Stéphane retires at the end of June, he will be sorely missed, although he is welcome to back any time, because as current Chair Robert Schmidt-Nia announced, the Board has agreed to make Stéphane an Honorary Member of IPTC. Congratulations, Stéphane!
We are pleased to announce the latest release of IPTC NewsCodes, including our main subject vocabulary for news content, IPTC MediaTopics.
This update includes:
New Media Topics terms
The new terms were requested by MediaTopics users Ritzau in Denmark, NTB in Norway and AFP in France.
- drowning (https://cv.iptc.org/newscodes/mediatopic/20001321)
- men (https://cv.iptc.org/newscodes/mediatopic/20001328)
- poisoning (https://cv.iptc.org/newscodes/mediatopic/20001322)
- sports coaching (https://cv.iptc.org/newscodes/mediatopic/20001323)
- sports management and ownership (https://cv.iptc.org/newscodes/mediatopic/20001324)
- sports officiating (https://cv.iptc.org/newscodes/mediatopic/20001325)
- torture (https://cv.iptc.org/newscodes/mediatopic/20001320)
- women (https://cv.iptc.org/newscodes/mediatopic/20001327)
- women’s rights (https://cv.iptc.org/newscodes/mediatopic/20001326)
Retired Media Topics terms
- accomplishment (https://cv.iptc.org/newscodes/mediatopic/20000497). Use award and prize (20000498) or record and achievement (20000499) instead.
- people (https://cv.iptc.org/newscodes/mediatopic/20000502). Use more specific terms instead.
Label changes to Media Topics
Please note that we only ever make changes to labels to make the meaning clearer, we never change the meaning of a term.
- transfer -> sports transaction (http://cv.iptc.org/newscodes/mediatopic/20001148)
- minister (government) -> minister and secretary (government) (http://cv.iptc.org/newscodes/mediatopic/20000613)
- “athletics, track & field” -> “athletics” in en-GB and “track and field” in en-US (http://cv.iptc.org/newscodes/mediatopic/20000827)
- plant -> flowers and plants (http://cv.iptc.org/newscodes/mediatopic/20000507)
- imperial and royal matters -> royalty (http://cv.iptc.org/newscodes/mediatopic/20000506)
Media Topics hierarchy moves
- “award and prize” (20000498) and record and achievement (20000499) were moved to the top level “human interest” term because we retired the parent term “accomplishment”
- birthday (20001238), celebrity (20000505), high society (20000504) and “human mishap” (20000503) were moved to the top level “human interest” term to under the top level “human interest” term because we retired the parent term “people”.
Definition changes in Media Topics
- Changes under “human interest” branch: animal (20000500), anniversary (20001237), award and prize (20000498), ceremony (20000501), funeral and memorial service (20001235), wedding (20001236), birthday (20001238)
- Grammar fixes in en-GB and en-US descriptions for 20000037, 03000000, 20000140, 20000215, 20000228, 20000279, 20000321, 20000327, 20000390, 20000426, 20001229, 20001220, 20000504, 20000339, 20000571, 20000575, 20000590, 20000591, 20000600, 20000604, 20000619, 20000630, 20000658, 20000852
Changes to mappings from MediaTopics to other vocabularies
We had a major review of MediaTopic to Wikidata mappings, thanks to Lucy Butcher from Wirecutter (part of The New York Times, an IPTC member) for her contributions. Many terms have had their WIkidata mappings edited or added. In the near future, we are planning to add mappings from Wikidata back to NewsCodes.
Changes to other NewsCodes vocabularies
The Genre vocabulary had a major update, the second half of the review that was started in the February release.
New Genre terms:
- Live Coverage (http://cv.iptc.org/newscodes/genre/LiveCoverage)
- Preview (http://cv.iptc.org/newscodes/genre/Preview)
- Scener (https://cv.iptc.org/newscodes/genre/Scener) – use From the Scene instead
- Text only (https://cv.iptc.org/newscodes/genre/Text_only) – Use Transcript and Verbatim instead
- Update (https://cv.iptc.org/newscodes/genre/Update) – Use Synopsis or Briefing instead
- Wrap (https://cv.iptc.org/newscodes/genre/Wrap) – Use Synopsis or Briefing instead
- Wrapup (https://cv.iptc.org/newscodes/genre/Wrapup) – Use Synopsis or Briefing instead
Label (and definition) changes:
- Daybook -> Planner (https://cv.iptc.org/newscodes/genre/Daybook)
- Listing of Facts -> Fact Box (https://cv.iptc.org/newscodes/genre/ListingOfFacts)
- Summary -> Briefing (https://cv.iptc.org/newscodes/genre/Summary)
Definition changes for: Biography, Birth Announcement, Curtain Raiser, Exclusive, Feature, Fixture, Forecast, From the Scene, Interview, Music, Obituary, Opinion, Polls and Surveys, Press Release, Press-Digest, Profile, Program, Question and Answer Session, Quote, Raw Sound, Response to a Question, Results Listings and Statistics, Retrospective, Review, Side bar and Supporting Information, Special Report, Synopsis.
As usual, all changes can be seen:
- Directly on the CV server at http://cv.iptc.org/
- In machine-readable form using SKOS, NewsML-G2 and NewsML 1 (see the cv.iptc.org guidelines for details)
- HTML tree view at https://www.iptc.org/std/NewsCodes/treeview/mediatopic/mediatopic-en-GB.html
- As an interactive diagram on http://show.newscodes.org/
- In Excel format from https://www.iptc.org/std/NewsCodes/IPTC-MediaTopic-NewsCodes.xlsx
Please let us know if you spot any problems. If you are an IPTC member you can post issues, questions and suggestions to the NewsCodes Working Group list at firstname.lastname@example.org.
Brendan Quinn, Managing Director of IPTC, spoke on 20 April 2021 at the regular meeting of the W3C Text and Data Mining Reservation Protocol Community Group.
The Community Group, open to anyone to join, is discussing how to “facilitate a technical protocol to reserve a publisher’s right for content to be made available for text and data mining (TDM). The solution should be capable of expressing the reservation of TDM rights – following the rules set by Article 4 of the new European DSM Directive – and the availability of machine-readable licenses for TDM actors.”
The Community Group is looking at various technologies for representing machine-readable licences, and Brendan presented IPTC’s RightsML as a possible option. Based on W3C’s ODRL, RightsML allows rights holders to specify permissions, prohibitions and constraints on usage of all types of media content, so it may be a good candidate for representing rights around data mining.
Laurent Le Meur, Chair of the TDM Reservation Protocol Community Group and previous contributor to IPTC, presented at the IPTC Autumn Meeting in 2020 to discuss the proposed project.