COMING FROM GOOGLE IMAGE SEARCH?

Learn about IPTC and Google Images

Last week, Brendan Quinn and Jennifer Parrucci presented about IPTC NewsCodes at the EBU’s Metadata Developer Network workshop.

Brendan Quinn of IPTC and Jennifer Parrucci of The New York Times present IPTC’s NewsCodes vocabularies, describing what they are, how they are maintained, how they can be used and a look into the future. Including a focus on IPTC MediaTopics, our leading vocabulary for topics of news content. Originally presented at the EBU Metadata Developers Network workshop, held online from 25 – 27 May 2021.

The full presentation slides are embedded below. A video recording of the session, including questions and answers, is available to EBU members via the EBU MDN website.

We are happy to announce a new version of the popular NewsML-G2 generator tool.

This version is easier to use, and shows how NewsML-G2 files can be created using either QCodes or URIs for controlled values. It also allows the user to select the body text format – either NITF (IPTC’s News Industry Text Format) or XHTML. Both formats are used by large news agencies to distribute news content, so using the selector can help you to see the difference between the two formats and perhaps help you to make a decision about which format to use.

The new version of the NewsML-G2 Generator can be accessed at the same URL as the older version: https://iptc.org/std/NewsML-G2/generator/.

Using the NewsML-G2 Generator

To use the generator, simply start typing into the form on the left side of the screen. The grey box on the right hand side will immediately update with the relevant XML markup to represent your content in NewsML-G2 format.

The selectors above the output box on the right hand side allow you to change the output format:

  • QCodes vs URIs: Metadata values such as “itemClass” can be expressed either using IPTC’s QCodes format, or by URIs. So for example the “item type” can be expressed as a QCode (<itemClass qcode="ninat:graphic"/>) or a URI (<itemClass uri="http://cv.iptc.org/newscodes/ninat/graphic"/>). This radio button allows you to switch between the two formats for all controlled values in your NewsML-G2 file.
  • XHTML vs NITF: NewsML-G2 describes how the metadata around a news item should be delivered, but the actual content of a text news item must be expressed in another format. Two options expressed here are NITF, IPTC’s News Industry Text Format, and XHTML. This radio button changes the <contentSet/> section to include an embedded XML document in either XHTML or NITF format.

More enhancements to come

Currently, the generator tool handles simple text news stories. Ideas for future enhancements are to include support for images, audio and video, packages of multiple news items possibly in multiple formats, the partMeta framework to include metadata about part of a news item, and more. Suggestions are always welcome – please contact us if you have any further ideas.

This week IPTC hosted its Spring Meeting. We’re getting used to the online format now and it worked very well once again! We had over 70 attendees this time, from IPTC member organisations and invited guests.

On Tuesday, Ridho Reinanda spoke about Bloomberg’s powerful knowledge graph covering the finance industry.

Day One included a focus on accessibility, with Jeanne Spellman of the W3C Silver Task Force giving a preview of the work towards the next version of WCAG, version 3. Jeanne described how the focus is shifting from “all or nothing” compliance to a graded score, with a fairer approach to multiple disabilities rather than the current focus on only a few conditions.

This dovetailed well with Caroline Desrosiers‘ talk about her company (and recently joined IPTC Startup Member) Scribely, which provides image description services to e-commerce and photography companies.

Monday also included a detailed session from Michael Steidl, now co-lead of the Photo Metadata Working Group along with David Riecks, who gave a detailed history lesson on metadata embedded in photos and the various ways that image metadata is used in different image formats. The knowledge came in handy on Wednesday when we looked at the detail of how trust metadata is embedded in images.

Pam Fisher, lead of the Video Metadata Working Group, discussed the WG’s recent work updating the Video Metadata Hub User Guide, looking at new embedded metadata formats, and looking at how to promote the use of Video Metadata Hub as a standardised set of video metadata fields in any formats and tools.

Day Two saw a focus on Knowledge Graphs and Semantic Technology, a growing topic of interest in newsrooms and media organisations around the world. We saw presentations on real-world implementations of knowledge graphs from Stuart Jennings of the BBCPia Virtanen of YLE, the Finnish national broadcaster; Ridho Reinanda of Bloomberg, and Manfred Mitterholzer of APA, the Austrian national news agency.

Silver Oliver of consultancy Data Language shared some lessons learned from working with linked data and semantic technologies for 10 years including his work on the BBC Sport Ontology. This led to an update from Paul Kelly on the progress of the IPTC Sports Content Working Group‘s ongoing work on making a semantic web version of our SportsML standard, with help from Silver and others.

Day Three started with updates from the News in JSON Working Group (from WG Lead, Johan Lindgren of TT) looking at work towards ninjs 2.0 including Protocol Buffers compatibility and some new fields for rights management. The NewsML-G2 and News Architecture Working Group update from WG Lead, Dave Compton of Refinitiv discussed some work on making NewsML-G2 easier to understand including a soon-to-be-revealed new version of the NewsML-G2 Generator tool.

After Linda Burman presented the recent work of the IPTC Public Relations Committee, we heard from three trust and credibility projects. WeVerify, presented by Denis Tayssou of AFP, is an EU project creating a toolkit for forensic analysis of website and images that can be used by fact checkers. WordProof, presented by its founder Sebastiaan van der Lans, is a blockchain-based verified time-stamping system that can be used to show when a piece of content was first created. Finally C2PA (the Coalition for Content Authenticity and Provenance) was presented by technical working group lead Leonard Rosenthal of Adobe. C2PA is working on the technical details underpinning the Content Authenticity Initiative, which we have heard about before, so it was great to learn more about the nuts and bolts of how it is planned to work.

Jennifer Parrucci of The New York Times, lead of the NewsCodes Working Group, presented the WG’s latest work, including last week’s update to NewsCodes including Media Topics, and looking into future work around supporting new languages, more integration with Wikidata, and explaining how users of Media Topics can extend the vocabulary to include their own terms. Then Kurt Mathiasen of TV2 Danmark discussed his organisation’s use IPTC Media Topics in their system workflows, and the challenges of their plans to use more industry standards such as IPTC’s News Architecture as a way to join up the metadata that is distributed between third-party systems and currently must be re-keyed or cut-and-pasted from one system to another.

The Spring Meeting ended with a surprise for Stéphane Guérillot, chair of the Standards Committee. He thought he was going to be chairing a meeting but instead we introduced many past IPTC member delegates as guest attendees, and presented a slideshow of some of his history over his amazing 40 year membership of IPTC! We all value and appreciate the work Stéphane has put in to IPTC over his tenure as working group lead, board member and Chair of the Board, Standards Committee Chair, and Treasurer. When Stéphane retires at the end of June, he will be sorely missed, although he is welcome to back any time, because as current Chair Robert Schmidt-Nia announced, the Board has agreed to make Stéphane an Honorary Member of IPTC. Congratulations, Stéphane!

extract from IPTC MediaTopics Feb 2021

We are pleased to announce the latest release of IPTC NewsCodes, including our main subject vocabulary for news content, IPTC MediaTopics.

This update includes:

New Media Topics terms

The new terms were requested by MediaTopics users Ritzau in Denmark, NTB in Norway and AFP in France.

  • drowning (https://cv.iptc.org/newscodes/mediatopic/20001321)
  • men (https://cv.iptc.org/newscodes/mediatopic/20001328)
  • poisoning (https://cv.iptc.org/newscodes/mediatopic/20001322)
  • sports coaching (https://cv.iptc.org/newscodes/mediatopic/20001323)
  • sports management and ownership (https://cv.iptc.org/newscodes/mediatopic/20001324)
  • sports officiating (https://cv.iptc.org/newscodes/mediatopic/20001325)
  • torture (https://cv.iptc.org/newscodes/mediatopic/20001320)
  • women (https://cv.iptc.org/newscodes/mediatopic/20001327)
  • women’s rights (https://cv.iptc.org/newscodes/mediatopic/20001326)

Retired Media Topics terms

  • accomplishment (https://cv.iptc.org/newscodes/mediatopic/20000497). Use award and prize (20000498) or record and achievement (20000499) instead.
  • people (https://cv.iptc.org/newscodes/mediatopic/20000502). Use more specific terms instead.

Label changes to Media Topics

Please note that we only ever make changes to labels to make the meaning clearer, we never change the meaning of a term.

  • transfer -> sports transaction (http://cv.iptc.org/newscodes/mediatopic/20001148)
  • minister (government) -> minister and secretary (government) (http://cv.iptc.org/newscodes/mediatopic/20000613)
  • “athletics, track & field” -> “athletics” in en-GB and “track and field” in en-US (http://cv.iptc.org/newscodes/mediatopic/20000827)
  • plant -> flowers and plants (http://cv.iptc.org/newscodes/mediatopic/20000507)
  • imperial and royal matters -> royalty (http://cv.iptc.org/newscodes/mediatopic/20000506)

Media Topics hierarchy moves

  • “award and prize” (20000498) and record and achievement (20000499) were moved to the top level “human interest” term because we retired the parent term “accomplishment”
  • birthday (20001238), celebrity (20000505), high society (20000504) and “human mishap” (20000503) were moved to the top level “human interest” term to under the top level “human interest” term because we retired the parent term “people”.

Definition changes in Media Topics

  • Changes under “human interest” branch: animal (20000500), anniversary (20001237), award and prize (20000498), ceremony (20000501), funeral and memorial service (20001235), wedding (20001236), birthday (20001238)
  • Grammar fixes in en-GB and en-US descriptions for 20000037, 03000000, 20000140, 20000215, 20000228, 20000279, 20000321, 20000327, 20000390, 20000426, 20001229, 20001220, 20000504, 20000339, 20000571, 20000575, 20000590, 20000591, 20000600, 20000604, 20000619, 20000630, 20000658, 20000852

Changes to mappings from MediaTopics to other vocabularies

We had a major review of MediaTopic to Wikidata mappings, thanks to Lucy Butcher from Wirecutter (part of The New York Times, an IPTC member) for her contributions. Many terms have had their WIkidata mappings edited or added. In the near future, we are planning to add mappings from Wikidata back to NewsCodes.

Changes to other NewsCodes vocabularies

The Genre vocabulary had a major update, the second half of the review that was started in the February release.

New Genre terms:

  • Live Coverage (http://cv.iptc.org/newscodes/genre/LiveCoverage)
  • Preview (http://cv.iptc.org/newscodes/genre/Preview)

Retired terms:

  • Scener (https://cv.iptc.org/newscodes/genre/Scener) – use From the Scene instead
  • Text only (https://cv.iptc.org/newscodes/genre/Text_only) – Use Transcript and Verbatim instead
  • Update (https://cv.iptc.org/newscodes/genre/Update) – Use Synopsis or Briefing instead
  • Wrap (https://cv.iptc.org/newscodes/genre/Wrap) – Use Synopsis or Briefing instead
  • Wrapup (https://cv.iptc.org/newscodes/genre/Wrapup) – Use Synopsis or Briefing instead

Label (and definition) changes:

  • Daybook -> Planner (https://cv.iptc.org/newscodes/genre/Daybook)
  • Listing of Facts -> Fact Box (https://cv.iptc.org/newscodes/genre/ListingOfFacts)
  • Summary -> Briefing (https://cv.iptc.org/newscodes/genre/Summary)

Definition changes for: Biography, Birth Announcement, Curtain Raiser, Exclusive, Feature, Fixture, Forecast, From the Scene, Interview, Music, Obituary, Opinion, Polls and Surveys, Press Release, Press-Digest, Profile, Program, Question and Answer Session, Quote, Raw Sound, Response to a Question, Results Listings and Statistics, Retrospective, Review, Side bar and Supporting Information, Special Report, Synopsis.

As usual, all changes can be seen:

Please let us know if you spot any problems. If you are an IPTC member you can post issues, questions and suggestions to the NewsCodes Working Group list at iptc-newscodes-dev@groups.io.

Text and Data Mining Reservation Protocol Community Group home pageBrendan Quinn, Managing Director of IPTC, spoke on 20 April 2021 at the regular meeting of the W3C Text and Data Mining Reservation Protocol Community Group.

The Community Group, open to anyone to join, is discussing how to “facilitate a technical protocol to reserve a publisher’s right for content to be made available for text and data mining (TDM). The solution should be capable of expressing the reservation of TDM rights – following the rules set by Article 4 of the new European DSM Directive – and the availability of machine-readable licenses for TDM actors.”

The Community Group is looking at various technologies for representing machine-readable licences, and Brendan presented IPTC’s RightsML as a possible option. Based on W3C’s ODRL, RightsML allows rights holders to specify permissions, prohibitions and constraints on usage of all types of media content, so it may be a good candidate for representing rights around data mining.

Laurent Le Meur, Chair of the TDM Reservation Protocol Community Group and previous contributor to IPTC, presented at the IPTC Autumn Meeting in 2020 to discuss the proposed project.

We are excited to present to IPTC members the full agenda for the IPTC Spring Meeting 2021, taking place online from Monday May 10th to Wednesday May 12th.

We are honoured to have presentations from IPTC members Adobe, BBC, Agence France-Presse (AFP), The New York Times, Bloomberg, Austria Press Agentur (APA) and new member Scribely, along with guest presentations from the World Wide Web Consortium (W3C), Data Language, TV2 Denmark, and YLE Finland.

Themes include

  • metadata for content accessibility;
  • knowledge graphs and semantic technologies in news and media; and
  • trust and credibility, including a presentation by Leonard Rosenthal of the new Coalition for Content Authenticity and Provenance

Plus we will have all our regular presentations from our Working Groups in NewsML-G2, Photo Metadata, Video Metadata, NewsCodes (including Media Topics), News in JSON and Sports. We will also have sessions for our Standards Committee and PR Committee.

There will also be some time allocated each day to member networking. While we can’t match the networking opportunities of an in-person meeting, we will be using some new tools to make networking more interesting and approachable for members.

We are also planning to hold a special webinar the week before the meeting Introducing knowledge graphs for the media, so we can get straight into the interesting content during the member meeting and not spend time introducing the concepts.

All IPTC member organisations are welcome to attend at no cost.

IPTC members can see more information on the Spring Meeting 2021 page in the IPTC Members-Only Zone.

The IPTC Video Metadata Working Group is happy to announce the 1.0 version of the  IPTC Video Metadata Hub User Guide.

The guide introduces IPTC’s Video Metadata Hub recommendation and explains how it can be used to solve metadata management problems in any organisation that processes video content, from news agencies to advertising agencies; libraries, galleries and museums; long-form video producers such as broadcasters and movie studios; and stock video services.

As well as explaining the details of each field in the IPTC Video Metadata Hub standard, it shows through a set of use cases how it can be used in a variety of common scenarios to store rights, descriptive and administrative metadata for video content.

Pam Fisher, group lead, and the IPTC’s Video Metadata Working Group welcome feedback on the document. If your organisation handles video content, please read it and let us know what you think and what can be explained better. Comments can be send via this site’s Contact Us form or to the public Video Metadata Hub discussion list at https://groups.io/g/iptc-videometadata.

The guide can be seen at https://iptc.org/std/videometadatahub/userguide/.

 

extract from IPTC MediaTopics Feb 2021We have just released a new version of IPTC NewsCodes, which includes many changes to Media Topics.

This is the first major update since August 2020 (although we released new versions in September and October 2020 to add translations of new terms).

The changes are detailed below:

New translations for Media Topics

After many requests, we have now added an “en-US” language version, based on a contribution by Jeff Brown of Fourth Estate. Thanks Jeff!

Mostly it simply changes British English words to US English, such as “centre”/”center” and “programme”/”program”, but there are a few more substantive changes around cinema / movies and changing “holiday” to “vacation”. Also where Jeff had suggested changes to definitions, we often changed them for both British and US English.

en-GB will still be the primary language for Media Topics, but we will keep the en-GB and en-US versions in sync as we make changes.

New Media Topics terms

These were suggested by our collaborators from Ritzau via iMatrics, NTB, TT and AFP. Thanks to all.

Please note that the new terms only exist in en-GB and en-US right now, more translations will be added soon.

Update on 15 March: we have now added translations in Danish (thanks to Ritzau and iMatrics), Nowegian (thanks to NTB), Swedish (thanks to TT) and Portuguese for Brazil and Portugal (thanks to Priberam and Lusa).

Update on 12 April: We have now also added Chinese and German translations for these new and updated terms and definitions. Thanks very much to members Xinhua and dpa for their help!

Retired Media Topics terms

  • sports facilities (http://cv.iptc.org/newscodes/mediatopic/20000559 (retired)) – use medtop:20001126 “sport venue” instead
  • inline skating (http://cv.iptc.org/newscodes/mediatopic/20000967 (retired)) – use medtop:20001155 “roller sports” instead

Label changes to Media Topics 

Please note that we only ever make changes to labels to make the meaning clearer, we never change the meaning of a term.

Media Topics hierarchy moves

Definition changes in Media Topics

Changes to other NewsCodes vocabularies

As usual, the changes can be seen:

Please let us know if you spot any problems. If you are an IPTC member you can post issues, questions and suggestions to the NewsCodes Working Group list at iptc-newscodes-dev@groups.io.

We have made it to the end of 2020. And what a year it has been!

A reminder of happier times when we could meet in person – Managing Director Brendan Quinn and IPTC member representatives enjoying dinner at the 2019 Autumn Meeting in Ljubljana, Slovenia. 

The news and media industry has perhaps been affected less than the travel or hospitality industry, but 2020 was still a hugely eventful year for us all professionally and personally. Congratulations on getting through it, and our thoughts go out to those who have suffered in any way this year.

IPTC Events

Of course our member meetings, planned for Tallinn Estonia and New York USA this year, quickly became virtual events held via Zoom. It worked surprisingly well, and even allowed us to bring on some speakers and guests who wouldn’t have been able to attend or present if we had held the events physically.

You can look back at our Spring Meeting blog posts (Day 1, Day 2, Day 3) and the summary of our Autumn Meeting.

The IPTC Photo Metadata Conference was very interesting this year: from our usual small room hosted as part of the CEPIC Congress, we went to a virtual event with over 200 attendees. If you missed it, or want to re-visit, videos of the sessions are available on YouTube.

Standards work

The News in JSON Working Group submitted ninjs 1.3 for approval at the Spring Meeting, which added fields for trust indicators and genres, support for different types of headlines and alternative IDs. The ninjs generator, showing how easy it is to create a ninjs document by filling in a web form, was very popular and was the inspiration for some related tools in other working groups. Since then, the working group has been looking at more features to be included in future versions of ninjs. If you handle news in JSON in any way and you haven’t completed our News in JSON survey, please do it now!

The NewsML-G2 Working Group released NewsML-G2 2.29 in July which added some fields required for the trust and credibility project, and a new NewsML-G2 Generator tool based on the ninjs one. The group also participated in the trust and credibility projects described below. The NewsML-G2 specifications and guidelines documents have now been updated to version 2.29.

The Video Metadata Working Group released Video Metadata Hub 1.3 during the summer, which added fields to track the editing of metadata (as opposed to editing the actual video), parent video identifier, and updated the mappings to EBUCore and EIDR. The group is hard at work on promoting Video Metadata Hub and creating more introductory materials to help new users understand VMHub and why it is useful.

The NewsCodes Working Group published three updates this year, in March, June and August, and a new update will be published very soon. The NewsCodes Guidelines document was released this year, and is already proving useful both for those wishing to learn how to use NewsCodes better and for the Working Group to establish clear guidelines about when and how to add new terms. MediaTopics is now available in 11 languages and we have more translations coming!

The Photo Metadata Working Group has been very busy, with the biggest news of the year being that Google now supports IPTC Photo Metadata to display licensor information in search results, including a link back to the image owner’s “licence this image” page. The feature was launched in beta in February and launched fully in August. We have had great take-up so far, and the interest in the Photo Metadata Conference (with over 200 people registered) showed that the industry was very keen to hear about it. We also launched updates to the GetPMD tool to support new schema.org mappings, and browser plugins for Chrome and Firefox to enable easy viewing of embedded IPTC Photo Metadata in photographs on the web.

The Sports Content Working Group has had its collective head down in 2020, re-thinking the data model for sports results, statistics and performances. We have been taking a semantic view, looking at using RDF as the main data model for sports data which can then be serialised into JSON, XML and other formats. The intention is that this will also bring the model closer to schema.org in the future. We have some RDF and semantic web experts on the group who are helping with the modelling, and are taking a use-case based approach to make sure that we’re designing something that’s both useful and usable.

A discussion group “spun out” from the NewsCodes Working Group to consider Named Entities for News. So far we have had a couple of meetings to discuss our thoughts on maintaining vocabularies for named entities such as people, companies and places, and to study different approaches used by IPTC member organisations and non-members.

An ongoing project that spans several working groups is the work on Trust and Credibility. After publishing a draft guidelines document in April and a webinar that we ran in September, we plan to publish a 1.0 version in the new year.

All of our Working Groups are always looking for new participants, so if you’re interested in any of these areas, please consider joining IPTC and taking part in a working group!

IPTC appearances at conferences and in the media

There weren’t many conferences in the first part of the year as everyone adjusted to working remotely, but in the second half of the year IPTC people made quite a few appearances at other conferences and webinars.

In July, Brendan Quinn and Robert Schmidt-Nia spoke about NewsML-G2 at an Arab States Broadcasting Union metadata workshop. In September, Michael Steidl spoke on a panel with Google and Alamy at the Perpignan photojournalism conference about Google’s “Licensable Images” feature, and Brendan Quinn hosted a webinar about our work in trust and credibility.

In October,  Pam Fisher and Mark Milstein spoke about Video Metadata Hub at the DMLA conference. In November, Brendan Quinn was invited to give a keynote at the  FIBEP World Media Intelligence Congress, speaking to the media monitoring / media intelligence industry who also use quite a few IPTC standards.

Also in November, Bill Kasdorf published a column in Publisher’s Weekly about Media Topics and IPTC Photo Metadata which raised a lot of interest in the publishing industry. In December, Michael Steidl was invited to present a webinar to IPTC member BVPA about IPTC Photo Metadata.

Membership updates

  • We announced the IPTC Startup Membership category in September, and our first Startup Member to join is IMATAG.
  • DATAGROUP Consulting Services joined as a Voting Member.
  • New Associate Members are CBC / Radio Canada, iMatrics, and DeFodi Images.
  • New Individual Members are Margaret Warren and Alison Sullivan.

We’re very happy to have them all on board and joining in the IPTC community!

Some sad news

It was with great shock that we learned in early November that longstanding member Andy Read of BBC had passed away. He was a key contributor in many areas and his friendliness and enthusiasm will be hugely missed. Rest in peace, friend.

Looking forward

It seems that we have come through the worst 2020 could throw at us and things are looking up for 2021. We are already thinking about 2021’s events and how we can learn from 2020 to improve things for members and friends in 2021.

Best wishes for the holiday season from all of us at IPTC.

PS: If you have any questions or thoughts about how IPTC could help you, or if you are interested in talking about joining IPTC, please contact Managing Director, Brendan Quinn at mdirector@iptc.org.

Today we announce the launch of two new browser extensions for viewing IPTC Photo Metadata on web pages.

The GetPMD tool is one of IPTC’s most popular online resources. With the GetPMD tool, users can view the embedded IPTC metadata of any image on the web, whether it was embedded using either the IPTC IIM or the ISO XMP format. But up to now, users must copy and paste an image’s URL into the tool, or install a browser “bookmarklet”.

To make that a little bit easier, we have created the IPTC Photo Metadata Inspector, a simple browser extension that currently works with the Google Chrome and Mozilla Firefox browsers.

With the extension installed, a context menu will appear when you right-click on an image anywhere on the Web, with a menu option, “View IPTC Photo  Metadata.” If you select that option, you will be taken to getpmd.iptc.org where you can see the embedded metadata for that image.

Example of the IPTC Photo Metadata Inspector extension being used on an image on taz.de.

Please note that the Photo Metadata Inspector only works with simple images: it won’t work with embedded video thumbnails or tweets, for example.

The browser extensions are open source, the code is available from the IPTC’s GitHub repository.

Ideas for fixes and new features are welcome.

If you have feedback, please raise an issue on our GitHub repository, post suggestions to the iptc-photometadata@groups.io public discussion list, or contact us via the form on this site.