All rights reserved
© 2014 IPTC
 

Introduction

The IPTC launched the SportsML project in March, 2001, as part of a larger effort to create specialized formats for data of interest to the news industry. Since then SportsML evolved and included more and more types of sports. In 2008 its version 2.0 joined the family of IPTC G2-Standards.

Why SportsML

The IPTC saw a vacuum with respect to a cross-sport, cross-language XML standard for the interchange of sports data and statistics. A sufficient number of IPTC members volunteered to devote the necessary amount of resources for the design and implementation of an XML DTD for sports, and its requisite supporting documentation and examples.

It is our hope that non-IPTC members, such as major sports leagues and special-event producers, will rally behind SportsML as a global standard. Our shared goal is to make the deployment of interactive sports data applications as easy as possible for customers of sports data feeds.

What's in SportsML?

SportsML supports the identification and description of a tremendous number of sports characteristics. Highlights include:

  • Scores: Who's winning, and how did the score change?

  • Schedules: Who's playing who, when, and where?

  • Standings: Who's in first place? Who's closest to qualifying for the championship?

  • Statistics: How do the players and/or teams measure up against one another in various categories?

  • News: How do we combine editorial coverage of sports with all these data feeds? How do we package metadata- and multimedia-filled articles together with sports data?

SportsML consists of a core DTD that contains a great amount of properties that describe a wide range of sports coverage. Much useful sports reporting can be done through the core DTD. In addition, SportsML contains several "plug-in" specific-sport DTDs, which are only necessary when the publisher needs to go in-depth for a specific sport. The fact that there are only seven sports covered in SportsML's initial release does not limit SportsML to these seven sports. The core DTD provides an excellent starting point for many sports, and the development process for other plug-ins will continue. Interested users are more than welcome to take part in SportsML's expansion and growth.

More details on the ins and outs of SportsML are available in the SportsML Tutorial below

Tutorial

SportsML is XML

SportsML is an XML-conforming vocabulary. This means that SportsML uses the constructs standardized by XML to describe elements of content within a document, and the descriptive attributes of that content.

This tutorial covers the most widely used sections of SportsML. For details about each element, consult the Documentation page.


SportsML is a logical representation of sports data, and is not meant to dictate how that sports data is formatted. If a publisher wants to use SportsML to identify how two teams fared in a particular game, the <sports-event> and <team> elements would be used:
<sports-event>
<sports-metadata
event-status="post-event"
/>
<team>
<team-metadata>
<name
first="New York"
last="Mets"
/>
team-metadata>
<team-stats
score="4"
event-outcome="win"
/>
team>
<team>
<team-metadata>
<name
first="Atlanta"
last="Braves"
/>
team-metadata>
<team-stats
score="2"
event-outcome="loss"
/>
team>
sports-event>

This SportsML fragment could then be rendered in HTML for a clean display:

New York Mets4final
Atlanta Braves2 

Note that the SportsML event-status attribute of "post-event" has been used by the SportsML-to-HTML rendering processor to indicate that this score is a final score.


Basic Structure of SportsML

The root element in SportsML is <sports-content>, which contains a required <sports-metadata> section, followed by zero or more of the following:

  • <sports-event>
  • <tournament>
  • <schedule>
  • <standing>
  • <statistic>
  • <article>

The first five of these items hold XML structures built upon various combinations of <team> and <player> elements. The <article> element is intended to hold a news story recommended to adhere to the News Industry Text Format, or NITF.

Data structures for these items are outlined as follows:

<sports-event> A set of teams or a set of players, followed by optional information about officials/referrees, play-by-play actions, highlights, and awards
<tournament> Broken into tournament-divisions, which have rounds of sports-events
<schedule> A structured set of sports-events.
<standing> A set of teams or players.
<statistic> Also a set of teams or players.
<article> A container for an NITF news article.

Each of these structures has an envelope for metadata. For example, <event-metadata> holds such properties as when and where the event takes place, and whether the game has started or not.


Keys and Identifiers

Behind SportsML is a comprehensive strategy for unambiguously identifying which player, team, league, sport, and event is being covered.

These values are generally stored in attributes we call "keys." For example, a team-key might equal "t.7". Where does one go to look up which team has the key of "t.7"? In what we call a Resource File.

The Resource File is an XML file that lists and defines which keys are allowed where. The IPTC has come up with its contents for Resource Files. However, publishers are free to create their own files, either based on the IPTC's, or containing whole new sets of values.

Besides listing items like leagues, conferences, associations, and teams, Resource Files also contain lists of controlled vocabularies used to describe other properties. For example, the various states of health a player is in could be described as "injured" or "fine," or could be described in much more detail.

A quick aside: In an ideal world, we might also have a central repository for all player-keys in major sports, regardless of which team they're on or country they're in. This is obviously a long-term goal, and comments for how various agencies could go about putting such a reference database together are welcome.


Sports Actions

Another notable characteristic is how SportsML files generally include only one "root reference" to a <player> or <team>. To expand, a <sports-event> may list two teams, and each team may list several players. But lower down in the document, there may be a list dozens of "actions" that occurred during the game. Each action refers to its participants not by repeating the player-keys and team-keys, but by calling out the idref of the "root reference." This example shows how to portray the fact that Bernie Williams of the New York Yankees hit a grand-slam home run with two outs in the bottom of the ninth inning.

<sports-event>
<sports-metadata
event-status="mid-event"
/>
<team>
<team-metadata>
<name
first="New York"
last="Yankees"
/>
team-metadata>
<team-stats
score="4"
event-outcome="win"
/>
<player
id="p1"
>
<player-metadata
height="157"
weight="93"
date-of-birth="19680913"
>
<name
first="Bernie"
last="Williams"
/>
player-metadata>
player>
team>
<team>

...

team>

<event-actions>
<event-actions-baseball>
<action-baseball-score
inning-value="9"
inning-half="bottom"
outs="2"
balls="3"
strikes="2"
batter-idref="p1"
hit-type="home-run"
rbi="4"
runs-scored="4"

event-actions-baseball>
event-actions>
sports-event>

You could also point to which player threw him the pitch he hit over the fences.


The SportsML Document Type Definition

XML vocabularies generally use a Document Type Definition -- or DTD -- to define which elements and attributes are allowed where. XML vocabularies can also now be specified by a Schema, which is viewed by many as the successor specification format to the DTD.

SportsML 1.0 is currently defined by a DTD, though we plan to supply a Schema definition later this year. The DTD consists of a Core DTD plus several sport-specific modules.

The Core SportsML DTD

One requirement of SportsML is that it provide a single, core set of properties that could be used to describe scores, schedules, standings, and statistics for a wide variety of sports. This Core DTD, while using U.S. English to express its contructs, has to support properties of sports in a way that is readily usable by publishers from any nation.

The Control File

The Core SportsML DTD refers to a separate, small DTD file known as the SportsML Control File. This file contains no particular sports properties in it, per se. Instead, the Control File activates any SportsML Plug-In DTDs that a publisher wants to use.

SportsML users who want to validate documents only covering particular sports are welcome to modify the SportsML Control File so that only those modules desired will be loaded.

Plug-In DTDs

SportsML allows a publisher to express properties that are highly specific to particular sports. It does so by support individual, sports-specific DTDs that "plug in" to the core SportsML DTD.

As long as the SportsML Control File includes the plug-in for, say, Ice Hockey, a publisher is able to represent such Ice Hockey-specific constructs as shift changes, penalty shots, and power plays. The IPTC has decided to support seven sport-specific plug-ins at the launch of SportsML 1.0, including:

  • American Football
  • Baseball
  • Basketball
  • Golf
  • Ice Hockey
  • Soccer (a.k.a. Football, everywhere but the U.S.)
  • Tennis

SportsML users who would like to contribute to the authoring process of other modules are welcome to contact the SportsML Committee Chair.

 

LATEST SPORTSML NEWS

SportsML 2.2 Released

The most important changes in this new version of Sportsml are 1) Documentation improvements 2) Additions to the core schema 3) Enhancements to the soccer plugin and 4) Adjustments to the tournament model.

See the Specification and Documentation tabs in the navigation bar on the top of this page to download the latest files.

NEW G2-Guidelines

A 260 page "G2 Guidelines for Implementers" document is available for download - see Documentation page.

Subscribe to RSS feed

Feed A feed in Atom 1.0 format is providing all changes to SportsML-G2