Easy News Topics

Easy News Topics - RSS2.0 Module

Specification Version: 1.0.13 (DRAFT)

October 2004

This version:
http://www.purl.org/NET/ENT/1.0/

Authors:

Matt Mower <self@mattmower.com>
Paolo Valdemarin <paolo@evectors.it>

Abstract

This specification defines the Easy News Topics (ENT) Module for the RSS2.0 syndication format. ENT is intended to be a very simple standard for describing how topic information can be introduced into an RSS2.0 news feed.

The goals of ENT are to:

  1. be as simple to implement as possible
  2. represent topics sufficiently that they be useful in enabling smart aggregators (e.g. filtering, recombining feeds, etc...)
  3. allow, via linking, use of more powerful and flexible standards where appropriate

Rights

Creative Commons License
This work is licensed under a Creative Commons License.

Changes since draft 12

  1. Ammended Matt Mower's email address

Changes since draft 11

  1. Added a new implementations section.

Changes since draft 10

  1. Added new RSS+ENT logo for blogs!
  2. Added XML content model & attribute definitions
  3. Added a note about the <category> tag.
  4. Corrected the examples to properly use <cloud ent:href=...>
  5. Added a new simpler example

Contents

  1. Status of this document
  2. Purpose
  3. Usage
  4. Elements
  5. Examples
  6. Relationship
  7. Logo
  8. Implementations
  9. References

Status of this document

This is draft #13 of the ENT specification. Please send your feedback to either Matt Mower <self@mattmower.com> or Paolo Valdemarin <paolo@evectors.it>.

Please note that, whilst this specification remains a draft, it is subject to change.

Purpose

Whilst it can be argued that existing standards such as RDF and XTM provide all that is required to annonate weblog posts with topic metadata (and the authors do agree) our contention is that such efforts have, so far, not gained widespread support. Although the RSS1.0 has become popular enough to account for some 25% of all RSS feeds it is interesting to note that less than 3% (and maybe much less) use the RDF based taxonomy module (despite the fact that is has been available in some form since September 2000!)

It is our position that the chief reason for the lack of widespread support of the existing standards is their perceived complexity among developers. This complexity means that there are few tools to support them and few applications that warrant them. For this reason we believe a simpler standard for introducing topics into RSS will encourage the adoption of metadata use in aggregator applications in a much smaller timeframe than full-blown RDF or XTM support could achieve.

Note that Easy News Topics (ENT) v1.0 is not intended as a replacement for either RDF or XTM, indeed we hope to promote their use by allowing the linking of ENT topics back to definitions within a properly defined topic map. Our hope is to jumpstart the process of people & applications attaching topic metadata to RSS feeds. If those applications choose to use RDF or XTM then we are still satisfied.

Whilst we hope that readers will come up with many applications for ENT topics that we have not conceived of our immediate hope is that providing a quick and easy way to add topic metadata to RSS2.0 feeds will enable a new generation of aggregator applications to be written. These aggregators will allow people to filter items they do not wish to see, prioritize those about things they are interested in and recombine items into new feeds. We think these will be very exciting and useful applications and look forward to seeing them emerge.

Model

The ENT model is quite simple but should be sufficient for most aggregator related applications.

ENT defines two concepts:

Cloud
A cloud is defined as a source of topics. Each cloud has a URI which may point at a resource containing a complete description of the topics used in the cloud. An example of such a resource might be an XTM topic map. Note that whilst the URI is required, it is optional as to whether it points at a valid resource. It is up to each application how it handles the resource if present.
Topic
A topic is a metadata item, a named representation of a subject, that may be attached to one or more items. Each topic has an ID which must be unique within its cloud. Optionally topics may have a type. This is intended to allow topics to be classified (examples of suggested types might be 'person', 'place', 'business', 'specification', and 'project'). At the simplest level this type may be a string with a pre-agreed interpretation however in combination with the cloud it may be considered a topic ID within the clouds external topic map.

Note that we do not make use of the existing RSS2.0 <category> element in ENT. The reasons are:

  1. We believe that the cloud is an important concept which <category> does not allow us to express.
  2. <category> has been around since RSS0.92 and has existing, different, semantics which we do not wish to change. People using <category> can safely continue to do so.

Usage

ENT is an RSS2.0 module. The elements defined by ENT are enclosed within a namespace and sent as part of an RSS <item>.

Elements

ENT is an extension to RSS2.0 that works by using additional elements held within a namespace separate from the main RSS elements. By using a namespace, non-ENT compliant RSS aggregators may safely ignore the ENT elements and attributes and the topic information contained within them.

The following is a list of all the ENT elements used to augment the RSS feed. These elements are intended to be used within the RSS2.0 <item> element.

Namespace Declarations

<cloud>

The <cloud> element has two related purposes.

  1. It defines a scope for referring to a related group of topics.
  2. It allows applications to ignore/concentrate on known sets of topics.

A cloud might refer to an OPML topic roll, an XTM topic map, an RDF file or any other means of defining topic metadata. Each cloud may offer topics which can be used to annotate RSS items. However an external definition is not strictly required as the basic topic information is also encoded within the RSS file.

The intention of the external file is to allow additional information to be provided to allow, for example:

Content Model

    <!ELEMENT cloud ( topic+ ) >
    	
<topic>
Each <topic> identifies a single occurrence of a topic associated with the current <item>.

Attributes

    <!ATTLIST cloud
        href            CDATA            #REQUIRED
        infoRef         CDATA            #IMPLIED
        description     CDATA            #IMPLIED
    >
    	
href
An RFC2396 URI which acts as both a unique identifier for the cloud and also as an indicator of an external resource where the topics used in the cloud are formally defined. This could be in the form of an OPML topic roll, an XTM topic map or an RDF specification. If such a resource is specified then the id of topics within the cloud should refer to the id of topics within the external resource. However any interpretation of the URI (and any corresponding resource) is entirely application dependent.
infoRef
An optional RFC2396 URI which refers to a page of human readable information about this cloud. This might include the author or maintainers contact details as well as a description of the purpose of the cloud.
description
An optional string that can be used to describe the cloud. It is intended that this string be populated with something suitable for use in a user interface (where the href of the cloud would not be useful).

Usage

To add <topic> elements to an <item> they should be enclosed with an appropriate <cloud> element referring to where the topics were defined. This means that there may be multiple <cloud> elements within each RSS <item>.

<topic>

The <topic> element associates the <item> with a topic.

Content Model

    <!ELEMENT topic (#PCDATA)>
		
The content of the element should be the descriptive name of the topic. This is assumed to be in the language as the enclosing RSS feed. If variant naming is required this can be achieved using the cloud external resource.

Attributes

    <!ATTLIST topic
        id                CDATA        #REQUIRED
        classification    CDATA	       #IMPLIED
        href              CDATA	       #IMPLIED
    >    
		
id
Unique identifier for this topic. These are only required to be unique within each cloud since two identical id's are still unique in combination with the <cloud>'s href URI. The id may contain any characters from the ID set.
classification
For systems that wish to related topics to one another a classification can be specified. This indicates the type of the topic. It can be a simple string, for example person, a more complex string such as sports/baseball/player, or it may be a reference into the cloud resource. The classification may contains any characters from the ID set.
href
The href attribute is available for systems to indicate a human readable page related to the topic. An example usage might be to specify an index page listing other uses of the same topic. To provide machine readable information about the topic (for example a subject indicator) use the cloud resource.

Usage:

There should be one <topic> element for each topic from the <cloud> that is to be associated with this <item>.

The id attribute refers to the cloud-unique identifier for this topic. By cloud-unique we mean that within a particular cloud no two topics may use the same id, however a topic in another cloud may duplicate an id. The ids are actually still unique since, in combination with the cloud's uri attribute they form a unique reference.

The [classification] attribute specifies how the topic should be classified. At its simplest this could be a string such as person. However in combination with a cloud specifying an external topic map it may be considered to be a topic id within that map.

The [href] attribute specifies the location of an HTML page (suitable for a human being) which contains further information about the topic. To provide a machine readable information use an external topic map.

Examples

The following are a couple of examples of ENT topics associated with an RSS item

Example#1: This example shows just the ENT tags required to add a single topic to an RSS <item>. This example is for illustration only and for simplicity sake the namespace declarations have been removed.

    <item>
        <title>A sample item</title>
        <description>A sample item</description>
        <cloud href="http://www.example.org/">
            <topic id="sample">Sample</topic>
        </cloud>
    </item>
		

In the next example the ENT namespace declaration is shown on the <ent:cloud> element. This is simply to illustrate the namespace declaration. In real world applications the ENT namespace declaration should be located in the <rss> element.

Example#2: The following is an example of an RSS2.0 <item> encoded with ENT topics.

    <item>
        <title>Giants go 7-0</title> 
        <link>http://matt.blogs.it/2003/04/08.html#a855</link> 
        <description>&lt;P&gt;&lt;A href="http://sanfrancisco.giants.mlb.com/NASApp/mlb/sd/news/sd_gameday_recap.jsp?ymd=20030407&amp;amp;content_id=263494&amp;amp;vkey=recap&amp;amp;fext=.jsp"&gt;Giants go 7-0!&amp;nbsp; Woo Hoo&lt;/A&gt;&lt;/P&gt; &lt;P&gt;Franchise history is being made, now if we can just make it to 10-0!&lt;/P&gt;</description> 
        <guid>http://matt.blogs.it/2003/04/08.html#a855</guid> 
        <pubDate>Tue, 08 Apr 2003 10:28:59 GMT</pubDate> 
        <comments>http://radiocomments.userland.com/comments?u=107808&amp;p=855&amp;link=http%3A%2F%2Fmatt.blogs.it%2F2003%2F04%2F08.html%23a855</comments> 
        <ent:cloud xmlns:ent="http://www.purl.org/NET/ent/1.0/" ent:href="http://matt.blogs.it/topics/resources/topicRoll.opml">
            <ent:topic ent:id="sf_giants" ent:classification="generic" ent:href="http://matt.blogs.it/topics/topicsS.html#sf_giants">San Fransisco Giants</ent:topic>
        </ent:cloud>
    </item>		
		

Example#3: A slightly contrived example showing the use of multiple clouds and topics.

In this example two of the topics are actually referenced from an external XTM topic map about Major League Baseball players and the ENT namespace declaration is assumed to have been made in the <rss> element.

    <item>
        <title>
        <link>http://www.example.org/blog/2003/04/08.html#a855</>
        <guid>http://www.example.org/blog/2003/04/08.html#a855</guid>
        <pubDate>Tue, 08 Apr 2003 10:28:59 GMT</pubDate>
        <description>Here is the text of the item.</>
        <ent:cloud ent:href="http://matt.blogs.it/topics/resources/topicRoll.opml">
        	<ent:topic ent:id="sf_giants">San Franscisco Giants</ent:topic>
    	</ent:cloud>
        <ent:cloud ent:href="http://www.examples.com/mlb.xtm">
        	<ent:topic ent:id="barry_bonds" ent:classification="player">Barry Bonds</ent:topic>
        	<ent:topic ent:id="ray_durham" ent:classification="player">Ray Durham</ent:topic>
        	<ent:topic ent:id="felipe_alou" ent:classification="manager">Felipe Alou</ent:topic>
    	</ent:cloud>
    </item>
        

A sample feed is available.

Relationship to other standards

ENT was designed to be a simple mechanism for including topics within RSS2.0 feeds. During its design a number of simplicity vs. expressiveness trade-offs were made with the result that the ENT topics model is considerably more limited than that available in related standards such as XTM or XFML.

However ENT offers a means of keeping its simplicity whilst still allowing applications the full power of more advanced models. This is achieved via the <cloud> href attribute.

When the href attribute refers to a valid resource the implication is that this resource contains more information about the topics in the feed. In this situation each <topic> id is intended to correspond to an equivalent id in the resource. Advanced ENT compliant applications should recognise this escape mechanism and refer to the external resource where necessary. Other applications can safely ignore the resource and proceed as normal.

For example, if the cloud is defined as <cloud href="http://www.example.org/mlb.xtm"> then a topic defined as <topic id="barry_bonds"> can also be interpreted as the XTM topic http://www.example.org/mlb.xtm#barry_bonds.

Logo

Applications which support the ENT specification are encouraged to display the ENT logo. However please make a local copy of the choosen logo and do not reference it directly.

Large (202x79, 3K) Small (103x40, 2K)
Large Logo Small Logo

Weblogs that publish RSS using the ENT module are encouraged to display the RSS+ENT logo to let readers know. Again, please make a local copy of the image.

(67x14, 504b)
RSS+ENT logo

Implementations

We're trying to create a comprehensive list of products & services based upon, or supporting, ENT. If you know of any we have left out please drop us a line.

References

ID attribute
Refers to the ifragment production in the IRI draft specification
XML1.0
XML 1.0 (2nd edition)
RDF
Resource Description Framework
XTM
XML Topic Maps
XFML
eXtensible Faceted Markup Language
OPML
Outline Processor Markup Language
RFC2396
Uniform Resource Identifiers (URI): Generic Syntax
RSS0.92
RSS Web Syndication Format 0.92
RSS1.0
RDF Site Summary 1.0
RSS2.0
Really Simple Syndication 2.0