8.1 Reading a feed

8.1.1 Example

We will illustrate the usage of this package throught the sample file provided by the RSS Advisory Board.

Let’s open this sample:

    >>> import itools.http
    >>> from itools.rss import RSSFile
    >>>
    >>> sample = RSSFile()
    >>> sample.load_state_from('http://www.rssboard.org/files/sample-rss-2.xml')

Notice we haven’t used the ability of vfs to directly open and load the feed, because the web server doesn’t send the application/rss+xml mimetype, nor the .rss file extension. Indeed the rss package has registered itself in itools to handle both this mimetype and this extension.

Now there are three main entries in the RSSFile API :

channel

is a dictionary describing the feed.

image

is an optional dictionary locating an image and its properties, None by default.

items

is a list of dictionaries, one per item.

All channel, image and individual item dictionaries are mapping the RSS 2.0 elements to keys, are their contents to values.

8.1.2 The Channel

An example is worth a thousand words:

    >>> from pprint import pprint
    >>> pprint(sample.channel)
   {'description': u'Liftoff to Space Exploration.',
    'docs': 'http://blogs.law.harvard.edu/tech/rss',
    'generator': u'Weblog Editor 2.0',
    'language': 'en-us',
    'lastBuildDate': datetime.datetime(2003, 6, 10, 11, 41, 1),
    'link': <itools.uri.generic.Reference object at 0x82ef9dc>,
    'managingEditor': 'editor@example.com',
    'pubDate': datetime.datetime(2003, 6, 10, 6, 0),
    'title': u'Liftoff News',
    'webMaster': 'webmaster@example.com'}

Now you can see that the most important elements were decoded to Python objects. Specifically, texts are decoded into unicode. The link element is decoded into a itools.uri.Reference object. Notice the datetimes are converted to your local time zone.

8.1.3 The Items

The items attribute is a list of each item element contained in the file. The order of these items is respected.

Knowing the number of items in the feed is straightforward:

    >>> len(sample.items)
    4

An Item

As for the channel, RSS 2.0 item elements are mapped into a dictionary:

    >>> pprint(sample.items[0])
    {'description': u'How do Americans get ready to work with Russians aboard the
    International Space Station? They take a crash course in culture, language and
    protocol at Russia\'s <a href="http://howe.iki.rssi.ru/GCTC/gctc_e.htm">Star
    City</a>.',
     'guid': 'http://liftoff.msfc.nasa.gov/2003/06/03.html#item573',
     'link': <itools.uri.generic.Reference object at 0x82f802c>,
     'pubDate': datetime.datetime(2003, 6, 3, 11, 39, 21),
     'title': u'Star City'}