Digital Libraries and
RSS:
Reaching Out with Web
Feeds
by Joy Weese Moll
April 2005
Abstract: Using web
feeds, digital libraries can syndicate content to the many readers who use news
aggregators. A web feed created with an RSS file requires little or no
technical expertise to produce and, depending on the content in the feed, may
take little or no extra effort to maintain beyond the normal activities and
processes of the digital library. Web feeds allow digital libraries to reach
more readers, more frequently, with very little time, effort, or cost.
Keywords: blogs, communication protocols,
digital library management, electronic publishing, mark-up language, marketing,
RSS, XML
Personal Statement
Before I discovered web feeds, I
spent a lot of time visiting web sites on my “Everyday” list, often to find
nothing new. When I began reading RSS web feeds through a news aggregator a few
months ago, I became more efficient and am now able to read several times the
news sources that I previously read. I first heard the concept of a
personalized electronic newspaper many years ago but had been disappointed with
every implementation I tried—the news aggregator finally fulfilled that
promise. As a consumer of information, I enjoy the experience of receiving it
through web feeds and I appreciate the individuals and organizations that
provide it that way.
As providers of information,
libraries, particularly digital ones with their electronic content, have a
splendid marketing opportunity to reach people like me. Karen G. Schneider, director of the Librarians’
Index to the Internet, made a convincing case for digital library web feeds in
her blog, Free
Range Librarian:
Donning
my lii.org hat, we had a remarkable education when we added RSS feeds. Now
people find us through the blog-finding agents. Librarians,
including me, suck at marketing, but by adding RSS feeds, we stumbled onto a
way for the audience to find us, instead of the glacially slow process of
dissemination through our existing readership.[1]
Introduction to RSS
To participate in this phenomenon of
being found by readers, a digital library must create a web feed, or RSS file.
“In essence, RSS is a simple XML syntax for describing a channel or feed of
recent additions to a website.”[2]
XML stands for Extensible Markup Language and is an alternative to HTML
particularly useful for describing data.[3]
The emphasis in the above definition of RSS should be on “simple.” Anyone who
is comfortable writing HTML can write an RSS file and there are a number of
ways to create an RSS file with no coding at all.
What does RSS stand for? Depending on
the version, it can stand for Rich Site Summary, RDF Site Summary, or Really
Simple Syndication. There is also a related format called Atom.[4]
“RSS goes by many names and sports multiple version numbers that do not reflect
any true lineage or patronage so much as a branding.”[5]
From the reader’s point of view, the particular version of the RSS file is
immaterial since the news aggregators will read all of the half dozen formats.
How does a publisher, like a digital
library, choose an RSS version for its feed? For a simple feed, any version
will do. Two situations may drive a digital library to choose a particular
version of RSS. First, a digital library that uses Dublin Core for its metadata
will want to take a careful look at RSS 1.0 which has long-established
guidelines for use with Dublin Core.[6]
Second, a digital library with a collection of audio content may want to
consider RSS 2.0, which allows for the possibility of podcasting,
feeding content to iPods and similar audio players.[7]
Literature Review
Roy Tennant of the California Digital
Library was an early proponent of RSS feeds in libraries, publishing a short
overview article in the
The National Cancer Institute and its
digital library, LION, made use of RSS feeds both for collection and for
outreach. In a Fall 2003 article in the Net
Connect supplement to Library Journal,
Kevin Broun explained how NCI gathered health-related
RSS feeds, parsed the content, and stored it in the LION database. At the same
time, LION began to offer RSS feeds to its users.[9]
Gerry McKiernan, of the Iowa State
University Library, published an article on LLRX.com on
In “RSS: The Latest Feed,” Judith Wusterman lists library-related uses for RSS, including blogs, announcements, acquisitions, and journal contents.
She provides detailed listings of aggregators, a history of RSS development,
and a look at the future of RSS.[11]
The Nature Publishing Group chose to
use RSS 1.0 for its web feeds, because it has defined guidelines for Dublin
Core metadata and because they could employ a similar technique to define a
metadata standard for serial publishers. Besides describing their process, this
D-Lib article has an excellent
overview of RSS and information about how a variety of scientific publishers
are currently using RSS feeds.[12]
OCLC’s new publications repository has a web feed. They
built a Content Management System that utilizes their process—the publications
are deposited in the repository and sent to the RSS feed when they are added to
WorldCat. This system is described in a March 2005
article in D-Lib.[13]
Creating web feeds
One of the easiest ways to create a
web feed is to write a blog. This can be done with no
coding by using a web-based blogging program and
hosting service, like Blogger and blogspot
(www.blogger.com). With just a little
HTML and network technology expertise, the digital library can host its own blog, utilizing the same design as the rest of the web
site. The blog can be updated, as easily as typing
email, using either web-based blogging software or a
program that is installed on the digital library’s computer. The blogging software will update the web feed automatically.
With a self-hosted blog, there is no reason for the
average user to know that a blog is the underlying
technology. It can look like a “What’s New” page or column on the web site.
As described in some of the
literature, many RSS feeds are created without blogs.
An RSS file can be coded by hand or generated from a Perl script. Many Content Management Systems are capable
of producing RSS feeds.[14]
A digital library using a Content Management System to put data and metadata
into a database may be able to easily implement a web feed that sends
descriptive metadata to readers’ news aggregators at the same time.
Creating content for web feeds
Most
digital libraries could benefit from having a “What’s New” or project
announcement feed for patrons, donors, and fans of their sites. There are many
other creative ways for digital libraries to reach out to their readers with
web feeds.
Some
digital libraries have RSS feeds that announce each item added to the
collection. The Librarians’ Index to the Internet has an RSS feed that is
generated weekly to show the new items added that week. New reports from the
Pew Internet and American Life project are announced via categorized web feeds.
Project Gutenberg (www.gutenberg.org)
announces newly-added electronic books via RSS.
Many print
and electronic journal publishers, including Nature, BioMed,
and National Geographic, use RSS
feeds to issue table of contents alerts. OhioLINK, a
consortium of academic libraries in
Digital
libraries can use RSS feeds to supplement their current awareness services. The
Digital Library of Earth System Education (www.dlese.org)
has categorized subject feeds for new items, but it also has a variety of feeds
for current awareness in the profession, including jobs, grants, and
conferences.
Some search
engines are providing RSS feeds of search results—new items detected by the
search appear in the user’s news aggregator each day. HubMed
(http://www.hubmed.org/) provides RSS
feeds of search results in its alternative interface to PubMed,
the National Library of Medicine’s database of medical documents.
Web feeds
do not always have to contain new content to be of interest to the reader.
There are many today-in-history or photo-of-the-day services that could be
emulated by digital libraries, using content that is already present in the
collection. A book-oriented collection could provide a chapter a day of classic
novels through a web feed. Collections with diaries or calendars, such as the
Truman calendar (http://www.trumanlibrary.org/calendar/index.html),
could issue an entry corresponding to the day on the calendar. While of less
practical use to the reader, this type of content offers amusement in the news
aggregator much in the same way that the comics page does in a newspaper. The
benefit to the provider of the content is obvious—daily patrons.
Further study and conclusion
According to a recent Pew report,
five percent of internet users subscribe to web feeds through RSS news
aggregators. This was the first time that Pew asked that question in a survey,
so there is no measure of growth, yet.[15]
Digital libraries that implement RSS feeds will want to monitor the growth of
usage of news aggregators and be aware of future alternatives that may one day
cause the growth to stagnate.
Anyone implementing RSS feeds as a
primary marketing tool will want to keep an eye on the developments related to
RSS. One is OPML, Outline Processor Markup Language, which allows people to
move their lists of feeds from one computer to another. According to Judith Wusterman, the same process could be used to share lists of
feeds by subject area.[16]
Another potential development is the presence of RSS reading capability within
browser software.[17]
For any marketing effort, it is
useful to have numeric measures of results. Unfortunately, readership of web
feeds is not as easy to count as “hits” on a web site. The number of
subscribers in Bloglines can be somewhat useful for
comparing the popularity of one RSS feed with another, but is only a partial
measure of readers since there are many other news aggregators.
Fortunately, the investment in
creating and maintaining a web feed is small. The return that can be partially
measured in Bloglines and indirectly measured by
increased traffic on the web site may be enough for many digital libraries to
justify the creation of one or more web feeds. Web feeds are new enough to give
digital libraries the panache of early implementation of technology while being
established enough to provide a good return on the investment required to
create them. A web feed extends the reach of the digital library by delivering
information to the daily electronic newspaper, the aggregation of RSS feeds, of
library patrons.
[1] Karen G.
Schneider, “Lists versus Blogs: Wait and See,” Free Range Librarian,
[2] Judith Wusterman, “RSS: The Latest Feed,” Library Hi Tech, 22, no. 4 (2004) <http://www.ucd.ie/wusteman/lht/wusteman-rss.html>
(
[3] Ken Sall, “XML:
Structuring Data for the Web: An Introduction,” Web Developer’s Virtual Library,
[4]
Elisabeth M. Long, RSS: What it Is, How
it Works, How to Use It, 2004, <http://dldc.lib.uchicago.edu/talks/2004/rss/>
(
[5] Tony
Hammond, Timo Hannay, and
Ben Lund, “The Role of RSS in Science Publishing: Syndication and Annotation on
the Web,” D-Lib, December 2004, <http://www.dlib.org/dlib/december04/hammond/12hammond.html>
(
[6] Ibid.
[7] David Winer, “What is Podcasting,” iPodder.org,
[8] Roy Tennant,
“Feed Your Head: Keeping Up by Using RSS,” Library
Journal,
[9] Kevin Broun, “Integrating Internet Content,” Library Journal: Net Connect, Fall 2003: 20-23. Library Literature and Information Science
Full Text, H.W. Wilson (
[10] Gerry
McKiernan, “Rich Site Services: Web
Feeds for Extended Information and Library Services,” LLRX,
[11] Wusterman.
[12]
[13] Shirley
Hyatt and Jeffrey A. Young, “OCLC Research Publications Repository,” D-Lib, March 2005, <http://www.dlib.org/dlib/march05/hyatt/03hyatt.html> (
[14] Mark
Nottingham, RSS Tutorial for Content
Publishers and Web Masters,
[15] Lee Rainie, “The State of Blogging,”
Pew/Internet: Pew Internet and American Life Project, January 2005, <http://www.pewinternet.org/pdfs/PIP_blogging_data.pdf> (
[16] Wusterman.