This document provides an overview of important OAI-PMH
definitions and concepts. For complete,
official, & up-to-date OAI-PMH see http://www.openarchives.org/OAI/openarchivesprotocol.html
I.
Basic Definitions:
Repository |
An OAI-Compliant Repository is a network-accessible
server to which OAI Requests, embedded in HTTP, can be submitted. OAI-Compliant Repositories may be
registered with a central OAI Registration Authority. |
|
Request |
An OAI Request may be expressed using either the
HTTP GET or POST methods. All OAI
Requests of a given repository are submitted to the single Base-URL
for that repository and consist of a list of arguments in the form of
key=value pairs. One key will always
be an OAI Verb. The other keys
will vary by OAI verb and the specific nature of the request. All keys and most values are
case-sensitive. |
|
Response |
An OAI Response is the XML-encoded byte stream,
embedded in HTTP, which is returned by a repository in response to an OAI
Request. The HTTP status line and
HTTP headers accompanying an OAI response may be used by an OAI repository to
indicate exception conditions. OAI
responses must be valid XML (other than exception condition responses which
may be only well-formed XML). |
|
Record |
An OAI Record is a <record> node of an
OAI Response as returned by a repository to satisfy an OAI Request
for metadata describing an item or items in that repository. Each OAI Record consists of 2
required nodes, <header> and <metadata>, and 1
optional node, <about>. |
Identifier |
An OAI Record Identifier is a persistent,
repository-unique key used to extract and identify a specific OAI Record
held by a repository. If the
repository is registered, all OAI Record Identifiers for that
repository will be unique across the entire registered OAI namespace. However, the same metadata content may be
associated with multiple OAI Record Identifiers (e.g., if the same metadata
content is held by multiple repositories).
To be valid, an OAI Record must include its OAI Record Identifier in
its <header> node. |
|
Datestamp |
An OAI Record Datestamp gives the date of creation,
deletion, or last modification of the <metadata> node contained
in that OAI Record. It is a
date only; no clock time is included.
To be valid, an OAI Record must include an OAI Record Datestamp in its
<header> node. |
|
Set |
An OAI Set is an optional construct for grouping items
in a repository for the purpose of selective harvesting of records. OAI Sets may be hierarchical (if so,
members of child sets are also retrieved as part of parent set). “set” is an optional key for some OAI
Verbs. |
|
Metadata Prefix |
“metadataPrefix” is a required key for certain OAI
Verbs. It is used to specify the
XML schema of the OAI Response <metadata> node(s) returned from
a repository to satisfy an OAI Request. Currently all fully compliant OAI repositories must support the
“oai_dc” Metadata Prefix for all non-deleted OAI Record Identifiers contained
in the repository. |
|
Flow Control |
Repository resource use may be managed in 2 ways. A repository may chunk a long response to
an OAI Request. When using
this method, a repository includes a <resumptionToken> node as
part of its OAI Response. To retrieve the next chunk of an OAI
Response, a harvest service will include this resumptionToken value as part
of its next OAI Request. Repositories
also may return a HTTP status of 503 (Service Unavailable) as a way to manage
flow control. When returning a
status of 503, the repository must also a return a “Retry-After” HTTP
response header. OAI-compliant
harvest services must respect this header value. |
II.
Verbs Used in OAI Requests:
Identify |
This verb is used to retrieve information about a Repository. No added arguments are allowed for this
verb. An Identify Response
includes the base-URL of the repository, the OAI protocol version supported,
the repository name, and the email address of the repository
administrator. Additional
human-readable and community-specific descriptive information about the
repository also may be provided. |
ListMetadataFormats |
This verb is used to retrieve the Metadata Formats
available from a Repository or for a particular Record. Not all records in a repository need be
available in all formats. The only
allowed optional argument is identifier (used to find available
metadata formats for a particular record).
A ListMetadataFormats Response includes metadata prefix,
namespace (optional), and XSD (for validation) for each metadata format
available. If the identifier
specified in the request is not available, the request does not generate an
error response (rather the response simply contains no metadata formats). |
|
ListSets |
This verb is used to retrieve the Set structure of
a Repository. The only allowed
optional argument is resumptionToken.
A ListSets Response includes setSpec (string used as value for
optional “set” key argument allowed with verbs ListIdentifiers and ListRecords)
and setName (human-readable string useful for display purposes). The required syntax for construction of
the setSpec reveals hierarchical relationship to parent sets (if any). If a repository has no set structure, a
valid, non-error response is returned containing no information about any
sets. |
|
ListIdentifiers |
This verb is used to retrieve the identifiers of
records that can be harvested from a Repository. Allowed optional arguments are until,
from, and set. until
and from are used to limit retrieval by date, while set limits retrieval by
set. resumptionToken is also
an allowed optional argument, but may not be use in combination with any
other. Any identifiers that match the
limit criteria are returned. Deleted
identifiers that match limits are returned with their XML status attribute
set to “deleted”. Return order of
identifiers is arbitrary and entirely up to the repository (may vary request
to request). An empty list is a valid
response. |
|
GetRecord |
This verb is used to retrieve an individual Record
from a Repository. Required
arguments are identifier and metadataPrefix. There are no optional arguments. A GetRecord Response will return a
record containing a metadata node in the requested format, if available. If identifier is not available, no
<record> node is included in the response. If identifier is valid but not available in requested format,
no <metadata> node is included in the <record> node returned. |
|
ListRecords |
This verb is used to harvest multiple Records from
a Repository. metadataPrefix
is a required argument (except when resumptionToken is used). until, from, set, and
resumptionToken are optional arguments as with ListIdentifiers. All records in the repository that match
the limits specified are retrieved. A
<record> node for a deleted record contains no <metadata> node
and includes an XML status attribute set to “deleted”. A <record> node for a record not
available in requested metadata format contains no <metadata>
node. Return order of records is
arbitrary and entirely up to the repository (may vary request to
request). An empty list is a valid
response. |
|
Document |
This verb is not part of the OAI PMH, but is often
implemented to facilitate testing of the repository using an XML-aware Web
browser. There are no optional
arguments. |
All verbs are case-sensitive.
Timothy W. Cole, University of Illinois at UC
17 September 2001
![]() |
University
of Illinois at Urbana-Champaign Library Gateway Homepage Comments to: Tom Habing Updated on: 9-16-01 TWC |