unAPI revision 0 ================ Background unAPI is a simple website API convention. There are many wonderful APIs and protocols for syndicating, searching, and harvesting content from diverse services on the web. They're all great, and they're all already widely used, but they're all different. We want one API for the most basic operations necessary to perform simple clipboard-copy functions across all sites. We also want this API to be able to be easily layered on top of other well-known APIs. Objective The objective of unAPI is to enable web sites with HTML interfaces to information-rich objects to simultaneously publish richly structured metadata for those objects, or those objects themselves, in a predictable and consistent way for machine processing. How it works unAPI consists of three parts: 1. A URI microformat: describes a standard way of identifying individual information-rich objects on arbitrary web pages; 2. An HTML "autodiscovery" link pointing to a unAPI service applicable to objects on particular sites; 3. A small set of HTTP interface functions for accessing identified objects and information about them. Some of these functions have a standardized response format. 1. A URI microformat (note: this has not been endorsed by the microformat project). for identifiable objects referenced on a page, add an HTML block inside the parent element representing those objects that looks like this: some-uri For example, if you have a reference to the Pubmed reference with pmid 12345678, you would publish: info:pmid/12345678 Inside the object representation's parent element. For example:
info:pmid/12345678
It is not required that the span-uri element be the first child of the parent element. 2. An autodiscovery link pointing to an appropriate unAPI service Each web page containing at least one span-uri must also contain an HTML LINK element for unAPI autodiscovery, having the following attribute values: ...where the value of the href attribute is a URL where unAPI requests are to be directed, with the trailing slash. 3. HTTP interface functions The basic unAPI functions. All responses are expressed in JSON. All responses comprise a two-item list including: 1. A header object with a required key "status" and status code value, and and optional key "message" with a string value providing further explaination for a particular status, and Possible status codes are from HTTP and include: * 200: OK, any successful request. * 400: Bad request, for requests that are incorrect or outside of this specification. * 404: Not found, for requests on URIs or specific URI representation formats not available to this server. Valid examples of the header object include: `{"status": 200}' `{"status": 400, "message": "Unknown function"}' (spaces after ':'s added for readability here and in subsequent json examples) 2. A response object, as defined below. For each of the following function definitions, UNAPI means "the full URL to an unAPI service", e.g. http://example.com/unapi , and URI means "some URI of interest", e.g. info:pmid/12345678. URIs in unAPI HTTP calls must be url-encoded. * go: Go immediately to the application's standard HTML dissemination of this URI, which could be a table of contents, or a thumbnail, etc. For example: UNAPI/go/URI unAPI/go has no specified response syntax; an HTML page should be expected. * formats: List the metadata and object formats available for objects at this site. For example: UNAPI/formats unAPI/formats responses are a JSON object with object format names as keys and an object containing key/value pairs describing the format as values. Examples of format names (keys) might be any of "dc", "opml", "atom", "mods", or "didl". Supported format descriptor keys are: * docs - a pointer to where information about this format may be found * name - a human-readable brief description of the format * schema - a pointer to the format schema, if applicable * type - the MIME type of the format An example response (not including the required header): `{"mods": {"docs": "http://www.loc.gov/standards/mods/", "type": "application/xml", "name": "Metadata Object Description Schema", "schema": "http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-0.xsd"}, "dc": {"docs": "http://dublincore.org/", "type": "application/xml", "name": "Dublin Core Metadata Element Set, Version 1.1", "schema": "http://dublincore.org/documents/2003/04/02/dc-xml-guidelines/"}}' * URI/formats: List the metadata and object formats available for the object identified by this URI. For example: UNAPI/URI/formats unAPI/URI/formats responses are defined as exactly the same as for UNAPI/formats, but only listing formats available for a particular URI, which may differ from the site-wide list. * URI/SOMEFORMAT Returns the specified object in the specified format. For example: UNAPI/URI/dc Would directly return a Dublin Core metadata record for the object identified by URI. Nothing limits unAPI to returning only metadata, however; it could as easily return objects themselves in bare formats, such as an image: UNAPI/URI/jpeg Or in an object+metadata wrapper structure such as MPEG21 DIDL or METS: UNAPI/URI/mets UNAPI/URI/SOMEFORMAT has no specified response syntax; the media-type and data should be as expected for the format requested. Example complete response This is an example response to the UNAPI/formats function. `[{"status": 200}, {"mods": {"docs": "http://www.loc.gov/standards/mods/", "type": "application/xml", "name": "Metadata Object Description Schema", "schema": "http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-0.xsd"}, "dc": {"docs": "http://dublincore.org/", "type": "application/xml", "name": "Dublin Core Metadata Element Set, Version 1.1", "schema": "http://dublincore.org/documents/2003/04/02/dc-xml-guidelines/"}}]' Open Issues Open for debate. Code wins over theory: simpler, easier-to-understand, less time-to-implement, etc. * Should JSON and XML both be acceptable response formats? Should it be something else entirely, like microformat-marked-up XHTML? * Do we need response codes? * Can the path order be more consistent? * Are GET params better than fixed paths? * Should we have a UNAPI/services function that lists all available services with paths (e.g. rss, atom, opensearch, oai, etc.) * Should there be an UNAPI/search function?