Development APIs : The list functions

OK, so there’s some interesting data to get – but how do you get it?

There are three general APIs, or 10… depending on how you count them.

Data returns

All APIs return data in the same ways:

  1. You can specify the format either with the Accepts header in the http request, or with the format parameter. The options are ‘json’, ‘xml’, or ‘text’, with ‘json being the default if nothing is specified.
    • If there’s a callback parameter, and the format is json, then a crossDomain package is returned… very useful!
  2. All return the data as a nested object, with three top-level elements:
 {
   'message' => {}
   'status'  => 'ok',
   'to'      => 'http://.....'
 }

status is “ok” or “fail”, to is the url that made the query, and message contains the actual data being returned…. which is dependant on the query!

The queries

Lets start with the suite that list things  (cf the AJAXie get_xxx functions and the main api)… currently at http://devel.edina.ac.uk:1201/cgi/list5/xxx, this is a suite of six APIs that pull out a list things:

  • type
  • content
  • country
  • lang
  • org
  • net

type

This lists the type (or classification) of repository.

'message' => {
               'type' => [
                           {
                             'code' => 1,
                             'text' => 'Subject (Research Cross-Institutional)'
                           },
                           {
                             'code' => 2,
                             'text' => 'Other'
                           },
                           ......
                         ]
                      },
code text
1 Undetermined – Repositories whose type has not yet been assessed
2 Institutional (Institutional or departmental repositories)
3 Disciplinary (Cross-institutional subject repositories)
4 Aggregating (Archives aggregating data from several subsidiary repositories)
5 Governmental (Repositories for governmental data)
6 Subject (Research Cross-Institutional)
7 Journal (e-Journal/Publication)
8 Thesis
9 Database (Database/A&I Index)
10 Learning (Learning and Teaching Objects)
11 Other
12 Demonstration

When a repository type is needed by /api, it is the code number you need.

Adding the parameter full=1 will cause the query to return all the repositories that are of that type listed under a repos element. Note that repositories are not exclusively one type or another, and may appear under multiple types.

The repos sub-elements are indexed by repo_id. There is also a count element which will tell you how many repositories are in the set.

content

This lists the type of content that repositories accept

  <message>
    <content>
      <code>1</code>
      <text>Research papers (pre- and postprints)</text>
    </content>
    <content>
      <code>2</code>
      <text>Research papers (preprints only)</text>
    </content>
    .....
  </message>
code text
1 Research papers (pre- and postprints)
2 Research papers (preprints only)
3 Research papers (postprints only)
4 Bibliographic references
5 Conference and workshop papers
6 Theses and dissertations
7 Unpublished reports and working papers
8 Books & chapters and sections
9 Datasets
10 Learning Objects
11 Multimedia and audio-visual materials
12 Software
13 Patents
14 Other special item types

When a content type is to be defined in /api, it is the code number you need.

Adding the parameter full=1 will cause the query to return all the repositories that accept the content-type listed under a repos element. Note that repositories are usually accept multiple content-types, so will appear under multiple entries.

The repos sub-elements are indexed by repo_id. There is also a count element which will tell you how many repositories are in the set.

 lang

This lists all the languages the dataset knows about (in essence, the ISO 639 codes).

(We are limited to ISO 639-2 as ISO639-3 & later are not Open Access lists and there is a clause which states “the product, system, or device does not provide a means to redistribute the code set.”)

{
  "to" : "http://devel.edina.ac.uk:1201/cgi/list5/lang",
  "status" : "ok",
  "message" : {
    "lang" : [
      {
        "text" : "Abkhazian",
        "iso3_b" : "abk",
        "code" : "ab"
      },
      {
        "text" : "Achinese",
        "iso3_b" : "ace"
      },
    ]
  }
}

Adding the parameter full=1 will cause the query to return all the repositories that assert they use that language in their interface, listed in a repos element. Many non-english interfaces are multi-lingual, and those repositories will appear in multiple lists.

The repos sub-elements are indexed by repo_id. There is also a count element which will tell you how many repositories are in the set.

country

This lists all the counties the dataset knows about (in essence, the ISO 3166-1 codes).

{
  "to" : "http://devel.edina.ac.uk:1201/cgi/list5/country",
  "status" : "ok",
  "message" : {
    "country" : [
      {
        "text" : "Andora",
        "code" : "ad"
      },
      {
        "text" : "United Arab Emirates",
        "code" : "ae"
      },
    ]
  }
}

Adding the parameter full=1 will cause the query to include all the repositories, under a repos element, that are listed [in OpenDOAR] as from of that country. OpenDOAR does not have a concept of multiple countries for a repository.

The repos sub-elements are indexed by repo_id. There is also a count element which will tell you how many repositories are in the set.

org

This lists all the organisations in the dataset. This script will take over 15 minutes to complete… there is a LOT of data to return!

{
  "to" : "http://devel.edina.ac.uk:1201/cgi/list5/org",
  "status" : "ok",
  "message" : {
    "org" : {
      "1" : {
        <as per org listing>
      },
      "4": { 
        <as per org listing>
      },
    ]
  }
}

Adding the parameter full=1 will cause the query to return all the repositories that are of that type listed under a repos element. Running the query with the full flag can take twenty minutes!

The repos sub-elements are, in this situation, listed as described in this post .

net

Adding the parameter full=1 will cause the query to return all the repositories that are of that type listed under a repos element. Note that repositories are not exclusively one type or another, and may appear under multiple types.

The repos sub-elements are indexed by repo_id. There is also a count element which will tell you how many repositories are in the set.

Comments are closed.