Research Links: List of Databases, Course Reserves

Quick Search: Articles, newspapers, books and ebooks, videos and more. Results primarily available online but may also include books available in the library or articles that can be requested for email delivery from ILLiad.
Books: Print and online books available from UNLV Libraries or by ILLiad request.
Articles: Articles from academic journals, magazines and newspapers.

Library Information: Pages on library web site, for example research guides, library policies and procedures, hours and events.

Quick and easy OAI record parsing with PHP

By Alex Dolski on February 11, 2009 8:17 AM | Permalink

Welcome to the second post on our blog! I'm the Web & Digitization Application Developer (a.k.a. junior programmer), which means that my posts will be the most important. Before I begin, a shout-out to all my dead homies. Now that that's out of the way, I can proceed.

If you work with metadata in a digital repository, the time may come when you need to export it for other purposes. Your repository software may or may not make that easy for you; but as long as it supports OAI-PMH, it's both possible and simple to do in a standards-compliant way. To do the job, we will use PHP's DOM core; this is one of several ways to work with XML in PHP. It can, of course, also be done in any other common scripting language.


<?php

$qs = array(
   'verb' => 'ListRecords',
   'metadataPrefix' => 'oai_dc',
   'set' => 'hughes'
);

$url = 'http://digital.library.unlv.edu/cgi-bin/oai.exe?'
   . http_build_query($qs);
$xml = file_get_contents(utf8_encode($url));
if (!$doc = DOMDocument::loadXML($xml)) {
   // abort
}

$xpath = new DOMXPath($doc);
$xpath->registerNamespace("dc",
   "http://purl.org/dc/elements/1.1/");

?>


<dl>
   <!-- The query() method returns a DOMNodeList object
   (an array-like list of DOMNodes). The first parameter is
   an XPath query. -->
   <? foreach ($xpath->query('//dc:*', $doc) as $node): ?>
      <dt><?= $node->nodeName ?></dt>
      <dd><?= $node->nodeValue ?></dd>
   <? endforeach ?>
</dl>

From here, you can do anything with it - print it to the screen, save it to a database, export it to a spreadsheet, etc. Handy. You can also fetch just a single record, using GetRecord in place of ListRecords:

$qs = array( 
   'verb' => 'GetRecords', 
   'metadataPrefix' => 'oai_dc', 
   'identifier' => 'oai:digital.library.unlv.edu:hughes/87' 
);

A mobot - who wouldn't want one?

There is more information in the OAI record than just the DC fields; to access it, just register the necessary namespace(s) and modify the XPath query. You can always take a look at the contents of $xml to see the raw XML string.

Comments

Submitted by faisal khan (not verified) on
Informative article for <a href="https://www.sagipl.com/php-development/">PHP Development</a> thanx for sharing it
Submitted by PHP Development... (not verified) on
Hey, Wow all the posts are very informative for the people who visit this site. Good work! We also have a Blog.Please feel free to visit our site. Thank you for sharing. <a href="https://www.paravidhi.com/website-development.html" rel="do follow" meta name="robots" content="index, follow" title="PHP Development in Indore">PHP Development in Indore</a>

Pages

Add new comment