Archive

Posts Tagged ‘SWORD’

ACErep update for the PORSCHE Executive Board meeting on Wednesday 10th August 2011

August 12, 2011 Leave a comment

This was the final meeting of the PORSCHE Executive Board – I attended by telephone and submitted this written update from ACErep:

After initial testing of the EasyDeposit SWORD client with the Jorum development server (see April 6th update) we have now configured the client with a SWORD endpoint for each of the several repository platforms we hope to deposit into. These include a demo install of DSpace – http://demo.dspace.org/ – the NDLR (in lieu of Jorum), the WRRO EPrints repository and our own intraLibrary repository.

We have been able to successfully deposit into http://demo.dspace.org/ and into the NDLR (also DSpace) but the client fails with EPrints with a generic error message. I have not yet established what the problem might be and queries have been raised with ‘sword-app-tech@lists.sourceforge.net’ and ‘eprints-tech@ecs.soton.ac.uk’.

SWORD has now also been implemented at York St John (Archivalware) and is currently being tested – I have not yet been able to test with EasyDeposit.

DSpace, EPrints and Archivalware should all accept METS packages but in order to deposit into intraLibrary it is necessary to extend EasyDeposit to package as IMS – negotiation is currently underway to carry out this work in partnership with Intrallect hopefully in time for the ALPS Showcase on 16th September* (there will still also the requirement to standardise Application Profiles across the repositories).

In terms of repository cross-search it was suggested that we may be able explore doing this via the DSpace (v1.6) API (the version run by the NDLR) though I have not yet seen any documentation though our content has now been successfully harvested into the NDLR (test install) via OAI-PMH.

* This work has now been scheduled for w/c 29th August

EPrints & SWORD – can you help?

August 4, 2011 Leave a comment

I’ve installed Stuart Lewis’ EasyDeposit sword client – http://easydeposit.swordapp.org/ – and configured it with several sword endpoints covering the several repository platforms we hope to deposit into:

http://demo.dspace.org/

http://training1.eprints.org/ (Thanks to Dave Tarrant)

http://repository-intralibrary.leedsmet.ac.uk/

http://www.ndlr.ie/ (Thanks to Phil Franks)

I’ve been able to successfully deposit into http://demo.dspace.org/ and into the NDLR (also DSpace) but the client fails with the EPrints training install with a generic error message – I don’t know EPrints very well so don’t know if it’s a problem with EasyDeposit or the configuration of the EPrints repository I’m trying to deposit into. I can see two collections in the client (see below) and had a quick rummage around in training1.eprints.org but can’t see an obvious way to configure collections [am Training Admin]….any EPrints folk out there able to help?

(It also won’t yet work into IntraLibrary but that’s due to the fact that EasyDeposit packages as METS and IL only accepts IMS-CP.)

Categories: SWORD Tags: , ,

Project update

March 4, 2011 1 comment

One of the most challenging aspects of ACErep continues to be working across multiple institutions and organisations and recently progress has stalled somewhat as we liaise with our partners and wait for development work essential to the overall infrastructure.

In November last year we were able to build a prototype using Xpert which harvests OAI-PMH from our repository at Leeds Met and which we are then able to selectively search by keyword. We had initially hoped to work with Jorum but at that time there was no Open API, nor were there any plans to harvest by OAI-PMH, so we liaised instead with Pat Lockley of Xpert who was able to help with both requirements.

A lot has happened since November however, both at Xpert and Jorum. Pat is moving on to another (OER related) post which raises questions for us around the sustainability of that service now it has lost its key developer and I have been working with the PORSCHE project at Newcastle University which aims to provide seamless access to academic and clinical learning resources for healthcare students primarily from the respective collections in Jorum and the NHS National eLearning Repository (NeLR). As such, PORSCHE has also requested that Jorum harvest by OAI-PMH and provide an open API; the project team includes the Jorum service manager Hiten Vaghmaria as well as Kate Lomax of the NeLR (which, like Leeds Met’s repository, is also based on intraLibrary.) Jorum have now released a first iteration of an open API and hopefully both projects, and the UKOER community at large, can now work with the national OER service to develop that API to meet our requirements – PORSCHE’s Suzanne Hardy has set up a the Jorum API discussion space (wiki) at http://jorumapi.pbworks.com/w/page/35601929/Jorum-API-discussion-space. I am, in fact, yet to contribute to the discussion myself partly due to a lack of knowledge around API development – I’m far from clear, for example, how we may most effectively use an API to return just a subset of the resources in Jorum (i.e. ALPS/medical resources). In the Xpert prototype we just filtered on keyword but this feels a little unsatisfactory and Pat suggested a custom URL search would be better…

The other pieces of the jigsaw are repository-shaped and variously lacking the dove-tailing standards required to make the model workable. At Leeds University the learning and teaching repository is ExLibris’ DigiTool while York St John have a system called ArchivalWare, both systems are OAI-PMH compliant but do not support SWORD – development work is underway to rectify this at York St John while Jodie has set up an out of the box test install of e-prints to investigate OER object management.

Mike has also made some progress with EasyDeposit which is now working on a Leeds Met server and, after a few teething problems, will now successfully deposit a METS package to Jorum (DSpace), we will still need to test with EPrints (METS should be OK I think) and ArchivalWare when SWORD is integrated with that platform…probably METS again. There is an issue with intraLibrary, however, in that it only accepts IMSCP by SWORD, not METS, so we will also need to write an IMS content packager for EasyDeposit.

We are very grateful to Tamsin Treasure-Jones of ALPs for her continued support with this challenging project and I’m hopeful that all of these pieces can be put together over the Spring and Summer as we move towards a local, regional and national infrastructure for sharing medical teaching, learning and assessment material with an approach that will have the benefit of digital assets being preserved in one location (an institutional repository) while providing several points of access as well as allowing the ALPS branded web-site and the institutional repositories to “piggyback” on Jorum’s Google pagerank and improving discoverability.

Working search prototype and a SWORD fight

November 23, 2010 3 comments

Since the last meeting, Mike has done a great job in putting together a working (search) prototype that uses the Xpert API to search http://www.nottingham.ac.uk/xpert/ for just those resources tagged “alpsportal” i.e. ALPS resources that I have added to our repository and that have been harvested by Xpert.  Currently Xpert is only harvesting Leeds Met but it should now be fairly straightforward to also harvest YSJ and Leeds such that any resources appropriately tagged will be returned from the portal.  It’s on a restricted test server at the moment so this what it looks like (with a few explanatory slides):

And the SWORD fight? There are 3 repositories (possibly 4 if we count JORUM) that we need to be able to deposit into – this is a problem as currently only one of them – intraLibrary at Leeds Met – has fully functioning SWORD.  This did lead us to consider a hybrid deposit process allowing for manual package forwarding but given limited technical resources (this approach would have its own, not inconsiderable, overheads) and the clear advantages of utilising SWORD, it probably makes sense to focus on a prototype that can utilise the protocol and that can, hopefully, plug into the other repositories in the future.

One of the issues we face is that intraLibrary is based on IEEE LOM and only accepts IMS Content Packages by SWORD whereas the majority of (open source) repositories are based on Dublin Core metadata and only accept METS by SWORD; I have been working with Jorum to test their SWORD deposit – as a customised DSpace repository it accepts METS (not IMS) so we have had to map IMSCP for one of our resources -> METS in order for it to be accepted.  Currently I have only been able to achieve this with a very simple package comprising just title, description and author (I’ve also tried to add keyword and rights but these don’t seem to be picked up by Jorum – I’m not sure if this is a problem with Jorum or with my XML – probably the latter!)

It is still a moot point whether we will need METS or IMSCP for YSJ/Leeds until various questions around their specific repository implementations have been answered but we probably need to progress on the basis that we should support both. We will need to get Mike involved on the technical side and I’m hoping that he can start developing a web-based SWORD deposit client that, eventually, will be able to post to any SWORD service URL whether at Leeds Met, Leeds, YSJ or Jorum; as the only viable target repository at this stage is Leeds Met we will want to package as IMSCP in the first instance, ultimately with a view to also packaging as METS depending on the requirements of the destination repository.

MT – I may be asking for the moon on a stick here but I know you’ll put me straight and tell me what we can actually achieve and in what time-scale…just to summarise, below, what (I think) I understand about SWORD so far and what I anticipate might be required of this type of implementation:

SWORD works by means of a service URL that a package is posted to via the ATOM publishing protocol – this service URL allows a “service document” to be retrieved from the target repository which itemises available collections for SWORD deposit. The desktop client I have been using with our repository and with Jorum is available from Sourceforge – http://sourceforge.net/projects/sword-app/.  It enables me to post a .zip containing a resource + its metadata in an imsmanifest.xml to intraLibrary (or resource + mets.xml to Jorum)

Screenshots below:

Authenticate to repository / retrieve service doc (note SWORD service URL)

Authenticate to repository / retrieve service document (note SWORD service URL)

Service doc successfully retrieved – choose collection to deposit into:

Service doc successfully retrieved - choose collection to deposit into:

Confirm collection details for post operation:

Confirm colection details for post operation

Browse to zip on hard drive (already containing imsmanifest.xml) and click button to post to repository:

Browse to zip on hard drive (already containing imsmanifest.xml) and click button to post to repository

Resource and full metadata record successfully posted to repository:

Resource and full metadata record successfully posted to repository:

I imagine we would need a web-form that allows similar metadata to be captured and transformed into an imsmanifest.xml, then allows file upload and zips resource and manifest into an IMSCP that is then posted to a specified collection in the service doc at http://repository-intralibrary.leedsmet.ac.uk/IntraLibrary-Deposit/service.

Erm, sounds a bit tricky to me, Mike?

It might also be worth looking again at Stuart Lewis’ EasyDeposit (blogged about at http://repositorynews.wordpress.com/2010/06/04/easydeposit-the-sword-client-creation-toolkit/) which packages as METS for DSpace/EPrints etc.

Using Jorum (and/or Xpert?) for ACErep

October 12, 2010 10 comments

Thanks to Gareth and Hiten at Edina who spared their time to speak to Peter and I yesterday and answer our questions about Jorum to help us determine how we might integrate with ACErep.

In a nutshell we were interested in whether we can deposit into Jorum via SWORD – in addition to our respective institutional repositories – with a view to using the national repository to search across all ALPS resources from the three partner institutions. Ideally we would want to conduct a search from our own ALPS portal and display/format search results in our own environment. However, it seems there is no “open” search facility to query Jorum and return data in a format that we could process ourselves (i.e. XML) and while we may be able to conduct a search from the portal, our only option would then be to “jump-off” to the results in Jorum itself.

In view of limited resources it may be that this is the route we choose to go down but we will need to speak to our stakeholders first to see if it is acceptable to them – it’s obviously not ideal and the additional functionality we could add to the portal would be limited in this scenario (eg. comments/discussion on ALPS resources to bridge the theory/practice gap in health education as suggested by our stakeholders).

The approach that would give us the greatest flexibility, of course, is if we were able to harvest/index/search our three repositories ourselves and Peter will do some research to determine whether we might look into this.   Realistically, however, we may lack the resources (mostly time!) for this to be viable and there is another service that may be worth looking at first:

Xpert, as I blogged recently, is a service at Nottingham University that harvests RSS and OAI-PMH feeds from learning object/OER repositories including Leeds Met (and as of last week Jorum itself) to create a “distributed e-learning repository”.

So…our question for @xpert_project is: Are we be able to query Xpert with an appropriate level of sophistication (tbc!) and return XML that we can process, format and display ourselves?

Peter and I had a look at the APIs recently released as Xpert Labs which includes base URLs to return a variety of data formats including XML; I don’t really know enough about querying databases/data transfer to know whether this, in itself, is sufficient to solve our problem and Peter suggested that the XML returned by this service is called up by the user’s browser rather than being in a format that we could further process (?) but we would be very interested to speak with Xpert to see if there is any mileage in these ideas. If we were able to utilise Xpert in some way then a further caveat is that it would necessitate a delay between deposit and discovery to allow for harvest – Xpert harvests every night I think so we would be looking at an overnight delay – is this acceptable?  Any harvesting solution would also necessitate a delay of course; SWORD deposit to Jorum should mean resources are discoverable immediately.

In the short term, Peter and I intend to pursue both the Jorum and Xpert routes; so far I’ve just tried to sketch out the broad picture – as always the devil is in the detail and in this case that devil, in one way or another, is likely to reside in metadata Hell…or Hades at least.

We have had only the briefest discussion about an Application Profile for ALPS but it is probably desirable to adopt a lightweight AP based on UKOER (see previous post) – with the only additional requirement being that resources are presented utilising a bespoke taxonomy to accommodate “specific learning/assessment outcomes” (tbc).  While it should not be too difficult to map between our disparate systems’ metadata standards to arrive at an AP based on UKOER (Title, Description, Keyword, Classification, Contributor etc) there is a potential issue in that Jorum classifies by JACS and Gareth could not say, without some experimentation, whether non-JACS classification data is indexed and hence searchable; we intend to submit some test METS packages to Jorum by SWORD in order to test this.  There may also be issues around managing Content Packages – especially if we want to deposit them by SWORD (but also harvest by Xpert); our original intention was to deposit everything into Jorum via SWORD (which would require authentication with a UKFed user-account – best option probably to set up a specific ALPS account with an UKFed institutional email?) and this might still be the best option; even if we do go down the Xpert route would we want resources to be harvested from the institutional repositories or from Jorum?  Jorum only accepts METS by SWORD, so IMSCP could not be deposited via the standard and we would need to have some sort of contingency process whereby Content Packages are transferred to Jorum by an alternative mechanism which would also precipitate a (further) delay – such a process is already in place for Unicycle resources.

N.B. In actual fact, I suspect we are unlikely to get large numbers of IMSCPs but the contingency needs considering nevertheless!

At this stage, of course, it still a moot point whether we can query Xpert at all and return data in an appropriate format but we would also want to be sure that we could query a bespoke taxonomy… There may also be issues with respect to harvesting Content Packages (see discussion on this post)

(Attempting) to summarise:

Option 1: Submit to Jorum by SWORD – search Jorum from portal but jump-off to Jorum itself for results. In addition, submit to one of three institutional repositories (depending on user affiliation)

Pro: Integration with national OER infrastructure / deposited resources available immediateley / (relatively) low developmental overheads

Con: User taken out of portal to results in Jorum / limits development of additional functionality

Option 2: Submit to one of three institutional repositories (depending on user affiliation); harvest/index/search the metadata ourselves.

Pro: Maximum control

Con: Resources / timescale / delay associated with harvest / developmental overheads are potentially prohibitive

Option 3: Submit to one of three institutional repositories (depending on user affiliation); ensure all three repositories are harvested by Xpert – utilise API to search Xpert and return XML that we can display/format ourselves.

Pro: (relatively) low developmental overheads (compared to option 2)

Con: Unknown issues (is it feasible?) / delay associated with harvest / does Xpert have resources to help us in ACErep project timescale?

Option 4: Submit to Jorum by SWORD AND one of three institutional repositories (depending on user affiliation); records harvested from Jorum by Xpert – utilise API to search Xpert and return XML that we can display/format ourselves.

Pros/cons: As above (Options 1 & 3)

Portal prototype

September 28, 2010 6 comments

Peter and I met on Friday – joined virtually by Helen in the afternoon – to discuss how we might implement the “ALPS portal” and what our prototype should look like based on input from the workshop in August. Peter did a quick sketch which I’ve reproduced (almost faithfully) below and we tried to anticipate some of the technical issues that we will face; Peter and I plan to meet with Gareth Waller to discuss possible solutions to some of these issues via mooted integration with Jorum and, in the meantime, Peter will work on some mock-ups while we get feedback from the rest of the group via the blog (hint!).

This outline is with a view to developing a portal whereby users can both search for (ALPS) resources AND deposit into their respective institutional repositories; these two criteria present technical challenges that are quite separate in some respects though more closely related in others.

Deposit:

  • Authentication:

It is essential that users are able to authenticate in some way in order to deposit.  At the very least we would want to know Name; email and Institutional affiliation and we considered LDAP authentication via the existing institutional system(s) – not sure how easy or difficult this would be and will require us to liaise with systems folk at our respective institutions.  Peter suggested that it might be quicker and easier to set up our own LDAP server so that users need to register before they can deposit – sustainability beyond the life of the project is perhaps a drawback with this scanario.

  • Deposit form

What metadata do we want to capture – what is our Application Profile?  What metadata, if any, can be automatically generated?

An AP based on UKOER used at Leeds Met would comprise:

Metadata field Comments
Title
Description
Keyword(s)
Classification We classify by HEA subject centres and JACS; Jorum classifies by JACS only – we may wish to classify ALPS resources differently?
Contributor:  Role of Contributor
Contributor
Contribution Date
Technical Information:  Technical Format MIME media type – 70 technical formats; intraLibrary identifies file type at upload and field is auto-completed
Educational Properties:  Type of Resource Terms from LOMv1.0: Diagram/ Exam/ Exercise/ Experiment/ Figure/ Graph/ Index/ Narrative Text/ Problem Statement/ Questionnaire/ Self Assessment/ Simulation/ Slide/ Table;

Terms we have added to our vocabulary: Podcast/ Not Applicable/ Presentation/ Photograph/ Quiz/ Spreadsheet/ Tutorial/ Video/ Lecture/ Game/ Animation/ Assessment/ Audio/ Case Study/ Database/ Workbook

Subject to Copyright
Statement of Copyright and Restriction* Our template includes URL for Creative Commons Licence – http://creativecommons.org/licenses/by-nc-sa/2.0/uk/ (required for Jorum Open)

Perhaps a deposit form could be based on that used by Jorum?

Jorum deposit interface

Jorum deposit interface

  • Potential issues
    • Different software uses different metadata standards / Application Profiles -> will need to map between them.
    • ALPS may require different metadata than ukoer – e.g. explicit priority is “Resources presented in the context of specific learning/assessment outcomes” – can Jorum accommodate this?

Selective deposit depending on user affiliation

Jorum has a SWORD endpoint that accepts METS (not IMSCP) so, in theory, all resources could be deposited to JorumOpen by default; user-affiliation – derived from the authentication process – could then determine which of the other Institutional Repositories the file is submitted to.

(N.B. Whether this will be via SWORD is still a moot point – Archivalware doesn’t yet have a SWORD endpoint / uncertain whether Leeds will test with DigiTool or EPrints / intraLibrary accepts IMSCP but not METS; could just be a semi-automated process for the prototype?)

In any case, as far as the user-journey is concerned, they log-in (preferably with institutional LDAP), upload file and add basic metadata – the resource is then deposited in their own Institutional Repository AND Jorum Open and is discoverable from the portal (+ JO; + IR)  – can’t see this being immediate and there is likely to be some sort of lag I think – just how much of a lag will depend on technical implementation; obviously want it to be as quick as possible (and email when available?)

Discover:

Via Jorum?

Peter and I discussed in more detail the possibility of somehow integrating with Jorum – with limited time and resources it seems unrealistic to develop the infrastructure to harvest / cross-search the three repositories ourselves and preferable, in any case, to utilise the national UKOER infrastructure…

If all resources are deposited to Jorum by default, can we liaise with them to this end?  As I understand, Jorum’s new search interface – http://www.jorum.ac.uk/searchOptions.html – searches indexed metadata from both JorumOpen and JorumUK so is there any way we could access that index (API, web-service) to build a bespoke facility to search from our own environment?

This approach would have the added benefit that all resources will also be discoverable from Jorum itself (assuming they are released under an appropriate license – Peter, there was a previous discussion I think around the option to deposit to an open OR closed collection which we’ve neglected to consider.

Search/Browse

One of the main requirements that came from the workshop was:

Resources presented in the context of specific learning/assessment outcomes (will guide discovery)

I’m not certain what these specific learning/assessment outcomes will be and we will need further input but if we anticipate a browsable hierarchy then it is clear that, however we search across the ALPS resources, the underlying metadata/classification will need to include this hierarchy…which might be an issue depending on how Jorum (DSpace) deals with metadata that it “doesn’t recognise” as we are likely to require quite specific classification schema (rather than JACS).

Miscellaneous issues

A couple of issues that probably should be thought about sooner rather than later:

  • What will we call the thing – “ALPS CETL repository portal”? Might seem trivial but could hold things up…
  • Where will The Portal be hosted – on an institutional server…or http://www.alps-cetl.ac.uk/…who is responsible for that site?