Sunday, 1 June 2008

Upcoming Changes to JZKit Configuration

JZSome sunday musings on the Configuration mechanism in the jzkit_Service module...

the JZKit_service module is the glue module that pulls together all the other components into a federated meta search component capable of resolving internal collection and landscape names into a list of external z39.50/sru/srw/opensearch/SOLR/JDBC/etc/etc services and broadcasting a search to those services, integrating the results and providing a unified result set.

This gives rise to a problem: For solutions like the previous Z39.50/SOLR bridge, we want a simple XML config file that a user can hack once, then leave. For more complicated applications, we need a real relational database behind the app to manage the complex config that goes along with large information network archtectures.

Here's the rub: JZKit carries with it a very detailed service registry. It's purpose isnt like the JISC IESR to be a registry in it's own right, but to support the search process. More and more, there is a need to make the JZKit service registry searchable in it's own right (As a Z39.50 Explain database, or as an SRU Explain collection / ZeeRex records).

This has been bothering me for a while, yet the answer has pretty much been staring me in the face all along. So, here's what I'm considering for the final release of JZKit3:

1. The current "InMemoryConfig" which is loaded from XML config files will be deprecated.

2. It will be replaced by an in-memory derby database, essentially just the current database backed config mechanism, but with an in-memory database.

3. The XML config files will be left intact, but considered to be a "BootStrap" mechanism. At startup, JZKit will scan the config files and update/create any entries in the configuration database.

Thus, the "In-Memory" config will remain as before, but instead of being held in hashmaps the data will be inside a derby database. This means we can now define first and foremost a JDBC backed datasource for the explain database and make it searchable even for static content.

--> Free explain service for any JZKit shared collections.

Tuesday, 27 May 2008

Exposing SOLR service(s) as a Z3950 server

JZKit is a pretty large toolkit for developers of search services to embed in their own systems, and it's not always easy to get to grips with ;). Partly thats because if you're using JZKit you're probably already dealing with the Z39.50 specifications along with a host of other concerns.

What developers need are simple starter apps that they can use to hit the ground running. In JZKit3 we've decided to try and address this by putting up some sample configurations of the tool that do useful stuff out of the box. First up.. Making a SOLR server visible as a Z39.50 Server using an easy to change XML config file. Why do that? Well lots of reasons, the most common one at the moment is that SOLR is being used to provide search interfaces into lots of interesting new content, not least of which are the whole new breed of digital object repository projects like Fedora and DSpace. What seems to keep coming back around is institutional librarians saying "But I want that content to be available along-side everything else".

Kewl, a problem we can do something about. For now, the gateway distro lives in our maven repository Here. If you fancy setting up a Z39.50 server to proxy for a local SOLR server, download the compressed tar file from the above URL and unpack it.

Configuration.

After unpacking, look in etc/JZKitConfig.xml you'll see a number of elements which define the SOLR services we want to search. (Actually, you can define all kinds of searchable resources here, not just SOLR, but other Z39.50 targets, SRU/SRW, OpenSearch, etc but thats for another day). You should be able to see the top points at a public SOLR server, adjust the URL to your local repository. The other elements inside the Repository element control the mapping of Z39.50 use attributes onto the SOLR search language and what record elements are requested as a part of search results. The code attribute specified what Z39.50 Database name this resource will be made available under. Having adjusted the config to point at a local server, cd ../bin and run "jzkit_service.sh start" this will give you a running Z3950 server, by default running on port 2100.

From here, it should be plain sailing, here's the output of a yaz client session:

(N.B.) XML Markup in the result record is being filtered out by blogger. Actual result record is XML (Actually, in this case it's just the SOLR element)

yaz-client tcp:@:2100
Connecting...OK.
Sent initrequest.
Connection accepted by v3 target.
ID : 174
Name : JZkit generic server / JZKit Meta Search Service
Version: 3.0.0-SNAPSHOT
Options: search present delSet triggerResourceCtrl scan sort extendedServices namedResultSets negotiationModel
Elapsed: 0.006948
Z> base Test
Z> find @attr 1=4 Dell
Sent searchRequest.
Received SearchResponse.
Search was a success.
Number of hits: 1, setno 1
records returned: 0
Elapsed: 0.002231
Z> format xml
Z> show 1
Sent presentRequest (1+1).
Records: 1
[coll]Record type: XML
3007WFPDell Widescreen UltraSharp 3007WFP6
nextResultSetPosition = 2
Elapsed: 0.008594
Z> quit

Other components of the config file can be used to control the XML records returned, to convert SOLR records into MARC or to manage meta searching through the gateway, but thats a subject for next time.

Happy meta-searching.

Knowledge Integration Ltd