oss4lib About - Contact    History 
Listserv         Projects
Readings         Submit  


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: cds/isis, anyone?



Daniel Chudnov wrote :
 
so.  the obvious question is what, among the varied bits of CDS/ISIS there
are out there, might be re-useful?  and is their architecture a worthy one
in 1999, or a legacy framework that needs replacement?
_________________________________________________________________________________________
 
I have been asking myself these questions for quite some time and may have a few elements to answer :
 
What are the strong points/characteristics of CDS/ISIS ? Most users would probably aknowledge that CDS/ISIS is very good at storing and retrieving text and bibliographical information. Particularly it has :
 
- a flat file structure representing non tabular data, i.e variable number and length of non-typed fields, subfields, repeatable fields (although not repeatable subfield). The way data is represented will be very familiar to anyone who knows MARC. CDS/ISIS can even export its data to some variant of ISO2709 format. There is some notion of relation ( or reference) between a field and another in the same or a separate database. But it is obviously not a RDMS, and it is not callable through SQL.
 
- a powerful formatting language, used for 2 purposes : display/printing of records and for the very flexible generation of inverted files/indexes. You can for example index on simple elements such as fields and/or subfields or some complicated mix of them with all the power of a small programming language. In the latest versions, multiple indexes can be created.
 
- a powerful search language with all the Boolean operators and the set of proximity operators. There is also a powerful sequential search engine for very complicated queries, expressed in the formatting language. Search results are order by record number.
 
- a flexible and programmable user interface, which makes it easy to "internationalize" it, at least for the character based versions.
 
Now what and how is it possible to reuse some of these characteristics ? I know of several tools (there are porbably others):
 
- ISISPASCAL : it is used to develop applications running inside the CDS/ISIS environment, but it is not open to the outside world (i.e. no access to external APIs like OS)
 
- ISISDLL : It is a library to call ISIS functions from Windows based programs. Not a straight forward implementation and quite buggy from my experience.
 
- WWWISIS (http://www.bireme.br/wwwisis.htm ): It is a command line utility with lots of parameters, which can be called from scripts, specially CGIs on the WWW. It uses CDS/ISIS search language and formatting language with some nice enhancements. It works in Win32 and some Unix environments. It can be used both to query and update.
 
Both ISISDLL and WWWISIS were developed by BIREME (http://www.bireme.br ) based on an unpublished C library CISIS. CDS/ISIS, ISISDLL and WWWISIS are free but are not open software. For example the source is unpublished. However some file structures were published in the CDS/ISIS manual.
 
I developed recently an online subscriber-only bibliographical database that uses CDS/ISIS through wwwisis for bibliographical data and a RDMS for subscriber management. The mix is successful and the system works well so far. The main advantage of using ISIS for bibliographical data was to harness the powerful search language and indexing capabilities of CDS/ISIS. It would have been very difficult to reproduce this in a SQL based system. But in my case it was also a case of legacy : the stand-alone version of the bibliographical database had been here for a long time (15 years) and used CDS/ISIS from the start.
 
The main drawback to use such a mix of CDS/ISIS and RDMS for me is just that : 2 heterogeneous databases are used, which means 2 very different ways of representing data, 2 different APIs. The complexity of the software is increased and the risk of bugs also. Some good software development practice may help, such as 3 tier development (i.e. hiding the complexity in the data layer).
 
An idea to overcome this would be to have a kind of unified data access standard, a kind of generalized SQL that could access non relational databases. A CDS/ISIS interface to this standard could be written and, hop, software development would be easier...
 
Sorry for the long email. I hope it provides some useful information about CDS/ISIS....
 
Jean-Philippe Thouard
Center for Library and Information Resources
Asian Institute of Technology
 

SourceForge Logo © Copyright 1999-2005, The oss4lib Community, except for readings and comments, which are owned by their posters.
oss4lib is graciously hosted by the good folks at sourceforge.net.
Site URL: http://oss4lib.org/ Questions or comments to maintainers.


library