InteroperabilityAnother important item is bulk data exchange. SuperThes supports two different methods for data exchange of bulk data:
XML data exchange
SuperThes produces XML output from its tabular and structural data and of course it is possible to import from XML sources. The XML format is contained in a Document type definition which of course is published and available to everyone.
The document type definition is in fact a metadefinition describing a method to describe a thesaurus’ actual configuration. An SuperThes XML file consists of three parts:
- The document type definition
- The description of tables and fields of a particular thesaurus
- The data (tabular and structural) itself
Over all of the advantages of XML data exchange there is one big disadvantage: the “other” program has to have a XML parser. Well you might use some public available tool kit like SAX. But the data path from XML into your database engine must be programmed. So there is most likely a high initial cost to implement XML data exchange.
If it comes to actions like to get data from an old system into SuperThes most people prefer to use a simple text format.
XML data may contain images, documents and sounds. This type of data are encoded as Base64 which is adapted from RFC1421. This type of encoding is compliant to RFC2045 (Internet Message Bodies)
Textual data exchange
Data which is contained in tables is very frequently found as a candidate to import into a thesaurus. Such data collections coming typically from text processing software like MS-Word or from a spreadsheet program like MS-Excel may be very easily imported into SuperThes. Various types of character encoding, date and time formats, field and record separator may be selected. Text may be encoded in every code page available on Microsoft operating systems and of course in Unicode. Unicode is supported either in UCS-16 or in UTF-8.
Imported data may be appended to an existing table. Columns may be referenced to already existing data and data may be merged with existing data.
Data to export are defined by selecting a table and fields.
Textual data exchange may also contain image or document data which are encoded as Base64 (See XML data exchange for details). |