Nonstop Database, HiRDB Version 9 Description

[Contents][Glossary][Index][Back][Next]

10.3.1 HiRDB Text Search Plug-in

The HiRDB Text Search Plug-in provides the following text structure search facilities:

Each of these facilities is explained as follows.

Organization of this subsection
(1) SGML and XML document registration
(2) Flat document registration
(3) Retrieval with structure specification
(4) Synonym and spelling variation retrieval
(5) Score retrieval

(1) SGML and XML document registration

A utility provided by the HiRDB Text Search Plug-in can be used to register into the HiRDB database a DTD file, which defines tag names that indicate the structure and elements of an SGML and XML document. When the SGMLTEXT constructor facility based on the registered DTD file is used, the SGML and XML document together with the document structure information can be registered into the HiRDB database.

(2) Flat document registration

Flat documents that do not have a structure can be registered into the HiRDB database.

(3) Retrieval with structure specification

A full-text search for SGML and XML documents can be performed by using the contains abstract data type function to specify the columns to be retrieved and the retrieval condition (document structure name to be retrieved, a conditional expression that specifies the keyword to be retrieved).

(4) Synonym and spelling variation retrieval

A utility provided by the HiRDB Text Search Plug-in can be used to register synonyms and spelling variations into a local file. Based on the registered synonyms and spelling variations, synonyms or spelling variations of a keyword used for a search can also be retrieved during a full-text search for SGML documents. For example, a search for the keyword computer can also retrieve a synonym such as electronic computing machine and spelling variations such as COMPUTER and Computer.

(5) Score retrieval

The contains_with_score and score abstract data type facilities provided by the HiRDB Text Search Plug-in can be used to compute points (scores) based on the frequency of occurrence of the keyword and to display the retrieval results according to the scores.