Scalable Database Server, HiRDB Version 8 Description

[Contents][Glossary][Index][Back][Next]

10.3.1 HiRDB Text Search Plug-in

The HiRDB Text Search Plug-in provides the following text structure search facilities:

Each of these facilities is explained as follows.

Organization of this subsection
(1) SGML and XML document registration
(2) Flat document registration
(3) Retrieval with structure specification
(4) Synonym and spelling variation retrieval
(5) Score retrieval
(6) Note

(1) SGML and XML document registration

A utility provided by the HiRDB Text Search Plug-in can be used to register into the HiRDB database a DTD file, which defines tag names that indicate the structure and elements of an SGML and XML document. When the SGMLTEXT constructor facility based on the registered DTD file is used, the SGML and XML document together with the document structure information can be registered into the HiRDB database.

(2) Flat document registration

Flat documents that do not have a structure can be registered into the HiRDB database.

(3) Retrieval with structure specification

A full-text search for SGML and XML documents can be performed by using the contains abstract data type function to specify the columns to be retrieved and the retrieval condition (document structure name to be retrieved, a conditional expression that specifies the keyword to be retrieved).

(4) Synonym and spelling variation retrieval

A utility provided by the HiRDB Text Search Plug-in can be used to register synonyms and spelling variations into a local file. Based on the registered synonyms and spelling variations, synonyms or spelling variations of a keyword used for a search can also be retrieved during a full-text search for SGML documents. For example, a search for the keyword "computer" can also retrieve a synonym such as "electronic computing machine" and spelling variations such as "COMPUTER" and "Computer".

(5) Score retrieval

The contains_with_score and score abstract data type facilities provided by the HiRDB Text Search Plug-in can be used to compute points (scores) based on the frequency of occurrence of the keyword and to display the retrieval results according to the scores.

(6) Note

HiRDB Text Search Plug-in versions 02-02 and earlier do not support UTF-8 as a character code classification. If you use a version of HiRDB Text Search Plug-in that does not support UTF-8, do not specify UTF-8 for the character code classification with the pdntenv command. For details about support for UTF-8, see the HiRDB Text Search Plug-in documentation or the Readme file.