Hitachi

Hitachi Advanced Database Setup and Operation Guide


2.17.4 Word-context search

This section describes word-context searches of text data (document data).

Organization of this subsection

(1) Overview of word-context search

Word-context search is a function that quickly searches English-language text data (document data) for specific English words. Word-context search allows the following two methods of searching for English words:

Use a word-context search when you want to quickly search English-language text data (document data) for words using complete-match retrieval or leading-match search. The following figure shows an overview of word-context search:

Figure 2‒61: Overview of word-context search

[Figure]

To perform a word-context search, you use the CONTAINS scalar function. For details about the CONTAINS scalar function, see CONTAINS in Character string functions (Acquisition of character string information) under Scalar Functions in the manual HADB SQL Reference.

When performing a word-context search, you can reduce the number of pages to be loaded by defining a text index for a word-context search. By doing so, you can improve table retrieval performance. For details about text indexes for word-context searches, see CREATE INDEX (define an index) under Definition SQL in the manual HADB SQL Reference.

The text data (document data) targeted by a word-context search must satisfy the following conditions:

The delimiting character itself is ignored by the word-context search. Text that is not written in English can also be used as target data for a word-context search if it satisfies these conditions. For details about delimiting characters, see (2) Relationship between word-context search and delimiting characters.

Note

Word-context search will work with words that contain full-width characters or full-width spaces. For example, if the search term Hitach[Figure] is specified (where the letter [Figure] is a full-width character), the word-context search will be executed with Hitach[Figure] as the search term.

Important

Word-context search does not use trailing-match or middle-match retrieval. For example, you cannot find the word pineapple by executing a word-context search with apple specified.

If you want to perform middle-match or trailing-match retrieval, you can do so by using a LIKE predicate to perform middle-match retrieval. Alternatively, you can use the CONTAINS scalar function to perform middle-match retrieval.

(2) Relationship between word-context search and delimiting characters

A word-context search retrieves words in English language documents that are separated by delimiting characters. The following characters are handled as delimiting characters in word-context searches:

The delimiting characters themselves do not constitute part of the word. This means that specifying one delimiting character or two or more consecutive delimiting characters does not impact the surrounding words.

■ When data contains consecutive delimiting characters

In the following sentences, This, is, an, and apple are treated as individual words.

  • This∆is◊an∆apple.

  • This∆is◊an∆∆◊apple.

Legend:

∆: Single-byte space

◊: Tab

You specify the characters to handle as delimiting characters when defining a text index for a word-context search. For details, see 5.4.6 Selecting the delimiting characters for word-context searches (DELIMITER).

You can also specify multiple words (phrases) when conducting a word-context search using complete-match retrieval. Delimiting characters do not constitute part of the words in this case either. This means that whether words are separated by one delimiting character or two or more consecutive delimiting characters has no impact on the words retrieved by the search.

■ Searching for multiple words (phrases) by using complete-match retrieval in a word-context search

When you execute a word-context search that specifies this∆∆∆is, you can retrieve multiple words (phrases) as follows. Note that the search will not retrieve phrases in which the words are in a different order (such as is∆this) or when there are additional characters between the words (as in this∆apple∆is).

  • this∆is∆an∆apple

  • this∆◊is◊an◊apple

Legend:

∆: Single-byte space

◊: Tab

Note

If the search term consists only of one or more delimiting characters, a full search is performed regardless of the content of the text data (document data) targeted by the word-context search.

(3) Combining word-context search with correction search

Word-context search operations can be performed together with correction search operations. For details about correction search, see 2.17.1 Correction search.

You can specify the scalar function CONTAINS to use both word-context search and correction search. Both complete-match and leading-match forms of word-context search operations can be combined with correction search operations.

By combining word-context search and correction search, all occurrences of a word or phrase can be retrieved while ignoring differences like the following:

(4) Combining word-context search with synonym search

Complete-match word-context search operations can be combined with synonym search operations. For details about synonym search, see 2.17.3 Synonym search.

You can specify the scalar function CONTAINS to use both word-context search and synonym search. This means that words targeted by the search and synonyms of the words targeted by the search can be retrieved in one operation by the word-context search.

Important

You cannot combine leading-match word-context search operations with synonym search operations.

Because word-context search searches at the word level, data is retrieved if it matches the search-target word regardless of the type of delimiting character used between words. However, a word retrieved by a synonym search will include a delimiting character if the word is registered in the synonym dictionary with the delimiting character forming part of the word. The following figure shows an example.

Figure 2‒62: Search example when combining word-context search with synonym search

[Figure]

Explanation
  • When combining word-context search and synonym search, if you search for the term data◊base, the search term will not match the synonym data∆base registered in the synonym dictionary. The synonym registered in the synonym dictionary will not be retrieved.

  • When combining word-context search and synonym search, if you search for the term data∆base, it matches the synonym data∆base registered in the synonym dictionary. Because this operation searches synonyms registered in the synonym dictionary, the synonym data◊bank is also retrieved. Because the search operation also performs word-context search, it will also retrieve data∆bank which has a different delimiting character.

(5) Combining word-context search, correction search, and synonym search

Complete-match word-context search operations can be combined with correction search and synonym search operations.