5.4.1 Points to consider in determining the columns to be defined for a text index : Hitachi Advanced Database Setup and Operation Guide

(1) Cases that benefit from a text index

Consider defining a text index when you will be retrieving data in the following ways:

When using the LIKE predicate to perform middle match or trailing match retrievals

If you use a text index, there will be fewer pages to be loaded than if you were using a B-tree index. This means that you can achieve higher-speed retrievals by defining a text index than when you define a B-tree index. Define in the LIKE predicate a text index for a column that is specified as a match value.
When executing retrieval operations with the LIKE_REGEX predicate specified

Using a text index requires fewer pages to be loaded. This means that you can achieve faster retrieval by defining a text index.
When executing retrieval operations with the CONTAINS scalar function specified

Using a text index requires fewer pages to be loaded. This means that you can achieve faster retrieval by defining a text index.
When performing a correction search

Using a text index for correction search requires fewer pages to be loaded. This means that you can achieve faster retrieval by defining a text index for correction search. For details about correction search, see 2.17.1 Correction search.
When performing word-context search

Using a text index for a word-context search requires fewer pages to be loaded. This means that you can achieve faster retrieval by defining a text index for a word-context search. For details about word-context search, see 2.17.4 Word-context search.

The following table explains the relationship between the data to be retrieved and text indexes.

Table 5‒4: Relationship between the data to be retrieved and text indexes
No.	Retrieved data items	Validity of text index	Notes
1	Document data (such as document data and mail containing at least 100 characters)	V	If a character string is long, the data size for the text index might become large. For document data, the data size for the text index tends to become large compared to other retrieval target data because data for the text index includes such character strings that are not used as search conditions.
2	Document data written in English (such as document data and emails containing at least 100 characters)	V	When searching for English words in documents written in English, specify a text-index-word-context search specification and define a text index. To conduct the word-context search, use the `CONTAINS` scalar function. For details about text-index-word-context search specification, see 5.4.5 Specifying a text index for a word-context search (TEXT WORDCONTEXT). Note that if a character string is long, the data size for the text index might become large. For document data, the data size for the text index tends to be larger than for other retrieval target data because data for the text index includes character strings that unlikely to ever be used as search conditions. The data size for the text index will also be larger when you specify a text-index-word-context search specification than when you do not.
3	Character strings consisting of only alphanumeric characters (such as URLs and access logs)	L	Use of text indexes might not yield any benefit in the following cases: Data is extremely short. The variation among characters used is small.
4	Name data (such as product names consisting of about 10 characters)	U	Use of a text index might not yield any benefit in the following case: Each data item consists of about 10 characters, but there are many identical name data items, such as in a history of purchases.
5	Character strings consisting of only numeric characters (such as IDs and codes)	N	Even if retrieval results are unique, the retrieval processing might require a large amount of time because there are not many character combinations (inasmuch as all character strings consist of only numeric characters). Little benefit can be expected from using text indexes.

Legend:

V: Very likely to be valid

L: Likely to be valid

U: Unlikely to be valid

N: Not valid

To Page Top

(2) Cases that benefit when a text index is not defined

Do not define a text index for columns with the following characteristics:

There are many duplications in the column data (many partial or complete matches).
The number of different characters is limited, such as a number column containing only numeric digits (for example, IDs).
The character strings to be retrieved are short and simple, such as the alphabetic character a or the numeric digit 0.
The character strings to be retrieved are too long (1,000 characters or more).

If a text index is defined for such a column, the number of pages to be retrieved cannot be reduced because the data retrieval range cannot be narrowed. No benefit can be expected from using a text index.

To Page Top

(3) Columns for which text indexes can be defined

Text indexes can be defined only for columns that satisfy the following conditions:

Columns for which text indexes can be defined

CHARACTER type column
VARCHAR type column

To Page Top