Hitachi

Hitachi Advanced Database Setup and Operation Guide


5.4.1  Points to consider in determining the columns to be defined for a text index

When choosing the columns on which text indexes will be defined, consider the points described in the following subsections.

Organization of this subsection

(1) Cases that benefit from a text index

Consider defining a text index when you will be retrieving data in the following ways:

The following table explains the relationship between the data to be retrieved and text indexes.

Table 5‒4: Relationship between the data to be retrieved and text indexes

No.

Retrieved data items

Validity of text index

Notes

1

Document data (such as document data and mail containing at least 100 characters)

V

If a character string is long, the data size for the text index might become large.

For document data, the data size for the text index tends to become large compared to other retrieval target data because data for the text index includes such character strings that are not used as search conditions.

2

Document data written in English (such as document data and emails containing at least 100 characters)

V

When searching for English words in documents written in English, specify a text-index-word-context search specification and define a text index. To conduct the word-context search, use the CONTAINS scalar function. For details about text-index-word-context search specification, see 5.4.5 Specifying a text index for a word-context search (TEXT WORDCONTEXT).

Note that if a character string is long, the data size for the text index might become large. For document data, the data size for the text index tends to be larger than for other retrieval target data because data for the text index includes character strings that unlikely to ever be used as search conditions.

The data size for the text index will also be larger when you specify a text-index-word-context search specification than when you do not.

3

Character strings consisting of only alphanumeric characters (such as URLs and access logs)

L

Use of text indexes might not yield any benefit in the following cases:

  • Data is extremely short.

  • The variation among characters used is small.

4

Name data (such as product names consisting of about 10 characters)

U

Use of a text index might not yield any benefit in the following case:

  • Each data item consists of about 10 characters, but there are many identical name data items, such as in a history of purchases.

5

Character strings consisting of only numeric characters (such as IDs and codes)

N

Even if retrieval results are unique, the retrieval processing might require a large amount of time because there are not many character combinations (inasmuch as all character strings consist of only numeric characters). Little benefit can be expected from using text indexes.

Legend:

V: Very likely to be valid

L: Likely to be valid

U: Unlikely to be valid

N: Not valid

(2) Cases that benefit when a text index is not defined

Do not define a text index for columns with the following characteristics:

If a text index is defined for such a column, the number of pages to be retrieved cannot be reduced because the data retrieval range cannot be narrowed. No benefit can be expected from using a text index.

(3) Columns for which text indexes can be defined

Text indexes can be defined only for columns that satisfy the following conditions:

Columns for which text indexes can be defined
  • CHARACTER type column

  • VARCHAR type column