2.3.7 Linking queries

In some cases, you cannot achieve the desired results by executing a single query. If this is the case, you can send the processing results of the first query (first-stage) to the second query (second-stage). This is called query linking. The following two types of query linking are possible:

When you link queries, the data processed by the first-stage query might be different from the data processed by the second-stage query. For example, if the first-stage query analyzes data in one-minute intervals, and the second-stage query analyzes data in one-hour intervals, it is best to use stream data-based query linking. Conversely, if the data being analyzed by both the first-stage query and the second-stage query is the same, it is best to use relation-based query linking.

The following subsections explain each type of query linking.

Organization of this subsection
(1) Stream data-based query linking
(2) Relation-based query linking

(1) Stream data-based query linking

This subsection explains stream data-based query linking, using an example that computes one-minute average temperatures for each observation site, and then outputs the highest average temperature in the last day. The following figure shows the input and output data present when these queries are executed.

Figure 2-21 Input and output data present when stream data-linked queries are executed

[Figure]

The first-stage query (query 1) computes one-minute average temperatures for each observation site, and the second-stage query (query 2) computes the highest average temperature in the last day.

Code
To link queries, you specify the name of the first-stage query in the FROM clause of the second-stage query.

REGISTER STREAM temperature_stream
(observation_time TIME, id INTEGER, temperature INTEGER);
REGISTER QUERY average_calculation
ISTREAM (
SELECT id, AVG(temperature) AS average_temperature
FROM temperature_stream[RANGE 1 MINUTE]
GROUP BY id);
REGISTER QUERY MAX_temperature
ISTREAM (
SELECT MAX(average_temperature)
FROM average_calculation[RANGE 1 DAY]);

Explanation
The first-stage query (average_calculation) is the same as the query described in 2.3.5 Categorizing and then summarizing data. The second-stage query (MAX_temperature) takes the stream data from the first-stage query and uses a window operation to retrieve tuples for analysis. The data analyzed by the second-stage query are all tuples from the last day. To achieve this, [RANGE 1 DAY] is specified in the FROM clause of the second-stage query.

(2) Relation-based query linking

This subsection explains relation-based query linking, using an example that computes the average temperature for each observation site, and then outputs the highest average temperature (maximum average temperature) at the current time. The following figure shows the input and the output data present when these queries are executed.

Figure 2-22 Input and output data present when relation-linked queries are executed

[Figure]

The first-stage query (query 1) computes one-minute average temperatures for each observation site, and the second-stage query (query 2) computes the highest average temperature.

Code
With relation-based query linking, you do not use a stream operation in the first-stage query, nor do you use a window operation in the second-stage query. You specify the name of the first-stage query in the FROM clause of the second-stage query.
Note that, if you do not use a stream clause, you do not need to enclose the SELECT clause in parentheses.

REGISTER STREAM temperature_stream
(observation_time TIME, id INTEGER, temperature INTEGER);
REGISTER QUERY average_calculation
SELECT id, AVG(temperature) AS average_temperature
FROM temperature_stream[RANGE 1 MINUTE]
GROUP BY id;
REGISTER QUERY MAX_temperature
ISTREAM (
SELECT MAX(average_temperature) AS MAX_temperature
FROM average_calculation);

Explanation
In relation-based query linking, the content of the output relation (average temperature for each observation site) of the first-stage query (average_calculation) is the same as that of the input relation (average temperature for each observation site) for the second-stage query (MAX_temperature).