Hitachi

uCosminexus Application Server XML Processor User Guide


3.4.1 Overview of the high-speed parse support function

The high-speed parse support function improves the execution speed of the parsing by studying the features of the XML document to be parsed through pre-parsing, and thereafter using the study results when parsing the XML document. The high-speed parse support function is suitable for systems that perform iterative parsing of XML documents in which the structural features such as order of elements and attributes or embedded relations are same. The following figure shows a system in which the high-speed parse support function is used:

Figure 3‒4: System in which the high-speed parse support function is used

[Figure]

The features of the XML document studied through pre-parsing are recorded in the preparsed object. The following figure shows the flow of generating the preparsed object and implementing high-speed parse.

Figure 3‒5: Flow of generating the preparsed object and implementing high-speed parse

[Figure]

  1. Generating the preparsed object

    The Preparsed object is generated by performing preparse of information, such as the order of elements, the order of attributes, and the iteration of elements from the Pre-Parse XML document, in Cosminexus XML Processor. The Pre-Parse XML document is an XML document with the same structural features as the XML document for parsing.

    The Pre-Parse XML document must be created by the user. As an XML document generally has a free structure, a preparsed object corresponding to all XML documents cannot be generated. Therefore, the Pre-Parse XML document is created by concentrating on the document structure with high occurrence frequency, and after considering the XML documents for parsing.

  2. Using the preparsed object (implementing high-speed parse)

    The XML document is parsed using the preparsed object when parsing with the XML parser function of Cosminexus XML Processor in which the preparsed object is set up. The parsing speed will improve if the structure of the XML document for parsing matches with the structure of the document recorded in the preparsed object.

When using the high-speed parse support function as per the flow shown in Figure 3-5, you must describe the codes for generating the preparsed object and setting it up in an XML parser, in the user program. For details about how to use the high-speed parse support function by describing the codes in the user program, see 3.4.2 Flow of operations for high-speed parse.

However, Hitachi does not guarantee that the parse speed can be improved than that of the normal speed in all cases. Therefore, when using the high-speed parse support function, Hitachi recommends that you evaluate the performance by actually using the XML document for parsing.

When using the high speed parse support function, check the contents of 6.19 Notes on the high-speed parse support function as well.

Reference note

About the amount of the parsing time that can be reduced using the high-speed parse support function

The high-speed parse support function reduces the parsing time in the SAX parser. Furthermore, as parsing with a DOM parser is implemented by parsing with the SAX parser and generation of a DOM tree, the parsing time of the DOM parser can also be reduced. However, the time for generating the DOM tree does not change even after using the high-speed parse support function. Therefore, as shown in the following figure, in comparison to parse with SAX parser, the amount of the reduced time as against the entire parsing time is less.

[Figure]

Also, when the validation check is performed, the amount of parsing time for the validation check process is high, and therefore, the amount of the reduced parsing time will be relatively less. Particularly, when the validation process is executed without using the javax.xml.validation.Schema class, the parsing time is hardly reduced.

Even for parsing a small-size XML document, the amount of initialization time of the XML parser in the entire parsing time is high, and therefore, the amount of the reduced parsing time will be relatively lesser.