3.4.3 Creating the Pre-Parse XML document

Create the Pre-Parse XML document considering the structural features of the XML document for parsing. Create a Pre-Parse XML document as per the following policies:

Table 3‒7: Policies for creating a Pre-Parse XML document
No.	Applicable location	Policy
1	XML declaration	Must match with the XML document for parsing.
2	Elements and attributes	Describe the elements and attributes described in the XML document for parsing.
3	Order of elements and attributes	Must match with the XML document for parsing.
4	Embedded structure of elements	Must match with the XML document for parsing.
5	Iteration structure of elements	Code the iterative elements in an iterative format.
6	Text, CDATA sections, and attribute values	Need not match with the XML document for parsing.
7	DTDs, comments, and processing instructions	Need not be coded.

Each of these policies is described as follows by citing specific coding example:

Organization of this subsection

(1) XML declaration
(2) Elements and attributes
(3) Order of elements and attributes
(4) Embedded structure of elements
(5) Iteration structure of elements
(6) Text, CDATA sections, and attribute values
(7) DTDs, comments, and processing instructions

(1) XML declaration

The XML declaration of the XML document for parsing is coded as it is in the Pre-Parse XML document. The following figure shows an example:

Figure 3‒7: Example for describing XML declaration (policy for creating a Pre-Parse XML document)

To Page Top

(2) Elements and attributes

The elements and attributes of the XML document for parsing are coded in the Pre-Parse XML document. The information about the elements and attributes described in the Pre-Parse XML document is recorded in the preparsed object that is generated.

If there are different elements and attributes in the XML documents for parsing, then code all those elements and attributes in the Pre-Parse XML document. The following figure shows an example:

Figure 3‒8: Example of coding of elements and attributes (policy for creating a Pre-Parse XML document)

However, the text in the elements, CDATA sections, and the contents of the attribute values need not match the XML document for parsing. For details, see 3.4.3 (6) Text, CDATA sections, and attribute values.

The following codes are not differentiated according to XML standards, but are differentiated when pre-parsing as different elements:

Blank elements and blank element tags (example: <a></a> and <a/>)
Elements with different number and position of linefeed characters, space characters, or tags

If these elements exist in the XML document for parsing, they must be differentiated and coded in the Pre-Parse XML document. The following figure shows an example:

Figure 3‒9: Example of coding of blank elements and blank element tags (policy for creating the Pre-Parse XML document)

Figure 3‒10: Example of coding elements with different number and position of linefeed characters or blank characters (policy for creating the Pre-Parse XML document)

To Page Top

(3) Order of elements and attributes

The elements and attributes of the XML document for parsing are coded in the same order in the Pre-Parse XML document. The following figure shows an example:

Figure 3‒11: Example of describing elements and attributes (policy for creating the Pre-Parse XML document, and order of elements and attributes)

If elements and attributes with the same combination but different order are described in the XML document for parsing, all such elements and attributes will be described in the Pre-Parse XML document. The following figure shows an example:

Figure 3‒12: Example of describing attributes with the same combination but different order (policy for creating the Pre-Parse XML document)

To Page Top

(4) Embedded structure of elements

The embedded structure of elements in the XML document for parsing is described in the same format in the Pre-Parse XML document. The following figure shows an example:

Figure 3‒13: Example of describing the embedded structure of elements (Policy for creating the Pre-Parse XML document)

If there are many different embedded structures in the XML document for parsing, all embedded structures will be coded in the Pre-Parse XML document. The following figure shows an example:

Figure 3‒14: Example of coding multiple embedded structures of elements (policy for creating the Pre-Parse XML document)

To Page Top

(5) Iteration structure of elements

If the XML document for parsing contains iterative elements, the iterative elements are described in continuation for two or more times in the Pre-Parse XML document. The elements described continuously for two or more times are treated as iterative elements when parsing. If the frequency of continuous description in the pre-parsing XML document is two or more times, there would be no impact on the performance of the high-speed parse support function even if the iteration frequency differs from that of the XML document for parsing. The following figure shows an example:

Figure 3‒15: Example of coding the iteration structure of elements (Policy for creating the Pre-Parse XML document)

To Page Top

(6) Text, CDATA sections, and attribute values

The text, CDATA sections, and contents of the attribute values described in the Pre-Parse XML document do not have an impact on the preparsed object that is generated. Therefore, the text, CDATA sections, and contents of the attribute values in the Pre-Parse XML document need not match with the XML document for parsing.

Also, the text and CDATA sections are not differentiated (the preparsed object is same no matter which one is coded). The following figure shows an example:

Figure 3‒16: Example of coding the text, CDATA sections, and attribute values (policy for creating the Pre-Parse XML document)

Elements with contents (example: <a>12</a>) and empty elements (example: <a></a>) are differentiated. Furthermore, as the linefeed, tab, and space are considered as text, the elements containing such text (example: <a>[linefeed]</a>) and empty elements (example: <a></a>) are differentiated. Therefore, if the XML document for parsing contains such elements, all elements must be differentiated and described in the Pre-Parse XML document. The following figure shows an example:

Figure 3‒17: Example of coding of empty elements (policy for creating the Pre-Parse XML document)

To Page Top

(7) DTDs, comments, and processing instructions

The DTDs, comments, and processing instructions are not included in the preparsed object. Therefore, these might not be described in the Pre-Parse XML document (even if these are described, they will not have an impact on the preparsed object that will be generated). The following figure shows an example:

Figure 3‒18: Example related to DTDs, comments, and processing instructions (policy for creating the Pre-Parse XML document)

However, the DTDs that define the entities referenced from within the XML document for parsing must be described in the Pre-Parse XML document. Following figure shows an example:

Figure 3‒19: Example of coding the definition of entities referenced from within the XML document (policy for creating the Pre-Parse XML document)

To Page Top