3.4.3 Creating the Pre-Parse XML document
Create the Pre-Parse XML document considering the structural features of the XML document for parsing. Create a Pre-Parse XML document as per the following policies:
No. |
Applicable location |
Policy |
---|---|---|
1 |
Must match with the XML document for parsing. |
|
2 |
Describe the elements and attributes described in the XML document for parsing. |
|
3 |
Must match with the XML document for parsing. |
|
4 |
Must match with the XML document for parsing. |
|
5 |
Code the iterative elements in an iterative format. |
|
6 |
Need not match with the XML document for parsing. |
|
7 |
Need not be coded. |
Each of these policies is described as follows by citing specific coding example:
- Organization of this subsection
(1) XML declaration
The XML declaration of the XML document for parsing is coded as it is in the Pre-Parse XML document. The following figure shows an example:
(2) Elements and attributes
The elements and attributes of the XML document for parsing are coded in the Pre-Parse XML document. The information about the elements and attributes described in the Pre-Parse XML document is recorded in the preparsed object that is generated.
If there are different elements and attributes in the XML documents for parsing, then code all those elements and attributes in the Pre-Parse XML document. The following figure shows an example:
However, the text in the elements, CDATA sections, and the contents of the attribute values need not match the XML document for parsing. For details, see 3.4.3 (6) Text, CDATA sections, and attribute values.
The following codes are not differentiated according to XML standards, but are differentiated when pre-parsing as different elements:
-
Blank elements and blank element tags (example: <a></a> and <a/>)
-
Elements with different number and position of linefeed characters, space characters, or tags
If these elements exist in the XML document for parsing, they must be differentiated and coded in the Pre-Parse XML document. The following figure shows an example:
(3) Order of elements and attributes
The elements and attributes of the XML document for parsing are coded in the same order in the Pre-Parse XML document. The following figure shows an example:
If elements and attributes with the same combination but different order are described in the XML document for parsing, all such elements and attributes will be described in the Pre-Parse XML document. The following figure shows an example:
(4) Embedded structure of elements
The embedded structure of elements in the XML document for parsing is described in the same format in the Pre-Parse XML document. The following figure shows an example:
If there are many different embedded structures in the XML document for parsing, all embedded structures will be coded in the Pre-Parse XML document. The following figure shows an example:
(5) Iteration structure of elements
If the XML document for parsing contains iterative elements, the iterative elements are described in continuation for two or more times in the Pre-Parse XML document. The elements described continuously for two or more times are treated as iterative elements when parsing. If the frequency of continuous description in the pre-parsing XML document is two or more times, there would be no impact on the performance of the high-speed parse support function even if the iteration frequency differs from that of the XML document for parsing. The following figure shows an example:
(6) Text, CDATA sections, and attribute values
The text, CDATA sections, and contents of the attribute values described in the Pre-Parse XML document do not have an impact on the preparsed object that is generated. Therefore, the text, CDATA sections, and contents of the attribute values in the Pre-Parse XML document need not match with the XML document for parsing.
Also, the text and CDATA sections are not differentiated (the preparsed object is same no matter which one is coded). The following figure shows an example:
Elements with contents (example: <a>12</a>) and empty elements (example: <a></a>) are differentiated. Furthermore, as the linefeed, tab, and space are considered as text, the elements containing such text (example: <a>[linefeed]</a>) and empty elements (example: <a></a>) are differentiated. Therefore, if the XML document for parsing contains such elements, all elements must be differentiated and described in the Pre-Parse XML document. The following figure shows an example:
(7) DTDs, comments, and processing instructions
The DTDs, comments, and processing instructions are not included in the preparsed object. Therefore, these might not be described in the Pre-Parse XML document (even if these are described, they will not have an impact on the preparsed object that will be generated). The following figure shows an example:
However, the DTDs that define the entities referenced from within the XML document for parsing must be described in the Pre-Parse XML document. Following figure shows an example: