XML (for ASP) Notes

XML / XSL Notes

*These notes are based on information taken from random sources on the Internet, and WROX Professional ASP XML. - 11/1/2025

XML specifications: http://www.w3c.org/xml
Microsoft client-side XML validator: http://msdn.microsoft.com/downloads/tools/xmlint/xmlint.zip

Intro to XML (chapter 1)

XML enables you to tell a parser exactly how to treat and comprehend data, without concern for how the data is displayed;
Attributes of tags must be in quotes (eg, <font size="3"> );
XML is case-sensitive. For your own sanity, create a standard and stick to it when writing tags/attributes;
Always close all tags - XML is not forgiving, as HTML often is;
Valid XML must conform to the rules defined in a DTD or Schema. Well-formed XML simply needs to be syntactically correct;
Every XML document must have a document element (two tags between which all other elements reside). All other elements are also referred to as child nodes or child elements, and any child node can have children of its own;
IE5 has a built-in XML parser (.xml files); it displays the XML as is, but makes the nodes collapsible. Maybe useful for making a list or something;

Understanding XML (chapter 2)

Logical structure: the framework of a document. Physical structure: the actual data (content) within the framework;
Begin an XML document with a prolog, which contains two parts: the XML declaration and the Document Type Declaration. The former simply states what version of XML is being used, the latter refers to a Document Type Definition (DTD), and appears one line below the XML declaration. Comments can also be included in the prolog;
There can be no whitespace above or before the prolog. The version declaration should be in all lowercase;
Other optional attributes of the prolog include: encoding ("UTF-8" or "UTF-16", the two unicode specifications), and standalone ("yes" or "no", depending on whether the document requires an external DTD or Schema);
Sample XML declaration: <?xml version="1.0"?> - the<?...?>is called a Processing Instruction (PI);
Sample DT Declaration: <!DOCTYPE root-element SYSTEM "root-element.dtd"> -root-element refers to the name of the document element, SYSTEM refers to the location of the .dtd file (local system);
You can create an empty tag, which combines open & close (eg, for use as a placeholder) in the following way: <tagName />;
The attribute xml:space, when included in any tag, applies to all child elements as well. It preserves the formatting of content (like <pre> in HTML). There is no guarantee this will be respected by a parser, however. The attribute xml:lang="langType" also affects all child elements - it indicates what language the content is written in (eg, for use when multiple languages are used in the same doc);
CDATA (expressed as <![CDATA[...]]> ) sections of an XML document are ignored by a parser. You can use this to pass examples of XML in a document. When using XSL to transform XML into HTML, any scripting must appear in a CDATA section;
You can display special characters with a reference. References written in decimal code are preceded by "&#" , and "&#x" for hexadecimal code;
Five special characters:' & > < &quote; ;

Validating XML with the DTD (chapter 3)

If a DTD is public, the DT declaration might look like:<!DOCTYPE root-element PUBLIC "name" "URL_of_DTD">;
If both an internal and external DTD is used, the internal is processed first, and it takes precedence over the external. Such a combination might be used to, for example, overwrite a default value contained in the external DTD with a value specific to the locality;
A DTD is structured like an XML document, in that elements are declared in a specific order. The basic form of an element declaration is:<!ELEMENT elementName rule >;
The ANY rule indicates that an element can contain anything allowed by the DTD, as well as nothing;
The EMPTY rule indicates that an element must not contain any data. The element itself can have attributes, however;
A mixed declaration is a list of options, enclosed by parentheses and spearated by the (|) operator. This would be used, for example, if a parent element could contain data or a child element: eg. rule = (ElementA | #PCDATA);
A multiple declaration is a comma-separated list of child elements, in order of appearance in the XML document;
The #PCDATA rule indicates that an element contains parsed character data (data containing markup tags). This data will be interpreted;
The following symbols, when appended to an element name within a rule, indicate...

? - element must appear never or once;
* - may or may not appear, any number of times;
+ - must appear one or more times;
(none) - must only appear once;
Example: <!ELEMENT EMAIL (To+, From, CC*, Subject?, Body?)>;

If an element has attributes, its element declaration should have a corresponding attribute declaration directly beneath it. The syntax is:<!ATTLIST targetElement attribName attribType Defaults>.

Attribute types:

CDATA - only character data can be used (will not be parsed);
ENTITY - value must refer to an external entity declared in the DTD;
ENTITIES - multiple of above (separated by whitespace);

ID - unique element identifier;
IDREF - value of unique ID type attribute;
IDREFS - multiple of above (separated by whitespace);
NMTOKEN - a valid XML token name;
NMTOKENS - multiple...
NOTATION - value must refer to notation declaration in the DTD;
Enumerated_values - attribute value must match one of the included values. Eg, (male | female);

Default settings:

#REQUIRED - attribute value must be specified;
#IMPLIED - value is optional;
#FIXED value - attribute must have the supplied value(always precede with CDATA?);
value - supplied value is the default (always precede with CDATA?);
Example:<!ATTLIST Employee gender (male | female) #REQUIRED>

Entity declarations in the DTD are similar to substitution macros; you can store stuff in them. The basic form of an internal entity is:<ENTITY entityName entityDefinition>. These work like the pre-defined XML entities (apostrophe, ampersand, etc), in that you can call on entityNamein the XML document by simply delimiting it with an (&) and (;).
- The basic form of an external entity is:<!ENTITY entityName SYSTEM entityURI>. You can use this to import complex data from other sites - a powerful feature. The data from the URL will be parsed. To prevent this, specify NDATA after the URL (eg. "...mapLondon.gif" NDATA GIF);
- The basic form of a parameter entity is:<ENTITY % entityName entityDefinition>. It is used exclusively within the DTD (for eliminative repetitive text). The entity is later called as: %entityName;
Other DTD keywords: <![ IGNORE [...]]> will turn off a block of DTD content. INCLUDE does the opposite;
Comment a line out with ;
Server-side XML validation:
Set objXML = Server.CreateObject("Microsoft.XMLDOM") objXML.ValidateOnParse = True objXML.Load(Server.MapPath(somefile.xml)) If objXML.ParseError.errorCode <> 0 Then Response.Write("Error: " & objXML.parseError.reason & "<br>") Response.Write("At line: " & objXML.parseError.line & "<br>") Else Set objRootElement = objXML.documentElement Response.Write(objRootElement.xml) End If

Validating XML Using Schemas (Chapter 4)

In object-oriented language, a schema (.xsd) describes a class of XML documents, which are each individually an instance of that class;
A Schema (.xsd) begins with a namespace (http://www.rpbourret.com/xml/NamespacesFAQ.htm), which looks like: <schema targetNamespace="nameSpaceURI" nameSpaceID:elementName xmlns:nameSpaceID="nameSpaceURI" nameSpaceID2:elementName2 xmlns:nameSpaceID2="nameSpaceURI2" elementFormDefault="(un?)qualified" attributeFormDefault="(un?)qualified">
A namespace is used to distinguish between duplicate elements and attributes in an XML page. Duplicates will most likely result when information from separate XML pages is combined into one. Assigning each original page (or "language") its own namespace resolves such issues. The nameSpaceID (above) can come before any element or attribute name as an identifier (separated by a colon). Namespace declarations must also be made in the XML instance (along with references to the schema). To check an element for conformance, the processor first locates the declaration for the element in a schema, and then checks that the targetNamespace attribute in the schema matches the actual namespace URI of the element (or, alternatively, that the schema does not have a targetNamespace attribute and the instance element is not namespace-qualified). Qualification of local elements and attributes can be globally specified by a pair of attributes, elementFormDefault andattributeFormDefault, on the schema element, or can be specified separately for each local declaration using the form attribute. All such attributes' values may each be set to unqualified or qualified, to indicate whether or not locally declared elements and attributes must be unqualified.;
You can set a default namespace for all elements by simply dropping thenameSpaceID:elementNamepart of the statement (leaving just xmlns:nameSpaceID="namespaceURI");
The top of an XML instance (that refers to a Schema) will look like:
<elementName xmlns="nameSpaceURI" xmlns:xsi="someURI" xsi:schemaLocation="nameSpaceURI" nameSpaceURI/schemaName.xsd">
schemaLocation can also be used within an (.xsd); in this case, it will either include (if namespaces are shared) or import (if namespaces are different) another Schema into itself. Example:
<import namespace="someURI" schemaLocation="someURI2/schemaName.xsd> <include schemaLocation="someURI/schemaName.xsd>
A complex type may contain other elements and may have attributes, while a simple type must not. Complex types are declared:
<complexType name="elementName" (?)base="" (?)content="" (?)group order=""> <element name="" type="" /> <-- "type definition" <element name="" type="" /> <attribute name="" type="" /> </complexType> -the base parameter can be any of the following: string, boolean, float, double, decimal, timeInstant, integer, ENTITY, NOTATION... (see pg 61-62 for more). This compares rather favorably with DTDs (limited to CDATA and PCDATA). The content attribute describes what the complexType contains (it is elementOnly by default). Other settings: empty, mixed, and textOnly. The group order attribute (sequence by default) specifies whether the sequence of listed elements matters. Theallsetting allows for changes in sequence (though it imposes restrictions - you can't have nested elements within a complexType with this attribute, and micOccurs/maxOccurs must both be one - this could change in future standards). The choice setting allows for nested elements to be optional (appear or not appear in an XML instance). Derivation? (pg 68);
To re-use an already declared type definition, substitute ref= forname=, and minOccurs (and/or maxOccurs) for type=. minOccurs can either be 0 (optional) or 1, and maxOccurs can be 1 or * (many);
A simpleType is a basic building block of a schema:
<simpleType name="" base=""> <facetName (value or name)="" /> </simpleType>
-A facet can be used to modify a simpleType declaration (its base type). Example:
<simpleType name="" base="string"> <minLength value="4" /> <maxLengh value="32" /> </simpleType>
The pattern facet can be used to specify data constrained by a regular expression (refer to the XML specs for pattern symbols);
The enumeration facet can be used to specify expected values (eg, "male" and "female" if the base="string", or "1" and "2", or etc..);
One distinction between attributes of elements (and elements themselves) is that the order of attribute declarations has no significance. Thus, if you wanted to be lenient in that regard, you could be;
Annotation tags can appear within a simple or complex type, and simply contain notes;

Document Object Model (chapter 5)

XSL is used to transform an XML document - preparing pieces of it for presentation - as another XML sheet or HTML;
Ugh - I'm skipping the rest.

Integrating XML with ASP (chapter 6)

Using the DOM:

MS Reference
Set objXML = Server.CreateObject("Microsoft.XMLDOM") objXML.validateOnParse = True/FalseFor XML from a file: objXML.load(Server.MapPath("documentName.xml")For XML from a string: objXML.loadxml(stringName) Set objRootElement = objXML.documentElement ... Set objXML = Nothing
You can shorten the above process by using objXML to jump straight to the node you want: Set objNode = objXML.selectSingleNode("/rootNode/childNode"). Grab the text from that node with objNode.text
An optional step (used after validateOnParse) is to setobjXML.async = false; this forces the file to load all at once;

objRootElement.childNodes(#).text creates a string containing the # child node of the document element (0 being the first), including its tags, and every subchild/attribute within it. Leaving off the (#) will call all the children;
Call a node by name with:Set objNodeName = objRootElement.selectSingleNode("nodeName");

Using For...Next (pg 119):
    For i = 0 To objRootElement.childNodes.length - 1
        strContent = objRootElement.childNodes.item(i).text
        strAttribute = objRootElement.childNodes.item(i).getAttribute("attributeName")
        strNodeName = objRootElement.childNodes.item(i).nodeName("attributeName")
    Next

Make sure content exists:If Not isObject(objRootElement) Then ... End if
Add a node via ASP to an XML document:
Set objXML = Server.CreateObject("Microsoft.XMLDOM") objXML.load("documentName.xml") Set objRootNode = objXML.documentElement Set objXML2 = Server.CreateObject("Microsoft.XMLDOM") objXML2.loadXML(strXML) Set objNewNode = objXML2.documentElement Set objCurrentNode = objRootNode.appendChild(objNewNode) objXML.save("documentName.xml")
Delete a node via ASP:
Set objOldItem = objRootNode.RemoveChild(objRootNode.childNodes(#)) objXML.save("documentName.xml")

Using Server-Side Includes:

Can be used to insert XML code into an ASP page like anything else; this code will not be parsed, and will be passed to the client as is. Pretty useless, unless the XML document happens to consist of HTML-like tags. Any tags not resembling HTML would be ignored;

Using the File System Object:

If you want to simply extract a string of text from an XML document, you might want to simply read the document into a file object, then use InStr to locate the text:
Set objFSO = CreateObject("Scripting.FileSystemObject") Set objFile = objFSO.OpenTextFile(Server.MapPath("documentName.xml"), 1, False) strXML = objFile.ReadAll objFile.Close Set objFSO = Nothing
You can also use the FSO to save text (as HTML) once you are done working with it;

Using CSS with XML (chapter 7)

To associate an XML document with an external stylesheet:<?xml-stylesheet type="text/css" href=http://3gwt.net/dje/"documentName.css" ?>. This is equivalent to <link> in html. It appears right below the XML version declaration (in the prolog). Additionally, you can add the optional media attribute, which currently accepts "screen", "print", and "all". Two stylesheets can be included in one document (each with a different media setting) to achieve the desired result (in IE5);
You can also embed a stylesheet directly within an XML document, if you like;
Defining the HTML namespace in the root element: <elementName xmlns:HTML="http://www.w3.org/TR/REC-html40">enables you to instruct a browser to interpret any content in that namespace as HTML. For example, the <HTML:UL> tag would initiate an unordered list, just as you would expect;
HTML-like functionality can also be achieved with a behavior (in IE5). It is used in CSS just like any other property (behavior:). Some cool behaviors include:

anchor (anchorClick) - used to open a folder in Web Folder view;
download - used to download HTML pages and other filse to the client;
homePage - used to query and change the user's Home Page setting;
saveFavorite - used to save the state of the page as a Favorites entry;
url - used to access another document. For example:behavior:url("documentName.htc");- this line associates its CSS tag with documentName, which contains more comprehensive code (more on this pgs 157-159);

XSL - Extensible Stylesheet Language (chapter 8)

XSL is used to transform XML to HTML, altering content selectively all the while;
Link an XML document to an XSL document by placing the following in the XML prolog: <?xml-stylesheet type="text/xsl" href=http://3gwt.net/dje/"documentName.xsl" ?>. This method is not super-efficient; better to perform a server-side transform (using the XMLDOM object, transformNode method) as follows: objXML.transformNode(objXSL);
An XSL stylesheet consists of one or more templates, each of which direct the transformation of a document. The head of a template contains the code<xsl:template match="/">, which indicates that transformation should begin at the root of a document.<HTML>and <BODY> tags normally come after the root match declaration;
Namespace declaration (also in head of template):<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform">. The xsl:stylesheet element becomes the document element of the XSL document, as well as the parent element of all XSL templates. This element also supports four attributes:

default-space - preserve white space from the source document (MSXML only supports "default");
indent-result - preserve white space originating in the stylesheet (MSXML only supports "yes");
language - script language used within scripting elements in the stylesheet
result-ns - denotes the namespace of the output document resulting from the transformation (this is ignored by MSXML 2.0);

XSL pattern characters (used like regular expressions: they help locate nodes in a DOM tree):

/ - when the pattern begins with this character, the search starts at root;
// - the pattern (after these characters) may appear anywhere below the starting point;
. - represents the current node;
* - wildcard: can be any element;
@ - name immediately following this character refers to an attribute;
@* - attribute wildcard;
: - namespace separator;
! - Applies an information method to the reference node. Various methods:

date - casts values to the date format;
end - returns true if the last node in a collection is selected;
index- returns the index number of the node;
nodeName- returns the qualified name of the node;
nodeType- returns a number indicating what type of node is selected;
text- returns the immediate text node of the selected element;
value- returns a type cast version of the value of an element;
Example:<xsl:when test=".[.!nodeName()=='elementName']"> ... </xsl:when>

() - groups contents for precedence;
[] - applies a filter pattern. Various operators:

&& - and;
||- or;
$not$ - not;
=- equals;
$ieq$ - case insensitive equals;
!= - not equals;
$ine$ - case insensitive not equals;
< - less than;
> - greater than;
<=- less than or equal to;
>= - greater than or equal to;
$all$ - returns true if the condition is true for all items in a collection;
$any$ - returns true if the condition is true for any item in a collection;

Template bodies are what enable you to apply logic to an XML document and dynamically transform them:

xsl:apply - templates - guides the XSL processor to the match template (based on an associated pattern)
xsl:attribute name="" - creates an attribute in the output document;
xsl:choose - multiple conditional testing (similar toSelectstatement) in conjunction withxsl:when test=""and xsl:otherwise;
xsl:comment - creates a comment node in the output;
xsl:copy - copies the current node or nodes to..;
xsl:element - creates an element in...;
xsl:eval - evaluates an in-line script to generate text output*. Methods for this found at pg 193;
xsl:for-each select="" - applies some processing to every node in a collection;
xsl:if test="" - simple condition;

xsl:pi - creates a processing instruction in...;
xsl:stylesheet - document element for a multi-template stylesheet;
xsl:template - defines a processing rule for output;
xsl:value-of select="" - inserts the value of the selected node into...;
Note: you can also use the order-by="" attribute with many of these;

Example stylesheet:
<?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl"> <xsl:template match="/"> <xsl:apply-templates /> </xsl:template> <xsl:template match="/contact_info"> <HTML> <BODY> <xsl:apply-templates /> </BODY> <HTML> </xsl:template> <xsl:template match="contact_info"> <csl:for-each select="./*"> <xsl:choose> <xsl:when test=".[.!nodeName()='personal']"> <DIV STYLE="backgroundcolor:teal;>Personal Contacts</DIV> </xsl:when> <xsl:otherwise> ... </xsl:otherwise> </xsl:choose> <xsl:apply-templates /> <p/> </xsl:for-each> </xsl:template> </xsl:stylesheet>

Advanced XSL Techniques (chapter 9)

Back to 3GWT