XML / XSL Notes
*These notes are based on information taken from random sources on the Internet, and WROX Professional ASP XML. - 11/1/2024 |
Intro to XML (chapter 1)
<font size="3">
);
Understanding XML (chapter 2)
encoding
("UTF-8"
or "UTF-16"
, the two unicode specifications), and standalone
("yes"
or "no"
, depending on
whether the document requires an external DTD or Schema);<?xml version="1.0"?>
- the
<?...?>
is called a Processing Instruction (PI);<!DOCTYPE root-element SYSTEM "root-element.dtd">
-
root-element refers to the name of the document element, SYSTEM refers
to the location of the .dtd file (local system); <tagName />
; xml:space
, when included in any tag, applies to
all child elements as well. It preserves the formatting of content (like <pre>
in HTML). There is no guarantee this will be respected by a parser, however.
The attribute xml:lang="langType"
also affects
all child elements - it indicates what language the content is written in (eg,
for use when multiple languages are used in the same doc); <![CDATA[...]]>
) sections of an
XML document are ignored by a parser. You can use this to pass examples of
XML in a document. When using XSL to transform XML into HTML, any scripting
must appear in a CDATA section; ' & > <
"e;
;Validating XML with the DTD (chapter 3)
<!DOCTYPE
root-element PUBLIC "name" "URL_of_DTD">
;
<!ELEMENT elementName rule >
;ANY
rule indicates that an element can contain anything
allowed by the DTD, as well as nothing;EMPTY
rule indicates that an element must not contain any
data. The element itself can have attributes, however;|
) operator. This would be used, for example,
if a parent element could contain data or a child element: eg. rule = (ElementA
| #PCDATA)
; #PCDATA
rule indicates that an element contains parsed
character data (data containing markup tags). This data will be
interpreted;?
- element must appear never or once;*
- may or may not appear, any number of times;+
- must appear one or more times;(none)
- must only appear once; <!ELEMENT EMAIL (To+, From, CC*, Subject?, Body?)>
; <!ATTLIST
targetElement attribName attribType Defaults>
.CDATA
- only character data can be used (will not be parsed);ENTITY
- value must refer to an external entity declared in
the DTD;ENTITIES
- multiple of above (separated by whitespace);ID
- unique element identifier;IDREF
- value of unique ID type attribute;IDREFS
- multiple of above (separated by whitespace);NMTOKEN
- a valid XML token name;NMTOKENS
- multiple...NOTATION
- value must refer to notation declaration in the
DTD;Enumerated_values
- attribute value must match one of the included
values. Eg, (male | female)
;Default settings:
#REQUIRED
- attribute value must be specified;#IMPLIED
- value is optional;#FIXED value
- attribute must have the supplied value
(always precede with CDATA
?);value
- supplied value
is the
default (always precede with CDATA
?); <!ATTLIST Employee gender (male | female)
#REQUIRED>
<ENTITY entityName
entityDefinition>
. These work like the pre-defined XML
entities (apostrophe, ampersand, etc), in that you can call on entityName
in the XML document by simply delimiting it with an (&) and (;). <!ENTITY entityName
SYSTEM entityURI>
. You can use this to import complex data
from other sites - a powerful feature. The data from the URL will be parsed.
To prevent this, specify NDATA
after the URL (eg. "...mapLondon.gif"
NDATA GIF
);
<ENTITY % entityName entityDefinition>
. It is
used exclusively within the DTD (for eliminative repetitive text). The
entity is later called as: %entityName
; <![ IGNORE [...]]>
will turn off a block of
DTD content. INCLUDE
does the opposite; <!-- ... -->
;
Set objXML = Server.CreateObject("Microsoft.XMLDOM")
objXML.ValidateOnParse = True
objXML.Load(Server.MapPath(somefile.xml))
If objXML.ParseError.errorCode <> 0 Then
Response.Write("Error:
" & objXML.parseError.reason & "<br>")
Response.Write("At line:
" & objXML.parseError.line & "<br>")
Else
Set objRootElement =
objXML.documentElement
Response.Write(objRootElement.xml)
End If
Validating XML Using Schemas (Chapter 4)
.xsd
) begins with a namespace (http://www.rpbourret.com/xml/NamespacesFAQ.htm),
which looks like:
<schema targetNamespace="nameSpaceURI"
nameSpaceID:elementName xmlns:nameSpaceID="nameSpaceURI"
nameSpaceID2:elementName2
xmlns:nameSpaceID2="nameSpaceURI2"
elementFormDefault="(un?)qualified"
attributeFormDefault="(un?)qualified">
nameSpaceID
(above) can come before any
element or attribute name as an identifier (separated by a colon). Namespace
declarations must also be made in the XML instance (along with references to
the schema). To check an element for conformance, the processor first
locates the declaration for the element in a schema, and then checks that
the targetNamespace
attribute in the schema matches the
actual namespace URI of the element (or, alternatively, that the schema does
not have a targetNamespace
attribute and the instance element
is not namespace-qualified). Qualification of local elements and attributes
can be globally specified by a pair of attributes, elementFormDefault
and
attributeFormDefault
, on the schema element, or can be specified
separately for each local declaration using the form attribute. All such
attributes' values may each be set to unqualified or qualified, to indicate
whether or not locally declared elements and attributes must be unqualified.;
nameSpaceID:elementName
part of the statement (leaving
just xmlns:nameSpaceID="namespaceURI"
);
<elementName xmlns="nameSpaceURI"
xmlns:xsi="someURI"
xsi:schemaLocation="nameSpaceURI"
nameSpaceURI/schemaName.xsd">
xsd)
; in this
case, it will either include
(if namespaces are shared)
or import
(if namespaces are different) another Schema
into itself. Example: <import namespace="someURI"
schemaLocation="someURI2/schemaName.xsd>
<include schemaLocation="someURI/schemaName.xsd>
<complexType name="elementName"
(?)base="" (?)content="" (?)group order="">
<element name=""
type="" /> <-- "type
definition"
<element name=""
type="" />
<attribute name=""
type="" />
</complexType>
-
the base
parameter can be any of the following: string
,
boolean
, float
, double
, decimal
,
timeInstant
, integer
, ENTITY
, NOTATION
...
(see pg 61-62 for more). This compares rather favorably with DTDs (limited
to CDATA
and PCDATA
). The content
attribute
describes what the complexType
contains (it is elementOnly
by default). Other settings: empty
, mixed
, and textOnly
.
The group order
attribute (sequence
by
default) specifies whether the sequence of listed elements matters. The
all
setting allows for changes in sequence (though it imposes
restrictions - you can't have nested elements within a complexType
with this attribute, and micOccurs/maxOccurs
must both be one -
this could change in future standards). The choice
setting
allows for nested elements to be optional (appear or not appear in an XML
instance). Derivation? (pg 68); ref=
for
name=
, and minOccurs
(and/or maxOccurs
) for type=
.
minOccurs can either be 0 (optional) or 1, and maxOccurs can be 1 or *
(many); <simpleType name=""
base="">
<facetName
(value or name)="" />
</simpleType>
simpleType
declaration
(its base type). Example: <simpleType name=""
base="string">
<minLength
value="4" />
<maxLengh
value="32" />
</simpleType>
pattern
facet can be used to specify data
constrained by a regular expression (refer to the XML specs for pattern
symbols); enumeration
facet can be used to specify expected
values (eg, "male" and "female" if the
base="string", or "1" and "2", or etc..);Annotation
tags can appear within a simple or complex
type, and simply contain notes;Document Object Model (chapter 5)
Integrating XML with ASP (chapter 6)
Set objXML = Server.CreateObject("Microsoft.XMLDOM")
objXML.validateOnParse = True/False
For XML from a file: objXML.load(Server.MapPath("documentName.xml")
For XML from a string: objXML.loadxml(stringName)
Set objRootElement = objXML.documentElement
...
Set objXML = Nothing
objXML
to jump straight to
the node you want: Set objNode = objXML.selectSingleNode("/rootNode/childNode")
objNode.text
validateOnParse
) is to set
objXML.async = false
; this forces the file to load all at once;
objRootElement.childNodes(#).text
creates a string
containing the # child node of the document element (0 being the first),
including its tags, and every subchild/attribute within it. Leaving off the
(#) will call all the children; Set objNodeName =
objRootElement.selectSingleNode("nodeName")
; For...Next
(pg 119):
If Not isObject(objRootElement) Then ...
End if
Set objXML = Server.CreateObject("Microsoft.XMLDOM")
objXML.load("documentName.xml")
Set objRootNode = objXML.documentElement
Set objXML2 = Server.CreateObject("Microsoft.XMLDOM")
objXML2.loadXML(strXML)
Set objNewNode = objXML2.documentElement
Set objCurrentNode = objRootNode.appendChild(objNewNode)
objXML.save("documentName.xml")
Set objOldItem =
objRootNode.RemoveChild(objRootNode.childNodes(#))
objXML.save("documentName.xml")
file
object, then
use InStr
to locate the text: Set objFSO =
CreateObject("Scripting.FileSystemObject")
Set objFile = objFSO.OpenTextFile(Server.MapPath("documentName.xml"),
1, False)
strXML = objFile.ReadAll
objFile.Close
Set objFSO = Nothing
<?xml-stylesheet type="text/css" href=http://3gwt.net/dje/"documentName.css"
?>
. This is equivalent to <link>
in html. It
appears right below the XML version declaration (in the prolog).
Additionally, you can add the optional media
attribute, which
currently accepts "screen", "print", and
"all". Two stylesheets can be included in one document (each with
a different media
setting) to achieve the desired result (in
IE5); <elementName
xmlns:HTML="http://www.w3.org/TR/REC-html40">
enables
you to instruct a browser to interpret any content in that namespace as
HTML. For example, the <HTML:UL>
tag would initiate an
unordered list, just as you would expect;behavior:
).
Some cool behaviors include:anchor (anchorClick)
- used to open a folder in Web Folder view;
download
- used to download HTML pages and other filse to the
client;
homePage
- used to query and change the user's Home Page setting;
saveFavorite
- used to save the state of the page as a Favorites
entry;
url
- used to access another document. For example:
behavior:url("documentName.htc");
- this line associates
its CSS tag with documentName
, which contains more
comprehensive code (more on this pgs 157-159);
XSL - Extensible Stylesheet Language (chapter 8)
<?xml-stylesheet type="text/xsl" href=http://3gwt.net/dje/"documentName.xsl"
?>
. This method is not super-efficient; better to perform a
server-side transform (using the XMLDOM
object, transformNode
method) as follows: objXML.transformNode(objXSL)
;
<xsl:template match="/">
, which indicates that
transformation should begin at the root of a document.
<HTML>
and <BODY>
tags
normally come after the root match declaration; <xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
.
The xsl:stylesheet
element becomes the document
element of the XSL document, as well as the parent element of all XSL
templates. This element also supports four attributes:default-space
- preserve white space from the
source document (MSXML only supports "default");indent-result
- preserve white space originating
in the stylesheet (MSXML only supports "yes");language
- script language used within scripting
elements in the stylesheetresult-ns
- denotes the namespace of the output
document resulting from the transformation (this is ignored by MSXML 2.0);/
- when the pattern begins with this character,
the search starts at root;//
- the pattern (after these characters) may
appear anywhere below the starting point;.
- represents the current node;*
- wildcard: can be any element;@
- name immediately following this character
refers to an attribute;@*
- attribute wildcard;:
- namespace separator;!
- Applies an information method to the
reference node. Various methods:date
-
casts values to the date format;end
-
returns true if the last node in a collection is selected;index
- returns the index number of the node;nodeName
- returns the qualified name of the node;nodeType
- returns a number indicating what type of node is selected;text
- returns the immediate text node of the selected element;value
- returns a type cast version of the value of an element;
<x
sl:when test=".[.!nodeName()=='elementName']">
...
</xsl:when>
()
- groups contents for precedence;[]
- applies a filter pattern. Various
operators:&&
- and;||
- or;$not$
- not;=
- equals;$ieq$
- case
insensitive equals;!=
- not equals;$ine$
- case insensitive not equals;<
-
less than;>
- greater than;<=
- less than or equal to;>=
-
greater than or equal to;$all$
-
returns true if the condition is true for all items in a collection;$any$
- returns true if the condition is true for any item in a
collection;xsl:apply
- templates - guides the XSL processor to
the match template (based on an associated pattern)xsl:attribute name=""
- creates an attribute
in the output document;xsl:choose
- multiple conditional testing (similar to
Select
statement) in conjunction with xsl:when
test=""
and xsl:otherwise
;xsl:comment
- creates a comment node in the output;xsl:copy
- copies the current node or nodes to..;xsl:element
- creates an element in...;xsl:eval
- evaluates an in-line script to generate
text output*. Methods for this found at pg 193;xsl:for-each select=""
- applies some
processing to every node in a collection;xsl:if test=""
- simple condition;
xsl:pi
- creates a processing instruction in...;xsl:stylesheet
- document element for a multi-template
stylesheet;xsl:template
- defines a processing rule for output;xsl:value-of select=""
- inserts the value
of the selected node into...; order-by=""
attribute
with many of these;
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">
<xsl:template match="/">
<xsl:apply-templates />
</xsl:template>
<xsl:template match="/contact_info">
<HTML>
<BODY>
<xsl:apply-templates />
</BODY>
<HTML>
</xsl:template>
<xsl:template match="contact_info">
<csl:for-each
select="./*">
<xsl:choose>
<xsl:when test=".[.!nodeName()='personal']">
<DIV STYLE="backgroundcolor:teal;>Personal Contacts</DIV>
</xsl:when>
<xsl:otherwise>
...
</xsl:otherwise>
</xsl:choose>
<xsl:apply-templates
/>
<p/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Advanced XSL Techniques (chapter 9)