Accessing and Manipulating XML Data in Microsoft .NET Framework

Are you preparing for IT certification? With practice questions, study notes, interactive quizzes, tips and technical articles, uCertify PrepKits ensure that you get a solid grasp of core technical concepts to ace your certification exam in first attempt.

Accessing and Manipulating XML Data in Microsoft .NET Framework

Rating:

The Extensible Markup Language (XML) is a simple, flexible, and portable markup language. It is the preferred standard of communication developed by the World Wide Web Consortium (W3C). The XML data comprises both human-readable data and its description , and contains text-based XML files. The following is an example of a well-formed XML document:

<? xml version = "1.0" encoding = "UTF-8"?>
  <StudentRecords>
      <Students>
         <StudentID> SO100 </StudentID>
         <StudentName> John Trivolta </StudentName>
          <StudentSubject> Science </StudentSubject>
      </Students>

      <Students>
         <StudentID > SO101 </StudentID>
         <StudentName> Bruce Lee </StudentName>
         <StudentSubject> Computer </StudentSubject>
      </Students>
  <StudentRecords>


An XML document contains well-formed text-based data and is considered as a meta-language that can easily be edited and created with any text editor. The rules that should be followed while writing an XML document are as follows:

  • XML describes document structures using markup tags '<' and '>'.


  • The line "<? xml version = "1.0" encoding ="UTF-8" ?>" is an XML declaration line, which is optional. This line denotes the XML specification of W3C, and uses the UTF-8 standard set of characters.


  • The line starting with the '<?' string and ending with the '?>' string is called the processing instruction that provides information to the application about the process of the XML document.


  • A well-formed XML document should contain a start tag and an end tag. For example, '<Students>' is a start tag that contains an end tag '</Students>'. Unlike HTML tags, XML tags are case-sensitive. For example, if a start tag is written as '<StudentID>', its corresponding end tag should be written as '</StudentID>', and not as '</studentid>' or '</STUDENTID>.


  • Unlike HTML, an XML document has no predefined elements and attributes, but is specified according to the application or business requirements.
An XML document contains an associated document type definition (DTD) or an XML schema. An XML schema describes the data and the relationship between the data within an XML document. Unlike an XML document, an XML schema describes the elements and attributes of the XML document including type information, and validates XML documents.

The XML Document Object Model

The XML Document Object Model, commonly known as XML DOM, is a representation of the XML document in memory. The DOM class reads, writes, and manipulates an XML document. XML documents are tree-structured and consist of parent and child nodes. In the above example, <StudentRecords> is the parent node and <Students> is the immediate child node. When two or more nodes are at the same level or have the same parent node, they are called sibling. There are different types of nodes in the XML DOM. They are as follows:

DOM Node Type Description
Document Represents the document root that behaves like a container for all nodes.
DocumentType Represents the <!DOCTYPE> node.
Element Represents the element nodes.
Attributes Represents the attributes of an element node.
Comment Represents the comment nodes.
Text Represents the text that belongs to a particular node.

The XmlDocument object that is derived from the XmlNode class represents the root node or the document node. The XmlDocument class performs tasks such as loading and saving XML documents and is used to access all the nodes in the document. The XmlDocument object consists of the Load, LoadXml, and Save methods. The Load and LoadXml methods load XML documents, and the Save method saves them. The following code snippet written in Visual Basic .NET shows how to load an XML DOM using the Load method:

Imports System
Imports System.IO
Imports System.Xml
Module Module1
 Sub Main()
   Dim myDoc As New XmlDocument()
   MyDoc.Load("students.xml")
   Console.WriteLine(myDoc.InnerXml.ToString)
   Console.ReadLine()
 End Sub
End Module


The following code snippet written in Visual C# .NET shows how to load an XML DOM using the Load method:

using System;
using System.IO;
using System.Xml;
class Class1
{
   static void Main()
   {
       XmlDocument myDoc = new XmlDocument();
       MyDoc.Load("students.xml");
       Console.WriteLine(myDoc.InnerXml.ToString);
       Console.ReadLine();
   }
}


The XmlReader class and the XmlWriter class derived from the System.Xml namespace are abstract base classes that parse and write XML data from streams or XML documents. The XmlReader and the XmlWriter classes have their own properties and methods that have been described under the following sub-headings:

The XmlReader Class

The XmlReader class is an abstract class that provides fast, non-cacheable, forward-only, and read-only access to XML data. The methods of the XmlReader class provide access to the elements and attributes of XML data. The XmlTextReader class derived from the XmlReader class implements the methods defined by the XmlReader class. The XmlValidating Reader class that derives from the XmlReader class helps read XML data. It also supports the document type definition (DTD) or schema validation.

The XmlTextReader class requires fast access to XML data and is not required to read the entire document into memory via the Document Object Model or DOM. The following are some of the properties and methods of the XmlTextReader class:

Member Name Type Description
AttributeCount Property Returns an integer value that specifies the number of attributes on the current node.
HasAttributes Property Returns true if a node contains attributes.
HasValue Property Returns true if the current node contains a value.
Depth Property Returns an integer value that specifies the depth of the current node in an XML document.
IsEmptyElement Property Returns true specifying that the current node of the XML element is empty.
Item Property Specifies a string value to an attribute.
Value Property Specifies the value of the current node as text.
Name Property Specifies the name of the current node in an XML document.
IsStartElement() Method Checks whether the current element is a start element.
Read() Method Reads the next node from the XML document.
ReadAttributeValue() Method Reads the attribute value.
ReadString() Method Reads the text content of an element or node.
MoveToFirstAttribute() Method Moves the current node to the first attribute.
MoveToNextAttribute() Method Moves the current node to the next attribute.

The following code snippet written in Visual Basic .NET shows how to read data using the XmlTextReader class:

Dim readdata As New XmlTextReader("e:students.xml")
While readdata.Read()
 Select Case readdata.NodeType
     Case XmlNodeType.Element
       Console.Write("<" + readdata.Name)
       While readdata.MoveToNextAttribute()
           Console.Write(" " & readdata.Name & "= ' " & _
             readdata.Value & "' ")
       End While
       Console.Write(">")
       If readdata.HasAttributes Then
          While readdata.MoveToNextAttribute
              Console.Write(" " & readdata.Value & " ")
          End While
       End If
     Case XmlNodeType.Text
       Console.Write(readdata.Value)
     Case XmlNodeType.EndElement
       Console.WriteLine(("</" & readdata.Name & ">"))
 End Select
End While


The following code snippet written in Visual C# .NET shows how to read data using the XmlTextReader class:

XmlTextReader readdata = new XmlTextReader("e:students.xml");
while (readdata.Read())
{
   switch (readdata.NodeType)
   {
     case XmlNodeType.Element:
        Console.Write("<" + readdata.Name);
        while (readdata.MoveToNextAttribute())
        {
         Console.Write(" " + readdata.Name + "= '"
           + readdata.Value + " '");
        }
       Console.Write(">");
       if (readdata.HasAttributes)
       {
         while (readdata.MoveToNextAttribute)
         {
           Console.Write(" " + readdata.Value + " ");
         }
       }
       break;
     case XmlNodeType.Text:
       Console.Write(readdata.Value);
       break;
     case XmlNodeType.EndElement:
       Console.WriteLine(("</" + readdata.Name + ">"));
       break;
   }
}


The XmlWriter Class

The XmlWriter class that creates streams and writes data to XML documents is an abstract class of the System.Xml namespace. The XmlWriter class performs tasks such as writing various documents into one output stream, encoding binary data, and managing, flushing, and closing the output stream. The XmlTextWriter class, derived from the XmlWriter class, provides properties and methods for writing XML data to an output file, stream, or console. The following are some of the properties and methods of the XmlTextWriter class:

Member Name Type Description
BaseStream Property Returns the stream to which the XmlTextWriter object is writing.
Formatting Property Specifies the formatting of the output data in either Indented or None form.
WriteState Property Specifies the state value of the writer that includes Start, Element, Attribute, Content, Prolog, and Closed.
WriteStartDocument() Method Writes the XML declaration <?xml version = "1.0" ?> to the start of the XML document.
WriteStartElement() Method Specifies the start tag of a specified element.
WriteEndElement() Method Specifies the end tag of a specified element.
WriteStartAttribute() Method Specifies the start of an attribute.
WriteEndAttribute() Method Specifies the end of an attribute.

The following code snippet written in Visual Basic .NET shows how to write data using the XmlTextWriter class:

Dim writedata As New XmlTextWriter("e:students.xml", System.Text.Encoding.UTF-8)
writedata.Formatting = Formatting.Indented
writedata.WriteStartDocument(False)
writedata.WriteDocType("StudentRecords", Nothing, Nothing, Nothing)
writedata.WriteComment("This file represents a fragment of Employees" & "database")
writedata.WriteStartElement("StudentRecords")
writedata.WriteStartElement("Students", Nothing)
writedata.WriteElementString("StudentID", "SO100" )
writedata.WriteElementString("StudentName", "John Trivolta")
writedata.WriteElementString("StudentSubject", "Science")
writedata.WriteEndElement()
writedata.WriteEndElement()

' Write the XML to file and close the textWriter
writedata.Flush()
textWriter.Close()
Console.WriteLine("Press <Enter> to exit")
Console.Read()


The following code snippet written in Visual C# .NET shows how to write data using the XmlTextWriter class:

XmlTextWriter writedata = new XmlTextWriter("e:students.xml", System.Text.Encoding.UTF-8);
writedata.Formatting = Formatting.Indented;
writedata.WriteStartDocument(false);
writedata.WriteDocType("StudentRecords", null, null, null);
writedata.WriteComment("This file represents a fragment of Employees" + "database");
writedata.WriteStartElement("StudentRecords");
writedata.WriteStartElement("Students", null);
writedata.WriteElementString("StudentID", "SO100" );
writedata.WriteElementString("StudentName", "John Trivolta");
writedata.WriteElementString("StudentSubject", "Science");
writedata.WriteEndElement();
writedata.WriteEndElement();

// Write the XML to file and close the textWriter
writedata.Flush();
writedata.Close();
Console.WriteLine("Press <Enter> to exit");
Console.Read();


The XML Path (XPath) Language

The XML Path or XPath language described by the W3C Standard is a query language for finding information in an XML document. XPath uses path expressions to select a set of nodes and to navigate through elements and attributes in an XML document. XPath contains built-in standard functions to manipulate string, numeric, and boolean values, and to manipulate sequences and nodes. XPath defines different types of nodes such as the element, attribute, text, namespace, processing instruction, and comment of an XML document and the relationship among the nodes in a document.

The XPathNavigator class derived from the System.Xml.XPath namespace performs XPath queries on any data source such as an XML document, a database or a DataSet. The XPathNavigator object for an XML document can be created by using the CreateNavigator method of the XmlNode and XPathDocument classes that implement the IXPathNavigable interface. The CreateNavigator method returns an XPathNavigator object that can be used to perform XPath queries. The XPathNavigator object reads data from an XML document, and enables forward and backward navigation within the nodes. In this way, it provides random access to the nodes. The XPathNavigator objects formed by XmlDocument objects can be edited, whereas those formed by XPathDocument objects are read-only.

The following are the steps to perform XPath queries that use some of the methods of the XPathNavigator class:

  1. At first, a set of nodes is selected before performing XPath queries.


  2. The Select method of the XPathNavigator object is used to select the set of nodes. It returns an object of the XPathNodeIterator class. The XPathNodeIterator object is used to iterate through the selected nodes for easy navigation.


  3. For optimal performance, the XPathNavigator class also provides some additional methods, such as the SelectChildren, SelectAncestors, SelectDescendants, and IsDescendant methods. These methods, except the IsDescendant method, return an XPathNodeIterator object. The state or position of the XPathNavigator object does not get affected when these methods are called.


  4. After selecting a set of nodes, the nodes can be navigated randomly by using the XPathNavigator object that provides various methods for navigation. The following table describes the methods that navigate through a set of nodes:

    Method Description
    MoveTo() Moves the XPathNavigator object to another node in the current position and returns a Boolean value indicating its movement.
    MoveToNext() Moves the XPathNavigator object to the next sibling of the current node.
    MoveToPrevious() Moves the XPathNavigator object to the previous sibling of the current node.
    MoveToFirst() Moves the XPathNavigator object to the first sibling of the current node.
    MoveToFirstChild() Moves the XPathNavigator object to the first child of the current node. The current node must be the root node or has child nodes.
    MoveToParent() Moves the XPathNavigator object to the parent node of the current node. The current node must not be the root node.
    MoveToRoot() Moves the XPathNavigator object to the root node.
    MoveToId() Moves the XPathNavigator object to the node that contains an ID attribute. The ID of the node must be specified.

  5. In addition to the methods described above, the Evaluate method can be used to evaluate XPath expressions. This method returns required results and performs various calculations.


The XML Schema Object Model

The structure of XML documents is based on the rules that are specified in an XML Schema Definition (XSD) file, also known as an XML schema. An XSD file consists of the definitions of elements, attributes, and data types. XML Schema is used to create, define, and validate the structure of XML documents. The structure of an XML document can be specified by the names of the elements that are used in the XML documents, and the structure and types of elements that must be valid for a specific schema.

An XML schema is an XML file that has an .xsd file name extension and uses valid XML objects to describe the contents of an XML document. The structure of the XML document can be created by using the simpleType and complexType elements. A simpleType element is used to define built-in data types or existing simple types and contains no element or attribute. A complexType element consists of elements and attributes.

The XML Schema Object Model (SOM) derived from the System.Xml.Schema namespace consists of a set of classes that read the schema definition from a file and also create the schema definition files programmatically. When a schema is created, it needs to be validated and compiled before being written to a file. The XML SOM can load and save valid XSD schemas from and to files. In-memory schemas can be created by using strongly typed classes. The XmlSchemaCollection class is used to cache and retrieve schemas, and the XmlValidatingReader class is used to validate XML instance documents.

After creating a schema definition file, the XML Schema Object Model is used to edit files in the same manner that the XML DOM uses for editing files. For validating an XSD file, the Compile method of the XmlSchema class is used to verify whether the schema is semantically correct. During the validation process, a validation callback is used if the parser gives an error or a warning, and simultaneously the ValidationEventHandler event is raised for the semantic validation checking of an XML schema.

In addition to it, the XmlSchema.Read method is used to load schemas from a file. The XML SOM is also used to read and write XSD language schemas from files or other sources using the XmlTextReader, XmlTextWriter, and XmlSchema classes. The following code example in Visual Basic .NET shows how to read an XML schema from a file and then to write the schema to a new file:

Imports System.IO
Imports System
Imports System.Xml
Imports System.Xml.Schema
Imports System.Text

Class ReadAndWriteFile
   Public Shared Sub Main()
     Try
        Dim readdata As New XmlTextReader("Students.xsd")
        Dim readwrite As XmlSchema = XmlSchema.Read(readdata, Nothing)
        readwrite.Write(Console.Out)
        Console.WriteLine("Exit Now!")
        Console.Read()
        Dim file1 As New FileStream("NewFile.xsd", FileMode.Create, _
          FileAccess.ReadWrite)
        Dim xmlWriter As New XmlTextWriter(file1, New UTF8Encoding())
        xmlwriter.Formatting = Formatting.Indented
        readwrite.Write(xmlwriter)
      Catch e As Exception
       Console.WriteLine(e)
     End Try
   End Sub
End Class


The following code example in Visual C# .NET shows how to read XML Schema from a file and then to write the schema to a new file:

using System.IO;
using System;
using System.Xml;
using System.Xml.Schema;
using System.Text;

class ReadAndWriteFile
{
   public static void Main()
   {
     try
     {
       XmlTextReader readdata = new XmlTextReader("Students.xsd");
       XmlSchema readwrite = XmlSchema.Read(readdata, null);
       readwrite.Write(Console.Out);
       Console.WriteLine("Exit Now!");
       Console.Read();
       FileStream file1 = new FileStream("NewFile.xsd", FileMode.Create,
         FileAccess.ReadWrite);
       XmlTextWriter xmlWriter = new XmlTextWriter(file1,
         new UTF8Encoding());
        xmlwriter.Formatting = Formatting.Indented;
       readwrite.Write(xmlwriter);
     }
     catch (Exception e)
     {
       Console.WriteLine(e);
     }
   }
}


After creating, reading, and writing XML data of an XML document, and creating XML schemas, now comes the use of the XmlValidatingReader class to validate an XML document. The following sub-heading is a description of an XML validation.

Validating an XML Document



Summary

  • XML or Extensible Markup Language describes document structures using markup tags '<' and '>'. A well-formed XML document should contain a start tag and an end tag. Unlike HTML tags, XML tags are case-sensitive.


  • An XML document is well-formed text-based data and considered as a meta-language that can easily be edited and created with any text editor. The XML data comprises both human-readable data and the description of that data, and contains text-based XML files.


  • The XML Document Object Model, commonly known as XML DOM, is a representation of XML document in memory. The DOM class does the work of reading, writing, and manipulating an XML document.


  • The XmlDocument object derived from the XmlNode class represents the root node or the document node. The XmlDocument object consists of Load, LoadXml and Save methods that loads and saves XML documents.


  • The XmlTextReader class derived from the XmlReader class implements the methods defined by the XmlReader class. The XmlValidating Reader is another class derived from the XmlReader class facilitates to read XML data and supports the document type definition (DTD) or schema validation.


  • XML Path or XPath language is a query language for finding information in an XML document. XPath also contains built-in standard functions to manipulate string, numeric, and boolean values, and to manipulate sequences and nodes.


  • The structure of XML documents is based on rules that are specified in an XML Schema Definition (XSD) file also known as an XML Schema. An XSD file consists of the definitions of elements, attributes, and data types.


  • The XML Schema Object Model (SOM) derived from the System.Xml.Schema namespace consists of a set of classes that read the schema definition from a file and also create the schema definition files programmatically.


  • The XML SOM is also used to read and write XSD language schemas from files or other sources using the XmlTextReader, XmlTextWriter, and XmlSchema classes.


  • The XmlValidatingReader class derived from the XmlReader class provides the document type definition (DTD), the XML Data Reduced (XDR), and the XSD schema validation services for the validation of an XML document or a fragment of an XML document.


  • A DataSet stores data obtained from a data source such as a relational database or an XML document. A DataSet can be either in typed or untyped form.


  • ADO.NET writes the DataSet data as a DiffGram, which means that both the original and current versions of the rows are included. A DiffGram can be represented in the XML format and is used to store and preserve all versions of the data.


Rating:



Other articles

Click here to Article home

Microsoft Certification MCSE: MCSA , MCTS, MCDST, MCAD, MCDBA, MCSE Messaging, MCSE Security
JAVA Certification: SCJP, SCWCD Cisco Certification: CCNA, CCENT, A+, Network+, Security+
Oracle Certification: OCP 9i, OCP 10g, OCA 9i, OCA 10g CIW foundation    Photoshop ACE
© 2008 uCertify.com. All rights reserved. All trademarks are the property of their respective owners.
 
HACKER SAFE certified sites prevent over 99.9% of hacker crime.