Microsoft.NET

……………………………………………….Expertise in .NET Technologies

ADO.NET Core Classes:Other .NET XML Classes – Part16

Posted by Ravi Varma Thumati on December 4, 2009

The .NET framework provides a full complement of classes for XML manipulation, which are found in the System.Xml family of namespaces (see Table 17-5).

Table 17-5. XML classes in .NET
Class Description
XmlDocument
An in-memory representation of an XML document. You can navigate freely through its structure, adding, removing, inserting, or modifying nodes.
XmlSchema
An in-memory representation of an XML schema document.
XmlTextReader and XmlTextWriter Allows you to quickly retrieve XML data from or save it to a file or stream. These classes provide optimized forward-only access.
XmlValidatingReader
Similar to XmlReader in that it provides read-only, forward-only access to an XML file or a stream. Unlike XmlReader, XmlValidatingReader throws an exception if the XML document violates the rules in a specified schema.
XslTransform
Allows you to perform XSLT transformations to change an in-memory XML document to another XML document.

These classes offer XML-specific functionality that isn’t directly available through the ADO.NET objects. For example, if you load a serialized DataSet using an XML file and XSD schema, you may not receive a warning if the XML document violates the schema; typically, you’ll just lose some of the data. In cases in which you aren’t sure if the XML file fits a given schema (or, in other words, you aren’t sure that the data in an XML file pertains to a given table or a given version of that table), you can run through the XML file with the XmlValidatingReader before loading it into the DataSet.

XmlDataDocument Object Overview

There is one .NET object that combines an XML view with a DataSet—the XmlDataDocument. The XmlDataDocument inherits from XmlDocument, so it provides the same node-based iteration ability and XPath querying methods. It also provides a single additional property, XmlDataDocument.DataSet, which provides the relational representation of XML data.

When using the XmlDataDocument, you can start with either an XML document or a DataSet. To create an XmlDataDocument based on a DataSet, use the overloaded constructor that accepts a DataSet instance:

XmlDataDocument dataDoc = new XmlDataDocument(ds);

Although you can use the XmlDataDocument.Nodes collection to modify the information in the DataSet, it becomes much more difficult to take into account referential integrity and other database-specific concepts. It can be useful if you want to execute an XPath query or perform an XSLT transform (as shown in the next two examples) or take advantage of some other XML-specific feature. However, it won’t replace the tasks you can accomplish through the DataSet model.

Another useful role for the XmlDataDocument is to provide a DataSet projection on an XML file. To use this approach, you create the XmlDataDocument, load a schema into the DataSet (this is a required step), and then load the XML file using the XmlDataDocument.Load( ) method:

XmlDataDocument dataDoc = new XmlDataDocument();

dataDoc.DataSet.ReadXmlSchema(“myschema.xsd”);

dataDoc.Load(“myfile.xml”)

The primary advantage of this approach, versus just using the DataSet directly, is that you don’t lose any information that isn’t defined in the schema. The DataSet acts as a projection of a portion of the information. If you load a document with an extra element that’s not included in the DataSet schema, you won’t be able to access that column through the XmlDocument.DataSet object. However, you can access it directly through the XmlDocument nodes, and you won’t lose the information when you call the XmlDocument.Save( ) method.

Searching a DataSet with XPath

You can use the XmlDataDocument to perform an XPath query. Example 17-4 demonstrates how you can create the XmlDataDocument for an existing DataSet and then select nodes using an XPath query.

Example 17-4. Searching a DataSet with XPath

// … create and populate a DataSet (ds) with the query:

// SELECT CategoryID, CategoryName, Description  FROM Categories

// Create the XML data document interface.

XmlDataDocument dataDoc = new XmlDataDocument(ds);

// Perform search for all Categories that have a

// CategoryName which starts with “C”.

string XPath;

XPath = @”//Categories[starts-with(child::CategoryName, ‘C’)]”;

XmlNodeList nodes = dataDoc.DocumentElement.SelectNodes(XPath);

foreach (XmlNode node in nodes)

{

Console.Write(“\nMatch: “);

foreach (XmlNode child in node.ChildNodes)

{

Console.WriteLine(child.InnerText);

}

}

The XPath query roughly translates into “get all the nodes named Categories where the contained CategoryName element has a value that starts with the letter ‘C.'” The code then iterates through the results and produces the following console output:

Match: 2

Condiments

Sweet and savory sauces, relishes, spreads, and seasonings

Match: 3

Confections

Desserts, candies, and sweet breads

Transforming DataSet XML with XSLT

Another reason you might use the XmlDataDocument is so you can apply an XSL transformation. XSL is a language that defines transformations. These transformations allow you to convert an XML document into an XML document with a different format or an entirely different document (such as an HTML page). XSL documents are themselves written as XML documents that use a specialized XML grammar. This is usually called the XSL transformation document, and is given the extension .xsl or .xslt. Essentially, the XSL document maps the source XML to a new document. In .NET, XSL transformations are performed by the XslTransform class in the System.Xml.Xsl namespace.

The XSL standard (defined at http://www.w3.org/TR/xslt) is beyond the scope of this book. However, it’s easy to illustrate the fundamentals with a quick example. Consider, for example, this simple XSL document:

<xsl:stylesheet xmlns:xsl=”http://www.w3.org/1999/XSL/Transform&#8221;

version=”1.0″>

<xsl:template match=”CustomerDataSet”>

This is the Customer DataSet.

(If this is an HTML document, we could insert opening tags here.)

Here’s the customer list:

<xsl:apply-templates select=”Customers”/>

(Closing tags belong here.)

</xsl:template>

<xsl:template match=”Customers”>

Customer: <xsl:value-of select=”ContactName”/>

Phone: <xsl:value-of select=”Phone”/></xsl:template>

</xsl:stylesheet>

The document matches two elements in the XML document. First, it matches CustomerDataSet as the root, which must be the name of the DataSet you create. Once this match is made (on the fourth line), a few lines of ordinary text are output into the new document. Then, the Customers element is matched, which is the name of the table rows. Every time a customer row is found, the text from the ContactName and Phone elements are output.

Example 17-5 shows a console application that uses this XSL document. Note that the DataSet is carefully generated to use the matching element names CustomerDataSet and Customers. If it doesn’t, the XSL transform will fail. The source XML for the transformation is supplied directly from the XmlDataDocument. The target document is the output stream for the console window.

Example 17-5. Transforming a DataSet with XSL

using System;

using System.Data;

using System.Data.SqlClient;

using System.Xml;

using System.Xml.Xsl;

public class XslTransformDataSet

{

private static string connectionString = “Data Source=localhost;” +

“Initial Catalog=Northwind;Integrated Security=SSPI”;

public static void Main()

{

string SQL = “SELECT TOP 5 * FROM Customers”;

// Create ADO.NET objects.

SqlConnection con = new SqlConnection(connectionString);

SqlCommand com = new SqlCommand(SQL, con);

SqlDataAdapter adapter = new SqlDataAdapter(com);

DataSet ds = new DataSet(“CustomerDataSet”);

// Execute the command.

try

{

con.Open();

adapter.FillSchema(ds, SchemaType.Mapped, “Customers”);

adapter.Fill(ds, “Customers”);

}

catch (Exception err)

{

Console.WriteLine(err.ToString());

}

finally

{

con.Close();

}

// Create the XML data document interface.

XmlDataDocument dataDoc = new XmlDataDocument(ds);

// Create the XSL transform object.

XslTransform xsl = new XslTransform();

xsl.Load(“transform.xsl”);

// Perform the transformation.

xsl.Transform(dataDoc, null, Console.Out);

}

}

When this code is executed, you’ll see the following output:

<?xml version=”1.0″ encoding=”IBM437″?>

This is the Customer DataSet.

(If this is an HTML document, we could insert opening tags here.)

Here’s the customer list:

Customer: Maria Anders, Phone: 030-0074321

Customer: Ana Trujillo, Phone: (5) 555-4729

Customer: Antonio Moreno, Phone: (5) 555-3932

Customer: Thomas Hardy, Phone: (171) 555-7788

Customer: Christina Berglund, Phone: 0921-12 34 65

(Closing tags belong here.)

Using the Data Objects to Edit XML

Editing relational tables through XML objects is rarely a good idea. You immediately introduce synchronization problems and lose the ability to enforce referential and identity integrity. There’s a more useful side of the ADO.NET and XML, however: using the ADO.NET data objects to work with an ordinary XML file.

For example, your program might need to store user-specific settings. One choice for a location to store these settings is the registry, but this isn’t always appropriate, particularly if users are regularly logging in to the program from different workstations. Another approach might be to use a central database server. However, you may not have the necessary database server in place, the program may need to support local (disconnected use), or you may not want to introduce additional network traffic. In these cases, you can adopt a different approach and assign a small XML configuration file to each user. To read and write the configuration file, use the DataSet.ReadXml( ) and DataSet.WriteXml( ) methods.

To evaluate this technique, here are a few considerations:

  • This approach is best if you want to interact with the data using a table and row-based syntax. You don’t interact natively with the XML.
  • This approach has the same limitations as any other file-based storage. If you need to write large amounts of data, it’s slow, and there is no way to manage concurrent updates. If you need these features, you need a full-fledged RDBMS.
  • This approach works best if you create an XML schema to describe your data format. Otherwise, type conversion errors and other schema inference problems are possible.
  • This approach works only if your XML document follows a DataSet-like organization of elements. This leads to problems for XML documents with deeply nested structures or XML documents that duplicate some elements to represent many-to-many relationships.

Inferring XML Structure

When using the ADO.NET objects to process an ordinary XML file, you should always create an associated schema. Otherwise, ADO.NET will attempt to infer a schema based on the structure of the XML document. This schema may not always be appropriate.

If you don’t provide a schema, you can still load XML data into a DataSet. However, you’re likely to discover a number of ADO.NET idiosyncrasies. The process ADO.NET uses to create a “best guess” schema by reading an XML document is called schema inference.

The DataSet follows a rigid set of inference rules:

  • Elements with attributes become tables. Attributes are inferred as columns.
  • Elements with child elements become tables. Elements that have no attributes or child elements and don’t repeat are inferred as columns.
  • Elements that repeat are inferred as a single table.
  • If the document or root element has no attributes and no child elements that would be inferred as columns, it is inferred as a DataSet. Otherwise, the document element is inferred as a table.
  • For elements that are inferred as tables and contain text, but have no child elements, a new column named TableName_Text is created. The text of each of the elements is placed in this column. If an element is inferred as a table and has text and child elements, any contained text is ignored.
  • If ADO.NET infers that there is a table element nested with another table element, it automatically creates a nested DataRelation between the two tables. A new, primary key column named TableName_Id is added to both tables and used by the DataRelation. A ForeignKeyConstraint is created between the two tables using the TableName_Id column.

The last point is particularly interesting. With a nested table, ADO.NET always generates a new primary key column in the parent and a new foreign key column in the child, even if you already have elements that could serve this purpose.

Finally, note that schema inference has another major limitation: it doesn’t detect data types. Even if you have numeric ID values, they are interpreted as strings.

SQL Server 2000 XML

Finally, SQL Server also provides its own direct support for XML. By using the FOR XML clause in a SELECT query, you indicate that the results should be returned as XML. This technique is a bit of a compromise. Even though it provides XML-savvy development houses with an easy way to work natively with XML, it’s also unavoidably specific to SQL Server, and therefore won’t suit if you need the flexibility to migrate to (or incorporate data from) another platform such as Oracle or DB/2.

By default, the SQL Server XML representation isn’t a full XML document. Instead, it simply returns the result of each record in a separate element, with all the fields as attributes (a marked different from ADO.NET’s default, which includes all fields as elements).

For example, the query:

SELECT CategoryID, CategoryName, Description FROM Categories FOR XML AUTO

returns the following XML document:

<categories categoryID=”1″ categoryname=”Beverages” description=”Soft

drinks, coffees, teas, beers, and ales”/>

<categories categoryID=”2″ categoryname=”Condiments” description=”Sweet and

savory sauces, relishes, spreads, and seasonings”/>

<!– Other categories omitted. –>

It’s possible to reverse SQL Server’s preference by adding the ELEMENTS keyword to the end of your query. For example, the query:

SELECT CategoryID, CategoryName, Description FROM Categories

FOR XML AUTO ELEMENTS

returns the following document:

<Categories>

<CategoryID>1</CategoryID>

<CategoryName>Beverages</CategoryName>

<Description>Soft drinks, coffees, teas, beers, and ales</Description>

</Categories>

<Categories>

<CategoryID>2</CategoryID>

<CategoryName>Condiments</CategoryName>

<Description>Sweet and savory sauces, relishes, spreads, and

seasonings”/>

</Categories>

<!– Other categories omitted. –>

Note that setting the format is an all-or-nothing decision. If you want to provide a more sophisticated XML document that follows a set format (i.e., some fields are represented as attributes, while others are columns) you must master the much more complex and much less compact FOR XML EXPLICIT syntax, which isn’t described in this book. For more information, refer to SQL Server Books Online.

Finally, you can add the XMLDATA clause to return a pregenerated schema at the beginning of your document. However, this clause isn’t of much use because the schema is based on Microsoft’s XDR standard, which was proposed before the XSD standard was accepted. As a result, the schemas generated by SQL Server aren’t recognized by most non-Microsoft XML parsers and will likely be replaced in future SQL Server versions.

In ADO.NET, you can retrieve this document using the SqlCommand.ExecuteXmlReader( ) method. This returns an XmlReader object that provides access to the returned XML.

Example 17-6 shows how to retrieve the query shown earlier and write it to a console window.

Example 17-6. Using SQL Server 2000 direct XML support

using System;

using System.Data;

using System.Data.SqlClient;

using System.Xml;

public class DirectXML

{

private static string connectionString = “Data Source=localhost;” +

“Initial Catalog=Northwind;Integrated Security=SSPI”;

public static void Main()

{

string SQL = “SELECT CategoryID, CategoryName, Description ” +

“FROM Categories FOR XML AUTO”;

// Create ADO.NET objects.

SqlConnection con = new SqlConnection(connectionString);

SqlCommand com = new SqlCommand(SQL, con);

// Execute the command.

try

{

con.Open();

XmlReader reader = com.ExecuteXmlReader();

while (reader.Read())

{

Console.WriteLine(“Element: ” + reader.Name);

if (reader.HasAttributes)

{

for (int i = 0; i < reader.AttributeCount; i++)

{

reader.MoveToAttribute(i);

Console.Write(“\t”);

Console.Write(reader.Name + “: “);

Console.WriteLine(reader.Value);

}

// Move back to the element node.

reader.MoveToElement();

Console.WriteLine();

}

}

reader.Close();

}

catch (Exception err)

{

Console.WriteLine(err.ToString());

}

finally

{

con.Close();

}

}

}

The results for the first two rows are shown here:

Element: Categories

CategoryID: 1

CategoryName: Beverages

Description: Soft drinks, coffees, teas, beers, and ales

Element: Categories

CategoryID: 2

CategoryName: Condiments

Description: Sweet and savory sauces, relishes, spreads

One other interesting ability of the FOR XML AUTO command is that it automatically infers relations with JOIN queries and creates XML documents with a nested structure. For example, the query:

SELECT CategoryName, ProductName Description FROM Categories INNER JOIN

Products ON Products.CategoryID = Categories.CategoryID FOR XML AUTO

creates the following XML document:

<Categories CategoryName=”Beverages”>

<Products Description=”Chai”/>

<Products Description=”Chang”/>

</Categories>

<Categories CategoryName=”Condiments”>

<Products Description=”Aniseed Syrup”/>

<!– Other categories and products omitted. –>

To disable this behavior, use the FOR XML RAW syntax instead, which always returns a rigid single-grid XML result. The XML RAW option also gives every row element the name row instead of the name of the table (for example, Categories).

You can also use variations of the FOR XML EXPLICIT syntax to specify nearly every aspect of how the returned XML document should look, and the OPENXML statement to retrieve an XML document from a file and process it in a stored procedure. For more information about the direct support for XML in SQL Server, consult the SQL Server 2000 Books Online.

The SQLXML Provider

Microsoft also provides a special ADO.NET provider designed exclusively with SQL Server and its XML support in mind. This provider isn’t included with .NET, although you can download it online from MSDN at http://msdn.microsoft.com/downloads/sample.asp?url=/msdn-files/027/001/824/msdncompositedoc.xml.

The SQLXML provider isn’t in all respects a true ADO.NET provider. For example, it provides only three managed objects: SqlXmlCommand, SqlXmlParameter, and SqlXmlAdapter. These objects don’t implement the standard interfaces, and there is no collection class (it is encapsulated by SqlXmlCommand).

The SqlXmlCommand class is the heart of the SQLXML provider. You choose the format of command by setting the SqlXmlCommand.CommandType property. Table 17-6 lists valid CommandType values.

Table 17-6. SqlXmlCommandType values
Property Description
Sql
Uses SQL text (similar to the standard SQL Server provider), for example: SELECT * FROM Employees FOR XML AUTO
XPath
Uses an XPath query; for example, Employees[@EmployeeID=1]
Template
Executes a SQL command defined in an XML template document
TemplateFile
Executes a SQL command defined in an XML template document that’s stored in a file
UpdateGram
Directly executes an UpdateGram, which is a predecessor to the DiffGram
Diffgram
Directly executes an ADO.NET DiffGram, which defines DataSet changes

When you use SQLXML with SQL Server and XML, you have several options. You can:

  • Transform rowsets to XML on the client side, not the server side. This can lessen the work the server needs to perform.
  • Submit XPath queries directly (rather than first retrieving the XML document and than searching it).
  • Submit batch updates as a DiffGram.

We’ll concentrate on these three features in the remainder of this chapter. In addition, the SQLXML provider duplicates some features provided by the standard SQL Server provider, such as the ability to execute a FOR XML query and capture the results with an XmlReader.

Converting to XML on the Client-Side

With FOR XML queries, SQL Server performs a query, converts it to XML, and returns the XML stream to the client. This has the potential for a minor performance penalty, and the network bandwidth required to send an XML document is always greater than that required for SQL Server’s optimized TDS interface, which sends a stream of proprietary binary data. To reduce this effect in performance-sensitive applications, you can use the SQLXML provider’s ability to convert a result set to XML on the client side. The resulting document takes the exact same form as if the server had performed the work.

The following code snippet shows this technique:

string connectionString = “Data Source=localhost;” +

“Initial Catalog=Northwind;Integrated Security=SSPI”;

// Create the command (which encapsulates a connection).

SqlXmlCommand cmd = new SqlXmlCommand(connectionString);

// Create the XML on the client.

cmd.ClientSideXml = true;

// Define the command.

cmd.CommandText = “SELECT * FROM Customers FOR XML AUTO”;

// Get the XML document.

XmlReader r = cmd.ExecuteReader();

One reason to use the ClientSideXml property is to wrap a stored procedure that doesn’t return an XML document; the data will convert to XML seamlessly. For example, consider the following stored procedure that retrieves a list of customers and the products they have ordered:

CREATE PROCEDURE CustOrderHist (@CustomerID nchar(5))

AS

SELECT ProductName, Total=SUM(Quantity)

FROM Products P, [Order Details] OD, Orders O, Customers C

WHERE C.CustomerID = @CustomerID

AND C.CustomerID = O.CustomerID AND O.OrderID = OD.OrderID

AND OD.ProductID = P.ProductID

GROUP BY ProductName

GO

You can execute this stored procedure and convert the result to an XML document on the client side with the following code:

string connectionString = “Data Source=localhost;” +

“Initial Catalog=Northwind;Integrated Security=SSPI”;

SqlXmlCommand cmd = new SqlXmlCommand(connectionString);

SqlXmlParameter p = cmd.CreateParameter();

p.Value = “ALFKI”;

// Define the command.

cmd.CommandText = “exec CustOrderHist ? FOR XML AUTO”;

cmd.ClientSideXml = true;

// Get the XML document.

XmlReader r = cmd.ExecuteReader();

This example also illustrates the slightly different code used to call stored procedures with the SQLXML provider. Unlike other ADO.NET providers, you don’t need to define the data type of the parameters used.

Submitting Direct XPath Queries

With the ordinary SQL Server provider, you must retrieve data using a SQL query before you can search it with XPath. The SQLXML provider removes this restriction. Performing an XPath query is as easy as setting the SqlXmlCommand.XPath property.

For example, consider the following XML document that defines an XPath query to select the FirstName and LastName fields from the Customers table:

<xsd:schema xmlns:xsd=”http://www.w3.org/2001/XMLSchema&#8221;

xmlns:sql=”urn:schemas-microsoft-com:mapping-schema”>

<xsd:element sql:relation=”Employees” >

<xsd:complexType>

<xsd:sequence>

<xsd:element

sql:field=”FirstName”

type=”xsd:string” />

<xsd:element

sql:field=”LastName”

type=”xsd:string” />

</xsd:sequence>

<xsd:attribute />

</xsd:complexType>

</xsd:element>

</xsd:schema>

You can use the defined Emp XPath query in a SqlXmlCommand as follows:

string connectionString = “Data Source=localhost;” +

“Initial Catalog=Northwind;Integrated Security=SSPI”;

SqlXmlCommand cmd = new SqlXmlCommand(connectionString);

SqlXmlParameter p = cmd.CreateParameter();

p.Value = “ALFKI”;

// Define the command.

cmd.CommandText = “Emp”;

cmd.CommandType = SqlXmlCommandType.XPath;

cmd.RootTag = “ROOT”;

cmd.SchemaPath = “XPathDoc.xml”;

// Get the XML document.

XmlReader r = cmd.ExecuteReader();

Note that the XPath document is loaded from the file identified by SqlXmlCommand.SchemaPath.

Batch Updates with the DiffGram

The SQLXML provider can also submit changes in a single batch operation using a DiffGram. In fact, if you use the SqlXmlAdatper to update a data source from a DataSet, this behavior takes place automatically, although you may not realize it.

For example, consider the following snippet of code that fills a DataSet and then applies changes to the data source:

SqlXmlCommand cmd = new SqlXmlCommand(connectionString);

cmd.CommandText = “SELECT * FROM Customers FOR XML AUTO”;

SqlXmlAdapter adapter = new SqlXmlAdapter(cmd);

DataSet ds = new DataSet();

// Fill the DataSet.

adapter.Fill(ds);

// (Modify the DataSet here.)

// Apply changes using the DiffGram.

ad.Update(ds);

When the SqlXmlAdapter.Update( ) method is invoked, the SqlXmlAdapter doesn’t step through the rows one by one looking for changes. Instead, it receives the DiffGram directly, and submits that document. The process is transparent to the .NET programmer.

Advertisements

One Response to “ADO.NET Core Classes:Other .NET XML Classes – Part16”

  1. Now you could have your new site and you’re eager to start making some sales! But, how are you going to make gross sales in case you wouldn’t have excessive volumes of tourists to your website?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: