Difference between XML and HTML
Posted by Ravi Varma Thumati on May 4, 2009
XML is designed to carry data.
XML describes and focuses on the data while HTML only displays and focuses on how data looks. HTML is all about displaying information but XML is all about describing information. In current scenario XML is the most common tool for data manipulation and data transmission.
XML is used to store data in files and for sharing data between diverse applications. Unlike HTML document where data and display logic are available in the same file, XML hold only data. Different presentation logics could be applied to display the xml data in the required format. XML is the best way to exchange information.
XML is Free and Extensible
XML tags are not predefined. User must “invent” his tags.
The tags used to mark up HTML documents and the structures of HTML documents are predefined. The author of HTML documents can only use tags that are defined in the HTML standard (like <p>, <h1>, etc.). XML allows the user to define his own tags and document structure.
XML Tags are Case Sensitive
Unlike HTML, XML tags are case sensitive. In HTML the following will work:
| <Message>This is incorrect</message> |
In XML opening and closing tags must therefore be written with the same case:
<message>This is correct</message |
XML Elements Must be Properly Nested
Improper nesting of tags makes no sense to XML.
In HTML some elements can be improperly nested within each other like this:
<b><i>This text is bold and italic</b></i> |
In XML all elements must be properly nested within each other like this:
<b><i>This text is bold and italic</i></b> |
XML is a Complement to HTML
XML is not a replacement for HTML.
It is important to understand that XML is not a replacement for HTML. In Web development it is most likely that XML will be used to describe the data, while HTML will be used to format and display the same data.
XML Syntax Rules
The syntax rules for XML are very simple and strict. These are easy to learn and use. Because of this, creating software that can read and manipulate XML is very easy. Xml enables a user to create his own tags.
Note – XML documents use a self-describing and simple syntax
Let’s develop a simple XML document:
| <?xml version=”1.0″ encoding=”ISO-8859-1″?> <E-mail> <To>Rohan</To> <From>Amit</From> <Subject>Surprise….</Subject> <Body>Be ready for a cruise…i will catch u tonight</Body> </E-mail> |
The XML declaration: Always the first line in the xml document:
The XML declaration should always be included. It defines the XML version and the character encoding used in the document. In this case the document conforms to the 1.0 specification of XML and uses the ISO-8859-1 (Latin-1/West European) character set.
| <? xml version=”1.0″ encoding=”ISO-8859-1″?> |
Root Element: The next line defines the first element of the document. It is called as the root element in the above example it is
| <E-mail> |
Child Elements: The next 4 lines describe the four child elements of the root (To, From, Subject and Body).
| <To>Rohan</To> <From>Amit</From> <Subject>Surprise….</Subject> <Body>Be ready for a cruise…i will catch u tonight</Body> |
And finally the last line defines the end of the root element.
| </E-mail> |
You may feel from this example that the XML document contains a E-mail To Rohan From Amit. Don’t you agree that XML is quite self-descriptive?
Now let’s discuss its syntax-rules which are very simple to learn.
All XML elements must have a closing tag
In XML all the elements must have a closing tag like this:
| <To>Rohan</To> <From>Amit</From> |
XML tags are case sensitive
XML tags are case sensitive. The tag <To> is different from the tag <to>.Hence the opening and closing tags must be written with the same case:
| <To>Rohan</To> <to>Rohan</to> |
XML Elements must be properly nested
Improper nesting of tags makes no sense to XML. In XML all elements must be properly nested within each other like this in a logical order:
| <b><i>Hi , how are you…..</i></b> |
XML Documents Must Have a Root Element
All XML documents must contain a single tag pair to define a root element. All other elements must be written within this root element. All elements can have sub elements called as child elements. Sub elements must be correctly nested within their parent element:
| <root> <child> <subchild>…..</subchild> </child> </root> |
Always Quote the XML Attribute Values
In XML the attribute value must always be quoted. XML elements can have attributes in name/value pairs just like in HTML. Just look the two XML documents below.
The error in the first document is that the date and version attributes are not quoted.
<?xml version=1.0 encoding="ISO-8859-1"?> <E-mail date=12/11/2002/> |
The second document is correct:
<?xml version="1.0" encoding="ISO-8859-1"?> <E-mail date="12/11/2002"/> |
With XML, White Space is preserved
With XML, the white space in a document is preserved.
So a sentence like this : Hello How are you, will be displayed like this:
| Hello How are you, |
Comments in XML
The syntax for writing comments in XML is similar to that of HTML.
| <!– This is a comment –> |