Coming straight to the point, XML stands for EXtensible Markup Language. As the name suggests, it is a language that can be molded according to the need of the hour. Personally, I don't think calling it a language is justified, but I fanatically follow the ordinance of the techie Gurus, and if they say it is a language, then it is a language.
As is the case with the famous HTML, XML too is a markup language, but its main power, unlike HTML, lies in the ability to describe data without harassing the over-worked server. Since most of the work is done on the client's machine (the person who is running the browser), there is less strain on the server.
It uses DTDs (Document Type Definitions) to properly define and organize data. But the presence of DTDs is not mandatory, and their absence does not hinder the performance of the web page.
The most interesting thing I have found in XML is, the tags are not pre-defined - you can create your own tags and use them as and when required. XML specifies neither semantics nor a tag set. In fact XML is really a meta-language for describing markup languages. In other words, XML provides a facility to define tags and the structural relationships between them. Since there's no predefined tag set, there can't be any preconceived semantics. All of the semantics of an XML document will either be defined by the applications that process them or by style-sheets.
XML allows users to:
? bring multiple files together to form compound documents ? identify where illustrations are to be incorporated into text files, and the format used to encode each illustration ? provide processing control information to supporting programs, such as document validators and browsers ? add editorial comments to a file.
It is important to note, however, that XML is not:
? a predefined set of tags, of the type defined for HTML, that can be used to markup documents ? a standardized template for producing particular types of documents.
XML was not designed to be a standardized way of coding text: in fact it is impossible to devise a single coding scheme that would suit all languages and all applications. Instead XML is formal language that can be used to pass information about the component parts of a document to another computer system. XML is flexible enough to be able to describe any logical text structure, whether it be a form, memo, letter, report, book, encyclopedia, dictionary or database.
The primary goal of XML is to enable SGML-coded data to be served, received, and processed on the Web in the way that is as easy as that currently made possible by use of the fixed SGML tag set provided by HTML. Ok, SGML means Standard Generalized Markup Language. SGML was designed in the 1980's as a tool to enable technical documentation and other forms of publishable data to be interchanged between authors, publishers and those responsible for the production of printed copies of data sets. By providing a formal definition of the component parts of a publishable information set, SGML made it possible to verify the correct transmission and receipt of interchanged data sets. It was soon found that these techniques are applicable in areas other than those directly related to publications. For example, SGML is often used as a neutral data format when moving data between databases as part of multinational projects.
XML has been designed for ease of implementation and for interoperability with both SGML and HTML. Unlike early versions of SGML and HTML, XML has been based from the very start on the ISO 10646 Universal Multi-Octet Coded Character Set (UCS, which includes the codes that make up the Unicode character set) so that it can be used in all major trading nations.
XML Part 2
Amrit Hallan
http://www.bytesworth.com/Am... Hallan is a freelance web designer. For all web site development and web promotion needs, you can get in touch with him at http://www.bytesworth.com. For more such articles, visit http://www.bytesworth.com/ar... and http://www.bytesworth.com/le... You can subscribe to his newsletter [BYTESWORTH REACHOUT] on Web Designing Tips & Tricks by sending a blank email at bytesworth-subscribe@topica.com
In order to work with XML, it becomes necessary to know a little about the XML Style Language.
XSL provides for two forms of output flow objects. The first set is the set of displayable objects defined for HTML, which allows XML data to be mapped into HTML-aware browsers. The second set is based on the DSSSL-O specifications (Document Style Semantics and Specification Language - Online), and allows XML data to be mapped to DSSSL-based text formatters, such as JADE. Both sets of flow objects are described using XML markup.
XSL defines a set of rules which define a set of actions that are to be associated with various patterns of target elements. The selection of target elements can be qualified in a number of ways. For example, XSL allows different rules to be applied to the same element type dependent on what its ancestors, siblings or contents are. In addition, processing rules can be specified for application when particular attribute values have been associated with an element, or when the element has specific contents. This means that specific rules can be applied to elements with unique identifiers or identified content types (classes).
XSL allows for the definition of sharable sets of style rules. A style rule applies a set of processing characteristics to a target element without creating a new flow object. Where the same style is to be applied to a number of elements, a uniquely named style can be defined for future reference. This provides XSL with the facilities for creating cascading sets of style sheet specifications similar in effect to those defined in the more limited Cascading Style Sheet specification used to process HTML documents.
XSL style sheets can use the ECMAScript programming language to evaluate the contents of elements or attributes prior to or during the creation of flow objects. ECMAScript is a variant of JavaScript and Jscript that has been formally defined by the European Computer Manufacturers Association. It allows tools containing a Java Virtual Machine to process data contained within an XML document. The language has been designed to support only a limited set of processing side-effects to ensure that evaluation cannot inhibit the progressive rendering of large documents.
Now coming back to XML, it was originally developed to allow structured documents of the type typically encoded in SGML to be delivered over the Internet as an integrated part of the World Wide Web of documents. Typically these documents require the specification of element types over and above those permitted in HTML (e.g. specific elements for parts number and other forms of article identification, prices and other forms of calculable measurements, and special classes of displayable text such as health warnings and controlled task lists). XML allows users to define their own sets of document elements and describe how each of these elements should be displayed on a screen in conformance with the supplier's house style.
One area where XML is anticipated to be particularly important is in the area of electronic commerce.
Traditional mechanisms for electronic data interchange (EDI) are based on the interchange of messages between the computer systems of two or more businesses. Each message has to be decoded before its contents can be processed or presented to users. Web-based commerce has, by contrast, been based on the concept of completing an HTML form and then posting the results back to the server for processing, without any details of the transaction being retained by the party completing the form.
XML-coded files are, by their nature, ideal for storing in databases. Because XML files are both object-orientated and hierarchical in nature they can be adapted to virtually any type of database, though care sometimes needs to be taken to ensure that enough structural data is retained in the database to reconstruct the original file.
Data stored using non-XML notations will need appropriate application software to process it, but the XML-coded file will correctly identify where each piece of such data belongs in the completed document and where it has been stored prior to use.
By storing data in the clearly defined format provided by XML you can ensure that your data will be transferable to a wide range of hardware and software environments. New techniques in programming and processing data will not affect the logical structure of your document's message. If more detail needs to be added to the file all you need to do is to update the model and then add new markup tags where required in the document instance. If a completely new style is required then the existing document model can be linked to the new one to provide automatic updating of document structures.
XML Part 3
Amrit Hallan
http://www.bytesworth.com/Am... Hallan is a freelance web designer. For all web site development and web promotion needs, you can get in touch with him at http://www.bytesworth.com. For more such articles, visit http://www.bytesworth.com/ar... and http://www.bytesworth.com/le... You can subscribe to his newsletter [BYTESWORTH REACHOUT] on Web Designing Tips & Tricks by sending a blank email at bytesworth-subscribe@topica.com
XML documents (and HTML documents) are made up by the following building blocks:
? Elements ? Tags ? Attributes ? Entities ? PCDATA ? CDATA
This is a brief explanation of each of the building blocks:
Elements are the main building blocks of both XML and HTML documents.
Examples of HTML elements are "body" and "table". Examples of XML elements could be "my-schedule" and "date". Elements can contain text, other elements, or be empty. Examples of empty HTML elements are "hr", "br" and "img".
Tags are used to markup elements.
A starting tag like <element_name> mark up the beginning of an element, and an ending tag like </element_name> mark up the end of an element.
Examples:
A body element: <body>body text in between</body>. A message
element: <message>some message in between</message>
Attributes provide extra information about elements.
Attributes are placed inside the start tag of an element. Attributes come in name/value pairs. The following "img" element has an additional information about a source file:
<img src="computer.gif" />
The name of the element is "img". The name of the attribute is "src". The value of the attribute is "computer.gif". Since the element itself is empty it is closed by a " /".
PCDATA means parsed character data.
Think of character data as the text found between the start tag and the end tag of an XML element.
PCDATA is text that will be parsed by a parser. Tags inside the text will be treated as markup and entities will be expanded.
CDATA also means character data.
CDATA is text that will NOT be parsed by a parser. Tags inside the text will NOT be treated as markup and entities will not be expanded.
Entities as variables used to define common text. Entity references are references to entities.
Most of you will known the HTML entity reference: "?" that is used to insert an extra space in an HTML document. Entities are expanded when a document is parsed by an XML parser.
The following entities are predefined in XML:
Entity References Character
< means "less than - < " > means "greater then - > " & means "ampersand - & " " means "quotes - " " &apos means "apostrophe - ' "
Since, right now we do not plan to go very deep into XML coding, we'll leave the data definition here, and move the future implication of XML.
Extensible Markup Language (XML), which complements HTML, promises to increase the benefits that can be derived from the wealth of information found today on IP networks around the world. This is because XML provides a uniform method for describing and exchanging structured data. The ability to describe structured data in an open text-based format and deliver this data using standard HTTP protocol is significant for two reasons. XML will facilitate more precise declarations of content and more meaningful search results across multiple platforms. And once the data is located it will enable a new generation of viewing and manipulating the data.
Consider an industry where interchange of data is vital, such as banking. Banks use proprietary systems to track transactions internally, but if they use a common XML format over the Web, then they'd be able to describe transaction information to another institution or an application (like Quicken or MS Money). Of course, they'd also be able to present the data in a pretty Web page. FYI: This markup does exist. It's called OFEX, the Open Financial Exchange format.
Under certain circumstances, if IE 4 on the PC comes across a <SOFTPKG> tag with the proper contents, a function is started that gives a user the opportunity to update installed software. If you're using Windows 98, it's possible that you've seen this process in action without knowing it was an XML application.
XML Part 4
Amrit Hallan
http://www.bytesworth.com/Am... Hallan is a freelance web designer. For all web site development and web promotion needs, you can get in touch with him at http://www.bytesworth.com. For more such articles, visit http://www.bytesworth.com/ar... and http://www.bytesworth.com/le... You can subscribe to his newsletter [BYTESWORTH REACHOUT] on Web Designing Tips & Tricks by sending a blank email at bytesworth-subscribe@topica.com
In the following two articles, I'm going to wrap up my pondering on XML. We'll explore the basic schema of a DTD, and the future of XML.
Let's recall that some basic features of XML are:
? XML can keep data separated from your HTML ? XML can be used to store data inside your HTML documents ? XML can be used as a format to exchange information ? XML can be used to store data in files or in databases
The power and beauty of XML is that it maintains the separation of the user interface from structured data, allowing the seamless integration of data from diverse sources. Customer information, purchase orders, research results, bill payments, medical records, catalog data and other information can be converted to XML on the middle tier, allowing data to be exchanged online as easily as HTML pages display data today. Data encoded in XML can then be delivered over the Web to the desktop. No retrofitting is necessary for legacy information stored in mainframe databases or documents, and because HTTP is used to deliver XML over the wire, no changes are required for this function.
Once the data is on the client desktop, it can be manipulated, edited, and presented in multiple views, without return trips to the server. Servers now become more scalable, due to lower computational and bandwidth loads. Also, since data is exchanged in the XML format, it can be easily merged from different sources - ok, this is the aspects that personally interests me. The portability of data. Database programmer all over the world face unlimited problems while tackling with data of multifarious formats. If formats cease to matter, anybody, anywhere, on whichever machine, can view and manipulate the data.
>From the previous article, we might recall the XML, unlike
HTML, does not have proprietary tags. We can go on a wild trip and define our own tags, according to the necessity. Consider this for example:
<?xml version="1.0"?>
<my-schedule>
<date>4/17/2001</date>
<morning-to-noon>
<XML-tutorial>
<XML-Introduction>Telling what exactly XML means</XML-Introduction>
<XML-Examples>Some Examples of XML</XML-Examples>
<XML-Conclusion>Some concluding text</XML-Conclusion>
<XML-Email>Email the XML files to Yagna</XML-Email>
</XML-tutorial>
</morning-to-noon>
<noon-to-mid-noon>
<nothing-important> Have something light to eat and laze around </nothing-important>
</noon-to-mid-noon>
<mid-noon-to-evening>
<work>Work on a client's web site</work>
</mid-noon-to-evening> ............ <date>4/18/2001</date> .............. </my-schedule>
If you can't make out what this is all about, don't worry. This is just an imaginary schema of a data structure that can be represented through an XML document.
Before you get the time to come to grips with the gory XML introduction, I present a more evolved version of the above mentioned XML code:
<?xml version="1.0"?>
<!DOCTYPE my-schedule [ <!ELEMENT my-schedule (date +)>
<!ELEMENT date (morning-to-noon, morning-to-mid-noon, mid-noon-to-evening)>
<!ELEMENT morning-to-noon (XML-tutorial)> <!ELEMENT XML-tutorial (XML-Introduction, XML-Example, XML-Conclusion, XML-Email)>
<!ELEMENT XML-Introduction (#PCDATA)>
<!ELEMENT XML-Example (#PCDATA)>
<!ELEMENT XML-Conclusion (#PCDATA)>
<!ELEMENT XML-Email (#PCDATA)>
<!ELEMENT noon-to-mid-noon (nothing-important)>
<!ELEMENT nothing-important (#PCDATA)
<!ELEMENT mid-noon-to-evening (work +)>
<!ELEMENT work (#PCDATA) ]>
<my-schedule>
<date>4/17/2001</date>
<morning-to-noon>
<XML-tutorial>
<XML-Introduction>Telling what exactly XML means</XML-Introduction>
<XML-Examples>Some Examples of XML</XML-Examples>
<XML-Conclusion>Some concluding text</XML-Conclusion>
<XML-Email>Email the XML Article</XML-Email> </XML-tutorial>
</morning-to-noon>
<noon-to-mid-noon>
<nothing-important> Have something light to eat and laze around </nothing-important>
</noon-to-mid-noon>
<mid-noon-to-evening> <work>Work on a client's web site</work>
</mid-noon-to-evening> ............ <date>4/18/2001</date> .............. </my-schedule>
The above is a comprehensive example of a DTD - Document Type Definition. XML provides an application independent way of sharing data. With a DTD, independent groups of people can use a common DTD for interchanging data. Your application can use a standard DTD to verify that the data you receive from the outside world is valid. You can also use a DTD to verify your own data.
In this example, the data structure is well defines. Each parent node has a child node, and some child-nodes have grand-child nodes and so on.
Amrit Hallan
http://www.bytesworth.com/Am... Hallan is a freelance web designer. For all web site development and web promotion needs, you can get in touch with him at http://www.bytesworth.com. For more such articles, visit http://www.bytesworth.com/ar... and http://www.bytesworth.com/le... You can subscribe to his newsletter [BYTESWORTH REACHOUT] on Web Designing Tips & Tricks by sending a blank email at bytesworth-subscribe@topica.com