Most of the web authors who code their websites by hand have switched to XHTML the last few years. Most of them used the fact that it was newer than HTML and that it was an XML application (which was cool, because it's new as well) as a motive to switch to XHTML. They were right of course, but is new always the best option?
XHTML was designed as an XML application. We all know what XML is. If not, there's always Google. HTML, however, is an SGML application. XML is based on SGML, but still has some major differences which, of course, also apply to both markup languages: XHTML and HTML. Let's have a look at them:
That means no more tag soup code and the use of ownage namespaces like SVG. Oh wait, I think I forgot something… Of course: Internet Explorer (abbreviated as IE). The very browser who was the first to support CSS1, bits of CSS2 and brought us AJAX, but is now the most hated browser in web design land.
After several years, IE still doesn't support XHTML properly. With proper
XHTML I mean XHTML served with the correct MIME type. A MIME type is
a string that defines what kind of filetype the file is. HTML has the MIME type
text/html. XHTML has the MIME type
application/xhtml+xml. So because IE doesn't support the MIME type
application/xhtml+xml, you can't use real XHTML in IE. The
"solution" most web authors find is to just send their XHTML document as
text/html. However, when a browser reads the file, it thinks it's
an HTML document and not an XHTML document. Because of this, you can not use
XHTML with the proper MIME type if you don't want to screw over all IE users.
Note: You might think that the DOCTYPE would let the
browser know you're using XHTML, but that's not the case. The
DOCTYPE has no effect on that at all. The
DOCTYPE doesn't serve any purpose in a document besides the fact
that it's acting cool sitting on the first line of the document. So the MIME
type tells the browser which markup language you're using, not the
DOCTYPE.
Some of you might wonder what the problem is with sending XHTML as
text/html. The problem is very simple: it's not XHTML anymore.
You're sending XHTML as text/html, so the browser thinks it's an
HTML document. This means that you can't use any of XHTML's
advantages. No more strictness that prevents a web author from writing tag soup
and no more support for namespaces. But besides that, it's also invalid HTML.
All XHTML DOCTYPEs are not allowed by the HTML 4.01 specification.
The same obviously also apply to both the xmlns and the
xml:lang attribute.
Also, imagine this piece of code: <img src="./pic" alt="…"/>.
This is of course well-formed XHTML, but in HTML, we've got a problem. As I
stated before, HTML is an SGML application. SGML was a complex markup
language with many features. One of them was a feature called "Shorttag". This
feature allowed the character / to be used to open and close tags
like this: <strong/some text/. So if we look back at the XHTML
example, it should be interperted as
<img src="./pic" alt="…">> by confirming
SGML parsers which should be the case if the document is sent as
text/html. Luckily for most web authors, browsers don't parse HTML
as SGML because the parsers are raped to work with tag soup instead leaving us
with no problem. Unfortunately for web authors who actually care about the HTML
standard, serving well-formed XHTML documents as text/html is
different than you intended them to be.
As you can see, there are quite reasons to not use XHTML served as
text/html. Lets summarize them:
/ character
should be interpeted differently which results in actually having a different
document than you would have if it were served as
application/xhtml+xml.
So I made the decision use HTML instead of XHTML and I don't think that's bound to change soon. There are just no reasons for me to use XHTML instead besides that it will likely be what we'll be using in the future. Unfortunately, the future is not now because we're still screwed with browsers specialized in parsing tag soup and web authors who have no idea what they're doing. Maybe a really interesting XML namespace will persuade me to change to XHTML but I haven't seen one yet.
Right now, I'm sure some of you are waiting for the line where I say that
content negotiating is the solution for all problems. Unfortunatelty, it's not
that easy. As I previously stated, there are quite some differences between
HTML and XHTML so scoping the Accept header for
application/xhtml+xml and changing the DOCTYPE and
start tag (adding the xmlns and xml:lang) attributes
is clearly not enough.
First of all, you need to make sure your scripts are compatible with both
application/xhtml+xml and text/html. In XHTML, the
document.documentElement.nodeName method returns html
while in HTML it would return HTML. document.write()
also doesn't work in XHTML.
When you're done fixing your scripts, you're definitely not done yet. By
looking at the XHTML 1.0 DTD you'll see the content of both the
SCRIPT and STYLE elements should be treated as
PCDATA instead of CDATA as we were used to in HTML.
Crap indeed. In order to make sure your CSS and scripts work correctly in
XHTML, we have to put the content of these elements in CDATA
marked sections like this:
<script type="application/javascript">
<![CDATA[
…script…
]]>
</script>
Of course, you have to make sure these CDATA marked sections
are not there when the document is text/html because that would
result in invalid HTML.
Congrats! You've now successfully made sure your site works as both
application/xhtml+xml and text/html! But why!?
Why do you want it so badly!? You still got no advantage compared
to text/html because you still can't use any of XML's powers. And
that, only that, is the only difference between HTML 4.01 and XHTML
1.0. You're not helping the user with this, only your own satisfaction. And if
you made an error in your content negotiation script, you're even in risk of
screwing
over some of your potential customers.
Although the Appendix
C of the XHTML 1.0 Specification tells us we're allowed to use
text/html as the MIME type to serve XHTML because XHTML is
supposed to be backwards compatible with HTML, I still hold to my conclusion:
XHTML should not be sent as text/html. This makes it
invalid HTML because HTML 4.01 is simply not forward compatible with
XHTML.
Copyright © 2005 - 2007 Jeroen van der Meer. All rights reserved.