My new blog is here

More insanity-inducing XML parsing. How difficult is it for people to understand XML and XSLT? It's not that complicated! You've got a few tools - you've got tree parsers, stream parsers and you've got XSLT. Read the very simple guides as to what they do and then use them for the appropriate thing. Permanent link to this item in the archive.

Backwardsland Permanent link to this item in the archive.

Want to see someting utterly insane? How about backwards SAX-style XML parsing. Yes, that is insane. Surely, the simple answer is to use SAX to parse the whole document as a stream and use an attribute in the root child element to specify the piece of data you want, and then take that particular chunk and load that in to a tree parser like DOM.

There may be performance hacks that could merit this type of approach, but the idea of using this approach as a method to implement a key feature in an application is absurd and insane.

A neat hack I found recently with writing an XSLT-based microformats parsing system in PHP5 is to use regular expressions to search a document. Basically, I have an XSL stylesheet to convert, say, hCards in to RDF/XML. But there's no point in going through the process of spawning an XSLT processor or loading the stylesheet in to the DOM if there are no hCards in the document. A regular expression just looking for the word "vcard" rules out the vast majority of documents that don't contain hCards. It's just an eliminative process to increase performance and reduce server load.

It's not pure XML though. When XProc comes in to vogue, I'm not sure what we can replace such hacks with. Perhaps you could do it with XSL by using regular expressions to invoke external stylesheets. I'm betting that such a hack wouldn't be particularly efficient.

I like the idea that M. David Peterson puts in the comments of producing an XML Best Practices specification, although I'm pretty sure I wouldn't contribute to such a specification - my best practices are snark-filled and evil.

The only rule of XML Best Practice is "Actually goddamn use XML!" - a rule that our friends at both the W3C and the WhatWG ought to tattoo backwards on their forehead.

|

XHTML has been a failure! Permanent link to this item in the archive.

XHTML is being used by Wikipedia, Twitter, Wordpress.com, Blogger, digg, TechCrunch, Comment is Free, upcoming, Facebook, TypePad, LiveJournal, 43things, Grazr, 37signals, MeasureMap, TUAW, Gizmodo, Skype, Adobe, BBC Backstage, TheyWorkForYou, Mashable, Rocketboom, Ze Frank, Jeremy Keith, Zeldman, A List Apart. And that's just scratching the surface.

What a total failure. Let's go back and party like it's 1997 with unclosed tags and implied 'head' and 'body' elements. The next generation of browser development really needs to be based on tag soup! Woohoo! By the time we get to 2012, we'll have mobile browsers that are 200Mb in size! Yay for the future of the web!

Seriously, if you want to die young, follow the standards procedure. Crawling across a field of land mines is about as pleasurable as watching the tragedy on wheels that is the web standards process.

The future of the web is at the very least XML-based. Anything else is bullshit. Tag soup needs to die a quick death.

|

HomeTom MorrisOpiumfield

Last modified: Sunday, August 26, 2007 at 11:43 AM.

March 2007
Sun
Mon
Tue
Wed
Thu
Fri
Sat
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Feb   Apr

This is my old blog. Please visit the new one.

Send me a voice message via Odea PayPal
 Subscribe

My podcast (RSS)