Ruby Markup: A Table By Any Other Name


Disclaimer: this manages to be both hopelessly obscure and hopelessly out of date, at the same time. But if you’re interested in my opinion on a three-year-old W3C recommendation for languages I don’t speak, read on!

Ruby (The W3C spec, not the programming language) is an XHTML 1.1 module for adding short annotations to text. Apparently this is really common in Japanese and Chinese: you’ll have a fine-print pronunciation guide above the main bit of text. An example in english (taken directly from the spec):

Month Day Year
10 31 2002
Expiration Date

The words “Month”, “Day”, “Year” and “Expiration Date” are all ruby annotations to the various bits and pieces of the base text “10 31 2002”.

Simple enough, and for one-to-one pairings of ruby annotations to text blocks, the markup is also reasonably simple:

<ruby><rb>WWW</rb> <rt><rp>(</rp>World Wide Web<rp>)</rp></rt></ruby>

<rb> denotes the “ruby base”, <rt> the “ruby text”. Wrap it in a ruby node and you’re good to go. (The <rp> tag is for backwards compatibility: it contains text — typically parentheses — which should be ignored by clients which recognize ruby tags. So the above example would be rendered by, well, every current web browser, as “WWW (World Wide Web),” while a hypothetical ruby-aware browser would omit the parentheses.)

So far so good. But something interesting happens to the specification when it gets to more complex examples such as the date above. There, we have ruby annotations for each part of the date, but there’s also a separate annotation for the whole date as a block.

The ruby markup for that, according to the spec, is

<ruby><rbc><rb>10</rb> <rb>31</rb> <rb>2002</rb></rbc> <rtc><rt>Month</rt> <rt>Day</rt> <rt>Year</rt></rtc> <rtc><rt rbspan="3">Expiration Date</rt></rtc></ruby>

Does that look familiar at all? Kind of sort of exactly like this?

<table><tr><td class="rb">Month</td> <td class="rb">Day</td> <td class="rb">Year</td></tr> <tr><td class="rt">10</td> <td class="rt">31</td> <td class="rt">2002</td></tr> <tr><td colspan="3" class="rt">Expiration Date</td></tr></table>

Now, surely the guys at the W3C are much smarter than I am, and know what they’re doing. But it sure looks to me like they’re reinventing all the table-related problems that style-based XHTML was supposed to get us away from. The specified ruby markup doesn’t make any direct logical association between each piece of base text and its ruby annotation; it’d be a PITA to write a parser that could determine which annotation goes with what base. The markup describes the layout, not the logic.

And here I thought we were supposed to be moving towards semantic markup.

Wouldn’t something like this make a hell of a lot more sense?

<ruby><rb><ruby><rb>10</rb><rt>Month</rt></ruby> <ruby><rb>31</rb><rt>Day</rt></ruby> <ruby><rb>2002</rb><rt>Year</rt></ruby></rb> <rt>Expiration Date</rt></ruby>

Every annotation is directly paired with its base text; every ruby tag contains exactly one base text node and as many annotation nodes as necessary. (I haven’t included any examples here, but sometimes multiple annotations on the same base text are needed; that would be represented simply as additional <rt> nodes.) No ugly “colspan” analogues, and no need for the additional <rbc> and <rtc> container nodes to stand in for the table row. No dependence on document order. In short, none of what we all deride tables for, nowadays.

The non-ruby-aware presentation would be a bit confused — it’d come out (if you added the <rp> parens) as 10 (Month) 31 (Day) 2001 (Year) (Expiration Date)… but then again the specified markup would be even more scrambled: (Month) (Day) (Year) 10 31 2001 (Expiration Date)

The spec explicitly pretends this problem doesn’t exist:

The rp element is not available in the case of complex ruby markup. There are two reasons for this. First, the rp element is only a fallback mechanism, and it was considered that this is much more important for the more frequent simple case. Second, for the more complex cases, it is difficult to come up with a reasonable fallback display, and constructing markup for such cases can be even more difficult if not impossible.

Admittedly, the more semantic markup means the browser has to be a bit smarter about presentation — because you’re not doing the work for it in the markup. Tooltip-like displays, of the sort already seen on ABBR tags, would be much easier to implement (though you would have to handle display of more than one tooltip at a time). Table-like layout would need some CSS to control whether the ruby text should appear above, below, to the side, etc. (The spec also defers this task to CSS.)

So what am I missing, here? Why did all these smart guys at the W3C revert to markup by presentation instead of by semantics, when the whole point of XHTML is to move the other direction? Is it just that it was written three years ago, before the point of XHTML had really sunk in? (If that’s the case, maybe it’s time to revisit this spec before anybody does something crazy and, you know, uses it for anything.)