Bitesize DITA: think before you add a <table>

One of the basic principles of XML in general, and DITA in particular is the separation of form and content – the idea that by writing content without its formatting and with semantic information we can publish and reuse content more effectively. What, though, should we do about tables? Tables are content, but they are also inherently presentational – muddying the otherwise clear water between content and form.

While you can define a lot of table behaviour such as column widths within DITA itself, that approach can potentially create complexities when it comes to publishing the content, and also limit the flexibility of the content for multi-format publishing purposes. Tables that force a change in page orientation add complexity to a stylesheet, and tables in general can be problematic on mobile devices.

It would be hard to make the case for abandoning tables altogether – but in the structured world we might need to think a little differently. It’s always a good idea to reflect on your content – what it’s for, what it means and whether it could be removed. All too often, large reference tables end up being the elephants’ graveyard of content – the reader ‘might need’ the information but nobody’s actually sure how it’s being used. So the first step in working with tables in DITA is to establish whether each one is really needed.

Assuming the information in a table is useful, at most locations in a DITA topic you’ll have a choice of two types: simple and CALS. CALS is a standard developed by the US military, which predates DITA and even XML, so while the DITA standard supports CALS it’s an outlier in a way. The CALS standard is supported by the standard DITA table type. <table> tables are distinguished from DITA’s <simpletable> by the degree of control you have over the row and column behaviour and spanning. You can use a CALS table to create the kind of complex constructions that we have all, at some point, created in Word – the question is whether that kind of complex layout is worth the potential problems later in the workflow.

In general, unless you really need a choice table in a task (which has its own element, <choicetable>, anyway) or sophisticated grouping and spanning capabilities, simpletable has the advantage that it is easier to author, and there is less that can go wrong. By the time an output is created, it can be really time consuming to troubleshoot errors within a table’s structure. Also, DITA becomes a lot easier for authors to work with if the content model is constrained to those elements which are most commonly used, and the simpletable is a great example of that.

To this end, it’s often easier to create tables using the toolbar buttons in your authoring tools. oXygen in particular does a lot of the work for you. Creating each of the elements in a table structure individually can be fiddly and time consuming – and prone to error too. Likewise, pasting content from an external source often works superficially, but you may find the time you spend fixing oddities brought in from the source content outweighs the convenience of cut and paste.

To sum up, then: think before you add a table – is it really necessary? If you do need a table, use simpletable or choicetable if at all possible.