06.28.07

Why Valid isn’t Good Enough

Posted in XHTML, general at 8:49 am by Mary McRae

It should be a “good thing” that OpenOffice has an XHTML export filter. It should be an even better thing that the resulting XHTML is actually DTD-valid. But, unfortunately, valid doesn’t always mean “correct.” In the case of OpenOffice the resulting document is of little use. Font sizes will change sporadically, and all numbering (paragraphs, lists) is out the window. What’s even worse is that I seem incapable of getting valid/usable XHTML out of OpenOffice no matter how many sidesteps I take. I’ve tried the Tidy trick. I’ve tried saving as a Word file. I’ve tried opening the OO file in Word using the new OO import/output filter. All to no avail.

So, if you’re one of those Technical Committees who are authoring your documents in OpenOffice, please just save as HTML.:-(

06.27.07

Creating Valid XHTML v1.0 Transitional from MS Word

Posted in XHTML at 1:59 pm by Mary McRae

The bad news is that even Office 2007 doesn’t have an “export as XHTML” option. The good news is that with a few simple steps you can get XHTML that not only validates but also looks good ;-)

The first step is to save as HTML using the filtered option. This removes a lot of extraneous markup that is only necessary if you plan on “round-tripping” back into Microsoft Word. Since you still have the original Word file, there’s no need to carry around the excess baggage.

The next step is to run the html file through Tidy. I use Tidy Online – just browse to the html file on your local system,  select “xhtml” and you’re off and running. Save the resulting file to your local system.

There’s one more step before you’re done. Tidy still has a few problems, and one of the things it doesn’t get quite right are the “lang” attributes. They should be “xml:lang”. I solve this problem by running a global search-and-replace on “lang=” (include the “=” to make sure you don’t inadvertently replace the string elsewhere in your document) and replace it with “xml:lang=”. You should now be able to validate the instance without error.  The bad new is that there will still be numerous warnings, but we’ll save those for another day.