Pages

Saturday

Is XML 1.0 (5th ed) backwardly compatible?

Been reading alot about the Xml 1.0 fifth edition, now a "Proposed Recommendation". The reasoning goes like this: there are no Xml 1.1 docs because parser writers believe no one will use it and no one will use it because parser writers don't support it. So why not make Xml 1.1 into an erratum version of Xml and call it 5th edition of 1.0? That seems to be what has been done.

Someone recently wrote about a problem of backward compatibility. The issue is around Unicode. Originally Xml was explicitly tied to Unicode 2. But Unicode has not stood still, moving on to version 4 and beyond. So what if someone wants to use Unicode 4 in a xml setting?

The theory is a relaxation of constraints. Xml currently says Unicode 2 and anything not explicitly allowed is prohibited. The Xml 1.0 5th edition says anything not explicitly prohibited is allowed. So following the logic, the 5th edition is a superset of the previous editions in terms of allowed characters.

Michael Rys says that 1.1 becoming a recommendation was a day for mourning. I assume he would say the same for 1.0 5th edition.

Norm Walsh likes it alot. "The fifth edition does not change the status of any existing XML 1.0 document with respect to well-formedness or validity. Nor does it introduce any of the backwards-incompatible changes introduced in XML 1.1."

So Norm's comment makes me think it fully backwardly compatible (the 5th edition that is).

John Cowan describes the characters are not allowed in 5th edition.

David Carlisle says the change is a good one, but it should be called explicitly a version and not passed off as an erratum.

Mark Nottingham describes the problem that most seem concerned about, namely "implementation Z (of, say, the 3rd edition) coming across a 5th edition document and blowing up".
This is a logical concern. But even this concern is still backwardly compatible in my reading as long as the Unicode standard is expanding in super sets and not altering existing material. This is what I understand to be the case.

So I still haven't found out what makes Xml 1.0 5th edition not backwardly compatible. For sure it would lead to things not being forwardly compatible. But that is a well understood issue in software development and doesn't prevent forward progress.

3 comments:

  1. XML 5th edition is definitely backward compatible: anything well-formed in a previous edition is also well-formed in the 5th edition, and the same is true of validity. Some invalid documents with DTDs become valid.

    Note that what's at issue here is the names of elements, attributes, processing instructions, and so on; it has always been the case that any character is valid in element content and attribute values, except those few that are explicitly forbidden and are still forbidden.

    Yes, it's a dirty trick. But the clean trick (XML 1.1) got no takeup, and this is the only way that justice between English-speakers and Khmer-speakers can be achieved.

    ReplyDelete
  2. The notion of forward or backward compatible doesn't apply here. The version number is being kept at 1.0 to deliberately obscure the direction of movement, or in fact that any movement has happened.

    If software implementing version 1.0 of a spec is incompatible with version 1.1 of a spec you can talk sensibly about whether there is forwards or backwards compatibility, but if two pieces of software both implementing version 1.0 of a spec are incompatible then that is just bad for everyone.

    ReplyDelete
  3. There seems to be a chicken and egg thing going on here. There are no xml 1.1 docs because vendors won't implement but people won't adopt because vendors won't support it.

    So I can see why the "dirty trick" as John says is in the works.

    But I do have to say that versioning must be clear. A new MINOR version number is not "a small amount of change" but an indicator of backwards compatibility. I really don't like it when organizations use a major new version number as a marketing theme. If Unicode 4 is backwardly compatible with Unicode 2 (which I have heard to be the case, but I have not verified myself) then the new version # is marketing. In that case it should be version 2.1, 2.2, or some such.

    ReplyDelete