Anti-forgery stuff: XML or no XML

Working on anti-forgery stuff. Here is my latest post to the IETF MARID working group. (Might be interesting to some folks but I’m mostly keeping it in my journal for myself)


I made the following comment during the jabber session yesterday, when it seemed we were moving further and further away from an agreement on XML:

> gconnor: I think the XML issue is a very small part of the important work we are doing within MARID. It’s an implementation detail. We have made a lot of progress in terms of coming together on identities and features. let’s not forget that. I know the XML issue pushes a lot of people’s hot buttons. I don’t want this issue, which I consider “not really central to the problem we are trying to solve” to tear the group apart

I don’t have a strong vested interest in whether MARID get implemented in XML, plain SPF, or Java or Tcl for that matter. I have some opinions, but I am not going to let my opinion stand in the way of the larger goal – MARID success and a proposed standard.

Here is my attempt to review some of the positions on the issue. I agree with Marshall that there really needs to be some compromise between the two extremes or we are not going to be successful. Therefore, please read this message through to the end before replying.

Position 1. In favor of XML:

XML is extensible. We can add new syntax later without breaking things.
Our MARID data should be part of a larger picture of “email policy”.
There are standard tools out there to parse XML that we can re-use.
We are going to look silly and be irrelevant in 5 years if we don’t allow extensions.

Position 2. Opposed to XML:

XML records are too big for what we want.
Rampant extensibility might lead to inconsistent results later.
An XML parser would be too big for my MTA or other mail-analyzing program.
Keep It Simple. Testing an XML parser would be onerous compared to other testing needed.
Mistrust of Microsoft and their motives for wanting XML
If we implement something too big or complex, we will look silly and be irrelevant in 5 years.

Position 3. Implement both, let domain owners decide:

Plain SPF and XML MUST be supported by all implementations.
Domain owners are free to publish in either format. People can vote with their feet.
If there is no extension really needed, plain text will probably be preferred because it is more readable.
If there is some extension that people end up needing, the XML may be preferred in the future.
If there is a clear winner in terms of market adoption, perhaps a later version of our standard will deprecate one or the other.

Position 4. Leave it up to the MTA:

Plain SPF MUST be supported. XML SHOULD be supported.
Those implementing MTA and mail filters can start with SPF as it is written today, and are encouraged to support XML also, but reading and acting on only the Plain SPF format is also allowed.
Software vendors that don’t support XML do so at their own risk. Users who feel they need XML support may complain. The software not supporting XML may get negative reviews later, or have a checkmark missing on the feature grid. Patches later might be needed.
Software that understand SPF now will be pretty much ready-to-go as is.
Domain owners may choose XML but might be faced with less support among receivers.
Assuming the XML may be extended and the plain format might not be, when both are present, the XML should be used, falling back to SPF plain only if the MTA doesn’t support XML.
Allows people to vote with their feet, sort of. Software vendors have some freedom to choose, but are ultimately accountable to the customers.

More thoughts on extensibility:

There are two kinds of extensibility, and there is some disagreement as to which we want, if any.

“Feature” or “semantic” extensibility adds new mechanisms later, and may change the output of the checking function, or even add new inputs to the function. This can be supported, if done carefully. Either an intelligent default, or an “alt=” type of statement attached to the extended feature, would be needed in order for the software of today to “navigate around” the data points we throw at it tomorrow. There is some support for this kind of extensibility, but there are also others who feel strongly that we should NOT be able to add new features on the fly, as this would lead to inconsistent behavior among some implementations and that’s not good.

Regarding feature extensibility, my guess is there are probably a lot of people who like it, but their support is lukewarm and they aren’t passionate about it enough to defend it. At the same time, there are probably some people who feel strongly enough to come out swinging against such flexibility, fearing that the new extensions won’t add enough value to offset the potential harm in flaky software not doing extensibility right or not being consistent with each other.

We also have “syntax” extensibility, which allows us to thrown in new data later, but only as “extra” or “not core” data. The new data might give extra info (like publish a domain’s desire to receive problem reports) or it might be something not relevant to MARID at all (like whether a domain uses domainkeys, or where to find accreditation, or something like that).

Just syntax extensions, with no feature extensions, are easy to add (to any language, not just XML) as long as the rules are clear from the start. For example, no “extra” data will ever result in a change to the accept/deny/unknown function. An MTA can either ignore it totally, or complain that it’s a possible typo, depending on where the data appears. (For example, an “ep” might have nodes other than “out” and “out” might have nodes other than “m”, but nodes within “m” should be of a recognized type or the implementation should interpret this as a typo and throw an error)

More about size:

XML is going to be a bit bigger than plain SPF or other plain-token language. Frankly, I don’t see size to be a big issue… it is a concern, but not enough of a concern for me to exert a veto or leave the party early. We will have an include-type mechanism for stitching together records. DNS *does* actually have a mechanism for fetching larger records, even though it may be blocked by some firewalls… I think it’s the responsibility of the domain owner to make sure that the record fits in a response packet, and test this, and those who choose to publish larger records should do so at their own risk.

Wrap up, and where to go from here:

I would prefer not to see replies to this post where people tear it up and match one or two sentences of mine with some quick comeback of theirs. That’s a great style for arguing, but I think the time for arguing is past. Most people who care about the issue have made most of the intelligent arguments they are going to make.

What I would really prefer to see is a few paragraphs that say:
1. What position you support
2. What other fallback or compromise positions you can live with, and
3. How strongly you feel about one or the other.

If you respond supporting one extreme, and don’t state a compromise position that you can live with (either 3 or 4 or something else I haven’t spelled out), then you might want to add a footnote saying why you believe this issue is enough of a dealbreaker that you are not willing to compromise, and why you are willing to let MARID fail rather than give in on the issue.

This is where the rubber really meets the road, and the hard work of the working group starts. Failure to reach any agreement is a *guaranteed* way to look silly and be irrelevant in 5 years :) Let’s all try to work together and rise to the challenge. We can do this.

Thanks for your time.
gregc

Leave a Reply