Chris's Rants

Saturday, January 29, 2005

None of the above

Steve Maine weighs in on the emerging blogosphere permathread on app protocols and transport independence:
Little bit of a diversion here, but this is what I don’t get about the whole MEST thing: if there’s only one application semantic (i.e. “ProcessMessage”), then why do different things happen when I send you different messages? There’s got to be something inside that message with some meaning, a latent semantic lurking in there somewhere, otherwise the system would never do anything interesting. ProcessMessage/POST is not an operation, it’s a punt. It’s a verb that effectively means “hey, I have no idea what you meant by this, so you go deal with this blob of data in whatever way you see fit”. In other words, it’s a non-verb. The interesting part (the part that actually governs the ultimate behavior) is not the POST but what got POSTed. The application protocol moves up a layer; behavior is governed not by the so-called protocol verb, but by the contents of the message and its position relative to other messages in a sequential conversation. I think it’s inaccurate to say that the entire application protocol consists of walking up to an endpoint and saying “deal with this”…(why that particular endpoint and not a different one? Why did it process one message but fault on a different one? There are semantics in here somewhere!). Maybe Savas can explain this to me.
Exactly right. FWIW, HTTP POST could just as easily have been named PUNT, or NONEOFTHEABOVE, or WHATEVERRR. Possibly, we could have named it after the great philosopher, Humpty Dumpty
"When I accept POST", said Humpty Dumpty, "it means just what I choose it to mean, neither more nor less."
From section 9.5 of RFC2616 (emphasis mine):
The POST method is used to request that the origin server accept the
entity enclosed in the request as a new subordinate of the resource
identified by the Request-URI in the Request-Line. POST is designed to allow a uniform method to cover the following functions:

- Annotation of existing resources;

- Posting a message to a bulletin board, newsgroup, mailing list,
or similar group of articles;

- Providing a block of data, such as the result of submitting a form, to a data-handling process;

- Extending a database through an append operation.

The actual function performed by the POST method is determined by the server and is usually dependent on the Request-URI.
Basically, as it basically says in the specification, POST was designed as the none of the above method (not GET, not PUT, not DELETE, not HEAD, etc.). It has no fixed semantic of its own other than to pass the data in the entity body of the request message to some server-designated process.

Savas follows up with a missive that attempts to clarify Steve's confusion regarding MEST:
In MEST, we see a service-oriented application as a collection of services that interact through the exchange of messages. The messages can be grouped into interesting message exchange patterns or protocols. But, how is communication achieved? We need to define the semantics of how a message is transferred from one service to the other. This is where the ProcessMessage() operation is needed.

We wanted to use people's familiarity with the concept of a 'call', 'method', 'operation' (see WSDL) to describe the semantics of what 'message transfer' meant. So, we combined the concept of a one-way message and the implicit request to process that message (extract the content and do something with it) into a logical operation called ProcessMessage. We thought that if people wanted to describe distributed applications in terms of operations, we could give them one but define its semantics in such a way that we can get what we want: one-way messages.
Both HTTP POST and the MEST ProcessMessage() have roughly the same semantic as far as I can tell.

SOAP defines a processing model that is inherently oneway:
SOAP provides a distributed processing model that assumes a SOAP message originates at an initial SOAP sender and is sent to an ultimate SOAP receiver via zero or more SOAP intermediaries. Note that the SOAP distributed processing model can support many MEPs including but not limited to one-way messages, request/response interactions, and peer-to-peer conversations
The message exchange pattern (MEP) is just that; a pattern. The fact that many, if not most, people use SOAP in conjunction with a request/response MEP doesn't preclude SOAP being used in the context of other, possibly more interesting and useful, MEPs.

However, it is when we come to the point where we need to map MEPs to an underlying transfer protocol, like HTTP that we encounter difficulties. A transfer protocol like HTTP has its own architectural constraints. HTTP is by definition a request/response protocol. The request message originates at the user agent and (typically) terminates at the origin server and the response message is returned over the same network connection on which the request was received, originating at the origin server and terminating at the user agent.

Additionally, HTTP defines the user agent as the entity which establishes the network connection.
client
A program that establishes connections for the purpose of sending
requests.

user agent
The client which initiates a request.
Thus, there is no provision that request messages might be initiated by the origin server and terminate at the user agent. While this presents no problems for the application for which the HTTP protocol was defined (the web), or for applications that interact by means of a request/response MEP in which the roles of the actors map neatly to the roles defined by HTTP, it results in rather unnatural acts when applied to other application uses such as peer-to-peer messaging, publish/subscribe, etc.

The tension in this permathread is really about how to map application semantics to the transfer protocol. Mark would have HTTP be the application. Thus, when the semantic of a request message is "retrieve the representation of the resource identified by this URI", that HTTP GET be used rather than a getSomething() operation tunneled over HTTP POST. That the URI of the resource be required to be the Request-URI of the HTTP request, not buried in a wsa:To SOAP header block in the SOAP entity body of an HTTP POST request message. Mark's point is that by not leveraging the HTTP application protocol, that Web services have effectively opted out of some of the architectural benefits that the infrastructure of the Web provides.
For example, had HTTP not had a GET semantic, then there'd be no need for caching. Now consider that with a "protocol independent" equivalent of GET, ala WS-Transfer, you've lost the ability to optimize the transfer features for that case. So while you could certainly try to deploy WS-Transfer, it would necessarily perform a whole lot worse than HTTP because optimization would require modifying SOAP. At least HTTP is optimized for the general case because it is the result of the merging of the two layers we're talking about.
He's right and he's wrong, at the same time.

He's right that in bypassing the application semantics of HTTP by tunnelling all requests over POST, that Web services has effectively opted out of exploiting the deployed web infrastructure for things like caching of responses and thus will not scale as effectively as it might had it leveraged HTTP GET for those requests that were effectively equivalent to an HTTP GET.

However, his assertion that you would have to modify SOAP to effect the optimization of caching in the context of a WS-Transfer Get in a transport independent manner is just plain wrong. WS-Addressing, maybe... but not SOAP. Just as with HTTP, there would need to be a SOAP header block defined that is the rough equivalent of the HTTP Cache-Control header. This new header could specify the criteria, if any beyond the wsa:To URI, for a matching request and the TTL for the cached response, and be targetted at a soap:role with a URI that specified a caching intermediary SOAP node.

Would a caching SOAP intermediary be as efficient as an HTTP caching intermediary? Possibly not, given that the arguments for the matching criteria are expressed as SOAP header blocks (wsa:To, wsa:Action and possibly a reference parameter(s)), possibly not adjacent to one another whereas with HTTP, the Request-URI and the HTTP Method are adjacent to one another and in the first line of an HTTP request message making it a snap to extract the matching criteria for a cached response. Would it be that much less efficient as to render Web services caching irrelevant? I doubt it. (Of course, there's no reason why the WS-Addressing WG couldn't grant me my wish and have the wsa:To and wsa:Action infoset properties be expressed as attributes on the soap:Envelope element rather than as SOAP header blocks.)

IMHO, it is misguided to assume, or argue, that the HTTP application protocol is the only-application-protocol-we'll-ever-need. The fact is that many of the types of application for which Web services (and WS-Addressing) are intended don't map neatly to the message exchange patterns, application semantics and/or roles defined by the HTTP application protocol.

4 Comments:

Post a Comment

<< Home