« October 2005 | Main | February 2006 »

November 29, 2005

TrackBacks are Pings, Maybe

Byrne Reese from SixApart is pursuing an idea as he works toward a formal specification for TrackBacks to submit to the standards process: are trackbacks and pings really the same thing? In talking to him a week ago, my reflexive answer was: “Sure, the only things that are different are the endpoints.” Having thought about it some more, I’m convinced that pings and trackbacks are in fact synonyms.

 

Or at least they should be. In practice, things are more difficult. First, while there is no single ping format specification, Dave Winer’s format  (URLs over XML-RPC ) is the de facto standard. (Note: Atom can and does work via XML-RPC, it’s a replacement for the XML document structure, not the transport). Second, while conceptually analogous, the current spec from SixApart is a REST-ful (mostly) implementation of pings. So right off the bat, for TrackBack and (de facto) Ping to coalesce, the transport issue will need to be resolved.

 

In general, I’m a fan of REST. But REST is really geared to communicating parameters (state) via URLs using CGI encoding, rather than transmission of content. XML-RPC on the other hand, transmits content directly, and is well suited to large, extended chunks of data being sent over the wire. Why should this matter when considering how to harmonize TrackBack and Ping? Because the emerging idea of what a “ping” is entails much more than just a timestamp and URLs pointing back to the origin server. Pings are that, and will continue to be, but they will be increasingly used to convey much more about the published content: meta-data, usage policy, and even the raw content itself (“fat ping” for those who follow the evolving discussions on the matter). 

 

If the payload for a ping is going to support extended metadata, policy and even the content itself, then a REST-ful interface is probably not the right choice. XML-RPC is a fine choice for handling large complex bundles of data, but is problematic for publishers of blog tools. Most blog tools support HTTP POST  functions that authors use to publish their entries. TrackBacks, as designed by SixApart, are just HTTP POST requests that arrive not from the author, but from other blogs. So, TrackBacks are straightforward to support for blog tools since they REST-ful, and rely only on HTTP POST. Changing TrackBack to require XML-RPC transport would be placing a heavy burden on blog tool providers, and would probably not be embraced quickly or widely.  It’s not rocket science, but it is a different processing model than the one blog tools are typically built around.

 

I think the answer here is that TrackBacks are lightweight pings. That is, messages that are faithful to the original name “ping” – “I’m here, are you there?”.  Maybe TrackBack is the issue that called the bluff on a lot of what has been happening around ping in the last few months: “Ping” is probably not the right word anymore, given the features we hope to infuse it with. If we adopt a new term for “fat ping”, and decide to give back “ping” it’s original meaning for the blogosphere, then Trackbacks and lightweight pings can probably be merged. Trackback/Pings can be REST-ful – natural and familiar to use for the blog tools, and workable for ping servers. Fat pings – or whatever they will be called – can continue with XML-RPC, XMPP or whatever transport is needed to support moving larger, structured bundles of content around.

November 28, 2005

OPML and URL-Based Identity

Dave Winer points to Johannes Ernst’s recent post summarizing all the good things happening around URL-based identity. From talking to Dave, he seems to be leaning the same direction as many other are lately: URLs are the natural building blocks for user-powered identity on the Net. It’s a great boost to the project if Dave and OPML can find adopt a common framework for identity.  There’s more discussion happening around this later this week, so we’ll have to wait and see, but given my last conversation on this with Dave just before Thanksgiving, I’m quite encouraged.

Usenet, RIP

I just heard from my cable company, who is also my ISP, that they are retiring their Usenet servers this month. AOL dropped support for Usenet nearly a year ago. Microsoft.public.* groups are dwindling. ISPs across the net are dropping Usenet from their feature list. Now if I want to boot up TIN and read comp.lang.ruby, well, I guess the days of using TIN are over, and I’ll just have to point my browser at groups.google.com.

 

Never mind that reading Usenet via Firefox and Google isn’t the same experience at all.  It’s not just the ads trying to be inconspicuously conspicuous over at the edge of the page. There’s no way to cycle through messages on groups I’ve subscribed to. No automatic quoting for splicing my comments in reply to someone else’s. I could go on. If you’ve been reading Usenet since the days when Usenet was the Internet, or at least that’s what the press and the rest of the world thought, you know what I mean. I’m sure I could find a Usenet server somewhere that I could use the “right way”. But it would be pointless, because the great global discussion that was Usenet is no more.  Or rather the discussion is happening, and more intensely than ever. It’s just gone elsewhere.

 

There are several factors that contributed to the downfall of Usenet, but ultimately this is another case of the “tragedy of the commons”.  Like email, Usenet promoted free, easy communications to any and all. And the same scaling capabilities that allowed it to grow so big left it defenseless against abuse. At its peak, Usenet spam easily outnumbered legitimate posts on many groups. A spammer with an anonymous re-mailer could flood the channel with noise at near zero cost, and with impunity. That opportunity proved too tempting and since it only takes a relatively small number of spammers to choke the system, over time Usenet succumbed. Readers and posters went elsewhere. Web-based forums flourished, equipped with their own login systems, and importantly, moderators.

 

I’ve no problem with the migration to web-based forums. In some ways, Usenet as über-forum was unworkable at large scale; even if spam wasn’t a problem, the sheer volume of popular groups was such that it was impossible to keep up. But for many “long tail” subjects, Usenet was remarkable in its ability to aggregate interested parties from around the world.

 

Why does this matter? Maybe it doesn’t. But I couldn’t help thinking about some of the worst days for Usenet several years ago – before everyone left and when spammers were really in high gear there -- when I looked at some trends and stats for content coming through weblogs.com recently. Vigorous growth and participation, being vigorously exploited by those who understand that the system is nearly defenseless against abuse.  Another commons has developed in the blogosphere, and is beset by toxins.

 

I suspect things may be different this time around with the blogosphere, though. The formats and specifications are still very much in development in this space, so the blogosphere isn't doomed to being defenseless. I think many have learned the lessons from the problems that email and Usenet encountered. There's a lot of energy being dedicated right now at VeriSign and elsewhere to enabling open standards that provide means for the commons to defend and police itself from abuse.  Identity, reputation and content management frameworks are emerging that can provide a good measure of defense from spam and abuse, and at the same time keep the "commons" the "commons".

 

November 07, 2005

Yadis and URL-Based Identity

Johannes Ernst has taken another swing at the YADIS (Yet Another Decentralized Identity Interoperability System) with Brad Fitzpatrick and David Recordon of SixApart. Johannes is the head of Netmesh – the people behind LID – the Lightweight Identity System. Brad and David are the driving force behind OpenID, an even lighter-weight identification system than LID. Both LID and OpenID focus on the URL as the anchor object for an identity, and in past months have worked to find an abstraction layer that would allow sites and organizations that consume identity to use a single means of discovery to authorize users who have OpenID or LID-based identities. Yadis is still painfully thin in terms of specification, but it seems probable that in the next couple months the details will be nailed down sufficiently to enable implementations that let identity consumers consume identities from either LID or OpenID users, or from other ID systems as well.

 

It’s not clear that the people behind Yadis have coordinated at all with Kim Cameron of Microsoft, but Yadis, sketchy though it is, sounds a whole lot like the identity metasystem Kim’s been advocating. Microsoft’s metasystem tends toward behind being large, comprehensive and quite complicated, and Yadis in comparison is much more humble and practical, but they both point to the same need; rather than a “better mousetrap” for SSO and identity, a “plug-and-play” bus needs to be agreed on that will enable those who consume identities to integrate multiple competing identity systems in one implementation.

 

Identity System Discovery

The this behind YADIS is not about implementing a particular identity scheme. It’s not an identity framework itself. Rather it’s a discovery mechanism. A Yadis-equipped website – say your blog – would use the Yadis discovery protocol to determine what identity capabilities the submitted URL supported. Right now, the only two that are addressed are LID and OpenID. Yadis provides a mechanism for a user to configure their identity URL(s) for LID, OpenID, or both.

 

It’s possible (and probable) that other identity systems will be plugged into this process, meaning that a Yadis-enabled identity consumer (like our blog server) would have an increasing number of possible ID systems to choose from for its users. The basic requirement for Yadis is that the submitted identity claim be expressed as a URL. If a given identity system can satisfy that requirement, then conceivably it could clip into the Yadis “plug-and-play” framework. It’s not incumbent on an identity consumer to support any particular ID system. The website may inform the user upfront which ID systems it supports, or generate appropriate error messages when the discovery process fails to find a suitable ID framework for the submitted identity URL.

 

If Yadis succeeds, then, it provides a way for websites to support multiple ID frameworks, and to let all the different identity mousetraps compete within a single framework.  New, Yadis-compliant ID systems can be introduced, without the website having to re-engineer it’s login systems. All that would have to be changed is the configuration on the back end --- adding the new system to the “supported” list and specify the trust controls for it. In this way, Yadis enables a “marketplace” for identity frameworks, and provides a means by which multiple frameworks can co-exist, competing and ordering themselves in the ecosystem according to their value.

 

Assessing Capabilities

There’s a tension between simplicity and self-configuration for the identity owner (the end user), and the identity consumer (the website). Identity consumers would prefer an identity URL to be a pointer to a specific capabilities document. For example, if the identity URL was “http://idhost.com/users/joe/xid”, then the URL might easily resolve to an XML document detailing all the configuration info that identity. The problem is that the user needs to have a single URL, and will want the URL to point to a  web page, so that humans – say those reading a business card – can navigate to user’s home page. So if we have a single URL and that URL must resolve to an HTML page, then we have two options: a) use a naming convention for deducing the extended URL for retrieving capabilities, or b)embedding meta information in the HTML home page for the URL.  Yadis currently has chosen b), with the additional option of asking the web server for a preferred MIME type of “application/x-meta-identity”. If the web server is configured to support this, regular browsers will resolve “http://idhost.com/users/joe” to the expected “index.html” page. Identity consumers would be asking for “application/x-meta-identity” rather than “text/html”, and would thus receive the expected capabilities document directly. That’s nice, but remember that one of the basic benefits of URL-based identity is that it can be set up and configured by the user, without any special adjustments by the hosting provider. In the case above, the web server would need to be configured to send the capabilities document for the appropriate requests to “http://idhost.com/users/joe”.

 

Convention over Configuration
David Heinemeier Hansson of 37signals.com and Ruby on Rails fame advocates the idea of  “Convention over Configuration”, and it applies here. We don’t emphasize configuration for the name of the HTML file to return for a web site URL like “http://www.domain.com” – by convention it’s “index.html”. Configuration can be applied to serve another page or resource, but the convention has served us well. Like “index.html”, it would make like easy for Yadis implementers, on both the user and the identity consumer side, if a set of simple conventions could be established. For example, if we keep the identity URL “http://idhost.som/users/joe” the way it is, and assume by convention that the URL “http://idhost.com/users/joe/yadis-config” is the derived URL for retrieving the Yadis capabilities document for this identity, we can skip an addition step called for right now, which is the slow and error-prone process of fetching and scraping the user’s home page to suck out the appropriate <link> tags to get things going. Similarly, if the convention prescribes for ““http://idhost.com/users/joe/yadis-key” as the derived URL for retrieving the public key for this identity, interested users (and servers) can go directly there when needed to retrieve the key, rather than having to wander through a set of configuration parameters. Again, in cases where the convention creates a problem, configuration can still be used. For the benefit of the masses, though, conventions offer the best solution, and configuration should be seen as a fall-back/specialization position.

 

XML vs. HTML vs. Text
In terms of specifying configurations for one’s identity, the OpenID proponents are interested in keeping things as “HTML-centric” as possible – link tags in the <head> section of the HTML doc specify the server and delegate to be used for the identity. This is understandable. For their constituency, editing HTML may not be easy, but it’s at least a fairly familiar process. Johannes Ernst advocates returning simple text documents, with straightforward “line-by-line” directives. This, too makes sense as there’s nothing simpler than having just a simple text document to edit.

 

My preference would be to bite the bullet and establish a simple XML document format for identity configuration. In my experience, users (including myself here) would be prone to mess up the HTML edits necessary for OpenID style configuration. In the OpenID case, the vast majority of OpenID-equipped users are those with LiveJournal accounts. LiveJournal conveniently configures their home pages for them – no editing necessary. But again, if one of the driving requirements is that a user should be able to setup and configure their own identity URL without any outside help, the “HTML-editing” strategy for configuration is problematic. Or at least no better for the user than editing a plain text file, or an XML file.

 

I believe users can edit simple XML files to define their configuration for their identity URLs. A simple text editor is all that is needed, and it’s easy to conceive that “helper” web pages will emerge that allow users to enter their options into a “control panel” form and download the result as a properly formed XML identity configuration file. (Note to self: something to gin up some evening.).  The same can be said for editing HMTL documents, but the process is much more complex when you have to edit someone else’s home page cleanly as part of the process.

 

On the consumer side, an XML configuration file wins hands down. Having written several different prototypes with OpenID and LID, I can say that snarfing out the right configuration parameters from an OpenID or LID URL isn’t rocket science, but it’s a hassle, and bound to become a major hassle as the demands on configuration (think security options) get more complex over time. Identity consumers are servers, and as such, feel much more comfortable parsing and managing XML documents than either HMTL or plain text. Give me a well-formed XML document, and I can suck out what I need quickly, accurately, and in just a few lines of code. It wasn’t hard to write a simple plain text parser for LID’s capabilities document, but isn’t this why we have tools like XML in the first place?

 

Most importantly, XML provides serious headroom in terms of additional semantics that can be supported in the future Considering that it adds very little complexity to the user experience, it not only makes the identity consumers’ job much more straightforward, but enables the addition of new features and semantics with little effort. OpenID’s HTML model and LID’s plain-text model will not scale in terms of new semantics. XML configuration files will.

November 02, 2005

Tag++

Can a tag have a ‘spin’? I’ve spent a lot of time tagging lately, and have found that I need a way to reflect positive/negative spin on items I’m tagging. For example, I was looking through my tags (and others) for items that labeled “ajax”, and specifically for items that focused on the problems or shortcomings of AJAX.  I wish I had tagged all my ajax resources with something like this: “ajax+1” for positive articles, “ajax-1” for negative articles, “ajax” for “no-spin”. Or maybe, “-ajax”, “+ajax”, and “ajax” would be a cleaner syntax. As a long-time C++ programmer, I like “ajax++” and “ajax- -” a lot, too.

 

In any case, it seems too fine grained to split my AJAX tags up into “ajax” and “negative”, “ajax” and “positive”, etc. That sort of works, but breaks down when multiple tags get involved. If I read an article on the realized virtues of AJAX and the perceived weaknesses of ASP 2.0, here’s what I might add for tags using my typical approach: “ajax”, “pos”,  “asp20”, “neg”. Fine.  But now when I want to pull up articles that are “detractors” of AJAX, what do I get? I get this article too, if I search for items tagged “ajax” “neg” – both of those tags were attached to this article, even though it was a pro-AJAX – anti-ASP article. The current tagging model has no way to bind “ajax” and “neg” together logically, and that’s what I’m really looking for.

 

Of course, problems arise from my idea for tags like “ajax++”.  First, the search function would have to be made aware. It hardly helps to tag articles with “ajax++” and “ajax—“ when the tag search won’t let me pull them up when I just want to look for “ajax” tags.  I’d need the search tool to understand something regular-expression-ish like “ajax*”, where the ‘*’ character serves as a wildcard for any “spin” suffixes. That’s not a significant problem, though, as tag searching is still just getting off the ground, and from a technical standpoint it’s trivial to support it. Whatever syntax is used will have a crowding effect on the tag namespace, however. For example, in the case I’m using above, if I wanted to tag technical articles related to the C++ programming language, the tag search function would think I was trying to retrieve items tagged “C” – the predecessor language, that had a positive spin on the subject.  There are ways around this too, but these seem inevitably to end up with really geeky regex-like formulations. I’m looking for a simple way to put positive or negative spin on a tag.

 

Second, if this is a good idea at all, then the idea of supporting just positive and negative spin probably too narrow. Instead, we’d need some kind of “qualified tag” system that provide more general semiotics for tags. For example, I might use a syntax like “ajax.pro” and “asp.anti” above. “pro” and “anti” here would still be tags, in the sense that I could search for items tagged “anti” (or maybe “*.anti”?) and find articles tagged “asp.anti”. Note that I’m not suggesting this a way to assign taxonomical or ontological labels – existing tags do that already: I can label something with a tag like “Computers: Internet: Web Design and Development: Promotion: Search Engine Optimization Firms”, if I’m patient enough to type all that in, or have a tool that will let me quickly browse and choose DMOZ nodes for tags. This isn’t about topical classification, but rather about a property I find increasingly useful when I want to create and retrieve tags – advocacy for or against, or neither.  That’s often a subjective distinction, but these are my own tags after all.

 

On the other hand, I’m aware that one of the keys to tagging catching on was it’s stark simplicity. If the IETF had created a working group to formalize a tag specification, you’d probably be able to write small games with the syntax provided, but no one would be using it, at least in the way del.icio.us tagging has taken off. Still, as I invest more time tagging things I read an work with – really just advanced bookmarking – I’m losing information I’d like to preserve at the point where I add tags. It’d be nice to have a simple way to “micro-tag” my tags.

Categories

Blog Tools | Blogosphere | Feeds | Identity | Miscellaneous | Ping | RailsConf | RailsConf2006 | RubyonRails | Tags | VeriSign |

Blogroll

Jeff Richards' Demand Insights

Web Security Blog

The Accountable Web

SSL Blog

Demystifying the Web's Secure Backbone

Powered by
Movable Type 3.2
Disclaimer: Opinions expressed here and in any corresponding comments are the personal opinions of the original authors, not of VeriSign.

VeriSign Legal Notices

Read our Privacy Policy