The Slug HTTP Header

Introduction

The Atom Publishing Protocol (AtomPub) defines a HTTP Header named Slug:

9.7. The Slug Header

Slug is an HTTP entity-header whose presence in a POST to a Collection constitutes a request by the client to use the header’s value as part of any URIs that would normally be used to retrieve the to-be-created Entry or Media Resources.

Servers MAY use the value of the Slug header when creating the Member URI of the newly created Resource, for instance, by using some or all of the words in the value for the last URI segment.

In other words the Slug header provides a means for a client to suggest the URI for a newly created resource. In this post I want to answer two questions:

Why is it called Slug? What do Molluscs have to do with URIs?
Why can clients only suggest the URI for a created resource?

Etymology of the term: ‘Slug’

I’m sure I’m not the only person who finds the term Slug non-intuitive. I don’t see how the term could have relevance to the world of URIs. Of course if you don’t know something about The Web, then you just need to ask it, Stack Overflow has this informative answer:

The term ‘slug’ comes from the world of newspaper production.

It’s an informal name given to a story during the production process. As the story winds its path from the beat reporter (assuming these even exist any more?) through to editor through to the “printing presses”, this is the name it is referenced by, e.g., “Have you fixed those errors in the ‘kate-and-william’ story?”.

Several web based content publishing systems such as WordPress & Django (which became popular in the same timeframe that AtomPub was created), use the ‘Slug’ term in this manner, so I guess the term filtered across from those systems into the AtomPub specification.

Wikipedia offers a good explanation of how the ‘Slug’ term is applied to URIs:

Some systems define a slug as the part of a URL which identifies a page using human-readable keywords. It is usually the end part of the URL, which can be interpreted as the name of the resource, similar to the basename in a filename or the title of a page. The name is based on the use of the word slug in the news media to indicate a short name given to an article for internal use.

Slugs are generally entirely lowercase, with accented characters replaced by letters from the English alphabet and whitespace characters replaced by a dash or an underscore, in order to avoid being encoded. Punctuation marks are generally removed. For example:

Original title: This, That & the Other! Various Outré Considerations

Generated slug: this-that-the-other-various-outre-considerations

Why is the Slug only a suggestion?

RFC 5023 states:

Servers MAY use the value of the Slug header when creating the Member URI of the newly created Resource

Why is that only a MAY requirement? Why not SHOULD or MUST? There are two chief reasons the server cannot be obliged to use the client suggested value:

URIs can only use a limited set of characters, if the Slug uses characters outside the legal URI character set, then the server must escape those characters.
Concurrency control - Two or more clients might attempt to create a resource with the same Slug at the same time, the server must give each resource a unique URI, so the server may choose to decorate the slug with additional characters to ensure each resource is uniquely named.

The bottom line is that the server is in charge of managing it’s URI namespace, and ultimately it has final say about what the URI is of any resource that it stores. Clients can influence the URIs of resources, but they can never have full control of how URIs are generated by a server.