Chris and I had a short discussion last week about what the default content type for Confluence attachments should be. On occasion, we have data uploaded by a user with no content type information provided. Chris had already implemented the fix to serve this data as
application/octet-stream. I was worried that the default should be
text/plain, based on something I could remember reading but couldn't put my finger on. I decided to have a look further into the situation tonight.
My first stop was the MIME Wikipedia article, because I knew the HTTP Content-Type header was derived from the same header used in email messages. It led me to the MIME RFC which states the default content type for MIME messages is
Default RFC 822 messages without a MIME Content-Type header are taken by this protocol to be plain text in the US-ASCII character set, which can be explicitly specified as:
Content-type: text/plain; charset=us-ascii
This default is assumed if no Content-Type header field is specified.
Of course, that doesn't mean the default for HTTP would be the same. HTTP is usually used for transmission of HTML data — it is the Hyper Text Transfer Protocol after all — so a different default might make sense. The HTTP spec actually does have quite a different standard. Strangely, it appears as almost a footnote, completely separate to the discussion of the Content-Type header itself:
Any HTTP/1.1 message containing an entity-body SHOULD include a Content-Type header field defining the media type of that body. If and only if the media type is not given by a Content-Type field, the recipient MAY attempt to guess the media type via inspection of its content and/or the name extension(s) of the URI used to identify the resource. If the media type remains unknown, the recipient SHOULD treat it as type "application/octet-stream".
Since we should provide a content type and the media type is also unknown to us, it turns out that
application/octet-stream is exactly the right thing for Confluence to do. We definitely don't want to delve into the perils of Content-Type sniffing.