Log in

No account? Create an account

Previous Entry | Next Entry

The only feature of Apache that I miss using Lighttpd is content negotiation.

In a nutshell, content negotiation takes an abstract resource URL like http://example.org/2005/chart and maps it to the files on the filesystem based on the available files and their mime-types, and the mime-types in the requestor's Accept: header.

Given that URL, an Accept: header suggesting image/svg+xml; q=1, image/*; q=0.5 and the files /www/example.org/2005/chart.png and /www/example.org/2005/chart.svg, the server would see that there is a image/svg type file, which matches the highest preference, and return that along with a Varies: Accept header.

The efficiency problems come from needing to know the available files and their mime-types. At the most efficient, an expensive scan for available files will happen for one hit, and be cached for subsequent hits. However, cache consistency is a difficult problem, and many of the solutions are as inefficient as no caching at all. Very recent linux kernels support the inotify mechanism which would work to monitor efficiently and keep the cache consistent, but it's not a generally portable solution.

The simplest implementation would take the URL, and check to see if it's immediately satisfiable — this is the same efficiency as normal serving, without content-negotiation. If it's not found, then ir must perform a directory listing (one open call, some read calls). This gets expensive for huge directories. (Directories of over 1000 files, though the expense depends on the type of filesystem). Candidates are selected, mime-types mapped, and selected according to the criteria in the HTTP spec. Unless there are extremely many alternatives or an absurdly large Accept: header, computing this isn't computationally intensive, on the order of O(m * n).

However, to send Content-Length: headers, at least one stat() call must be made, and to handle dangling symbolic links, a stat() for every file under consideration (though since dangling links are an edge case, this could be implemented as a fallback, not normal operation.).

The biggest issues are the ones dealing with unusually large directories, where a linear scan of the listing can take a long time, and if caching is performed, how to keep cache consistency and still gain from the cache.

Thoughts are always welcome. I'll probably implement this in Lighttpd at some point.



( 8 comments — Leave a comment )
Nov. 4th, 2005 02:01 am (UTC)

I think this is almost completely a filesystem issue; there’s no lasting solution to it without stuff like Reiser dreams. A really scalable setup will have directories indexed by filename0 and mime type – feel like running a Postgres table to cache your webdirs’ metadata?

&laugh; &laugh; &retch; &laugh; &retch; &retch;.

What’s the best way to decide a file’s mime type? I don’t want to trust its extension. I don’t want to read magic bytes. Probably I’ll only really trust an attribute as per a real filesystem (which will have its own consistency problems, like any metadata1).

I’ll be interested to see what you come up with.

  1. By filename, not hash of filename! With the usual filenames-to-resource mapping, you want to find all the file.* directory entries contiguously – that’s the point. Internal locale won’t matter as long as the alphabetization is left-to-right (i18nly, head-side-to-suffix-side) and a total ordering.

  2. If C-only coders do the first popular attributed filesytems, they’re going to say it’s safer and faster to make the file mime type attribute immutable.  touch, fopen(), etc., will require a mime type argument to make a file. You’ll have to pass the number of directory entires to ls by hand or it’ll run off the end.

Nov. 4th, 2005 02:11 am (UTC)

Then after a few years they’ll start getting excited about “polymorphic” fileutils which can, amazingly, apply the same semantics to differently typed files. For instance, a polymorphic version of cat could dump several text/plain files or, equally well, several text/tab-separated-values files! With the same binary! Wow! What a head trip!

Sorry. It’s so easy.

Nov. 4th, 2005 10:06 pm (UTC)
And this is why we don't have file-type attributes!
Nov. 4th, 2005 10:05 pm (UTC)
I'm quite content to have MIME types be assigned based on the file name, and trust the author not to botch that up. There's enough else in web design of similar fragility that that doesn't make it any worse. I'd just like to hide all that from the viewers. That much works out pretty well.

Attributed filesystems actually incur a much higher penalty, because that's one syscall per filename like the stat() for dead symlinks above, only not optimizable as a fallback.

If I got to reinvent the OS, I'd definitely do what you suggest regarding types to the open() call being required and immutable. And I'd index it, of course.
Nov. 4th, 2005 11:03 pm (UTC)

If I understand the current practice and realistically envision an attributed filesystem, the extra penalty would be a small constant – the mime type goes in the equivalent of the dirent struct and it’s trivial.

(Unless you were seeing my semi-sarcasm and raising it …) I wouldn’t actually make mime types immutable (for reasons similar to why we want content negotiation over HTTP). And who wants to write portable packages for that? It would be nice to pass openining functions a mime filter, though, if you could supply lists (text/plain ∪ text/rtf), just major types (image/*), and negatives (¬application/riscos).

(As long as I’m supposing, it would be interesting if some layer had a bit marking whether a file strictly validated as what it said.)

Nov. 5th, 2005 06:24 am (UTC)
It would go in the inode, not in the dirent, making it another lookup, not a single operation. Unless you have a hardlinkless filesystem.

Really, mime types aren't expressive enough. Getting there, but we need an inheritance tree to really filter well, not domain/type, where type or type and domain can be wildcard.
Nov. 4th, 2005 02:21 am (UTC)

Also, ignoring your declared nutshell, to nitpick for the benefit of readers new to the subject, there’s nothing specific to filesystems about content negotiation. You can negotiate about objects out of a database, generated (X)HTML, or anything else that might come in more than one roughly equivalent types.

Nov. 4th, 2005 10:06 pm (UTC)
True. My problem, however, is specific to filesystems, since there's no index on type.
( 8 comments — Leave a comment )