Le weblog entièrement nu

Roland, entièrement nu... de temps en temps.

Looking for the ultimate distributed filesystem

(This is not quite a “Dear lazyweb” post. If anything, it's “Dear lazyweb, I've done my homework, now what?”)

I'm looking for the ultimate distributed filesystem. Something that's simple to use, redundant, fault-tolerant, and still smart enough to avoid the more obvious performance chokepoints. Ideally, it should work for all of the following sets of files:

The requirements on the ultimate distributed filesystem (which I'll call UDFS for short, otherwise you'll get bored and go look at pictures of kittens) are as follows:

Quite a few constraints, eh? I have a feeling they are not mutually incompatible though, so I had a look at several candidates. I started with Wikipedia, and I followed the links to Tahoe-LAFS, XtreemFS and Ceph. The following is my evaluation of these candidates based on much reading of docs and websites and wikis, some questioning on IRC, and very little testing.

The overall picture seems full of good things scattered across different solutions, but unfortunately none of the existing ones seems to address the whole problem; at least, not my whole problem. It would be good if each focused on one layer and did that layer well, but that seems not to be the case either, so they can't be combined to get the best of all worlds. It may be that I'm missing something, or that I failed to read some docs properly, or that I misunderstood the docs, or that the docs themselves are simply lacking; but my ideal UDFS currently doesn't seem to exist as a turn-key solution.

However, the main pieces are available, and implementing the remaining parts may be doable. My humble idea of a way forward would be based on Tahoe-LAFS, with the following three changes:

Also nice to have would be a way to work with multiple introducer nodes, but that seems to be in the works already. This would be pretty damn close to my UDFS; read/write performance would certainly be far from what can be obtained on native filesystems stored on local disks, but my use cases involve reasonably small files for which instant access is not compulsory, and the filesystem cache would probably absorb most of the access times.

In case anyone is looking for ideas of things to do in their spare time, here are rough sketches of other possible UDFS implementations I thought of. These are wild ideas, and I'm not even sure they could be doable in practice:

Such is the state of my research so far. I would welcome feedback, pointers to things I neglected to read, corrections for things I misread or misunderstood, comments on the ideas and so on. I'll probably post an update if my search goes significantly forward.

Update: I've already received two pieces of feedback, including a lengthy one with corrections about Tahoe-LAFS. For the sake of fairness, I'll solicit (and wait for) the same from the other candidates I looked at. I was also pointed at HekaFS, GlusterFS and git-annex, which I'll have to look at in more details. Other suggestions are still welcome, but the more I get, the more the full update will be delayed. Thanks already!

Update 2: See take 2 for the full update.

Creative Commons License Sauf indication contraire, le contenu de ce site est mis à disposition sous un contrat Creative Commons.