Le weblog entièrement nu

Roland, entièrement nu... de temps en temps.

Geek

RSS
A challenge for whoever feels they have too much free time

Open question to enthusiasts, theoretical computing scientists and mathematicians of all sorts: is it possible to construct a valid QR-code that leads to interesting results when used as an initial configuration for the Game of Life?

The rules:

  • QR-code or DataMatrix or whatever, I'm not picky;
  • valid code as in maybe not exactly what would be produced from the text, but which the error-correction can reconstruct;
  • interesting results, as in not decaying to a stable state (or not too soon, at least);
  • for bonus points, it would lead to at least one glider;
  • for extra bonus points, a glider gun or a breeder of such;
  • for awesomeness, the QR-code should encode an URL pointing at the demonstration of its own evolution in Life.

For Science!

Update: The answer seems to be yes. Jurij Smakov assembled a QR-code generator and a Life engine and plugged them together for easy experimenting. And Stefano Zacchiroli noticed that using "free software" (no quotes) as the input leads to a couple of gliders endlessly traveling a field with a few still lifes. This is way beyond awesome.

Tags: geek
Posted mar. 19 févr. 2013 20:34:44 CET
Various small bits

Dear reader, I know you're wondering what I'm getting up to these days. Or at least, I guess there's a possibility that you're wondering. So since there's no single bit of news that would be worthy of a post here by itself, here's one summarizing things, in the hope that the accumulation makes a tiny bit of a difference.

On the rock'n'roll front: Eleven did its first true gig on our own in a pub last Saturday. And when I say “on our own”, it could almost be taken literally, for we must have had a grand total of about 10 people. A bit of a disappointment, to say the least, especially since the pub was about 120 kilometers from home, 60 of which I rode on my motorbike at 2 AM (and close to 0 °C). However, the landlord seemed to like us and hinted at further gigs during seasons where people are more likely to go out for drinks and music than to stay warm at home.

In a related note: we got ourselves a small set of stage lights (LED-powered, of course), and they have, in addition to the power cord and various switches, two sockets at the back for plugging XLR-3 cables. On investigation, it seems that this means they can be controlled by a protocol known as DMX 512, which opens up a lot of possibilities for someone who likes to control various things from a computer. I read a few web pages to get an idea of how this is supposed to work, it seems rather simple and straightforward, but the required software isn't in Debian yet. So I guess that if/when I get the necessary hardware, I'll have a new hobby and new toys to play with. Maybe our next gig will have bursts of lights on the big accented beats, triggered by “strong enough” hits on my drum cymbals.

This allows me to link to… a new minor release of Wiimidi. The only addition is a new configuration section mapping to the default drumkit provided by Hydrogen.

And finally, to stay in the “small bits of software”, I was prompted to give my simple GPG-encrypted password store, a.k.a. SGEPS, its own page, with a proper release and so on. So there, 0.1 is out, with minimal Debian packaging included.

Tags: geek
Posted mer. 30 janv. 2013 18:56:33 CET
Wii(gh2)midi yet again

Not that I'm bored or anything, but I spent some more time on this since last month, and apparently some people are interested enough, so here's the recent news about wiigh2midi.

  • First, I think it's got generic enough that the “GH” part is no longer warranted. I've looked for a suitable name, and the best I found was the simplest. From now on, this piece of code will be known simply as Wiimidi. Various searches on the web led me to believe that it's a traditional name that's been used a couple of times already by projects that stopped evolving years ago. So mine will either continue in that vein and maintain the tradition, or stay alive and keep the name. Time will tell.
  • I ended up doing the Debian packaging. It's fairly minimal, and I don't currently plan to actually upload it (unless asked to), but it's there.
  • I also added support for sending MIDI “program change” events in addition to notes. Why, you wonder? See next item.
  • A new set of settings called program_changes in the sample configuration file, and a new rhythmbox-midi script. The goal is to control a Rhythmbox music player with your Wiimote (or possibly the attached drumkit).
  • I also slapped a GPL-2+ license on it, which is consistent with the python-cwiid license.

Also, since the code moved (and it doesn't harm mentioning it again): Wiimidi is developed with Bazaar, and the public branch is at https://alioth.debian.org/~lolando/bzr/wiimidi/trunk/. So, to grab a copy:

 bzr checkout https://alioth.debian.org/~lolando/bzr/wiimidi/trunk/

Patches welcome, of course! Also, I'll be interested to hear about your applications. I've been told about a Wiimote-powered foot controller, which is exactly the kind of unpredictable results I was hoping to achieve by publishing my code. Keep it up, I want to hear about Wiimote-controlled robot dinosaurs next!

Update: Wiimidi now has its dedicated page at Wiimidi.

Tags: geek
Posted jeu. 29 nov. 2012 21:01:11 CET
Guitar Hero drumkits and MIDI, again

Almost a year after my previous post, I felt inclined to spend another Sunday (this one was chilly rather than rainy) working on my script to integrate the drumkit of Guitar Hero World Tour for Wii within a MIDI environment. And wiigh2midi got, if not a rewrite, then at least a few enhancements since I last mentioned it here.

  • The buttons are now handled. The “cross-stick”, the A and B buttons, + and -, Home, 1 and 2. And if the Wiimote is plugged into a drumkit, you get even more buttons: another set of + and -, plus another mini-joystick. Thunder! Martians! Bubbles! Yay for sound effects!
  • A side-effect: the drumkit is no longer even necessary. Run several instances with different configurations, give a wiimote each to a bunch of kids, and get them to perform a batucada by pressing the big round button in time. I tried and failed, but it was due to the age and over-excitedness of the bunch of kids in question rather than the technical side of things.
  • Yeah, configurations. It's all in a nice .ini file now, and you can define several “kits” (input-to-MIDI-note mappings) in different sections, and select which kit to use when running.
  • The configuration can also define which MIDI device to send events to, and which channel to use. My use case is that sometimes I want to use my GHWT drumkit as a set of extra pads and send the MIDI stuff to my “real” electronic drumkit (over an USB adaptor), and sometimes I just want to send the MIDI events to a software synthesiser such as Hydrogen, to have a second, cheaper, lighter drumkit.
  • Speaking of Hydrogen: since the MIDI-to-instrument mappings are somewhat strange in there, I provided a template Hydrogen file with a reasonable drumkit that matches the example configuration if wiigh2midi.
  • Hey, you know, three toms, two cymbals and one pedal are really a small kit, even for beginners. And what about open/closed hi-hats? Well, we could, you know, split some of the pads so they can produce different “sounds” depending on context. Like, maybe if you hit the left pad real soft, you get a cross-sticks sound, but you get the standard snare drum sound if you hit it a bit harder (or very hard). And the open hi-hat could be obtained by hitting the yellow cymbal harder, with normal hits still sounding like closed hi-hats. And since the crash cymbal is usually hit markedly, while the ride is usually more of a soft-to-medium hit, you guess what we could do with the orange cymbal. Well, scratch the “could” there, that's exactly what we do now. So my cheap drumkit has hi-hats that open, a cross-stick sound on the snare drum, ride and crash cymbals, a bass drum and two toms. Not so bad.

It's still not something that I'd put in everyone's hands, but it's coming to be seriously usable. I wonder if there'd be any interest in me packaging that and uploading it to Debian? It would need some cleanup first (and a more generic name, since it's far from restricted to Guitar Hero controllers or drumkits), but I guess it could be useful. Ping me if you're interested.

Update: Wiigh2midi has been renamed to Wiimidi, and has its dedicated page at Wiimidi.

Tags: geek
Posted dim. 28 oct. 2012 21:00:03 CET
Integrating FWbuilder with fail2ban and port-knocking

This article documents how I'm currently building my firewalls. It builds on netfilter-based-port-knocking, and tries to integrate several components of a firewall as gracefully as possible.

For some context: I'm getting involved with servers where the firewall policy goes beyond a handful of “SSH from my home and my other servers” rules. There are many different network streams that need to be policed, some of them are common across several (potentially many) servers, and so on. So I'm gradually giving up on my hand-made scripts, and trying out higher-level tools. I settled on FWbuilder, which seems nice. However, it only allows static policies, and I still want to keep dynamic policies such as what fail2ban provides, as well as my own port-knocking system.

The problem I had was that fail2ban isn't really made to play nice as part of a complex firewalling setup, my port-knocking system was too tightly integrated within my firewall script, and FWbuilder wasn't too flexible when it came to delegating part of the firewall policy to something external. Fortunately, this was only a perceived problem (or a real problem in my understanding), because it is actually possible to have all three blocks playing nicely together.

More context: as usual, I'm focusing on Debian-like systems. More precisely, on those with a Linux kernel; it may be that FreeBSD's firewalling subsystem has a feature comparable to Linux's recent module, but I don't know.

Let's start with FWbuilder. This is not the place for a manual, the official documentation is rather complete. I'll assume you have defined most relevant blocks in there: firewall, hosts, IP addresses, services, and so on. You define your “static” policy with the standard rules. From then on, we want to integrate the external tools for dynamic rules.

Step 1: Integrating fail2ban

We want fail2ban to have its own playground, so that it doesn't overwrite anything in the standard policy. The trick is to define a new “policy rule set” named fail2ban. Leave it empty in FWbuilder.

So far so good, but fail2ban (the daemon) still operates on the INPUT chain in the firewall, and could therefore still mangle the static rules. Fortunately, starting with fail2ban 0.8.5 (available from Debian Wheezy, or in the backports for Squeeze), you can define what chain to operate on: with a configuration item such as chain = fail2ban, fail2ban (the daemon) will now only add its rules to fail2ban (the firewall chain), and won't be able do damage the other chains.

The missing part is to send some of the traffic to it using the standard policy: i defined a rule sending the incoming SSH connections to the fail2ban policy (“branching” in FWbuilder jargon).

Voilà: the static policy delegates part of the decision-making to a sub-policy controlled by the fail2ban daemon.

Step 2: Integrating port-knocking

This is a bit trickier, but we'll use a similar method.

First, the traffic used for port-knocking needs to be directed to the chain that does the listening. Define a “policy rule set” named portknocking, and leave it empty in FWbuilder. It'll be used by the dynamic rules to track progression of source IP addresses through the port-knocking sequence, so you'll need to send (“branch”) incoming traffic there, probably after the rules allowing incoming connections from known hosts.

The dynamic part of this will only concern the refreshing of this “listening chain“, which we assume will do its work and mark IP addresses with PK_ESTABLISHED once the sequence is completed. What we do with these marked IP addresses will still be defined within the FWbuilder policy.

We're going to need some complex rules since we want to filter according to this PK_ESTABLISHED bit and according to destination port, for instance; unfortunately FWbuilder doesn't allow combining filter criteria with and, so we define a new policy rule set called accept_if_pk_ok. This ruleset has two rules: the second is an ACCEPT and should be easy to understand. The first rule needs to ensure the ACCEPT is only reached for connections coming from PK_ESTABLISHED addresses, so it's going to be a bit tricky.

  • The “service” needs to be a custom service (I called it “PK not established”), since FWbuilder doesn't know about the marking feature in iptables. Use -m recent ! --rcheck --seconds 86400 --name PK_ESTABLISHED for the definition (change the duration to the number of seconds the door should stay open after the port-knocking sequence has been completed). Note the exclamation mark.
  • The “action” is also going to be custom, defined as -j RETURN. Again, this feature is iptables-specific and FWbuilder doesn't provide any UI for it.

(Explanation: the first rule matches packets coming from IP addresses not marked as PK_ESTABLISHED, and returns them to the calling policy. Packets remaining after this rule are those coming from the appropriate addresses, and they go on to the ACCEPT. We could have had the first rule match on IP addresses that are marked, and branch to yet another ruleset with the ACCEPT part, but that would make it harder to read, I feel.)

Now let's get back to the main policy and add rules concerning what kind of traffic we want to allow once the port-knocking sequence completed. For instance, we define a rule matching on the SSH “service”, where the action is to “branch” to accept_if_pk_ok. When an incoming packet tries to establish a connection to the SSH port, it's passed to the accept_if_pk_ok ruleset. If it comes from the same IP as a recent port-knocking sequence, it goes on to be ACCEPTed. If not, it returns to the main policy. Maybe static rules further on will allow it to go through.

Step 3: tying it all together

Now that we have all the pieces, the rest is plumbing.

  • Get FWbuilder to “compile” a script from the data. I called mine $hostname.fw, and stored it into /usr/local/sbin.
  • Write a /usr/local/sbin/port-knocking script that operates on the portknocking chain and manages the PK_ESTABLISHED bit. It need not do more than what's described in netfilter-based-port-knocking.
  • Write an initialisation script that calls both /usr/local/sbin/$hostname.fw and /usr/local/sbin/port-knocking. I called mine /etc/init.d/firewall.
  • Make sure that fail2ban's initialisation script is called after ours. Either with boot sequence numbers, or with the LSB dependency pseudo-headers: I made my firewall script to Provides: iptables; since fail2ban's script declares that it Should-Start: […] iptables, we're fine.
  • Run /usr/local/sbin/port-knocking every hour, or as often as needed to recalculate the port numbers.

With this setup, at boot time, the $hostname.fw script creates the static policy and the extra playgrounds; then the port-knocking script implements the listening for the magic sequence; then fail2ban inserts its own rules. And there we are: three different parts for the firewall policy, all integrating nicely. Mission accomplished!

Note: (Mostly copy-and-pasted from the previous article) This article is deliberately short on details and ready-to-run scripts. Firstly because firewall scripts vary wildly so any script would have to be adapted anyway, but mostly because security is best handled with one's brain switched on. Fiddling with a firewall can easily open gaping holes or lock everyone out. So please make sure you understand what goes on before blindly pasting stuff into your own setup. Some bits are left as an exercise to the reader.

Tags: geek
Posted ven. 31 août 2012 18:00:02 CEST
Reumeuleuleu

Bon alors je n'ai pas prévenu avant, mais je suis arrivé hier soir à l'édition 2012 des Rencontres mondiales du logiciel libre (plus connues dans le microcosme francophone comme les « reumeuleuleu »). Raphaël Hertzog semble se débiner, et Olivier Berger est tombé dans un trou, mais c'est plein d'autres geeks sympathiques avec qui causer et boire de la bière (libre). Si vous voulez une dédicace du Cahier de l'admin Debian (ou de sa traduction anglaise), cherchez-moi donc dans les couloirs ou les conférences.

Ah, et ça ne fait probablement rire que moi, mais comme ça se passe à Genève, il y a énormément de panneaux, d'affiches, de devantures de magasins et d'enseignes diverses typographiées en Helvetica, Helvetica Bold, Helvetica Light, etc. Donc pour me reconnaître, cherchez un type le nez en l'air (quoique j'ai aussi repéré du Helvetica sur une plaque d'égoût). Comme dit le tee-shirt : Sex Drugs Helvetica Bold !

Tags: geek
Posted lun. 09 juil. 2012 08:35:09 CEST
Le Debian Handbook est sorti !

Pour ceux qui vivaient dans une grotte récemment, la raison pour laquelle je vivais dans une grotte vient de se terminer : la traduction anglaise du Cahier de l'Admin Debian vient enfin de sortir, sous le titre de “The Debian Administrator's Handbook”. Et pas qu'un peu, puisqu'il est disponible :

  • En version papier chez Lulu ;
  • En consultation en ligne ;
  • En téléchargement aux formats PDF, Epub et Mobipocket ; vu que le site est un poil chargé en ce moment, il est conseillé d'utiliser les torrents (PDF, Epub, Mobipocket, HTML) ;
  • Et il est même disponible directement dans Debian, tant qu'à faire : apt-get install debian-handbook (uniquement dans unstable pour l'instant, mais il va certainement migrer vers les autres distributions).

Le dernier élément est important : le livre est libre (sous licences GPL-2+ et CC-BY-SA-3), et les contributions sont donc possibles et bienvenues.

Pour tous détails supplémentaires, le site web du livre : debian-handbook.info.

Les donations sont toujours possibles (et appréciées !), même après la fin officielle de la campagne de financement.

Raphaël et moi serons vraisemblablement aux RMLL de Genève en juillet, et ça serait bien le diable si nous n'arrivions pas à organiser une séance de dédicaces si le besoin s'en fait sentir…

Tags: geek
Posted jeu. 10 mai 2012 14:45:02 CEST
La retraite à 10 ans

Petit billet rapide pour annoncer qu'après dix ans de bons et loyaux services, mirexpress vient d'être mis à la retraite. Il est remplacé par polymir, mon nouveau PC tout beau tout brillant et tout moderne. Comme en 2002, je l'ai pris nettement plus puissant que mes besoins actuels ne l'exigent, mais j'ai bien l'intention de le faire durer dix ans lui aussi, ne serait-ce que pour continuer à fournir un contre-exemple pour tous les gens qui pensent que leur ordinateur est forcément obsolète au bout de deux ans.

Tags: geek
Posted mer. 18 avril 2012 14:45:01 CEST
Looking for the ultimate distributed filesystem, take 2

This follows up on a previous post, and tries to summarize the corrections and suggestions I received.

First, a reminder: yes, I'm really looking for a filesystem, not a storage system. Even if I only consider the “music files” use case, there are just too many things that want to access the files, and there's no way I'm going to port these things to use the storage system's API instead of plain old file access. Examples? The Rhythmbox player. The XMMS2 player. The mt-daapd/forked-daapd DAAP server. My script to backup or restore metadata (tags). Ex-Falso, to do mass edits on these tags. abcde, to add new files when I buy a new CD. Any kind of shell one-liners I currently do without thinking because find, xargs, cp, ln, and so on all operate on files. I understand that providing filesystem semantics on top of a storage system is hard, but that doesn't change my requirements, and saying otherwise is patronising.

So, on to the meat of the matter.

Apparently my understanding of Tahoe-LAFS was mostly correct. It seems that multiple introducers are on their way to become a reality, which would mean the one remaining central point would go away. Most of my other complaints seem to be on their way to be resolved, too, except that it's still targeted as a storage system, and the FUSE/sshfs layer on top still has its drawbacks (quoting from the wiki, “Before uploading a file to a Tahoe filesystem, the whole file has to be available”, “mutable parts of a filesystem should only be accessed via a single sshfs mount [...] data loss may result for concurrently accessed files if this restriction is not followed” and so on). From what I read, the position is that nobody publicly uses Tahoe's filesystem integration, therefore there's no need to fix its shortcomings; my understanding is that nobody uses it precisely because of its shortcomings.

Ceph: apparently does repair/rebalance automatically contrary to what I thought. However, two persons told me that it's really expecting all nodes on the same LAN, and geographically distributed setups aren't really targeted.

XtreemFS: I don't think I got any comments on my evaluation, so I'll assume it still stands.

git-annex: storage system, not a filesystem. Yes, the files can be made available in the filesystem, but if I need to manually retrieve them before accessing them, this doesn't work.

GlusterFS: I'd like to read more docs on how it works, but I haven't been able to find them. Apart from the installation manuals, the only documentation I found was a very short “Concepts” page, which described some of the concepts but didn't give me the big picture on how the whole thing works. (I'm turning a blind eye on the Red Hat advertisements and the requirements that everything be installed on RH machines; the hardware requirements are also quite out of line with what I want to do.)

HekaFS (aka CloudFS): I found even fewer docs about this one than about GlusterFS, and apparently HekaFS is on its way to being merged into GlusterFS anyway.

PlasmaFS: I didn't know about this one previously, but the one email I got about it was mainly full of “this won't work” and “this won't work either”. I didn't feel inclined to read further docs after that.

In summary, I guess the replies I got didn't cause me to change my mind too much. Tahoe-LAFS still seems to be, if not the best solution, then at least the “least bad”. Hopefully the drawbacks will be fixed soonish; the main sticking point (at least from my point of view) still seems to be the lack of focus on proper filesystem integration. I'll try setting it all up at some point; I may report further if I end up finding interesting things.

Thanks to all who responded!

Tags: geek
Posted jeu. 01 mars 2012 12:30:02 CET
Looking for the ultimate distributed filesystem

(This is not quite a “Dear lazyweb” post. If anything, it's “Dear lazyweb, I've done my homework, now what?”)

I'm looking for the ultimate distributed filesystem. Something that's simple to use, redundant, fault-tolerant, and still smart enough to avoid the more obvious performance chokepoints. Ideally, it should work for all of the following sets of files:

  • Music files. They're currently stored on a plug computer at home and served over DAAP, but DAAP seems to become less and less of a priority for music players (I'll keep the “music players suck” rant for another post), and it has its inconveniences too. Ideally, the files should be integrated in the filesystem of my work computer, the one that serves as a media player in the living-room, the one I use for playing the backing tracks when I'm practising the drums, and the laptop I use when on the road too.
  • Backups. My BackupPC currently runs on the plug computer at home, so backups of my remote servers are in a geographically distinct location; but backups of my home computers are not, and I'd like to change that without having to copy files around by hand. Backups do contain confidential data, so this adds a constraint on the filesystem.
  • Bazaar repositories. I do have a script that pushes and pulls stuff across computers for the many repositories I use, but it's still awkward that I need to do that.
  • Parts of my home directory, such as my browser preferences and bookmarks, or the database of the Hamster time tracking tool, or the working directories of stuff I do for clients, and so on. Again, each of those can be done with specific means (bookmarks synced to a server, CouchDB-like database, DVCS and regular commits, etc.), but wouldn't it be much simpler if there only was one file to begin with?

The requirements on the ultimate distributed filesystem (which I'll call UDFS for short, otherwise you'll get bored and go look at pictures of kittens) are as follows:

  • Availability means redundancy: some of the storage nodes will be on dedicated servers in datacenters, others at home; I can imagine setting up the firewall so that the home computers are reachable from outside, but sometimes network links go down, and the home computers are far from being on 100% of the time.
  • Availability/redundancy also means automated replication and rebalancing: if a node is added to the “grid” (or switched on), it should automatically get its share of the files so as to contribute to availability if another node goes down at a later time.
  • Confidentiality: at the very least, network communications must be encrypted and authenticated; ideally, individual storage nodes wouldn't need to be able to access the stored data. If I store bits of my backups on a friend's server, I don't want to have to trust them not to look at the data; also, my friend may actually want to be unable to look at my data (to provide for deniability in case someone else wants to look at it).
  • Performance: native disk performance may not be realistically reachable, but the system must be smart enough (or configurable enough) to store files on both sides of the ADSL link, for instance, so not all accesses need to go through the bottleneck.
  • Integration with the system: I want a filesystem, not a storage system. All applications know how to navigate a mounted filesystem; very few will interface with an specific application designed to store and fetch chunks of data.
  • Scalability would be nice, although my personal needs are still well below the terabyte range and I can't see myself using more than a dozen nodes or so.

Quite a few constraints, eh? I have a feeling they are not mutually incompatible though, so I had a look at several candidates. I started with Wikipedia, and I followed the links to Tahoe-LAFS, XtreemFS and Ceph. The following is my evaluation of these candidates based on much reading of docs and websites and wikis, some questioning on IRC, and very little testing.

  • Availability/redundancy 1: all three candidates work on the net, and they all provide for data replication. XtreemFS seems to operate on a “fall-back” mode while the other two are more distributed (meaning there's no canonical node hosting any particular file). Tahoe uses erasure codes (files can be split across N shares, k of which are enough to reconstruct the whole file; the N/k ratio controls the amount of redundancy); it seems to require the “introducer” node to be always up, which introduces a single point of failure, but this node can be replaced with no loss of data if it fails (this just requires reconfiguring the storage nodes, which can probably be automated). Ceph can work with any number of meta-data servers, so redundancy is assured there (data itself can be replicated in a configurable way too).
  • Availability/redundancy 2: either I'm a fool and I didn't find out about that, or none of the three candidates provides for automated rebalancing. Apparently Tahoe provides a way to “repair” files without enough redundancy, but that needs to be run manually on each file, rather than being systematic, and it's a kind of add-on to the system rather than being properly integrated. Ceph only does rebalancing of meta-data. XtreemFS seems to work a bit like RAID1 with spare disks, but details are scarce.
  • Confidentiality: Tahoe wins, hands down. Neither network sniffers nor storage nodes can see the contents of the stored files. They can't even find out about their name or the directory structure. XtreemFS encrypts the data on the network (but it is still stored in cleartext on the nodes). Ceph doesn't even try.
  • Performance: Ceph wins (or loses less, at any rate): it seems possible to configure a topology for the storage nodes, and to drive the storage location policy according to this topology; so I could define sets of nodes (“my servers in datacenters”, “my plug computer at home”, “my desktop and laptop computers”, “my friends' computers”) and decide that each file should be stored at least once on each set of nodes. Tahoe stores shares of files on a number of storage nodes chosed in kind of a round-robin way; read access uses the same round-robin system, which means that you will probably end up fetching a file at least partially over your slow link even when you could fetch it entirely from the local network. The XtreemFS website doesn't seem to acknowledge the possibility of the network being slow.
  • Integration: Ceph and XtreemFS win: they're native filesystems. Tahoe is in the “storage and retrieval system” category; the FUSE layer is marked as experimental and not recommended, and the SFTP layer (which can be mounted as a filesystem with sshfs) has many documented drawbacks. Maybe a WebDAV frontend, combined with fusedav, would provide a working alternative, but it's not implemented yet.
  • Scalability: all three candidates boast about scalability (in data size and in number of nodes). Ceph seems to require some configuration every time a new storage node is added. Tahoe seems much more dynamic: new nodes (storage or clients) just need to be told where the introducer node is, and then they merge into the grid seamlessly by being told about other nodes (a bit like Bittorrent, where clients learn about each other by asking the tracker).

The overall picture seems full of good things scattered across different solutions, but unfortunately none of the existing ones seems to address the whole problem; at least, not my whole problem. It would be good if each focused on one layer and did that layer well, but that seems not to be the case either, so they can't be combined to get the best of all worlds. It may be that I'm missing something, or that I failed to read some docs properly, or that I misunderstood the docs, or that the docs themselves are simply lacking; but my ideal UDFS currently doesn't seem to exist as a turn-key solution.

However, the main pieces are available, and implementing the remaining parts may be doable. My humble idea of a way forward would be based on Tahoe-LAFS, with the following three changes:

  • A configurable dispatch policy, so that administrators could define their own behaviour. The current Tahoe parts already turn files into a number of blobs that only need to be stored on various nodes, and the function that chooses which nodes seems to be reasonably self-contained, so it could maybe even be made pluggable, and an administrator could implement a storage policy that matches the constraints and the network topology. Ditto for the function that picks which nodes the data is read from (which, as far as I can tell, is the same; splitting could bring some benefits here).
  • A working FUSE implementation, either on top of SFTP if the drawbacks can be fixed, on top of WebDAV if that gets implemented, or native to Tahoe if that can be done.
  • Automatic rebalancing of data when new nodes are added or turned on, or when an intermittent network link comes up. The current script seems to be an afterthought; if it can be made more automated and more reliable, I'll be happy with that.

Also nice to have would be a way to work with multiple introducer nodes, but that seems to be in the works already. This would be pretty damn close to my UDFS; read/write performance would certainly be far from what can be obtained on native filesystems stored on local disks, but my use cases involve reasonably small files for which instant access is not compulsory, and the filesystem cache would probably absorb most of the access times.

In case anyone is looking for ideas of things to do in their spare time, here are rough sketches of other possible UDFS implementations I thought of. These are wild ideas, and I'm not even sure they could be doable in practice:

  • eCryptfs on top of Ceph. The file contents are still only accessible to those in possession of the adequate key, and replication/distribution is handled by Ceph. That could probably work; the main drawbacks I can see come from Ceph's administrative overhead and manual configuration.
  • LUKS and a standard filesystem, on top of the part of Ceph that manages block devices (RADOS). I'm not sure this could be mounted simultaneously by several nodes, though.
  • LUKS on top of Ceph's RADOS, and Lustre or GlusterFS on top of that? Am I even making sense?
  • A kind of union filesystem with local caching, on top of Tahoe-LAFS. Say we take unionfs-fuse, and add local caching: if the requested file is already in cache, performance is near native; if not, we still get the advantages of Tahoe. I'm not sure how caching works with WebDAV, but once Tahoe gets a WebDAV frontend it might be a simple matter of adding a Squid cache between that and davfs.
  • I even thought of building a filesystem on top of the Bittorrent protocol; redundancy would be obtained by ensuring that there are at least N “distributed copies” of a file among the peers. I gave up even before reaching the details phase, though, but maybe a variant of the idea could be workable.

Such is the state of my research so far. I would welcome feedback, pointers to things I neglected to read, corrections for things I misread or misunderstood, comments on the ideas and so on. I'll probably post an update if my search goes significantly forward.

Update: I've already received two pieces of feedback, including a lengthy one with corrections about Tahoe-LAFS. For the sake of fairness, I'll solicit (and wait for) the same from the other candidates I looked at. I was also pointed at HekaFS, GlusterFS and git-annex, which I'll have to look at in more details. Other suggestions are still welcome, but the more I get, the more the full update will be delayed. Thanks already!

Update 2: See take 2 for the full update.

Tags: geek
Posted dim. 15 janv. 2012 21:45:02 CET
Creative Commons License Sauf indication contraire, le contenu de ce site est mis à disposition sous un contrat Creative Commons.