Joey Hess is looking for a tool that allows him to split sets of files across subsets with a known maximal size, for archival on DVD. That sounds a bit like what my ~/bin/prepare-gallery-backups.pl script does. Actually, my script doesn't operate on a per-file granularity, but per-directory (think photo gallery albums), but I suppose it would be rather simple to adapt. Also, there's no guarantee of optimality, and it may well be that you'll end up with a few subsets more than what is strictly needed. I'm not trying to solve the notoriously hard "backpack problem", just looking for a "good enough for me" solution. Now if Joey wants to recode that properly and add it to moreutils, I'll be a happy man.
I was about to paste just the relevant excerpt of my script, but since only a few lines would have been omitted, I figured what the hell. Here's the whole script, licensed under a "send me a lot of money if you like, otherwise do what you want" license.
#! /usr/bin/perl use File::Find ; $maxsize = 700000 ; $remote_dir = "/srv/www/gallery.placard.fr.eu.org" ; $rsync_dir = "/space/backups/gallery-backup/rsync/" ; $albumsdir = "/space/backups/gallery-backup/rsync/gallery.placard.fr.eu.org/albums" ; $backupdir = "/space/backups/gallery-backup/backups" ; # system "rsync -avL --rsh=ssh --delete clodomir:$remote_dir $rsync_dir" ; system "rsync -avL --rsh=ssh --delete mirobole:$remote_dir $rsync_dir" ; if ( (defined $ARGV[0]) && ($ARGV[0] eq "--make-isos") ) { File::Find::find({wanted => \&wanted, follow => 1}, $albumsdir); for $i (@alist) { $fname = $i ; $fname =~ s,.*/,, ; $du = qx/du -hk $i/ ; $size = $du ; $size =~ s,\s.*,, ; $albums{$fname}{size} = $size ; $albums{$fname}{path} = $i ; } foreach $album (sort {$albums{$b}{size} <=> $albums{$a}{size}} keys %albums) { $size = $albums{$album}{size} ; $name = $album ; $name =~ s,.*/,,; if($cur + $size > $maxsize) { my @bplist = @curlist ; my %entry = ('list' => \@bplist, 'size' => $cur) ; push @backpacks, \%entry ; $cur = 0 ; @curlist = () ; } $cur += $size ; push @curlist, $name ; } my @bplist = @curlist ; my %entry = ('list' => \@bplist, 'size' => $cur) ; push @backpacks, \%entry ; $i = 1 ; foreach $bp (@backpacks) { print "Backpack $i (size $bp->{size}) :\n" ; $bpdir = $backupdir . "/backup-$i" ; system "mkdir -p $bpdir" ; foreach $n (@{$bp->{list}}) { print "\t$n\n" ; system "cp -a $albumsdir/$n $bpdir/" ; } system "mkisofs -quiet -r -o $backupdir/backup-$i.iso $bpdir" ; system "rm -rf $bpdir" ; $i++ ; } } sub wanted { my ($dev,$ino,$mode,$nlink,$uid,$gid); (($dev,$ino,$mode,$nlink,$uid,$gid) = lstat($_)) ; # print "$File::Find::name ; $File::Find::dir\n" ; if ((-f $File::Find::name) && ($File::Find::dir eq $albumsdir)) { # print "Keeping file $File::Find::name\n" ; push @alist, "$File::Find::name" ; } if ((-d $File::Find::name) && ($File::Find::dir eq $albumsdir) && ($File::Find::name ne $albumsdir)) { # print "Keeping dir $File::Find::name\n" ; push @alist, "$File::Find::name" ; } }