Le weblog entièrement nu

Roland, entièrement nu... de temps en temps.

File set split utility, or the backpack problem

Joey Hess is looking for a tool that allows him to split sets of files across subsets with a known maximal size, for archival on DVD. That sounds a bit like what my ~/bin/prepare-gallery-backups.pl script does. Actually, my script doesn't operate on a per-file granularity, but per-directory (think photo gallery albums), but I suppose it would be rather simple to adapt. Also, there's no guarantee of optimality, and it may well be that you'll end up with a few subsets more than what is strictly needed. I'm not trying to solve the notoriously hard "backpack problem", just looking for a "good enough for me" solution. Now if Joey wants to recode that properly and add it to moreutils, I'll be a happy man.

I was about to paste just the relevant excerpt of my script, but since only a few lines would have been omitted, I figured what the hell. Here's the whole script, licensed under a "send me a lot of money if you like, otherwise do what you want" license.

#! /usr/bin/perl

use File::Find ;

$maxsize = 700000 ;
$remote_dir = "/srv/www/gallery.placard.fr.eu.org" ;
$rsync_dir = "/space/backups/gallery-backup/rsync/" ;
$albumsdir = "/space/backups/gallery-backup/rsync/gallery.placard.fr.eu.org/albums" ;
$backupdir = "/space/backups/gallery-backup/backups" ;

# system "rsync -avL --rsh=ssh --delete clodomir:$remote_dir $rsync_dir" ;
system "rsync -avL --rsh=ssh --delete mirobole:$remote_dir $rsync_dir" ;

if ( (defined $ARGV[0])
   && ($ARGV[0] eq "--make-isos") ) {

  File::Find::find({wanted => \&wanted, follow => 1}, $albumsdir);
    for $i (@alist) {
  $fname = $i ;
  $fname =~ s,.*/,, ;
  $du = qx/du -hk $i/ ;
  $size = $du ;
  $size =~ s,\s.*,, ;
  $albums{$fname}{size} = $size ;
  $albums{$fname}{path} = $i ;
    }

    foreach $album (sort {$albums{$b}{size} <=> $albums{$a}{size}} keys %albums) {
  $size = $albums{$album}{size} ;
  $name = $album ;
  $name =~ s,.*/,,;
  if($cur + $size > $maxsize) {
      my @bplist = @curlist ;
      my %entry = ('list' => \@bplist,
           'size' => $cur) ;
      push @backpacks, \%entry ;
      $cur = 0 ;
      @curlist = () ;
  }
  $cur += $size ;
  push @curlist, $name ;
    }
    my @bplist = @curlist ;
    my %entry = ('list' => \@bplist,
       'size' => $cur) ;
    push @backpacks, \%entry ;

    $i = 1 ;
    foreach $bp (@backpacks) {
  print "Backpack $i (size $bp->{size}) :\n" ;
  $bpdir = $backupdir . "/backup-$i" ;
  system "mkdir -p $bpdir" ;
  foreach $n (@{$bp->{list}}) {
      print "\t$n\n" ;
      system "cp -a $albumsdir/$n $bpdir/" ;
  }
  system "mkisofs -quiet -r -o $backupdir/backup-$i.iso $bpdir" ;
  system "rm -rf $bpdir" ;
  $i++ ;
    }
}


sub wanted {
  my ($dev,$ino,$mode,$nlink,$uid,$gid);

  (($dev,$ino,$mode,$nlink,$uid,$gid) = lstat($_)) ;
  # print "$File::Find::name ; $File::Find::dir\n" ;

  if ((-f $File::Find::name) && ($File::Find::dir eq $albumsdir)) {
      # print "Keeping file $File::Find::name\n" ;
      push @alist, "$File::Find::name" ;
  }
  if ((-d $File::Find::name) && ($File::Find::dir eq $albumsdir)
    && ($File::Find::name ne $albumsdir)) {
      # print "Keeping dir $File::Find::name\n" ;
      push @alist, "$File::Find::name" ;
  }
}
Tags:
Creative Commons License Sauf indication contraire, le contenu de ce site est mis à disposition sous un contrat Creative Commons.