About Blog

Find sparse files

Good morning, world. Please find herein a script for detecting sparse files. This is handy if you download P2P using a client with poor staging, and your media library keeps getting littered with partial downloads. Simply run:

mike@igor$ ~/bin/sparse /media |
  while read ;
   do rm -i "$REPLY" ;
  done

It has been tested on FreeBSD, and found to work with UFS and to not work with ZFS. It includes basic tests, which may be run with trial:

mike@igor$ trial ~/bin/sparse

Usage: ./sparse [path …]

If no arguments are supplied, ‘.’ is assumed. Check given files and recurse through given folders, printing sparse files one per line as they are encountered.

Bugs:

  • Does not work on ZFS partitions! du report includes sparse blocks or sparse files not being created properly.

  • Depends on du. Tried using os.stat().st_blocks but haven’t made sense of it.

  • Depends on twisted.python.filepath and twisted.trial.

UPDATE: This software is offered under certain terms and conditions. If you cannot accept these terms, you cannot use this software. The terms are as follows: I do not give a crap what you do. Try not to hurt anyone. FURTHER: fixed editing.

Add post to: Delicious Reddit Slashdot Digg Technorati Google
(already: 1) Comment post

Comments

2009-Mar-Mon 15:45 Mike

I tried a weird thing to support unit testing. Any function which uses a global rebinds the global to a local using a default argument. The idea was that I would be able to pass in new values for testing purposes w/out hiding or wrapping the real globals.

It turned out to be less satisfying than I imagined, and I didn’t carry it through all the way. I became discouraged with the idea when I realized that these overrides would have to be passed along to calls made within the tested function. For example, “find_sparse” calls “is_sparse” and “find_sparse”. Originally I forgot to pass the local value of “complain” in those invocations, so the outermost invocation of “find_sparse” behaved as expected under testing, but the recursions and calls to “is_sparse” reverted back to logging to stderr. Passing along whatever my present local value is works for “complain”, but it would not work if I were calling a function that uses globals the caller doesn’t. Then things just get ugly, fast.

I consider this technique a failure as it is implemented here.

Comment form for «Find sparse files»

Required. 30 chars of fewer.

Required.

Comment post