Stop me if you've heard this one...
fdupes is one of those delightful utilities that you
quickly wonder how you got along without. "All" it does is, given a
directory or set of directories, tells you what files are duplicates.
Firstly by an md5 sum then for matches by a byte by byte
In my case I needed to rationalise our photo collection a bit as I
had too many directories named things like
sort-later-dont-delete/ which contained photos already
dealt with. Of course as
fdupes makes use of md5 sums I
would also catch photos that had been renamed.
It's got some nice options for dealing with symbolic and hard links
(optimise your kernel trees anyone ?) and appears to handle large file
sets efficiently. Team it with
xargs and you've got a
pretty formidable little tool (or an unintentionally empty home
directory if you're careless I guess :)