I'm trying to deduplicate my metadata files (files/*/**/entries), which have the following lengths histogram: 306784442 0 <= x < 2153043 ************************************************************ 495 2153043 <= x < 4306086 173 4306086 <= x < 6459129 141 6459129 <= x < 8612172 445 8612172 <= x < 10765215 75 10765215 <= x < 12918258 0 12918258 <= x < 15071301 0 15071301 <= x < 17224344 0 17224344 <= x < 19377387 396 19377387 <= x <= 21530430 This is on a Linux machine with 64 Gigabytes of RAM. Here are the programs I tried: equivs3f: This is from my personal Subversion. It was killed by the OOM killer while still trying to inhale the file sizes. jdupes: I tried: apt-get install jdupes cd /mnt/backshift/save-directory jdupes --recurse --link-hard --hard-links files It was eventually killed by the OOM killer too. fclones: I built fclones from source, even though there's a non-repo .deb for it. Tried: cd /mnt/backshift/filesdups TMPDIR=$(pwd) fclones group /mnt/backshift/save-directory/files > dups.txt Oddly, this counted (much?) higher on the first stage than the histogram above would seem to suggest should be necessary. It came /very/ close to 64 Gigabytes of RAM, but just before that happened, it moved on to other stages. However, it started doing stuff with XFS extents, which I didn't want, so I killed it. drs-dedup: Written (mostly) for this purpose, in bash and python, by me. It too comes from my personal Subversion. It's extremely slow, probably because of the poor locality of reference. But at least it's not because the virtual memory system is thrashing. Started 2026 Mon Apr 20 06:30:08 AM PDT df -h /mnt/backshift/save-directory/ Filesystem Size Used Avail Use% Mounted on /dev/sda1 11T 7.5T 3.5T 69% /mnt/backshift above cmd output done 2026 Sat May 23 10:08:06 AM PDT I did about 7-9 backups while it was running, before it completed; these were not part of the deduplication process. Also, I had atimes turned on, for both files and directories; turning those off probably would've helped. As likely would've upgrading the OS from Debian Bookworm to Debian Trixie. drs-dedup appeared to work, but only freed up about a tenth of a terabyte, so I shelved the idea.