File Deduplicator
Find and remove duplicate files intelligently. Save storage space, keep your system clean. Perfect for digital hoarders and document management.
Find and remove duplicate files intelligently. Save storage space, keep your system clean. Perfect for digital hoarders and document management.
Real data. Real impact.
Emerging
Developers
Per week
Open source
Skills give you superpowers. Install in 30 seconds.
Vernox Utility Skill - Clean up your digital hoard.
File-Deduplicator is an intelligent file duplicate finder and remover. Uses content hashing to identify identical files across directories, then provides options to remove duplicates safely.
clawhub install file-deduplicator
const result = await findDuplicates({ directories: ['./documents', './downloads', './projects'], options: { method: 'content', // content-based comparison includeSubdirs: true } });console.log(); console.log(Found ${result.duplicateCount} duplicate groups);Potential space savings: ${result.spaceSaved}
const result = await removeDuplicates({ directories: ['./documents', './downloads'], options: { method: 'content', keep: 'newest', // keep newest, delete oldest action: 'delete', // or 'move' to archive autoConfirm: false // show confirmation for each } });console.log(); console.log(Removed ${result.filesRemoved} duplicates);Space saved: ${result.spaceSaved}
const result = await removeDuplicates({ directories: ['./documents', './downloads'], options: { method: 'content', keep: 'newest', action: 'delete', dryRun: true // Preview without actual deletion } });console.log('Would remove:'); result.duplicates.forEach((dup, i) => { console.log(); });${i+1}. ${dup.file}
findDuplicatesFind duplicate files across directories.
Parameters:
directories (array|string, required): Directory paths to scanoptions (object, optional):
method (string): 'content' | 'size' | 'name' - comparison methodincludeSubdirs (boolean): Scan recursively (default: true)minSize (number): Minimum size in bytes (default: 0)maxSize (number): Maximum size in bytes (default: 0)excludePatterns (array): Glob patterns to exclude (default: ['.git', 'node_modules'])whitelist (array): Directories to never scan (default: [])Returns:
duplicates (array): Array of duplicate groups
duplicateCount (number): Number of duplicate groups foundtotalFiles (number): Total files scannedscanDuration (number): Time taken to scan (ms)spaceWasted (number): Total bytes wasted by duplicatesspaceSaved (number): Potential savings if duplicates removedremoveDuplicatesRemove duplicate files based on findings.
Parameters:
directories (array|string, required): Same as findDuplicatesoptions (object, optional):
keep (string): 'newest' | 'oldest' | 'smallest' | 'largest' - which to keepaction (string): 'delete' | 'move' | 'archive'archivePath (string): Where to move files when action='move'dryRun (boolean): Preview without actual actionautoConfirm (boolean): Auto-confirm deletionssizeThreshold (number): Don't remove files larger than thisReturns:
filesRemoved (number): Number of files removed/movedspaceSaved (number): Bytes savedgroupsProcessed (number): Number of duplicate groups handledlogPath (string): Path to action logerrors (array): Any errors encounteredanalyzeDirectoryAnalyze a single directory for duplicates.
Parameters:
directory (string, required): Path to directoryoptions (object, optional): Same as findDuplicates optionsReturns:
fileCount (number): Total files in directorytotalSize (number): Total bytes in directoryduplicateSize (number): Bytes in duplicate filesduplicateRatio (number): Percentage of files that are duplicatesconfig.json:{ "detection": { "defaultMethod": "content", "sizeTolerancePercent": 0, // exact match only "nameSimilarity": 0.7, // 0-1, lower = more similar "includeSubdirs": true }, "removal": { "defaultAction": "delete", "defaultKeep": "newest", "archivePath": "./archive", "sizeThreshold": 10485760, // 10MB threshold "autoConfirm": false, "dryRunDefault": false }, "exclude": { "patterns": [".git", "node_modules", ".vscode", ".idea"], "whitelist": ["important", "work", "projects"] } }
const result = await findDuplicates({ directories: '~/Documents', options: { method: 'content', includeSubdirs: true } });console.log(); result.duplicates.slice(0, 5).forEach((set, i) => { console.log(Found ${result.duplicateCount} duplicate sets); console.log(Set ${i+1}: ${set.files.length} files); });Total size: ${set.totalSize} bytes
const result = await removeDuplicates({ directories: '~/Documents', options: { keep: 'newest', action: 'delete' } });console.log(); console.log(Removed ${result.filesRemoved} files);Saved ${result.spaceSaved} bytes
const result = await removeDuplicates({ directories: '~/Downloads', options: { keep: 'newest', action: 'move', archivePath: '~/Documents/Archive' } });console.log(); console.log(Archived ${result.filesRemoved} files);Safe in: ~/Documents/Archive
const result = await removeDuplicates({ directories: '~/Documents', options: { dryRun: true // Just show what would happen } });console.log('=== Dry Run Preview ==='); result.duplicates.forEach((set, i) => { console.log(); });Would delete: ${set.toDelete.join(', ')}
Won't remove files larger than configurable threshold (default: 10MB). Prevents accidental deletion of important large files.
Move files to archive directory instead of deleting. No data loss, full recoverability.
All deletions/moves are logged to file for recovery and audit.
Log file can be used to restore accidentally deleted files (limited undo window).
MIT
Find duplicates. Save space. Keep your system clean. 🔮
No automatic installation available. Please visit the source repository for installation instructions.
View Installation Instructions1,500+ AI skills, agents & workflows. Install in 30 seconds. Part of the Torly.ai family.
© 2026 Torly.ai. All rights reserved.