Hi everyone! This is my first post here. I figured I would try and post something helpful to show my thanks for all the work from everyone else that I've enjoyed for so long (HD texture packs, tools, etc. Thank you everyone!)
I wrote a small Python utility to help clean up large ROM collections by detecting duplicates and letting you manually choose which regional versions to keep.
Example grouping:
I wrote a small Python utility to help clean up large ROM collections by detecting duplicates and letting you manually choose which regional versions to keep.
What it does
The script recursively scans a ROM directory (currently focused on .zip files) and groups games based on the filename up to the first parenthesis.Example grouping:
These are treated as one group.
- Actraiser (USA)
- Actraiser (Europe)
- Actraiser (Japan)
How it works
For each detected duplicate group, it pauses and prompts you interactively:You then respond with something like:Duplicates found:
1. Actraiser (USA).zip
2. Actraiser (Europe).zip
3. Actraiser (Japan).zip
Keep which numbers?
Any files not selected are automatically moved into a deleteme folder inside their original directory.1, 2
Features
- Recursive directory scanning
- Groups ROMs by base title (before first parentheses)
- Interactive selection per group
- Safe removal method (moves instead of deleting)
- Handles multiple duplicates per title
- Prevents overwriting in deleteme folder
Requirements
- Python 3.x
- No external libraries
Script
#!/usr/bin/env python3
import re
import shutil
from pathlib import Path
ROOT_DIR = Path(input("ROM directory: ").strip()).expanduser()
# Match everything before the first "("
TITLE_RE = re.compile(r"^(.*?)\s*\(")
groups = {}
for file_path in ROOT_DIR.rglob("*.zip"):
if not file_path.is_file():
continue
match = TITLE_RE.match(file_path.stem)
if match:
key = match.group(1).strip().lower()
else:
key = file_path.stem.strip().lower()
groups.setdefault(key, []).append(file_path)
duplicate_groups = {
key: files
for key, files in groups.items()
if len(files) > 1
}
print(f"\nFound {len(duplicate_groups)} duplicate groups.\n")
for key in sorted(duplicate_groups):
files = sorted(duplicate_groups[key])
print("=" * 80)
print("Duplicates found:\n")
for idx, f in enumerate(files, start=1):
print(f"{idx}. {f.name}")
while True:
response = input("\nKeep which numbers? ").strip()
try:
keep = {
int(x.strip())
for x in response.split(",")
if x.strip()
}
if not keep:
raise ValueError
if any(i < 1 or i > len(files) for i in keep):
raise ValueError
break
except ValueError:
print(
f"Please enter numbers between 1 and {len(files)} "
"separated by commas."
)
for idx, file_path in enumerate(files, start=1):
if idx in keep:
continue
deleteme_dir = file_path.parent / "deleteme"
deleteme_dir.mkdir(exist_ok=True)
destination = deleteme_dir / file_path.name
counter = 1
while destination.exists():
destination = (
deleteme_dir /
f"{file_path.stem}_{counter}{file_path.suffix}"
)
counter += 1
shutil.move(str(file_path), str(destination))
print(f"Moved: {file_path.name}")
print()
print("Done.")
Notes
- This is intentionally conservative: nothing is deleted, only moved to the deleteme.
- It is designed for manual control rather than fully automatic cleanup.
- Works best for ROM sets organized with region tags like (USA), (Europe), (Japan).






