[Python] Interactive ROM Duplicate Finder and Region Selector

SevenSilverCheese

New Member
Newbie
Joined
Jun 10, 2026
Messages
1
Reaction score
0
Trophies
0
Age
31
XP
7
Country
United States
Hi everyone! This is my first post here. I figured I would try and post something helpful to show my thanks for all the work from everyone else that I've enjoyed for so long (HD texture packs, tools, etc. Thank you everyone!)


I wrote a small Python utility to help clean up large ROM collections by detecting duplicates and letting you manually choose which regional versions to keep.

What it does​

The script recursively scans a ROM directory (currently focused on .zip files) and groups games based on the filename up to the first parenthesis.

Example grouping:

  • Actraiser (USA)
  • Actraiser (Europe)
  • Actraiser (Japan)
These are treated as one group.

How it works​

For each detected duplicate group, it pauses and prompts you interactively:

Duplicates found:
1. Actraiser (USA).zip
2. Actraiser (Europe).zip
3. Actraiser (Japan).zip

Keep which numbers?
You then respond with something like:

Any files not selected are automatically moved into a deleteme folder inside their original directory.

Features​

  • Recursive directory scanning
  • Groups ROMs by base title (before first parentheses)
  • Interactive selection per group
  • Safe removal method (moves instead of deleting)
  • Handles multiple duplicates per title
  • Prevents overwriting in deleteme folder

Requirements​

  • Python 3.x
  • No external libraries

Script​

#!/usr/bin/env python3

import re
import shutil
from pathlib import Path

ROOT_DIR = Path(input("ROM directory: ").strip()).expanduser()

# Match everything before the first "("
TITLE_RE = re.compile(r"^(.*?)\s*\(")

groups = {}

for file_path in ROOT_DIR.rglob("*.zip"):
if not file_path.is_file():
continue

match = TITLE_RE.match(file_path.stem)

if match:
key = match.group(1).strip().lower()
else:
key = file_path.stem.strip().lower()

groups.setdefault(key, []).append(file_path)

duplicate_groups = {
key: files
for key, files in groups.items()
if len(files) > 1
}

print(f"\nFound {len(duplicate_groups)} duplicate groups.\n")

for key in sorted(duplicate_groups):
files = sorted(duplicate_groups[key])

print("=" * 80)
print("Duplicates found:\n")

for idx, f in enumerate(files, start=1):
print(f"{idx}. {f.name}")

while True:
response = input("\nKeep which numbers? ").strip()

try:
keep = {
int(x.strip())
for x in response.split(",")
if x.strip()
}

if not keep:
raise ValueError

if any(i < 1 or i > len(files) for i in keep):
raise ValueError

break

except ValueError:
print(
f"Please enter numbers between 1 and {len(files)} "
"separated by commas."
)

for idx, file_path in enumerate(files, start=1):
if idx in keep:
continue

deleteme_dir = file_path.parent / "deleteme"
deleteme_dir.mkdir(exist_ok=True)

destination = deleteme_dir / file_path.name

counter = 1
while destination.exists():
destination = (
deleteme_dir /
f"{file_path.stem}_{counter}{file_path.suffix}"
)
counter += 1

shutil.move(str(file_path), str(destination))
print(f"Moved: {file_path.name}")

print()

print("Done.")

Notes​

  • This is intentionally conservative: nothing is deleted, only moved to the deleteme.
  • It is designed for manual control rather than fully automatic cleanup.
  • Works best for ROM sets organized with region tags like (USA), (Europe), (Japan).
 

Site & Scene News

Popular threads in this forum