1
0
mirror of https://github.com/ciromattia/kcc synced 2025-12-13 01:36:27 +00:00

huge speed optimization on HDD by removing md5 (#845)

* Eliminate unnecessary use of MD5 checksum

md5checksum() computes the actual checksum of a specified file, which is appropriately expensive, but the code seemed to be using the checksum result as a key into the imgMetadata dictionary to avoid handling image files being renamed during processing steps. This seems like a very expensive way to handle the rename so instead, I now update the imgMetadata keys with the new filename in the one place that the rename happens, and avoid MD5 checksums entirely.

* merge conflicts

* Add missing handling for image path renames due to nested chapter folder name

* merge conflicts

* merge conflicts

* add perf_counters

* imgFileProcessing perf_counter

* use startswith and removeprefix

---------

Co-authored-by: utopiafallen <utopiafallen@gmail.com>
This commit is contained in:
Alex Xu
2025-03-04 11:28:23 -08:00
committed by GitHub
parent 5f8526da44
commit 01625904d1
3 changed files with 15 additions and 17 deletions

View File

@@ -22,7 +22,6 @@ import io
import os
import mozjpeg_lossless_optimization
from PIL import Image, ImageOps, ImageStat, ImageChops, ImageFilter
from .shared import md5Checksum
from .page_number_crop_alg import get_bbox_crop_margin_page_number, get_bbox_crop_margin
from .inter_panel_crop_alg import crop_empty_inter_panel
@@ -321,7 +320,7 @@ class ComicPage:
output_jpeg_file.write(output_jpeg_bytes)
else:
self.image.save(self.targetPath, 'JPEG', optimize=1, quality=85)
return [md5Checksum(self.targetPath), flags, self.orgPath]
return [self.targetPath, flags, self.orgPath]
except IOError as err:
raise RuntimeError('Cannot save image. ' + str(err))