1
0
mirror of https://github.com/ciromattia/kcc synced 2026-04-17 14:38:47 +00:00

Compare commits

...

14 Commits
2.8 ... 2.9

Author SHA1 Message Date
Ciro Mattia Gonano
148211a5c7 Update for 2.9 release. 2013-04-19 12:14:24 +02:00
Ciro Mattia Gonano
23e07f47f0 Merge pull request #46 from ciromattia/slugify
Filenames slugification
2013-04-19 02:27:54 -07:00
Ciro Mattia Gonano
724156c554 Small fixes 2013-04-12 01:36:51 +02:00
Ciro Mattia Gonano
b972e4c746 Remove Windows silly 'thumbs.db' too 2013-04-11 12:33:14 +02:00
Ciro Mattia Gonano
f0afa1fff2 Convert dot char to hyphen.
Removes UNIX-hidden files and dirs from the final archive (prevents .DS_Store and stuff)
2013-04-11 12:18:02 +02:00
Ciro Mattia Gonano
a36c05f0c5 Merge from master 2013-04-11 12:00:51 +02:00
Ciro Mattia Gonano
4f3a66b4eb Update README.md 2013-04-11 12:59:01 +03:00
Ciro Mattia Gonano
6369c7ea44 Update after merging of #44 2013-04-11 12:54:16 +03:00
Ciro Mattia Gonano
f1b8aff8d4 Merge pull request #44 from devernay/master
Support more input image formats: GIF, TIFF, ...
2013-04-11 02:50:37 -07:00
Ciro Mattia Gonano
be270aa797 Add number padding and lowering for file names (not directory) 2013-04-11 11:49:29 +02:00
Ciro Mattia Gonano
f33d355024 Filenames slugifications (#28, #31, #9, #8) 2013-04-11 10:34:33 +02:00
Ciro Mattia Gonano
6f913b026e rarfile updated to 2.6 2013-04-11 09:34:20 +02:00
Ciro Mattia Gonano
220b4e0954 Add an option to generate a CBZ skipping all the EPUB/Mobi stuff.
Prevent output files to overwrite the source (add _kcc if duplicate is detected)
Fixes #45
2013-04-10 12:29:31 +02:00
Frédéric Devernay
bac4a4fd86 support more input image formats 2013-04-04 14:09:21 +02:00
7 changed files with 296 additions and 154 deletions

View File

@@ -6,16 +6,22 @@ actually a comic 2 epub converter that every ereader owner can happily use**_.
It can also optionally optimize images by applying a number of transformations. It can also optionally optimize images by applying a number of transformations.
### A word of warning
**KCC** _is not_ [Amazon's Kindle Comic Creator](http://www.amazon.com/gp/feature.html?ie=UTF8&docId=1001103761) nor is in any way endorsed by Amazon.
Amazon's tool is for comic _publishers_ and involves a lot of manual effort, while **KCC** is for comic _readers_.
If you want to read some comments over *Amazon's kc2* you can take a look at [this](http://www.mobileread.com/forums/showthread.php?t=207461&page=7#96) and [that](http://www.mobileread.com/forums/showthread.php?t=211047) threads on Mobileread.
_kc2_ in no way is a replacement for **KCC** so you can be quite confident we'll going to carry on developing our little monster ;)
## BINARY RELEASES ## BINARY RELEASES
You can find the latest released binary at the following links: You can find the latest released binary at the following links:
- OS X: [https://dl.dropbox.com/u/16806101/KindleComicConverter_osx_2.8.zip](https://dl.dropbox.com/u/16806101/KindleComicConverter_osx_2.8.zip) - OS X: [https://dl.dropbox.com/u/16806101/KindleComicConverter_osx_2.9.zip](https://dl.dropbox.com/u/16806101/KindleComicConverter_osx_2.9.zip)
- Win64: [https://dl.dropbox.com/u/16806101/KindleComicConverter_win-amd64_2.8.zip](https://dl.dropbox.com/u/16806101/KindleComicConverter_win-amd64_2.7.zip) - Win64: [https://dl.dropbox.com/u/16806101/KindleComicConverter_win-amd64_2.9.zip](https://dl.dropbox.com/u/16806101/KindleComicConverter_win-amd64_2.9.zip)
- Win32: [http://pawelj.vulturis.eu/Shared/KindleComicConverter_win-x86_2.8.zip](http://pawelj.vulturis.eu/Shared/KindleComicConverter_win-x86_2.8.zip) *(thanks to [AcidWeb](https://github.com/AcidWeb))* - Win32: [http://pawelj.vulturis.eu/Shared/KindleComicConverter_win-x86_2.9.zip](http://pawelj.vulturis.eu/Shared/KindleComicConverter_win-x86_2.9.zip) *(thanks to [AcidWeb](https://github.com/AcidWeb))*
- Linux: Just download sourcecode and launch `python kcc.py` *(Provided you have Python and Pillow installed)* - Linux: Just download sourcecode and launch `python kcc.py` *(Provided you have Python and Pillow installed)*
## INPUT FORMATS ## INPUT FORMATS
`kcc` can understand and convert, at the moment, the following file types: `kcc` can understand and convert, at the moment, the following file types:
- PNG, JPG - PNG, JPG, GIF, TIFF, BMP
- Folders - Folders
- CBZ, ZIP - CBZ, ZIP
- CBR, RAR *(With `unrar` executable)* - CBR, RAR *(With `unrar` executable)*
@@ -51,6 +57,7 @@ Options:
-t TITLE, --title=TITLE -t TITLE, --title=TITLE
Comic title [Default=filename] Comic title [Default=filename]
-m, --manga-style Manga style (Right-to-left reading and splitting) [Default=False] -m, --manga-style Manga style (Right-to-left reading and splitting) [Default=False]
-c, --cbz-output Outputs a CBZ archive and does not generate EPUB
--nopanelviewhq Disable high quality Panel View [Default=False] --nopanelviewhq Disable high quality Panel View [Default=False]
--noprocessing Do not apply image preprocessing (Page splitting and optimizations) [Default=True] --noprocessing Do not apply image preprocessing (Page splitting and optimizations) [Default=True]
--forcepng Create PNG files instead JPEG (For non-Kindle devices) [Default=False] --forcepng Create PNG files instead JPEG (For non-Kindle devices) [Default=False]
@@ -62,7 +69,7 @@ Options:
--nosplitrotate Disable splitting and rotation [Default=False] --nosplitrotate Disable splitting and rotation [Default=False]
--nocutpagenumbers Do not try to cut page numbering on images [Default=True] --nocutpagenumbers Do not try to cut page numbering on images [Default=True]
-o OUTPUT, --output=OUTPUT -o OUTPUT, --output=OUTPUT
Output generated EPUB to specified directory or file Output generated file (EPUB or CBZ) to specified directory or file
-v, --verbose Verbose output [Default=False] -v, --verbose Verbose output [Default=False]
``` ```
@@ -121,8 +128,13 @@ The app relies and includes the following scripts/binaries:
Rewrite of Landscape Mode support (huge readability improvement for KPW) Rewrite of Landscape Mode support (huge readability improvement for KPW)
Upscale use now BILINEAR method Upscale use now BILINEAR method
Added generic CSS file Added generic CSS file
Optimized archive extraction for zip/rar files (#40) Optimized archive extraction for zip/rar files (#40)
- 2.9: Added support for generating a plain CBZ (skipping all the EPUB/Mobi generation) (#45)
Prevent output file overwriting the source one: if a duplicate name is detected, append _kcc to the name
Rarfile library updated to 2.6
Added GIF, TIFF and BMP to supported formats (#42)
Filenames slugifications (#28, #31, #9, #8)
## COPYRIGHT ## COPYRIGHT

2
kcc.py
View File

@@ -16,7 +16,7 @@
# TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR # TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
# PERFORMANCE OF THIS SOFTWARE. # PERFORMANCE OF THIS SOFTWARE.
# #
__version__ = '2.8' __version__ = '2.9'
__license__ = 'ISC' __license__ = 'ISC'
__copyright__ = '2012-2013, Ciro Mattia Gonano <ciromattia@gmail.com>' __copyright__ = '2012-2013, Ciro Mattia Gonano <ciromattia@gmail.com>'
__docformat__ = 'restructuredtext en' __docformat__ = 'restructuredtext en'

View File

@@ -1,4 +1,5 @@
#!/usr/bin/env python #!/usr/bin/env python
# -*- coding: utf-8 -*-
# #
# Copyright (c) 2012 Ciro Mattia Gonano <ciromattia@gmail.com> # Copyright (c) 2012 Ciro Mattia Gonano <ciromattia@gmail.com>
# #
@@ -16,7 +17,7 @@
# TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR # TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
# PERFORMANCE OF THIS SOFTWARE. # PERFORMANCE OF THIS SOFTWARE.
# #
__version__ = '2.8' __version__ = '2.9'
__license__ = 'ISC' __license__ = 'ISC'
__copyright__ = '2012-2013, Ciro Mattia Gonano <ciromattia@gmail.com>' __copyright__ = '2012-2013, Ciro Mattia Gonano <ciromattia@gmail.com>'
__docformat__ = 'restructuredtext en' __docformat__ = 'restructuredtext en'
@@ -128,7 +129,7 @@ def buildNCX(dstdir, title, chapters):
f = open(ncxfile, "w") f = open(ncxfile, "w")
f.writelines(["<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n", f.writelines(["<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n",
"<!DOCTYPE ncx PUBLIC \"-//NISO//DTD ncx 2005-1//EN\" ", "<!DOCTYPE ncx PUBLIC \"-//NISO//DTD ncx 2005-1//EN\" ",
"http://www.daisy.org/z3986/2005/ncx-2005-1.dtd\">\n", "\"http://www.daisy.org/z3986/2005/ncx-2005-1.dtd\">\n",
"<ncx version=\"2005-1\" xml:lang=\"en-US\" xmlns=\"http://www.daisy.org/z3986/2005/ncx/\">\n", "<ncx version=\"2005-1\" xml:lang=\"en-US\" xmlns=\"http://www.daisy.org/z3986/2005/ncx/\">\n",
"<head>\n", "<head>\n",
"<meta name=\"dtb:uid\" content=\"015ffaec-9340-42f8-b163-a0c5ab7d0611\"/>\n", "<meta name=\"dtb:uid\" content=\"015ffaec-9340-42f8-b163-a0c5ab7d0611\"/>\n",
@@ -276,6 +277,10 @@ def getImageFileName(imgfile):
if filename[0].startswith('.') or\ if filename[0].startswith('.') or\
(filename[1].lower() != '.png' and (filename[1].lower() != '.png' and
filename[1].lower() != '.jpg' and filename[1].lower() != '.jpg' and
filename[1].lower() != '.gif' and
filename[1].lower() != '.tif' and
filename[1].lower() != '.tiff' and
filename[1].lower() != '.bmp' and
filename[1].lower() != '.jpeg'): filename[1].lower() != '.jpeg'):
return None return None
return filename return filename
@@ -356,6 +361,7 @@ def genEpubStruct(path):
chapterlist = [] chapterlist = []
cover = None cover = None
_, deviceres, _, _, panelviewsize = image.ProfileData.Profiles[options.profile] _, deviceres, _, _, panelviewsize = image.ProfileData.Profiles[options.profile]
sanitizeTree(os.path.join(path, 'OEBPS', 'Images'))
os.mkdir(os.path.join(path, 'OEBPS', 'Text')) os.mkdir(os.path.join(path, 'OEBPS', 'Text'))
f = open(os.path.join(path, 'OEBPS', 'Text', 'style.css'), 'w') f = open(os.path.join(path, 'OEBPS', 'Text', 'style.css'), 'w')
#DON'T COMPRESS CSS. KINDLE WILL FAIL TO PARSE IT. #DON'T COMPRESS CSS. KINDLE WILL FAIL TO PARSE IT.
@@ -535,6 +541,36 @@ def getWorkFolder(afile):
return path return path
def slugify(value):
"""
Normalizes string, converts to lowercase, removes non-alpha characters,
and converts spaces to hyphens.
"""
import unicodedata
value = unicodedata.normalize('NFKD', unicode(value, 'latin1')).encode('ascii', 'ignore')
value = re.sub('[^\w\s\.-]', '', value).strip().lower()
value = re.sub('[-\.\s]+', '-', value)
value = re.sub(r'([0-9]+)', r'00000\1', value)
value = re.sub(r'0*([0-9]{6,})', r'\1', value)
return value
def sanitizeTree(filetree):
for root, dirs, files in os.walk(filetree):
for name in files:
if name.startswith('.') or name.lower() == 'thumbs.db':
os.remove(os.path.join(root, name))
else:
splitname = os.path.splitext(name)
os.rename(os.path.join(root, name),
os.path.join(root, slugify(splitname[0]) + splitname[1]))
for name in dirs:
if name.startswith('.'):
os.remove(os.path.join(root, name))
else:
os.rename(os.path.join(root, name), os.path.join(root, slugify(name)))
def Copyright(): def Copyright():
print ('comic2ebook v%(__version__)s. ' print ('comic2ebook v%(__version__)s. '
'Written 2012 by Ciro Mattia Gonano.' % globals()) 'Written 2012 by Ciro Mattia Gonano.' % globals())
@@ -542,7 +578,6 @@ def Copyright():
def Usage(): def Usage():
print "Generates HTML, NCX and OPF for a Comic ebook from a bunch of images." print "Generates HTML, NCX and OPF for a Comic ebook from a bunch of images."
print "Optimized for creating MOBI files to be read on Kindle Paperwhite."
parser.print_help() parser.print_help()
@@ -556,6 +591,8 @@ def main(argv=None):
help="Comic title [Default=filename]") help="Comic title [Default=filename]")
parser.add_option("-m", "--manga-style", action="store_true", dest="righttoleft", default=False, parser.add_option("-m", "--manga-style", action="store_true", dest="righttoleft", default=False,
help="Manga style (Right-to-left reading and splitting) [Default=False]") help="Manga style (Right-to-left reading and splitting) [Default=False]")
parser.add_option("-c", "--cbz-output", action="store_true", dest="cbzoutput", default=False,
help="Outputs a CBZ archive and does not generate EPUB")
parser.add_option("--nopanelviewhq", action="store_true", dest="nopanelviewhq", default=False, parser.add_option("--nopanelviewhq", action="store_true", dest="nopanelviewhq", default=False,
help="Disable high quality Panel View [Default=False]") help="Disable high quality Panel View [Default=False]")
parser.add_option("--noprocessing", action="store_false", dest="imgproc", default=True, parser.add_option("--noprocessing", action="store_false", dest="imgproc", default=True,
@@ -578,7 +615,7 @@ def main(argv=None):
parser.add_option("--nocutpagenumbers", action="store_false", dest="cutpagenumbers", default=True, parser.add_option("--nocutpagenumbers", action="store_false", dest="cutpagenumbers", default=True,
help="Do not try to cut page numbering on images [Default=True]") help="Do not try to cut page numbering on images [Default=True]")
parser.add_option("-o", "--output", action="store", dest="output", default=None, parser.add_option("-o", "--output", action="store", dest="output", default=None,
help="Output generated EPUB to specified directory or file") help="Output generated file (EPUB or CBZ) to specified directory or file")
parser.add_option("-v", "--verbose", action="store_true", dest="verbose", default=False, parser.add_option("-v", "--verbose", action="store_true", dest="verbose", default=False,
help="Verbose output [Default=False]") help="Verbose output [Default=False]")
options, args = parser.parse_args(argv) options, args = parser.parse_args(argv)
@@ -593,25 +630,40 @@ def main(argv=None):
if options.imgproc: if options.imgproc:
print "Processing images..." print "Processing images..."
dirImgProcess(path + "/OEBPS/Images/") dirImgProcess(path + "/OEBPS/Images/")
print "\nCreating ePub structure..." if options.cbzoutput:
genEpubStruct(path) # if CBZ output wanted, compress all images and return filepath
# actually zip the ePub print "\nCreating CBZ file..."
if options.output is not None: filepath = getOutputFilename(args[0], options.output, '.cbz')
if options.output.endswith('.epub'): make_archive(path + '_comic', 'zip', path + '/OEBPS/Images')
epubpath = os.path.abspath(options.output)
elif os.path.isdir(args[0]):
epubpath = os.path.abspath(options.output) + "/" + os.path.basename(args[0]) + '.epub'
else:
epubpath = os.path.abspath(options.output) + "/" \
+ os.path.basename(os.path.splitext(args[0])[0]) + '.epub'
elif os.path.isdir(args[0]):
epubpath = args[0] + '.epub'
else: else:
epubpath = os.path.splitext(args[0])[0] + '.epub' print "\nCreating ePub structure..."
make_archive(path + '_comic', 'zip', path) genEpubStruct(path)
move(path + '_comic.zip', epubpath) # actually zip the ePub
filepath = getOutputFilename(args[0], options.output, '.epub')
make_archive(path + '_comic', 'zip', path)
move(path + '_comic.zip', filepath)
rmtree(path) rmtree(path)
return epubpath return filepath
def getOutputFilename(srcpath, wantedname, ext):
if not ext.startswith('.'):
ext = '.' + ext
if wantedname is not None:
if wantedname.endswith(ext):
filename = os.path.abspath(wantedname)
elif os.path.isdir(srcpath):
filename = os.path.abspath(options.output) + "/" + os.path.basename(srcpath) + ext
else:
filename = os.path.abspath(options.output) + "/" \
+ os.path.basename(os.path.splitext(srcpath)[0]) + ext
elif os.path.isdir(srcpath):
filename = srcpath + ext
else:
filename = os.path.splitext(srcpath)[0] + ext
if os.path.isfile(filename):
filename = os.path.splitext(filename)[0] + '_kcc' + ext
return filename
def checkOptions(): def checkOptions():

View File

@@ -93,31 +93,33 @@ class MainWindow:
self.options = { self.options = {
'Aepub_only': IntVar(None, 0), 'Aepub_only': IntVar(None, 0),
'Bmangastyle': IntVar(None, 0), 'Bcbz_only': IntVar(None, 0),
'Cnopanelviewhq': IntVar(None, 0), 'Cmangastyle': IntVar(None, 0),
'Dimage_preprocess': IntVar(None, 0), 'Dnopanelviewhq': IntVar(None, 0),
'Eforcepng': IntVar(None, 0), 'Eimage_preprocess': IntVar(None, 0),
'Fimage_gamma': DoubleVar(None, 0.0), 'Fforcepng': IntVar(None, 0),
'Gimage_upscale': IntVar(None, 0), 'Gimage_gamma': DoubleVar(None, 0.0),
'Himage_stretch': IntVar(None, 0), 'Himage_upscale': IntVar(None, 0),
'Iblack_borders': IntVar(None, 0), 'Iimage_stretch': IntVar(None, 0),
'Jrotate': IntVar(None, 0), 'Jblack_borders': IntVar(None, 0),
'Knosplitrotate': IntVar(None, 0), 'Krotate': IntVar(None, 0),
'Lcut_page_numbers': IntVar(None, 0) 'Lnosplitrotate': IntVar(None, 0),
'Mcut_page_numbers': IntVar(None, 0)
} }
self.optionlabels = { self.optionlabels = {
'Aepub_only': "Generate EPUB only", 'Aepub_only': "Generate EPUB only",
'Bmangastyle': "Manga mode", 'Bcbz_only': "Generate CBZ only (skip EPUB/Mobi generation)",
'Cnopanelviewhq': "Disable high quality Panel View", 'Cmangastyle': "Manga mode",
'Dimage_preprocess': "Disable image optimizations", 'Dnopanelviewhq': "Disable high quality Panel View",
'Eforcepng': "Create PNG files instead JPEG", 'Eimage_preprocess': "Disable image optimizations",
'Fimage_gamma': "Custom gamma correction", 'Fforcepng': "Create PNG files instead of JPEG",
'Gimage_upscale': "Allow image upscaling", 'Gimage_gamma': "Custom gamma correction",
'Himage_stretch': "Stretch images", 'Himage_upscale': "Allow image upscaling",
'Iblack_borders': "Use black borders", 'Iimage_stretch': "Stretch images",
'Jrotate': "Rotate images instead splitting them", 'Jblack_borders': "Use black borders (instead of white ones)",
'Knosplitrotate': "Disable splitting and rotation", 'Krotate': "Rotate images (instead of splitting them)",
'Lcut_page_numbers': "Disable page numbers cutting" 'Lnosplitrotate': "Disable both splitting and rotation",
'Mcut_page_numbers': "Disable page numbers cutting"
} }
self.optionsButtons = {} self.optionsButtons = {}
for key in sorted(self.options): for key in sorted(self.options):
@@ -162,28 +164,30 @@ class MainWindow:
return return
profilekey = ProfileData.ProfileLabels[self.profile.get()] profilekey = ProfileData.ProfileLabels[self.profile.get()]
argv = ["-p", profilekey] argv = ["-p", profilekey]
if self.options['Bmangastyle'].get() == 1: if self.options['Bcbz_only'].get() == 1:
argv.append("-c")
if self.options['Cmangastyle'].get() == 1:
argv.append("-m") argv.append("-m")
if self.options['Cnopanelviewhq'].get() == 1: if self.options['Dnopanelviewhq'].get() == 1:
argv.append("--nopanelviewhq") argv.append("--nopanelviewhq")
if self.options['Dimage_preprocess'].get() == 1: if self.options['Eimage_preprocess'].get() == 1:
argv.append("--noprocessing") argv.append("--noprocessing")
if self.options['Eforcepng'].get() == 1: if self.options['Fforcepng'].get() == 1:
argv.append("--forcepng") argv.append("--forcepng")
if self.options['Fimage_gamma'].get() != 0.0: if self.options['Gimage_gamma'].get() != 0.0:
argv.append("--gamma") argv.append("--gamma")
argv.append(self.options['Fimage_gamma'].get()) argv.append(self.options['Gimage_gamma'].get())
if self.options['Gimage_upscale'].get() == 1: if self.options['Himage_upscale'].get() == 1:
argv.append("--upscale") argv.append("--upscale")
if self.options['Himage_stretch'].get() == 1: if self.options['Iimage_stretch'].get() == 1:
argv.append("--stretch") argv.append("--stretch")
if self.options['Iblack_borders'].get() == 1: if self.options['Jblack_borders'].get() == 1:
argv.append("--blackborders") argv.append("--blackborders")
if self.options['Jrotate'].get() == 1: if self.options['Krotate'].get() == 1:
argv.append("--rotate") argv.append("--rotate")
if self.options['Knosplitrotate'].get() == 1: if self.options['Lnosplitrotate'].get() == 1:
argv.append("--nosplitrotate") argv.append("--nosplitrotate")
if self.options['Lcut_page_numbers'].get() == 1: if self.options['Mcut_page_numbers'].get() == 1:
argv.append("--nocutpagenumbers") argv.append("--nocutpagenumbers")
errors = False errors = False
left_files = len(self.filelist) left_files = len(self.filelist)
@@ -207,7 +211,7 @@ class MainWindow:
(subargv[-1], str(err), traceback.format_tb(traceback_))) (subargv[-1], str(err), traceback.format_tb(traceback_)))
errors = True errors = True
continue continue
if self.options['Aepub_only'].get() == 0: if self.options['Aepub_only'].get() == 0 and self.options['Bcbz_only'].get() == 0:
try: try:
if os.path.getsize(epub_path) > 314572800: if os.path.getsize(epub_path) > 314572800:
# do not call kindlegen if source is bigger than 300MB # do not call kindlegen if source is bigger than 300MB

View File

@@ -1,6 +1,6 @@
# rarfile.py # rarfile.py
# #
# Copyright (c) 2005-2012 Marko Kreen <markokr@gmail.com> # Copyright (c) 2005-2013 Marko Kreen <markokr@gmail.com>
# #
# Permission to use, copy, modify, and/or distribute this software for any # Permission to use, copy, modify, and/or distribute this software for any
# purpose with or without fee is hereby granted, provided that the above # purpose with or without fee is hereby granted, provided that the above
@@ -17,7 +17,7 @@
r"""RAR archive reader. r"""RAR archive reader.
This is Python module for Rar archive reading. The interface This is Python module for Rar archive reading. The interface
is made as zipfile like as possible. is made as :mod:`zipfile`-like as possible.
Basic logic: Basic logic:
- Parse archive structure with Python. - Parse archive structure with Python.
@@ -34,7 +34,17 @@ Example::
for f in rf.infolist(): for f in rf.infolist():
print f.filename, f.file_size print f.filename, f.file_size
if f.filename == 'README': if f.filename == 'README':
print rf.read(f) print(rf.read(f))
Archive files can also be accessed via file-like object returned
by :meth:`RarFile.open`::
import rarfile
with rarfile.RarFile('archive.rar') as rf:
with rf.open('README') as f:
for ln in f:
print(ln.strip())
There are few module-level parameters to tune behaviour, There are few module-level parameters to tune behaviour,
here they are with defaults, and reason to change it:: here they are with defaults, and reason to change it::
@@ -64,7 +74,7 @@ For more details, refer to source.
""" """
__version__ = '2.5' __version__ = '2.6'
# export only interesting items # export only interesting items
__all__ = ['is_rarfile', 'RarInfo', 'RarFile', 'RarExtFile'] __all__ = ['is_rarfile', 'RarInfo', 'RarFile', 'RarExtFile']
@@ -148,45 +158,45 @@ except ImportError:
## Module configuration. Can be tuned after importing. ## Module configuration. Can be tuned after importing.
## ##
# default fallback charset #: default fallback charset
DEFAULT_CHARSET = "windows-1252" DEFAULT_CHARSET = "windows-1252"
# list of encodings to try, with fallback to DEFAULT_CHARSET if none succeed #: list of encodings to try, with fallback to DEFAULT_CHARSET if none succeed
TRY_ENCODINGS = ('utf8', 'utf-16le') TRY_ENCODINGS = ('utf8', 'utf-16le')
# 'unrar', 'rar' or full path to either one #: 'unrar', 'rar' or full path to either one
UNRAR_TOOL = "unrar" UNRAR_TOOL = "unrar"
# Command line args to use for opening file for reading. #: Command line args to use for opening file for reading.
OPEN_ARGS = ('p', '-inul') OPEN_ARGS = ('p', '-inul')
# Command line args to use for extracting file to disk. #: Command line args to use for extracting file to disk.
EXTRACT_ARGS = ('x', '-y', '-idq') EXTRACT_ARGS = ('x', '-y', '-idq')
# args for testrar() #: args for testrar()
TEST_ARGS = ('t', '-idq') TEST_ARGS = ('t', '-idq')
# whether to speed up decompression by using tmp archive #: whether to speed up decompression by using tmp archive
USE_EXTRACT_HACK = 1 USE_EXTRACT_HACK = 1
# limit the filesize for tmp archive usage #: limit the filesize for tmp archive usage
HACK_SIZE_LIMIT = 20*1024*1024 HACK_SIZE_LIMIT = 20*1024*1024
# whether to parse file/archive comments. #: whether to parse file/archive comments.
NEED_COMMENTS = 1 NEED_COMMENTS = 1
# whether to convert comments to unicode strings #: whether to convert comments to unicode strings
UNICODE_COMMENTS = 0 UNICODE_COMMENTS = 0
# When RAR is corrupt, stopping on bad header is better #: When RAR is corrupt, stopping on bad header is better
# On unknown/misparsed RAR headers reporting is better #: On unknown/misparsed RAR headers reporting is better
REPORT_BAD_HEADER = 0 REPORT_BAD_HEADER = 0
# Convert RAR time tuple into datetime() object #: Convert RAR time tuple into datetime() object
USE_DATETIME = 0 USE_DATETIME = 0
# Separator for path name components. RAR internally uses '\\'. #: Separator for path name components. RAR internally uses '\\'.
# Use '/' to be similar with zipfile. #: Use '/' to be similar with zipfile.
PATH_SEP = '\\' PATH_SEP = '\\'
## ##
@@ -337,49 +347,57 @@ def is_rarfile(fn):
class RarInfo(object): class RarInfo(object):
'''An entry in rar archive. r'''An entry in rar archive.
@ivar filename:
File name with relative path.
Default path separator is '/', to change set rarfile.PATH_SEP.
Always unicode string.
@ivar date_time:
Modification time, tuple of (year, month, day, hour, minute, second).
Or datetime() object if USE_DATETIME is set.
@ivar file_size:
Uncompressed size.
@ivar compress_size:
Compressed size.
@ivar compress_type:
Compression method: 0x30 - 0x35.
@ivar extract_version:
Minimal Rar version needed for decompressing.
@ivar host_os:
Host OS type, one of RAR_OS_* constants.
@ivar mode:
File attributes. May be either dos-style or unix-style, depending on host_os.
@ivar CRC:
CRC-32 of uncompressed file, unsigned int.
@ivar volume:
Volume nr, starting from 0.
@ivar volume_file:
Volume file name, where file starts.
@ivar type:
One of RAR_BLOCK_* types. Only entries with type==RAR_BLOCK_FILE are shown in .infolist().
@ivar flags:
For files, RAR_FILE_* bits.
@ivar comment:
File comment (unicode string or None).
@ivar mtime: :mod:`zipfile`-compatible fields:
Optional time field: Modification time, with float seconds.
Same as .date_time but with more precision. filename
@ivar ctime: File name with relative path.
Optional time field: creation time, with float seconds. Default path separator is '\\', to change set rarfile.PATH_SEP.
@ivar atime: Always unicode string.
Optional time field: last access time, with float seconds. date_time
@ivar arctime: Modification time, tuple of (year, month, day, hour, minute, second).
Optional time field: archival time, with float seconds. Or datetime() object if USE_DATETIME is set.
file_size
Uncompressed size.
compress_size
Compressed size.
CRC
CRC-32 of uncompressed file, unsigned int.
comment
File comment. Byte string or None. Use UNICODE_COMMENTS
to get automatic decoding to unicode.
volume
Volume nr, starting from 0.
RAR-specific fields:
compress_type
Compression method: 0x30 - 0x35.
extract_version
Minimal Rar version needed for decompressing.
host_os
Host OS type, one of RAR_OS_* constants.
mode
File attributes. May be either dos-style or unix-style, depending on host_os.
volume_file
Volume file name, where file starts.
mtime
Optional time field: Modification time, with float seconds.
Same as .date_time but with more precision.
ctime
Optional time field: creation time, with float seconds.
atime
Optional time field: last access time, with float seconds.
arctime
Optional time field: archival time, with float seconds.
Internal fields:
type
One of RAR_BLOCK_* types. Only entries with type==RAR_BLOCK_FILE are shown in .infolist().
flags
For files, RAR_FILE_* bits.
''' '''
__slots__ = ( __slots__ = (
@@ -433,19 +451,27 @@ class RarInfo(object):
class RarFile(object): class RarFile(object):
'''Parse RAR structure, provide access to files in archive. '''Parse RAR structure, provide access to files in archive.
@ivar comment:
Archive comment (unicode string or None).
''' '''
#: Archive comment. Byte string or None. Use UNICODE_COMMENTS
#: to get automatic decoding to unicode.
comment = None
def __init__(self, rarfile, mode="r", charset=None, info_callback=None, crc_check = True): def __init__(self, rarfile, mode="r", charset=None, info_callback=None, crc_check = True):
"""Open and parse a RAR archive. """Open and parse a RAR archive.
@param rarfile: archive file name Parameters:
@param mode: only 'r' is supported.
@param charset: fallback charset to use, if filenames are not already Unicode-enabled. rarfile
@param info_callback: debug callback, gets to see all archive entries. archive file name
@param crc_check: set to False to disable CRC checks mode
only 'r' is supported.
charset
fallback charset to use, if filenames are not already Unicode-enabled.
info_callback
debug callback, gets to see all archive entries.
crc_check
set to False to disable CRC checks
""" """
self.rarfile = rarfile self.rarfile = rarfile
self.comment = None self.comment = None
@@ -457,6 +483,7 @@ class RarFile(object):
self._needs_password = False self._needs_password = False
self._password = None self._password = None
self._crc_check = crc_check self._crc_check = crc_check
self._vol_list = []
self._main = None self._main = None
@@ -489,6 +516,14 @@ class RarFile(object):
'''Return RarInfo objects for all files/directories in archive.''' '''Return RarInfo objects for all files/directories in archive.'''
return self._info_list return self._info_list
def volumelist(self):
'''Returns filenames of archive volumes.
In case of single-volume archive, the list contains
just the name of main archive file.
'''
return self._vol_list
def getinfo(self, fname): def getinfo(self, fname):
'''Return RarInfo for file.''' '''Return RarInfo for file.'''
@@ -510,7 +545,8 @@ class RarFile(object):
raise NoRarEntry("No such file: "+fname) raise NoRarEntry("No such file: "+fname)
def open(self, fname, mode = 'r', psw = None): def open(self, fname, mode = 'r', psw = None):
'''Return open file object, where the data can be read. '''Returns file-like object (:class:`RarExtFile`),
from where the data can be read.
The object implements io.RawIOBase interface, so it can The object implements io.RawIOBase interface, so it can
be further wrapped with io.BufferedReader and io.TextIOWrapper. be further wrapped with io.BufferedReader and io.TextIOWrapper.
@@ -522,9 +558,14 @@ class RarFile(object):
uncompressed files, on compressed files the seeking is implemented uncompressed files, on compressed files the seeking is implemented
by reading ahead and/or restarting the decompression. by reading ahead and/or restarting the decompression.
@param fname: file name or RarInfo instance. Parameters:
@param mode: must be 'r'
@param psw: password to use for extracting. fname
file name or RarInfo instance.
mode
must be 'r'
psw
password to use for extracting.
''' '''
if mode != 'r': if mode != 'r':
@@ -571,8 +612,12 @@ class RarFile(object):
For longer files using .open() may be better idea. For longer files using .open() may be better idea.
@param fname: filename or RarInfo instance Parameters:
@param psw: password to use for extracting.
fname
filename or RarInfo instance
psw
password to use for extracting.
""" """
f = self.open(fname, 'r', psw) f = self.open(fname, 'r', psw)
@@ -593,9 +638,14 @@ class RarFile(object):
def extract(self, member, path=None, pwd=None): def extract(self, member, path=None, pwd=None):
"""Extract single file into current directory. """Extract single file into current directory.
@param member: filename or RarInfo instance Parameters:
@param path: optional destination path
@param pwd: optional password to use member
filename or RarInfo instance
path
optional destination path
pwd
optional password to use
""" """
if isinstance(member, RarInfo): if isinstance(member, RarInfo):
fname = member.filename fname = member.filename
@@ -606,9 +656,14 @@ class RarFile(object):
def extractall(self, path=None, members=None, pwd=None): def extractall(self, path=None, members=None, pwd=None):
"""Extract all files into current directory. """Extract all files into current directory.
@param path: optional destination path Parameters:
@param members: optional filename or RarInfo instance list to extract
@param pwd: optional password to use path
optional destination path
members
optional filename or RarInfo instance list to extract
pwd
optional password to use
""" """
fnlist = [] fnlist = []
if members is not None: if members is not None:
@@ -693,6 +748,7 @@ class RarFile(object):
more_vols = 0 more_vols = 0
endarc = 0 endarc = 0
volfile = self.rarfile volfile = self.rarfile
self._vol_list = [self.rarfile]
while 1: while 1:
if endarc: if endarc:
h = None # don't read past ENDARC h = None # don't read past ENDARC
@@ -707,6 +763,7 @@ class RarFile(object):
self._fd = fd self._fd = fd
more_vols = 0 more_vols = 0
endarc = 0 endarc = 0
self._vol_list.append(volfile)
continue continue
break break
h.volume = volume h.volume = volume
@@ -1210,7 +1267,7 @@ class UnicodeFilename:
class RarExtFile(RawIOBase): class RarExtFile(RawIOBase):
"""Base class for 'file-like' object that RarFile.open() returns. """Base class for file-like object that :meth:`RarFile.open` returns.
Provides public methods and common crc checking. Provides public methods and common crc checking.
@@ -1218,13 +1275,15 @@ class RarExtFile(RawIOBase):
- no short reads - .read() and .readinfo() read as much as requested. - no short reads - .read() and .readinfo() read as much as requested.
- no internal buffer, use io.BufferedReader for that. - no internal buffer, use io.BufferedReader for that.
@ivar name: If :mod:`io` module is available (Python 2.6+, 3.x), then this calls
filename of the archive entry. will inherit from :class:`io.RawIOBase` class. This makes line-based
access available: :meth:`RarExtFile.readline` and ``for ln in f``.
""" """
def __init__(self, rf, inf): #: Filename of the archive entry
"""Fill common fields""" name = None
def __init__(self, rf, inf):
RawIOBase.__init__(self) RawIOBase.__init__(self)
# standard io.* properties # standard io.* properties
@@ -1325,7 +1384,13 @@ class RarExtFile(RawIOBase):
return self.inf.file_size - self.remain return self.inf.file_size - self.remain
def seek(self, ofs, whence = 0): def seek(self, ofs, whence = 0):
"""Seek in data.""" """Seek in data.
On uncompressed files, the seeking works by actual
seeks so it's fast. On compresses files its slow
- forward seeking happends by reading ahead,
backwards by re-opening and decompressing from the start.
"""
# disable crc check when seeking # disable crc check when seeking
self.crc_check = 0 self.crc_check = 0
@@ -1374,8 +1439,17 @@ class RarExtFile(RawIOBase):
"""Returns True""" """Returns True"""
return True return True
def writable(self):
"""Returns False.
Writing is not supported."""
return False
def seekable(self): def seekable(self):
"""Returns True""" """Returns True.
Seeking is supported, although it's slow on compressed files.
"""
return True return True
def readall(self): def readall(self):

View File

@@ -15,7 +15,7 @@ use_setuptools()
import sys import sys
NAME = "KindleComicConverter" NAME = "KindleComicConverter"
VERSION = "2.8" VERSION = "2.9"
MAIN = "kcc.py" MAIN = "kcc.py"
includefiles = ['README.md', 'MANIFEST.in', 'LICENSE.txt', 'comic2ebook.ico', 'comic2ebook.icns'] includefiles = ['README.md', 'MANIFEST.in', 'LICENSE.txt', 'comic2ebook.ico', 'comic2ebook.icns']

View File

@@ -10,7 +10,7 @@ sys.path.insert(0, 'kcc')
setup( setup(
name = "KindleComicConverter", name = "KindleComicConverter",
version = "2.8", version = "2.9",
author = "Ciro Mattia Gonano", author = "Ciro Mattia Gonano",
author_email = "ciromattia@gmail.com", author_email = "ciromattia@gmail.com",
description = "A tool to convert comics (CBR/CBZ/PDFs/image folders) to MOBI.", description = "A tool to convert comics (CBR/CBZ/PDFs/image folders) to MOBI.",