Simple epub to xhtml Converter, for Offline Browsers (Requires Linux)
-
I didn't want to bother with other programs to read my ebooks so I made a converter script in Linux to open up and reformat epub books from j-novel, which I can then read on any device. I thought I'd share my code in case anyone is interested, I have no idea if anyone will find it useful though.
Features:
- Should be fully automatic with minimal setup on Linux platforms.
- Flattens the entire book into one browser compatible file.
- Automatically creates chapter navigation.
- Embeds a stylesheet for common elements (dark mode). (If you want to modify it, either open full.xhtml in a text editor, modify the script below, or include another stylesheet in the same folder)
- Basic profanity filter. (Prioritizes resulting readability over removing everything).
Potential Issues:
- I have not tested this on any other devices, use at your own risk.
- I have not tested this on any other series.
- Note on security? Don't paste random code into your command line unless you are sure it won't hurt you. This script is pretty transparent so it should be easy enough to validate.
- Unfortunately, at the time of writing, a number of common mobile browsers do not allow loading separate images, though I still prefer reading on my phone and just switching to photo viewer whenever there is an insert.
Steps:
- Ensure relevant commands are compatible with your system (I haven't tested on any other devices, I'm just using Ubuntu on Windows Subsystem for Linux). You may need to install unzip using the command below or, alternatively, unzip the epub manually (rename to .zip and extract like usual) into a folder of your choice.
sudo apt install unzip
- Modify the first two lines of the script such that the appropriate epub file is targeted. The example is for a part number and volume number, e.g. "make 4-3" would output "Processing p4-v3" for the first line. If you are confused, it will be faster to unzip it manually and remove the first two lines.
- Copy the following script into a Makefile (or replace "$*" with a folder of your choice and paste the commands into your command line).
- Place the epub file (or extracted folder) into the same directory as your Makefile and run "make <name of folder>". For example, to convert "ascendance-of-a-bookworm-part-3-volume-1.epub" I would run "make 3-1" which would create the folder "3-1" and place the relevant files in there.
- If everything went well, your folder should now have a bunch of images and full.xhtml, which should be compliant with most browsers.
%: echo "$*" | sed 's/\([0-9]*\)-\([0-9]*\)/Processing p\1-v\2/' echo "$*" | sed 's/\([0-9]*\)-\([0-9]*\)/ascendance-of-a-bookworm-part-\1-volume-\2.epub/' | xargs unzip -d $* > discard.me grep $*/OEBPS/Text/cover.xhtml -e "utf" -e "DOCTYPE" -e "<html" -e "head" -e "meta" -e "title" -e "link" > sb echo -e "<style>\nbody {\n line-height: 1.2em;\n font-size: 1em;\n overflow-wrap: break-word;\n background-color: #222;\n color: white;\n font-family: Lato, sans-serif;\n font-size: 110%;\n user-select: none;\n cursor: none;\n margin: 0em;\n padding: 1em;\n}\n\nimg {\n max-width: 100%;\n}\n\n.main { \n font-weight: normal; \n letter-spacing: 0; \n orphans: 1; \n widows: 1; \n word-spacing: 0; \n}\n\np { display: block;\n margin-top: 0em;\n margin-bottom: 0.5em;\n margin-left: 0em;\n margin-right: 0em;\n text-indent: 18pt;\n }\n\np.signature {\n text-align: right;\n}\n\nblockquote {\nmargin-top: 1em;\nmargin-bottom: 1em;\nmargin-left: 1em;\nmargin-right: 1em;\n}\n\nblockquote p {\nmargin-left: 0;\nmargin-right: 0;\n}\n\nli {\nfont-size: 1em;\nmargin-top: 6pt;\n}\n\nli p {\ntext-indent: 0em;\n}\n\nul {\nmargin-top: 1em;\nmargin-bottom: 1em;\n}\n\nol {\nmargin-top: 1em;\nmargin-bottom: 1em;\ntext-align: left;\n}\n\nh1 {\nfont-size: 1.55em;\nmargin-top: 10em;\nmargin-bottom: 1em;\nline-height: 1.2em;\ntext-indent: 20pt;\n}\n\nh2 {\nfont-size: 1.15em;\nmargin-top: 1.5em;\nmargin-bottom: .5em;\nline-height: 1.2em;\n}\n\ntable\n{\nmargin-top: 1.5em;\nmargin-bottom: 1.5em;\nfont-size: 0.9em;\nborder-collapse: collapse;\n}\n\ntr td\n{\nvertical-align: top;\npadding: 0.2em;\n}\n\ncode {\nfont-family: Consolas,\"courier new\",monospace;\n}\n</style>\n<body>" >> sb grep -v -e "signup.xhtml" -e "toc.xhtml" $*/OEBPS/content.opf | grep -e "<itemref" | sed -z 's/cover"/cover.xhtml"/; s/ <itemref idref="/$*\/OEBPS\/Text\//g; s/"\/>\r*\n/ /g' | xargs cat | grep -e "<section" -e "<div" -e "</section" -e "</div" -e "<h1" -e "<h2" -e "<p" -e "<img" > t.xhtml grep -e '<section' -e '<div' -e 'h1' t.xhtml | sed -z -E 's/<section[^i>]*id="([^<"]*)">\s*<div class="main">\s*<h1>([^<]*)<\/h1>/<li class="toc-front"><a href="\#\1">\2<\/a><\/li>/g' | grep -e 'li class="toc-front"' >> n.xhtml echo "</ol></nav>" >> n.xhtml cat sb n.xhtml t.xhtml | sed 's/"..\/Images\//"/g; s/"..\/Styles\//"/g' | sed 's/the hell//g; s/as hell//g; s/Hell,/In fact,/g; s/The hell/What/g; s/a hell of a/quite a/g; s/hell /grief /g; s/damned //g; s/damning/heinous/g; ' > full.xhtml echo "</body> </html>" >> full.xhtml mv full.xhtml $*/. rm -f sb t.xhtml n.xhtml discard.me mv $*/OEBPS/Images/* $*/. rm -r $*/OEBPS $*/META-INF $*/mimetype