This article isn’t so much about how I love the ePub format, or how I’m appreciative that Sony added ADE to the firmware of the PRS-505. It’s about the cruel and never-ending annoyances that crept up converting a multi-chapter HTML file into an ePub book.
Before we begin, many would ask: Why in the Hell would you roll your own eBooks? An honest response to this statement would be because I can. There have been more than a few incidents where I don’t like the layout of a book I’ve purchased and thought I could do better if I had a text editor and access to the source HTML.
Well, enter in the ePub format: XHTML 1.0 goodness wrapped in a convenient ZIP file. It’s a good format. Simple to wrap one’s head around and easy to craft, especially with a program such as Calibre to do the heavy lifting.
My first step was Python and a module called lxml. The HTML was formatted decently well, but I had grandiose ideas. I wanted to scrape all 57 chapters (including Prologue and Epilogue) into one massive HTML file. The script started off very simple, but grew due to the fact that I wanted to make the script do exactly what I wanted it to do, and do it exactly the way I thought it should be done.
So after thirty or so versions of this Python script, I finally had this large yet fully compliant XHTML 1.0 web page with proper attributes for the images, proper CSS for centering images and text, good layout for the quotation located at the beginning of the book. I was happy and even proud of this proto-eBook.
However, there were foul things lurking in the margins. I had decided that each chapter needed a container around it. This would make navigating the tree much easier if I needed to layout the book again. Also, I had purchased a decorative horizontal rule from iStockPhoto to replace the <br /> with something a bit more presentable and easier to see. I even stripped the ALIGN="CENTER" from the original img tags and used the margin-left and margin-right attributes to center the image across the page.
All of these would bite me in the tail by the end of the day.
The first issue I had was this great idea of containing the chapters in their own div element. Looking back on it, I was an idiot, but that didn’t stop this great idea from coming into fruition. I have no real proof that this can’t be done, but after looking at the output from Calibre and the generated ePub file, they were stripped away in the final product. For some odd reason, even with Calibre’s changes my layout looked poor at best. There were other reasons for that, but allow me to explain exactly why this was a bad idea: it was unneccessarily complex and didn’t do anything but make troubleshooting more difficult. So, with this in mind, I removed them and pressed onward.
So, after taking this decorative horizontal rule and pushing it through Adobe Illustrator into Adobe Photoshop and then saving it as a 800×12 pixel, 64 bit, transparency-enabled PNG file, I placed it in the proper directory and added it to the html where the br tags used to be. This was fairly simple and I thought nothing of it. Using CSS to resize the image to a svelte 400 pixels wide would make the image fit right in with everything else.
Then, after I was done adjusting the other images in the XHTML page I centered them and waited for Calibre to create the ePub file (yet again) and give me exactly what I had seen on my screen.
It simply was not to be.
After much screwing around with the CSS, I looked at some of the other ePub files I have and figured out something: You must wrap the img tags in a div element.
The div must then be set to text-align: center so that the image will, after all of this hard work, be centered across the page.
The good news was that every image now was centered—except for the horizontal rules which wouldn’t even render. They might have been centered, but their only presence was a large blank space in the resulting ebook.
So, another trip to Adobe Photoshop was required. After reducing it to 32 bits and downsizing it to 400 pixels across, I re-introduced it to the book and voilà, everything layed out perfectly.
These issues aren’t Calibre’s fault: after the first main issue, every one of them looked correct on the program’s display program. I don’t know if it’s an ADE issue, or if it’s an issue with the PRS-505. I have a feeling it’s an issue with the version of ADE that comes with the firmware. With that in mind, a quick list of what not to do:
div elements as containers for your images (span might work as well). In this case, text-align: center is always a good choice.text-align: justify. Your text will have to settle for the standard alignments of left, right, and center.
Thanks, appreciate the feedback. Calibre is a great program and I’ve been using it to convert my eBooks into formats that don’t make me want to punch people.
I was just crafting a epub of my own. It’s a collection of notes I’m writing as I work through a project. I decided to keep it as an ebook on my nook. One of the things I want to do is include a few images. Some I want on the left, some on the right, and some centered. The usual approach to centering didn’t work. (Hmm…) But, you wrapped in a div container did the trick, at least in ADE. Now to load my draft on my nook and check it out there. Thanks!
Bad Behavior has blocked 68 access attempts in the last 7 days.
7:53 am
Enjoyed your article. Discovered Calibre through it. Thanks for taking the time to write it!