LaTeX to ePub, experiments with multiformat publishing

También puedes encontrar este artículo en español.

Didn’t I say this before?: (#iloveplaintext). You can be sure that there is a way to switch the format you chose for writing a document, into a different format, probably keeping most of the layout and appearance of the original text.
This was my premise when this challenge arised: to obtain an ePub file (or .mobi or any other ebook format), similar to this PDF in appearance, but improved for ebook readers as Kindle or Cool Reader for Android devices. The good thing is that the LaTeX source files were available to use them in the conversion.

In this article I publish the recipes and results obtained in my 5 experiments, so anybody can get an idea of how powerful are the free software tools for this kind of tasks, and which ones to try (or not to try) for her particular case. If you want to know the method that worked best for me, go to the 5th attempt.

  • 1st try: LaTeX memoir class.
  • 2nd try: LaTeX geometry class.
  • 3rd try: Pandoc (old version)
  • 4th: Pandoc (new version)
  • 5th try (¡Good!): text4ht + Calibre

1st try: LaTeX memoir class

Time spent in this conversion: 5 minutes max.
The steps I followed:

  • Open the .tex file with Kile (my favorite LaTeX editor)
  • Mark as comments the documentclass line, geometry and todonotes packages
  • Add a documentclass line with memoir class and several options, specific for ebook (taken from here, that article includes a complete template but I went directly to what I needed.
  • My figures and tables widths were in cm (#grr), so I made a search and replaced them with \linewidth multiples (roughly). I think it didn’t helped though.
  • Heights were in cm too, I was too lazy to change them.
  • Build the PDF with Kile and go.

The result was not worth it, so I tried a different thing.

2nd try: LaTeX geometry class

Time spent in this conversion: 5 minutes max.
The steps I followed:

  • Open the .tex file with Kile (my favorite LaTeX editor)
  • Mark as comments the documentclass line, geometry and todonotes packages
  • Abrir el .tex con Kile
  • Marcar como comentarios las líneas de documentclass, la del paquete geometry y todonotes
  • Add a documentclass line with geometry class and margins taken from here.
  • Put the same width for left and right margins (they were different).
  • Search and replace figures and tables widths: 13cm and 14cm >> 8cm; 5cm > 3cm, 7cm > 4cm.
  • Heights were in cm too, I was too lazy to change them.
  • Build the PDF with Kile and go.

The result was not worth it either, so I tried a different thing.

3rd try: Pandoc (old version)

Time spent in this conversion: 5 minutes max.
The steps I followed:

  • Convert to HTML with pandoc (installed from apt in Debian Squeeze):
 pandoc -f latex -t html larjona_thesis_translations_en.tex -s -S -R --toc -o larjona2.html
  • Open Calibre and add the HTML book
  • Convert to epub (didn’t spent time on metadata etc)

In Calibre, I could not open the ePub files converted directly with Pandoc, that’s why I did it like this. The result (rename the file removing the “.pdf” and leaving the “.epub” extension) was quite worse than former PDFs: figures were missing (bad), footnotes all grouped at the end of the doc (bad), Appendix titles were missing (bad), this time I could see the verbatim (good), bibliography was missing (bad).

My Kindle reader for Android didn’t recognized the ePub. I did not troubleshoot the problem, since a free software reader, FBReader, opened it without problems.

4th try: Pandoc (new version)

I repeated the process of my 3rd try in Debian Wheezy, which comes with an updated version of Pandoc. The result (rename the file removing the “.pdf” and leaving the “.epub” extension) is somehow better, but still without figures nor tables nor bibliography.

5th try (¡Good!): text4ht + Calibre

Time spent in this conversion: 5 minutes max.
The steps I followed:

  • Install tex4ht from apt in Debian Wheezy.
  • Open the .tex file with Kile, and mark as comment a section title which included a footnote with an URL and a short title for the table of contents, and was causing an error when building with tex4ht. I wrote a simple title instead.
  • Build again with Kile (PDFLatex) so the bibliography files are generated
  • From command line:
 htlatex larjona_thesis_translations_en_tex4ht.tex

(some warnings about figure sizes)

  • Open Calibre, and add the book, choosing the “root” HTML file generated.
  • Convert to ePub, adding a frontpage and filling the metadata
  • ¡FINISHED!

The result (rename the file removing the “.pdf” and leaving the “.epub” extension) is quite admissible: my ePub reader for Android renders well links (URL, table of contents, footnotes and bibliography), but not verbatim (the *) nor tables. However, in my laptop, I see everything fine with Calibre or Okular.

Comments

  • I did this almost one year ago, so maybe now we can find better tools (or the same, but improved) to do the trick.
  • I didn’t have too much time to spend on this, so I didn’t read all the documentation, etc. just searched a bit the internet, read the most important things, and tried. I’m sure that there are other better ways to do the conversion, or more elegant, or anything.
  • I had a particular LaTeX file that I wanted to convert, and it was already written in LaTeX. I mean: if I begin again to write a document (with the multiformat publishing in mind) maybe I wouldn’t choose LaTeX as source (for example, I would use markdown, or docbook and edit with Publican (as the Debian Administration Handbook), or Emacs org-mode (yes, it’s possible to write a thesis in org-mode), or HTML). It depends much on the type of the document: mine was a simple one, other works may have for example a lot of formulae or graphics or tables and you need to choose the main format and editing tool regarding that too.
  • Of course I tried ony free-as-in-freedom software to do the conversions. Not interested in nonfree tools.

Thanks

Gregorio Robles asked if I could convert my Master Thesis PDF or LaTeX sources to an ebook-friendly format. For me, it has been an opportunity to:

  • Play with ‘LaTeX to anything’ converters. Indeed, I wanted to publish an HTML version here in my blog, and here you are ;)
  • Play with ebook readers in my laptop and Android devices (I have no Kindle nor other ebook reader (machine), I still prefer paper… but I knew Calibre, FBReader, Cool Reader and of course Okular that handles many types of file included ePub and .mobi..
  • Put here what I learned, so other people may save a bit of work (I hope!).

So, Thanks a lot Gregorio!

About these ads

About larjona

My name is Laura Arjona, I am a libre software user and fan of the free culture. If you want to contact me you can write an email to larjona [at] larjona [dot] net I am @larjona at identi.ca in the Pump.io social network. --- Me llamo Laura Arjona, soy usuaria de software libre y fan de la cultura libre. Si quieres contactar conmigo puedes escribir a larjona [en] larjona [punto] net Soy @larjona en el servidor identi.ca, de la red social Pump.io.
This entry was posted in My experiences and opinion, Tools and tagged , , , , , , , , . Bookmark the permalink.

One Response to LaTeX to ePub, experiments with multiformat publishing

  1. Pingback: De LaTeX a ePub, experimentos con publicación multiformato | The bright side

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s