Contact

R, Statistics and Visualization

R Reporting Part 7: Converting R Documents to E-books

This is the seventh of a series of articles on how to use R, RStudio and TexMaker to prepare presentations and batch jobs for automated reporting on a web server or Microsoft SharePoint server. The series is based upon the presentation that I did at the February 27, 2016 Dallas R User Group Meetup. Because the presentation was primarily a demonstration, there really isn’t a presentation to distribute; this series covers the topics from the presentation/demonstration. The series will eventually include the following articles as I complete them over the next couple of weeks:

The series of articles describes the process for daily batch jobs that generate the Daily Econometric Graphs web page which includes links to the same econometric charts in several formats, all generated through the same R code:

All of the examples are based upon the knitr R package; you should reference the knitr documentation, as this article is not a replacement for the knitr documentation.

Example source is available in images/documents/econometric_source.zip.

Software Required

Creating epub files requires software that converts from html to epub format. Calibre is the one of the better open source tools for e-book generation and management. It is available for Linux, Windows and OS X. For converting LaTeX files to epub, it is necessary to create an HTML file as an intermediate step. There are several tools to do this; Latex2html is one of the tools that can do this and is arguably the best, but it is not being actively maintained. The LaTeX to HTML FAQ has a list of alternatives.

In looking at alternatives, it is helpful to use one that does not generate pages with headers and footers, as these can create problems during the epub generation process.

Creating an E-book from Rhtml

The first step in creating an E-book is to create an HTML file as input to the process. You can generate the document as the output from running an .RHTML or by using one of several utilities for converting LaTeX files to HTML. The script below is a very simplified script to generate an epub. For simplicity, error checking and recovery are omitted:

 #!/bin/bash echo \$HOME cd ~/svn_work/Consulting_Business/web_site/articles/econometric_charts/src/R # # Run RScript and knitr to generate the HTML file # /usr/bin/Rscript -e "require('knitr'); knit('econometric_charts_home_page.Rhtml')" # # Compress images # optipng images/figures/*.png optipng *.png # # Use Calibre ebook-convert to convert the HTML file to an epub file # ebook-convert econometric_charts_home_page.html .epub --authors "Bruce Moore" --language en --no-default-epub-cover # # Upload to the web site # cd ~/svn_work/Consulting_Business/web_site/articles/econometric_charts/src/R scp -i /home/batchuser/.ssh/webuser econometric_charts_home_page_article/econometric_charts_home_page_article.epub This email address is being protected from spambots. You need JavaScript enabled to view it. document.getElementById('cloakfd6e5167b763b211ef457fe5003dec0c').innerHTML = ''; var prefix = '&#109;a' + 'i&#108;' + '&#116;o'; var path = 'hr' + 'ef' + '='; var addyfd6e5167b763b211ef457fe5003dec0c = 'w&#101;b&#117;s&#101;r' + '&#64;'; addyfd6e5167b763b211ef457fe5003dec0c = addyfd6e5167b763b211ef457fe5003dec0c + 'w&#101;b&#117;s&#101;rtw&#97;r&#101;s&#101;rv&#105;c&#101;s' + '&#46;' + 'c&#111;m'; var addy_textfd6e5167b763b211ef457fe5003dec0c = 'w&#101;b&#117;s&#101;r' + '&#64;' + 'w&#101;b&#117;s&#101;rtw&#97;r&#101;s&#101;rv&#105;c&#101;s' + '&#46;' + 'c&#111;m';document.getElementById('cloakfd6e5167b763b211ef457fe5003dec0c').innerHTML += '<a ' + path + '\'' + prefix + ':' + addyfd6e5167b763b211ef457fe5003dec0c + '\'>'+addy_textfd6e5167b763b211ef457fe5003dec0c+'<\/a>'; :/home/webuser/www/images/documents/ 

Run RScript and Knitr

The first section of the script calls Rscript to run R from the command line and pass two statements. The first is require('knitr') while the second is to knit() the Rhtml file which runs all of the R code and creates an HTML file.

 # # Run RScript and knitr to generate the HTML file # /usr/bin/Rscript -e "require('knitr'); knit('econometric_charts_home_page.Rhtml')" 

Compress Images

The second segment uses optipng to compress the .png images that are created as graphics when running the Rhtml file:

 # # Compress images # optipng images/figures/*.png optipng *.png 

Ebook-convert

The third segment uses the ebook-convert program from Calibre to create the epub file. There are numerous options for the ebook-convert command but the --authors, --language and --no-default-epub-cover options are a minimal set. The first two are self explainatory and are useful for all files. The --no-default-epub-cover option is useful for input files that have content that generates a good cover page, or where no cover page is needed.

 # # Use Calibre ebook-convert to convert the HTML file to an epub file # ebook-convert econometric_charts_home_page.html .epub --authors "Bruce Moore" --language en --no-default-epub-cover 

The final segment is a command to upload files to a web server or other server. In this case, the scp secure copy command is used. In this use, it is assumed that public key authentication has previously been configured. The -i option is not normally needed for interactive use, but is necessary for reliable use under cron. The -i option points to the private key file to be used for authentication.
 # # Upload to the web site # cd ~/svn_work/Consulting_Business/web_site/articles/econometric_charts/src/R scp -i /home/batchuser/.ssh/webuser econometric_charts_home_page_article/econometric_charts_home_page_article.epub This email address is being protected from spambots. You need JavaScript enabled to view it. document.getElementById('cloakc0d011669aea13e2c8e307994a361417').innerHTML = ''; var prefix = '&#109;a' + 'i&#108;' + '&#116;o'; var path = 'hr' + 'ef' + '='; var addyc0d011669aea13e2c8e307994a361417 = 'w&#101;b&#117;s&#101;r' + '&#64;'; addyc0d011669aea13e2c8e307994a361417 = addyc0d011669aea13e2c8e307994a361417 + 'w&#101;b&#117;s&#101;rtw&#97;r&#101;s&#101;rv&#105;c&#101;s' + '&#46;' + 'c&#111;m'; var addy_textc0d011669aea13e2c8e307994a361417 = 'w&#101;b&#117;s&#101;r' + '&#64;' + 'w&#101;b&#117;s&#101;rtw&#97;r&#101;s&#101;rv&#105;c&#101;s' + '&#46;' + 'c&#111;m';document.getElementById('cloakc0d011669aea13e2c8e307994a361417').innerHTML += '<a ' + path + '\'' + prefix + ':' + addyc0d011669aea13e2c8e307994a361417 + '\'>'+addy_textc0d011669aea13e2c8e307994a361417+'<\/a>'; :/home/webuser/www/images/documents/ 
When using a LaTeX input file, there is one additional step in between the RScript and ebook-convert segments. Option -split +0 causes the output to be placed into a single HTML file rather than splitting it into files based upon sections.
 # # Generate the epub version of the article # latex2html -split +0 econometric_charts_home_page_slides.tex