Server update. 21 Mar 2012: A summary of all the arXiv and Gutenberg documents is available at http://itex.vis.research.att.com/libsum.html. This includes version lengths in pages for all sizes and orientations, and erroneous pages identified by an automated checker. There errors are usually text that doesn't fit on the page. For arXiv, these are either completely failed format changes, or equations that run off the end of the line.
Server update, 19 Feb 2012: The automated translator has been substantially improved. Gutenberg texts should get most of the Latin-1 characters right, so it is worth downloading fresh copies of these texts.
The arXiv documents should be getting better translations to LaTeX now. Many are broken, but the majority seem to work, and many come out quite nicely. Work will continue
My video presentation of iTeX at TUG 2010 in San Francisco is here.
(These pages modified extensively on 6 Feb 2012.)
TeX and other traditional text layout markup languages are predicated on the assumption that the final output format would be known to the nanometer. Extensive computation and clever algorithms let us optimize the presentation for a high standard of quality, designed by artists who are experts at document layout.
Book readers offer a new way to store and read documents, but they are a challenge to high quality text layout. Ebook users are accustomed to selecting reader orientation, typeface, and font size. We probably cannot run TeX over a document every time a reader shifts position in his chair.
iTeX is an experimental document bundle format and a free iPad application that present documents exactly as they were rendered by LaTeX. The bundle (a tar of a standardized directory structure) contains precomputed page images for portrait and landscape layouts, in standard and large type versions. I hope this may be a suitable standard to encourage similar applications on devices like the Nook or the many versions of the Kindle, included in the same bundle.
The iTeX app is available for the iPad at no charge from the iTunes store.
iTeX reads iTeX bundles. If you have loaded the app, and tap on an iTeX bundle link on the Internet, the app will download and display the document.
Of course, iTeX is quite new and not widely adopted (yet: one can hope!) So, to try it out, you need to either generate your own bundle from an existing LaTeX document, or you can try the iTeX document translation service.
The app gives web access to Project Gutenberg and arXiv.org. You can select a document from either library, and iTeX will attempt to translate it for you to a bundle. This works quite well with Project Gutenberg books; it only works for about half of the scientific documents on arXiv, according to my tests. But when it does, you get a nicely-formatted LaTeX output for exactly the iPad's size and resolution.
In principle, automatic translations of TeX and LaTeX documents require human intervention. Don Knuth is quite adamant about this, and he is right. Really good formatting requires human judgment, and even some rewriting of the text.
But this service is an adequate starting point. I will be making the generated LaTeX documents available for download. Perhaps someone will clean up the results.
The library browser opens up the PG web page. Find the desired document and tap it. You will go to the translation page and, if the translation works, a button to download it will appear.
ArXiv documents are a little trickier, in a couple ways: they are harder
to translate automatically, and one has to tap on the right download option.
The option to select is
I monitor the translation service to try to get the bugs out and give better results. But I do not record any information about the source of the translation requests, or the bundle downloads.
A quick start guide is available at the beginning of this bundle generation guide. It contains some boilerplate that will help you generate an iTeX bundle quickly from an exisiting LaTeX document of your own.
Though a well-formatted TeX or LaTeX document requires the human touch, scripts can often generate plausibly good documents automatically. Scripts and more complete documentation are available if you wish to dive in further.
This translation is experimental, and often does not work properly. When it does, it is quite convenient. I plan to improve the translation process.
Further information is detailed here:
Summary of recent translations.
This shows all attempts in the past 10 days, and shows an overview of the errors. For arXiv, many of these errors occur because the document is just too complex for the simple translations heuristics. On 5 March I improved the handling of extraneous geometry requests, which should help some.
Visualization of all the arXiv bundles translated.
I welcome comments, bug fixes, etc.:
Bill Cheswick
ches at research.att.com
Last modified: 6 Feb 2012