The aim here is to emulate the Unix
nroff, which formats
text as best it can for the screen, from the same
input as the Unix typesetting program
Converting DVI to plain text is the basis of many of these techniques; sometimes the simple conversion provides a good enough response. Options are:
dvi2tty(one of the earliest),
catdvi, which is capable of generating Latin-1 (ISO 8859-1) or UTF-8 encoded output.
Catdviwas conceived as a replacement for
dvi2tty, but development seems to have stopped before the authors were willing to declare the work complete.
A common problem is the hyphenation that TeX inserts when typesetting something: since the output is inevitably viewed using fonts that don’t match the original, the hyphenation usually looks silly.
Ralph Droms provides a
txt bundle of things in support of
but it doesn’t do a good job with tables and mathematics.
Another possibility is to
use the LaTeX-to-ASCII conversion program,
although this is really more of a de-TeXing program.
The canonical de-TeXing program is
detex, which removes
all comments and control sequences
from its input before writing it to its output. Its original purpose
was to prepare input for a dumb spelling checker, and it’s only usable
for preparing useful ASCII versions of a document in highly
Tex2mail is slightly more than a de-TeXer — it’s a
Perl script that converts TeX files into
plain text files, expanding various mathematical symbols
(sums, products, integrals, sub/superscripts, fractions, square
roots, …) into “ASCII art” that spreads over
multiple lines if necessary. The result is more readable to human
beings than the flat-style TeX code.
Another significant possibility is to use one of the
and then to use a browser such as
lynx to dump the resulting
HTML as plain text.
FAQ ID: Q-toascii