% -*-latex-*- % Document name: /u/sy/beebe/tex/talks/special/special.ltx % Creator: Nelson H.F. Beebe [beebe@magna.math.utah.edu] % Creation Date: Sun Nov 11 07:06:19 1990 % 1.02 -- [04-Jun-1991] fix two small typos % 1.01 -- [12-Nov-1990] last major changes in original version %% @texfile{ %% author = "Nelson H. F. Beebe", %% version = "1.02", %% date = "04 June 1991", %% filename = "special.ltx", %% address = "Center for Scientific Computing %% Department of Mathematics %% South Physics Building %% University of Utah %% Salt Lake City, UT 84112 %% USA %% Tel: (801) 581-5254", %% checksum = "1491 6916 48967", %% email = "beebe@math.utah.edu (Internet)", %% codetable = "ISO/ASCII", %% keywords = "", %% supported = "yes", %% docstring = "This document contains a proposal for the %% handling of \\special and paper %% specifications by DVI drivers. %% %% The checksum field above contains the %% standard UNIX wc (word count) utility %% output of lines, words, and characters; %% eventually, a better checksum scheme should %% be developed." %% } \documentstyle[special,ltugboat]{article} \title{A Proposal for \protect\TeX{} {\tt\char92special} Commands and \protect\DVI{} Driver Paper Specification} \author{Nelson H. F. Beebe} \address{Center for Scientific Computing\\ Department of Mathematics\\ 220 South Physics Building\\ University of Utah\\ Salt Lake City, UT 84112\\ USA\\ Tel: (801) 581-5254\\ FAX: (801) 581-4148} \netaddress[\network{Internet}]{Beebe@math.utah.edu} \begin{document} \maketitle \bibliographystyle{plain} \section{Introduction} \TeX{} is now a {\em de facto\/} standard; its source code development is now frozen, with the version number converging to $\pi$ as increasingly rare bugs are fixed \cite{Knuth:TB11-4-???-???}. \TeX{} has been implemented on nearly every computer architecture commercially available today, from personal computers to supercomputers, on a wide variety of operating systems. \TeX{}'s principal output is a {\em device-independent file}, the \DVI{} file, which contains a compact description of where to set characters on the page. It does not contain any descriptions of the characters themselves, only a reference to the fonts in which they are found. A few other programs besides \TeX{} also produce \DVI{} files. It is the job of separate programs, known as \DVI{} drivers, to translate this description into a form suitable for some output device, which might be a printer, a display screen, a phototypesetter, or even another \DVI{} file. Because a separate \DVI{} driver is needed for each output device, and each operating system, there is the potential for an explosion in the number of auxiliary programs that may be required to obtain usable output from \TeX{}, and regrettably, that seems to have happened. \section{The \protect\DVI{} driver problem} I have previously espoused the view \cite{Beebe:TB8-1-41} that prevention of \DVI{} driver program proliferation is properly addressed by writing a {\em family} of drivers that supports a wide variety of output devices, sharing common source code as much as possible. The code must be highly portable, so as to work on a wide variety of operating systems. My implementation of such a family of drivers has been well-received, and many thousands of copies of the programs are in use around the world. The last public release, version 2.10 of October 1988, consists of about 30~000 lines of code for 19 drivers, together with about 8400 lines of documentation, corresponding to about 150 typeset pages. Five major operating environments (Atari, DEC TOPS-20, DEC \VAX{} VMS, IBM PC, and \UNIX{}) % \typeout{EDITOR: small caps on UNIX looks odd} % are supported, with several different compilers. Ports have been carried out to other operating systems as well, but the changes have not been made generally available. The widespread use of those programs has confirmed my thesis, but has also demonstrated that they have some deficiencies. This is to be expected in any software product, whether public or commercial. Even \TeX{} has evolved from its original design. Consequently, in the fall of 1988 I set out to redesign the driver family to remedy all of the deficiencies, to further enhance portability to new operating systems and new compilers, to make it easier to modify existing driver code to support other output devices, and to extend and enhance the documentation. The development version is known as 3.0. This work is, alas, far from complete, and I sometimes wonder whether Don Knuth will finish Volume 4 of the {\em Art of Computer Programming} before I finish the new driver family. However, considerable progress has been made. The number of output devices and operating environments supported has more than doubled. The source code is now over 55~000 lines; for comparison, \TeX{} and \MF{} are each about 20~000 lines when prettyprinted. There are 23~500 lines of documentation, corresponding to about % dvidrive 256 % dviman 53 % dviman2 37 % dvistatu 20 % dvi.ps 33 % Total = 399 400 typeset pages. The `manual pages' are written in a subset of \LaTeX{}, then converted automatically to \UNIX{} \verb|troff| format, \VAX{} VMS \verb|help| file format, Emacs \TeX{}info format, and a simple ASCII text file format. \section{Standardization of \protect\DVI{} drivers} As should be expected, the proliferation of \DVI{} driver code written by many authors has led to considerable variation in driver interfaces and operation. While human interfaces unavoidably depend somewhat on the operating environment, one can demand that the same {\em capabilities\/} (e.g.\ page selection and order, paper sizes, page origin offset, file paths, startup-file support, \ldots{}) be available in all drivers. Operational differences are less excusable. For example, most programs, including \TeX{}, have fixed limits arising from compile-time choices of internal storage sizes. User annoyance and frustration results when those limits are reached prematurely. Even on the same output device, slight variations in page positioning, and placement of rules and characters will be found in different drivers. Worse, some drivers may refuse to print certain \DVI{} files, because internal limits are reached, or a particular font cannot be found. To address these problems, a committee of the \TeX{} Users Group was established to develop a standard for \DVI{} drivers. Completion of a level-0 draft is imminent. This draft is intended to define minimal standards that all \DVI{} drivers should adhere to. It does not address some of the thornier issues, particularly the \TeX{} \verb|\special| command, which will be covered in a higher-level draft standard yet to be prepared. \section{The \protect\TeX{} {\tt\char92special} command} When \TeX{} was first defined in 1977--78, its author realized that there would be a need for extensibility. He chose to provide this in a very simple form---an arbitrary string provided as the argument to the \verb|\special| command is macro-expanded, then passed verbatim to the \DVI{} file without further interpretation. To guide \TeX{} users and authors of \DVI{} drivers, he offered this advice \cite[pp.~228--229]{Knuth:texbook}: % \begin{quote} The $\langle$token list$\rangle$ in a \verb|\special| command should consist of a keyword followed if necessary by a space and appropriate arguments. For example, % \begin{verbatim} \special{halftone pic1} \end{verbatim} % might mean that a picture on the file \verb|pic1| should be inserted on the current page, with its reference point at the current position. $\vdots$ \noindent Software programs that convert \verb|dvi| files to printed or display output should be able to fail gracefully when they don't recognize your special keywords. $\vdots$ \noindent However, the author anticipates that certain standards for common graphics operations will emerge in the \TeX{} user community, after careful experiments have been made by groups of people; then there will be a chance for some uniformity in the use of \verb|\special| extensions. \end{quote} As Knuth noted, the most common use of \verb|\special| is to inform the \DVI{} driver that a graphics file is to be inserted in the output. Many other possibilities exist, including paper specification, operator messages, grey shading, change bars, color selection, page overlays, and output device control. With very few exceptions, existing drivers, including my own 2.10 version, have adopted {\em ad hoc\/} syntax for the \verb|\special| string. The result is that the \DVI{} file is no longer device-independent; it depends both on the output device, and {\em on the particular driver that is expected to process it}. \section{Improving the handling of {\tt\char92special} commands} In the 3.0 \DVI{} driver development, I had to solve the \verb|\special| problem. This section will describe how, and why, I did so. While the complete source code for the 3.0 drivers will not be released for some time, the part described in this article for \verb|\special| strings and paper specifications is complete, and {\bf I am making it available immediately, and without any restrictions whatsoever, to authors of \DVI{} drivers for incorporation in their programs.} The source code is written in ANSI C \cite{ANSI:c89}. C is already used for many \DVI{} driver programs; for drivers that are not written in C, it should be considerably easier to start with this code and reprogram it in some other modern language, such as Pascal or Modula-2, than to redevelop equivalent code from scratch. The previous section observed that most existing drivers have chosen an arbitrary syntax for the \verb|\special| strings they support. This is undesirable, for at least these reasons: % \begin{itemize} \item The chosen syntax is mostly unique to a particular driver, and therefore seriously compromises document portability. \item The syntax is not obviously extensible. \item The syntax cannot always be unambiguously parsed. \item The output device, or driver, to which the \verb|\special| applies is not determinable. \item The capabilities are weak, and fail to address many of the potential uses of the \verb|\special| command. \end{itemize} The syntax that I have developed completely resolves these objections. It has the following features: % \begin{itemize} \item The \verb|\special| command string is defined to contain a program written in a small language that consists of sequences of assignment statements, possibly with embedded comments. \item The \verb|\special| language is {\em rigorously defined\/} by a programming language grammar, based on the C language grammar \cite{ANSI:c89,Harbison:carm-2}. Correct parsers for the language can be developed using any of several standard methods that are well-known in computer science \cite{Aho:red-dragon,Aho:green-dragon,% Holub:compiler-design,% Schreiner:compiler} and the \UNIX{} world \cite{Johnson:yacc,Lesk:lex}. Implementations of some of these are available from several sources, and for several operating systems \cite{Abraxas:pcyacc,Donnelly:bison,% Gray:lex,Holub:compiler-design,% MKS:yacc,Paxson:flex}. \item The language is {\em extensible}. An assignment statement consists of a keyword\slash value pair. Several keywords are already defined, and {\em new ones can be added without invalidating existing uses of the language}. \item Keywords are typed, and constant values assigned to them must be of the same type. The supported types are scalar strings, numbers, and dimensions. The latter include all of \TeX{}'s standard dimensional units. \item There is {\em no limit\/} (other than host memory) on the length of a constant string. \item Value string concatenation is supported in the style of ANSI C \cite{ANSI:c89}, avoiding the often severe line length limitations of text editors, operating systems, and file systems. \item Provision is made for encoding {\em all} characters in the host character set, so that, e.g.\ binary printer control sequences can be incorporated as {\em printable}, and {\em portable}, text in \TeX{} documents. \item A particular keyword, \verb|language|, is provided to permit the user to specify the output device language, or the \DVI{} driver, to which the \verb|\special| command is directed. \item The language is general enough that it can be used for other purposes. In my 3.0 \DVI{} driver software, the complex issue of paper specification is handled by the same language, and importantly, by the {\em same parser\/} that is used for \verb|\special| strings. \end{itemize} % In the actual implementation of the parser, I chose {\em not\/} to use one of the above-cited parser generators. There are two important reasons for this. \begin{itemize} \item Parser generators convert a grammar file to an output program that is impossible to modify by hand. Portability and extensions of the drivers would be compromised if part of their source code could only be generated on certain systems, or with proprietary software. \item Parser generators encode the language keywords into the parser code, usually in incomprehensible forms; examine the parsing tables in a \verb|yacc|-generated parser \cite{Johnson:yacc} to see why. \end{itemize} By suitable abstractions, it has proven possible to create a recursive-descent parser \cite{Aho:red-dragon,Aho:green-dragon} for the language in which {\em the keywords and value storage locations are provided in a table passed to the parser}. The parser code is therefore completely portable, and {\em independent\/} of the keywords in the language it parses. The same code is used for both the \verb|\special| command strings, and for paper specification. \section{A proposed syntax for the {\tt\char92special} command} The preceding section has described the motivation for a new approach to the definition of a \verb|\special| language. What does the language look like? Some examples will give the general flavor before we describe the details of the grammar. Here are some fragments of \TeX{} input with \verb|\special| commands intended for a \DVI{} driver that produces \POSTSCRIPT{}; each of these works with \verb|dvialw| in the version 3.0 development. % \begin{verbatim} % Display a picture with the % upper-left corner at the current % point \special{language "PostScript", include "pict.eps"} % Display a picture at its original % absolute page position \special{language "PostScript", overlay "pict.eps"} % Use literal PostScript to draw a % 1in box with lower-left corner at % TeX's current point \special{language "PostScript", literal "newpath % move origin from upper-left % to lower-left 0 -72 translate 0 0 moveto 0 72 rlineto 72 0 rlineto 0 -72 rlineto -72 0 rlineto closepath 4 setlinewidth stroke showpage"} % Display a figure at half size \special{language "PostScript", literal "0.5 0.5 scale", include "pict.eps"} % Display the figure in landscape % mode by rotating the coordinates % about the center of the bounding % box \special{language "PostScript", literal "BoxWidth 2 div BoxHeight 2 div translate 90 rotate BoxWidth -2 div BoxHeight -2 div translate", include "pict.eps"} \end{verbatim} Naturally, the details of a \verb|\special| command invocation should be hidden away in suitable macros that are easy to use. Here are some examples from a recent document illustrating the incorporation by \verb|dvialw| of \POSTSCRIPT{} figures from a variety of sources: % \begin{verbatim} \newcommand{\FIGPLOT}[4]{% % Arg 1 = EPS file to plot % Arg 2 = figure caption % Arg 3 = width in inches % Arg 4 = height in inches \begin{figure}[htb] \Figrule\smallskip \begin{center} \setlength{\unitlength}{1.0in} \begin{picture}(#3,#4)(0,0)% \put(0,0){\special{ language "PostScript", position "bottom left", literal "/SX {#3 72 mul BoxWidth div} def /SY {#4 72 mul BoxHeight div} def 1 SX sub BoxLLX mul 1 SY sub BoxLLY mul translate SX SY scale", include "#1"}}% \put(0,0){\circle*{0.5}}% \put(0,0){\dashbox{0.1}% (#3,#4)[t]{}}% \end{picture}% \end{center} \caption{\tolerance=6000 \emergencystretch=3pt #2 File: {\tt #1}. Picture size: #3in wide by #4in high.} \label{#1} \smallskip\Figrule \end{figure} } \newcommand{\Figrule}{\hrule width \linewidth height 2pt depth 2pt \relax} \FIGPLOT{roseart.ps}{Adobe Illustrator 1.0b2 rose art (scaled 1:2)} {3.4861}{4.625} \FIGPLOT{golfart.ps}{Test of golfart scaling (scaled $1:2$).} {3.95833}{4.82639} \FIGPLOT{tiger.ps}{A bitmapped image.} {4.5}{3.0107} \end{verbatim} The \verb|literal| string makes use of \POSTSCRIPT{} macros output by \verb|dvialw| to define the position (\verb|BoxLLX|, \verb|BoxLLY|, \verb|BoxURX|, and \verb|BoxURY|) and size (\verb|BoxHeight| and \verb|BoxWidth|) of the bounding box. The current page position is also available as (\verb|CurrentX|, \verb|CurrentY|), and the paper size as \verb|PaperHeight| and \verb|PaperWidth|. All of these are in standard \POSTSCRIPT{} units of big points. These quantities are needed to support things like figure transformations, landscape mode, change bars, and grey shading. If a \verb|\special| contains both a \verb|literal| and an \verb|include| or \verb|overlay| statement, then the literal string is output {\em before} the inserted file. Should the reverse order be required, then the literal string must be specified in a separate following \verb|\special|. Finally, here are some examples of the same language, now used to parse paper specifications. The first is a command-line, or startup-file, switch which provides a paper program as the switch value: % \begin{verbatim} -paper:{paper="letter";width=8.5in; height=11in;dev_init="...";} \end{verbatim} % In a startup file, such options can be written more clearly: % \begin{verbatim} -paper: { % standard US paper size paper = "letter"; width = 8.5in; height = 11in; % printer origin is off by 0.05in x_origin = 1.05in; y_origin = 1in; % printer wraps coordinates, so % we need clipping turned on x_clip = 1; y_clip = 1; % not all of page is imageable x_left = 0.3in; x_right = 0.3in; y_top = 0.5in; y_bottom = 0.5in; % print pages from last to first output_order = -1; % adjacent strings are concatenated dev_init = "...." "...." "...."; % final formfeed and printer reset dev_term = "\f\033E"; page_init = "...."; page_term = "...."; } -paper= { paper = "ALW-note"; use = "letter"; x_left = 0.41in; x_right = 0.41in; y_top = 0.42in; y_bottom = 0.42in; } \end{verbatim} The last example illustrates a feature of the paper specification language; the \verb|use| keyword references a paper type defined elsewhere whose specifications are copied into the internal data structures for the new type before the new values are installed. This makes it easy to prepare modifications of base forms. For example, most laser printers use the same size paper, but differ in the imageable area and output stacking order. The example above defines a paper type known to the Apple LaserWriter in terms of a standard paper type. Comments are from percent to end-of-line (like \TeX{}), and letter case is {\em not significant} in variable names. Whitespace is ignored, so the specification can be formatted for readability, or for compactness. Dimensions can be given in any unit known to \TeX{} (bp cc cm dd in mm pc pt sp). The presence of a left brace following the paper switch signals that a forms definition follows; otherwise, the following token is a paper name. To facilitate collection of the complete specification at a higher level without having to parse it in detail when the switch and its value are collected, braces must be balanced; escape sequences and comments provide ways to ensure this. \section{The language grammar} The grammar for the language is based on the C programming language grammar given in Appendix B of \cite{Harbison:carm-2}, with changes affecting hexadecimal escape sequences in strings, and concatenation of adjacent strings, as specified in the ANSI C standard \cite{ANSI:c89}. Adjacent string concatenation is a convenient way of working around limitations on line length when long strings are needed, and adding support for it took only four lines of code. Hexadecimal escape sequences of arbitrary length permit transparent support for character sizes larger than 8 bits. Octal escape sequences remain limited to 3 digits for backward compatibility; hexadecimal escape sequences are new with ANSI C. In the following grammar, the suffix \verb|-opt| means that the item is optional. For brevity, numeric constants are not specified in grammatical form here. They are parsed by the ANSI C library routine, \verb|strtod()|, which expects numbers in the form (\verb|[ ]| marks optional fields, \verb={|}= marks alternatives): % \begin{verbatim} [whitespace][sign][digits][. digits] [{e|E}[sign]digits] \end{verbatim} Except in quoted strings, tokens may not contain embedded blanks. Thus, 210mm is legal input, but 210\verb*| |mm is not. Here is the grammar, in standard Backus-Naur form: % \begin{verbatim} program: statement statement: assignment-statement compound-statement null-statement assignment-statement: name = constant name : constant name constant compound-statement: { statement-list-opt } null-statement: , ; statement-list: statement statement ; statement-list statement , statement-list constant: dimension-constant float-constant string-constant name dimension-constant: float-constant dimension-unit dimension-unit: one of bp cc cm dd in mm pc pt sp string-constant: simple-string-constant string-constant simple-string-constant simple-string-constant: " character-sequence-opt " ' raw-character-sequence-opt ' character-sequence: character character-sequence character raw-character-sequence: raw-character raw-character-sequence character character: printing-character escape-character raw-character: printing-character \' printing-character: one of (note that " and \ are omitted, and ' may be specified by \' as well) ! # $ % & ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ escape-character: \ escape-code escape-code: character-escape-code octal-escape-code hexadecimal-escape-code character-escape-code: a b f n r t v \ ' " octal-escape-code: octal-digit octal-digit octal-digit octal-digit octal-digit octal-digit hexadecimal-escape-code: x hexadecimal-digit hexadecimal-escape-code hexadecimal-digit octal-digit: one of 0 1 2 3 4 5 6 7 hexadecimal-digit: one of 0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f name: letter letter extended-letter-sequence extended-letter-sequence: extended-letter extended-letter-sequence extended-letter letter: one of alphabetic or underscore characters A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z _ extended-letter: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z - . _ \end{verbatim} For readers unfamiliar with programming language grammars, a short explanation is in order. The beginning rules % \begin{verbatim} program: statement statement: assignment-statement compound-statement null-statement \end{verbatim} % say that a \verb|program| is a \verb|statement|, and that a \verb|statement| is either an {\tt assignment-statement}, or a {\tt compound-statement}, or a {\tt null-statement}. Further rules in turn define what each of these are. The last rule says that an {\tt extended-letter} is a digit, letter, hyphen, dot, or underscore. The characters permitted in {\tt extended-letter} are chosen % \begin{itemize} \item to avoid conflict with characters otherwise significant in the grammar, and \item to cover the most common filename syntax, so as to allow unquoted simple filenames to be collected as single constant name tokens for assignments. \end{itemize} This grammar supports two kinds of quoted strings. The {\em normal\/} kind is delimited by double quotes, and inside it are recognized all the escape sequences supported by the C language. The {\em raw\/} kind is delimited by single quotes; only escape-single-quote pairs are recognized inside it. This is more convenient when it is necessary to have strings with several backslashes, since it then avoids having to double all of them. Once normal and raw strings are parsed, they are stored identically. German \TeX{} styles often change the syntax of the quotation mark to add an umlaut accent to the following character; users of such styles can happily use the raw string form to avoid conflict. Backslashes in literal strings and filenames pose a small problem for the user, because \TeX{} will ordinarily try to interpret control sequences triggered by backslashes in the argument of the \verb|\special| command. For filenames, IBM PC DOS is the only operating system that normally would use backslashes, and then only as a directory separators. In most cases, you should omit directory paths anyway, and rely instead on the \CODE{DVIINPUTS} search path to let the drivers find the files at run time; doing so will enhance document portability. If you still wish to use a directory path in the \verb|\special| command, you can exploit an unadvertised feature of PC DOS; namely, system calls accept forward slashes as equivalent to backslashes, so you can use forward slashes instead. This is normally not possible with PC DOS commands that accept filenames on the command line, because their simplistic parsing confuses such names with option switches. Literal strings are therefore likely to be the only place where backslashes may be unavoidable. Although it would have been possible to choose another escape character than backslash for such strings, this would likely prove confusing to those users who are used to C and \UNIX{}, where the backslash escape character is firmly entrenched. Fortunately, the solution is not difficult, because \TeX{} does not have backslash hardcoded as a control sequence prefix; you can change it by altering \TeX{}'s catcodes. Thus instead of writing something like % \begin{verbatim} \special{literal = "\033[I"} \end{verbatim} % \noindent which would raise a \TeX{} {\em Undefined control sequence\/} error, you can instead write % \begin{verbatim} { \catcode`\@=0 \catcode`\\=12 @special{literal = "\033[I"} } \end{verbatim} % \noindent This changes the \TeX{} control sequence prefix from backslash to at-sign, and gives backslash a meaning that will not cause problems. The surrounding braces ensure that the changes disappear when the braced group is exited. The catcodes are of course ugly magic numbers, so if you do this more than once, you should hide them in a macro with a more meaningful name, and use that macro in place of the first two lines in the group. The grammar supports statement separators (rather than terminators), and they may be either commas or semicolons. In a simple language, it is convenient to allow both kinds of separators. Since there is a null statement, the separator is optional after the last statement in a sequence. Drivers will supply an implicit brace pair surrounding the \verb|\special| string retrieved from the \DVI{} file, to ensure that multi-statement text looks like the compound statement required by the grammar. Finally, note that the assignment statement may use either the equals or colon operator, or the operator may be omitted altogether. This supports the common forms % \begin{verbatim} keyword = value keyword : value keyword value \end{verbatim} % Because the values have very limited syntactical possibilities, there is no ambiguity created by this. \section{The {\tt\char92special} language} The preceding section defined the grammar for the \verb|\special| language. We now need to define what keywords will be recognized. As emphasized above, the language is {\em extensible}, and the parser that I have implemented for it makes it very easy to add new keywords {\em without touching a single line of the parser code itself}. For example, only a short specification like % \begin{verbatim} { {"include", 7}, (symbol_value*)&spec.include, T_STRING }, \end{verbatim} % needs to be added to a table to define a new keyword, together with a small amount of code to do something with the value returned by the parser for that keyword. The current set of keywords recognized is given in the following table: % \begin{center} \begin{tabular}{llp{1.3in}} \hline Keyword & Value & Action \\ \hline {\tt boundingbox} & string & Define the coordinates of the lower-left and upper-right corners of the box which bounds the figure input by an \verb|include| or \verb|overlay| command.\\ {\tt graphics} & string & Execute the generic graphics primitives in string (not yet defined).\\ {\tt include} & filename & Insert file contents with relative page positioning.\\ {\tt language} & string & Name the output-device language for which this \verb|\special| is intended.\\ {\tt literal} & string & Insert literal output device code.\\ {\tt message} & string & Supply an operator message to be sent to the terminal and log file.\\ {\tt options} & string & Not yet defined.\\ {\tt overlay} & filename & Insert file contents with absolute page positioning.\\ {\tt position} & string & Specify the reference point on an inserted figure which is to be mapped to the current page position.\\ \hline \end{tabular} \end{center} % In a series of assignment statements, the order of the keywords is not significant, except that if duplicate keywords are specified, the value of the last one is used. It is not necessary to supply a final newline in the strings or files; one will be provided implicitly to ensure correct parsing. The \verb|graphics| keyword value is intended to be used to support a simple generic graphics language, yet to be defined. Such a language would make it possible to obtain simple line graphics on virtually any output device. The \verb|options| keyword value could be used to supply device-dependent information; no particular values have yet been defined in my 3.0 \DVI{} driver code. The \verb|message| string provides a means for operator communication; for example, % \begin{verbatim} message "Thesis bond paper for this job" \end{verbatim} % The message is sent verbatim to the terminal and the log file. The \CODE{position} keyword specifies a string that should contain two blank-separated words. The first should be one of \CODE{top}, \CODE{middle}, or \CODE{bottom}, and the second should be one of \CODE{left}, \CODE{center}, or \CODE{right}. These words may be abbreviated to a single letter if desired. Together, they select on the bounding box one of nine points (four corners, four edge centers, and the box center) which is to be placed at the \TeX{} current point. If this keyword is not given, the default is % \begin{verbatim} position = "top left" \end{verbatim} % \noindent The point selected by this keyword (or by default) will be the {\em reference point\/} for the insertion of graphics files. The following remarks are particular to \POSTSCRIPT{} devices, but the possible generalizations to others should be evident. Literal \POSTSCRIPT{} code from a file or a literal string is expected to be well-behaved, and preferably, should conform to Adobe's Encapsulated \POSTSCRIPT{} File format version 2.0 or later \cite{Adobe:epsf-spec}, and to Adobe's \POSTSCRIPT{} Document Structuring Conventions, version 2.0 or later \cite{Adobe:docstruct-spec}. It may contain a \verb|showpage|, which is disabled temporarily by the \DVI{} driver during the execution of the \verb|\special| strings, but it should not contain any of these operators: % \begin{center} \tt \begin{tabular}{lll} \hline banddevice & initgraphics & setdevice \\ copypage & initmatrix & setmatrix \\ erasepage & note & setpageparams\\ exitserver & nulldevice & setsccbatch \\ framedevice & quit & setscreen \\ grestoreall & renderbands & settransfer \\ initclip \\ \hline \end{tabular} \end{center} % If it does, erroneous output is virtually certain. While these commands could be disabled like \verb|showpage| is, Adobe's Encapsulated \POSTSCRIPT{} guidelines do not recommend doing so. The imported \POSTSCRIPT{} should write into its own dictionary if it needs to define objects. Because dictionary sizes must be specified when they are created, it is not possible to define a standard one in advance in the macros that mark the start and end of the imported \POSTSCRIPT{} (called \verb|SB| and \verb|SE| in \verb|dvialw|) to protect from corruption of the dictionary used by the \DVI{} driver. The \verb|language| keyword should specify \verb|"PS"| or \verb|"PostScript"|; letter case does not matter. If any other non-empty value is found, the \verb|\special| command is ignored by a \POSTSCRIPT{} driver, since it presumably applies to some other output device. However, if no \verb|language| keyword is given, the driver assumes it should process the rest of the \verb|\special| command. Files specified by \verb|include| and \verb|overlay| keywords are searched for in the \verb|DVIINPUTS| path. In the common {\tt include filename} case, the upper-left corner of the \POSTSCRIPT{} bounding box will be placed at the current point. The \POSTSCRIPT{} file must then contain (usually near the start) a comment of the form % \begin{verbatim} %%BoundingBox: llx lly urx ury \end{verbatim} % specifying the bounding box lower-left and upper-right coordinates in standard \POSTSCRIPT{} units (big points, 1bp = 1/72 inch). Alternatively, if the comment % \begin{verbatim} %%BoundingBox: (atend) \end{verbatim} % is found in the file, the last 4096 characters of the file will be searched to find a bounding box comment that specifies the coordinates of the two corners. The {\em last\/} such comment found is the one used; this requirement permits correct handling of inserted files that themselves contain nested \POSTSCRIPT{} files. In the {\tt overlay filename} case, the \POSTSCRIPT{} file to be included will be mapped onto the page at precisely the coordinates it specifies, where the page origin is in the lower-left corner, $x$ increasing to the right, and $y$ increasing upward. Any \verb|%%BoundingBox| specification is ignored, since it is not required for positioning. This option might be used to print an overlay page. For actions that are to be done on every page, such as printing a logo, or a string like {\tt Draft} or {\tt Company Confidential}, it is more efficient to redefine the \POSTSCRIPT{} \verb|showpage| command instead. If the \POSTSCRIPT{} file cannot be opened, or the \verb|\special| command string cannot be recognized, or for relative positioning, the bounding box cannot be determined, a warning message is issued and the \verb|\special| command is ignored. Numerous drivers already support \verb|\special| command strings of the form {\tt include filename}; this parser will recognize them. A key point here is the \verb|language| keyword. If it is {\em not\/} given, the \DVI{} driver must assume that the \verb|\special| command is intended for it, and attempt to process it. Thus, % \begin{verbatim} \special{include tiger.eps} \end{verbatim} % will be handled as before. However, when the \verb|language| keyword is found, then its value determines whether the \DVI{} driver will process the \verb|\special|, or ignore it. Every \DVI{} driver must recognize a generic language choice relevant to its output device, such as {\tt PostScript} or {\tt Epson}. In addition, each driver must recognize its own name as a language value. The reason for this requirement is as follows. When startup files are supported, their names are derived from the driver names. In my 3.0 \DVI{} driver code, a driver named \verb|dvialw| will search for startup files named \verb|dvialw.ini| in a list of standard places. The default behavior of a particular driver can be changed merely by storing a copy of its executable program under a different name, and providing a corresponding startup file. Typically, this would be done to provide easy-to-use variants of a basic driver for different paper types, or different page orientations. If the user wishes to incorporate driver-specific \verb|\special| strings, permitting the \verb|language| value to be the driver name provides that flexibility. Existing mini-languages for graphics, such as \verb|eepic|, \verb|epic|, \verb|tpic|, and \verb|xpic|, are properly handled using the \verb|graphics| and \verb|language| keywords together: % \begin{verbatim} \special{language = "tpic", graphics = "..."} \end{verbatim} The \DVI{} Driver Standards Committee has debated whether drivers should issue warning messages about \verb|\special| commands that they are unable to process. In the absence of the \verb|language| keyword, I believe that such warnings are desirable, although the driver should provide an option to suppress such warnings. However, when a \verb|language| value is found, it is important that the driver {\em silently ignore\/} ones that it is not prepared to process. The presence of that value is sufficient evidence to conclude that the user intends it to be ignored by some drivers, and certainly does not want those drivers to complain about it. I expect that with more powerful, and standardized, \verb|\special| command support of the type described in this paper, use of \verb|\special|s will increase. Consider, for example, a document that makes heavy use of color or grey-scale requests via \verb|\special| commands; there could be hundreds, or even thousands, of them in a document of modest size. Were the driver to issue warnings for all of them, the terminal output or log file would be flooded with mostly useless warning messages that obscure much more important information. The \verb|language| value provides a standard means to prevent this. \section{Paper specification} Paper handling and specification is a complex issue, and may require future extensions. Thus, it is desirable to have a flexible means of specifying paper characteristics, and a reasonable scheme seems to be to use a small extensible language to define it. The assignment-statement language whose grammar was presented above is suitable for this purpose. Some examples of the paper specifications supported by my 3.0 \DVI{} driver work were given earlier. Here, we define the keywords recognized. \begin{center} \begin{tabular}{llp{1.1in}} \hline Keyword & Type & Description \\ \hline \CODE{dev_init} & string & initiate device use of paper\\ \CODE{dev_term} & string & terminate device use of paper\\ \CODE{height} & dimension & paper height\\ \CODE{output_order} & number & negative for printing last to first\\ \CODE{paper} & string & paper form name\\ \CODE{use} & string & name of copied paper form\\ \CODE{width} & dimension & paper width\\ \CODE{x_clip} & number & clip in x direction if non-zero\\ \CODE{x_left} & dimension & width of left unprintable margin\\ \CODE{x_right} & dimension & width of right unprintable margin\\ \CODE{x_origin} & dimension & horizontal offset of \TeX{} (0,0) point\\ \CODE{y_bottom} & dimension & \sloppy width of bottom unprintable margin\\ \CODE{y_clip} & number & clip in y direction if non-zero\\ \CODE{y_origin} & dimension & vertical offset of \TeX{} (0,0) point\\ \CODE{y_top} & dimension & width of top unprintable margin\\ \hline \end{tabular} \end{center} % The \CODE{paper} keyword defines a name that is used to tag the collected parameters. If the form name already exists, assignments will replace previous values. Otherwise, a new form is created. The \CODE{use} keyword names an existing form whose parameters are to be copied to a new one named by the \CODE{paper} keyword in the same program. This copying happens {\em before\/} any of the other keyword assignments are done. The order of the statements in the program does not matter, because the results of the assignments are collected in a temporary form before copying to the specified form. Recursive forms references are supported; just don't make them circular! The \CODE{use} keyword should normally be employed to make private modifications of standard forms types. Some printers misbehave if they are presented with data that are off the page, or too close to the margins; for example, the Hewlett-Packard LaserJet wraps such coordinates horizontally. For such devices, the \CODE{x_clip} and \CODE{y_clip} values should be set non-zero. Few printers place the (0,0) origin exactly in the upper-left page corner; instead, they have it slightly offset at some other point, which we call (\CODE{x_origin},\CODE{y_origin}). The standard \LaTeX{} file, \FN{testpage.\-tex}, can be used to determine the correct settings of these values. If you print its typeset output, the upper-left corner of the inner frame should be exactly one inch from the page edges. Suppose you actually find that that corner lies 0.75in from the left edge, and 1.1in from the top edge. This means the printer's (0,0) point is to the left, and just below, the upper-left corner. Setting \CODE{x_origin = -0.25in} and \CODE{y_origin = 0.1in} will compensate, so the next time you print the test page, the inner frame should be correctly positioned. Most printers are incapable of printing very close to the edges of the physical page; the margin values \CODE{x_left}, \CODE{x_right}, \CODE{y_bottom}, and \CODE{y_top} should be set to indicate the relevant limits. Sometimes these values can be found in the printer documentation. However, if the physical paper position relative to the printing mechanism is adjustable, as it is for most dot-matrix printers, you may have to experiment. If you print the \FN{testpage.tex} typeset output, the tick marks in the four margins will usually not print near the paper edges; use them as a guide to setting reasonable values for the margin values. \DVI{} drivers that require a page bitmap will allocate memory corresponding to the paper surface inside of these margins. Wide margin settings can therefore reduce the amount of memory required; that in turn can reduce the number of bitmap strips that must be processed for high-resolution printers, speeding the output. The standard \TeX{} and \LaTeX{} macro packages are parametrized to assume that the \TeX{} (0,0) point will be exactly one inch in from the left, and one inch down from the top. They also usually assume American paper sizes. Text widths and heights are then chosen to ensure identical top and bottom margins, and except for two-sided printing styles, identical left and right margins. While the \DVI{} driver \OPTION{x} and \OPTION{y} command line options can be used to adjust the output position, it is usually better to do so by setting paper parameters. \begin{sloppypar} For example, ISO A4 paper is 210mm (8.2677in) wide; \TeX{} macro packages assume 6.5in text width with 1in left and right margins. To center that text on A4 paper, the 1in margins need to be reduced by (8.5 - 8.2677)/2 = 0.1161in, so we could put \CODE{x_origin = +0.1161in}. Similarly, the A4 height of 297mm (11.6929in) exceeds the 11in U.~S. paper height, and requires adding (11.6929 - 11.0)/2 = 0.3465in to the top and bottom margins. That can be accomplished by setting \CODE{y_origin = -0.3465in}. Of course, if you already have non-zero values of these parameters, you will have to adjust them accordingly; just {\em add\/} the above offsets to the existing values. \end{sloppypar} If you routinely use non-American paper sizes, then you probably should be using a style file modification that accounts for the different page dimensions, rather than fiddling with paper positioning on your output device. The \CODE{output_order} value should be set negative if you want pages printed from last to first. This provides an alternate to the \OPTION{backwards} command line option, but affects only the paper forms types it is defined for. If \CODE{output_order} is negative, the \DVI{} drivers will simply flip the current setting of the backwards-printing switch, which may have already been set from the command line. If the printer needs to receive some magic codes to select an alternate paper type (e.g.\ some high-speed laser printers support multiple input paper trays), it will be necessary for the \DVI{} driver to write them into the output file. The \CODE{dev_init} and \CODE{dev_term} strings provide for this. The \DVI{} drivers output the initialization string at the start of the job, and the termination string at the end. These are output verbatim with nothing added, not even a newline. For example, if you are using the \POSTSCRIPT{} driver, \verb|dvialw|, on a system that does not have a \POSTSCRIPT{} printer spooler, you might want the end of the file to have the \POSTSCRIPT{} serial line job terminator character, \CTL{D}. You could arrange that by setting % \begin{verbatim} dev_term = "\004"; \end{verbatim} % \noindent in a paper program. The DVI drivers already know how to initialize and terminate their output devices under normal conditions, so you should rarely need to specify \CODE{dev_init} and \CODE{dev_term} values. \bibliography{special} \makesignature \end{document}