By running pdflatex on fixltxhyph.dtx the user gets the .sty file, and the English documentation file in pdf format. \endpostamble \askforoverwritefalse \generateFile{fixltxhyph.sty}{t}{\from{fixltxhyph.dtx}{style}} \def\tmpa{plain} \ifx\tmpa\fmtname\endgroup\expandafter\bye\fi \endgroup % % % \fi % % \iffalse %<*driver> \documentclass{ltxdoc} \ProvidesFile{fixltxhyph.dtx}[2012/04/02 v.0.4 Documented TeX file for the FixLtxHyph package] \GetFileInfo{fixltxhyph.dtx} \usepackage[latin1]{inputenc} \usepackage[T1]{fontenc} \usepackage{lmodern} \usepackage{color} \usepackage{multicol} \title{\centering The FixLtxHyph package\protect\\ A small fix in order to hyphenate emphasized words after a vocalic elision\protect\\ in Catalan, French, Italian, Romansh, and Friulan} \date{\fileversion\space\filedate} \author{Claudio Beccari} \usepackage{array} \usepackage{metalogo} \def\prog#1{\textsf{#1}} \begin{document}\errorcontextlines=9 \maketitle \begin{multicols}{2} \tableofcontents \end{multicols} \setlength\hfuzz{20pt} \DocInput{fixltxhyph.dtx} \end{document} % % \fi % \CheckSum{84} % % \begin{abstract} % This file fixes a small feature of the hyphenation algorithm used by the \TeX\ system % typesetting engines that manifests itself only with those languages that use the % apostrophe for marking a vocalic elision. This small package was set up to fix this % little undesirable feature in Italian, but it was extended to Catalan, French, the % future implementation of the fourth official Swiss language Rumantsch Grischun (Romansh % in English) and the future implementation of the Regional Language Friulan, spoken % and written in North Eastern Italy. This fix operates correctly with both \prog{pdflatex} and \prog{xelatex}. % \end{abstract} This small package was set up to fix this % little undesirable feature in Italian, but it was extended to Catalan, French, the % future implementation of the fourth official Swiss language Rumantsch Grischun (Romansh % in English) and the future implementation of the Regional Language Friulan, spoken % and written in North Eastern Italy. This fix operates correctly with both \prog{pdflatex} and \prog{xelatex}. % \end{abstract} % \section{What is the feature to be fixed} % The five languages Catalan, French, Italian, Romansh, and Friulan use the apostrophe % for marking the vocalic elision of the ending vowel at the end of prepositions, articles, % articulated prepositions, definite adjectives, and other words playing similar rôles when % they just precede nouns, adjectives, verbs, numerals, that start with a vowel. Probably % there are other languages that use the apostrophe in a similar way. I can easily upgrade % this small package if \LaTeX\ users of other languages let me know about such languages. % % This feature is common to most Romance languages (from West to East) from Catalan and % Valencian, to French, Langue d'oc, Occitan, Provençal, Vivaroalpin, Italian, Piedmontese, % Lombard, Romansh, Ladin, Friulian; up to now only Catalan, French and Italian are handled % by the \TeX\ system programs; at the same time most of these languages are minority ones % and are being protected by local legislation or are supported by specific cultural or % linguistic institutions; Romansh has got a national/federal legal status in % Switzerland and is being used in legal and official documents in the whole Swiss % Confederation, not only in its area of everyday use, the Kanton Graubunden or Canton % Grigioni or Chantun Grischun (where seven Romansh varieties are being spoken, besides % Swiss German, Italian, and other languages). The Friulan language has an official % regional status in the North-eastern Italian Region Friuli\,-Venezia Giulia. % % This spelling rule is very rigorous in French; I suppose it is also a rigorous rule in % Catalan, Romansh, and Friulan but I am not that familiar with these languages even if I % can understand their written forms. In Italian it used to be a rigorous rule many years % ago, but nowadays it is less frequently used when plurals are involved. % Nevertheless apostrophes are practically the only analphabetic sign you see in an % Italian text except for letters and punctuation and quotation marks. % % In order to hyphenate correctly these word combinations all five languages have to % declare the apostrophe, that has a category code of~12, as a glyph with non zero lower % case code. In facts all five languages declare: %\begin{verbatim} %\lccode`\'=`\' %\end{verbatim} % or something equivalent. With this little trick, the typesetting engine considers the % apostrophe as a valid word character and treats the whole string as a single word; the % patterns of these languages, of course, take into consideration also the apostrophe so % that the resulting correct line breaks are easily found: %\begin{center} %\begin{tabular}{l>{\ttfamily}ll} %Catalan & d'aquesta & d'a-ques-ta \\ %French & l'électricité & l'élec-tri-ci-té \\ %Italian & dell'eleganza & del-l'e-le-gan-za \\ %Romansh & l'identitad & l'i-den-ti-tad \\ %Friulan & l'arbul & l'ar-bul %\end{tabular} %\end{center} % % So where is the problem? It emerges when the second part of the string is emphasized, % because in this case no hyphenation takes place: %\begin{center} %\begin{tabular}{l>{\ttfamily}ll} %Catalan & d'\string\emph\{aquesta\} & d'\emph{aquesta} \\ %French & l'\string\emph\{électricité\} & l'\emph{électricité} \\ %Italian & dell'\string\emph\{eleganza\} & dell'\emph{eleganza} \\ %Romansh & l'\string\emph\{identitad\} & l'\emph{identitad} \\ %Friulan & l'\string\emph\{arbul\} & l'\emph{arbul} %\end{tabular} %\end{center} % % This behavior is easily explained, so that it is not to be considered a bug, but a % feature; a feature that is annoying only when using the above five named languages. % The point is that all \TeX\ system typesetting engines consider a word to be that % character string starting after a character invalid in a word and finishing with the % first token invalid in a word. Notice that when the hyphenating algorithm comes to work % the command |\emph| has already been expanded and it ends up with the qualifications of % the selected font; therefore a string such as \verb*| d'aquesta | (starting after % a space and ending before the following space) is made up of valid characters; but % \verb*| d'\emph{aquesta} | is a ``word'' starting after a space and ending before % a space, but containing a font change. And this makes the word invalid for hyphenation. % The \TeX\-book is clear on this respect: ``If a suitable letter is found [as a starting % character], let it be in font $f$. \dots\ \TeX\ continues to scan forward until coming % to something that's not one of the following three ``admissible items'': (1) a character % in font $f$ whose |\lccode| is not zero; (2) a ligature formed entirely from characters % of type (1); (3) an implicit kern. \dots\ Notice that all these letters are % in font~$f$.'' % % This was a specific programming choice decided by Donald~E.\ Knuth together with Frank % Liang, his PhD student who developed the hyphenation algorithm implemented in the % typesetting engines of the \TeX\ system\footnote{I have been told that Lua\TeX\ is % developing a different algorithm that eliminates this feature.}. % As all such decisions, it is a compromise between accuracy and speed. And remember that % at the beginning \prog{tex} the program was used essentially with English, a language % that does not use accented letters and uses elision in a much different way as the one we % are speaking here. The problem did non exist and, I suppose, it will never exist in % English. % % \section{The solutions} % As a compromise I decided to solve the problem in an automatic way only when the second % part of the ``word'' to be hyphenated is emphasized. I suppose it is the most frequent % situation, although no one can avoid thinking to other situations; for example: the % second part of such ``word'' after the apostrophe is bolded, is colored, is written % in another font selected on purpose or is in another alphabet, is in italics (with % no automatic inclination switching); in such cases the solution is manual and remains % manual, because there are too many possibilities and it is cumbersome to deal with all % of them. % % But manual or automatic, how should we proceed? Simply we must convince that the % starting letter must not be the start of the part preceding the apostrophe, but what % follows it. % This is simple: it suffices to put after the apostrophe an unbreakable, zero width glob % of glue; \TeX\ starts looking for a potential starting letter after the glue. % Therefore the manual solution consists in defining a short macro such as the following % one: %\begin{verbatim} %\newcommand\hz{\nobreak\hskip\z@skip} %\end{verbatim} % or, if you want to avoid setting this short command into a personal \texttt{.sty} file, % simple change |\z@skip| with |0pt|. You will then have to modify the font changing % phrase into something such as: %\begin{verbatim} %... d'\hz\textbf{aquesta} ... %\end{verbatim} % The |\hz|, whose name reminds the phrase ``Horizontal skip of an unbreakable Zero width % glob of glue'', finishes the preceding word and sets the grounds for starting the search % of a new starting letter of another word; it will be found after the font selections code % introduced in the horizontal list by the selected font identification. % % The automatic solution, on the opposite, implies a small but substantial modification of % the |\emph| command. In facts the text command uses the text declaration |\em|; on turn % |\em| is a robust command, that is it is defined as \verb*|\protect\em |: it would be % very unwise to modify a protected command, so it is necessary to modify the % \texttt{protect}ed one, and this operation is not trivial because of the space in this macro name. % In any case if we find out how, we must add |\hz| to the definition of \verb*|\em | % before its argument, the real text to emphasize, is processed. % % This small package does exactly this, only for the five named languages, and only if % they are used, and only with the |\emph| command. The |\hz| command is available to the % user in a global way, so that when this package is loaded, the manual solution remains % valid for every language, although in very unlikely situations. % % It has been tested with the five languages with both \prog{pdflatex} and \prog{xelatex}, % and apparently it works as expected; it has been throughly tested in all situations with % Italian; it should work properly also in French, in Romansh, and in Friulan. The adopted % solution does not fiddle with active characters and therefore it does not interfere with % the internal workings and settings of Catalan and the other languages. % % \section{Installation} % With modern \TeX\ distributions these instructions are superfluous; should you need to % install by hand, download from \textsc{ctan} in a scratch directory (possibly create one, % and after finishing, delete the whole directory with its contents) run this file % \texttt{fixltxhyph.dtx} through \prog{pdflatex}; you get two files and move them % as follows: %\begin{itemize} %\item Move all the files in the following directories on your disk; if you don't already % have those directories, create them. %\item These directories should be created in your personal \texttt{texmf} tree; if you % don't have one, create it; how to do this and where to root it depends on your operating % system; before doing any change to your hard disk, please read carefully the TeX Live % or the MiKTeX documentations in order to find out what a personal tree is. %\item Move \texttt{fixltxhyph.dtx} to \texttt{.../texmf/source/latex/FixLtxHyph/}; %\item Move \texttt{fixltxhyph.pdf} to \texttt{.../texmf/doc/latex/FixLtxHyph/}; %\item Move \texttt{fixltxhyph.sty} to \texttt{.../texmf/tex/latex/FixLtxHyph/}; %\item if your distribution requires it, refresh the file name database. %\end{itemize} % You are now ready to use the package by simply invoking it in the preamble of your % documents: %\begin{verbatim} %\usepackage{fixltxhyph} %\end{verbatim} % %\section{Aknowledgements} %I wish to thank Lorenzo Pantieri who tested the preliminary and the actual versions of % this package and directly or indirectly helped debugging the code, especially in the % preliminary version that used active characters and was particularly buggy. Another % big thank to Enrico Gregorio who spotted the protection problem of the |\em| command. % % \StopEventually{} % %\section{The documented code} % We start by identifying the package and the necessary format file: % \begin{macrocode} %<*style> \ProvidesPackage{fixltxhyph}[2011/04/02 v.0.4 Small fix for hyphenating emphasized words preceded by vocalic elision] \NeedsTeXFormat{LaTeX2e}[2011/06/27] % \end{macrocode} % Then we make sure that the package \texttt{babel} or \texttt{polyglossia} has already % been loaded; otherwise we warn the user and exit; no patches can be made to an unknown % package. % \begin{macrocode} \@ifpackageloaded{babel}{}{\@ifpackageloaded{polyglossia}{}{% \PackageWarning{FixLtXHyph}{This package must be loaded after babel or polyglossia}% \endinput}} % \end{macrocode} % % We need the package |etoolbox| in order to perform any action on control sequences % that contain spaces in their names; we do not need any means to test if we are working % with \texttt{babel} or with \texttt{polyglossia} because, thanks to the previous % tests, one of the two packages has certainly been loaded. % \begin{macrocode} \@ifpackageloaded{etoolbox}{}{\RequirePackage{etoolbox}} % \end{macrocode} % We define a very short command |\hz| in order to have available a handy command % for inserting an unbreakable zero-width glob of glue in case we needed to do some % sort of patching by hand. % \begin{macrocode} \newcommand\hz{\nobreak\hskip\z@skip} % \end{macrocode} % We make patches only if one or more of the five languages Catalan, French, Italian, % Romansh, or Friulan (or its alias Furlan) has been invoked as an option to \texttt{babel} % or specified to \texttt{polyglossia}; if none of these options had been selected, % evidently the user was thinking to other details and missed the point that this patch is % necessary only for the above mentioned five languages. In any case no harm takes place % if from now on nothing else gets done, except for the definition of |hz| that remains % available to the user. % % The next bit of code defines some aliases in order to keep the original meaning of the % declaration |\em|; in order to patch an alias, so as to be able to set the proper % definitions only for the named five languages and to restore the original situation when % a change of language takes place. % \begin{macrocode} \letcs{\FLH@originalem}{em } \let\FLH@newem\FLH@originalem \preto\FLH@newem{\hz} % \end{macrocode} % % We then use a repetition cycle based on a list of language names; if the % language with one of the listed names has been invoked as an option to \texttt{babel}, % or specified to \texttt{polyglossia} then the patched \verb*|\em | definition is made % the default, while when changing language the original definition is restored; we define a macro that contains the language names: % \begin{macrocode} \def\@tempB{catalan,french,italian,romansh,friulan,furlan} % \end{macrocode} % then we perform the above mentioned cycles; we have to distinguish if we are using % \texttt{polyglossia} or \texttt{babel} because the internal setting and resetting macros % of these two packages have different names; they are all made up by the agglutination of % a prefixx to the language name, so we have to build the macro names to patch with the % usual deferred name contraction by means of |\expandafter| and |\csname| with its % companion |\endcsname|. % \begin{macrocode} \@ifpackageloaded{polyglossia}% {\@for\@tempA:=\@tempB\do{% \expandafter\ifx\csname captions\@tempA\endcsname\relax\else \expandafter\addto\csname noextras@\@tempA\endcsname{\cslet{em }{\FLH@originalem}}% \expandafter\addto\csname blockextras@\@tempA\endcsname{\cslet{em }{\FLH@newem}}% \expandafter\addto\csname inlineextras@\@tempA\endcsname{\cslet{em }{\FLH@newem}}% \fi}% }{\@for\@tempA:=\@tempB\do{% \expandafter\ifx\csname captions\@tempA\endcsname\relax\else \expandafter\addto\csname extras\@tempA\endcsname{\cslet{em }{\FLH@newem}}% \expandafter\addto\csname noextras\@tempA\endcsname{\cslet{em }{\FLH@originalem}}% \fi}% } % \end{macrocode} % % This documented file is now terminated and its final commands are issued. % \begin{macrocode} \endinput % % \end{macrocode} % % \Finale % \endinput