Sou a: Inici / El projecte / THE GEMINATED EL


per Marc Antoni Malagarriga i Picas darrera modificació 17/10/2014 14:28
Presentation by Oriol Moret Viñals at the 58th ATypI Conference (Barcelona, September 20th 2014)


[catalan version]


(* Text pending thorough English revision.)


* Good morning. I am here to talk about the geminated el, a rather troublesome Catalan character.

* I speak as part of the Working Group for the Typographic Standardisation of the Geminated El. The name should not mislead anyone: we are a non-profit group (not a team), meeting on an occasional basis, with no official status, no funding, almost no nothing —we do have a modest website… in Catalan only.

* The group includes people from academic and professional backgrounds in areas such as Philology, Graphic Design – Typography and Engineering – Computing. Marc Antoni is in the audience, he is a Software Developer specialized in PostScript and PDF technologies, he inspired both the project and the working group back in 2004; I am lecturing Graphic Design and Letterpress Typography at the Faculty of Fine Arts, University of Barcelona.

* Our presentation is divided into three sections. Section 1 gives an overall, chronologic view of the issue; Section 2 focuses on the major problems around the character today; Section 3 is a sort of assessment of past and future work.


PDF slides





* Apparently, a geminated el is something like this —in upper case and lower case.

* We may consider it “genuine” and “new”.



* By “genuine” we mean “specifically, exclusively Catalan” —no other language makes use of the geminated el.

* The slide shows some data about Catalan. They should give an idea of how “local” the scope is.


4 “NEW”

* The geminated el is a “new” character. It was officially introduced in 1913, in the Orthographic Norms issued by the academic Institute of Catalan Studies as a means to rule and normalize the modern usage of Catalan.

* The new geminated el was adopted to satisfy etymology and pronunciation requirements.

* Its graphic representation (its “glyph”) was intensely debated, too. A compromise was eventually reached in a provisional solution: “two els with a hanging dot between them…”



* … let us stop a minute to stress some significant features.

* This means that this hanging dot acts as a diacritic.

* And this means that the geminated el does NOT represent a single phoneme, but two —each el belonging to a different syllable.

* Therefore, the geminated el is not a digraph.

* And therefore, the geminated el —always intervocalic, never at the beginning or end of a word— can be broken into syllables: at the end of a line, the dot “disappears” and “turns into” a hyphen, each el stands within its syllable.

* (These are some of the reasons why the Institute (IEC) today defines the geminated el as a “modified group of letters”.)



* We can go back to the provisional graphic solution now: “two els with a hanging dot between them…

* and goes “…the els being as close to each other as possible —like in a double el.”

* so they must be perceived as being part of a single word (no gap, not compound words)

* Whether an indication or a plain rule, the academicians posed the typographic problem.



* The geminated el has such a specific usage; stands for such a (double) sound; has such typographic properties.

* That provisional solution would never be modified: it became definitive, still in current use.



* So, orthographic norms implied typographic “norms” for the new character. The geminated el should be “accordingly” cast as a single piece, would become part of any “normal” (standard) Catalan fount; the single “geminated-el-piece” should be “normal” in all type composition techniques —for example, letterpress and linotype.



* But not in typewriters.

* The character-width standard could not be reasonably met with in mono-spaced type (at least the uppercase: the lowercase should not be more difficult than /m/).

* So by the 1930s manufacturers like Olivetti made up another way to type the geminated el: they cropped it into /L·/ or /·L/.

* We will not discuss the character (/L·/ or /·L/) or the key on the keyboard.

* Frivolous as it may seem, it would have wider consequences in the long run.



* Then a long gap follows: the Spanish Civil War, the ban on Catalan, the end of the dictatorship…; even the birth and spread of photocomposition… All in just one slide. (Meaning this is a dark period in our story, too.)



* We move to the 80s and 90s.

* Catalonia and Catalan have regained some autonomy consideration in the new Spanish democracy map.

* The electronic era is on its way. We mentioned Olivetti, we must mention IBM now. They operated on Catalan territory, too —and at that time they were the biggest here in handling data. Thing is, some local IBM engineer was stubborn enough to have the geminated el to the forefront and eventually succeeded in having approval for coding one little bit: the middle dot, 00B7. (Don't laugh. We may thank him for that middle dot. Otherwise we could have been left with no geminated el at all.)

[* The middle dot entered the IBM 7-bit chart (1984) as a “diacritic”; it would feature in the basic ASCII table (Latin-1) from 1989 and later in Unicode charts: it became part of standard tables and keyboards.]

* In 1985, a Royal Decree establishes regulations for Spanish keyboards in (transmission data equipment and) electronic typewriters: the text specifies that the geminated el may be typed through two key combinations: /el-dot/ + /el/; or /el/ + /middle-dot/ + /el/ —though the related figure was not that clear.

* Even if that was not code as we know it, the basic story could go like this: the geminated el, “un-coded” as a single character, would only be coded in cropped manner. There were two ways of cropping it: Olivetti's /el-dot/ and IBM's /middle-dot/ —both ways had some sort of “legal status”; the geminated el had not.



* New regulations come with Unicode. In October 1991, the Unicode Standard, Version 1.0 (Volume 1) is issued. Their Character Code Charts include /el-dot-uppercase/ and /el-dot-lowercase/ (in Latin Extended-A), and /middle-dot/ (in Latin-1 Supplement). /el-dot-uppercase/ is given some (shocking) design recommendation; /el-dot-lowercase / is explicitly acknowledged as “Catalan”; /middle-dot/ shows multiple use.

* They could have encoded /the-geminated-el/ as a single character: characters were no longer tied to mono-spacing limitations. They did not. Why?

* Unicode must have collected officially recognised character encodings. And, according to Spanish encodings, the geminated el only existed in cropped manner, twice alternate. So Unicode did just that: they gave codes to existing characters only. (And to traditional ligatures —but the geminated el was no such thing, either.)

* Therefore, the cropping of that local character, which had acquired national status, was now an international standard: unique, unified, universal.

* No other language uses the geminated el. No language uses /el-dot/. /el-dot/ accounts for nothing.

/middle-dot/ is used in other languages and contexts —but differently, for other purposes: “consequently”, Unicode encoded it within the “Punctuation: others” category, they did not keep the original IBM “diacritic” status.

The issue had become two-sided, there were two problems. Two partial codes for the “same” (uncoded) character made the situation twice awful.

* Let’s just say that the whole affair was a mistake. And Unicode was not the only one to blame. Nobody here said a word. /el-dot/ was there to stay, everywhere.



* This sounds too dramatic. /el-dot/ would hardly be a usual character in a digital font set —at first (PostScript Type1 and TrueType). Later on, with OpenType, it could have had some more impact.

* Be it as it may, the point is that Unicode had given /el-dot/ (the geminated el cripple) a code and it would feature in basic character charts; whereas the “true” geminated el had not and would not. Ever.

* /the geminated el/ would remain an exception, like in Adobe's IBM Courier (1989). The single character would enter the referential Adobe Glyph List (still in use today) —but only made it to the private area in Unicode: LL U+F6BF; ll U+F6C0. (The fact should give us clues on the two ways by which a glyph is determined —names in PDF technology; scalar values in web platform.)

* By the way, that Courier experiment stood as a one-off. (Yes, everyone in the room thinks the upper case looks awful; but it had a code, you know.)



[* Last slide of Section 1. I will not keep to chronological order, here.]

* A (new) Royal Decree from 1993 derogates those 1985 regulations to conform Spanish keyboards to European standards. Result: the geminated el is swept away from this map, too —without the /el-dot/ key, it can only be imagined through the /middle-dot/ key.

* /el-dot/ keeps its hidden place inside the code. But it will wane over time. In 2007, Unicode Standard 5.2 tags /el-dot/ a “deprecated character”. The alternative, “preferred representation for Catalan [legacy]” is /el/ + /middle-dot/ —conforming to ISO (6937) compatibility.

The “true” geminated el still does not exist; and the standard geminated el has no alternative anymore: it is a compound of two unique 004C (or two unique 006C) and a unique, but multi-use, 00B7 between them.

* Nothing seems to worry anyone; nobody cares.

* Well, not quite. From the nineties onwards, a few local voices start to draw some attention to the geminated el of digital times. Complaints grow and crystallize in new platforms such as the World Wide Web. There, in 2005, the ICANN approves the Internet domain /.cat/ —and /.cat/ allows the registration of names that bear “Catalan glyphs” like accentuated vowels, ce cedilla… and the geminated el. (Or something of the kind: they also offer /el/ + /hyphen/ + /el/ as an alternative to the geminated el for tokenization reasons.)

* Touchscreen keyboards follow suit. In the worst of cases, here —Catalan configuration provided— the middle dot appears as an option in the /dot/ key. In the best of cases, the one-piece geminated el appears as an option in the /el/ key. Some might say that this is all very well, sufficient, more than we had, even a lot. But in either case they are only options —alternatives to widely acknowledged characters.





* [This should do as an overall view of the issue; we can move on now to tackle the main problems.]

* But first let’s retrieve summary note #1: The geminated el has such a specific usage; stands for such a (double) sound; has such graphic properties.

* As to pronunciation, let's say that we generally stress it less than we used to.

* As to orthography, the character remains —growing strong in neologisms.

* Catalan still exists. The geminated el still exists. We still have to use the geminated el (in (type-)written Catalan). But how can we carry on using it?

* The question is easy, the answers are not. The whole thing can be simplified in three problems and three users —but neither of them is one-way only: all of them intertwine and complement each other at different levels.

* This is a bit loose so let’s have some clues. It is often bad-looking; nowhere to be seen on keyboards; cropped in code —unknown to many.



* To most everyday users, the geminated el is a nuisance.

* They have to type a character that is nowhere to be found on the keyboard. They must “make” it by pressing a combination of keys on a 1993 European standard keyboard.

* (The following examples may seem strange but they are not unusual.) Users may be capable of knowing what a geminated el is and where to use it, but they might have some trouble in typing it properly. The environment does not help: the user is exposed to all kinds of geminated els, right and wrong —from both private and official “literature”.

* So the user can choose one of the following options or combinations:

1. /L/ + /./ + /L/

2. /L/ + /-/ + /L/

3. /L/ + /•/ + /L/.

The first two are easy, no need to press the shift-key. The third option is just silly [MSWord: see below]. None of them is a geminated el, all three of them are plain mistakes.

4. /L·/ + /L/. The fourth option assumes the user knows where to find /L·/. But even if he knew, he would be overlooking Unicode's advice. And may incur in line-partition mistakes: /l·-/.

5. /L/ + /·/ + /L/. The best option at present.

* Option 5 is the best of them all, but that does not mean it is good. It may conform to code (it does not make use of /L·/) and orthography (it makes use of the hanging dot). But, like every other option (with the occasional exception of 4), it will always exceed the original width recommendation, “LL”. Therefore, no matter what the option, the resulting geminated el will never be right.

[* This is all rather depressing. The everyday user would be better off if keyboards had a geminated-el-key —cannot be bothered about “fancy” matters like width recommendations: that's the font designer's affair.]



* The font designer steps in.

* Let us consider a meticulous font designer who wants to design a complete font —everyone should know by now that “no Latin font will be complete without a geminated el”.

* The font designer might not know what a geminated el is —but we said he/she was meticulous: so he/she will check Unicode charts in a font editor to assist him/her; and, being meticulous, he/she will make sure the charts are up-to-date, that is, at least later than 2007 —when /el-dot/ is dismissed as “deprecated”.

* Let's give a couple of warnings before we carry on:

* Warning #1: this is not going to be a quick-and-easy [-ready,-dirty] practical lesson on how to design the geminated el.

* Warning #2 goes like this:



* (I hope it is clear enough.) PLEASE DO NOT USE /el-dot/ (in any case —upper, lower).

(We had some internal discussion on the verb to choose: “use” seems broad and general enough to involve all users.)

* That's for the whole room, not only Catalan-minority designers, but also all ATypI friends, big companies and sponsors: BauerTypes, Glyphs, Google, Adobe, Monotype, Apple and Microsoft…

You all know “Unicode tagged them ‘deprecated characters’ in 2007”.

(If you are not happy with that, we can give you some more reasons:

it can cause accessibility problems;

it may not respond to line-partition properly;

and, no matter how nicely drawn it is in itself, it will often look awkward in context (within a word) —even though the original width standard is kept.

* What then?



* To start with, forget /el-dot/ and stick to /el/-and-/middle-dot/ or, sorry, /periodcentered/.

* /el/ and /periodcentered/ are common characters. Every font designer draws the glyphs in all variants: caps, small caps, lowercase; roman, italic, bold…; serif, sanserif, script… —as many variants as we use the geminated el in.

* But it is pretty unlikely that many font designers will draw them with the geminated el in mind. This is no insult. It is only a way to consider design “recommendations” —whether simple “recommendations” or strict “norms”.

* Take the original character width recommendation of 1913, for instance: “similar to” or “equal to” /double-el/?

[* If font designers ignore any of these, we may have “badly-spaced” geminated els.]

* No recommendation was set forward for the hanging dot back then: how and where it should be placed. We only know that, years later, Unicode recommended what it should be: 00B7.

[* If font designers follow this, we may also have “oversized, misplaced, uneven” hanging dots —“bad” geminated els. (At present, we will not consider any other hanging dot than 00B7.)]

* What then #2?



* So the meticulous font designer might go back to the original standard and think of drawing the geminated el as a single glyph, nicely spaced (width), nicely balanced (hanging dot).

* Our font designer knows such a “true” geminated el can only be a cheat. Its glyph is a composite of OpenType features that link to real characters —but it has no actual encoding itself.

* [Incidentally, this might raise a few questions. Which name should it have? Which cell should it be in? There is not much point (instead of “no point”) in asking which code it should have: no true geminated el has an official code, it is still out of code.]



* Deep problems come from code. They have nothing to do with (and go far beyond) aesthetics, perception or design standards. They are (mal)functioning problems: it is not that the geminated el is poorly done; it is rather that it does not work properly, or that it does not work at all.

* We may now retrieve our picture gallery and conclude that mistakes become problems within and inside the code arena. Even if there are no mistakes, geminated els can raise trouble.



* As if there was a code story behind every image: problems derive from wrong codes or data type mismatch. Here's a selection of a few —real— examples:

* If you copy-pasted a geminated el from MSWord onto a web editor, the middle-dot will turn into a bullet. (The two characters probably have the same code in different character tables: code remains, it is tables that change.)

* If you were browsing geminated els on a database, you would get all kinds of cropped results —and even none: the middle dot might not have been indexed, the middle dot might be “unsearchable” .

* If you were a Gmail or Outlook user linking to a geminated-el-d site/domain or address, you would never get there: the link would break just before the middle dot —or, Google and Microsoft ignore officially approved ICANN character tables.

* If your company name had a word like “Intel·ligència”, Intel Company could well sue you —they could interpret that single word as two words.

* If your written name had a geminated el, it would be ignored or altered (l.l) in official Spanish census. And if you had an electronic ID card or passport, well, we do not know what could happen exactly —there is no standard on how to flatten the geminated el.

* If…



* We can close this section now with

* Summary Note #2:

* In our digital days of (en)coded text, the geminated el gives way to problems that are far more complex than in past times: not only graphic, but also of accessibility and tokenization.





* Time to collect things, now.

* We would like to have a proper geminated el, a standard geminated el: the glyph, the key, the code. We may be asking too much. We do what we can, we keep on trying —now we have turned up at a Type Conference and told people here, as if there was any chance to improve things.

* But the problem remains, three-sided —typography, accessibility, tokenization.



* The glyph comes first. It seems it is becoming popular amongst local font designers to draw it as a “single piece”, keeping more or less to the original 1913 standard guidelines (so we are happy). A features composite, of course —perhaps temporary and patchy solutions, but good enough when lacking a real code.

We can only suggest them to name it “consistently”. Internal glyph name: /Lgeminada, /lgeminada, like some already do, as regards to future, possible standard compatibility.

* And, by all means, any of these is better than /el-dot/ —should we say it again,


* As to the key, let's not expect much in the near future. However, there have been a few improvements, like touchscreen keyboards from Apple and Firefox show. True, they are option keys, alternatives within custom keyboards, alternatives to standard keys —but they are real options.

* We have the code left… It is a tricky matter —that changes depending on the platform: web or pdf. At present, there seems to be no other standard solution than fitting it into the Private Use Area of Unicode, somewhere between E000 and F8FF —code or Name Character Sequence. But there is still a lot more to consider before we reach a satisfactory result —and that should also imply further discussions with Unicode, hoping to get some response…



* … that would finally stop us asking these questions on the screen. No official institution or big company sees much business in the geminated el, so they do not care much or have not cared much —we know that for sure. So we may turn once again to our ground —users, font designers, programmers, developers.

* The geminated el might only be a local issue, but it might be like some of yours. And that is why we have come to tell you about it. Even if it is only at a type conference on a Saturday morning. Thank you.