|
A
proposal for supplementary characters in Unicode:
Medieval Nordic
Subrange
2: Diacritical characters
As explained in
the introduction to this proposal, Unicode contains
a large number of precomposed characters in the
Latin alphabet, but is very reluctant to accept any
new additions. This subrange has therefore been
greatly reduced in the present version, and now
only contains 9 characters, listed in sections 2.1,
2.2 and 2.3.
The remaining
list of composite characters are discussed in
section 2.4, and is no longer included in the
proposal. However, since the inventory of
combinations is of interest for font developers,
lists of expected combinations are included in this
section.
Also note that
characters for metrical analysis have been moved to
subrange
11.1
(i.e. characters with combinations of macron, acute
and breve).
2.1 Characters with a
loop
In Old Norse
manuscripts, the vowel "o" may have a loop. The
loop should probably be interpreted as a reduced
form of the character "e" or "a", but the resulting
character ought to be distinguished from the
ligatures "oe" and "ao". The loop may be placed in
a high position to the right, or in a low position
to the left.
There is no
combining loop in Unicode 3.2. Since the
loop only appears in connection with the vowel "o"
it is probably best to make this into a separate
character.
|
Glyph
|
Entity
|
Unicode
|
Descriptive
name
|
|

|
&olll;
|
0000
|
LATIN
SMALL LETTER O WITH LOWER LEFT
LOOP
|
|

|
&Olll;
|
0000
|
LATIN
CAPITAL LETTER O WITH LOWER LEFT
LOOP
|
|

|
&ourl;
|
0000
|
LATIN
SMALL LETTER O WITH UPPER RIGHT
LOOP
|
|

|
&Ourl;
|
0000
|
LATIN
CAPITAL LETTER O WITH UPPER RIGHT
LOOP
|
2.2 Characters with
complex diacritics
The character "y"
may in some Icelandic sources appear with a
combination of a dot above and an acute accent,
placed side by side. There is no single combining
diacritical mark of this type, and since this
combination only appears with the character "y", it
is suggested to introduce a composite character
rather than a new combining mark.
The character "o
ogonek" is included in Unicode 3.2 in the
range Latin
Extended-B
(01EB and 01EA). However, the accented form is not
included, and must be encoded as a combination of
LATIN SMALL LETTER O WITH OGONEK (01EB) or LATIN
CAPITAL LETTER O WITH OGONEK (01EA) and COMBINING
OGONEK (0328). These two characters have in my
opinion the most prominent position of all
characters in this proposal, since they are the
only characters needed for the display and printing
of normalised Old Norse orthography that are not
included in a composite form in Unicode. In view of
recent additions of composite characters in
Unicode, such as several characters in
Latin
Extended-B
(01F8, 01F9, 0218, 0219, 021A, 021B, 021E, 021F,
0226, 0227, 0228, 0229, 022A, 022B, 022C, 022D,
022E, 022F, 0230, 0231, 0232, 0233) these two
composite characters ought to be seriously
considered for inclusion.
|
Glyph
|
Entity
|
Unicode
|
Descriptive
name
|
|

|
&ydaac;
|
0000
|
LATIN
SMALL LETTER Y WITH DOT ABOVE AND
ACUTE
|
|

|
&Ydaac;
|
0000
|
LATIN
CAPITAL LETTER Y WITH DOT ABOVE AND
ACUTE
|
|

|
&ohbrac;
|
0000
|
LATIN
SMALL LETTER O WITH HOOK BELOW RIGHT AND
ACUTE
|
|

|
&Ohbrac;
|
0000
|
LATIN
CAPITAL LETTER O WITH HOOK BELOW RIGHT AND
ACUTE
|
2.3 Characters with a
hook above
In Medieval
Nordic manuscripts, the characters "a", "e", "i",
"j" and "y" may appear with a hook, which is placed
above the character, facing to the left. It has
been suggested that this hook could be rendered
with the combining hook above used as a tone mark
in Vietnamese (0309). However, the hook above in
Medieval Nordic manuscripts has a slightly
different form, and should be defined and designed
as a horizontally and vertically turned ogonek.
Like the ogonek, it usually overlaps with the base
character.
|
Glyph
|
Entity
|
Unicode
|
Descriptive
name
|
|

|
&comhal;
|
0000
|
COMBINING
HOOK TO THE LEFT ABOVE
|
Recommendation:
characters with a hook above are encoded with the
combining mark proposed here.
List
of expected combinations of base characters and a
combining hook above
Note: In
combination with the character "o", the hook above
may face to the right. This hook is attested, but
is very unusual, and it is an open question whether
it should be recognized as a separate mark. See the
link above for examples.
2.4 Characters with
other types of diacritics
Medieval Nordic
characters may appear with a number of existing
combining marks in Unicode. It is suggested that
such combinations are treated as decomposed
characters, i.e. as a combination of a base
character and a combining diacritical mark. Note,
however, that since this proposal includes several
base characters not (yet) included in Unicode, many
of these hitherto "unknown" characters may appear
with diacritical marks, such as the ligatures "au",
"av", "oo" etc.
For this reason,
expected combinations are listed below, so that
font developer can take this into
consideration.
2.4.1
Characters with a single acute
The acute is
widely used in Medieval Nordic sources, primarily
over vowels but also over some consonants. It is
often used simply as a distinctive mark, especially
over "i", which frequently is dotless and easily
mistaken for part of an "m", "n" or "u"
(minims). In some manuscripts the acute is
used to denote length, and this is the usage in
standard orthography.
Recommendation:
characters with a single acute are encoded with the
combining acute accent (0301).
List
of expected combinations of base characters and the
combining acute accent
2.4.2
Characters with a double acute
The double acute
is used in Hungarian over the vowels "o" and "u".
In Medieval Nordic manuscripts, especially late
Icelandic ones, the double acute accent is
sometimes used to denote length and are found over
all vowels, consonants (semivowels) such as "j",
"v" and "w", and some of the ligatures.
Recommendation:
characters with a double acute are encoded with the
combining double acute accent (030B).
List
of expected combinations of base characters and the
combining double acute accent
2.4.3
Characters with a single dot above
Single dots above
are used for some Old English characters such as
"c" and "g", and in general as a length mark in
Medieval Nordic manuscripts, above consonants
(geminates) as well as above vowels. In Old Norse
standard orthography dots above are not used, but
they are found in diplomatic editions.
Recommendation:
characters with a single dot above are encoded with
the combining dot above (0307).
List
of expected combinations of base characters and the
combining dot above
2.4.4
Characters with a single dot below
A special
category of signs are characters with a dot below,
typically indicating an uncertain reading. As such
they do not appear in the manuscripts themselves,
but they are quite frequent in diplomatic editions
of Medieval Nordic texts. They are also frequently
encountered in epigraphical contexts, e.g. in Runic
inscriptions (namely in the transliteration of
runes into the Latin alphabet).
Recommendation:
characters with a single dot below are encoded with
the combining dot below (0323).
List
of expected combinations of base characters and the
combining dot below
2.4.5
Characters with a double dot above
Double dots
above, diaeresis, are widely used over vowels, as
in modern German and Swedish. In Medieval Nordic
manuscripts, especially late Icelandic ones,
diaeresis is found over vowels and ligatures, as
well as "v" and "w".
Recommendation:
characters with a double dot above are encoded with
the combining diaeresis (0308).
List
of expected combinations of base characters and the
combining diaeresis
2.4.6
Characters with a hook below
A few vowels,
especially "o" and "e", may appear with a hook in
Old Norse manuscripts. The latter combination, "e
caudata", is common in Latin manuscripts, in which
the letter form alternates with the ligature
"æ". This hook, also called ogonek, is placed
below the base character and faces to the right. In
addition to the vowels, the character "t" may
appear with a hook.
Recommendation:
characters with a hook below are encoded with the
combining ogonek (0328).
List
of expected combinations of base characters and the
combining ogonek
|