[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: UNICODE (Re: а ну её на фиг, эту Xkb.)



On Tue, Jul 04, 2000 at 15:49:43 +0400, Alexander Voropay wrote:

>  А где написано, что UNICODE Code Space 000000..10FFFF ?
[...]
>  Прочитал внимательно. С одной стороны, они утверждают :
> 
> "A single 16-bit number is assigned to each code element defined
> by the Unicode Standard, Version 3.0. Each of these 16-bit numbers
> is called a code value [...]"

Code value (code unit) - это не то же самое, что code point (unicode
scalar value).  Опять же, не хочу здесь перевирать - см. UTR17.


Из unicode@unicode.org (автора не могу установить, т.к. у меня в
архивах только в цитате).

| In UTF-16, each 16-bit code value in the 0x0..0xD7FF range and the
| 0xE000..0xFFFF range directly corresponds to the same scalar value,
| while a "surrogate" pair of 16-bit code values algorithmically
| represents a single scalar value in the range 0x010000..0x10FFFF.
| The first half of the pair is always in the 0xD800..0xDBFF range,
| and the second half of the pair is in the 0xDC00..0xDFFF range.
| Unicode 3.0 and ISO/IEC 10646-1;2000 have adopted the UTF-16
| mechanism as the only official usage of the 0xD800..0xDFFF scalar
| range.

| Here are various ways of representing the proposed abstract
| character named "GOTHIC LETTER QAITHRA" (=Q) (which will probably be
| assigned to the Unicode scalar value 0x10335):

|     * in Unicode notation, by its Unicode scalar value: U-00010335
|     * as a UCS-4 code value sequence, in hex notation:  0x00010335
|     * as a UCS-2 code value sequence:           illegal; out of range
|     * as a UTF-16 code value sequence, in hex notation: 0xD800 0xDF35
|     * in Unicode notation, by its Unicode value pair:   U+D800 U+DF35
|     * in EBNF notation:                                 \uD800 \uDF35
|     * as a UTF-8 code value sequence, in hex notation:  0xF0 0x90 0x8c 0xB5

SY, Uwe
-- 
uwe@ptc.spbu.ru                         |       Zu Grunde kommen
http://www.ptc.spbu.ru/~uwe/            |       Ist zu Grunde gehen