preformatted/ctext/ctext.txt

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
1056N/ACompound Text Encoding
1056N/A
1056N/AX Consortium Standard
1056N/A
1056N/ARobert W. Scheifler
1056N/A
1276N/AX Consortium
1276N/A
1276N/AX Version 11, Release 7.7
1056N/A
1056N/AVersion 1.1
1056N/A
1276N/ACopyright © 1989 X Consortium
1056N/A
1056N/APermission is hereby granted, free of charge, to any person obtaining a copy of
1056N/Athis software and associated documentation files (the "Software"), to deal in
1056N/Athe Software without restriction, including without limitation the rights to
1056N/Ause, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
1056N/Aof the Software, and to permit persons to whom the Software is furnished to do
1056N/Aso, subject to the following conditions:
1056N/A
1056N/AThe above copyright notice and this permission notice shall be included in all
1056N/Acopies or substantial portions of the Software.
1056N/A
1276N/ATHE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
1056N/AIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
1056N/AFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE X
1056N/ACONSORTIUM BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
1056N/AACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
1056N/AWITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
1056N/A
1056N/AExcept as contained in this notice, the name of the X Consortium shall not be
1056N/Aused in advertising or otherwise to promote the sale, use or other dealings in
1056N/Athis Software without prior written authorization from the X Consortium.
1056N/A
1276N/AX Window System is a trademark of The Open Group.
1276N/A
1056N/A━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
1056N/A
1056N/ATable of Contents
1056N/A
1056N/AOverview
1056N/AValues
1056N/AControl Characters
1056N/AStandard Character Set Encodings
1056N/AApproved Standard Encodings
1056N/ANon-Standard Character Set Encodings
1056N/ADirectionality
1056N/AResources
1056N/AFont Names
1056N/AExtensions
1056N/AErrors
1056N/A
1056N/AOverview
1056N/A
1056N/ACompound Text is a format for multiple character set data, such as
1056N/Amulti-lingual text. The format is based on ISO standards for encoding and
1056N/Acombining character sets. Compound Text is intended to be used in three main
1056N/Acontexts: inter-client communication using selections, as defined in the
1056N/AInter-Client Communication Conventions Manual (ICCCM); window properties (e.g.,
1056N/Awindow manager hints as defined in the ICCCM); and resources (e.g., as defined
1056N/Ain Xlib and the Xt Intrinsics).
1056N/A
1056N/ACompound Text is intended as an external representation, or interchange format,
1056N/Anot as an internal representation. It is expected (but not required) that
1056N/Aclients will convert Compound Text to some internal representation for
1056N/Aprocessing and rendering, and convert from that internal representation to
1056N/ACompound Text when providing textual data to another client.
1056N/A
1056N/AValues
1056N/A
1056N/AThe name of this encoding is "COMPOUND_TEXT". When text values are used in the
1056N/AICCCM-compliant selection mechanism or are stored as window properties in the
1056N/Aserver, the type used should be the atom for "COMPOUND_TEXT".
1056N/A
1056N/AOctet values are represented in this document as two decimal numbers in the
1056N/Aform col/row. This means the value (col * 16) + row. For example, 02/01 means
1056N/Athe value 33.
1056N/A
1056N/AFor our purposes, the octet encoding space is divided into four ranges:
1056N/A
1056N/AC0 octets from 00/00 to 01/15
1056N/AGL octets from 02/00 to 07/15
1056N/AC1 octets from 08/00 to 09/15
1056N/AGR octets from 10/00 to 15/15
1056N/A
1056N/AC0 and C1 are "control character" sets, while GL and GR are "graphic character"
1056N/Asets. Only a subset of C0 and C1 octets are used in the encoding, and depending
1056N/Aon the character set encoding defined as GL or GR, a subset of GL and GR octets
1056N/Amay be used; see below for details. All octets (00/00 to 15/15) may appear
1056N/Ainside the text of extended segments (defined below).
1056N/A
1056N/A[For those familiar with ISO 2022, we will use only an 8-bit environment, and
1056N/Awe will always use G0 for GL and G1 for GR.]
1056N/A
1056N/AControl Characters
1056N/A
1056N/AIn C0, only the following values will be used:
1056N/A
1056N/A00/09 HT  HORIZONTAL TABULATION
1056N/A00/10 NL  NEW LINE
1056N/A01/11 ESC (ESCAPE)
1056N/A
1056N/AIn C1, only the following value will be used:
1056N/A
1056N/A09/11 CSI CONTROL SEQUENCE INTRODUCER
1056N/A
1056N/A[The alternate 7-bit CSI encoding 01/11 05/11 is not used in Compound Text.]
1056N/A
1056N/ANo control sequences are defined in Compound Text for changing the C0 and C1
1056N/Asets.
1056N/A
1056N/AA horizontal tab can be represented with the octet 00/09. Specification of
1056N/Atabulation width settings is not part of Compound Text and must be obtained
1056N/Afrom context (in an unspecified manner).
1056N/A
1056N/A[Inclusion of horizontal tab is for consistency with the STRING type currently
1056N/Adefined in the ICCCM.]
1056N/A
1056N/AA newline (line separator/terminator) can be represented with the octet 00/10.
1056N/A
1056N/A[Note that 00/10 is normally LINEFEED, but is being interpreted as NEWLINE.
1056N/AThis can be thought of as using the (deprecated) NEW LINE mode, E.1.3, in ISO
1056N/A6429. Use of this value instead of 08/05 (NEL, NEXT LINE) is for consistency
1056N/Awith the STRING type currently defined in the ICCCM.]
1056N/A
1056N/AThe remaining C0 and C1 values (01/11 and 09/11) are only used in the control
1056N/Asequences defined below.
1056N/A
1056N/AStandard Character Set Encodings
1056N/A
1056N/AThe default GL and GR sets in Compound Text correspond to the left and right
1056N/Ahalves of ISO 8859-1 (Latin 1). As such, any legal instance of a STRING type
1056N/A(as defined in the ICCCM) is also a legal instance of type COMPOUND_TEXT.
1056N/A
1056N/A[The implied initial state in ISO 2022 is defined with the sequence: 01/11 02/
1056N/A00 04/03 GO and G1 in an 8-bit environment only. Designation also invokes. 01/
1056N/A11 02/00 04/07 In an 8-bit environment, C1 represented as 8-bits. 01/11 02/00
1056N/A04/09 Graphic character sets can be 94 or 96. 01/11 02/00 04/11 8-bit code is
1056N/Aused. 01/11 02/08 04/02 Designate ASCII into G0. 01/11 02/13 04/01 Designate
1056N/Aright-hand part of ISO Latin-1 into G1. ]
1056N/A
1056N/ATo define one of the approved standard character set encodings to be the GL
1056N/Aset, one of the following control sequences is used:
1056N/A
1056N/A01/11 02/08 {I} F      94 character set
1056N/A01/11 02/04 02/08{I} F 94^N character set
1056N/A
1056N/ATo define one of the approved standard character set encodings to be the GR
1056N/Aset, one of the following control sequences is used:
1056N/A
1056N/A01/11 02/09 {I} F       94 character set
1056N/A01/11 02/13 {I} F       96 character set
1056N/A01/11 02/04 02/09 {I} F 94^N character set
1056N/A
1056N/AThe "F"in the control sequences above stands for "Final character", which is
1056N/Aalways in the range 04/00 to 07/14. The "{I}" stands for zero or more
1056N/A"intermediate characters", which are always in the range 02/00 to 02/15, with
1056N/Athe first intermediate character always in the range 02/01 to 02/03. The
1056N/Aregistration authority has defined an "{I} F" sequence for each registered
1056N/Acharacter set encoding.
1056N/A
1056N/A[Final characters for private encodings (in the range 03/00 to 03/15) are not
1056N/Apermitted here in Compound Text.]
1056N/A
1056N/AFor GL, octet 02/00 is always defined as SPACE, and octet 07/15 (normally
1056N/ADELETE) is never used. For a 94-character set defined as GR, octets 10/00 and
1056N/A15/15 are never used.
1056N/A
1056N/A[This is consistent with ISO 2022.]
1056N/A
1056N/AA 94^N character set uses N octets (N > 1) for each character. The value of N
1056N/Ais derived from the column value for F:
1056N/A
1056N/Acolumn 04 or 05 2 octets
1056N/Acolumn 06       3 octets
1056N/Acolumn 07       4 or more octets
1056N/A
1056N/AIn a 94^N encoding, the octet values 02/00 and 07/15 (in GL) and 10/00 and 15/
1056N/A15 (in GR) are never used.
1056N/A
1056N/A[The column definitions come from ISO 2022.]
1056N/A
1056N/AOnce a GL or GR set has been defined, all further octets in that range (except
1056N/Awithin control sequences and extended segments) are interpreted with respect to
1056N/Athat character set encoding, until the GL or GR set is redefined. GL and GR
1056N/Asets can be defined independently, they do not have to be defined in pairs.
1056N/A
1056N/ANote that when actually using a character set encoding as the GR set, you must
1056N/Aforce the most significant bit (08/00) of each octet to be a one, so that it
1056N/Afalls in the range 10/00 to 15/15.
1056N/A
1056N/A[Control sequences to specify character set encoding revisions (as in section
1056N/A6.3.13 of ISO 2022) are not used in Compound Text. Revision indicators do not
1056N/Aappear to provide useful information in the context of Compound Text. The most
1056N/Arecent revision can always be assumed, since revisions are upward compatible.]
1056N/A
1056N/AApproved Standard Encodings
1056N/A
1056N/AThe following are the approved standard encodings to be used with Compound
1056N/AText. Note that none have Intermediate characters; however, a good parser will
1056N/Astill deal with Intermediate characters in the event that additional encodings
1056N/Aare later added to this list.
1056N/A
1276N/A┌────┬────┬───────────────────────────────────────────────────────────────────┐
1276N/A│{I} │94/ │Description                                                        │
1276N/A│F   │96  │                                                                   │
1276N/A├────┼────┼───────────────────────────────────────────────────────────────────┤
1276N/A│4/02│94  │7-bit ASCII graphics (ANSI X3.4-1968), Left half of ISO 8859 sets  │
1276N/A├────┼────┼───────────────────────────────────────────────────────────────────┤
1276N/A│04/ │94  │Right half of JIS X0201-1976 (reaffirmed 1984), 8-Bit              │
1276N/A│09  │    │Alphanumeric-Katakana Code                                         │
1276N/A├────┼────┼───────────────────────────────────────────────────────────────────┤
1276N/A│04/ │94  │Left half of JIS X0201-1976 (reaffirmed 1984), 8-Bit               │
1276N/A│10  │    │Alphanumeric-Katakana Code                                         │
1276N/A├────┼────┼───────────────────────────────────────────────────────────────────┤
1276N/A│04/ │96  │Right half of ISO 8859-1, Latin alphabet No. 1                     │
1276N/A│01  │    │                                                                   │
1276N/A├────┼────┼───────────────────────────────────────────────────────────────────┤
1276N/A│04/ │96  │Right half of ISO 8859-2, Latin alphabet No. 2                     │
1276N/A│02  │    │                                                                   │
1276N/A├────┼────┼───────────────────────────────────────────────────────────────────┤
1276N/A│04/ │96  │Right half of ISO 8859-3, Latin alphabet No. 3                     │
1276N/A│03  │    │                                                                   │
1276N/A├────┼────┼───────────────────────────────────────────────────────────────────┤
1276N/A│04/ │96  │Right half of ISO 8859-4, Latin alphabet No. 4                     │
1276N/A│04  │    │                                                                   │
1276N/A├────┼────┼───────────────────────────────────────────────────────────────────┤
1276N/A│04/ │96  │Right half of ISO 8859-7, Latin/Greek alphabet                     │
1276N/A│06  │    │                                                                   │
1276N/A├────┼────┼───────────────────────────────────────────────────────────────────┤
1276N/A│04/ │96  │Right half of ISO 8859-6, Latin/Arabic alphabet                    │
1276N/A│07  │    │                                                                   │
1276N/A├────┼────┼───────────────────────────────────────────────────────────────────┤
1276N/A│04/ │96  │Right half of ISO 8859-8, Latin/Hebrew alphabet                    │
1276N/A│08  │    │                                                                   │
1276N/A├────┼────┼───────────────────────────────────────────────────────────────────┤
1276N/A│04/ │96  │Right half of ISO 8859-5, Latin/Cyrillic alphabet                  │
1276N/A│12  │    │                                                                   │
1276N/A├────┼────┼───────────────────────────────────────────────────────────────────┤
1276N/A│04/ │96  │Right half of ISO 8859-9, Latin alphabet No. 5                     │
1276N/A│13  │    │                                                                   │
1276N/A├────┼────┼───────────────────────────────────────────────────────────────────┤
1276N/A│04/ │942 │GB2312-1980, China (PRC) Hanzi                                     │
1276N/A│01  │    │                                                                   │
1276N/A├────┼────┼───────────────────────────────────────────────────────────────────┤
1276N/A│04/ │942 │JIS X0208-1983, Japanese Graphic Character Set                     │
1276N/A│02  │    │                                                                   │
1276N/A├────┼────┼───────────────────────────────────────────────────────────────────┤
1276N/A│04/ │942 │KS C5601-1987, Korean Graphic Character Set                        │
1276N/A│03  │    │                                                                   │
1276N/A└────┴────┴───────────────────────────────────────────────────────────────────┘
1056N/A
1056N/AThe sets listed as "Left half of ..." should always be defined as GL. The sets
1056N/Alisted as "Right half of ..." should always be defined as GR. Other sets can be
1056N/Adefined either as GL or GR.
1056N/A
1056N/ANon-Standard Character Set Encodings
1056N/A
1056N/ACharacter set encodings that are not in the list of approved standard encodings
1056N/Acan be included using "extended segments". An extended segment begins with one
1056N/Aof the following sequences:
1056N/A
1056N/A01/11 2/05 02/15 03/00 M L variable number of octets per character
1056N/A01/11 2/05 02/15 03/01 M L 1 octet per character
1056N/A01/11 2/05 02/15 03/02 M L 2 octet per character
1056N/A01/11 2/05 02/15 03/03 M L 3 octet per character
1056N/A01/11 2/05 02/15 03/04 M L 4 octet per character
1056N/A
1056N/A[This uses the "other coding system" of ISO 2022, using private Final
1056N/Acharacters.]
1056N/A
1056N/AThe "M" and "L" octets represent a 14-bit unsigned value giving the number of
1056N/Aoctets that appear in the remainder of the segment. The number is computed as
1056N/A((M - 128) * 128) + (L - 128). The most significant bit M and L are always set
1056N/Ato one. The remainder of the segment consists of two parts, the name of the
1056N/Acharacter set encoding and the actual text. The name of the encoding comes
1056N/Afirst and is separated from the text by the octet 00/02 (STX, START OF TEXT).
1056N/ANote that the length defined by M and L includes the encoding name and
1056N/Aseparator.
1056N/A
1056N/A[The encoding of the length is chosen to avoid having zero octets in Compound
1056N/AText when possible, because embedded NUL values are problematic in many C
1056N/Alanguage routines. The use of zero octets cannot be ruled out entirely however,
1056N/Asince some octets in the actual text of the extended segment may have to be
1056N/Azero.]
1056N/A
1056N/AThe name of the encoding should be registered with the X Consortium to avoid
1056N/Aconflicts and should when appropriate match the CharSet Registry and Encoding
1056N/Aregistration used in the X Logical Font Description. The name itself should be
1056N/Aencoded using ISO 8859-1 (Latin 1), should not use question mark (03/15) or
1056N/Aasterisk (02/10), and should use hyphen (02/13) only in accordance with the X
1056N/ALogical Font Description.
1056N/A
1056N/AExtended segments are not to be used for any character set encoding that can be
1056N/Aconstructed from a GL/GR pair of approved standard encodings. For example, it
1056N/Ais incorrect to use an extended segment for any of the ISO 8859 family of
1056N/Aencodings.
1056N/A
1056N/AIt should be noted that the contents of an extended segment are arbitrary; for
1056N/Aexample, they may contain octets in the C0 and C1 ranges, including 00/00, and
1056N/Aoctets comprising a given character may differ in their most significant bit.
1056N/A
1056N/A[ISO-registered "other coding systems" are not used in Compound Text; extended
1056N/Asegments are the only mechanism for non-2022 encodings.]
1056N/A
1056N/ADirectionality
1056N/A
1056N/AIf desired, horizontal text direction can be indicated using the following
1056N/Acontrol sequences:
1056N/A
1056N/A09/11 03/01 05/13 begin left-to-right text
1056N/A09/11 03/02 05/13 begin right-to-left text
1056N/A09/11 05/13       end of string
1056N/A
1056N/A[This is a subset of the SDS (START DIRECTED STRING) control in the Draft
1056N/ABidirectional Addendum to ISO 6429.]
1056N/A
1056N/ADirectionality can be nested. Logically, a stack of directions is maintained.
1056N/AEach of the first two control sequences pushes a new direction on the stack,
1056N/Aand the third sequence (revert) pops a direction from the stack. The stack
1056N/Astarts out empty at the beginning of a Compound Text string. When the stack is
1056N/Aempty, the directionality of the text is unspecified.
1056N/A
1056N/ADirectionality applies to all subsequent text, whether in GL, GR, or an
1056N/Aextended segment. If the desired directionality of GL, GR, or extended segments
1056N/Adiffers, then directionality control sequences must be inserted when switching
1056N/Abetween them.
1056N/A
1056N/ANote that definition of GL and GR sets is independent of directionality;
1056N/Adefining a new GL or GR set does not change the current directionality, and
1056N/Apushing or popping a directionality does not change the current GL and GR
1056N/Adefinitions.
1056N/A
1056N/ASpecification of directionality is entirely optional; text direction should be
1056N/Aclear from context in most cases. However, it must be the case that either all
1056N/Acharacters in a Compound Text string have explicitly specified direction or
1056N/Athat all characters have unspecified direction. That is, if directionality
1056N/Acontrol sequences are used, the first such control sequence must precede the
1056N/Afirst graphic character in a Compound Text string, and graphic characters are
1056N/Anot permitted whenever the directionality stack is empty.
1056N/A
1056N/AResources
1056N/A
1056N/ATo use Compound Text in a resource, you can simply treat all octets as if they
1056N/Awere ASCII/Latin-1 and just replace all "\" octets (05/12) with the two octets
1056N/A"\\", all newline octets (00/10) with the two octets "\n", and all zero octets
1056N/Awith the four octets "\000". It is up to the client making use of the resource
1056N/Ato interpret the data as Compound Text; the policy by which this is ascertained
1056N/Ais not constrained by the Compound Text specification.
1056N/A
1056N/AFont Names
1056N/A
1056N/AThe following CharSet names for the standard character set encodings are
1056N/Aregistered for use in font names under the X Logical Font Description:
1056N/A
1276N/A┌───────────────┬──────────────────────────────┬──────────────────────────────┐
1276N/A│Name           │Encoding Standard             │Description                   │
1276N/A├───────────────┼──────────────────────────────┼──────────────────────────────┤
1276N/A│ISO8859-1      │ISO8859-1                     │Latinalphabet No. 1           │
1276N/A├───────────────┼──────────────────────────────┼──────────────────────────────┤
1276N/A│ISO8859-2      │ISO8859-2                     │Latinalphabet No. 2           │
1276N/A├───────────────┼──────────────────────────────┼──────────────────────────────┤
1276N/A│ISO8859-3      │ISO8859-3                     │Latinalphabet No. 3           │
1276N/A├───────────────┼──────────────────────────────┼──────────────────────────────┤
1276N/A│ISO8859-4      │ISO8859-4                     │Latinalphabet No. 4           │
1276N/A├───────────────┼──────────────────────────────┼──────────────────────────────┤
1276N/A│ISO8859-5      │ISO 8859-5                    │Latin/Cyrillic alphabet       │
1276N/A├───────────────┼──────────────────────────────┼──────────────────────────────┤
1276N/A│ISO8859-6      │ISO 8859-6                    │Latin/Arabic alphabet         │
1276N/A├───────────────┼──────────────────────────────┼──────────────────────────────┤
1276N/A│ISO8859-7      │ISO8859-7                     │Latin/Greekalphabet           │
1276N/A├───────────────┼──────────────────────────────┼──────────────────────────────┤
1276N/A│ISO8859-8      │ISO8859-8                     │Latin/Hebrew alphabet         │
1276N/A├───────────────┼──────────────────────────────┼──────────────────────────────┤
1276N/A│ISO8859-9      │ISO8859-9                     │Latinalphabet No. 5           │
1276N/A├───────────────┼──────────────────────────────┼──────────────────────────────┤
1276N/A│JISX0201.1976-0│JIS X0201-1976 (reaffirmed    │8-bit Alphanumeric-Katakana   │
1276N/A│               │1984)                         │Code                          │
1276N/A├───────────────┼──────────────────────────────┼──────────────────────────────┤
1276N/A│GB2312.1980-0  │GB2312-1980, GL encoding      │China (PRC) Hanzi             │
1276N/A├───────────────┼──────────────────────────────┼──────────────────────────────┤
1276N/A│JISX0208.1983-0│JIS X0208-1983, GL encoding   │Japanese Graphic Character Set│
1276N/A├───────────────┼──────────────────────────────┼──────────────────────────────┤
1276N/A│KSC5601.1987-0 │KS C5601-1987, GL encoding    │Korean Graphic Character Set  │
1276N/A└───────────────┴──────────────────────────────┴──────────────────────────────┘
1056N/A
1056N/AExtensions
1056N/A
1056N/AThere is no absolute requirement for a parser to deal with anything but the
1056N/Aparticular encoding syntax defined in this specification. However, it is
1056N/Apossible that Compound Text may be extended in the future, and as such it may
1056N/Abe desirable to construct the parser to handle 2022/6429 syntax more generally.
1056N/A
1056N/AThere are two general formats covering all control sequences that are expected
1056N/Ato appear in extensions:
1056N/A
1056N/A01/11 {I} F
1056N/A
1056N/AFor this format, I is always in the range 02/00 to 02/15, and F is always in
1056N/Athe range 03/00 to 07/14.
1056N/A
1056N/A09/11 {P} {I} F
1056N/A
1056N/AFor this format, P is always in the range 03/00 to 03/15, I is always in the
1056N/Arange 02/00 to 02/15, and F is always in the range 04/00 to 07/14.
1056N/A
1056N/AIn addition, new (singleton) control characters (in the C0 and C1 ranges) might
1056N/Abe defined in the future.
1056N/A
1056N/AFinally, new kinds of "segments" might be defined in the future using syntax
1056N/Asimilar to extended segments:
1056N/A
1056N/A01/11 02/05 02/15 F M L
1056N/A
1056N/AFor this format, F is in the range 03/05 to 3/15. M and L are as defined in
1056N/Aextended segments. Such a segment will always be followed by the number of
1056N/Aoctets defined by M and L. These octets can have arbitrary values and need not
1056N/Afollow the internal structure defined for current extended segments.
1056N/A
1056N/AIf extensions to this specification are defined in the future, then any string
1056N/Aincorporating instances of such extensions must start with one of the following
1056N/Acontrol sequences:
1056N/A
1056N/A01/11 02/03 V 03/00 ignoring extensions is OK
1056N/A01/11 02/03 V 03/01 ignoring extensions is not OK
1056N/A
1056N/AIn either case, V is in the range 02/00 to 02/15 and indicates the major
1056N/Aversion minus one of the specification being used. These version control
1056N/Asequences are for use by clients that implement earlier versions, but have
1056N/Aimplemented a general parser. The first control sequence indicates that it is
1056N/Aacceptable to ignore all extension control sequences; no mandatory information
1056N/Awill be lost in the process. The second control sequence indicates that it is
1056N/Aunacceptable to ignore any extension control sequences; mandatory information
1056N/Awould be lost in the process. In general, it will be up to the client
1056N/Agenerating the Compound Text to decide which control sequence to use.
1056N/A
1056N/AErrors
1056N/A
1056N/AIf a Compound Text string does not match the specification here (e.g., uses
1056N/Aundefined control characters, or undefined control sequences, or incorrectly
1056N/Aformatted extended segments), it is best to treat the entire string as invalid,
1056N/Aexcept as indicated by a version control sequence.
1056N/A