Ideographic Description Characters

Unicode character block
(1999)12 (+12)15.1 (2023)16 (+4) Unicode documentationCode chart ∣ Web pageNote: [1][2]

Ideographic Description Characters is a Unicode block containing graphic characters used for describing CJK ideographs. They are used in Ideographic Description Sequences (IDS) to provide a description of an ideograph, in terms of what other ideographs make it up and how they are laid out relative to one another.[3] An IDS provides the reader with a description of an ideograph that cannot be represented properly, usually because it is not encoded in Unicode; rendering systems are not intended to automatically compose the pieces into a complete ideograph, and the descriptions are not standardized.

U+2FF0 to U+2FFB were introduced from GBK; U+2FFC to U+2FFF were devised later and introduced in Unicode 15.1 (2023).

Block

Ideographic Description Characters[1]
Official Unicode Consortium code chart (PDF)
  0 1 2 3 4 5 6 7 8 9 A B C D E F
U+2FFx ⿿
Notes
1.^ As of Unicode version 15.1

Ideographic Description Sequences

Ideographic Description Sequences are sequences of characters that represent a Chinese character structure as defined by the Unicode standard.

Below are the 16 characters as defined by Unicode in this block:

Unicode Char Meaning Example 1 IDS Example 2 IDS
U+2FF0 Two components combined left to right ⿰木目 𠁢 ⿰丨㇍
U+2FF1 Two components combined above to below ⿱木口 𠚤 ⿱𠂊丶
U+2FF2 Three components combined left to middle and right ⿲彳氵亍 𠂗 ⿲丿夕乚
U+2FF3 Three components combined above to middle and below ⿳亠口小 𠋑 ⿳亼目口
U+2FF4 One component fully wrapping another component ⿴囗口 𠀬 ⿴㐁人
U+2FF5 One component surround three sides of another component (opening at bottom) ⿵几皇 𧓉 ⿵齊虫
U+2FF6 One component surround three sides of another component (opening at top) ⿶凵㐅 ⿶乂丶
U+2FF7 One component surround three sides of another component (opening at right) ⿷匚斤 𧆬 ⿷虎九
U+2FF8 One component surround top and left side of another component ⿸疒丙 𤆯 ⿸耂火
U+2FF9 One component surround top and right side of another component ⿹戈廾 𢧌 ⿹或壬
U+2FFA One component surround bottom and left side of another component ⿺走召 𥘶 ⿺礼分
U+2FFB Two components overlapped ⿻工从 𣏃 ⿻木⿻コ一
U+2FFC One component surround three sides of another component (opening at left) ⿼叉丶 𬺹 ⿼コ二
U+2FFD One component surround bottom and right side of another component ⿽水丶 ⿽⺀十
U+2FFE Horizontal reflection ⿾卍 𣥄 ⿾正
U+2FFF ⿿ Rotation 𠕄 ⿿凹 𠄔 ⿿予

Two other related ideographic description characters are not encoded in this Unicode block, but of which may be used in ideographic description sequences:

Unicode Char Block Meaning Example 1 IDS Example 2 IDS
U+303E CJK Symbols and Punctuation Variant but not equivalent 㬵 (U+3B35) 〾胶 (U+80F6)[4] 𫜵 〾爫[5]
U+31EF CJK Strokes Subtraction ㇯兵丶 𧰨 ㇯豕一


This is the syntax of IDS in EBNF:

IDS := Ideographic | Radical | CJK_Stroke | Private Use | U+FF1F | IDS_UnaryOperator IDS | IDS_BinaryOperator IDS IDS | IDS_TrinaryOperator IDS IDS IDS 
CJK_Stroke := U+31C0 | U+31C1 | ... | U+31E3
IDS_UnaryOperator := U+2FFE | U+2FFF
IDS_BinaryOperator := U+2FF0 | U+2FF1 | U+2FF4 | ... | U+2FFD | U+31EF
IDS_TrinaryOperator:= U+2FF2 | U+2FF3

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Ideographic Description Characters block:

Version Final code points[a] Count UTC ID L2 ID WG2 ID IRG ID Document
3.0 U+2FF0..2FFB 12 X3L2/95-111 N1284 Ideographic Structure Symbol (additional request), 1995-11-07
N1303 (html, doc) Umamaheswaran, V. S.; Ksar, Mike (1996-01-26), "8.13 Ideographic structure symbols", Minutes of Meeting 29, Tokyo
N1348 Ideographic Components and Composition Scheme, 1996-02-05
N1357 Revised Ideographic Structure Symbols, 1996-04-12
N1353 Umamaheswaran, V. S.; Ksar, Mike (1996-06-25), "9", Draft minutes of WG2 Copenhagen Meeting # 30
L2/97-026 N1494 IRG proposal: Ideographic structure character, 1996-06-27
N1430 N365 Proposal Summary Form: Ideographic Structure Character, 1996-08-01
N1453 Ksar, Mike; Umamaheswaran, V. S. (1996-12-06), "9.6 Ideographic Structure Characters", WG 2 Minutes - Quebec Meeting 31
L2/97-023 N1486 N437 IRG #8 Resolutions, 1997-01-16
N1489 Supplement to Ideographic Components and Composition Schemes, 1997-01-16
N1490 N436 Response to WG2 question on Ideographic Structure Characters, 1997-01-16
L2/97-030 N1503 (pdf, doc) Umamaheswaran, V. S.; Ksar, Mike (1997-04-01), "9.6", Unconfirmed Minutes of WG 2 Meeting #32, Singapore; 1997-01-20--24
L2/97-114 N1544 (html, doc) N453 Sato, T. K. (1997-04-08), Questions on the "Han structure method" described in WG2 N1490 (IRG N436)
L2/97-255R Aliprand, Joan (1997-12-03), "4.B.2 Ideographic Structure Characters", Approved Minutes – UTC #73 & L2 #170 joint meeting, Palo Alto, CA – August 4-5, 1997
N1680 Project Sub-Division Proposal on Scheme of Ideograph Description Sequence, 1997-12-18
N1782 Clause X Ideographic Description Sequence (IDS) – IRG N575, 1998-05-06
L2/98-158 Aliprand, Joan; Winkler, Arnold (1998-05-26), "SC2 SC2 Action re Ideographic Description Sequences", Draft Minutes – UTC #76 & NCITS Subgroup L2 #173 joint meeting, Tredyffrin, Pennsylvania, April 20-22, 1998
N1842 Proposed text for a Draft for amendment 28 - Ideographic Description Sequences, 1998-06-03
L2/98-286 N1703 Umamaheswaran, V. S.; Ksar, Mike (1998-07-02), "9.5", Unconfirmed Meeting Minutes, WG 2 Meeting #34, Redmond, WA, USA; 1998-03-16--20, The original proposal was to use character composition. It has changed from being composition to description over its three year development.
L2/98-317 N1892 (pdf, doc) Combined CD registration and consideration ballot on WD for 10646-1/Amd. 28, AMENDMENT 28: Ideographic description characters, 1998-10-22
L2/99-010 N1903 (pdf, html, doc) Umamaheswaran, V. S. (1998-12-30), "10.3", Minutes of WG 2 meeting 35, London, U.K.; 1998-09-21--25
L2/99-072.1 N1971 Irish Comments on SC 2 N 3186, 1999-01-19
L2/99-072 N1970 (html, doc) Summary of Voting on SC 2 N 3186, PDAM ballot on WD for 10646-1/Amd. 28: Ideographic description characters, 1999-02-05
N2023 Paterson, Bruce (1999-04-06), FPDAM 28 Text - Ideographic Description Characters
L2/99-120 Text for FPDAM ballot of ISO/IEC 10646, Amd. 28 - Ideographic description characters, 1999-04-07
UTC/1999-014 Jenkins, John (1999-06-01), Recursion depth limit for IDC's
UTC/1999-015 Whistler, Ken (1999-06-01), Re: Brief note on length of ideograph descriptions
UTC/1999-020 Jenkins, John (1999-06-04), Diagram and language [for Ideograph Description Sequences]
L2/99-176R Moore, Lisa (1999-11-04), "Recursion Limit for Ideographic Description Characters", Minutes from the joint UTC/L2 meeting in Seattle, June 8-10, 1999
L2/99-232 N2003 Umamaheswaran, V. S. (1999-08-03), "6.1.2 PDAM28 - Ideographic Description Characters", Minutes of WG 2 meeting 36, Fukuoka, Japan, 1999-03-09--15
L2/99-253 N2067 Summary of Voting on SC 2 N 3312, ISO 10646-1/FPDAM 28 - Ideographic description characters, 1999-08-19
L2/99-301 N2123 Disposition of Comments Report on SC 2 N 3312, ISO/IEC 10646-1/FPDAM 28, AMENDMENT 28: Ideographic description characters, 1999-09-20
L2/99-302 N2124 Paterson, Bruce (1999-09-24), Revised Text for FDAM ballot of ISO/IEC 10646-1/FDAM 28, AMENDMENT 28: Ideographic description characters
L2/00-010 N2103 Umamaheswaran, V. S. (2000-01-05), "6.4.3", Minutes of WG 2 meeting 37, Copenhagen, Denmark: 1999-09-13—16
L2/00-045 Summary of FDAM voting: ISO 10646 Amd. 28: Ideographic description characters, 2000-01-31
L2/02-221 N2480 Cook, Richard (2002-05-18), Proposal to add Ideographic Description Characters (IDC) to the UCS
L2/02-436 N2534 N955 IRG Radical Classification, 2002-11-21
L2/12-087 Proposed Changes to ISO/IEC 10646 Annex I, Ideographic Description Characters, 2012-02-09
L2/12-007 Moore, Lisa (2012-02-14), "Consensus 130-C13", UTC #130 / L2 #227 Minutes, Submit L2/12-087 on extensions to ideographic description sequences to WG2.
L2/15-065 Jenkins, John (2015-02-02), Proposal to Add IDS Links to Online Unihan Database
L2/15-070 Davis, Mark (2015-02-03), IDS in Unihan
L2/15-313 Lunde, Ken (2015-11-03), Request for IDS Data
15.1 U+2FFC..2FFF 4 L2/17-386 N2273 Yang, Tao; Chan, Eiso; Wang, Yifan (2017-10-13), Submission of 3 IDCes
L2/17-379 Lunde, Ken (2017-10-20), "Proposed Ideographic Description Characters (IDCs)", IRG #49 Liaison Report
L2/18-012 Yang, Tao; Chan, Eiso; Wang, Yifan (2018-01-05), Proposal of Four IDCs
L2/18-168 Anderson, Deborah; Whistler, Ken; Pournader, Roozbeh; Moore, Lisa; Liang, Hai; Chapman, Chris; Cook, Richard (2018-04-28), "22. IDCs", Recommendations to UTC #155 April-May 2018 on Script Proposals
L2/21-118R N2492 Lunde, Ken; Jenkins, John H. (2021-08-11), Preliminary proposal to add a new provisional kIDS property (Unihan)
L2/22-136 West, Andrew (2022-07-08), Feedback on Proposals to Encode New Ideographic Description Characters
L2/22-191 N2572 Lunde, Ken; Jenkins, John; West, Andrew (2022-08-24), Proposal to encode five new Ideographic Description Characters
L2/22-227 SAT Feedback to "Preliminary proposal to add a new provisional kIDS property (Unihan)" (IRGN2492) and "Proposal to encode five new Ideographic Description Characters" (IRGN2572), 2022-08-29
L2/22-228 Fan, Ming (2022-09-02), Feedback on IRGN2572 "Proposal to encode 5 new ideograph description characters"
L2/22-247 Lunde, Ken (2022-11-01), "29", CJK & Unihan Group Recommendations for UTC #173 Meeting
L2/22-241 Constable, Peter (2022-11-09), "E.1 29", Approved Minutes of UTC Meeting 173
  1. ^ Proposed code points and characters names may differ from final code points and names

See also

References

  1. ^ "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
  2. ^ "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.
  3. ^ IDS are described in chapter 18.2 of the Unicode Standard 9.0 on pages 689 through 692.
  4. ^ "「㬵(U+3B35)」和「胶(U+80F6)」为什么在《康熙字典》收录了两次? - 知乎". www.zhihu.com. Retrieved 2023-09-21.
  5. ^ "基本集扩充字考(五・完结)附扩充块新增字考". 知乎专栏 (in Chinese). Retrieved 2023-09-21.
  • v
  • t
  • e
CJK ideographs in Unicode[a]
Block namePlaneChart rangeCharactersHan unificationScripts contained in block

0 BMP
0 BMP
2 SIP
2 SIP
2 SIP
2 SIP
2 SIP
3 TIP
3 TIP
2 SIP
0 BMP
0 BMP
0 BMP
0 BMP
0 BMP
0 BMP
0 BMP
0 BMP
0 BMP
1 SMP
2 SIP

4E00–9FFF
3400–4DBF
20000–2A6DF
2A700–2B73F
2B740–2B81F
2B820–2CEAF
2CEB0–2EBEF
30000–3134F
31350–323AF
2EBF0–2EE5F
2E80–2EFF
2F00–2FDF
2FF0–2FFF
3000–303F
31C0–31EF
3200–32FF
3300–33FF
F900–FAFF
FE30–FE4F
1F200–1F2FF
2F800–2FA1F

20,992
6,592
42,720
4,154
222
5,762
7,473
4,939
4,192
622
115
214
16
64
37
255
256
472
32
64
542

Unified
Unified
Unified
Unified
Unified
Unified
Unified
Unified
Unified
Unified
Not unified
Not unified
Not unified
Not unified
Not unified
Not unified
Not unified
12 are unified
Not unified
Not unified
Not unified

Han
Han
Han
Han
Han
Han
Han
Han
Han
Han
Han
Han
Common
Han, Hangul, Common, Inherited
Common
Hangul, Katakana, Common
Katakana, Common
Han
Common
Hiragana, Common
Han

Totals 
21
99,735
  
  1. ^
    As of version 15.1