GameCube character encoding (Generation III)

Main article: Character encoding (Generation III)

This is the character encoding used in the Generation III side series games for the Nintendo GameCube.

Pokémon Box Ruby & Sapphire

Pokémon Box Ruby & Sapphire uses Shift JIS (for Japanese text) and Windows-1252 (for Western text) as the encoding for its text data.

Character set

The Japanese font supports the following characters:

Size Row Supported code points
Ranges Count
Single-byte 0x20-7E 95
Double-byte 1 Special characters 0x8141-8142, 0x8144-8145, 0x8148-814B, 0x815B, 0x815E, 0x8160, 0x8163-8164, 0x8175-8178, 0x817C, 0x8189-818A, 0x8195, 0x819A 22
2 0x81A8 1
3 Numerals/Latin letters 0x824F-8258, 0x8260-8279, 0x8281-829A 62
4 Hiragana 0x829F-82F1 83
5 Katakana 0x8340-837E, 0x8380-8396 86
8 Box-drawing characters 0x84A3-84A8, 0x84AA-84AB 8
23 Kanji 0x8C8E 1
38 0x93FA 1
Total: 359

The Western font supports Windows-1252 code points 0x20-FF, for a total of 224 characters.

In Western versions, the Japanese font only supports 248 of the 359 characters it supports in the Japanese version; specifically, it excludes Shift JIS code points 0x21-7E, 0x8141, 0x8144, 0x815B, 0x8160, 0x8163, 0x8195, 0x819A, 0x81A8, 0x84A3-84A8, 0x84AB, 0x8C8E, and 0x93FA.

A space ( ) is used as a fallback character for characters not included in a given font.

Transcoding

Text from the Game Boy Advance games is transcoded from the proprietary encoding used in those games to Shift JIS or Windows-1252 for display, though it is still stored using the proprietary encoding in the game's save file.

Only the characters on a white background below can be input in box names; Ä, Ö, Ü, ä, ö, and ü are only available for input when connected to a German game. Those on a light gray background may be used in other text strings. Characters on a dark gray background are transcoded as a # followed by the hex code for the code point. For example, code point 0xF7 would be printed as #F7, except in Japanese text in Western games, where it would be printed as due to these characters having been removed from the Japanese font.

Western font
-0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -A -B -C -D -E -F
0- À Á Â Ç È É Ê Ë Ì Î Ï Ò Ó Ô
1- Œ Ù Ú Û Ñ ß à á ç è é ê ë ì
2- î ï ò ó ô œ ù ú û ñ º ª & +
3- = ;
4-
5- ¿ ¡ Í % ( )
6- â í
7-
8- < >
9-
A- 0 1 2 3 4 5 6 7 8 9 ! ? . - ·
B- ¨ ¢ £ , × / A B C D E
C- F G H I J K L M N O P Q R S T U
D- V W X Y Z a b c d e f g h i j k
E- l m n o p q r s t u v w x y z
F- : Ä Ö Ü ä ö ü
Japanese font
-0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -A -B -C -D -E -F
0-
1-
2-
3-
4-
5-
6-
7-
8-
9-
A-
B-
C-
D-
E-
F-

The set of quotation marks used depends on the language of the connected cartridge, not the language of Pokémon Box Ruby & Sapphire.

English Spanish Italian German French
0xB1 «
0xB2 »

Nonstandard characters

The following characters are displayed in a nonstandard manner. Rows ending with JP, NA, or EU apply only to the Japanese, North American, or European version of the game, respectively.

Western font (replaced characters)
Code Windows-1252 character Displayed character
5C \ REVERSE SOLIDUS ¥ Yen sign
87 DOUBLE DAGGER er Superscript erEU
8D <control> Female sign
9D <control> Male sign
A2 ¢ CENT SIGN Male sign
A3 £ POUND SIGN Female sign
A5 ¥ YEN SIGN Black square
A6 ¦ BROKEN BAR Black circle
A7 § SECTION SIGN Black triangle
A8 ¨ DIAERESIS Two dot leader
B0 ° DEGREE SIGN º Masculine ordinal indicatorEU
B6 PILCROW SIGN × Multiplication sign
B8 ¸ CEDILLA Two dot leader
BC ¼ VULGAR FRACTION ONE QUARTER Black heart
BD ½ VULGAR FRACTION ONE HALF Black star
BE ¾ VULGAR FRACTION THREE QUARTERS
S
Space symbol
Western font (approximated characters)
Code Windows-1252 character Displayed character
83 ƒ LATIN SMALL LETTER F WITH HOOK f Latin small letter F
8A Š LATIN CAPITAL LETTER S WITH CARON S Latin capital letter S
8C Œ LATIN CAPITAL LIGATURE OE O Latin capital letter O
8E Ž LATIN CAPITAL LETTER Z WITH CARON Z Latin capital letter Z
95 BULLET Black circle
98 ˜ SMALL TILDE ~ Tilde
9A š LATIN SMALL LETTER S WITH CARON s Latin small letter S
9C œ LATIN SMALL LIGATURE OE o Latin small letter O
9E ž LATIN SMALL LETTER Z WITH CARON z Latin small letter Z
9F Ÿ LATIN CAPITAL LETTER Y WITH DIAERESIS Y Latin capital letter Y
A1 ¡ INVERTED EXCLAMATION MARK ! Exclamation markJP/NA
AE ® REGISTERED SIGN R Latin capital letter R
B2 ² SUPERSCRIPT TWO 2 Digit two
B3 ³ SUPERSCRIPT THREE 3 Digit three
B9 ¹ SUPERSCRIPT ONE 1 Digit one
BA º MASCULINE ORDINAL INDICATOR 0 Digit zeroJP/NA
BF ¿ INVERTED QUESTION MARK ? Question markJP/NA
C6 Æ LATIN CAPITAL LETTER AE A Latin capital letter AJP
D0 Ð LATIN CAPITAL LETTER ETH D Latin capital letter D
D8 Ø LATIN CAPITAL LETTER O WITH STROKE O Latin capital letter O
DF ß LATIN SMALL LETTER SHARP S β Greek small letter beta
E6 æ LATIN SMALL LETTER AE a Latin small letter A
F8 ø LATIN SMALL LETTER O WITH STROKE o Latin small letter O
Western font (blank characters)
Code Windows-1252 character
7F <control>
80 EURO SIGN
81 <control>
84 DOUBLE LOW-9 QUOTATION MARKJP
8F <control>
90 <control>
99 TRADE MARK SIGN
A0   NO-BREAK SPACE
A4 ¤ CURRENCY SIGN
A9 © COPYRIGHT SIGN
AA ª FEMININE ORDINAL INDICATORJP
C6 Æ LATIN CAPITAL LETTER AENA/EU
DE Þ LATIN CAPITAL LETTER THORN
F0 ð LATIN SMALL LETTER ETH
FE þ LATIN SMALL LETTER THORN
Japanese font (replaced characters)
Code Shift JIS character Displayed character
7E OVERLINE ~ TildeJP
84A3 BOX DRAWINGS LIGHT UP AND LEFT Ä Latin capital letter A with diaresisJP
84A4 BOX DRAWINGS LIGHT UP AND RIGHT Ö Latin capital letter O with diaresisJP
84A5 BOX DRAWINGS LIGHT VERTICAL AND RIGHT Ü Latin capital letter U with diaresisJP
84A6 BOX DRAWINGS LIGHT DOWN AND HORIZONTAL ä Latin small letter A with diaresisJP
84A7 BOX DRAWINGS LIGHT VERTICAL AND LEFT ö Latin small letter O with diaresisJP
84A8 BOX DRAWINGS LIGHT UP AND HORIZONTAL ü Latin small letter U with diaresisJP
84AA BOX DRAWINGS HEAVY HORIZONTAL
S
Space symbol
84AB BOX DRAWINGS HEAVY VERTICAL × Multiplication signJP

Control characters

  • 0x00 is a terminator, marking the ends of strings.
  • 0x0A is a line break.
  • 0x1A is an escape character for variables and functions. It is followed by a byte indicating the length of the escape sequence and a byte indicating the function to call.
    • Function 0x01 prints a string from the buffer specified by the following 16-bit integer.
    • Function 0x02 prints a number from the buffer specified by the following 16-bit integer.
    • Function 0x03 performs an action specified by the following 16-bit integer. (Higher values are unused and have the same effect.)
      • Value 0x0001 prompts for the player to press a button to continue, which clears the dialogue box before printing the next line.
      • Value 0x0002 causes a picture of an Egg to be shown in a separate box and a sound effect to play.
      • Value 0x0003 causes Brigette to keep her mouth open in a smile and close her eyes for a longer period of time. (Higher values are unused and have the same effect.)

Pokémon Colosseum and XD

Pokémon Colosseum and XD use UTF-16 in big endian to store text data. When communicating with the Game Boy Advance games, text is transcoded between the proprietary encoding used in those games and their Unicode equivalent for storage and display.

Character set

Pokémon Colosseum and XD split their fonts across multiple files, with fonts containing certain characters such as kanji and symbols only being loaded in menus and areas where they are needed. A filled square (⬛︎) is used as a fallback character for characters not included in a given font. The following Unicode characters are supported in at least one font in either game:

The following characters are only included in certain versions of the games:

  • Pokémon Colosseum (Japanese region):
  • Pokémon XD (all regions): , , , , , , , , , , , , ,
  • Pokémon XD (Japanese region): , , , ,
  • Pokémon XD (PAL region):

Transcoding

The following tables describe the Unicode code points that correspond to each byte value in the GBA games.

  • Cells on a dark gray background indicate the byte is mapped to the null terminator U+0000. Upon reaching a terminator, all subsequent bytes/characters are ignored and the rest of the buffer is filled with U+0000. When traded back to the GBA games, it is mapped to the end-of-string terminator 0xFF, and the rest of the buffer is filled with 0x00.
  • Cells containing indicate the byte is mapped to U+FEFE (a reserved code point in Unicode). When traded back to the GBA games, it is mapped to 0x2C (displayed as er in Western languages).
  • When traded back to the GBA games, any other Unicode character not present in the appropriate table below is mapped to 0xB7 (displayed as in Japanese and $ in Western languages).
Western font
-0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -A -B -C -D -E -F
0- À Á Â Ç È É Ê Ë Ì Î Ï Ò Ó Ô
1- Œ Ù Ú Û Ñ ß à á ç è é ê ë ì
2- î ï ò ó ô œ ù ú û ñ º ª & +
3- = ;
4-
5- ¿ ¡ Í % ( )
6- â í
7-
8- < >
9-
A- 0 1 2 3 4 5 6 7 8 9 ! ? . -
B- ' $ , × / A B C D E
C- F G H I J K L M N O P Q R S T U
D- V W X Y Z a b c d e f g h i j k
E- l m n o p q r s t u v w x y z
F- : Ä Ö Ü ä ö ü
Japanese font
-0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -A -B -C -D -E -F
0-  
1-
2-
3-
4-
5-
6-
7-
8-
9-
A-
B- ×
C-
D-
E-
F- Ä Ö Ü ä ö ü

The set of quotation marks used depends on the region of Pokémon Colosseum or XD. In Japanese and American region games, Japanese or English quotation marks are always used, respectively. In PAL region games, it depends on the Pokémon's language of origin.

GBA → GCN
English Spanish Italian German French
0xB1 «
0xB2 »
GCN → GBA
English Spanish Italian German French
0xB1 (left quote) 0xB2 (right quote)
0xB2 (right quote) 0xB2 (right quote)
0xB7 (invalid) 0xB1 (left quote)
« 0xB7 (invalid) 0xB1 (left quote)
» 0xB7 (invalid) 0xB2 (right quote)

Nonstandard characters

The following characters are displayed in a nonstandard manner.

Introduced in Japanese versions of Colosseum
  • U+2018 LEFT SINGLE QUOTATION MARK is displayed as the dakuten mark () used on the text entry screen in certain fonts in Japanese games.
  • U+FF3E FULLWIDTH CIRCUMFLEX ACCENT is displayed as the handakuten mark () used on the text entry screen in Japanese games.
  • U+25A1 WHITE SQUARE is displayed as the space symbol (
    S
    ) used on the text entry screen.
  • U+337C SQUARE ERA NAME SYOUWA is displayed as the large "e+" in "カードe+" (Card e+).
  • U+337D SQUARE ERA NAME TAISYOU is displayed as the large "e" in "カードeリーダー+" (Card e Reader+).
  • U+337E SQUARE ERA NAME MEIZI is displayed as a fullwidth "é".
  • U+FF04 FULLWIDTH DOLLAR SIGN is displayed as the fullwidth Pokémon Dollar symbol ($).
Introduced in Western versions of Colosseum
  • U+0024 $ DOLLAR SIGN is displayed as the halfwidth Pokémon Dollar symbol ($). Displays as a regular dollar sign in Japanese versions of Colosseum.
  • U+03B1 α GREEK SMALL LETTER ALPHA is displayed as the large "e+" in "カードe+" (Card e+) in certain fonts.
  • U+03B2 β GREEK SMALL LETTER BETA is displayed as the large "e" in "e-Reader" in certain fonts.
  • U+03B3 γ GREEK SMALL LETTER GAMMA is displayed as a superscript "er" (er) in certain fonts.
  • U+2030 PER MILLE SIGN is displayed as a halfwidth two-dot ellipsis (). The fullwidth form is mapped to the standard Unicode codepoint for a two-dot ellipsis.
  • U+2031 PER TEN THOUSAND SIGN is displayed as a speech bubble (🗨) or as a space ( ) depending on the font.

Control characters

050Diglett.png This section is incomplete.
Please feel free to edit this section to add missing information and complete it.
Reason: Full list of variables/functions
  • 0x0000 is a terminator, marking the ends of strings.
  • 0xFFFF is an escape character for variables and functions. It is followed by a byte indicating the index of the function to call, and 0, 1, or 4 bytes as arguments. Some of these functions are described below.
    • When followed by 0x00, it marks a line break.
    • When followed by 0x02, it marks a prompt for the player to press a button to dismiss the dialogue box. No visible indicator is displayed.
    • When followed by 0x03, it marks a prompt for the player to press a button to continue, which clears the dialogue box before printing the next line. An indicator is displayed at the end of the line.
    • When followed by 0x04, it marks the start of the base text for furigana.
    • When followed by 0x05, it marks the end of the base text and the start of the ruby text for furigana.
    • When followed by 0x06, it marks the end of the ruby text for furigana.
    • When followed by 0x08, it changes the text color to the RGBA color specified by the following four bytes, though the alpha component is ignored.
    • When followed by 0x09, it marks a pause for a period of time specified by the following byte.
    • When followed by 0x2B, it prints the player's name.
    • When followed by 0x2C, it prints Rui's name.Colo


Data structure in the Pokémon games
General Character encoding
Generation I Pokémon speciesPokémonPoké MartCharacter encoding (Stadium) • Save
Generation II Pokémon speciesPokémonTrainerCharacter encoding (StadiumKorean) • Save
Generation III Pokémon species (EvolutionPokédexType chart)
Pokémon (substructures) • MoveContestContest moveItem
Trainer TowerBattle FrontierCharacter encoding (GameCube) • Save
Generation IV Pokémon species (EvolutionLearnsets)
PokémonSaveCharacter encoding (Wii)
Generation V–present Character encoding
Generation VIII Save
TCG GB and GB2 Character encoding
Project Games logo.png This data structure article is part of Project Games, a Bulbapedia project that aims to write comprehensive articles on the Pokémon games.