Character encoding (Generation II)

Revision as of 02:02, 12 October 2021 by Aztec Warrior (talk | contribs) (→‎Species names: diplaying > displaying)
050Diglett.png This article is incomplete.
Please feel free to edit this article to add missing information and complete it.
Reason: French, German, Italian, and Spanish character encodings

The Generation II games use a proprietary character encoding to store text data. The Generation II encoding is largely similar to the Generation I encoding. Versions of the games in different languages may use different encodings, some more different than others.

Character sets

Technically, all characters before 0x60 function as control characters. An asterisk (*) denotes a character that is explained in the control characters section below.

English

Only a few control characters below 0x50 are actually used in the English games, all of which are ones marked below with an asterisk (for more, see the control characters section further down where they are explained). The sub-0x50 characters that are not marked with an asterisk are artifacts of the Japanese control characters that print other characters with a diacritic.

-0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -A -B -C -D -E -F
0- ? B C D E F G H I J K L M N O P
1- Q R S T * * * X Y Z ( ) : ; [ *
2- q r * * * * w x y z            
3- Ä Ö Ü ä ö * * * * *           *
4- Z ( ) ":"           * * * * 'r * *
5- * * * * * * * * * * * * * * * *
6-   D E F G H I V S L M :
7- PO Ké text box borders  
8- A B C D E F G H I J K L M N O P
9- Q R S T U V W X Y Z ( ) : ; [ ]
A- a b c d e f g h i j k l m n o p
B- q r s t u v w x y z            
C- Ä Ö Ü ä ö ü                    
D- 'd 'l 'm 'r 's 't 'v                
E- ' PK MN -     ? ! . & é
F- $ × . / , 0 1 2 3 4 5 6 7 8 9

The arrows at 0xDF & 0xEB are only present in Pokemon Crystal.

French & German

-0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -A -B -C -D -E -F
0- Unsure
1-
2-
3-
4-
5-
6-   D E F G H I V S L M :
7- PO Ké text box borders  
8- A B C D E F G H I J K L M N O P
9- Q R S T U V W X Y Z ( ) : ; [ ]
A- a b c d e f g h i j k l m n o p
B- q r s t u v w x y z à è é ù ß ç
C- Ä Ö Ü ä ö ü ë ï â ô û ê î
D- c' d' j' l' m' n' p' s' 's t' u' y'
E- ' PK MN - + ? ! . & é
F- $ × . / , 0 1 2 3 4 5 6 7 8 9

Italian & Spanish

-0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -A -B -C -D -E -F
0-
1- Unsure
2-
3-
4-
5-
6-   D E F G H I V S L M :
7- PO Ké text box borders  
8- A B C D E F G H I J K L M N O P
9- Q R S T U V W X Y Z ( ) : ; [ ]
A- a b c d e f g h i j k l m n o p
B- q r s t u v w x y z à è é ù À Á
C- Ä Ö Ü ä ö ü È É Ì Í Ñ Ò Ó Ù Ú á
D- ì í ñ ò ó ú º & 'd 'l 'm 'r 's 't 'v
E- ' PK MN - ¿ ¡ ? ! . & é
F- $ × . / , 0 1 2 3 4 5 6 7 8 9

Japanese

-0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -A -B -C -D -E -F
0- ? イ゙ エ゙ オ゙
1- * * * ネ゙ ノ゙ * * *
2- ィ゙ あ゙ * * * *
3- * * * * * *
4- * * * * も゚ * *
5- * * * * * * * * * * * * * * * *
6-   D E F G H I V S L M :
7- text box borders  
8-
9-
A-
B-
C-
D-
E- ? !
F- × . / 0 1 2 3 4 5 6 7 8 9

Korean

Main article: Korean character encoding (Generation II)

Control characters

  This section is incomplete.
Please feel free to edit this section to add missing information and complete it.
Reason: Any alternate defaults or functions in Gold and Silver or in other languages

The characters on a gray background below are not naturally used in the games.

Character Japanese English
0x14 Prints the player's name, including a gendered honorific in Japanese (adds くん for male, ちゃん for female).
0x15 Nothing. May break.
0x16 Nothing. May break.
0x1D Prints . Prints ;.
0x1E Prints って. Prints [.
0x1F Prints . Prints  .
0x22 Prints た!. A "half" line break (moves the print position to the place one tile below the start of the current line).
0x23 Prints こうげき. Prints tzx, but this appears to be junk data.
0x24 Prints は . Prints POKé.
0x25 Prints の . Nothing.
0x35 Prints ばん どうろ. Nothing.
0x36 Prints わたし. Nothing.
0x37 Prints ここは . Nothing.
0x38 Prints レッド. Prints RED.
0x39 Prints グリーン. Prints GREEN.
0x3F Prints the opposing Trainer's name (including their Trainer class). (Outside of battle, this may not terminate properly.)
0x49 Prints おかあさん. Prints MOM.
0x4A Prints . Prints PKMN.
0x4B Appears to be the same as 0x55
0x4C Appears to be the same as 0x55 except without any prompt or pause (immediately shifting the dialogue box's lines upwards)
0x4E Line break (moves the print position to the space two tiles below the start of the current line (as defined by explicit placements of the print position); mostly used in move descriptions and Pokédex entries).
0x4F Dialogue line break (moves the print position to the expected start of the second line in a standard dialogue box).
0x50 String terminator.
0x51 Prompts the player to press a button, after which the text window is cleared to make way for the following text.
0x52 Prints the player's name without an honorific.
0x53 Prints the rival's name.
0x54 Prints ポケモン. Prints POKé.
0x55 Prompts the player to press a button, after which the top line of the text window is replaced by the bottom, the bottom line is cleared, and the print position moves to the start of the bottom line.
0x56 Prints ...... (in the middle of the line in Japanese).
0x57 Marks the end of dialogue, without a visual prompt to the player.
0x58 Marks the end of dialogue, with a visual prompt to the player.
0x59 Prints the inactive* Pokémon's name in battle. (Outside of battle, this may not terminate properly.)
0x5A Prints the active* Pokémon's name in battle.
0x5B Prints パソコン. Prints PC.
0x5C Prints わざマシン. Prints TM.
0x5D Prints トレーナー. Prints TRAINER.
0x5E Prints ロケットだん Prints ROCKET.
0x5F Prints a period (0xE8) and simultaneously functions as a string terminator. (Only used in Japanese Pokédex entries.)

Uses

Strings using this character encoding are found in various places, including the save data structure in RAM and the Trainer data structure and the below examples in ROM.

Species names

The names of all the Pokémon are stored in a simple list of 256 strings. In international versions, each entry is 10 bytes, while in Japanese versions, they are 5 bytes. If a name takes less than its full allotted length, it is terminated by 0x50.

The first name in the list is Bulbasaur while the last is ?????. The game looks up a Pokémon's name by receiving its index number, decrementing it by 1, and then looking for that index in the list. So while Bulbasaur's index number is 1, the game will find Bulbasaur's name at index 0 in the list (the first entry). These names are used for a variety of things, such as naming wild Pokémon and displaying the species name in the Pokédex and on summary screens.

The following are ROM offsets for the start of the list in each game:

Game English Japanese
Gold 0x1B0B74 0x053A09
Silver 0x1B0B74 0x053A09
Crystal 0x053384 0x05341A



Data structure in the Pokémon games
Generation I Pokémon speciesPokémonPoké MartCharacter encodingSave
Generation II Pokémon speciesPokémonTrainerCharacter encoding (Korean) • Save
Generation III Pokémon species (Pokémon evolutionPokédexType chart)
Pokémon (substructures) • MoveContestContest moveItem
Trainer TowerBattle FrontierCharacter encodingSave
Generation IV Pokémon species (Pokémon evolutionLearnsets)
PokémonSaveCharacter encoding
Generation V-present Character encoding
TCG GB and GB2 Character encoding


  This data structure article is part of Project Games, a Bulbapedia project that aims to write comprehensive articles on the Pokémon games.