Character encoding (Generation I): Difference between revisions

Taking content out of the intro, and creating meta-sections for that content
No edit summary
(Taking content out of the intro, and creating meta-sections for that content)
Line 1: Line 1:
{{incomplete|needs=French, German, Italian, and Spanish character encodings}}
{{incomplete|needs=French, German, Italian, and Spanish character encodings}}
In [[Generation I]] games, proprietary '''character encoding''' is used to store text data. Different language versions may use different encodings, some more different than others.
In [[Generation I]] games, a proprietary '''character encoding''' is used to store text data.


Fixed-length user-input strings are terminated with 0x50. If a fixed-length string is terminated before using its full capacity, the content of the remaining space is undetermined.
==Compatibility==
The exact character encoding differs between languages, although all Western languages use almost-equivalent encodings. The set of [[Text entry (Generation I)|user-enterable characters]] is almost identical between Western languages, with the exception of some letters with umlauts being enterable in the German version (<code>ÄÖÜäöü</code>).


Note that 0x7F is a space (" "), not empty. All non-control characters print in one character.
In the Generation I games, the only supported cross-language compatibility is among the Western games. Attempting to trade or battle between a Western language game and a Japanese or Korean game (only the Generation II games are available in Korean) will usually result in some kind of corruption in both games, and is completely disabled in the [[Virtual Console]] releases. Between Western language games, the only text that can be transferred is the player's name, and the [[nickname]]s and [[Original Trainer]]s of their party Pokémon.


In some contexts, some characters may display differently than suggested below. For example, in the character input table, <sup>E</sup><sub>D</sub> is 0xF0 instead of the [[Pokémon Dollar]] symbol, and in the English Pokédex, the feet (') and inches (") symbols are 0x60 and 0x61.
Due to the encodings of Western language games mostly being compatible, when [[trade|trading]] Pokémon between different Western languages, [[nickname]]s and [[Original Trainer]] names are usually displayed correctly, with the exception of characters with diacritics (such as letters with umlauts, and some characters obtainable in names in the Spanish versions of the Generation II games that cannot be [[Text entry (Generation II)|entered by players]]). The [[Original Trainer]] of Pokémon obtained in [[in-game trade]]s in Generation I is codepoint 0x5D, a control character that prints "TRAINER" in the game's language, meaning that it is automatically translated when traded between languages.
 
The [[Character encoding (Generation II)|Generation II character encoding]] for each language is almost the same as the Generation I encoding, with all user-enterable characters remaining at the same codepoints in both generations. Additionally, the English Generation II games support the letters with umlauts that can be entered in German games, unlike the English Generation I games. This means that trading Pokémon between Generation I and II games of the same language will not affect their nicknames or Original Trainer names.
 
===Poké Transporter===
{{main|Poké Transporter#Character transcoding|Poké Transporter → Character transcoding}}
When transferring a Pokémon from a Generation I or II game via [[Poké Transporter]], its nickname and Original Trainer need to be transcoded from this character encoding to that of [[Pokémon Bank]]. Due to differences in the characters that can be entered or otherwise appear in names in these games and the Generation VII games, some characters are not transcoded to the same characters they represent in these games.
 
==Rendering==
Due to how text is rendered in the Generation I Pokémon games, all non-control characters take up the exact same amount of space (i.e. the games effectively use a {{wp|monospaced font}}). In Western languages, some ligature characters exist to display two characters within the width of one (e.g. the character <code>'s</code> is the same width as <code>s</code>).
 
These same code points are used for both rendering text and other elements. For example, codepoints 0x01 to 0x48 are used for rendering map elements in the overworld, codepoints 0x79-0x7E are [[wikipedia:box-drawing characters|box-drawing characters]] used to draw the boundaries of text boxes, etc.
 
Some codepoints are used for different characters in different contexts. For example, 0xF0 usually represents the the [[Pokémon Dollar]] symbol, but on the text entry interface it displays as <sup>E</sup><sub>D</sub> instead.


==English==
==English==
Due to how text is rendered in the Generation I Pokémon games, all characters take up the exact same amount of space (i.e. the games effectively use a {{wp|monospaced font}}). Some ligature characters exist to display two characters within the width of one (e.g. the character <code>'s</code> is the same width as <code>s</code>).
{| class="wikitable" style="text-align: center; border-collapse: collapse" cellpadding="2px" width="375px"
{| class="wikitable" style="text-align: center; border-collapse: collapse" cellpadding="2px" width="375px"
|-
|-
Line 109: Line 121:
* 0x6E-0x6F, 0x76-0x78, and 0xE9-0xEB are Japanese [[wikipedia:katakana|katakana]] leftover in the character table from the Japanese version. They are not used in the English version.
* 0x6E-0x6F, 0x76-0x78, and 0xE9-0xEB are Japanese [[wikipedia:katakana|katakana]] leftover in the character table from the Japanese version. They are not used in the English version.
* 0x74 is an [[wikipedia:interpunct|interpunct]] leftover from the Japanese version. It is not used in the English version.
* 0x74 is an [[wikipedia:interpunct|interpunct]] leftover from the Japanese version. It is not used in the English version.
* 0x79 are [[wikipedia:box-drawing characters|box-drawing characters]] used to draw the boundaries of text boxes. In the games themselves, they are rendered with Poké Balls in the corners.
* 0x79-0x7E are [[wikipedia:box-drawing characters|box-drawing characters]] used to draw the boundaries of text boxes. In the games themselves, they are rendered with Poké Balls in the corners.
* 0x7F is a [[wikipedia:Space (punctuation)|space]].
* 0x7F is a [[wikipedia:Space (punctuation)|space]].
* 0xBB-0xBF and 0xE4-0xE5 represent an apostrophe followed by a letter. These characters are used to render contractions in dialogue, so that the apostrophe does not take up an entire character-worth of space.
* 0xBB-0xBF and 0xE4-0xE5 represent an apostrophe followed by a letter. These characters are used to render {{wp|Contraction (grammar)|contractions}} and {{wp|English possessive|possessives}} in dialogue, so that the apostrophe does not take up an entire character-width of space.
* 0xC0-0xDF are usually blank, although some parts of the game may load characters in these code points.
* 0xC0-0xDF are usually blank, although some parts of the game may load characters in these code points.
** Letters with [[wikipedia:umlaut|umlaut]]s that are user-enterable in the German version (<code>ÄÖÜäöü</code>) are located at 0xC0-0xC5 in Western languages other than English.
** Letters with [[wikipedia:umlaut|umlaut]]s that are user-enterable in the German version (<code>ÄÖÜäöü</code>) are located at 0xC0-0xC5 in Western languages other than English.