Partially inspired by [LiveOverflow's video series] on Pokemon Red I decided to revisit a few classics from my own childhood: the 4th generation Pokemon main game titles. Mainly I really wanted to replay these games using whatever Pokemon I wanted and not just the ones availiable at certain parts of each game so from the beginning my main goal was to find a way to edit a Pokemon's species. I'll be mostly focusing on [Pokemon Platinum version] but most of this information applies to all main games from the same generation (only offsets differ).
Content:
- Starting Point
- Save file memory blocks and Bypassing CRC checksum
- The Pokemon data structure
- Conclusion and future work
Starting Point
Before doing anything remotely exciting I obviously needed to be able to run the games on my computer. I decided to go with the DeSmuME emulator as it was the easiest to get to run properly on my OS of choice. (Disclaimer: I do own physical copies of these games but if you're going to download them either way at least make sure you get them from a reputable source).
Getting Started With Reverse Engineering:
First I started two new save files, naming my character "AAAAAAA" in the first one and "BBBBBBB" in the second
Then opened both files in a hexeditor (actually I used vimdiff and the xxd command but you get the idea) and checked for differences
In the "A" file we find the sequence of bytes "2b01" seven times in a row and in the "B" file we find the sequence of bytes "2c01" seven times in a row. Considering that our character's name is seven characters long and that the number 2c comes after the number 2b, it seems highly likely that we found the player name within the save file. But if that's the case why are characters encoded in two bytes? What is that "01" for? I later found out through trial and error that it's a sort of region encoding. For example if you were to change "01" to "00" you'd get japanese characters instead of the latin alphabet. Nice so if we were to change all the "2b"s to "2d"s in the first file, our name in game would change to "CCCCCCC" once we load the save file... right...? Well...
... not exactly.
Save file memory blocks and Bypassing CRC error detection
Save file memory blocks:
Before we adress this issue a quick explanation of how information is laid out within the file. Save files are divided into two pairs of blocks, small blocks and big blocks (here you can see at what values they start and end):
With that said the data we want to mess with (player information, party pokemon, items, etc.) is all stored in the small blocks so we'll be focusing on those. We'll be referencing this Bulbapedia* page a lot from this point onward as it gives a lot of usefull information like the offsets for things we want to edit.
(Note: it seems the first time you save the game all data stored in small blocks gets saved only to the second block. The second time you save the game it gets saved to the first block and your previous save remains in the second block. From there on out your most recent save always gets written to the fisrt block and the previous save to the second block. So it appears the game stores your last two saves at all times although I still need to fully confirm this.)
Bypassing CRC:
Turns out the game uses an error detection algorithm to assure data integrity, causing our save file to become corrupted if we try to edit stuff(from the same Bulbapedia page we know the algorithm used is CRC-16-CCITT). Basically, it calculates a value based on part of the contents of the save file (in this case it takes all the bytes in the small blocks except for the block's footer: it's last 20 bytes) and checks if said value is consistent. Good news is these kinds of algorithms aren't design to stop attackers. This value is also stored in the save file so we can just:
- Implement the algorithm
- Make the changes we want to the save file
- Run the algorithm with the modified file to get the updated value
- Switch the previous value with the new one
Let's make sure these values correspond to the CRC output by implementing CRC-16-CCITT ourselves. There are several ways we can go about doing this, I opted for using a pre computed table. Let's have the code print the values it calculates in hex (also note the values are stored in little-endian):
small_block_2 = 0x40000 small_block_footer_offset = 0xcf18 def updateChecksum(dataToUpdate): # Precomputed lookup table table = [0x0000, 0x1021, 0x2042, 0x3063, 0x4084, 0x50A5, 0x60C6, 0x70E7, 0x8108, 0x9129, 0xA14A, 0xB16B, 0xC18C, 0xD1AD, 0xE1CE, 0xF1EF, 0x1231, 0x0210, 0x3273, 0x2252, 0x52B5, 0x4294, 0x72F7, 0x62D6, 0x9339, 0x8318, 0xB37B, 0xA35A, 0xD3BD, 0xC39C, 0xF3FF, 0xE3DE, 0x2462, 0x3443, 0x0420, 0x1401, 0x64E6, 0x74C7, 0x44A4, 0x5485, 0xA56A, 0xB54B, 0x8528, 0x9509, 0xE5EE, 0xF5CF, 0xC5AC, 0xD58D, 0x3653, 0x2672, 0x1611, 0x0630, 0x76D7, 0x66F6, 0x5695, 0x46B4, 0xB75B, 0xA77A, 0x9719, 0x8738, 0xF7DF, 0xE7FE, 0xD79D, 0xC7BC, 0x48C4, 0x58E5, 0x6886, 0x78A7, 0x0840, 0x1861, 0x2802, 0x3823, 0xC9CC, 0xD9ED, 0xE98E, 0xF9AF, 0x8948, 0x9969, 0xA90A, 0xB92B, 0x5AF5, 0x4AD4, 0x7AB7, 0x6A96, 0x1A71, 0x0A50, 0x3A33, 0x2A12, 0xDBFD, 0xCBDC, 0xFBBF, 0xEB9E, 0x9B79, 0x8B58, 0xBB3B, 0xAB1A, 0x6CA6, 0x7C87, 0x4CE4, 0x5CC5, 0x2C22, 0x3C03, 0x0C60, 0x1C41, 0xEDAE, 0xFD8F, 0xCDEC, 0xDDCD, 0xAD2A, 0xBD0B, 0x8D68, 0x9D49, 0x7E97, 0x6EB6, 0x5ED5, 0x4EF4, 0x3E13, 0x2E32, 0x1E51, 0x0E70, 0xFF9F, 0xEFBE, 0xDFDD, 0xCFFC, 0xBF1B, 0xAF3A, 0x9F59, 0x8F78, 0x9188, 0x81A9, 0xB1CA, 0xA1EB, 0xD10C, 0xC12D, 0xF14E, 0xE16F, 0x1080, 0x00A1, 0x30C2, 0x20E3, 0x5004, 0x4025, 0x7046, 0x6067, 0x83B9, 0x9398, 0xA3FB, 0xB3DA, 0xC33D, 0xD31C, 0xE37F, 0xF35E, 0x02B1, 0x1290, 0x22F3, 0x32D2, 0x4235, 0x5214, 0x6277, 0x7256, 0xB5EA, 0xA5CB, 0x95A8, 0x8589, 0xF56E, 0xE54F, 0xD52C, 0xC50D, 0x34E2, 0x24C3, 0x14A0, 0x0481, 0x7466, 0x6447, 0x5424, 0x4405, 0xA7DB, 0xB7FA, 0x8799, 0x97B8, 0xE75F, 0xF77E, 0xC71D, 0xD73C, 0x26D3, 0x36F2, 0x0691, 0x16B0, 0x6657, 0x7676, 0x4615, 0x5634, 0xD94C, 0xC96D, 0xF90E, 0xE92F, 0x99C8, 0x89E9, 0xB98A, 0xA9AB, 0x5844, 0x4865, 0x7806, 0x6827, 0x18C0, 0x08E1, 0x3882, 0x28A3, 0xCB7D, 0xDB5C, 0xEB3F, 0xFB1E, 0x8BF9, 0x9BD8, 0xABBB, 0xBB9A, 0x4A75, 0x5A54, 0x6A37, 0x7A16, 0x0AF1, 0x1AD0, 0x2AB3, 0x3A92, 0xFD2E, 0xED0F, 0xDD6C, 0xCD4D, 0xBDAA, 0xAD8B, 0x9DE8, 0x8DC9, 0x7C26, 0x6C07, 0x5C64, 0x4C45, 0x3CA2, 0x2C83, 0x1CE0, 0x0CC1, 0xEF1F, 0xFF3E, 0xCF5D, 0xDF7C, 0xAF9B, 0xBFBA, 0x8FD9, 0x9FF8, 0x6E17, 0x7E36, 0x4E55, 0x5E74, 0x2E93, 0x3EB2, 0x0ED1, 0x1EF0] sum = 0xffff print ("Computing checksum...") for i in range(0,len(dataToUpdate)): sum = ( (sum & 0xffff) << 8) ^ table[ ( dataToUpdate[i] ^ ( (sum >> 8) & 0xff) ) & 0xff ] return hex(sum & 0xffff) def main(): with open("AAAAAAA.dsv","rb") as f: sav = bytearray(f.read()) print (checksum(sav[small_block_2:(small_block_2 + small_block_footer_offset)])) if __name__ == "__main__": main()
This code outputs the following:
The same values we find at addresses 0x4CF2A-0x4CF2B meaning we can be sure we found the correct value. Now let's try changing all the "2b"s to "2d"s in the "A" file once more:
Then we run the same program to calculate the new value:
And we edit it within the file:
And now if we load the save file...
Success!
*A well known Pokemon wiki
The Pokemon data structure
Now that we can make changes to our save files without corrupting them let's see if we can find a way to change our Pokemon's species. The data for the up to six Pokemon in the player's party is all stored in the small blocks (each party Pokemon is 236 bytes in size). To keep things simple we'll be focusing on editing only the first Pokemon in the party.
From the Bulbapedia page we've been referencing thus far we know we can find the party Pokemon data from save file offsets 0xA0 to 0x627 and we can learn more about the location of specific offsets for things like the Pokemon's species id (or it's national dex number), its moves and ability from this page. This is where we run into another problem though.
Not only are bytes 0x08 to 0xEB of the Pokemon data structure xor encrypted, this encryption uses a pseudorandom number generator (PRNG) based on a checksum independent from the one we just bypassed. Furthermore bytes 0x08 to 0x87 are devided into four 32-byte blocks and shuffled around.
Thankfully, with the resources linked so far we can learn how these things are implemented and how to find workarrounds (hopefully without too much trouble).
Let's start by pointing out that bytes 0x01 to 0x07 are not encrypted and contain usefull information such as the checksum value and the personality value or pv of the Pokemon (for the purposes of this post you can think of it as a value that varies from Pokemon to Pokemon).
Decrypting the Pokemon structure:
Let's tackle the encryption first. From Bulbapedia we know that the PRNG uses the follwing formula:
- X[n+1] = (0x41C64E6D * X[n] + 0x6073)
- Takes the 16 upper bytes of said number
- Takes the nth 2-byte word from Pokemon data offsets 0x08 to 0x87
- And XORs them
small_block_offset = 0x40000 lead_Pokemon_offset = 0xA0 def PRNG(data,checksum): for i in range(0,128,2): seed = (0x41C64E6D * seed) + 0x00006073 data[(small_block_offset+lead_Pokemon_offset+0x08)+i] ^= (seed >> 16) & 0xff; data[(small_block_offset+lead_Pokemon_offset+0x08)+i+1] ^= (seed >> 24) & 0xff;
Unshuffling the 32-byte blocks:
Like we mentioned already, Pokemon data from bytes 0x08 to 0x87 is shuffled into four different blocks (let's call these blocks A, B, C and D). First let's undestand how the blocks are shuffled. Using the following formula:
- ((pv & 0x3E000) >> 0xD) % 24 (where pv is the Pokemon's personality value)
Result | Order |
---|---|
0 | ABCD |
1 | ABDC |
2 | ACBD |
3 | ACDB |
4 | ADBC |
5 | ADCB |
6 | BACD |
7 | BADC |
8 | BCAD |
9 | BCDA |
10 | BDAC |
11 | BDCA |
12 | CABD |
13 | CADB |
14 | CBAD |
15 | CBDA |
16 | CDAB |
17 | CDBA |
18 | DABC |
19 | DACB |
20 | DBAC |
21 | DBCA |
22 | DCAB |
23 | DCBA |
Knowing this as well as the personality value (stored in bytes 0x00 to 0x03 which are not encrypted) we can write a function to calculate the correct block order for the Pokemon we're trying to edit:
def getBlockOffsets(pv): orderTable = { # A offset -> index 0; B offset -> index 1; C offset -> index 2; D offset -> index 3 0:[0,32,64,96], #ABCD 1:[0,32,96,64], #ABDC 2:[0,64,32,96], #ACBD 3:[0,96,32,64], #ACDB 4:[0,64,96,32], #ADBC 5:[0,96,64,32], #ADCB 6:[32,0,64,96], #BACD 7:[32,0,96,64], #BADC 8:[64,0,32,96], #BCAD 9:[96,0,32,64], #BCDA 10:[64,0,96,32], #BDAC 11:[96,0,64,32], #BDCA 12:[32,64,0,96], #CABD 13:[32,96,0,64], #CADB 14:[64,32,0,96], #CBAD 15:[96,32,0,64], #CBDA 16:[64,96,0,32], #CDAB 17:[96,64,0,32], #CDBA 18:[32,64,96,0], #DABC 19:[32,96,64,0], #DACB 20:[64,32,96,0], #DBAC 21:[96,32,64,0], #DBCA 22:[64,96,64,0], #DCAB 23:[96,64,32,0] #DCBA } return orderTable[((pv & 0x3e000) >> 0xd) % 24]
The idea here is to have the function return a list L of offsets such that:
- small_block_offset + lead_Pokemon_offset + 0x08 + L[0] is the first byte in block A
- small_block_offset + lead_Pokemon_offset + 0x08 + L[1] is the first byte in block B
- And so on
Now that we have bypassed the Pokemon data structure's encryption let's combine this with our CRC bypass and try to edit a Pokemon's species id. We can play the game for a bit until we get our starter Pokemon and try to change it to a different Pokemon:
Wait that doesn't look right... Of course, we forgot to update the Pokemon checksum!
Updating the Pokemon checksum:
Thankfully this is very straightforward, the Pokemon checksum consists simply of the 16 lower bits of the sum of bytes 0x08 to 0x87 (unencrypted). We can write a function to calculate and update the Pokemon checksum:
def updatePKMChecksum(data): sum = 0 dataToUpdate = data[small_block_offset+lead_Pokemon_offset+0x08:small_block_offset+lead_Pokemon_offset+0x87] for i in range(0,len(dataToUpdate),2): sum += ([i+1] << 8) + dataToUpdate[i] data[small_block_offset + lead_Pokemon_offset + Pokemon_checksum_offset] = sum & 0xff data[small_block_offset + lead_Pokemon_offset + Pokemon_checksum_offset+1] = sum >> 8
Finally we have all the tools we need to get a hacker worthy starter:
Conclusion and future work
I ended up taking what I learned from this project and writing a small terminal based save file editor for the 4th generation Pokemon games in C++.
Current features include:
- Editing player name
- Editing lead Pokemon's species
- Editing lead Pokemon's ability
- Editing lead Pokemon's Moves
- Making lead Pokemon Shiny
No idea what caused this (also, that was supposed to be a Charizard!).