The Sonic the Hedgehog 2.0 Hacking Page
... or not having anything better to do than hack yet another remake
Written by nextvolume

It was unavoidable. Time had come. Yet another re-release; great cash cow by the way!
That said, it's Sonic 1, right? It can't be wrong, can it? And most of all, made with the Retro-Engine?
I had to look inside it.


Here I will describe the notation I will be using throughout the document.
I will describe every chunk of data as a C structure, and don't worry, you do not really need to know C to understand what each field means.
For example let's say we have a signed 16-bit value referenced as A and an unsigned 32-bit value referenced as B inside a block type we references as Simple; in my notation this would be:

struct Simple
    int16_t A; /** Comment 1, for A */
    uint32_t B;
     Comment 2, for B


Comments come always after the element they refer to.
I also assume you know what little-endian and big-endian order means.

How to find the game executable:
If you have the iPhone version, inside the IPA it is at Payload/Sonic 1
If you have the Android version, inside the APK it is either at lib/armeabi/ or at lib/armeabi-v7a/ They're the same but compiled for different CPU versions.
Can't open IPKs or APKs? Simply rename their extension to ZIP!

How to find the game datafile:
If you have the iPhone version, inside the IPA it is at Payload/Sonic
If you have the Android version, inside the APK it is at assets/Data.rsdk.xmf

Datafile format

The files for Sonic 2.0 are stored entirely within an archive file, called datafile.
It's like a ZIP or a TAR archive, just in its own custom format.

Datafile structure
struct DataFile
    struct Header header;
/** Header */
    struct FileDescriptionBlock fblock[header.numberOfFiles];
/** Sequentially, a file description block for every file stored inside the data file. */
    uint8_t data[];
/** Stored file data. */

struct Header
    uint32_t magicNumber;
/** Magic number. Used for identifying the file as an RSDK datafile. 32-bit little-endian */
    uint16_t formatVersion;
/** Usually 'vB' */
    uint16_t numberOfFiles;
/** Number of files stored in the datafile. 16-bit little-endian. */

File Description Block
struct FileDescriptionBlock
    uint32_t md5[4];
/** These 32-bit little-endian values are the MD5 key for the file name. See here for more information */
    uint32_t dataFileOffset;
/** Absolute offset in datafile for the file data. Example: if this value is 0xabadec, the file data for the file is found starting at offset 0xabadec in this datafile. 32-bit little endian.*/
    uint32_t fileSize;
/** File size in bytes. 32-bit little endian.
The 31th-bit inside this value has a very important purpose - if it is set the file is encrypted.
If ([this value] AND 0x80000000) == 0x80000000 the file is encrypted.
To get the real file size value, AND this value with 0x7FFFFFFF */


The MD5 keys and how to compute them

To thwart hacking, the new datafile format uses the following system:

  1. A path inside the datafile is requested by Retro-Engine SDK
  2. The path string is converted to lower case
  3. An MD5 hash is computed from the string
  4. The datafile reading code iterates through the file description blocks. A match is found if a file is found for which the MD5 key matches. If no match is found, there was no file named as such in the datafile.

It makes our life more difficult when hacking, because we have no directory structure and do not know the path of a file.
But as long as one keeps the original hashes, despite not having the original filename, a new datafile can be created and the data will be found by the game anyway!

How to access the remastered soundtrack

The remastered soundtrack is one of the most advertised features of this remake.
The good news is that you can extract the soundtrack with Retrun-Sonic and listen to it!
The music is in the free and open source Ogg Vorbis format. Most common audio players can play the format.
Sample command line: retrun x Data.rsdk -3 -d=sonic_music
The bin files will then appear in the sonic_music directory.
If your audio player doesn't want to play the files, rename the extension of the file you want to play to .OGG

iPhone version: Files from 86.bin to 111.bin

Android version: Files from 88.bin to 113.bin

How to create a datafile with Retrun-Sonic

Retrun-Sonic currently supports creating datafiles in Sonic 1 format, and it can also use an hash information file.
The hash information file allows us to tinker without knowing the original filename requested by the Retro-Engine - MD5 is a one-way hash function.
There is a line for each file for which additional information is needed, in the following format:
Filename on disk Encryption flag (Always "Dec", kept for backwards compatiblity) MD5 hash

The following example replaces the Green Hill Zone music with an Ogg Vorbis file supplied by the user:
retrun x Data.rsdk -3 -P=hash.txt
[ now replace Data/94.bin with your own Ogg Vorbis file]
retrun c Data Data.rsdk -3 -P=hash.txt

For the Android version, replace Data/94.bin with Data/96.bin.

Tips for reverse engineering

The datafile format has been fully reverse-engineered. Information here is just for the curious.
If you want to help reverse engineering the game and its formats, grab a copy of the IDA Pro Demo here. If you don't run the operating systems listed there, IDA Pro runs great even under the Wine emulator.
It is a crippled version, meaning you can't save and you have other serious limits, but for poking at what's inside the executable it is great.
You can reverse-engineer whatever version you have, but if you want to get the most out of your reverse engineering, it is recommended that you work on the Android version.
The reason for this is that more symbol information was preserved in the binary so you actually have many RSDK function names at hand!


Last Updated: May 23rd, 2013