TSGBAStringFetcher
Welcome to my second blog about TSGBAStringFetcher.
A short explanation of this tool:
“extract” in-game strings from The Sims Game Boy Advance games.
In this blog, i will explain how that tool works, as i think it might fit better in an actual blog than on a README.md honestly lol.
I will try to explain it as good as i can, so that you can actually get the Strings without the need of this program. Though honestly if you can use the program, just use it, as doing it manually will only cost much time and can possibly cause annoyance depending on how long the string is.
What you will need
You definitely need:
- Basic knowledge about Hexadecimal.
- A Hex Editor (HxD works pretty well for it) or a way to view the bytes of the Game.
- A Calculator (On Windows 10, it’s pre-included calculator in programming mode is Perfect for it, as it has a
>>
(right shift) operation and%
, so everything there you need). - Your dumped backup of the game of course.
- Something to store notes, because trust me, without you will definitely get lost with all the things you need to have in mind.
If you got all of that, then you can continue with the next step.
Preparations
Before we start with it, we need to make some notes:
- From which game do we want to fetch a string? See Locations below for more.
- Which String ID do we want to fetch? See ID Ranges below for the min and max values from which you can fetch a string from.
- In which language do we want to fetch the String from? See Locations as well, as the things for it are related to it.
In this example I will choose:
- Game:
The Sims 2 Game Boy Advance
- StringID:
379
(Which is Burple, I already checked before). - Language: English
So with that, the notes would look like the following for example:
- Locations:
- Address1:
019B4990
- Address2:
019B4B20
- Address3:
019B4994
- Address1:
- StringID:
379
Locations
You will find the Locations for all supported games below.
The Sims 2 Game Boy Advance
Language | Address1 | Address2 | Address3 |
---|---|---|---|
English | 019B4990 | 019B4B20 | 019B4994 |
Dutch | 019D7784 | 019D7924 | 019D7788 |
French | 019FAF9C | 019FB154 | 019FAFA0 |
German | 01A1F7E0 | 01A1F98C | 01A1F7E4 |
Italian | 01A460A0 | 01A46254 | 01A460A4 |
Spanish | 01A697C0 | 01A69978 | 01A697C4 |
The Urbz - Sims in the City Game Boy Advance (non japanese)
Language | Address1 | Address2 | Address3 |
---|---|---|---|
English | E4F820 | E4F9B0 | E4F824 |
Dutch | E93ECC | E94074 | E93ED0 |
French | EDA9AC | EDAB60 | EDA9B0 |
German | F26B40 | F26CD8 | F26B44 |
Italian | F733B4 | F73560 | F733B8 |
Spanish | FBA2AC | FBA460 | FBA2B0 |
The Sims Bustin’ Out Game Boy Advance (non japanese)
Language | Address1 | Address2 | Address3 |
---|---|---|---|
English | 98D488 | 98D5FC | 98D48C |
Dutch | 9C1A7C | 9C1C00 | 9C1A80 |
French | 9F5294 | 9F5438 | 9F5298 |
German | A2FE48 | A2FFD4 | A2FE4C |
Italian | A5ECF0 | A5EE7C | A5ECF4 |
Spanish | A94E60 | A9500C | A94E64 |
ID Ranges
You can find the proper min and max values of the String IDs in the table below.
Game | Min | Max |
---|---|---|
The Sims Bustin’ Out | 0x0 | 0x1A02 |
The Urbz - Sims in the City | 0x0 | 0x1AFD |
The Sims 2 | 0x0 | 0xD85 |
After initial preparations
Now that we know the Addresses, the String ID and the language, we can focus on some of the variables.
Counter
: We need this variable for the right shift andShiftVal
+ShiftAddress
variables.Character
: This will contain two bytes (while we will need one), which we will decode later on after getting all of the character bytes.ShiftVal
: We need this for the right shift operation.ShiftAddress
: We need this to get theShiftVal
.
Now we can also set initial values to Counter
and Character
. Counter
should have an initial value of 0
, and Character
an initial value of 0x100
.
For ShiftVal
and ShiftAddress
, we will focus on that below, as the initial value depend on the StringID and the Addresses.
Getting the ShiftAddress
Now to get the ShiftAddress
we’ll have to do the following actions:
StringID
*0x4
(which would be0x379
*0x4
=>0xDE4
).Address2
+Result from above
(Which would be0x019B4B20
+0xDE4
=>0x19B5904
).
Now we have to open the Hex Editor (I use HxD in my case) and go to the location of the last result (in my case 19B5904
).
We will have to read 4 bytes starting at that location.
As you can see on the picture, the section I marked B7 EC 00 00
is what we got there. We will have to swap the byte order though, so B7 EC 00 00
gets 00 00 EC B7
which would be the value: 0xECB7
or 0x0000ECB7
.
Now to get the final initial ShiftAddress
, what we have to do now is:
Address1
+Read Value From Above
(which would be0x019B4990
+0xECB7
=>0x19C3647
).
Getting the ShiftVal
Now that we know the initial ShiftAddress
, this part is easy.
Basically we’ll have to read 4 byte starting at the ShiftAddress
location.
In this case we’ll go to 19C3647
, which results in E1 EC 20 E3
, now we’ll swap that again, so E1 EC 20 E3
=> E3 20 EC E1
which would be 0xE320ECE1
.
The actual start
Now let’s get the notes again of what we initialized:
- Locations:
- Address1:
019B4990
- Address2:
019B4B20
- Address3:
019B4994
- Address1:
- Counter:
0x0
- Character:
0x100
- ShiftVal:
0xE320ECE1
- ShiftAddress:
0x19C3647
We don’t need the StringID anymore, as we only needed it for the initial ShiftVal
/ ShiftAddress
, so the variables above are the only ones we have to keep in mind.
Below you can find a “semi” code snippet of how the actions will be handled.
do {
Character = 0x100;
do {
const bool IsZero = ((ShiftVal >> Counter) % 0x2) == 0;
if (IsZero == true) {
Character = Read2BytesFromROMData((Character * 0x4) + Address3 - 0x400);
} else if (IsZero == false) {
Character = Read2BytesFromROMData((Character * 0x4) + Address3 - 0x3FE);
}
Counter = Counter + 1;
if (Counter == 8) {
Counter = 0;
ShiftAddress = ShiftAddress + 1;
ShiftVal = Read4BytesFromROMData(ShiftAddress);
}
} while (0xFF < Character);
StringList.push_back(Character);
} while (Character != 0x0);
Basically what it’s first doing is, set Character
to 0x100
. We already did it before.
Then we are doing a loop with the actions of it explained below until 0xFF
is smaller as Character
.
Shifting Action
This is, what we’ll first do at the start of the loop:
const bool IsZero = ((ShiftVal >> Counter) % 0x2) == 0;
Basically, in our calculator, we will do:
ShiftVal
»Counter
% 0x2
Which would be in the first loop:
0xE320ECE1
»0
% 0x2
And the result of that would be:
0xE320ECE1
»0
(0xE320ECE1
) % 0x2 =>1
.
Because this is NOT zero, the result would be false.
But because Counter
being 0
would be lame, let’s do it as Counter
being 4
.
0xE320ECE1
»4
(0xE320ECE
) % 0x2 =>0
.
So in the case of Counter
being 4
, we’ll get the result 0
.. which in the case of Is Zero
would be true.
If you don’t have an option for right shift, then the following way will work as well.
ShiftVal
/See Table for Counter Value below
% 0x2
So here we will have to take a look at the table below.
Counter | Value |
---|---|
0 | 0x0 |
1 | 0x2 |
2 | 0x4 |
3 | 0x8 |
4 | 0x10 |
5 | 0x20 |
6 | 0x40 |
7 | 0x80 |
Because Counter
for that action never reaches 8
, we will be fine with 0 - 7
.
Let’s do the same example as above with Counter
being 0
and then 4
.
0xE320ECE1
/0x0
(0xE320ECE1
) % 0x2 =>1
.
0xE320ECE1
/0x10
(0xE320ECE
) % 0x2 =>0
.
See? You basically get the same result doing that way. As for %
if you really have no choice, then you can just use Google, as if i remember correctly it also has a calculator. Basically the %
operator means, how much is left if you would do /
instead, example: 9 / 2
would be 4
, but 1
is left. If that would be 8 / 2
, then that would be 4
and obviously 0
is left, as you can divide 8 by 2 completely.
You can find a “visual” way below.
Character Read Action
After we did the Shifting Action, we can continue here.
if (IsZero == true) {
Character = Read2BytesFromROMData((Character * 0x4) + Address3 - 0x400);
} else if (IsZero == false) {
Character = Read2BytesFromROMData((Character * 0x4) + Address3 - 0x3FE);
}
IsZero
is basically the result from the Shifting Action. Let’s go onto this action here step by step.
Character
*0x4
Initially this would be:
0x100
*0x4
=>0x400
.
So far so good for the first one from that list. Let’s continue on.
Result from Above
+Address3
In this case it’d be:
0x400
+0x019B4994
=>0x19B4D94
.
So far so good too, right? Let’s get into the next one.
Result from Above
-If Shifting Action was True (IsZero) then 0x400, else 0x3FE
Soo, because the result with IsZero
at Counter
being 0
was 1 (which is not Zero), it’d be:
0x19B4D94
-0x3FE
=>0x19B4996
Now from the result from above, we’ll read 2 bytes through the Hex Editor again.
So like before, go to the result from above (in this case 19B4996
) and read the bytes starting from that.
In this case i got 62 01
(like before, we have to swap it again, so 62 01
=> 01 62
which would be 0x0162
).
This will be our new Character
value.
Increasing Counter and Action
After we got the new Character
value, it’s time we’ll increase our Counter
variable by 1.
If the Counter
variable reached 8
, we have to do the following Actions:
- Reset
Counter
back to0
. - Increase
ShiftAddress
by1
. - Re-read
ShiftVal
from the newShiftAddress
.
In case you forgot how to do the last step, here explained again.
Go to the location of ShiftAddress
in the Hex Editor, then read the 4 bytes again. Like before you will have to swap the byte order again, so: 11 22 33 44
=> 44 33 22 11
which would be the new value: 0x44332211
(This is only an example).
Check Action
Now repeat the steps from Shifting Action until this as long as 0xFF
is smaller than the Character
variable. If Character
is 0xFF
or smaller, then you can do the following Action below.
- Make a note of the current
Character
value somewhere, so you won’t forget it or everything was for nothing.- You should make the note of that like an array, so it’d look like:
0x20, 0x55, 0x77...
, because we will have to decode that at the end to a read-able string.
- You should make the note of that like an array, so it’d look like:
-
If the
Character
has a value of0x0
, then you are done for this action and can keep going with Decoding Action which is the last thing. Else follow the step below. - Set
Character
back to0x100
and repeat Shifting Action until this.
Decoding Action
First of all, there are characters that are invalid, those are the following:
- 0x1 - 0x9
- 0xB - 0x1F
- 0xBC - 0xFF
You can just skip them at this process.
Below you will find the table of the characters and their value.
Basically what you have to do is, search for the Value
in the table from your result of above, then note the Character
.
After you did it for each value you read from the previous steps, you got your string in a read-able format!
Value | Character |
---|---|
0x0 | \0 |
0xA | \n |
0x20 | (space) |
0x21 | ! |
0x22 | " |
0x23 | # |
0x24 | $ |
0x25 | % |
0x26 | & |
0x27 | ' |
0x28 | ( |
0x29 | ) |
0x2A | * |
0x2B | + |
0x2C | , |
0x2D | - |
0x2E | . |
0x2F | / |
0x30 | 0 |
0x31 | 1 |
0x32 | 2 |
0x33 | 3 |
0x34 | 4 |
0x35 | 5 |
0x36 | 6 |
0x37 | 7 |
0x38 | 8 |
0x39 | 9 |
0x3A | : |
0x3B | ; |
0x3C | < |
0x3D | = |
0x3E | > |
0x3F | ? |
0x40 | @ |
0x41 | A |
0x42 | B |
0x43 | C |
0x44 | D |
0x45 | E |
0x46 | F |
0x47 | G |
0x48 | H |
0x49 | I |
0x4A | J |
0x4B | K |
0x4C | L |
0x4D | M |
0x4E | N |
0x4F | O |
0x50 | P |
0x51 | Q |
0x52 | R |
0x53 | S |
0x54 | T |
0x55 | U |
0x56 | V |
0x57 | W |
0x58 | X |
0x59 | Y |
0x5A | Z |
0x5B | [ |
0x5C | \ |
0x5D | ] |
0x5E | ^ |
0x5F | _ |
0x60 | ` |
0x61 | a |
0x62 | b |
0x63 | c |
0x64 | d |
0x65 | e |
0x66 | f |
0x67 | g |
0x68 | h |
0x69 | i |
0x6A | j |
0x6B | k |
0x6C | l |
0x6D | m |
0x6E | n |
0x6F | o |
0x70 | p |
0x71 | q |
0x72 | r |
0x73 | s |
0x74 | t |
0x75 | u |
0x76 | v |
0x77 | w |
0x78 | x |
0x79 | y |
0x7A | z |
0x7B | © |
0x7C | œ |
0x7D | ¡ |
0x7E | ¿ |
0x7F | À |
0x80 | Á |
0x81 | Â |
0x82 | Ã |
0x83 | Ä |
0x84 | Å |
0x85 | Æ |
0x86 | Ç |
0x87 | È |
0x88 | É |
0x89 | Ê |
0x8A | Ë |
0x8B | Ì |
0x8C | Í |
0x8D | Î |
0x8E | Ï |
0x8F | Ñ |
0x90 | Ò |
0x91 | Ó |
0x92 | Ô |
0x93 | Õ |
0x94 | Ö |
0x95 | Ø |
0x96 | Ù |
0x97 | Ú |
0x98 | Ü |
0x99 | ß |
0x9A | à |
0x9B | á |
0x9C | â |
0x9D | ã |
0x9E | ä |
0x9F | å |
0xA0 | æ |
0xA1 | ç |
0xA2 | è |
0xA3 | é |
0xA4 | ê |
0xA5 | ë |
0xA6 | ì |
0xA7 | í |
0xA8 | î |
0xA9 | ï |
0xAA | ñ |
0xAB | ò |
0xAC | ó |
0xAD | ô |
0xAE | õ |
0xAF | ö |
0xB0 | ø |
0xB1 | ù |
0xB2 | ú |
0xB3 | û |
0xB4 | ü |
0xB5 | º |
0xB6 | ª |
0xB7 | … |
0xB8 | ™ |
0xB9 | |
0xBA | ® |
0xBB |
Test attempt
You can see how i reproduced it step by step on my own.
NOTE: This might be large.
PREPARATIONS
Locations:
- Address1 => 019B4990
- Address2 => 019B4B20
- Address3 => 019B4994
- StringID: 379
INITIAL SHIFT ADDRESS
379 * 4 => DE4
019B4B20 + DE4 => 19B5904
^ReadVal => B7 EC 00 00 => ECB7
ShiftAddress => 019B4990 + ECB7 => 19C3647
INITIAL SHIFT VAL
ShiftVal => E1 EC 20 E3 => E320ECE1
NOTES
Counter => 0
Character => 100
ShiftVal => E320ECE1
ShiftAddress => 19C3647
Res => 42, 75, 72, 70, 6C, 65, 00
FINAL => B, u, r, p, l, e, \0
LOOP 1
E320ECE1 >> 0 (E320ECE1) % 2 => 1
100 * 4 (400) + 019B4994 (19B4D94) - 3FE => 19B4996
Character => 62 01 => 162
Counter => 1
E320ECE1 >> 1 (71907670) % 2 => 0
162 * 4 (588) + 019B4994 (19B4F1C) - 400 => 19B4B1C
Character => 5F 01 => 15F
Counter => 2
E320ECE1 >> 2 (38C83B38) % 2 => 0
15F * 4 (57C) + 019B4994 (19B4F10) - 400 => 19B4B10
Character => 5A 01 => 15A
Counter => 3
E320ECE1 >> 3 (1C641D9C) % 2 => 0
15A * 4 (568) + 019B4994 (19B4EFC) - 400 => 19B4AFC
Character => 52 01 => 152
Counter => 4
E320ECE1 >> 4 (E320ECE) % 2 => 0
152 * 4 (548) + 019B4994 (19B4EDC) - 400 => 19B4ADC
Character => 4A 01 => 14A
Counter => 5
E320ECE1 >> 5 (7190767) % 2 => 1
14A * 4 (528) + 019B4994 (19B4EBC) - 3FE => 19B4ABE
Character => 42 01 => 142
Counter => 6
E320ECE1 >> 6 (38C83B3) % 2 => 1
142 * 4 (508) + 019B4994 (19B4E9C) - 3FE => 19B4A9E
Character => 39 01 => 139
Counter => 7
E320ECE1 >> 7 (1C641D9) % 2 => 1
139 * 4 (4E4) + 019B4994 (19B4E78) - 3FE => 19B4A7A
Character => 2B 01 => 12B
Counter => 8
COUNTER RESET
Counter => 0
ShiftAddress => 19C3648
ShiftVal => EC 20 E3 06 ==> 06E320EC
06E320EC >> 0 (06E320EC ) % 2 => 0
12B * 4 (4AC) + 019B4994 (19B4E40) - 400 => 19B4A40
Character => 42 00 => 42
Counter => 1
LOOP 2
Character => 100
06E320EC >> 1 (3719076) % 2 => 0
100 * 4 (400) + 019B4994 (19B4D94) - 400 => 19B4994
Character => 61 01 => 161
Counter => 2
06E320EC >> 2 (1B8C83B) % 2 => 1
161 * 4 (584) + 019B4994 (19B4F18) - 3FE => 19B4B1A
Character => 5E 01 => 15E
Counter => 3
06E320EC >> 3 (DC641D) % 2 => 1
15E * 4 (578) + 019B4994 (19B4F0C) - 3FE => 19B4B0E
Character => 59 01 => 159
Counter => 4
06E320EC >> 4 (6E320E) % 2 => 0
159 * 4 (564) + 019B4994 (19B4EF8) - 400 => 19B4AF8
Character => 51 01 => 151
Counter => 5
06E320EC >> 5 (371907) % 2 => 1
151 * 4 (544) + 019B4994 (19B4ED8) - 3FE => 19B4ADA
Character => 75 00 => 75
Counter => 6
LOOP 3
Character => 100
06E320EC >> 6 (1B8C83) % 2 => 1
100 * 4 (400) + 019B4994 (19B4D94) - 3FE => 19B4996
Character => 62 01 => 162
Counter => 7
06E320EC >> 7 (DC641) % 2 => 1
162 * 4 (588) + 019B4994 (19B4F1C) - 3FE => 19B4B1E
Character => 60 01 => 160
Counter => 8
COUNTER RESET
Counter => 0
ShiftAddress => 19C3649
ShiftVal => 20 E3 06 60 ==> 6006E320
6006E320 >> 0 (6006E320) % 2 => 0
160 * 4 (580) + 019B4994 (19B4F14) - 400 => 19B4B14
Character => 5C 01 => 15C
Counter => 1
6006E320 >> 1 (30037190) % 2 => 0
15C * 4 (570) + 019B4994 (19B4F04) - 400 => 19B4B04
Character => 55 01 => 155
Counter => 2
6006E320 >> 2 (1801B8C8) % 2 => 0
155 * 4 (554) + 019B4994 (19B4EE8) - 400 => 19B4AE8
Character => 72 00 => 72
Counter => 3
LOOP 4
Character => 100
6006E320 >> 3 (C00DC64) % 2 => 0
100 * 4 (400) + 019B4994 (19B4D94) - 400 => 19B4994
Character => 61 01 => 161
Counter => 4
6006E320 >> 4 (6006E32) % 2 => 0
161 * 4 (584) + 019B4994 (19B4F18) - 400 => 19B4B18
Character => 5D 01 => 15D
Counter => 5
6006E320 >> 5 (3003719) % 2 => 1
15D * 4 (574) + 019B4994 (19B4F08) - 3FE => 19B4B0A
Character => 57 01 => 157
Counter => 6
6006E320 >> 6 (1801B8C) % 2 => 0
157 * 4 (55C) + 019B4994 (19B4EF0) - 400 => 19B4AF0
Character => 4F 01 => 14F
Counter => 7
6006E320 >> 7 (C00DC6) % 2 => 0
14F * 4 (53C) + 019B4994 (19B4ED0) - 400 => 19B4AD0
Character => 47 01 => 147
Counter => 8
COUNTER RESET
Counter => 0
ShiftAddress => 19C364A
ShiftVal => E3 06 60 5A ==> 5A6006E3
5A6006E3 >> 0 (5A6006E3) % 2 => 1
147 * 4 (51C) + 019B4994 (19B4EB0) - 3FE => 19B4AB2
Character => 70 00 => 70
Counter => 1
LOOP 5
Character => 100
5A6006E3 >> 1 (2D300371) % 2 => 1
100 * 4 (400) + 019B4994 (19B4D94) - 3FE => 19B4996
Character => 62 01 => 162
Counter => 2
5A6006E3 >> 2 (169801B8) % 2 => 0
162 * 4 (588) + 019B4994 (19B4F1C) - 400 => 19B4B1C
Character => 5F 01 => 15F
Counter => 3
5A6006E3 >> 3 (B4C00DC) % 2 => 0
15F * 4 (57C) + 019B4994 (19B4F10) - 400 => 19B4B10
Character => 5A 01 => 15A
Counter => 4
5A6006E3 >> 4 (5A6006E) % 2 => 0
15A * 4 (568) + 019B4994 (19B4EFC) - 400 => 19B4AFC
Character => 52 01 => 152
Counter => 5
5A6006E3 >> 5 (2D30037) % 2 => 1
152 * 4 (548) + 019B4994 (19B4EDC) - 3FE => 19B4ADE
Character => 6C 00 => 6C
Counter => 6
LOOP 6
Character => 100
5A6006E3 >> 6 (169801B) % 2 => 1
100 * 4 (400) + 019B4994 (19B4D94) - 3FE => 19B4996
Character => 62 01 =>162
Counter => 7
5A6006E3 >> 7 (B4C00D) % 2 => 1
162 * 4 (588) + 019B4994 (19B4F1C) - 3FE => 19B4B1E
Character => 60 01 =>160
Counter => 8
COUNTER RESET
Counter => 0
ShiftAddress => 19C364B
ShiftVal => 06 60 5A 72 ==> 725A6006
725A6006 >> 0 (725A6006) % 2 => 0
160 * 4 (580) + 019B4994 (19B4F14) - 400 => 19B4B14
Character => 5C 01 =>15C
Counter => 1
725A6006 >> 1 (392D3003) % 2 => 1
15C * 4 (570) + 019B4994 (19B4F04) - 3FE => 19B4B06
Character => 65 00 => 65
Counter => 2
LOOP 7
Character => 100
725A6006 >> 2 (1C969801) % 2 => 1
100 * 4 (400) + 019B4994 (19B4D94) - 3FE => 19B4996
Character => 62 01 => 162
Counter => 3
725A6006 >> 3 (E4B4C00) % 2 => 0
162 * 4 (588) + 019B4994 (19B4F1C) - 400 => 19B4B1C
Character => 5F 01 => 15F
Counter => 4
725A6006 >> 4 (725A600) % 2 => 0
15F * 4 (57C) + 019B4994 (19B4F10) - 400 => 19B4B10
Character => 5A 01 => 15A
Counter => 5
725A6006 >> 5 (392D300) % 2 => 0
15A * 4 (568) + 019B4994 (19B4EFC) - 400 => 19B4AFC
Character => 52 01 => 152
Counter => 6
725A6006 >> 6 (1C96980) % 2 => 0
152 * 4 (548) + 019B4994 (19B4EDC) - 400 => 19B4ADC
Character => 4A 01 => 14A
Counter => 7
725A6006 >> 7 (E4B4C0) % 2 => 0
14A * 4 (528) + 019B4994 (19B4EBC) - 400 => 19B4ABC
Character => 00 => 00
Counter => 8
COUNTER RESET
Counter => 0
ShiftAddress => 19C364C
ShiftVal => 60 5A 72 47 ==> 47725A60
DECODING
42, 75, 72, 70, 6C, 65, 00
42 => B
75 => u
72 => r
70 => p
6C => l
65 => e
00 => \0
^ Burple
You may not need that many notes, though i did it for showing how it can be done.
And that’s it! As you could see it requires some steps to get it manually, so i’d recommend to use the tool instead of doing it manually, if you can as it can be time intensive. I might have to edit this blog, but so far i think it looks alright. See ya until the next one i guess! ~SuperSaiyajinStackZ - 05 December 2021.