One way to change the behavior of a program is to modify the executable binary itself, better known as binary patching. With this technique, attackers try to alter part of the program's code while keeping the rest of its functionality intact. This way the program can be manipulated in place without requiring the source code. It can for instance be used to remove license checks from paid software or change any string present in the binary.
Binary patching can easily be performed by using any disassembler, such as radare2 for instance. As an example, some simple pincode checking code is shown below, with the corresponding assembly instructions next to it. We can see the compare (cmp
) instruction followed by a jump if not equal (jne
) instruction based on its result, which corresponds to the if-else control flow in the code.
The disassembler shows that 0f8527000000
is the hexadecimal value of the jne
instruction. In this case, 85
is the opcode of jne
, so we could just patch the binary and change this value to the opcode of jump equal 84
. Doing so would reverse the control flow of the if-else statement. This way any value except for the correct pin results in taking the code path that corresponds to providing a correct pin as displayed in figures 1a & 1b.
Since you can’t control what happens to your program in the wild, it is impossible to prevent it from being patched. Instead of trying to prevent it, we could at least try to detect whether our code has been tampered with, i.e. binary patched, and take action accordingly. One way to do this is to use code checksumming. In the remainder of this article, we will explain how code checksumming works and provide an example of how to implement it yourself.
void checkPin() {
int pin;
std::cout << “Please enter pin: “;
std::cin >> pin;
if (pin == 1337) {
std::cout << “pin correct!”;
} else {
std::cout << “pin incorrect!”;
}
}
...
cmp dword ptr [rbp - 8], 1337
jne .LBB1_2 0f8527000000
...
call std::basic_ostream
...
jmp .LBB1_3
.LBB1_2:
...
call std::basic_ostream
...
.LBB1_3:
...
...
if (pin != 1337) {
std::cout << “pin correct!”;
} else {
std::cout << “pin incorrect!”;
}
cmp dword ptr [rbp - 8], 1337
je .LBB1_2 0f8527000000
...
.LBB1_2:
...
.LBB1_3:
...
To detect any changes to the original code, we can utilize a checksum of our code. This is a small piece of data derived from our code that can be used to verify its integrity.
The principle works as follows:
This allows us to detect — at runtime — whether the code in question was changed since it was compiled. Note that we first have to compile our code before calculating the hash since we don’t know exactly what the code will look like until after compilation. An overview of the principle is shown in figure 2 below.
1. Retrieve code
2. Calculate checksum
3. Store checksum in app package
4. Code to calculate checksum at runtime
5. Compare checksum values
Implementing a code checksumming mechanism ourselves requires three specific tasks to be completed:
Before calculating the checksum of our compiled code, we need to decide which piece of code we want to verify. In this case, we will checksum a specific security-sensitive function, such as a license check, to verify its integrity. But you can also checksum any other part of your code that you want. You could even checksum the whole __TEXT
segment in order to verify all of your code.
In order to calculate the checksum of the piece of code we are interested in, we first need the compiled bytes that make up the code as input. We can use a disassembler to help us with this task.
For example, we can retrieve the data of the checkPin()
function from earlier, and calculate its checksum using radare2 as presented in figure 3.
$ r2 -A binary
> afl
...
0x10000310 4 139 sym.checkPin
...
> s 0x100003100
> ph sha256 139
85c4cc4129549a0b95a826d063470c2574212fd76f5dc
5b361cb0573dbb19bb7
Load and analyse the binary with radare2
List functions and their information
(address, #blocks, size, symbol)
Change offset to the function address
Calculate the hash of the function by providing the function size
We now have the resulting checksum of our code. But in order to have access to the checksum at runtime, it needs to be included in the app package somehow. For simplicity, we will do this by adding the resulting checksum value within the Info.plist
file of the app.
Calculating the checksum at runtime requires many of the same steps as calculating it after compilation. But instead, they need to be performed programmatically by our code at runtime.
First, we need some code to find the location of our function so we can use it as input for our checksum calculation. In this case we can leverage function pointers, since they point to the start of their function. In this example the function in question is a C++ function, so it is as simple as this to retrieve the pointer to this code:
void* fnPtr = &checkPin();
If the function was written in Objective-C, the function pointer could be retrieved as follows:
void* fnPtr = class_getMethodImplementation([self class], @selector(functionName));
There is no concept of C-style function pointers in Swift. To work around this, you could implement the function you want to protect in Objective-C or C++. Or you could annotate your function with @_cdecl(“functionName”)
to expose the function to C to be able to use its function pointer.
Now that we know where our function’s data is located, we can go ahead and write some code to calculate its checksum (i.e. hash). The hash can be calculated using any language or framework, as long as you are consistent with the hash algorithm and its parameters. For example, in Objective-C you could do it as demonstrated in figure 4.
Finally, all that is left to do is to compare the hash calculated at runtime with the previous checksum that was calculated after compilation and stored within the app package. If these two values are not equal, we know that the code has changed since compilation. Based on this information, you could for example decide to crash the app to prevent modified versions of the app from running as shown in figure 5.
unsigned char resultBuffer[CC_SHA256_DIGEST_LENGTH];
CC_SHA256(fnPtr, fnLength, resultBuffer);
NSMutableString *result = [NSMutableString
stringWithCapacity:CC_SHA256_DIGEST_LENGTH * 2];
for (int i = 0; i < CC_SHA256_DIGEST_LENGTH; ++i) {
[result appendFormat:@”%02x”, resultBuffer[i]];
}
Initialize a buffer for the checksum result
Calculate the SHA256 hash of the function
Create a string and store the checksum into the string in hexadecimal format
NSString *checksum = [[NSBundle mainBundle]
objectForInfoDictionaryKey:@” checksum”];
if (result != checksum) {
exit(0);
}
Retrieve the checksum from the Info.plist file
Verify whether the checksums match and exit the app if they don’t
In the example we shared above, we chose to embed the checksum value in the app package by adding it into the Info.plist
file. This approach is quite convenient, but obviously not very secure since the attacker can easily tamper with the value. For greater security, we could encrypt the checksum before adding it to the app package and decrypt it at runtime before using it.
Additionally, instead of adding the checksum within the Info.plist
file, we could include it at another location within the app. For example, the very technique we are trying to fight could be used to add the value into the app post compilation: the checksum value can be stored in a variable with a placeholder value at compile time, which can then be replaced by the actual checksum value using binary patching.
The process we used to calculate the checksum after compilation and add it to the app package requires some manual work. To make life easier, this process could be automated by leveraging Xcode post-action build scripts.
Even though we can now detect binary patching, a skilled reverse engineer could simply patch out our code checksumming check, rendering all our work useless. Therefore it is important to use several layers of obfuscation and runtime checks in order to maximize protection. For example, we could obfuscate the checksumming code to make it more challenging for reverse engineers to analyse what is going on.