Good protection schemes should be written in assembler. Applications should be
written in a high level language. These two statements are incompatible if you
truly wish to incorporate protection throughout your application. I have however
developed a technique that allows protection to be added to a C or C++ program
after the program has been compiled and linked. This has the
advantage that the inserted code can be generated by another program and can
thus be different for every program or version of a program that you issue.
Fravia+ and others have stated that to be a good protectionist you first need to
be a good cracker. I admit I started out as a protectionist, and with hind-sight
I was not a particularly good one. I then learned and used the crackers techniques,
most from Fravia+s wonderful web site. (Thanks Fravia+). These lessons in cracking
gave me some wonderful insights into how I might in turn protect my own programs
from crackers. I now want to give something back to the community, hence this
article.
I don't want to get into the argument between the use of high level languages and assembler. I agree that assembler language can be much more efficient than a high level language both in speed of operation and in memory footprint. However I feel that a high level language (such as C or C++) is generally the only tool of choice for many applications. Unfortunately any protection scheme that is written in a high level language is significantly weaker than one in assembler.
I would like to demonstrate a technique that allows assembler code to be added to an executable after it has been compiled and linked.
To demonstrate the technique I will implement two of Mark's famous 14 protectors commandments.
6. Patch your own software. Change your code to call different validation
routines each time.
B. Flood the cracker with bogus calls and hard coded strings.
I have techniques that implement many more of the commandments and I will describe them in later articles which will build on the techniques described here.
Patch your own software
My technique is probably not quite what Mark had in mind. I assume Mark meant that each time the application was run it would modify its own file copy. The idea I have in mind however is that every release of a program will be different. Many releases of a product are just bug-fixes or minor changes to the program. The comparison of two releases will show that they are substantially the same with only a small percentage change. Thus if one version has been cracked then cracking later versions is very easy.
To patch an executable we need to know two things.
1. Where the patch is to be applied
2. What goes into the patch
Where the patch is to be applied
We can determine where to apply a patch by putting a signature in the program which can be recognised by the patch program. This signature should be one that would not normally appear in the output of the compiler/linker. For patches to the code segment I define the following macro.
#define PATCH10 \ __asm { _emit 0x72}; \ __asm { _emit 0x01}; \ __asm { _emit 0x72}; \ __asm { _emit 0xf9}; \ __asm { _emit 0xe9}; \ __asm { _emit 0x01}; \ __asm { _emit 0x00}; \ __asm { _emit 0x00}; \ __asm { _emit 0x00}; \ __asm { nop };\The code generated by this macro has the advantage that it could never be generated normally by a compiler. It is also code that cannot be correctly handled by any dis-assembler (including IDA), but more on this later.
The best dis-assembly of this code is:-
72 01 jp +1 72 f9 jp -7 e9 01 00 00 00 jmp +1 90 nopThe reason this cannot be generated normally is clear in the first jp instruction. It is actually jumping to the middle of the second instruction, specifically to the byte f9. This op-code is the assembler instruction 'stc'. It is clear that this sequence would not normally be generated by any compiler!
This pattern of 10 bytes can be searched for by the patch program and replaced by alternative assembler code of length 10 bytes. Of course the macro can be modified to produce different lengths of 'holes' in the executable. For example.
#define PATCH20 \ __asm { _emit 0x72}; \ __asm { _emit 0x01}; \ __asm { _emit 0x72}; \ __asm { _emit 0xf9}; \ __asm { _emit 0xe9}; \ __asm { _emit 0x0b}; \ __asm { _emit 0x00}; \ __asm { _emit 0x00}; \ __asm { _emit 0x00}; \ __asm { nop };\ __asm { nop };\ __asm { nop };\ __asm { nop };\ __asm { nop };\ __asm { nop };\ __asm { nop };\ __asm { nop };\ __asm { nop };\ __asm { nop };\ __asm { nop };\This produces a 'hole' of 20 bytes. The same principle can be extended to provide holes of any size. I use holes of upto 100000 bytes in my programs.
To use these macros is very easy. The following source code shows the principle
void main(void) { PATCH10 printf("Hello world\n"); PATCH20 exit(1); PATCH10 }This code will result in 'holes' of 10 or 20 bytes between each line of compiler generated code. The next question is what do we want to put in these 'holes'.
What can be patched in
Now 10 or 20 bytes may not seem much. Consider however that a compiler will generate only 20 or 30 bytes of code on average for each line of source code (this varies from only four or five bytes upto 100 or so). Percentage wise this can easily add up to a substantial fraction of our code and has the advantage that is is dispersed throughout our code.
I admit that we can't do much functional code in 10 or 20 bytes. We can however (finally) get to Mark's Commandment B. Flood the cracker with bogus code and hard coded strings.
Flood the cracker with bogus code.
The Microsnot C++ compilers can produce either optimised or non-optimised code. I noted that in the non-optimised code every line of source code is self contained and registers EAX, EBX, ECX and EDX are not used to carry values between the code generated for different source code line. In optimised mode however these registers can be used to hold local variables.
To insert assembler between lines of C source code I compile my program with optimisation turned off. I can then generate bogus code using the following types of instructions.
mov toReg, frReg mov toReg, [ebp+xx] inc/dec toReg mov toReg XXXXXXXX mov toReg dword_xxxxxxxx cmp toReg dword_xxxxxxxxWhere toReg is one of EAX,EBC,ECX,EDX and frReg is one of EAX,ECX,EDX,EBX,ESP,EBP,ESI or EDI. The dword_xxxxxxxx is a random address in the data segment. This one is especially useful since it results in many bogus references being produced by IDA. One of the powers of IDA is being able to easily identify any reference to a data value. With hundreds (or thousands) of bogus references this makes the crackers work so much harder.
This list of bogus code instructions can easily be extended. The trick is to look at the output of the compiler (using IDA) and see what typical instructions it generates. We don't want to generate code that will affect the program execution so the following instructions would not be allowed
mov frReg, toReg mov dword_xxxxxxxx, toReg push toRegThe source code to generate this code is simply a large switch statement, one case for each type of generated code. The selection is made on an random basis and the choice of registers for each case is also chosen at random. The switch statement is enclosed in a loop which adds data into a 'hole' until the hole is filled with random garbage. Take care however to include a few one byte op-codes so that the holes can be filled completely (e.g. stc, clc).
Confusing IDA and SoftIce
I would also like to demonstrate some techniques that will confuse IDA and SoftIce disassembly. Remember the code generated by the PATCH macros? The code had two execution routes, in one route a byte could be used as the first byte of an op-code in another route it could be used as a data byte to an op-code. IDA is unable to cope with this since it can't show the same byte being used in two different assembler instructions.
This technique can be used to hide functional code. Take for example the following code
430502 81 C7 33 C0 F7 F0 add edi, 0F0F7C033h 430508 0F 82 F6 FF FF FF jb loc_430502+2At first glance this looks fairly innocuous. But note that the conditional jump is to a location in the middle of a multi-byte op-code. If we dis-assemble from address 430504 we get the following code.
430504 33 C0 xor eax,eax 430506 F7 F0 div eax 430508 0F 82 F6 FF FF FF jb loc_430504Now this is certainly not innocuous. It will generate a divide by zero exception and this was hidden from view in the original IDA output.
Now it is certainly more difficult to write code in this way but for certain key routines (like tests for the presence of SoftIce) it can be an invaluable technique. Personally I have combined this technique with the technique of patching code into holes to put multiple checks for SoftIce into my own code. Each SoftIce check was generated using random registers and random code sections so that each SoftIce check is different. Having found one such routine the cracker can not then use a simple byte search to find the other routines.
How to use encrypted strings and how to avoid having to decrypt them into memory
How to encrypt parts of your C and C++ program.
How to calculate Cyclic Redundancy Checks on part of your program
How to detect bpx and bpmb type breakpoints in your code
How to stop a cracker from getting any useful information from your program resulting from
putting a bpmb for read or write on a key global data variable
How to use system API calls (such as MessageBoxA) in such a way that breakpoints
on them never break and API Spy programs don't report their usage