The Economics of Piracy - 3
An interesting email-exchange between
Brandon Van Every and Russ Williams
(september 1998)
Courtesy of fravia's pages of reverse engineering
Welcome to the third part of a very interesting email-exchange between
some 'real' games' programmers. Even if the main concern of these guys is to
avoid CD-ROM pirating, some of the tricks they are proposing and evaluating
have quite a relevance for all reverse engineering enthusiasts, as you'll read. I have added very few comments.
Bra1Rus |
Rus2Bra |
Rus3Bra |
Rus4Bra
Russ Williams wrote in message <906206116.15699.0.nnrp-
>
>Well, if you do the security on every single disc, then the QA
>would be done on the protected game - if it goes wrong,
>the testers will bitch about it...
But the lead programmer could screw up by generating a non-unique ID that
doesn't crash the game. Like, he grabbed the wrong triangle because the
engineers, not knowing any better, moved something. You'd need QA that
actually knows what the identification process is supposed to be. Either
that or an infallible lead programmer. :-)
>OK. Fair point. Something like using the least significant
>bits of each byte in a BMP body would be bad because it
>could be wiped out trivially once detected.
Rather, "could be wiped out trivially." The only thing that needs detecting
is that BMP data exists. Then you wipe the BMP data, you don't need to know
if it's an ID source or not. You just need to add enough noise to the data
that whatever ID info it might have been carrying is completely scrambled.
Although past a certain point of perturbation, the perturbed BMP data could
become meaningless. Ruin all the artwork, maybe this isn't advantageous to
the cracker after all. Then again, maybe people would be happy with games
in new color schemes.
>>The *statistical* part comes from how many disks you
>>release to the world. The odds of the extremely determined
>>cracker getting ahold of 1, or 2, of them.
>
>Yup. The way I'd counter that is to provide codes grouped
>on 3 sources. That way you need nC3 keys, but a disc from
><=3 sources will have a piece of the data that's identical
>in all 3 versions and identifies which 3 leaked. If only
>2 leak, then there will be many more keys to identify
>which 2. Obviously, this is trivially expandable to any
>number in a group (in case you think 42 discs will go
>walkabout).
It's late... can you explain this one from the top? You want to somehow
scramble data from 16 different disks into each other? One thing I'm really
missing is what exactly you're scrambling. Different source pools? No that
can't be it... it's late.
>It could work for levels if you use lots and lots of strong
>encryption and chain levels together (ie: the key for level
>n is in the data for level n-1) and alter things after some
>number of levels. They'd crack the first dozen levels, say,
>figure it's working, but have missed that the encryption
>method changes for level 13..
The decryption method is always contained in the binary somewhere. So
you're just presuming about the carefulness/carelessness of the cracker.
There are a million needle-in-a-haystack strategies out there, all can be
defeated with sufficient patience. Just a matter of how much you enjoyed
writing it, vs. how much the cracker enjoys cracking it.
>>No, you don't want it in the raw data. The raw data is easy
>>to perturb randomly and still get basically the same data.
>
>That depends how you place it. Ultra low frequency
>components across the whole dataset would be much
>more difficult to remove. Imagine a sample with a 4Hz
>sine wave mixed in - it would be undetectable by ear
>and noise wouldn't remove it.
So you run it under an audio filter and chop anything the ear can't hear.
Then you do some nasty things to what you can hear, so that the data is now
sufficiently different.
>>You want to stick your key in the INDEX STRUCTURE
>>of the data.
>
>Too obvious.
Again, the goal is not "find the key!" The goal is to erase the key. An
index structure is selected because it'll take the most time to erase, you
have to figure out how the damn thing works before you can do it.
>>Somewhere that takes quite a while to figure out how
>>to transform without breaking everything.
>
>But, as you've said, there are crackers willing to spend
>any amount of time doing the grunt work..
Well, what kind of index file would be so onerous that only the most
foolhardy cracker would attempt it?
>>To reiterate, you don't FIND the key. You eradicate
>>everything, including the key.
>
>It's not always that simple. Unless the crackers are going
>to rip out every image, every sample, every piece of
>data from the game then they're not going to be 100%
>effective.
They are going to do exactly that. They are going to develop automated
methods for ripping everything to shreds. Hence why encoding in BMP files
doesn't work, the structure of the file is well-known and easily
transformed. The index methods are not well-known, ;True! Yet the
they're unique to each app. So the goal is to think up a really onerous, ;debugging overload
convoluted one. Something so horrid you'd need months of test-debug ;is something that
iteration to figure it all out. ;programmers that
;are not crackers as
>[...] ;well will not endure!
>>Incidentally, if you're willing to burn your CDs one at a time,
>>then you could use the same data transformation methods
>>to encode the unique identity of a file. Rather than sticking
>>a unique ID on each disk somewhere within the file, and
>>running the risk of 2 files being compared, you make the data
>>file on *every* CD unique.
>
>That was the idea of #2 above - a compressed and
>encrypted data set can't be compared meaningfully if
>they encryption key changes between builds. You need
>to spend ages decompressing before you can
>compare.
No, your techinque is different. I'm not encrypting anything on the CDs,
I'm just guaranteeing that the entire data file has a unique bit pattern for
each CD. The entire file becomes the ID. I can read my unique file without
a decryption mechanism. The problem with your decryption, is that once the
file is decrypted it's the same regardless of what CD it came from. And the
file *can* be decrypted, the code is always available to do this in the
binary itself.
Cheers, 3d graphics optimization jock
Brandon Van Every Seattle, WA
-----------------------------------------------------------------------
If we are all Gods and we have thrown our toys the mortals away
and we are Immortal What shall we do
and we cannot die to entertain ourselves?
Brandon Van Every wrote:
>Russ Williams wrote:
[...]
>>>The *statistical* part comes from how many disks you
>>>release to the world. The odds of the extremely determined
>>>cracker getting ahold of 1, or 2, of them.
>>
>>Yup. The way I'd counter that is to provide codes grouped
>>on 3 sources. That way you need nC3 keys, but a disc from
>><=3 sources will have a piece of the data that's identical
>>in all 3 versions and identifies which 3 leaked. If only
>>2 leak, then there will be many more keys to identify
>>which 2. Obviously, this is trivially expandable to any
>>number in a group (in case you think 42 discs will go
>>walkabout).
>
>It's late... can you explain this one from the top? You want
>to somehow scramble data from 16 different disks into
>each other? One thing I'm really missing is what exactly
>you're scrambling. Different source pools? No that
>can't be it... it's late.
OK. You're sending discs to 4 people: A, B, C and D.
You want to make sure that even if 3 of them leak, you
can ID them.
You hide 4 keys: ABC, ABD, BCD, ACD.
ie: the first key is in A's copy, B's copy and C's copy
but not in D's.
If the cracker gets A's, B's and C's copies and checks
them for differences, they'll detect 3 of the keys:
ABD won't be in C's, BCD won't be in A's and ACD
won't be in B's. But ABC will be the same in all 3
versions. You know that there are 4 codes and where
they are, but the cracker doesn't. If they eliminate
all the differences, key ABC remains and identifies
the 3 culprits.
If only B and C leak, then keys ABC and BCD will
remain.
>>It could work for levels if you use lots and lots of strong
>>encryption and chain levels together (ie: the key for level
>>n is in the data for level n-1) and alter things after some
>>number of levels. They'd crack the first dozen levels, say,
>>figure it's working, but have missed that the encryption
>>method changes for level 13..
>
>The decryption method is always contained in the binary
>somewhere. So you're just presuming about the
>carefulness/carelessness of the cracker.
Yup.
>There are a million needle-in-a-haystack strategies out
>there, all can be defeated with sufficient patience.
Yup. But who cares? If it takes them 2 months to crack
the game that's 2 months of sales and then the game
is 'old hat'.
>Just a matter of how much you enjoyed writing it, vs. how
>much the cracker enjoys cracking it.
Yup.
>>>No, you don't want it in the raw data. The raw data is easy
>>>to perturb randomly and still get basically the same data.
>>
>>That depends how you place it. Ultra low frequency
>>components across the whole dataset would be much
>>more difficult to remove. Imagine a sample with a 4Hz
>>sine wave mixed in - it would be undetectable by ear
>>and noise wouldn't remove it.
>
>So you run it under an audio filter and chop anything the ear
>can't hear. Then you do some nasty things to what you can
>hear, so that the data is now sufficiently different.
And who would go to all that trouble?
[...]
>>>Somewhere that takes quite a while to figure out how
>>>to transform without breaking everything.
>>
>>But, as you've said, there are crackers willing to spend
>>any amount of time doing the grunt work..
>
>Well, what kind of index file would be so onerous that only
>the most foolhardy cracker would attempt it?
I have no idea. All indices seem fairly simple formats to
me.
---
Russ
>>>>No, you don't want it in the raw data. The raw data is easy
>>>>to perturb randomly and still get basically the same data.
>>>
>>>That depends how you place it. Ultra low frequency
>>>components across the whole dataset would be much
>>>more difficult to remove. Imagine a sample with a 4Hz
>>>sine wave mixed in - it would be undetectable by ear
>>>and noise wouldn't remove it.
>>
>>So you run it under an audio filter and chop
>>anything the ear can't hear. Then you do some nasty
>>things to what you can hear, so that the data is now
>>sufficiently different.
>
>And who would go to all that trouble?
One other point that I missed before: what about
low-frequency components in *images*? There is no
'outside the seeing range' in this case.
And as for "sufficiently different", these methods are
so subtle that to be sure it's code free the data would
need to be replaced by white/pink noise - hardly worth
the effort of cracking if that's what you end up with.
Most crackers would simply rely on anonymity at
both ends and let the leaks fend for themselves (or
distribute the final version from shops).
---
Russ
Brandon Van Every <vanevery@blarg.net> wrote:>
>Russ Williams wrote:
>>One other point that I missed before: what about
>>low-frequency components in *images*? There is
>>no 'outside the seeing range' in this case.
>
>Sure there is. Darkest part of the image, you don't
>need it. People scale the luminescence of photos
>all the time.
But this won't get rid of it. Similar watermarking
techniques are capable of detecting a copy of an
image that's been altered (blur/soften/sharpen/etc.),
printed into a newspaper and scanned back into
a computer. Simple bit flipping or scaling just won't
cut it.
---
Russ
Well, let's see if we can fix some points out of this long email-exchange
between game-programmers. First of all let's recall our 'bearings'. Top game programmers'
protection techniques are interesting because these guys are under very
sophisticated attacks from the 'real piracy' industry: i.e. those guys that BURN hundreds
(thousands?) of pirated CD-ROMs with the latest "hit" in order to steal money. Whole industries
are active in this sense in Asia and Eastern Europe.
This activity is not only stealing, it is a "commercial" attitude that we not
only DO NOT condone, but that we are ready to counter together with these same protectors if
needs be, for instance delivering our own thoughts on these protection schemes.
On the other hand, these same attacks seem to push these protectors to an higher
degree of resourceful inventive: unfortunately their main problem is identifying leaks when
they send CD-ROMs to the magazines for testing, yet part of the schemes that they propose
could bear interesting fruits in the future for all forms of software protection, CD-ROM
based or NOT CD-ROM based. That's the reason I have decided to publish this.
As you have read, there are quite a lot of sound protecting ideas in here, and I can
only praise the idea of sticking the key "in the
INDEX STRUCTURE of the data, somewhere that takes
quite a while to figure out how to transform without breaking
everything. Yes! Yes! This is the way of the future for all kinds of protections,
and it is, IMO, the only strategy that will work. Once more: "The cracker *has* to solve the structure
of your data file to break
the protection scheme. No other choice. Has to understand what's a float value,
what's an index, etc. And you could make it take a very long time for
him to do that. And exactly that will mark the difference between lam-o-crackers and
real reversers: real reversers will enjoy this kind of protection! Of course
we will reverse these schemes, yet it shall -let's hope- take
some time (like it should be when solving any good puzzle) and this delay will
serve this kind of protectors well, since by the time we have
cracked their games' protections, the commercial products themselves
will already be "oldies". So what? Who cares? Man, I'm myself at this very moment cracking
in the background
a very old DOS game just for the fun of it! Who cares if the target is the last frizzy-dizzy
flop of tomorrow? It's the amount of intelligence that has been put into the protection scheme
that counts :-)
Let's hope that Brandon and Russ will soon implement some of their
smart and inventive projects!
Awaiting your comments!
[Back to counter intelligence] ~
[Back to part one]
~
[Back to part two]
(c) 2000: [fravia+], all rights
reserved