It seems that questions about whether (and if so, to what extent) GameMaker: Studio games can be decompiled are being asked at a constant pace, and yet there are still no resources to clear up these questions. So I've decided to make a small post on the matter.
If you are not familiar with what is this all about: older versions of GameMaker had produced executables that could be reversed into an editable file with a help of a program ("decompiler"). Such unpleasant turn of events was made possible because:
- Game data was more or less just appended to the end of a "runtime" executable
- Source code was inserted into data as text, with code structure and comments intact.
So a program would "read" the appropriate sections of the executable, extract game data, and repack it into an editable file, permitting various acitivities, most of which would violate EULA.
To mention, a lot of Lua-based game engines also suffer from this kind of problem, and it is generally solved by obfuscating source code and altering the executable structure (in GM's case, with a so-called "anti-decompiler") for the "decompiler" to not even be able to extract the obscured game data as easily.
So it's a valid thing to wonder about how much of this still applies to GameMaker: Studio...
How it was
I guess the first thing you could ask is
Why'd someone store unaltered source code in a compiled game at all?
And it's a good question, but I can't know the answer for sure.
Maybe it was because GameMaker has started as an educational program, and such things were not given much thought at that time.
It could be due to assuming that people wouldn't do such a thing.
Could have been due to runtime running interpreted code, leaving less sense in introducing another intermediate representation of the program.
Or it might have been because the program was written in Object Pascal/Delphi and it's actually a bit of surprise that things worked as well as they did.
Either way, the exact reasoning will remain a mystery.
How it is now
With the runtime being rewritten at some early point of GameMaker:Studio's existence, an important change was made - now, instead of being stored as text, all game code is compiled into bytecode at the compilation stage. This makes for smaller size, faster performance, and not storing many traces of what the program structure even was.
Source code interpreter not being included in runtime also meant that code could no longer be evaluated dynamically, which is something that people somehow still whine about, even though it was a security hole with size of a small city for any game using it to run code from external files.
Sometime later, YYC (YoYoCompiler) modules were introduced. YYC makes extensive use of LLVM to transform GML code into (slightly terrifying) C++ source files that interface directly with the runtime. Needless to say, this can help a lot with performance of the game logic... and also has pleasant implications in terms of security. More about this later.
So, let's further look into what risks each of three output types (non-YYC, YYC, HTML5) has.
Can it be decompiled?: Maybe, at some point of future.
It's no news that with sufficient resources bytecode can be transformed back into more or less readable code, and software doing such process exists for tools that have been using bytecode to store program data for many years (for example, Lua or C#'s CIL).
In the best case, such decompiler would be able to recover rough code (without comments, original spacing, and some of the structures at their places) with some of the variable/script/function names intact.
To counter this, the developer can obfuscate most names (excluding ones referencing built-in function) and/or obfuscate the bytecode structure to require additional manual work to be able to recompile the game, should someone succeed at decompiling it.
While the second option requires purpose-specific tools to be developed (but such do exist for languages that had dealt with decompilation for a while), the first one is pretty simple. In fact, so simple, that you could even do it manually, should you feel entitled:
To summarize, while decompilation of non-YYC games is possible, one can definitely say that new tools will emerge and evolve, should that become reality.
Can it be decompiled?: Unlikely.
And when I say "unlikely", I mean it - GML is converted into C++ code, and that is compiled. With optimizations. And even if you can reverse the executable into C code (which is already a heck of a task due to C++ structures leaving next to no traces of what they were before compilation), you still have to somehow recover the actual GML from that mess.
Otherwise YYC builds include minimal information to aid with displaying error messages, which can be removed just as easily.
Can it be decompiled?: Sort of.
On one hand, JS code generated from GML is the closest-looking thing to the actual GML.
But then things like with-loops turn into horrible monstrosities.
And everything is renamed to random garbage in a race to bring the size down.
In theory, with sufficient resources (probably lots of pattern recognition), it would be possible to recover some code and resources out of a GMS' HTML5 game, but with all variable names and fair of structure lost - for what exact reason?
Can graphics/audio be extracted?: Yep.
And what did you expect - it is now year 2015. Even if someone wouldn't have made a little script to extract texture sheets and sound effects out of data.win files, tools for extracting graphics (textures/models) right out of DirectX context have been around for more than a decade now.
Various measures can be taken about this but are generally not worth the hassle.
While the decompilation is theoretically possible for non-YYC and HTML5 games made with GameMaker: Studio, currently only graphics and audio assets can be extracted.
Should you proceed with searching for such things, I strongly encourage you to virus-scan any findings and run them in a virtual machine (e.g. VirtualBox) or secure sandboxed environment (e.g. Sandboxie) - keep in mind that this is the internet and people can go some lengths to misrepresent information if it benefits their interests.
If I've missed something, do tell.