Using FCEUX to Play this Proof-of-Concept NES Game is a User Input - Expanding on PoC||GTFO 18:4
2025-07-13 - [54] 4:25
Abstract
Presented is a proof-of-concept Nintendo Entertainment System (NES) ROM that behaves differently if played on versions 2.3.0 through 2.6.6 (the most recent version as of this article) of the FCEUX NES emulator compared to other emulators or hardware. In essence, this NES ROM knows if it is being played on FCEUX 2.3.0 through 2.6.6 or not, which means the choice to use FCEUX 2.3.0 through 2.6.6 or not can be used as a user input. Expanding on my work on "Concealing ZIP Files in NES Cartridges" from the International Journal of "Proof of Concept or GTFO" (PoC||GTFO) issue 0x18 article 4, in order for this proof-of-concept to work, I create an NES ROM that is also a valid ZIP file, which when extracted, produces another NES ROM. Thanks to features of DEFLATE compression (RFC 1951) used for ZIP files, I am able to share/reuse the vast majority of data between the NES file that is also a ZIP file and the extracted NES ROM.
The proof-of-concept NES ROM and the Git Repository are linked below:
`jekyll-klondike.nes` NES ROM
`jekyll-klondike-nes` Git Repository [Git]
Introduction
Hello, neighbors. The world changes over time, and because of that, sometimes expected outcomes change. This article is a writeup on an NES Proof-of-Concept project I worked on over the last month. Hopefully this resulting information will be useful or enlightening to others. "Others" likely includes myself in the future (friendly neighbors, please save... archive... so my future self may too become an enlightened neighbor).
PoC||GTFO 0x18
Back in 2018, I wrote an article for the journal PoC||GTFO issue 0x18 titled "Concealing ZIP Files in NES Cartridges". In it, I explained how to create a ROM file for the NES that is also a valid ZIP file. This file had the added feature that it could be dumped from a physical NES cartridge and the dumped file will also be a valid ZIP file.
Proof of Concept or GTFO 0x18 [HTTPS]
One moment... before continuing, I want to fix an error from article 18:4 that has bugged me for about 7 years...
sed 's/0602/0620/'
There, fixed!
User Inputs
In my presentation at the Hackers of Planet Earth (HOPE) Conference in 2018 titled "I Dream of Game Genies and ZIP Files", I go over how unexpected things can in fact be used as user inputs. These unexpected user inputs include things like pressing the RESET button, using a Game Genie cheating device, or taking apart a controller so Left + Right can be pressed at the same time. In this article, I intend to show another user inface: simply choosing to use the FCEUX emulator to play the NES game.
"I Dream of Game Genies and ZIP Files" Talk Slides (PDF)
FCEUX 2.3.0+
When I had initially submitted "Concealing ZIP Files in NES Cartidges", the example proof-of-concept NES ROM worked on the FCEUX emulator. Since then, there have been quite a few changes made to FCEUX. First (and most useful for me) was that the Linux port of FCEUX finally got a debugger and hex editor, so I no longer need to run the Windows port under WINE. Second was that the way FCEUX handles ZIP files changed... This ended up causing issues opening the example proof-of-concept NES ROM, but from that frustration arises an opportunity!
FCEUX ZIP Handling Change
Before, FCEUX handled NES ROMs much like many other NES emulators, such as Mesen. First, the emulator checks to see if a selected file is an NES ROM, then if it is not, it checks to see if the file is a ZIP file that contains an NES ROM. That was the order of operations I got used to while working on the NES ROM for "Concealing ZIP FIles in NES Cartridges".
Since FCEUX version 2.3.0 (continuing through 2.6.6, the current version of FCEUX at the point of time in which I write this), the order of checking file types has, so now FCEUX checks to see if the selected file is a ZIP file first, after that, it checks to see if it is an NES ROM. If FCEUX considers the file to be a ZIP file, FCEUX will give an error if the ZIP file does not contain an NES ROM. What does this mean? Well, if I want to create an NES ROM with a valid ZIP file inside of it, the ZIP file MUST contain an NES ROM, as FCEUX will completely ignore the outer NES ROM data.
FCEUX ZIP Detection
Technically, FCEUX 2.3.0 through 2.6.6 don't check to see if the file is a ZIP file. It actually just checks to see if the 4 byte "End of Central Directory" signature of "PK\x05\x06" (50 4B 05 06) exists somewhere in the last 65,535 bytes (64KiB - 1 byte) of the file. It's unlikely that an NES ROM will randomly have these 4 bytes, as it is a 1 in 4,294,967,296 chance if randomly chosen, but it is not hard to modify an NES ROM to add these 4 bytes in unused space and prevent the ROM from opening up in FCEUX 2.3.0 through 2.6.6. There are 2 workarounds though, which are:
- Compress the NES ROM inside of a ZIP file, as FCEUX does not check if a compressed NES ROM is a ZIP file, only if the initial selected file is a ZIP file
- Append 65,532 null (0x00) bytes to the NES ROM to guarantee the 4 byte "End of Central Directory" does not occur within the last 65,535 bytes of the file
Schizophrenic File
A Schizophrenic File is a file that is interpreted differently depending on the program interpreting it. 2 programs could interpret the same file of the same file type differently for multiple reasons, but one might be that the first program tries to find the important file data starting from the beginning of the file moving towards the end and the second program tries to find the important file data starting from the end of the file moving towards the beginning. The difference in behavior between the 2 programs could be that the second program often ends up being quicker to find the important file data, which could save time for the user, while the first program uses a more straight forward approach to finding the important file data by searching form the beginning of the file towards the end.
In the case of the proof-of-concept presented here, I intend to create an NES ROM that behaves differently depending on if the emulator attempts to interpret the file as an NES ROM first or a ZIP file that might contain an NES ROM first. In essence, this NES ROM should behave differently on FCEUX 2.3.0 through 2.6.6 than it does on Mesen or many/most other NES emulators.
The following 2 images show the difference between the proof-of-concept NES ROM being run on Mesen 2.1.0 and FCEUX 2.6.6. Note the different text under the large title (I decided to use the Shavian alphabet for this ROM as a design choice, so these glyphs are not in the Latin alphabet).
Proof-of-Concept being run on Mesen 2.1.0

The subtitle text `"Jekyll" Klondike Solitaire` in the Shavian alphabet is shown.
Proof-of-Concept being run on FCEUX 2.6.6

The subtitle text `"Hyde" Klondike Solitaire` in the Shavian alphabet is shown.
NES ROM Files
NES ROM files generally contain 3 parts, the iNES header (alternatively an NES 2.0 header), PRG (Program) ROM, and CHR (Character/Graphics) ROM. The iNES header provides required information for the NES emulator to run the NES ROM properly as though it were an actual cartridge running on actual NES hardware (a little more info is given in the next paragraph). The PRG ROM is 6502 machine code and data used for the NES to parse, interpret, and run. The CHR ROM is graphical data used for sprite and background tile data.
The iNES header is a 16 byte header that lets the NES emulator determine how to handle the NES ROM. For instance, it specifies PRG ROM size, CHR ROM size, mapper type (PCB/hardware/circuit capabilities), intended region (NTSC or PAL), whether or not PRG RAM is used and whether that PRG RAM could stay saved with the use of a battery in the cartridge, along with some info on the kind of scrolling used for the cartridge (called "Mirroring"). The iNES header always starts at file offset 0 and begins with the 4 byte magic string "NES\x1A" (ASCII N, ASCII E, ASCII S, MS-DOS EOF character). As we are using one of the simple mapper types, the NROM-256, we will have the resulting 16 byte iNES header for this proof-of-concept (presented in hexadecimal to remove ambiguity).
4E 45 53 1A 02 01 00 00 00 00 00 00 00 00 00 00
The following is a diagram of how an NROM-256 NES ROM looks, using hexadecimal ranges, where the first number is the beginning offset of the data and the second number is the offset immediately after the end of the data. For instance, [0000:0010] is data that starts at offset 0 (hex 0x0000) and ends on (inclusive) offset 15 (hex 0x000F), where the immediate offset after the data is offset 16 (hex 0x0010). The resulting data is 16 bytes (0x0010 [second offset] minus 0x0000 [first offset]) in length. This is how offsets in Python and many other programming languages work, so if you write software, you might likely be familiar with this syntax (although the numbers are in hexadecimal instead of decimal in this case).
+NES-ROM--------------[0000:A010]+
|iNES Header [0000:0010]|
+--------------------------------+
|PRG ROM [0010:8010]|
| |
| |
| |
+--------------------------------+
|CHR ROM [8010:A010]|
+--------------------------------+
This Proof-of-Concept File - Expanding on PoC||GTFO 18:4
In "Concealing ZIP Files in NES Cartidges", I crammed the ZIP file in its entirety in the PRG ROM data of the NES ROM (I used the NROM-128 NES mapper for that proof-of-concept, which resulted in a 16KiB PRG ROM instead of a 32KiB PRG ROM. The difference isn't important, but I may as well mention it, as it is a difference). To simplify the article, I essentially just made a ZIP file and slapped a valid NES ROM around it, then updated offsets in the ZIP file data so ZIP file extractors wouldn't complain.
Things were simple and worked until FCEUX 2.3.0, but that doesn't work anymore, as explained earlier, so I need to be more clever! The simplest way of handling things would be compressing a highly compressable NES ROM inside of the ZIP file, resulting in an NES ROM with a ZIP file inside it with an NES ROM inside of it. That's too easy though! Let's do better!
To reiterate, for this proof-of-concept to work, I need to make an NES ROM that is also a ZIP file that when extracted, produces a different NES ROM.
ZIP Compression (DEFLATE)
It's possible to create a ZIP file with either compressed or uncompressed files. For this proof-of-concept file, I want to compress an NES ROM inside of the ZIP file that is the same size as the overall NES+ZIP file. In other words, I want to make an NES ROM (in this case an NROM-256 ROM, which is 40KiB + 16B in length, for a total of 40976 bytes) that contains a ZIP file which contains a compressed NES ROM that, which when uncompressed is 40976 bytes in size.
The compression ZIP files use is called DEFLATE, which is explained in RFC 1951.
RFC 1951 - DEFLATE Compressed Data Format Specification version 1.3 [HTTPS]
It turns out DEFLATE has a nice feature that I can take advantage of... if I want to reuse the vast majority of bytes between the outer NES ROM and the compressed NES ROM inside of the ZIP file data, I can!
The data of a DEFLATE data stream consists of multiple blocks of data. Each block of data occurs sequentially and can be one of three types of blocks (BTYPE), "Non-Compressed" (BTYPE=00), "Compressed with fixed Huffman codes" (BTYPE=01), and "Compressed with dynamic Huffman codes"(BTYPE=10). At the time of writing this, I won't pretend to understand all of the details for BTYPE=10, so in this proof-of-concept, I will only focus on BTYPE=00 and BTYPE=01. Blocks of BTYPE=01 can have an arbitrary length while Non-Compressed blocks of BTYPE=00 can contain a maximum of 65535 bytes of uncompressed data.
Due to some complication in understanding how DEFLATE data blocks work, I'll go over that quick.
There are 2 kinds of values for DEFLATE data blocks, Literals and Numbers. Numbers are inserted into the DEFLATE data block from least significant bit to most significant bit. Literals are inserted into the DEFLATE data block from most significant bit to least significant bit.
Once you have the DEFLATE data stream bits, split them into 8 bit chunks to create bytes. As stated in RFC 1951, a byte is a number from 0 through 255, so we write the value from least significant bit to most significant bit. For instance, if we had 16 bits of a DEFLATE data stream (aligned to a byte boundary) of 00011011 10111000, the resulting bytes would end up being 0xD8 (11011000) 0x1D (00011101). The handling of Numbers and Bytes sometimes makes a numerical value at the end appear as a Little-Endian numerical value in most significant to least significant byte order, as long as the numerical value is aligned to a byte boundary.
DEFLATE data streams contain DEFLATE data blocks that occur immediately after each other, with the final DEFLATE data block being padded with up to seven 0 bits to align it to a byte boundary. DEFLATE data blocks, in relation to each other, are NOT aligned to byte boundaries. To make my life easier though, I crafted particular BTYPE=01 data blocks to make sure everything stayed byte aligned.
BTYPE=00
All "Non-Compressed" DEFLATE data blocks follow the same format:
- 1-bit BFINAL: 0 if NOT final block, 1 if final block (Literal)
- 2-bit BTYPE: 00 (Number)
- Padding of 0 bits to align to a byte boundary (Literal)
- 16-bit length of non-compressed data (Number)
- 16-bit ones' compliment length (Number)
Immediately after the 16-bit one's compliment length, there will be `N` raw, non-compressed, unmodified bytes, where `N` is specified by the 16-bit length.
The following example is a full BTYPE 00 data block that started 3 bits past a byte boundary that is NOT the final block (BFINAL = 0) and contains the uncompressed ASCII data "Yo!":
0 00 00 1100000000000000 0011111111111111 01011001 01101111 00100001
| || || |||||||||||||||| |||||||||||||||| |||||||| |||||||| ||||||||
| || || |||||||||||||||| |||||||||||||||| |||||||| |||||||| ASCII "!"
| || || |||||||||||||||| |||||||||||||||| |||||||| ASCII "o"
| || || |||||||||||||||| |||||||||||||||| ASCII "Y"
| || || |||||||||||||||| Ones' Compliment of Length = 3
| || || Length = 3
| || Padding to align to byte boundary
| BTYPE = 00
BFINAL = 0
BTYPE=01
All "Compressed with fixed Huffman codes" DEFLATE data blocks follow the same format:
- 1-bit BFINAL: 0 if NOT final block, 1 if final block (Literal)
- 2-bit BTYPE: 01 (Number)
- Remainder of data block
- End-of-Block value of 256 (Literal)
After the 2-bit BTYPE, all Literal values will use the following Huffman code mapping:
- 0 through 143: 00110000 through 10111111 (8 bit length)
- 144 through 255: 110010000 through 111111111 (9 bit length)
- 256 through 279: 0000000 through 0010111 (7 bit length)
- 280 through 287: 11000000 through 11000111 (8 bit length)
This means the End-of-Block value of 256 will be the bit sequence 0000000.
There are 3 kinds of content for the data block, a Literal value, Length value, or Backwards Distance value. The Length value is ALWAYS followed by the Backwards Distance value. The Length value starts with the Literal value of one of 257 through 285, followed by a 0 through 5 bit Number value, depending on the Literal value. The Backwards Distance starts with a 5 bit Number value from 0 through 29, along with 0 through 13 extra bits, depending on the initial 5 bit Number.The Backwards Distance value states how many Literals back that were not Length or Backwards Distance values to go to get the Literal to be repeated. The Length value states how many time the Literal is to be repeated.
The following example is a full BTYPE 01 data block that is the final block (BFINAL = 1) that expands to a Literal byte value of 0x00, then repeats that initial 0x00 byte 74 times, then repeats that 0x00 byte another 258 times:
1 01 00110000 0010101 1110 00000 11000101 00000 0000000
| || |||||||| ||||||| |||| ||||| |||||||| ||||| |||||||
| || |||||||| ||||||| |||| ||||| |||||||| ||||| End of Block
| || |||||||| ||||||| |||| ||||| |||||||| Backwards Distance = 1
| || |||||||| ||||||| |||| ||||| Length = 258
| || |||||||| ||||||| |||| Backwards Distance = 1
| || |||||||| ||||||| Length Addition = 7
| || |||||||| Length = 67
| || Literal value of 0x00
| BTYPE = 01
BFINAL = 1
Synchronizing NES ROM Offsets
To make things easier, I'm going to call the overall proof-of-concept NES+ZIP file "OUTSIDE" and the NES ROM that would be extracted from the ZIP file "INSIDE". I will call the data that is shared between OUTSIDE and INSIDE "SHARED".
Something that is getting in the way of reusing most of the data of OUTSIDE is the fact that a ZIP file starts and ends with some data specific to ZIP files. File data inside of a ZIP file is unable to start before the ZIP file data and unable to end after the ZIP file data. This means in order to synchronize OUTSIDE and INSIDE so SHARED can be used between OUTSIDE and INSIDE, I need to compress at least some data of INSIDE. In this case, some data near the beginning and near the end of INSIDE.
For this proof-of-concept, the start of OUTSIDE will be an iNES header, then some data that is unique to OUTSIDE, then the start of the ZIP file data.
The DEFLATE data stream of INSIDE will start with an uncompressed iNES header, identical to the iNES header of OUTSIDE, then some uncompressed data that is unique to INSIDE, then compressed data, which we will call COMPRESSED1, the length of which will be figured out later in this article. After that will be the vast majority of the uncompressed NES ROM data of SHARED, then some more compressed data, which we will call COMPRESSED2, the length of which will be figured out later in this article, then the end of the ZIP file data, and finally the rest of OUTSIDE (which will just be null bytes in this case).
The length of COMPRESSED2 is the same length as the end of the ZIP file data after the shared NES ROM data between OUTSIDE and INSIDE. The reason for this is because INSIDE will not include the end of the ZIP file data once it has been extracted from the ZIP file. The length of COMPRESSED1 is exactly enough to align the shared NES ROM data between OUTSIDE and INSIDE. In both cases of COMPRESSED1 and COMPRESSED2, it is easy enough to just compress a bunch of null bytes for padding.
To review, the following terms will be used from this point forward in the article:
- OUTSIDE - The NES ROM that is also a ZIP file that contains an inner NES ROM
- INSIDE - The NES ROM that is contained inside of the ZIP file
- SHARED - Shared data between OUTSIDE and INSIDE
- COMPRESSED1 - Compressed DEFLATE data block used for padding to synchronize the byte offset of OUTSIDE and INSIDE to SHARED
- COMPRESSED2 - Compressed DEFLATE data block used for padding the end of INSIDE to match the file size of OUTSIDE
NES+ZIP Polyglot
To expand on the NES file structure diagram earlier, here is the same diagram with an inserted ZIP file structure diagram. Compared to the previous strtucture diagram, we are including another data construct, this time "length". The length is placed within parentheses and is a hexadecimal value. For instance, (0010) is a length of 16 in decimal, as 16 is 0x10 in hexadecimal. An arbitrary value `n` is included in some lengths as well and is specifically the length of the file name of INSIDE, which in this case is `hyde-klondike.nes`, which is a length of 17 in decimal or 0x11 in hexadecimal.
+NES-ROM-----------------[0000:A010]+
|iNES Header [0000:0010]|
+-----------------------------------+
|PRG ROM [0010:8010]|
| +ZIP-File------------[XXXX:YYYY]+ |
| |Local File Header (001E+n)| |
| | +---------------------------+ | |
| | |DEFLATE Data | | |
| | | | | |
| | | | | |
| | | | | |
+-| | | |-+
| | | | | |
| | +---------------------------+ | |
| +-------------------------------+ |
| |Central Directory (002E+n)| |
| +-------------------------------+ |
| |End of Central Directory (0016)| |
| +-------------------------------+ |
|CHR ROM [8010:A010]|
+-----------------------------------+
With the 'n' values of 17 (0x11) we have the following structure diagram:
+NES-ROM-----------------[0000:A010]+
|iNES Header [0000:0010]|
+-----------------------------------+
|PRG ROM [0010:8010]|
| +ZIP-File------------[XXXX:YYYY]+ |
| |Local File Header (002F)| |
| | +---------------------------+ | |
| | |DEFLATE Data | | |
| | | | | |
| | | | | |
| | | | | |
+-| | | |-+
| | | | | |
| | +---------------------------+ | |
| +-------------------------------+ |
| |Central Directory (003F)| |
| +-------------------------------+ |
| |End of Central Directory (0016)| |
| +-------------------------------+ |
|CHR ROM [8010:A010]|
+-----------------------------------+
The following structure diagram is used for the DEFLATE Data of the previous structure diagram:
+DEFLATE-DATA---------[IIII:JJJJ]+
|INSIDE Unique Data BTYPE=00|
+--------------------------------+
|COMPRESSED1 BTYPE=01|
+--------------------------------+
|SHARED Data BTYPE=00|
| |
| |
+--------------------------------+
|COMPRESSED2 BTYPE=01|
+--------------------------------+
For this proof-of-concept, we want to reuse almost every single byte of data of the NES ROM. This means XXXX will be close to 0x0010 and YYYY will be close to 0xA010.
The following subsections explain how I determine values for XXXX, YYYY, IIII, JJJJ, the length of SHARED Data, and the content of COMPRESSED1 and COMPRESSED2. They are presented in the order I figured them out to make the proof-of-concept possible.
XXXX Value
To make my life easier, I decided the unique OUTSIDE PRG ROM data would be 0x200 (512) bytes long, so I set XXXX to be 0x0210.
IIII Value
Because there are 0x2F (47) bytes in the Local File Header before the DEFLATE data IIII will be 0x023F (0x0210 + 0x2F).
SHARED Data
I didn't want to make this proof-of-concept more complicated than need be with arbitrary lengths, so I decided the SHARED Data would start at offset 0x0810 and at 0x9E10. Because of the SHARED Deflate data block being a "Non-Compressed" block, 5 bytes of the "Non-Compressed" DEFLATE block header (the BTYPE/BFINAL/Padding byte, the 2 length bytes, and the 2 ones' compliment length bytes) will need to be set immediately before 0x0810.
COMPRESSED1 Data
This was the hardest part for me to figure out, as I did not want to over-complicate things further down into in the proof-of-concept data. As explained in the previous subsection, the DEFLATE data block for the SHARED Data needs to start 5 bytes before 0x0810, so at 0x080B. Due to the INSIDE Unique Data being 0x0210 bytes in length and the content bytes of SHARED Data starting at 0x0810, COMPRESSED1 needs to decompress into 0x600 (1536) bytes of padding.
The INSIDE Unique DEFLATE data block starts at offset 0x023F. It is a "Non-Compressed" DEFLATE data block, so has a DEFLATE block header that is 5 bytes in length. It also has 0x0210 raw bytes, so the first offset after the INSIDE Unique Data is 0x0454. The end of the COMPRESSED1 DEFLATE data block as mentioned earlier, is at 0x080B. This means the COMPRESSED1 Data needs to be represented in 0x3b7 (951) bytes.
1536 bytes need to be created using 951 bytes. It's decently straightforward to create a short "Compressed with fixed Huffman codes" DEFLATE data block, but in order to align everything, all 951 bytes need to be used up. A solution is to throw in a bunch of empty "Non-Compressed" DEFLATE data blocks after the "Compressed with fixed Huffman codes" DEFLATE data block. Each empty "Non-Compressed" DEFLATE data block is 5 bytes in length, assuming I make the COMPRESSED1 Data "Compressed with fixed Huffman codes" DEFLATE data block aligned to a byte boundary. I also need to make sure the COMPRESSED1 DEFLATE data block is a length where 951 bytes minus the length of the COMPRESSED1 DEFLATE data block is divisible by 5 so I can include an arbitrary number of empty "Non-Compressed" DEFLATE data blocks.
The solution I came up with was creating a 21 byte COMPRESSED1 DEFLATE data block, as follows:
0 10 00110000 00110000 00110000 00110000 00110000 00110000 11000101
| || |||||||| |||||||| |||||||| |||||||| |||||||| |||||||| ||||||||
| || |||||||| |||||||| |||||||| |||||||| |||||||| |||||||| Length = 258
| || |||||||| |||||||| |||||||| |||||||| |||||||| Literal value of 0x00
| || |||||||| |||||||| |||||||| |||||||| Literal value of 0x00
| || |||||||| |||||||| |||||||| Literal value of 0x00
| || |||||||| |||||||| Literal value of 0x00
| || |||||||| Literal value of 0x00
| || Literal value of 0x00
| BTYPE = 00
BFINAL = 0
00000 11000101 00000 11000101 00000 11000101 00000 11000101
||||| |||||||| ||||| |||||||| ||||| |||||||| ||||| ||||||||
||||| |||||||| ||||| |||||||| ||||| |||||||| ||||| Length = 258
||||| |||||||| ||||| |||||||| ||||| |||||||| Backwards Distance = 1
||||| |||||||| ||||| |||||||| ||||| Length = 258
||||| |||||||| ||||| |||||||| Backwards Distance = 1
||||| |||||||| ||||| Length = 258
||||| |||||||| Backwards Distance = 1
||||| Length = 258
Backwards Distance = 1
00000 0001001 0 00000 0001101 00 00000 11000011 11110
||||| ||||||| | ||||| ||||||| || ||||| |||||||| |||||
||||| ||||||| | ||||| ||||||| || ||||| |||||||| |||||
||||| ||||||| | ||||| ||||||| || ||||| |||||||| |||||
||||| ||||||| | ||||| ||||||| || ||||| |||||||| Length Addition = 15
||||| ||||||| | ||||| ||||||| || ||||| Length = 295
||||| ||||||| | ||||| ||||||| || Backwards Distance = 1
||||| ||||||| | ||||| ||||||| Length Addition = 0
||||| ||||||| | ||||| Length = 19
||||| ||||||| | Backwards Distance = 1
||||| ||||||| Length Addition = 0
||||| Length = 11
Backwards Distance = 1
00000 0000000
||||| |||||||
||||| End of Block
Backwards Distance = 1
You might notice that I didn't need 5 of Literal 0x00 values. This is what we in the industry might consider a "'wrong, but not broken' mistake". The same result can occur if we remove 5 of the Literal 0x00 values and change "Length Addition = 15" to "Length Addition = 20". If that change were to occur, 1 extra empty "Non-Compressed" DEFLATE data block would be used.
Overall for this proof-of-concept AS WRITTEN, the COMPRESSED1 DEFLATE data block is 21 bytes in length, starts at offset 0x023F, and is followed by 186 empty "Non-Compressed" DEFLATE data blocks, finally bringing us to offset 0x080B and the start of the SHARED DEFLATE data block.
COMPRESSED2 Data
The COMPRESSED2 DEFLATE data starts at offset 0x9E10, so 0x200 (512) bytes of padding need to be produced. The amount of bytes needed to do this are not important anymore, as this is nearly the end of the ZIP file data. Because this is the final DEFLATE data block, padding is added to the end of the DEFLATE data block to align it to a byte boundary.
The the COMPRESSED2 DEFLATE data block looks as follows:
1 10 00110000 11000101 00000 11000100 01011 00000 0000000
| || |||||||| |||||||| ||||| |||||||| ||||| ||||| |||||||
| || |||||||| |||||||| ||||| |||||||| ||||| ||||| End of Block
| || |||||||| |||||||| ||||| |||||||| ||||| Backwards Distance = 1
| || |||||||| |||||||| ||||| |||||||| Length Addition = 26
| || |||||||| |||||||| ||||| Length = 227
| || |||||||| |||||||| Backwards Distance = 1
| || |||||||| Length = 258
| || Literal value of 0x00
| BTYPE = 01
BFINAL = 1
0000000
|||||||
Padding to align end of last DEFLATE data block to byte border
The COMPRESSED2 DEFLATE data block is 7 bytes long, so the first offset after the COMPRESSED2 DEFLATE data block is 0x9E17.
JJJJ Value
As the COMPRESSED2 DEFLATE data block was the final DEFLATE data block, JJJJ is equal to 0x9E17.
YYYY Value
There are 0x55 (0x3F + 0x16, or decimal 85) bytes after the DEFLATE data to finish off the ZIP file, that being the Central Directory and End of Central Directory data. That means YYYY is equal to 0x9E6C.
Final NES+ZIP Polyglot data structure
+NES-ROM-----------------------------(A010)-[0000:A010]+
|iNES Header (0010) [0000:0010]|
+------------------------------------------------------+
|PRG ROM (8000) [0010:8010]|
| +--------------------------------------------------+ |
| |OUTSIDE Unique Data (0200) [0010:0210]| |
| +--------------------------------------------------+ |
| +ZIP-File------------------------(965C)-[0210:9E6C]+ |
| |Local File Header (002F) [0210:023F]| |
| | +DEFLATE-Data----------------(9BD8)-[023F:9E17]+ | |
| | |INSIDE Unique Data (0215) [023F:0454]| | |
| | +----------------------------------------------+ | |
| | |COMPRESSED1 Data (0015) [0454:0469]| | |
| | +----------------------------------------------+ | |
| | |Empty BTYPE=00 Data Blocks (03A7) [0469:0810]| | |
| | +----------------------------------------------+ | |
| | |SHARED Data (9600) [0810:9E10]| | |
+-| | | |-+
| | | | | |
| | +----------------------------------------------+ | |
| | |COMPRESSED2 Data (0007) [9E10:9E17]| | |
| | +----------------------------------------------+ | |
| +--------------------------------------------------+ |
| |Central Directory (003F) [9E17:9E56]| |
| +--------------------------------------------------+ |
| |End of Central Directory (0016) [9E56:9E6C]| |
| +--------------------------------------------------+ |
|CHR ROM (2000) [8010:A010]|
+------------------------------------------------------+
OUTSIDE/INSIDE Unique Data
It is important that all calls to PRG ROM data outside of SHARED are handled by both OUTSIDE's and INSIDE's Unique Data at the same offsets. For this proof-of-concept NES ROM, a subroutine called `ModifyShuffle`, some data called `NameBytes`, and a byte called `NameLen` are called within the PRG offset 0x8000 to 0x8200 range (PRG ROM data is mapped to CPU RAM range 0x8000 through 0xFFFF). For both the OUTSIDE and INSIDE Unique data, I have `ModifyShuffle` mapped at 0x8000 (NES file offset 0x0010), `NameBytes` mapped at 0x8100 (NES file offset 0x0110), and `NameLen` mappet at 0x8110 (NES file offset 0x0120).
For the INSIDE ROM ("Hyde" Klondike), `ModifyShuffle` modifies a deck shuffle to create a "blocking stack" of cards that makes the shuffle impossible to win. For the OUTSIDE ROM ("Jekyll" Klondike), `ModifyShuffle` does nothing (it returns from stack immediately). In the SHARED data, `ModifyShuffle` is called after every deck shuffle, so in the INSIDE ROM, every game will be unwinnable, and in the OUTSIDE ROM, the shuffles will remain fair. `NameBytes` along with `NameLen` are used to write the correct subtitle on the title screen.
Impossible Klondike Solitaire Shuffles
Most of us who have played Klondike Solitaire know the feeling of only having a few more face-down cards in the tableau, and realizing that the face up card blocking them can't move because the cards required to move it are in that face-down stack. This is called a "blocking stack" and guarantees a loss. For "Hyde" Klondike (INSIDE ROM), I generate a 4-card blocking stack in the following way:
- Pick a random value from 2 through Queen and suit for the card closest to the player of the stack
- Place the card immediately 1 value below but the same suit as the randomly chosen card under the chosen card (so further down the pack, away from the player)
- Place the 2 cards 1 value above but opposite color of the randomly chosen card under the chosen card (so further down the pack, away from the player)
I choose a random column from 4 through 7 to stick the 4-card blocking stack in. For instance, from closest to furthest from the player, I could have 2 of Spades, ace of Spades, 3 of Hearts, and 3 of Diamonds somewhere in column 7. The 2 cannot move because either the Ace of Spades needs to be free and on the foundation for the 2 of Spades to set on, or either of the red 3s needs to be free in the tableau. Because the red 3s and the Ace of Spades are all under the 2 of Spades, the 2 of Spades has to opportunity to move, so the game is impossible to win. It is possible to be locked out of winning a game earlier than dealing with the 4-card blocking stack, but in the longest-case scenario, the blocking stack will prevent a win.
Matching CRC32 checksums
As an Easter Egg of sorts, I decided this proof-of-concept NES ROM should have the same CRC32 checksums for both the OUTSIDE and INSIDE ROMs.
FCEUX and Mesen both include CRC32 checksums for NES ROMs that are played. The CRC32 checksums usually end up being of the entire NES ROM including the iNES header, the PRG ROM data by itself, the CHR ROM data by itself, or the PRG and CHR ROM concatenated. I decided that I wanted to match all of these CRC32 checksums between the OUTSIDE ROM and the INSIDE ROM.
- iNES Header [0000:0010]
- OUTSIDE Unique PRG Data [0010:0210]
- OUTSIDE Unique PRG Unmodifiable Data [0210:0810]
- SHARED Data [0810:9E10]
- OUTSIDE Unique CHR Unmodifiable Data [9E10:9E6C]
- OUTSIDE Unique CHR Data [9E63:A010]
Both the OUTSIDE and INSIDE ROM share the same iNES header. There is then 0x800 bytes that are potentially different between the OUTSIDE and INSIDE PRG ROM. If the 0x800 bytes of OUTSIDE Unique PRG data can be modified to have the same CRC32 value as the first 0x800 bytes of the INSIDE PRG data, then both ROMs will both be at the same CRC32 checksum at byte 0x810, then continue having the same CRC32 checksum through the end of the SHARED data. That would mean the iNES Header and the entire PRG ROM of both OUTSIDE and INSIDE will have the same CRC32 checksums.
Most of the CHR ROM data for both OUTSIDE and INSIDE is in the SHARED data. At the end of the SHARED data, both ROMs will still be at the same CRC32 checksum. The final 0x200 bytes of the of the OUTSIDE and INSIDE CHR ROM are potentially different, and if the OUTSIDE Unique CHR data can be modified to have the same CRC32 value as the last 0x200 byte of the INSIDE CHR data, then both ROMs will finish their file with the same CRC32 checksum. That would set both the OUTSIDE and INSIDE ROMs have the same iNES header CRC32 checksum, same PRG ROM CRC32 checksum, an the same CHR ROM CRC32.
Due to how CRC32 works, chunks of bytes with the same CRC32 checksums are interchangable. This means any addition combination of the categories of iNES header, PRG ROM, and CHR ROM data between both the OUTSIDE and INSIDE ROMs will have the same CRC32 checksums. Also due to how CRC32 works, every 32 bit CRC32 checksum with the exception of 0x0000 is reachable by modifying only 32 bits somewhere in the data.
For matching the PRG ROMs, I bruteforced by setting every possible 4 byte value at bytes 0x20C through 0x20F [020C:0210] of the OUTSIDE ROM and checking to see if the CRC32 checksums of offsets 0x10 through 0x80F [0010:0810] of the OUTSIDE ROM and INSIDE ROM match.
For matching the CHR ROMs, I bruteforced by setting every possible 4 byte value at bytes 0xA00C through 0xA00F [A00C:A010] of the OUTSIDE ROM and checking to see if the CRC32 checksums of offsets 0x8010 through 0xA00F [8010:A010] of the OUTSIDE ROM and INSIDE ROM match.
- iNES Header CRC32 checksums match
- OUTSIDE [020C:0210] modified so [0010:0810] CRC32 checksums match
- [0810:8010] CRC32 checksums match
- Because [0010:0810] and [0810:8010] CRC32 checksums match, PRG ROM CRC32 checksums match
- [8010:9E10] CRC32 checksums match
- OUTSIDE [A00C:A010] modified so [9E10:A010] CRC checksums match
- Because [8010:9E10] and [9E10:A010] CRC32 checksums match, CHR ROM CRC32 checksums match
- Because iNES Header, PRG ROM, and CRC ROM CRC32 checksums match, any category combination CRC32 checksum will match
Defeating the Proof of Concept
Just because the default version of this proof-of-concept NES ROM is "Jekyll" Klondike on hardware and most emulators does not mean that other emulators cannot play the "Hyde" Klondike version. Likewise, just because the default version of this proof-of-concept NES ROM is "Hyde" Klondike on FCEUX 2.3.0 through 2.6.6 does not mean FCEUX 2.3.0 through 2.6.6 cannot play the "Jekyll" Klondike version.
Play "Jekyll" Klondike on FCEUX 2.3.0 through 2.6.6
There are 2 simple ways to make the "Jekyll" Klondike NES ROM play on FCEUX 2.3.0 through 2.6.6. The first is to append a sufficiently large enough of unimportant data to the end of the NES ROM file. The second (and probably easier) way is to compress this proof-of-concept NES ROM inside of a ZIP file. FCEUX only checks if the file is a ZIP file a single time and then plays the `.nes` ROM file inside of it. Because this proof-of-concept NES ROM is in fact an NES ROM, even if it's also a ZIP file, FCEUX will detect the proof-of-concept NES ROM as an NES ROM and play it.
Play "Hyde" Klondike on other emulators
This proof-of-concept NES ROM is just an NES ROM that is also a ZIP file that contains an NES ROM. FCEUX 2.3.0 through 2.6.6 will play that ZIPped up NES ROM by default, but it is still easy to play the "Hyde" Klondike NES ROM on other emulators. Because the NES ROM is also a ZIP file that contains an INNER NES ROM, the INNER NES ROM can be extracted by unzipping the proof-of-concept NES ROM.
Future Work
It is certainly possible to include more than 1 file in the ZIP file data part of an NES+ZIP polyglot file, which is left as an exercise to yall neighbors. There is also potential to make considerably more complicated unique data for the OUTER and INNER ROMs or to change their offsets or the amount of offsets. All in all, it is certainly possible to make a MUCH more complicated NES ROM and/or ZIP File data.
Conclusion
Thank you, neighbors, for exploring this idea of allowing for the creation of a user input with me. The user input of "FCEUX" may end up being patched in the future (I would personally recommend patching this out in FCEUX version 2.6.7 or later to prevent errors if the 4 bytes "PK\x05\x06" (50 4B 05 06) exist near enough to the end of the selected NES ROM file), but for now, we have the playground of an unexpected and experimental user input.
Contact/Reply
If you would like to reply to this post, feel free to send me an email.
Email: vi@vigrey.com [Email]
PGP Public Key [515F AD67 F931 0A2B 9B93 CE19 814F ECB1 A398 63CE]