Here's the ugly script I had used to find C:
http://pastebin.com/RVWKhDJc
Obviously, you need to supply your own keyY, normalkey, and 3DS AES output(info on that inside).
That should work, yes. The 32C3 presentation didn't explain very well
why it works.
The first part is figuring out the basic algorithm. From the DSi keyslot crack, they guessed correctly that the algorithm would have a similar form. The DSi one was like this as big-endian:
K = ((X XOR Y) + C) ROL S
You can write that as:
K = (G(X, Y) + C) ROL S for some function G that's G(X, Y) = X XOR Y on DSi.
The 3DS keyslot formula has the same form. If you guess that the new G function on X and Y is still an XOR on 3DS keyslots, you can just try a few things and derive that G(X, Y) = (X ROL 2) XOR Y. It's easy to test it without knowing C or S. In fact, this was discovered long before the whole formula was, because you don't need the normal-key in order to figure it out.
From this point on, the rest is done with differential cryptanalysis. One bit of Y is flipped, and you try to modify bits of the normal-key until it matches the results of the keyslot.
To determine S, you can just look at the effect of changing the low bit of Y on the normal-key. The lowest bit it changes is the rotation value, because flipping the low bit of an addend always flips the low bit of the sum. This derives that S is 87 (or equivalently -41).
Now that the rotation value is known, you think about what happens when you change arbitrary bits of Y. The first important observation is that when doing addition, a flip of bit B can only change a contiguous sequence of bits starting with bit 87+B mod 128. You can see that from this 16-bit binary wrapping addition:
| 1101111010101101
+ 1100000011011110
------------------
| 1001111110001011
Let's say that we change bit 9 (from the right) of the bottom number to 1:
| 1101111010101101
+ 1100001011011110
------------------
| 1010000110001011
Only bits 9 and above in the sum can change. Also, they must be contiguous. This property comes from how addition carrying works--they have to be contiguous, or a carry chain will stop.
G(X, Y) is added directly to C, and that you can flip bits of G(X, Y) by flipping bits in Y. You don't know what G(X, Y) is, but it can be manipulated.
The second important observation is that when you flip a 0 to a 1, G(X, Y) becomes larger. Because C is a constant, this means that G(X, Y) + C is also larger (*see note). When you flip a 1 to a 0, G(X, Y) and thus G(X, Y) + C becomes smaller. You can use this to determine the bits of G(X, Y).
After encrypting something with keyslot 0x39 and a Y with a bit flipped, you can flip bits in K and encrypt the same data in normal-key mode until the encrypted version matches the one with the keyslot modified Y. This tells you the new value of K. By the carry observation above, you only will have to change a small number of bits of K. This takes at worst ~128*127 attempts per key, but a lot less in practice. (This algorithm can fail in an extraordinarily rare circumstance: two different keys encrypt a data block to the same ciphertext. This is so rare as to not matter in practice.)
The final algorithm becomes something like this, working with overwhelming probability (for a random C; it's known to work fine with the actual C and V=0):
(3DS part)
1. Choose a 128-bit constant V. V=0 is fine, and is what we used. Encrypt V with the normal key K to make E. It doesn't matter which cipher mode you use, so just use ECB since it's the simplest.
2. Encrypt V with keyslot 0x39 and the matching key Y. You should get an identical result, since they're two representations of the same key.
3. For B in 0 to 127 inclusive:
3a. Flip bit B in Y to make Y'. That is, Y' = B XOR (1 << B).
3b. Encrypt V with keyslot 0x39 and Y' to make E'.
3c. Record E' and its B to a file.
(PC part, though a 3DS could do it, too)
4. Copy the file to a PC for easier manipulation.
5. Set Z = 0. Z represents G(X, Y).
6. For each line of the file, reading E' and B:
6a. Flip bit (B+87)%128 in K to make K'.
6b. Encrypt V using normal key K' to make E''.
6c. If E'' does not equal E', flip the next bit in K' and go to 5b. (The bit flips are in the order 87, 88, ... 126, 127, 0, 1, ... 86.) It is an error if all 128 bits have been flipped with no match.
6d. Compare K' ROR 87 and K ROR 87 numerically as 128-bit big-endian integers.
6e. If K' ROR 87 < K ROR 87, set bit B of Z to 1, otherwise 0. (This means that G(X, Y') < G(X, Y), and thus a bit of G(X, Y') went from 1 to 0. Accordingly, this bit of G(X, Y) is 1. Vice versa for 0.)
7. Let W = Z XOR Y. (This undoes the XOR Y part of G(X, Y) = (X ROL 2) XOR Y.)
8. Let X = W ROR 2. (This undoes the ROL 2.)
9. Let C = (K ROR 87) - Z.
Now you have key X and the secret constant C.
* A wart in this algorithm is that if wraparound (overflow) occurs, this less than/greater than check will be wrong. You will flip a 0 bit to a 1, and the overflow will cause the final result to be less, rather than greater. However, in the case of the 3DS constant C, this does not occur, and thus doesn't have to be worried about in the end. (The solution if it were required would be to flip one bit at a time in Z for bits 127, 126, 125... until the result matches.)