Linear Cryptanalysis Demo

The function f takes an 8-bit input (x) and an 8-bit subkey (k) as input and produces an 8 bit output (y). We can write this as y = f(x, k) (mod 2). What if SDES had been designed in such a way that we could write the function f as a linear combination of x and k modulo 2? That is, what if the function f were designed as y = f(x, k) = Mx + Dk (mod 2) where M and D are constant 8 x 8 matrices. The function f would look like this:

_{}

To do this we would only have to change the S-boxes to linear functions. All permutations and XORs are already linear functions. For example, P4 can be written as something like y = h(x):

_{}

and XOR4 can be written like z = i(x, y):

_{}

So if the S-boxes were linear equations we could easily find a linear function y = f(x, k) (mod 2). The SW switch function is just a permutation so it is also a linear function, which we can write as y = g(x) = Ex where E is a constant 8x8 matrix:

_{}

SDES has two rounds of the function f (one for each subkey K1 and K2) with a switch function SW in the middle. So if P is the plaintext and C is the ciphertext and you ignore the initial and final permutations, then:

C = f(g(f(P, K1)), K2)

=M´g(f(P, K1)) + D´K2

=E´M´f(P, K1) + D´K2

=E´M´(M´P + D ´ K1) + D´K2

=E´M^{2}´P + E´M´D´K1 +
D´K2 (mod 2)

Now define
three new constant 8x8 matrices: R= E´M^{2}, S= E´M´D,
and T=D. Even if you use independent
subkeys for K1 and K2, you have a linear equation:

C = R´P + S´K1 + T´K2 (mod 2)

Furthermore, we can get a linear equation like this for any number of rounds, for example let’s try four rounds where K1, K2, K3, and K4 are all independently chosen subkeys and not generated off of the same encryption key:

C = R´P + S´K1 +
T´K2
+ U´K3
+ V´K4 (mod 2)

Every element in R, S, T, U, and V is either a 0 or a 1. That means that the respective bit in P, K1, K2, K3, or K4 is either present in the equation for the ciphertext bit or not. So you might get something like

C_{0} = P_{3} + P_{4}
+ P_{5} + K1_{0} + K1_{2} + K1_{3} + K1_{5}
+ … + K4_{5} + K4_{7}

This says that bit 0 of the ciphertext is equal to bit 3 of the plaintext XORed with bit 4 of the plaintext XORed with bit 5 of the plaintext XORed with bit 0 of K1, etc. You would have similar equations for bits 1 through 7 of the ciphertext. So for a known ciphertext attack with one plaintext-ciphertext pair you would have 8 equations and 32 unknowns, which doesn’t do much good.

But with 4 plaintext-ciphertext
pairs you would have 32 equations and 32 unkowns. Using Gaussian elimination or Cramer’s rule it is easy to see
that you could solve this system with something on the order of 32
calculations. If you found that two or
more of your 32 equations were not linearly independent then you could just add
another plaintext-ciphertext pair. So a
key-size of 32-bits in this case gives us a cryptanalytic effort of 32, instead
of 2^{32} as we would have hoped, if we have several known
plaintext-ciphertext pairs.

Fortunately, S-DES and DES use S-boxes which are non-linear. But this doesn’t mean that a linear equation for the function f won’t hold for some pairs of inputs and outputs. In a perfect world, a linear equation modulo 2 on any input-output combination would hold exactly 50% of the time. This is because if you just take a bunch of randomly chosen bits and put them in a linear equation modulo 2 you would expect to get a 0 half the time and a 1 half the time.

But sometimes in S-DES and DES we
can find linear functions for the S-boxes that occur with a probability ¹
50%. For example, consider S-box 0 of
S-DES with input X_{0}X_{1}X_{2}X_{3} and
output Y_{0}Y_{1}. The
probability that X_{0}+X_{1}+X_{3}+Y_{0} = 0 is
81.25%. The probability that X_{2}+Y_{1}
= 0 is 18.75%. Let’s rewrite this one
as the probability that X_{2}+Y_{1} = 1 is 100% - 18.75% =
81.25%. So what we can do is come up
with a bunch of equations that that will be satisfied by the S-boxes with a
probability of more than 50%, regardless of what the subkey does to mash up the
input. The bits of the subkey in the
XOR8 will either make our linear approximations hold often or prevent them from
holding often, but in either case they should push the likelihood of our
equations holding away from 50%.

Let’s do step-by-step linear cryptanalysis on a modified version of S-DES which has three rounds instead of two, uses three independent subkeys (K1, K2, and K3), and has no initial or final permutation (since these permutations don’t do much anyway).

Step 1: We need about 16 good linear approximations for each S-box. They should hold for more than 50% of the possible inputs to the S-box. Each equation just takes a few bits from the input and also from the output of the S-box and adds them together modulo 2 (equivalent to XORing them all together) to get an answer of either 0 or 1. We’ll need about eight equations for S0 and eight for S1.

Step 2: We need about 100 plaintext/ciphertext pairs. They don’t have to be chosen plaintext, this is a known plaintext attack.

Step 3: Do step 4 for every possible subkey K3

Step 4: For each plaintext/ciphertext pair that we have

Step 4.1: Find Q, the output of the S-boxes for the second round. In our 3-round modified S-DES example this should be pretty easy. The XOR4 in the second round took the right half of the plaintext as input (since the first round didn’t touch the right half of the plaintext) and produced the right half of the ciphertext as output (since the third round didn’t touch the right half of the ciphertext). So if we XOR these two we should get the other input into the second round’s XOR4. Then the inverse of a P4 is all we have to do and we know what the output of the S-boxes for the second round was.

Step 4.2: Find S, the input to the XOR8 of the second round. We can do this by taking the right half of the ciphertext and using it as input to E/P in the third round. We can just use the subkey K3 that we are trying and do the XOR8, S-boxes, and P4 for the third round. The output of P4 and the left half of the ciphertext are one input and the output of the XOR4. We just XOR these two together to get the other input to the XOR4 in the third round. We can trace this input back and observe that it is the same as the input to the E/P in the second round. So we plug it into E/P for the second round and we have S.

Step 4.3: See if each of our linear approximations hold for S as input and Q as output. We don’t really care what the subkey K2 is because any bit in K2 has a 50% chance of being a 0 and not touching anything. If it were a 1 then it would change one of our S bits but we wouldn’t really care because our linear approximations are still biased away from 50%. If S was chosen well (meaning our guess for K3 was good) then we would expect S and Q to show bias with the linear approximations. If S is not anything like the input that was really used when the ciphertext/plaintext pair was generated then we’re just plugging in random bits for S and Q and we would expect our linear approximations to hold about 50% of the time.

The subkey that we guess that shows the highest deviation away from 50% is the one most likely to be the real K3. If not then the real K3 is definitely second or third on the list.

The important thing to notice is
that once we know K3 we can attack K2 the same way and then K1. The level of effort to break the cipher then
comes from the size of the subkey and not the key size. O(2^{8}) to break K3 + O(2^{8})
to break K2 + O(2^{8}) to break K1 means an overall effort of O(2^{8}). This is a lot easier than a brute force
attack on the 24-bit key which would be O(2^{24}).

Linear cryptanalysis won’t produce such dramatic results on real DES, though, because DES uses 16 rounds and a much better S-box design. The level of effort to do linear cryptanalysis on DES is still dependent on the size of the subkey, but you need a lot of plaintext/ciphertext pairs which makes it pretty much infeasible. The math is a lot harder, too, because you end up trying to find linear equations for 15 rounds of DES instead of just a single round.

This was created as part of the Cryptography Module
of NSF Award No. 0113627: "Increasing
Security Expertise in Aviation-oriented Computing Education: A Modular Approach",
at Embry-Riddle Aeronautical University in Prescott, Arizona.