Correlation Power Analysis on time-invariant simple password

In one of the previous articles we’ve seen how we can use ChipWhisperer to break weak password checking using timing attacks. In this tutorial we will work on a time-invariant version of the program and we will use Correlation Power Analysis (CPA) to recover the password from the device.

In CPA we assume that power consumption of the device will behave according to some underlying general model (“leak model”) which depends on the secret key. If the model is correct, then we can significantly narrow down the search space for our secret or even recover it completely.

In this case we are interested in this particular part of the code of our program:

for(unsigned i=0; i<sizeof(q); i++) {
c |= q[i] ^ p[i];

The loop goes through all the bytes of the input password, XOR-ing them with the bytes of the reference password and writes partial results to variable c. If is something other than 0 after the loop, it means that passwords differ.

How can we attack this algorithm? We need to know how XOR operation reflects in our power measurements. Notice that during XOR operation on two bytes, the number of bits which are changing is equal to the Hamming weight of the result of this operation, which in turn is equal to Hamming distance between the inputs. It turns out that, on the processor we’re working with, the Hamming distance predicts the power consumption. We expect that power measurement at the time of XOR operation will highly correlate with the Hamming weight of the result. We don’t know where XOR is exactly in the power traces, but this doesn’t matter – if we collect enough traces, the correlation we’re looking for should dominate the noise in the rest of the trace.

Let’s dive into it. Firstly, we generate 1000 power traces for random, 8-byte passwords. Generating the whole password randomly has the advantage that it won’t influence the correlation. We compute correlations for the whole power traces, so it is important that there are no other patterns.

trcs = []
inps = []
for i in tnrange(ntrc, desc='Getting traces'):
x = os.urandom(8)

For convenience we also send the reference password on each iteration. This doesn’t affect the attack as key load happens entirely outside of our power measurements.

Here’s our function performing a guess on each password byte using our leak model:

def guessbyte(trcs,avg,byte):
ntrc = len(trcs)
ltrc = len(trcs[0])
ris = []
for guess in range(256):
hs = []
for i in range(len(trcs)):
c = ord(inps[i][byte])
d = HW[guess^c]
meanh = np.mean(hs, dtype=np.float64)
covht = np.zeros(ltrc)
varh = np.zeros(ltrc)
vart = np.zeros(ltrc)
for i in range(ntrc):
hdiff = hs[i] - meanh
tdiff = trcs[i] - avg
covht += hdiff*tdiff
varh += hdiff2
vart += tdiff2
rs = covht / np.sqrt(varh * vart)
np.nan_to_num(rs, copy=False)
best_correlation = max(-rs)

We go through all bytes of the password. For every possible value of a password byte (a “guess”) we compute the leak model: we calculate XOR of the guess with the input byte that was sent to the device during our 1000-trace sampling. Then we compute the Hamming weight and calculate the correlations between the model and the measurements we collected. Due to noise we may not be able to recover the password directly, so we report top-N highest correlation guesses (N ranks). We repeat the process for all password bytes.

Best guesses for byte 0:   
20 :0.861236
Best guesses for byte 1:
Best guesses for byte 2:

As you can see, the correct bytes appear typically in rank 0 or rank 1. We are left with a reduced set of password byte candidates that we can use to brute-force the full password in a reasonable time. This reduces brute-force time from years (64 bits entropy) to minutes (8 bits entropy):

Keep in mind this is a very simple CPA. We needed very little knowledge of the device. As long as the traces are aligned, we don’t even need to know where our vulnerable code is in the trace. With more precise attack we should be able to recover the password directly without a brute-force pass and in much lower amount of samples. This will be a topic for a future post.

Our full source codes for this post can be found in our repository:

Leave a Reply

Close Menu