Defending against side-channel attacks - Part 3

Gilbert Goodwill, Cryptography Research, Inc.

10/3/2012 5:52 AM EDT

Editor's Note: This article was originally presented at ESC Boston 2011.

Part One provided a brief introduction to side-channel analysis, including timing analysis and simple and differential power analysis (SPA and DPA). Part Two
discussed a DPA attack against AES using EM emissions from the devices. In Part Three, Cryptography Research discusses a new standard recently proposed by CRI to enable developers and accredited testing laboratories to test devices for potential side-channel vulnerabilities.

4. Standardized side-channel testing methodology

This section discusses a new standard recently proposed by CRI [5] to enable developers and accredited testing laboratories to test devices for potential side-channel vulnerabilities. The goal is to provide evaluators with a standardized methodology for performing side-channel analysis that is sensitive enough to uncover many potential problems.

No standardized testing program can guarantee complete protection against all attacks. Rather, the program is designed to ensure that sufficient care was taken in the design of the device under test (DUT). Sections 4.1-4.3 give a rationale, overview, and description of the testing methodology. Section 4.4 gives some example test results for different devices.

4.1 Rationale for the t-test methodology
Side-channel attacks such as SPA and DPA exploit the presence of information about sensitive algorithmic intermediates within the power traces collected from a device. Any sensitive computational intermediate that influences the power consumption in a statistically significant way could potentially create vulnerabilities.

Our testing approach uses statistical hypothesis testing to detect if one of a number of sensitive intermediates significantly influences the measurement data. For each sensitive intermediate, the collected traces are partitioned into two sets where the value of the intermediate is substantially different. The null hypothesis is that the two sets of power traces have identical means and variance. In other words, sensitive intermediate has no influence on these quantities. The alternate hypothesis is that the means of the two distributions is different. Welch’s t-test used in the tests determines whether a data set with a comparable size to the acquired data set of an attacker provides sufficient evidence to reject the null hypothesis.

4.2 Overview of the t-tests
The core statistical technique for checking for statistical differences between the two subsets of power traces is Welch’s t-test, which is an extension of the Student’s t-test for unequal sample sizes and unequal variance. A high positive or negative value of T at a point in time indicates a high degree of confidence that the null hypothesis is incorrect. The confidence value C will be specified by the evaluator, and will correspond to a high confidence in rejecting the null hypothesis. C is chosen such that the probability of the t-statistic being greater than C or less than -C may correspond to 95 percent, 99 percent or even 99.99999 percent confidence that the null hypothesis can be rejected.

Each trace can contain several thousand power measurements across time. Therefore, even for a fairly high threshold of C, chosen to make the likelihood of a false positive at a particular point in time small, there could be a significant likelihood that the t-test statistic exceeds ±C at some point for large traces. To balance the need for detecting leakages (by keeping C small) while minimizing false positives, two independent experiments are required, and a device can be rejected only if the t-test statistic exceeds ±C at the same time in both experiments. If a particular leakage of information occurs at a particular point in the traces, then it should appear in both tests, whereas if the t-test statistic exceeded ±C at a particular instance in time purely by chance, this rare occurrence is unlikely to repeat at the same instance in time in the other independent experiment.

For each algorithm, multiple t-tests must be performed, each targeting a different type of leakage. Each test must be repeated twice, with two different data sets.
4.3 Description of the t-tests
Each test is performed as follows:
1. Specification of data to be used: The evaluator specifies what the set of traces must be used for the test. The set of traces are divided into two disjoint groups, Group 1 and Group 2. These are the two disjoint data sets for performing the two independent Welch’s t-tests.

2. Group 1 test:
a. Based on the algorithm being tested, the evaluator specifies a partitioning of the traces in Group 1 into two subsets A and B. This partitioning is algorithm-specific, and selected to highlight calculations likely to have problems. Let NA and NB be the size of the subsets A and B.

b. Compute XA, the average of all the traces in group A; XB, the average of all traces in group B; SA, the sample standard deviation of all the traces in group A; and SB, the sample standard deviation of all the traces in group B. Note that, as each trace is a vector of measurements across time, and the average and sample standard deviations of the traces are also vectors over the same points in time.

That is, the averages and sample standard deviations are computed point-wise within the traces for each point in time.

c. Compute the t-statistic trace T (over the same time instants) as



Note that the above calculation is performed point-wise, for each time instant in the traces for XA, XB, SA and SB.

d. Note the time instants in the t-test statistic trace T, where the value exceeds the confidence threshold ±C, where C is specified in the evaluator.

3. Group 2 test: The testing steps are the same as those for Group 1, except that the measurement traces, and its subsets A and B will be different. Again the time instants where the t-static trace computed over Group 2, exceeds the threshold ±C are noted.

If there is any point in time for which the t-test statistic exceeds ±C for both Group 1 and Group 2, the device fails. Otherwise, the device passes this test. In some cases, the evaluator may specify a particular region in the trace to consider for a particular test. For example, the test may require that only the time-instances corresponding to the middle third of the AES calculation be used to determine pass/fail.

4.4 Example t-test results
This section discusses t-tests performed on two different AES implementations, one of which failed the t-tests while the other passed. The tests were run on a SASEBO FPGA test board developed by AIST for side-channel testing [6]. The SASEBO test platform is shown in Figure 17 below.


Figure 17: SASEBO FPGA test platform

The first set of t-tests was performed on an implementation of AES-128 containing no side-channel countermeasures. The tests were performed on 60,000 traces collected over 50 minutes. Round 4 was selected by tester, and the threshold for failure was selected to be |t| > 4.5. (The value 4.5 corresponds roughly to 99.999 percent confidence.) The worst case values for each of the round 4 tests are shown in Figure 18 below. As can be seen in the table, the implementation failed all the tests except for the S-box and Round Output bit tests. Hence, this implementation would be graded as failing the t-tests.


Figure 18: t-test results for AES-128 implementation without countermeasures

The second set of t-tests was performed on an implementation of AES-256 containing a masked S-box countermeasure. The tests were performed on 216,000 traces collected over 3 hours. Round 7 was selected by tester, and the threshold for failure was selected to be |T| > 4.5. The worst case values for each of the round 7 tests are shown in Figure 19 below. As can be seen in the table, the implementation passed all the tests performed, and hence would be graded as passing the t-tests.


Figure 19: t-test results for AES-256 implementation with S-box masking countermeasure

4.5 Summary of t-test methodology
The t-tests provide a simple, repeatable methodology for testing devices for side-channel vulnerabilities. It has clear pass/fail criteria. In our lab, the worst-case data collection and analysis time was 6 hours, but can be significantly less. The tests can be run on partially collected data, and remaining data collection and tests can be aborted early upon test failure. Finally, the data collection and tests are scriptable. The evaluation process can be automated once an evaluator has established confidence in the collected data.

5. References
[1] Paul Kocher, “Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS, and Other Systems”, Advances in Cryptology – Crypto ‘96 Proceedings, Lecture Notes in Computer Science Vol. 1109, Neal Koblitz (Ed.), Springer-Verlag, 1996, pp. 104–113.
[2] Joseph Bonneau and Ilya Mironov, “Cache-Collision Timing Attacks against AES”, Cryptographic Hardware and Embedded Systems – CHES 2006, Lecture Notes in Computer Science, Vol. 4249, L. Goubin and M. Matsui (Ed.), Springer-Verlag, 2006, pp. 201-215.
[3] D. Brumly and D. Boneh, “Remote Timing Attacks are Practical”, Proceedings of the 12th USENIX Security Symposium, August 4–8, 2003. (Paper available at http://crypto.stanford.edu/~dabo/papers/ssl-timing.pdf).
[4] Paul Kocher, Joshua Jaffe, Benjamin Jun, “Differential Power Analysis,” Advances in Cryptology - Crypto 99 Proceedings, Lecture Notes In Computer Science Vol. 1666, M. Wiener, (Ed.), Springer-Verlag, 1999, pp. 388–397. (Whitepaper available at http://www.cryptography.com/resources/whitepapers/DPATechInfo.pdf)
[5] Cryptography Research, Inc, “A Standardized Testing Methodology for Side-Channel Resistance Validation”, Version 0.9 (Draft), July 1, 2011.
[6] Side-channel Attack Standard Evaluation Board (SASEBO), http://www.rcis.aist.go.jp/special/SASEBO/index-en.html

To access Part One, click here.
To access Part Two, click here

See related links:
Using MISRA C and C++ for security and reliability. Part I

Using MISRA C and C++ for security and reliability. Part II

Using MISRA C and C++ for security and reliability. Part III

How secure is AES against brute force attacks?


Public key cryptography and security certificates


----------------------
If you found this article to be of interest, visit
Military/Aerospace Designline where you will find the latest and greatest design, technology, product, and news articles with regard to all aspects of military, defense and aerospace. And, to register to our weekly newsletter, click here.