This paper provides information on the instruction, and its usage for computing the Galois Hash. It also provides code examples for the usage of PCLMULQDQ, together with the Intel® AES New Instructions (Intel® AES-NI) for efficient implementation of AES in Galois Counter Mode (AES-GCM).
Retrieved from https://software.intel.com/sites/default/files/managed/af/98/carry-less-multiplication-instruction.pdf on 2017 May 09.
Author(s): Shay Gueron; Michael E. Kounavis
Series: White Paper 323640-001
Edition: Revision 2.0
Publisher: Intel Corporation
Year: 2010
Language: English
Pages: 76
Introduction.......................................................................................................................... 4
Preliminaries......................................................................................................................... 4
PCLMULQDQ Instruction Definition ....................................................................................... 6
The Galois Counter Mode (GCM)............................................................................................ 8
Efficient Algorithms for Computing GCM ............................................................................. 12
Code Examples: Ghash Computation ................................................................................... 22
Code Examples: AES128-GCM ............................................................................................. 28
PCLMULQDQ and GFMUL Test Vectors ................................................................................. 70
Performance ....................................................................................................................... 72
Summary ............................................................................................................................ 74
References .......................................................................................................................... 74
Acknowledgements ............................................................................................................. 75
About the Authors............................................................................................................... 75
Figures
Figure
1. The Galois Counter Mode ..................................................................... 9
Figure
2. The OpenSolaris “gfmul” C Function......................................................10
Figure
3. Lookup Table Based Implementation of AES-GCM ...................................11
Figure
4. Code Sample – Reflecting Bits of a 128-bits Quantity ..............................18
Figure
5. Code Sample - Performing Ghash Using Algorithms 1 and 5 (C) ................23
Figure
6. Code Sample - Performing Ghash Using Algorithms 1 and 5 (Assembly).....24
Figure
7. Code Sample - Performing Ghash Using Algorithms 2 and 4 with Reflected
Input and Output .................................................................................25
Figure
8. Code Sample - Performing Ghash Using an Aggregated Reduction Method..26
Figure
9. AES-GCM – Encrypt With Single Block Ghash at a Time ...........................29
Figure
10. AES-GCM – Decrypt With Single Block Ghash at a Time .........................32
Figure
11. AES-GCM – One Block at a Time with Bit Reflection (to Be Used with the
Multiplication Function from Figure 7)......................................................36
Figure
12. AES-GCM: Processing Four Blocks in Parallel with Aggregated Every Four
Blocks ................................................................................................44
Figure
13. AES128 Key Expansion .....................................................................49
Figure
14. A Main Function for Testing ...............................................................50
Figure
15. AES-GCM (Assembly code): Processing Four Blocks in Parallel with
Aggregated Every Four Blocks................................................................55
Figure
16. Test Vector 1: Code Output ...............................................................67
Figure
17. Test Vector 2: Code Output ...............................................................68
Figure
18. Test Vector 3: Code Output ...............................................................68
Figure
19. Test Vector 4: Code Output ...............................................................69
Figure
20. Test Vector 5: Code Output ...............................................................69
Figure
21. Test Vector 6: Code Output ...............................................................70
2
323640-001
Intel® Carry-Less Multiplication Instruction and its Usage for Computing the GCM Mode
Tables
Table 1. The Performance of AES-128 in GCM Mode (on a processor based on Intel
microarchitecture codename Westmere) ..................................................73