Introduction to cryptology

2WF80 – Introduction to cryptology - Winter 2022

Contents Announcements Exams Literature Videos and slides Course notes and exercise sheets Old exams

Tanja Lange
Coding Theory and Cryptology
Eindhoven Institute for the Protection of Systems and Information
Department of Mathematics and Computer Science
Room MF 6.104B
Technische Universiteit Eindhoven
P.O. Box 513
5600 MB Eindhoven
Netherlands

Phone: +31 (0) 40 247 4764

The easiest ways to reach me wherever I am:
e-mail:tanja@hyperelliptic.org

This page belongs to course 2WF80 - Introduction to cryptology. This course is offered at TU/e as part of the bachelor's elective package Security.

Contents
Classical systems (Caesar cipher, Vigenère, Playfair, rotor machines), shift register sequences, DES, RC4, RSA, Diffie-Hellman key exchange, cryptanalysis by using statistics, factorization, attacks on WEP (aircrack).

Some words up front: Crypto is an exciting area of research. Learning crypto makes you more aware of the limitations of security and privacy which might make you feel less secure but that's just a more accurate impression of reality and it a good step to improve your security.
Here is a nice link collection of software to help you stay secure https://prism-break.org/en/ and private https://www.privacytools.io/.

Announcements

If you study mathematics, you should have participated in "2WF50 - Algebra" and "2WF70 - Algorithmic algebra and number theory".
If you study computer science or any other program you should have participated in "2DBI00 - Linear Algebra and Applications", "2IT50 or 2IT80 - Discrete structures", and "2WF90 - Algebra for security" before taking this course.
If not you can find some material in the Literature section but note that you are on your own for learning this.

Lectures take place Mon block 3 and 4 Helix 0.01 (except for Jan 16 where it is in Flux 1.02) and Thu block 7 and 8; the latter are in Flux 1.02 on 17 Nov and 19 Jan and in Aud 04 otherwise. Out of the 4 rooms for instructions (Thu block 5 and 6) we are merging into two rooms Luna 1.056 and Gemini A3.08. Please join the room that's emptier. There is also an online instruction session on wonder.me; check the announcements for details and the room link.
There will be no lectures on 22 December.
Lectures will be recorded and for the last two years the course has been running online with short videos – one video per topic, so several videos per unit. This means that you do not need to attend the in person lectures if you're concernd about your health. Please stay home and participate remotely if you are sick.

The teaching assistants for this course are:

Literature and software

It is not necessary to purchase a book to follow the course.

For some background on algebra see

Some nice books on crypto (but going beyond what we need for this course) are

For easy prototyping of crypto implementations I like the computer algebra system Sage. It is based on python and you can use it online or install it on your computer (in a virtual box in case you're running windows).<,br> I recorded a video to demonstrate how to use Sage https://www.sagemath.org/, covering basics of finite fields and elliptic curves. The latter do not for this course so watch start till minute 10:50 and then again for about 2 minutes after 19:30.
I also wrote a short ``cheat sheet'' with commands for Sage, see here

For encrypting your homeworks you should use GPG/PGP. If you're running Linux then GnuPG is easy to install. If you're using windows I recommend using GPG4win; if you're using MAC-OS you can use GPG Suite. We are OK with having only the attachment being encrypted and signed, but prefer proper encryption of the whole email. Thunderbird has good integration and I hear that also outlook can work well with the plugins.

Examination

30% of the grade is determined by homeworks. There will be six sets of homework during the quarter. You should hand in your homework in groups of 2 or 3. To make sure that you get used to crypto we require solutions to be sent encrypted and signed with GPG/PGP. Each participant must have communicated with the TAs at least once using GPG/PGP. You can find the keys for the TAs linked above or also on the key servers.
If for some reason you need to email me, you can find my public key for tanja@hyperelliptic.org on the key servers and on my homepage.

The exam will take place on 23 January 2023, 13:30 - 16:30; the rooms are SSC Sporthal 2(A-C). The retake is scheduled for April 17, 18:00 - 21:00, the rooms are Atlas 4.215 and 4.225. The exam accounts for the remaining 70% of the grade. The plan is to go back to the pre-Corona style of written exams without notebook support. So check out the exams from the 2019 edition and earlier (last exam Jan 2020) for examples.

Videos and slides

The in-person lectures will be recorded and should appear at videocollege.tue.nl.
For the 2020 and 2021 editions of the course I recorded a lot of short videos which you can find on the YouTube Channel.
The
course page for 2021 has short descriptions of all videos, slides, and no-cookie links to the YouTube videos. Watch them from there if you're on a low-cookie diet. For even fewer cookies, you can find the surfdrive. File names match the file names of the slides.

Class notes and exercise sheets

This section is extended through the course with notes of what happened in class and links to blackboard pictures. You can also find here the exercise sheets for the sessions on Thursdays.

14 Nov 2022
This was a full-on Murphy's law lecture with the system for the whole room failing, meaning that audio didn't work, I couldn't show slides (apart from the last 5 min of the second hour using a portable beamer), and the light for the blackboards was dimm. The person from AV support did a heroic effort to fix it but there was a remarkable level of things going wrong.

We covered the requirements that cryptography should achieve, namely to provide confidentiality, integrity, and authenticity, against attackers that have access to all messages sent and who may attempt to inject messages or to modify messages. We covered Caesar cipher (original and a keyed version), substitutionnn ciphers, the one-time pad, and Viginère cipher as well as how to break them and how big the keyspace is. Here are the slides that I tried to show.

Pictures of black boards are here.

17 Nov 2022
Here is the exercise sheet for block 5 and 6: exercise-1.pdf. See also the raw data if paste fails.
If you get stuck on the Hill cipher watch the 2021 video on XGCD under Math background on the 2021 course page.

For most of the exercises the solution is obvious when you have it. If you're not sure, please ask the TA in your room to check you solution. There are many more pages on the web with tools for cryptanalysis of classical ciphers, e.g. https://www.guballa.de/vigenere-solver, https://www.braingle.com/brainteasers/codes/index.php, http://www.cryptool-online.org, http://practicalcryptography.com/ciphers/, https://www.boxentriq.com/code-breaking.

We covered column transposition as another historical cipher. The Playfair and Hill ciphers were covered in the instruction session, see the exercise sheet. I commented briefly on rotor machines like the Enigma. I showed column transposition and a picture of the Enigma from these slides.
We had some rotor machines from the Cryptomuseum on show in the MetaForum. Currently we have Benne de Weger's Hagelin on display (near the elevator on floor 6 of the MetaForum) Check out the extensive website of the Cryptomuseum on crypto machines and spy craft.

As a second topic I covered stream ciphers as an example of symmetric-key encryption and as a more practical alternative to one-time pads. All security concerns of one-time pads apply – never use them twice – and additionally stream ciphers need to be checked whether they really produce random-looking outputs. Basic essentials to avoid the two-time pad issue is to use an IV (initialization vector) to make sure that the stream cipher has a fresh randomized starting point for each message and that the space for the IV is big enough to avoid repetitions.
The slides I used for the two--time-pad example are here

On a high level I explained the difference between symmetric-key cryptography and public-key cryptography and what public-key encryption and sigantures achieve along with the data flow. For submitting your homework you need to generate a keypair (a private and a public key). As the names suggest you should keep the private key private and can publish your public key. The latter is needed to encrypt messages to you and to verify signatures made by you, so that's what you need to send in along with your homework solution and that's also what your team mates need from you in order to communicate securely with you. You can find the keys of the TAs above under Announcements. I didn't want to wipe the board and used some slides for explaining the rest of public-key cryptography. You can find the slides here.

Pictures of black boards are here.

The first homework is due on 24 November 2022 at 13:30. Here is the first homework sheet. Please remember to submit your homework by encrypted, signed email to the TAs. Don't forget to include your public key and those of your team mates.

21 Nov 2022
We covered feedback-shift registers (FSRs), LFSRs and how to represent them via matrices.
(L)FSRs are examples of stream ciphers. The IV is used as the initial state and the key defines the feedback function. We have seen that depending on the function the output sequence can have a very short repeat. For LFSRs we have established that the output sequence is periodic (not just ultimately periodic) if c0=1. We also established that the all-zero state leads to the all-zero sequence of period 1 for any LFSR. We covered 2 complete examples with state-size 3 for and all states, one had periods 7 and 1, the other had 4,2,1,1. Note that there are 2^n different states, so the sum needs to match 2^n, so 8 in this example.
Finally we looked at the state update matrix and showed that its characteristic polynomail is x^n -cn-1xn-1 -cn-2x- .... - c1x - c0. Most of the time we consider sequences modulo 2 and the all the - turn into +. Note that the n is the state size and the ci are the update coefficients (being 1 for a closed wire and 0 for an open one).

Pictures of black boards are here. Somebody noticed that I made a mistake in the determinant computation. I have fixed this on the last picture (in pink), the c1 needed a minus sign.

24 Nov 2022
Here is the exercise sheet for block 5 and 6: exercise-2.pdf. You should really try to solve these exercises and make some conjectures about how orders, degrees, and periods fit together. You can call over the TAs for checking and I've put a quizz on Canvas so that you can check the numbers yourself -- and ask the TAs if you didn't get the right answer. I do expect that you use a computer for the larger examples.

In class we covered more details on LFSRs, in particular we took some of the conjectures on orders of C and f(x) that you should have found in the exercise session, turned them into theorems, and proved them. Now that they are proven, you can use them as facts, so proofs are useful. I mentioned Rabin's irreducibility test, you should know that from previous lectures and you should be able to recognize some irreducible polynomials of small degree namely x, x+1, x^2+x+1, x^3+x+1, and x^3+x^2+1 as factors of polynomials.

As a seccond topic we covered sums of LFSRs. This can be motivated by asking what we can build from given small hardware components, and in fact many designs are actually build from such pieces, but it is also a useful step in our quest to understand LFSRs and we got some ideas – and disproved some. See here for the slides that I showed of the sums (nicer drawings than I did on the board).

Pictures of black boards are here.

The second homework is due on 1 December 2022 at 13:30. Here is the second homework sheet. Please remember to submit your homework by encrypted, signed email to the TAs. Don't forget to include your public key and those of your team mates.

28 Nov 2022
This was another bout of Murphy's law. From what I understood from the person doing the recordings, the video for the second hour is likely messed up, too, in addition to the beamer being unkilleable, showing wrong colors, and me messing up a proof. If you're following remotely it likely makes more sense for this lecture to watch the 'Math vs. Mystery' video (maybe with pauses and rewiding) and checking the slides. In the live lecture I also resorted to using the slides when I was out of space where I could write.

At the end of the previous lecture we had found a counterexample to our hypotheses of what happens when we add two LFSRs. The main result of this lecture is that the characteristic polynomial of the LFSR that one obtains as the sum of two LFSRs equals the lcm of the characteristic polynomials of the LFSRs that are added up. We also proved that for an irreducible characteristic polynomial all non-zero output sequences have the same period length. Both of these results needed generating functions, which are very useful tools to know anyways. I got stuck in some pretty simple portion of the proof where I needed to rename the indices. In looking back, I think I had the result right in front of me without realizing it, because I somehow wanted to rename both rather than just one. I hope it's clearer on the picture or on the slides.

Further reading on finite fields and the mathematical theory of LFSRs is in Lidl/Niederreiter (see literature section), van Tilborg (see literature section), and David Kohel's lecture notes.

Pictures of black boards are here.

01 Dec 2022
Here is the exercise sheet for block 5 and 6: exercise-3.pdf.

LFSRs are used in practice because they are small and efficient, but they need a non-linear component to be secure. I covered three ciphers from telecommunication (A5/11, A5/2, and SNOW-3G) which all are based on LFSRs with some nonlinear components. See the slides for their definitions and some details.
These ciphers were in every cell phone using GSM 2G and downgrade attacks were possible for a long time. It sounds like the 54 bits were a compromise between countries wanting strong crypto and others wanting weak crypto. One of the first postings on it with some details on the history (note that the original link does not work now) an attack idea for A5/1 by Ross Anderson is from 1994, but many details were missing. The full algorithm descriptions of A5/1 and its purposefully weakened sibling A5/2 were reverse engineered and posted in 1999 by Marc Briceno, Ian Goldberg, and David Wagner. The same group also showed a devastating attack on A5/2, allowing for real-time decryption. Sadly enough, the A5 algorithms allow downgrade attacks, so this is a problem for any phone which has code for it, which is most until recently. Also A5/1 does not offer 2^54 security (54 bits is the effective key length) but only 2^24 (with some precomputation/space). However, A5/2 is broken even worse, in 2^16 computations, with efficient code online, e.g A5/2 Hack Tool.

Since making those slides in 2020 even more details on GSM have come out and just last year details of two more GSM ciphers, GEA-1 and GEA-2, were published and analyzed. I showed their design using page 7 of the scientific publication about the attack. The attack made news last year because many new cell phones were still supporting GEA-1, even though it was removed from the standard by ETSI in 2013, and because it was unusually weak: GEA-1 has a 64-bit key but it is designed such that after initialization there are only 2^40 different states (instead of 2^64 that should be possible). The design was not public, but a cursory look would not reveal this property; this strongly looks like a back door. GEA-2 does not have this problem (and so far is not discontinued) but it also offers far less security than it should. It is unclear whether the techniques used in the attack were known in the 1990s.

A nice overview of lightweight ciphers, including more modern and less broken ones is given by Alex Biryukov and Leo Perrin. This paper is from 2017, thus does not include the GEA attacks.

The cipher we analyzed in the exercise session is RC4. Even as recently as 2013 this was the preferred symmetric cipher for https. I find it quite surprising that such a widely used cipher exhibits properties we can find in an exercise session (well, knowing where to look, of course).
Note that RC4 was a secret design, available only as black box implementation. Soon after it was leaked as "arcfour" weaknesses were found.
I summarized some weaknesses on the board but for showing some of the biases I used these slides.

Better stream ciphers exist, e.g. the final portfolio from the eSTREAM competition has held up well.

Pictures of black boards are here.

The third homework is due on 08 December 2022 at 13:30. Here is the third homework sheet. Please remember to submit your homework by encrypted, signed email to all the TAs (and not to Tanja). Don't forget to include your public key and those of your team mates. Do not submit as a singleton, the minimum group size is 2.

05 Dec 2022
I showed an example for the proof from last week Monday using these slides
Stream ciphers protect the confidentiality of messages but not the integrity. An attacker flipping a ciphertext bits causes the same bit to flip in the plaintext. Integrity protection is achieved using Message Authentication Codes (MACs). MACs belong to symmetric-key crypto and veryifying a MAC tag takes the same key as generating one. Hence the receiver is convinced of the authenticity of a message but cannot convince a 3rd party of it. The latter is achieved by signatures, which can be verified using a public key.
To build MACs we need the definition of cryptographic hash functions. THese compress arbitrary-length messages to fixed-length messages. Hash functions find use also outside crypto, e.g. to deduplicate data and as short fingerptints, but we require more properties. Cryptographic hash functions need to provide pre-image resistance, second pre-image resistance, and collision resistance. If the output of the hash function has n bits then finding a collision takes on average 2n/2 trials (use the birthday paradox to see this) and finding a preimage or second preimage takes on average 2n trials.
MD4 is completely broken; for MD5 it's easy to find collisions and meaningful ones are possible, see MD5 considered harmful today: Creating a rogue CA certificate. First SHA-1 collisions were computed in 2017 (see https://shattered.io/) followed by a way to create meaningful SHA-1 collisions in https://sha-mbles.github.io/. SHA-2 (which includes SHA-256, SHA-384, and SHA-512 and SHA-3 (and the other SHA-3 finalists) are likely to be OK.
We covered design of hash functions using the Merkle-Damgaard construction and how this enables length-extension attacks on a simple MAC construction unless special padding is used. On the bright side, the iterative design of the function makes it eaiser to study it. H is collision resistant if C is, so cryptanalysts can focus on C. Note that the IV is fixed in hash functions.

Pictures of black boards are here.

08 Dec 2022
Here is the exercise sheet for block 5 and 6: exercise-4.pdf.

After a short recap of the MD construction and length extension (as feature and attack vector) we finished discussing MACs. I showed HMAC as a design that protects against length-extension attacks even when an MD-based hash function is used.
Block ciphers can encrypt data in blocks of fixed length n. If you encrypt each block separately your encryption is vulnerable to statistical attacks, A famous example of how weak this is is the ECB penguin. The name for this approach is ECB (electronic code book) mode. The approach means that identical blocks encrypt the same way.
To encrypt messages longer than one block you need to use a mode of operation. More reasonable modes than ECB are CBC,OFB, and CTR. These modes ensure that identical plaintext blocks do not lead to identical ciphertext blocks. Always make sure to include a MAC! I showed OFB and CTR from these slides. Note that the OFB slide was correct, the _input_ to the Enc function has been encrypted t-1 times, then it gets encrypted one more time before being xor'ed into M_{t-1}.

The exercise session today covered DES as an example of a block cipher. DES is a Feistel cipher. I showed the general design and gave some description of how it is possible to decrypt even though the functions f_i are not invertible. I showed the schematics of the function for DES from page 4 of these slides The S-box is the non-linear part; the NSA strengthened the S-boxes in the original design (Lucifer) against differential attacks (but made the keys much shorter). In the exercises we saw that small changes in the input lead to big changes in the output. 56 bits for the key is not secure enough! First brute force attack was done with "DES Cracker" for 250k USD. In 2006 a team from Bochum and Kiel built COPACOBANA which can break DES in a week for 8980 EUR (plus some grad-student time).

Pictures of black boards are here.

The fourth homework is due on 15 December 2022 at 13:30. Here is the fourth homework sheet. Please remember to submit your homework by encrypted, signed email to all TAs (and not to Tanja). Don't forget to include your public key and those of your team mates. Do not submit as a singleton, the minimum group size is 2.

12 Dec 2022
We covered attack scenarios agains symmetric encryption in general and against DES in particular, covering single target and multi-target attacks and attacks succeeding with low probability.
DES takes only 2^56 trials for complete key search. 2-DES is only marginally harder to break than DES, taking 2^57 with a divide-and-conquer/meet-in-the-middle approach. Still in common use is 3-DES, sometimes with k1=k3. Full 3-DES needs 2^112 steps to break, there are some more weaknesses in 2-key 3-DES. This concludes the symmetric-key cryptography part of the course.

After setting up public key crypto for public-key encryption and signatures, we covered schoolbook RSA encryption and why it works. We also covered one pitfall of schoolbook RSA which happens because of too small exponents and lack of padding. We'll see more attacks in this scenario on Thursday (exercise and lecture). This is a good moment to brush up some mathematics, we need Fermat's little theorem, XGCD, and the Chinese Remainder theorem. If you don't remember how to compute exponentiations with the square-and multiply method you should also revive that; note that for modular exponenetiation it is important to compute a reduction after every squaring or multiplication to avoid having the results get too large. Try this on your pocket calculator to get immediate feedback on what I mean or see your computer running out of memory when computing m^d for large m and d. There is a YouTube video from the 2020 edition of this course on exponentiation. The slides are here. For CRT see video and slides.

Pictures of black boards are here.

15 Dec 2022
Here is the exercise sheet for block 5 and 6: exercise-5.pdf.

We covered RSA signatures and that RSA is atypical in that decryption matches signing and encyprion matches verification. Normally such operations are very different (apart from the general data flow that the former need the private key and the latter the public key.
We covered some more weaknesses of schoolbook RSA encryption: If e receipients all use public exponent e (and their own moduli n_i) then one can recover m^e modulo the product of the moduli which is m^e as an integer (without reduction) from which one can compute the integer e-th root with a normal calculator. I showed an example for e=3 on the last page of these slides.
The second weakness is if there are related messages, such as m_2=am_1 + b for known a and b. The proof of the relation between the expressions A and B is on these slides.
As a thurd weakness we showed that RSA encryption is homomorphic; per se this is not a problem (and can be a feature in some systems) but we showed that this means that schoolbook RSA is not OW-CCA-II secure. To understand what this means and why it mattered we looked into security notions.

Security notions and attack definitions formalize what we consider an attack and what powers the attacker has. We covered these for public-key encryption and signatures:

The abilities of the attacker vary; for signatures it might be a key-only attack (KOA), a known message attack (KMA), or a chosen message attack (CMA). In the latter two cases the attacker sees valid signature pairs (m,s); in CMA the attacker can choose for which messages he sees signatures.

For encryption the attacker may do a chosen-plaintext attack (CPA) or a chosen-ciphertext attack (CCA). There are two versions of CCA security: in CCA-I the attacker gets to request decryptions of arbitrary ciphertexts until he sees c; in CCA-II the attacker can request decryptions of ciphertexts c' (not equal to c) also after receiving c.

For signatures, the attack goal is to forge signatures, this could mean to generate any valid (m,s) pair (existential forgery) or to generate valid (m,s) for a meaningful message m (universal forgery).
For encryption the goal is to recover plaintext from ciphertext;, i.e. to break one-wayness. We typically request that a scheme is so strong that the attacker learns no information about the plaintext from the ciphertext; this is called semantic security. However, this is hard to deal with in practice. An equivalent and more practical security requirement is indistinguishability: the attacker chooses two messages m0 and m1 and is then presented with a ciphertext c which encrypts one of m0 and m1. The attacker wins if he correctly guesses which, i.e., if he can distinguish the ciphertexts. To deal with a 50% chance of guessing, the advantage of the attacker is defined as the extra chance on top of the 50%.
See the blackboard pictures or these slides for more precise definitions.

As said above, RSA is homomorphic, which defeats some security notions and can mean real attacks (depending on the setting). Schoolbook RSA is not is not OW CCA-II secure, because of the homomorphic property: to decrypt c the attacker can ask for the decryption of c'=c*2^d, obtain m' and get m = m'/2.
Because schoolbook RSA is deterministic, it is not even CPA secure: the attacker can simply encrypt m0 and m1 himself and compare the results to c..

To make RSA a randomized encryption one uses some padding; however this is also error prone. PKCS#1 v1.5 is a negative example which is broken using Bleichenbacher's attack. Take a look at https://robotattack.org/ for a recent use of Bleichenbacher's attack in practice. You should be able to understand details of the full paper Return Of Bleichenbacher's Oracle Threat. RSA-OAEP is a better padding scheme.
I showed PKCS#1 v1.5 from the last page of these slides

Pictures of black boards are here.

The fifth homework is due on 22 December 2022 at 13:30. Here is the homework sheet. Please remember to submit your homework by encrypted, signed email to all TAs. Don't forget to include your public keys. There is no instrution or lecture on 22 December, but the homework deadline goes through as normal and we will post a sheet of exercises for your entertainmaint.

19 Dec 2022
Today we covered the Diffie–Hellman key exchange as interactive protocol and as semi-static DH where one part uses a long-term public-key. Eve can compute the shared secret if she can compute a from h_A or b from h_B. There might other ways than computing these discrete logarithms but it is important to choose the groups such that these attack are as hard as possible.
After discussing some examples of bad groups and good groups (multiplicative group of a finite field or group of points on an elliptic curve) we coverd a generic attack, i.e.., an attack that works for any group.
The Baby-Sep-Giant-Step (BSGS) algorithm is an algorithm to compute discrete logarithms in a cyclic group with generator g, i.e. given g and h_A =g^a it computes a.
Put m=floor(√l), where l is the order of g. For us that is p-1 for now. BSGS computess all powers of the generator from g^0=1 up to g^(m-1), these are the baby steps (incrementing by 1 in the exponent). Then it computes M=g^(-m) and checks for each h_A * S^j for j = 0,... whether it is in the list of baby steps; these are the giant steps (moving by -m in the exponent). If a match happens, we have g^i=h_A * S^j = g^a*g^(-mj), thus a=i+mj.
To see an example of BSGS you can watch this video on YouTube along with the slides

Pictures of black boards are here.

22 Dec 2022
There are no lectures or instructions today. You can do the 6th exercise sheet exercise-6.pdf whenever you get around to it and verify your results with the quiz on Canvas. Please let Tanja know if some answer in Canvas does not work. This is a new quiz.

The sixth (and last) homework is due on 12 January 2023 at 13:30. Here is the homework sheet. Please remember to submit your homework by encrypted, signed email to all TAs. Don't forget to include your public keys. This is your last chance if you still need to do an encrypted round of communication.

Enjoy your holidays. If you want to do some crypto take a look at the old exams (below). The exam for 2WF80 will be in person, on paper, without laptops, without books. so pick exams no more recent than Jan 20220 for practicing in the same setting. The last two years had different conditions.

09 Jan 2023
Stan Korzilius was so nice to give this lecture.
The DH protocol was published in 1976 and at the time they seemed dissapointed, because they ``only'' had a key exchange protocol, not an encryption system. Nowadays the DH key exchange in combination with symmetric crypto is used more than RSA, for example.
Nevertheless, ElGamal later found a way to turn the idea into a cryptosystem. The system is homomorphic, so not OW-CCA II secure. It also assumes that the message m is in G. It starts the same as the DH key exchange, with Alice choosing a secret key a and computing and publishing public key h_a = g^a. Then Bob, who wants to encrypt something to Alice, chooses a random k and computes r = g^k. He also takes Alices public key and computes (h_a)^k. This is basically the shared key from DH. He then encrypts the message by multiplying with this key: c = m(h_a)^k. Finally, he sends (r,c) to Alice. Alice can decrypt by computing m = c / r^a. The system has some weaknesses and is included mainly for historical purposes. In practice, semi-static DH in combination with symmetric crypto is a better solution.

ElGamal signatures are a little more involved. We will use l = ord(g). Alice again picks random a and computes and publishes h_a = g^a. To sign a message m, she picks a random k, computes r=g^k, and s=k^{-1}(H(m)-ar) mod l. Every term in s is defined modulo l, except r, which is defined modulo p. Therefore the value that is send for r should be used exactly as is. The signature is then (r,s). It is accepted when g^{H(m)}-r^s(h_a)^r = 0. Anyone not knowing a cannot compute s. Note that computations in the exponent happen automatically modulo l, otherwise modulo p. Breaking the system is as hard as the DDH problem. ElGamal signatures serve as a basis for more modern signature systems such as DSA.

We also discussed the idea of secret sharing. In many cases the private key should not be in posession of a single user. A nice analogy is a bank vault which requires multiple keys to open it. The Shamir secret sharing scheme is a t-out-of-N threshold system based on polynomials. The idea behind it is that it requires t points to recover a (t-1)-degree polynomial. The sharer generates a random polynomial of degree (t-1), such that f(0)=s. The values at x=1, x=2, etc. are the shares received by the parties. The polynomial, or only the secret value, can be reconstructed from sufficient shares by Lagrange interpolation. t-1 or less users learn nothing about the secret (every possible value has the same probability).
In practice nobody should ever know s. In theory a trusted party could compute and use s, then forget it, and only share the result of the required computation. A better solution is that users locally compute the desired output with their shares, and only then combine their shares to reveal the result (but not s). Also, often you want nobody to know the secret (which could be a secret key), not even the sharer. This can be achieved by generating it in a distributed manner: Everybody picks a random integer. The secret is then the sum of all the integers. To compute it every user shares its integer in a t-out-of-N manner (every user should receive the shares corresponding to the same x-value), and then adds all the shares it received (including the share of its own secret). This is basically adding the underlying random polynomials.

Stan was so nice to take pictures of black boards, there are here.

12 Jan 2023
Here is the exercise sheet for block 5 and 6: exercise-7.pdf.

In the lecture we discussed some more issues with ElGamal and that it is important that m is in the group generated by g. Here is the article on the Moscow voting system that I mentioned which didn't ensure that all messages were squares while g generated the subgroup of squares.
Instead of having to worry about getting m into this group (not so terrible for DH in finite fields, much more of a hassle when the group changes to elliptic curves) we can use the semi-static DH system which instead of computing c=(g^a)^r * m computes a symmetric-crypto key k = Hash((g^a)^r}) and uses that as key in symmetric authenticaed encryption to send m. A system combining a public-key crypto system with a symmetric-key one is called a hybrid system.
This is a special case of the KEM-DEM terminology (Key-Encapsulation Mechanism and Data-Encryption Mechanism), where the KEM makes sure that sender and receiver have the same k and then use that in the DEM part for symmetric-key authenticated encryption to send the data. I showed a generic method of how to turn a public-key encryption system into a KEM. This nicely avoids several of the issues we have seen with RSA.
Finally I covered the security notions DLP (we have seen that already), CDHP (Computational Diffie-Hellman problem), and DDHP (Decisional Diffie-Hellman problem). Coming back to the fact that it's easy to determine if an h in F_p is a square we found an attack on the DDHP that successfully finds that (g,g^a,g^b,h) is not valid with probability 1/2. If you want to see an example of this attack, check out the last page of these slides.
Similar attacks would work for checking cubes, 5th powers, 7th powers etc. which motivates working in a prime order subgroup. Note that then the DDHP is adjusted to having h picked as g^{ab} or as g^c for random c so that one does not trivially break it by noticing that h is not in the right subgroup.
Any system based on DLP has at most square root of the group order hardness of the DLP. For elliptic-curve groups that's also the best known attack complexity while there are faster attacks on finite-field DLP which reduce the complexity to that of RSA numbers of the same size, so p has 2048 - 4096 bits. g is always taken to generate a prime-order subgroup. Choices are that p= 2p'+1 for p' a prime as well and then g generating the subgroup of squares or that g generates a much smaller (256 bits) subgroup.

Pictures of black boards are here.

16 Jan 2023

We covered how Eve can mount a man-in-the-middle attack on unauthententicated DH and how to avoid this attack using signatures or 3-DH with long-term keys and epehemeral keys. We also covered the Needham-Schroeder protocol as an important example of how not to achieve authenticity by showing how Eve can impersonate Alice to Bob if she can get Alice to start a conversation with her. The reason that this works is that Bob's reply is not linked to the scope of his exchange with Eve and Eva can pass it on to Alice as her own message in the conversation Alice–Eve.. It's easy to fix this by including the public keys of all communicating parties in the encrypted message or include the hash of the transcript in the encrypted message (if space is an issue).

Finally we covered some unusual types of signatures. Blind signatures are used in eCash and easy to get from homomorphic RSA signatures. Undeiable signatures limit who can verify a signature and require interaction between the signer and the verifier to verify. I showed Chaum's protocol which uses the DLP. For an example and to show how Alice can convince Bob that she did not sign a message that she in fact did not sign I used the last two pages from these slides.

Pictures of black boards are here.

19 Jan 2023
Both the instruction and the lecture will be Q&A sessions. Ask anything incl old exams. Most relevant are the exams of Jan 2020 and eaerlier, i.e., in the handwritten format which matches what you'll encounter on the 23rd. It's best if you come prepared with questions.
You can see the questions on the pictures below. The LFSR is an example I made up on the fly, the part about ECB and CBC as well, everything else are old exam questions and labeled as such.

Pictures of black boards are here.
Addition in 2024: exercise 7 as presented in the video and on the board is not correct, I have updated the picture to fix this.

23 Jan 2023
Exam! See above for room info.


Old Exams

This course was given for the first time in Q2 of 2014. Here are my exams so far