2WF80 -- Introduction to cryptology - Winter 2021

Contents Announcements Exams Literature Videos and slides Course notes and exercise sheets Old exams

Tanja Lange
Coding Theory and Cryptology
Eindhoven Institute for the Protection of Information
Department of Mathematics and Computer Science
Room MF 6.104B
Technische Universiteit Eindhoven
P.O. Box 513
5600 MB Eindhoven
Netherlands

Phone: +31 (0) 40 247 4764

The easiest ways to reach me wherever I am:
e-mail:tanja@hyperelliptic.org

This page belongs to course 2WF80 - Introduction to cryptology. This course is offered at TU/e as part of the bachelor's elective package Security.

Contents
Classical systems (Caesar cipher, Vigenère, Playfair, rotor machines), shift register sequences, DES, RC4, RSA, Diffie-Hellman key exchange, cryptanalysis by using statistics, factorization, attacks on WEP (aircrack).

Some words up front: Crypto is an exciting area of research. Learning crypto makes you more aware of the limitations of security and privacy which might make you feel less secure but that's just a more accurate impression of reality and it a good step to improve your security.
Here is a nice link collection of software to help you stay secure https://prism-break.org/en/ and private https://www.privacytools.io/.

Announcements

If you study mathematics, you should have participated in "2WF50 - Algebra" and "2WF70 - Algorithmic algebra and number theory".
If you study computer science or any other program you should have participated in "2DBI00 - Linear Algebra and Applications", "2IT50 or 2IT80 - Discrete structures", and "2WF90 - Algebra for security" before taking this course.
If not you can find some material in the Literature section but note that you are on your own for learning this.

Because of the Covid-19 situation we will not have any in-person lectures. I will provide recorded videos – one video per topic, so several videos per unit – and we'll have live sessions for the exercise hours, Thursdays block 5 and 6, and Q & A sessions on Mondays block 3 and 4. Last year I often stayed around after the exercise sessions for more questions and am happy to do this again this year.

The teaching assistants for this course are:

Literature and software

It is not necessary to purchase a book to follow the course.

For some background on algebra see

Some nice books on crypto (but going beyond what we need for this course) are For easy prototyping of crypto implementations I like the computer algebra system Sage. It is based on python and you can use it online or install it on your computer (in a virtual box in case you're running windows).
For encrypting your homeworks you should use GPG/PGP. If you're running Linux then GnuPG is easy to install. If you're using windows I recommend using GPG4win; if you're using MAC-OS you can use GPG Suite. We are OK with having only the attachment being encrypted and signed, but prefer proper encryption of the whole email. Thunderbird has good integration and I hear that also outlook can work well with the plugins.

Examination

30% of the grade is determined by homeworks. There will be six sets of homework during the quarter. You should hand in your homework in groups of 2 or 3. To make sure that you get used to crypto we require solutions to be sent encrypted and signed with GPG/PGP. Each participant must have communicated with the TAs at least once using GPG/PGP. You can find my public key for tanja@hyperelliptic.org on the key servers and on my homepage. You can find the keys for the TAs linked above or also on the key servers.

The exam will take place on 24 January 2022, 13:30 - 16:30. The exam accounts for the remaining 70% of the grade.

Videos and slides

The course covers the following topics, mostly in this order. See Course notes and exercise sheets for what you should prepare for each session and Canvas for quizzes for the videos.

This part has www.youtube-nocookie.com links to the YouTube videos. Watch them from here if you're on a low-cookie diet. For even fewer cookies, you can find the videos on surfdrive. File names match the file names of the slides.

If you prefer watching on YouTube, you can also go to the YouTube Channel for the course.

Introduction and concepts

This video covers basic concepts of cryptology.

See also the slides.

I recorded a video to explain public-key cryptography.
The video explains public-key cryptography for encryption and signatures. To submit your homeworks you will need to use PGP/GnuPG. See above for software suggestions. Note that the sender needs the public key of the recipient in order to send a message. This means that you need to send your public Key and those of your team mates to the TAs so that they can reply to you.

You can find the slides here.

Math background

If you don't recall how to invert integers modulo other integers I made a video explaining the extended Euclidean algorithm (XGCD)

with slides.

I recorded a video to demonstrate how to use Sage https://www.sagemath.org/, covering basics of finite fields and elliptic curves. The latter do not for this course so watch start till minute 10:50 and then again for about 2 minutes after 19:30.

I also wrote a short ``cheat sheet'' with commands for Sage, see here

More TBD

Historical ciphers

Watch this video to learn about the substitution cipher, Caesar cipher, one-time pad, and Vigen&egrav;re encryption as well as how to analyze them.

See also the slides.

This video covers more historical ciphers (Playfair, Hill cipher, column transposition, and very briefly rotor machines.

See also the here.

We currently have one rotor machines from the Cryptomuseum on show in the MetaForum (near the elevator on floor 6 - not that that helps you now, but it's something to look forward to). Check out their extensive website on crypto machines and spy craft.

(Linear) Feedback Shift registers

This first video introduces feedback shift registers and how to use them for encryption; the definitions of state, period, pre-period,, ultimately periodic sequences. We also prove the following lemma: If the sequence is periodic with period r and si = si+l for all i then r divides l.

The slides are here.

This video introduces linear feedback shift registers (LFSRs) and how we draw them, motivates that we want c0=1 (else there is just a delay in output).
I didn't show this but one can run the LFSR backwards and thus the is periodic, not just ultimately periodic.
The zero state produces the zero sequence, thus the max period is 2^n-1, For two examples the video determines all periods that can be generated by those LFSRs.

See here for the slides.

This video establishes a relationship between state update and a matrix, the state-update matrix. The order of the matrix is a multiple of the period of the LFSR.The video gives an example of how this order is computed and proves a formula for the characteristic polynomial of the matrix. This will be useful for further investigations.

See here for the slides.

Do not watch this video before you're done with exercise 1 on sheet 2 and have developed some conjectures.

This video takes some of the conjectures that you should have found, turns them into theorems and proves them. Once they are proven, you can use them as facts, so proofs are useful. The video also recalls the definition of irreducible polynomials and Rabin's irreducibly test.
See here for the slides.

Hardware design motivates combining short LFSRs, so this video looks at sums of LFSRs.

The video develops some hypotheses about sums of LFSRs and (partially) disproves them. Like for every movie ending with tension, there will be a sequel. Stay tuned for finding out why 21 is missing!
See here for the slides.

We have accumulated a bunch of open problems and hypotheses. This video

cleans everything up so that now we can look at an LFSR and with very few steps determine its properties. The slides are here.

LFSRs are used in practice because they are small and efficient, but they need a non-linear component to be secure. I cover three examples (A5/11, A5/2, and SNOW-3G) in this video:

The slides show the LFSRs and some details. I also cover the history. It sounds like the 54 bits were a compromise between countries wanting strong crypto and others wanting weak crypto. One of the first postings on it with some details on the history (note that the original link does not work now) an attack idea for A5/1 by Ross Anderson is from 1994, but many details were missing. The full algorithm descriptions of A5/1 and its purposefully weakened sibling A5/2 were reverse engineered and posted in 1999 by Marc Briceno, Ian Goldberg, and David Wagner. The same group also showed a devastating attack on A5/2, allowing for real-time decryption. Sadly enough, the A5 algorithms allow downgrade attacks, so this is a problem for any phone which has code for it, which is most until recently. Also A5/1 does not offer 2^54 security (54 bits is the effective key length) but only 2^24 (with some precomputation/space). However, A5/2 is broken even worse, in 2^16 computations, with efficient code online, e.g A5/2 Hack Tool.

Further reading.

Stream ciphers, block ciphers, modes

This video introduces basic concepts of stream ciphers and highlights the importance of the IV (initialization vector). To show why it is necessary I review issues with the "two-time pad".

The slides are here.
Stream ciphers are much more practical than the OTP in that the key is much shorter. To encrypt a message, expand the key into a stream of pseudo-random bits and xor those to the message, i.e., treat the stream-cipher output as the one-time pad. To encrypt multiple message it becomes necessary to remember how many bits have been used and either stay in that state or forward by that many positions the next time one uses the cipher. This is impractical. Initialization Vectors (IVs) deal with that problem in that they move the beginning of the stream to a random position. The IV is then sent in clear along with the ciphertext, so that the receiving end can compute the same starting position.

I've recorded a video to summarize the biases in RC4 and explain some. You should have solved exercise sheet 3 before watching this video.

The slides are here.

Note that RC4 was a secret design, available only as black box implementation. Soon after it was leaked as "arcfour" weaknesses were found.

Better stream ciphers exist, e.g. the final portfolio from the eSTREAM competition has held up well.

This video covers cryptographic hash functions

see here for the slides

Cryptographic hash functions need to provide preimage resistance, second preimage resistance, and collision resistance.

If the output of the hash function has n bits then finding a collision takes on average 2n/2 trials (use the birthday paradox to see this) and finding a preimage or second preimage takes on average 2n trials.

We covered design of hash functions using the Merkle-Damgaard construction and how this enables length-extension attacks. We used this as a feature in computing more SHA-1 iterations per second when searching for near collisions. See our write-up for more details. But there are also situations when this property is not welcome. One can deal with it by insisting on a fixed padding in the last block and checking for that.

Short summary of hash functions: MD4 is completely broken; for MD5 it's easy to find collisions, first SHA-1 collisions were computed in 2017 (see https://shattered.io/). SHA-256, SHA-512 and SHA-3 (and the other SHA-3 finalists) are likely to be OK.

This video is on message authentication codes (MACs)

see here for the slides

Stream ciphers are susceptible to attacks flipping bits in the ciphertext, which cause the same bits to flip in the plaintext. A fingerprint protects against accidental bit flips, but a proper Message Authentication Codes (MACs) need to resist adversarially chosen changes. Communicating parties A and B need an authentication key along with the encryption key. The easiest version of a MAC is to use a hash function to compute cryptographic checksum over the authentication key and the ciphertext. We want to have checksum on the ciphertext for easy and quick rejection of forged packets = Encrypt, then MAC. If you would use the simple MAC you would run into trouble with length-extension attacks. There are many MACs either as standalone constructions ore integrated into a block cipher + mode as in AES-GCM. The video covers HMAC as an example of MAC used in practice..

More TBD

Schoolbook RSA

TBD

Diffie-Hellman key exchange

TBD

Advanced use of crypto

TBD

Class notes and exercise sheets

This section is extended through the course. I will announce here which topics we cover which weeks and which videos you should watch to prepare for the live sessions. See here for videos and slides.

15 Nov 2021
Please watch the first two videos, about basic concepts and historical ciphers, before the first live session. In the live session on Mon 15 2021 in block 3 and 4 we will discuss general topics about the course and your questions about these videos. There is also a quiz on Canvas.

Here are the slides we discussed in the live Q & A session.

18 Nov 2021
In preparation, please watch the videos on XGCD (optional, only if you don't remember), the second historical ciphers video, and the first video on stream ciphers. Here is the exercise sheet for block 5 and 6: exercise-1.pdf. See also the raw data if paste fails.

For most of the exercises the solution is obvious when you have it. There are many more pages on the web with tools for cryptanalysis of classical ciphers, e.g. https://www.guballa.de/vigenere-solver, https://www.braingle.com/brainteasers/codes/index.php, http://www.cryptool-online.org, http://practicalcryptography.com/ciphers/, https://www.boxentriq.com/code-breaking.

The first homework is due on 25 November 2021 at 13:30. Here is the first homework sheet. Please remember to submit your homework by encrypted, signed email to the TAs. Don't forget to include your public key and those of your team mates.
Two errors have now been fixed in exercse 1. I was mixing solution and question when I typed the text; the known friends are called Wilhelmina and Theodor. There was also a character flip in an early versio of this sheet. Make sure to reload if you don't see Theodor mentioned.

22 Nov 2021
Please watch the videos on feedback-shift registers (FSRs), LFSRs and how to represent them via matrices.
Some time before Thursday also watch the lecture on public-key cryptography (under general concepts). You should watch this video to understand the data flow in PGP/GnuPG which you need for submitting your homework.
Please ask questions on Zulip so that I can prepare something for the Q&A session.

This time I got a few questions before the lecture, thanks for that. I clarified a few things regarding the Hill cipher and the column transposition cipher, see the screenshots here and here.
For questions regarding the exercise about the Hill cipher I used the slides from last year's live Q & A session which show how to compute c and d. I also talked about what we mean by non-random properties and why the IV matters/how it is used, but without making notes.

25 Nov 2021
We'll do another round on wonder.me; really hope the quirks from last week won't reappear. See you there at 13:30.
There are no new videos to watch before the exercise session, but brush up your abiilities with computer-algebra systems so that you can deal with large matrices and polynomials modulo 2.

Here is the exercise sheet for block 5 and 6: exercise-2.pdf. You should really try to solve these exercises and make some conjectures about how orders, degrees, and periods fit together. I'm holding back on some video with explanations to make you solve these. You shold call us over for checking but I'll also make a quizz available if you don't manage to finish on time.

The second homework is due on 2 December 2021 at 13:30. Here is the second homework sheet. Please remember to submit your homework by encrypted, signed email to the TAs. Don't forget to include your public key and those of your team mates.

29 Nov 2021
Please bring your questions on LFSRs.

In the live session we did some examples of sums of LFSRs see pictures 1, 2 3, 4.

02 Dec 2021
Here is the exercise sheet for block 5 and 6: exercise-3.pdf.

The cipher we analyzed in the exercise session is RC. I find it quite surprising that such a widely used cipher exhibits properties we can find in an exercise session (well, knowing where to look, of course). For much more info, see the video above.

The third homework is due on 09 December 2020 at 13:30. Here is the second homework sheet. Please remember to submit your homework by encrypted, signed email to all the TAs (and not to Tanja). Don't forget to include your public key and those of your team mates. Do not submit as a singleton, the minimum group size is 2.

06 Dec 2021
The topics for the Q & A session are everything we have covered so far, in particular the videos on RC4 and hash functions, but please ask about LFSRs as well, if anything is unclear. Again, I appreciate geting questions in advance.


Old Exams

This course was given for the first time in Q2 of 2014. Here are my exams so far