Contents | Announcements | Exams | Literature | Videos | Course notes & exercise sheets | Follow-up courses | Old Exams |
Tanja Lange
Coding Theory and Cryptology
Eindhoven Institute for the Protection of Information
Department of Mathematics and Computer Science
Room MF 6.103
Technische Universiteit Eindhoven
P.O. Box 513
5600 MB Eindhoven
Netherlands
Phone: +31 (0) 40 247 4764
The easiest ways to reach me wherever I am:
e-mail:tanja@hyperelliptic.org
Contents
Note that one of the course requirements is algebra. I will not repeat basic background on groups, rings and fields in class. If you don't have the necessary background, take the summer and work through the "Number Theory and Algebra" script.
It is not necessary to purchase a book to follow the course.
Previous versions of this course used
Henk van Tilborg's "Fundamentals of Cryptology", Kluwer academic
Publishers, Boston, 2000. But the book is out of print.
A preliminary author's copy by Henk can be downloaded in pdf form
here
and as a mathematica worksheet here.
Other books you might find useful (in alphabetical order):
The first exam is on 29 Oct 09:30 - 12:00 in AUD 15 and 16. The retake is on 28 Jan 18:00 - 21:00.
The videos from this course appear on
this page of videocollege,tue.nl. You can find
older editions of this course
here.
Note that this page shows lectures
from multiple years and that there are some differences between the
course versions; I taught the course with recordings in 2022 and 2019 and my colleague Andreas
Hülsing taught in 2018, so you can get different explanations.
For the 2021 edition of the course I recorded a lot of short videos
which you can find on the YouTube Channel.
The
course page for 2021 has short descriptions of all videos, slides,
and no-cookie links to the YouTube videos. Watch them
from there if you're on a low-cookie diet.
For even fewer cookies, you can find the
videos on surfdrive. File names match the file names of the slides.
This section is extended through the course with notes of what happened in class and links to blackboard pictures.
03 Sep 2024
First lecture, followed by exercise session.
General introduction to cryptography; concepts public key
and symmetric key cryptography.
We covered Diffie-Hellman key exchange with some generic P and showed that A
and B both compute abP while E sees P, aP, and bP.
Then we covered the clock group over the reals (or rational numbers) as a bad example
where the attacker can easily solve the discrete logarithm problem by observing
how large the power of 5 grws when using P=(3/5,4/5), but this study
also gave us the addition formulas for the clock group which we then used to consider
the clock group over the integers modulo a prime.
We showed that the addition law has (0,1) as neutral elemment, -(x,y) = (-x,y)
and that the law is commutative. In the instruction you'll show that the
resulting point is on the clock. We skipped associativity as it is not
particularly illuminating.
Clocks modulo primes are an example of cryptosystems
against which the best attacks have subexponential (but superpolynomial) complexity; we
will get to this attack much later. If possible, we would like to have systems
where the best attacks are exponential.
Finally we saw Edwards curves and the addition law and how much it resembles addition
on the clock.
Some students asked for some intuition on the Edwards additon law.
Here is a link to a
blog
post describing a unified way of addition on circles and Edwards curves by
Thomas Hales. I prefer the simple and intuitive addition law on the circle
which we've seen to be a special case of the Edwards addition law for d = 0.
He goes the other way around – starting with a geometric interpretation
of the addition law that I worked out with Arene, Naehrig, and Ritzenthaler,
and then showiing that that can also be used for the circle. But tastes differ,
so you might like his presentation better.
For our paper, Michael Naehrig recorded a video about it together with his kids
which you can find
here.
I'll post pictures of the blackboards later tonight. Pictures of blackboards are here.
Here is the sheet for the instruction session (block 7 & 8).
Submitting homework is optional. If you want feedback, please submit by next
Tue (10 Sep)
before 13:30 through Canvas. Please submit in groups of 2-3 people; we do not have
capacity to grade everybody individually.
To explain the 'optional':
I do expect that you look at the exercises (homwork and instructions, in
particular if they cover pieces we leave out in the lectures. In general it's a
good idea to engage with the material.
Here is the first homework sheet.
05 Sep 2024
Recap of Diffie–Hellman key exchange and related problems (computational
Diffie–Hellman problem, decisional Diffie–Hellman problem, and discrete logarithm problem) abstractly,.
Consideration of what points are bad starting points for the clock; you'll have
a similar proof for Edwards curves in the homework,but remember that you cannot
argue about angles there, so you need to use the formulas.
Definition of order of a group element (this should be a recap for you.
For computing aP I explained the double-and-add method with examples 17P and
23P. If this went too fastĀ§, watch ECC II from the
YouTube
Channel
Recap of Edwards curves; proof that denominators are never 0 over the reals if d is
negative.
We looked at what happens if 0 < d < 1 and d > 1 and that the pictures show
that there are points at infinity. So typically the addition law has exceptions
and is thus not complete. These arguments do not make sense over a finite
field, as we cannot argue about sizes there. The matching properties are that
negative d means that it's not a square and that a positive d is a square.
Being a square or not is a property that makes sense over a finite field. We
showed that over a finite field there are as many squares as non-squares, using
that the multiplictive group of a finite field is cyclic (can be written as
powers of a single element, called a generator. We also repeated the theorem of
Lagrange, that the order of an element in a finite group divides the group
order, where group order means the number of elements in the group.
For the proof that there are no exceptions over F_p for p>2 and d a non-square
please watch ECC III from the
YouTube Channel.
On Tuesday we had already shown that -(x,y) = (-x,y) continues to hold on
Edwards curves and that (0,1) remains the neutral element. We skipped showing
that addition is associative and we also didn't show that the sum of two points
is on the curve, but given that there are no exceptions to the addition law
this is something you can ask your computer to check. Taken together, these
show that the points on Edwards curves form a group under this addition.
The group is commutative.'
From the pictutres and the addition formula it is clear that the number of
points on an Edwards curve are a multiple of 4. This is also clear by Lagrange
because we have points of order 4 (the hands of the starfish or equivalent
points for the other pictures) and using Lagrange to show that the group order
must be a multiple of 4.
Twisted Edwards curves are a generalization of Edwards curves which, depending
on the choice of a, do not have a point of order 4. But we will show next week
that the number of points over any finite field remains divisible by 4. This
will be shown once we cover Montgomery curves.
Pictures of blackboards are here.
10 Sep 2024
We covered Weierstrass curuves as the most general form of elliptic curves, saw
the Jacobi criterion and what singularties (cusp and node) look like.
Oover fields in which 6
is not 0 we can transform the curve with an isomoprhism to short Weierstrass
form y^2 = x^3 +a_4 x + a_6, which is non-singular if 4a_4^3 + 27 a_6^2 is not
0.
Starting from the additon law that points on a line add up to 0 we developed
the addition formulas incl.
proving that there is exactly one 3rd. point on the curve for a line of the
form v= λ u + μ going through two input pionts (or being the tangent
to one input point, in the case of doubling)
and then showed that there are 5 cases to consider for
the inputs for addition. The Weierstrass form is the most general curve equation
for elliptic curves but also the most annoying to implement – missing one
of the special cases can cause wrong resutls and in crypto that's often enough
for an attacker to get in.
The geometric addition law is called the
chord-and-tangent method.
Montgomery curves are another special curve shape; we covered that the curve is
nonsingular if A is not 2 or -2 and B is not 0, where we used the Jacobi criterion.
We covered the addition law on Montgomery curves.
A relaxation of isomorphisms are birational equivalences
which are also invertible maps between curves which are compatible with
addition, but permit a finite number of exceptions; as the name
suggests, these are given by fractions of polynomials.
I said (without proof) that Montgomery curves are birationally equivalent to Edwards
curves and stated the formulas.
Pictures of blackboards are here.
Here is the sheet for the instruction session (block 7 & 8).
The next homework is due on 17 September at 13:30 via Canvas. Here is the homework sheet.
12 Sep 2024
We showed that on a Montgomery curve over a finite field at least one of the
following holds: there are 3 points of order 2 or 2 points of order 4. This
implies that the
group order is always divisible by 4. The proof boils down again to using that in a
finite field g^2i is a square while g^(2i+1) is not, where g is a generator of
the multiplicative group or the finite field. The part of showing that the
first 2 poitns have order 4 is part of the exercise sheet 2 (see there).
We then showed that Montgomery curves
and twisted Edwards curves are birationally equivalent by giving the explicit
map and also checking for exceptional cases and figured out where they are
using projective coordinates.
Projective coordinates help understand the points at
infinity. these appear when Z=0 while at least one of X or Y is nonzero. For
Edwards curves it's nicer to study points at infinity with having seperate
denominators for x and y, so that (x,y) turns into ((X:Z), (Y:T)) and we found
two points at infinity in the x direction: ((1:0),(± √(a/d) : 1)) if a'd is
a square; this matches the picture I drew for twisted Edwards curves when both
a and d a non-squares and the curve stretches far out in the x and -x
direction.
Geometrically, this means working with projective coordinates (the
normal coordinates that we've looked at are called affine coordinates) and
instead of using just two coordinates (x,y) we work with three (X:Y:Z) with the
understanding that x=X/Z and y=Y/Z; this requires Z nonzero and means that we
do not have a unique representation (the : in (X:Y:Z) are used by convention to
indicate the redundancy of the representation; note (X:Y:Z)=(cX:cY:cZ) for nonzero
c).
We used projective coordinates for this explanation and developed X3 and Z3 for
twisted Edwards when x and y have independent denominators Z and T to show that
the exceptional points have orders 2 and 4.
Inversions are computationally expensive, so for efficient implementations we
like to work with fractions and clear denominators only at the end of the
computation.
For more addition formulas in projective coordinates and optimized operation count in
multiplications etc. see https://hyperelliptic.org/EFD/.
I gave a short show-and-tell of how Sage works, showing
E = EllipticCurve(GF(41),[1,0]) and
P4 = E((1,17)); 2* P4; 4*P4 as example usage and that Sage uses
(0:1:0) to denote the point at infinity.
Here is a longer example that I prepared last year:
E=EllipticCurve(GF(41),[1,5]) generates an elliptic curve. Typing
E in an intereactive session or print(E) otherwise
outputs
Elliptic Curve defined by y^2 = x^3 + x + 5 over Finite Field of size
41
E.cardinality() tells us how many points there are on E, which is 47
in this case. To avoid knowing what the solution is I'll generate basepoint and
target randomly
P=E.random_point() for me resulted in P=(38 : 4 : 1), note the
projective coordinates we covered today. We also need a target discrete log
problem, so
PA=E.random_point() which gave me (28 : 3 : 1).
If you want to reporoduce exactly this example, you will use
PA=E((28,3)) to create the point PA on E. If you need the point at
infinity it's
Q=E(0) which Sage then outputs as (0:1:0) (this is the same on
Montgomery and Weierstrass).
After this setup phase everything is nice, you can just compute 2*P or
PA + 5*P and Sage does all the work for you. Sage is based on python,
so you an build loops and if statement and for the BS you can make a set.
In 2021 I recorded a
video to
demonstrate how to use Sage https://www.sagemath.org/, covering basics of finite fields and
elliptic curves.
I also wrote a short ``cheat sheet'' with commands for Sage,
see here
Finally, we very briefly covered the baby-step giant-step (BSGS) algoirthm,
just enough to see what it does but without any explanation. Those will come on
Tue.
Pictures of blackboards are here.
17 Sep 2024
Montgomery curves are interesting becase they offer efficient differential additions
in which we compute only the first coordinate of a point (u-coordinate in how
we write them), a dADD opeartions adds points of known difference and uses the
u-coordinate of the difference to do that. We develped u-only doubling and I
wrote u-only dADD formula. I showed the projective version of this computation
using slides and also showed the definition of Curve25519.
The Montgomery ladder computes scalar multiplication aP by using two
intermediate points of difference P and one DBL and one dADD per bit of a.
The Montgomery ladder is also interesting because it hides the exact operations
we are doing. This matters as
an attacker might be able to obtain side channel information, such as timing
(at different levels of precision), electomagnetic radiation, or power consumption
and from this infer information on the secret scalar that was being used.
I showed the timing distribution from some TPMs (Trusted Platform modules) from
TPM-Fail where you can clearly see the influence
of length of a and Hanning weight of a. In several cases this leakes 12 bits; for
Diffie-Hellman that's not a big problem, but for the signature system that they
were attacking it was -- and you don't know where your scalar multiplication will
be used.
We covered the double-and-add method, the double-and-always-add method (using
dummy operations to hide the pattern). Windowing methods lead to a speedup for
scalar multiplications as they need fewer additions and they aslo need fewer dummy
operations if doing one addition per step.
All of this is covered in these slides,
which also show how to do swaps in the Montgomery ladder so that the sequence
of operations becomes 1 dADD, 1DBL, independent of the bits of the scalar
a.
If you want to see a lot more on side-channel attacks and countermeasures, check
out the CHES conference series.
Talks are available on YouTube on the "The IACR" channel.
In the DLP we want to find a with P_A = aP and a should be less than the number
of points n.
The number of points on the curve over F_p is in [p+1-2√ p, p+1+ 2 &radc; p].
The Baby-Step-Giant-Step (BSGS) attack computes a = i - jm mod l for m=√
l and 0≤ i < m, 0 ≤ j ≤ m.
Hence, the DLP in a group is no harder than the square root of the group order.
BSGS has storage costs of m which might beprohibitive and thus choosing less
optimal ratio of BS and GS might be better.
The large storage requirement motivates us to look for other solutions.
I showed some pictures from the first part of these
slides,
explaining colisions of random walks on a finite
set and that the resulting graph looks like the Greek letter ρ. The reason
we see collisions after about &radiic;l steps is the birthday paradox. The time
to collision splits about evenly over the tail and the cycle part.
I explained Floyd's cycle-finding method which permits to identify collisions
without requiring to store all visited points while requiring about twice as
much computation. Finally, I showed how to recover the DLP if at every step we
know the reached point as a linear combination of the base point P and the
target P_A. This method is called Pollard's rho method and I very briefly
showed the schoolbook version of how one can define such a walk. Much more on
that on Thursday.
Here is the
sheet for the instruction session (block 7 & 8).
The next homework is due on 24 September at 13:30 via Canvas. Here is the homework sheet.
Pictures of blackboards are here.19 Sep 2024
Pollard's rho method with Floyd's cycle-finding method runs in O(√l) in a
group of size l and only uses two
points. The method does a fast (two basic steps by fast step) and a slow walk and
only compares the current points, not past points.
The birthday paradox guarantees that each walk collides with itself and the
pseudorandom nature of the walk means that it enters a cycle after a collision,
which we can find using cycle finding. I showed an expensive way to
instantiate the step function, taking 2 scalar multiplications per step
and how we can use this to turn the collision into solving the DLP.
The step function can just be some hash of the coordinates
of W which produces b and c. Getting the
DL a from the collision is just computing a=(bj-bi)/(ci-cj) modulo l.
For these walks, as well as for BSGS,
we work with unique representatives of points, so we need to use
affine coordinates rather than projective ones, and there Weierstrass curves are faster.
We can use the birational equivalence to move our attack targets to Weierstrass
form in case it is given on a twisted Edwards curve.
Then I
showed the schoolbook method, which is less random, and additive walks which
get more random with more precomputed steps (still a small number).
There is still some loss in randomness:
doing a pseudo-randoom walk with 2^k different precomputed R_i leads to a
slowdown factor of 1/\radic(1-1/2^k), so we need to choose k large enough.
I showed
the graph for the parallel attack due to van Oorschot and Wiener, in which each
waklk ends when it hits a 'distinguished point' (a point marked by some
properties in its coordinat2es, e.g., a large number of 0 bits in x(P).).
Instead of waiting for walks to close a cycle we then hope for walks to join
paths and reach the same distinguished point. This uses M computers
efficiently, giving a factor of M speedup.
The attack designer
can choose the frequency of distinguished points to control the lengths of the
walks, storage (and communicaion) costs,
The same approach is also taken in finding collisions in hash functions using
many computers. There it is important to find the first collision, for DLPs any
collision is good as long as c_i is not c_j mod n.
Pictures of blackboards are here. Thanks to the student who took them and made them availabel to me.
24 Sep 2024
This lecture was given by Andreas
Hülsing and covered hash functions and proofs by reduction.
Here are the slides for his presentation
Further reading on hash functions
Please see my version from 2022 or the short videos from 2021 for a more algorithmic focus.Here is the
sheet for the instruction session (block 7 & 8).
Homework is optional. If you want feedback, please submit by next Tue (03 Oct)
before 13:30 through Canvas. Please submit in groups of 2-3 people; we do not have
capacity to grade everybody individually.
Here is the homework sheet.
26 Sep 2024
I repeated the cost statement for Pollard rho and how additive walks with
predetermined steps have anti collisions, but then I messed up in writing the
extra cost. A student caught that &radic:(1-1/2^k) cannot be correct as the
extra factor for how much longer the walks take, because increasing k should
reduce the factor. I then cdhanged the - to + while I should have taken in
inverse. The blackboard picture is not fixed, the correct value is
1/√(1/1/2^k) which is close to √(1+2^(k+1)). Please also see the
slides which also calculates the formula.
Note that the k there is our 2^k.
Last week we did an attack on the DDHP where we checked whether a, b, and c
matched modulo small divisors of l (the order of P).
The Pohlig-Hellmann attack turns this informartion into an attack to compute
the full DLP, if all factors of l are small enough then
a modulo the divisors is efficiently computable and we can recover
ausing CRT.'
Important For l = Π p_{i}^{ ei} the attack
solves e_{i} DLPs in a group of size p_{i}, not one DLP
in a group of size p_{i}^{ ei}.
I showed slides to give a numerical example and
to highlight the different costs of what you might try to do in sort of following
this attack idea. These costs state the costs of the DLPs; the scalar multiplications
don't matter asymptotically, and the CRT computation is very fast.
The slides also contain a systematic statement of how to run the Pohlig-Hellman
attack which you should look at and understand.
If you don't remember how the CRT computation works, take a looks at this
YouTube video I
made for 2WF80 and the corresponding
slides.
The lesson of the Pohlig-Hellman attack is that the DLP is no harder than the
DLP in the largest prime-order subgroup, so we try to choose points P with prime
order l; there might be some cofactor c so that the curve has c*P points.
As a step towards signature schemes we covered Schnorr's identificaion scheme.
We showed that a valid transcipt passes verification. Think about whether an
attacker can learn a from seeing many transcrips.
Pictures of blackboards are here.
Further reading on DLP:
01 Oct 2024
Brief summary of consequences of Pohlig-Hellman attack: we want groups which
have a large prime factor as the DLP is as hard as the DLP in the largest
prime-order subgroup. To avoid attacks on the DDHP we want the base point to
have prime order. In the following ord(P) is l while #E(F_p) is c*l, for some
co-factor c.
In the first lectur this semester we
covered properties of signatures: authenticity, integrity, and non-repudiation,
meaning that the message really came from the sender, was not modified, and the sender
cannot claim not to have sent it. A signature scheme is part of public-key cryptography;
the public key is used to verify a signature while the private key is used to sign.
Note that the message is known to the sender and the verifier.
As a step towards signature schemes we covered Schnorr's identificaion scheme which
is a zero-knowledge proof of knowledge of the discrete log of P_{A}. We
showed why this is zero knowlege and why it proves that the prover knows a with P_{a}
=aP. The latter part also showed that if one signs two messages with the same
randomness R then an attacer can compute the private key, see also the fourth
homework.
In Schnorr signatures the random challenge picked by the verifier is replaced
by the output of a hash function which is computed over the message and the commitment.
I showed EdDSA (up to some technicalities), ECDSA, and DSA.
EdDSA is a direct instatiation of Schnorr
signatures with some improvements and handling that the basepoint P has order n/8
(where n is the total number of points on the curve and n/8 is prime). ECDSA is
slightly different in the order of how the operands are combined and is more
annoying in the verification because we need to compute inverses modulo the order
of P -- which is a good reason in addition to Pohlig-Hellman to choose groups of
prime order.
The Sony playstation
disaster (see 27C3 talk
PS3 Epic Fail)
gives a real-world example where Sony failed to update the random r. As you will
also see in the homework, this means you can compute a from just two signatures.
To avoid this fragility, EdDSA picks r peudorandomly, by generating the private
key as (a,b), with a the usual DL of the public key and b a seed for generating
r = hash(b,m), so ensure that different messages lead to differnt r.
A scheme is considered broken if an attacker can produce a new valid signature (R',s') different from any (R,s) they have seen. Note, that m can be there same here. Schemes like ECDSA that only use the x-coordinate of a point and do not include R in the hash inputs are malleable, meaning that one can state a different signature for the same m by flipping signs. This is not considered an attack but is not really desired either.
As a second topic we covered schoolbook RSA, how KeyGen, Encrypt, and Decrypt
work. I skipped showing that m'=m, but you can easily get that from Euler's
theorem.
I mentioned that one
can speed up encryption by choosing small e which have only very few non-zero
bits in the binary expansion such as e=65537. For decryption we cannot make
special chioces of d as that's part of the secret key and we shoudln't restrict
the number of choices there. The CRT method (named after the Chinese Remainder
Theorem) gives a speed-up by a factor of 3 - 4 (depending on whether the
modular arithemtic uses quadratic-time schoolbook multiplication of Karatsuba
multiplication).
Pictures of blackboards are here.
Here is the
sheet for the instruction session (block 7 & 8).
Homework is optional. If you want feedback, please submit by next Tue (08 Oct)
before 13:30 through Canvas. Please submit in groups of 2-3 people; we do not have
capacity to grade everybody individually.
Here is the homework sheet.
03 Oct 2024
I showed the Fermat primality test and mentioned that Carmichael numbers are a
class of composite numbers n for which a^(n-1) = 1 mod n for all a coprime to
n.
I filled in from the previous lecture what I meant by the warning about
schoolbook RSA (homomophic property, dictionary attacks due to lack of
randomization) and what kind of padding one would want. I showed RSA-OAEP from
the last slide of rsa-1.pdf and explained
which functionality the different components have.
Coppersmith's attack works when we know some parts of the prime, and I showed
on the example of our attack on the Taiwanese citizen cards https://smartfacts.cr.yp.to that this
can happen in practice. See rsa-9.pdf and
rsa-10.pdf (only first slide)
for the slides I showed.
In this lecture I showed Coppersmith's approach in that he sets up a system of
equation that are 0 modlulo some b^k, in our case modulo p, and that one can
combine such equations to get new ones that are also 0 modulo b^k. A
theorem of Howgrave-Graham gives a bound on how large the root may be so that
the resulting polynomial has an integer root (instead of just a root modulo
some b^k). In our example we have b^k equal p and X is the bound on the unknown
part of p (the top part is assumed to be known as in the example of the TW
citizencards. We
first treated LLL as black box algorithm which provides short integer linear
combinations. Then I explained what LLL does and what
guarantees it gives on the shortness of the output vectors.
Coppersmith's attack uses LLL to find a polynomial with bounded coefficients.
I didn't get to the end of this, but will show on Tuesday that you can
factor n if you know the top 2/3 bits of p.
We covered how to set up the systems in general but I'll repeat that again as
well. See the beginning of rsa-11.pdf
for LLL and the bounds.
08 Oct 2024
We covered how to set up the systems in general and'how the requrement
from Howgrawe-Graham's theorem fits to the guarantees given by LLL about the
size of the first vector. With that we showed that we can recover all of p with
the 3x3 matrix if the top 2/3 of p's bits are given.
I showed that going to x^2*f does not improve the number of bits of p that
can be recovered. I then showed how to build a system modulo p^2,
by taking n^2, n*f, f^2. I skipped using just d = 4 as that doesn't add
anything,
Including X*f^2 and X^2*f^2, i.e. going up to dimension 5 and setting up
M=matrix([helps. It has determinant n^3*X^10. The second condition for Howgrave-Graham is then that the 5th root of this is less than p^2 which is about n. This means that X is less than n^(1/5) (which is larger than n^(1/6)).
[X^4,2*a*X^3,a^2*X^2, 0, 0],
[ 0, X^3,2*a*X^2,a^2*X, 0],
[ 0, 0, X^2,2*a*X,a^2],
[ 0, 0, 0, X*n,a*n],
[ 0, 0, 0, 0,n^2]
])
Generalizations exist to
bottom bits and bits in two chunks. We showed that considering relations modulo
p^2 and a 5x5 matrix we can miss 2/5 of the bits of p.
Finally, we did stereotyped
messages as another example to see how to deal with polynomials of degree
larger than 1. See rsa-11.pdf
for the details of the example.
Try setting up a matrix for stereotyped messages with e=5 and consider how many bits you can recover for e = 3 (as in the example) and for e = 5.
For factorization methods I mentioned
trial division/sieving and that for random integers m which need to be factors
this efficiently removes all small factors.
Then we covered the p-1 method and why it works when it works.
Pictures of blackboards are here.
Here is the
sheet for the instruction session (block 7 & 8).
Homework is optional. If you want feedback, please submit by next Tue (15 Oct )
before 13:30 through Canvas. Please submit in groups of 2-3 people; we do not have
capacity to grade everybody individually.
Here is the homework sheet.
10 Oct 2024
We covered Pollard's rho method of
factorization and that it takes time about √p to find p.
For the p-1 method we covered the generalizations to p+1 and
the Elliptic Curve Method (ECM) of
factoriation. The question of how to find points on a curve modulo a composite
integer led us to discussing that computing square-roots modulo p * q can be
used for factoring. I showed Dixon's method (congruence of squares) for how to
build b and c with b^2 and c^2 congruent to each other modulo n. This is the
first method we see for factoring RSA numbers and the motivation for
considering the factorization of the auxilary numbers before.
Take a look at rsa-6.pdf as an example for Dixon's
method.
Pictures of blackboards are here.
15 Oct 2024
To finish off the integer-factorization topic,
I showed an idea for keeping the sizes of the auxiliary numbers small.
A big factorization attack using the Number Field Sieve factors the auxillary
numbers by first doing sieving or trial division, then rho, then p-1 and ECM, keeping only
those numbers that factor sufficiently well.
NFS runs in subexponential time, as L_{n}(1/3), we covered the
L-notation as a way to express complexities between exponential and polynomial
and how those are covered as L(1) and L(0).
Here is the sheet for the instruction session (block 7 & 8).
Pictures of blackboards are here.
19 Oct 2024
To finish off ElGamal encryption I commented on the tension that to avoid
Pohlig-Hellman attacks one needs to work in a prime-order subgroup while for
ElGamal one needs to have that the messages are part of the group.
Finite-field DLP has lower security per bit than ECC.
Index calculus attacks have many similarities with the number-field sieve
attack: again we replace one hard computation with many easier ones. Here we
solve the DLPs for many small primes by setting up a system of linear equations
(to get there we again need to factor). Note that these attacks require fixing
a factor base for which it is easy to see whether a given element factors over
it. We have this for finite fields (including extension fields) using
factorization into integers (or irreducible polynomials) but not for
elliptic curves over finite fields. That's why we can safely use elliptic
curves over fields of size 256 bits while the DLP in finite-field needs to have
3072 bits to be equally hard.
If you want to see the idea and an example, take a look at
ff-dl-4.pdf for a small example of
index calculus attacks. For keysize recommendations.see e.g. the
ECRYPT-CSA
Algorithms, Key Size and Protocols Report. This report is from 2018 and an
update from 2020 exists but is locked up with ENISA. If you just want the
summary (which hasn't changed) see the first slide of
ff-dl-3.pdf.
These attacks also run in time L(1/3,c). Interstingly, one can backdoor the
prime selection to have an easier DLP, see
A kilobit hidden SNFS discrete
logarithm computation. This attack uses that the hardness of the DLP
depends on features of p and that one needs to know them in order toactually
break the system.
As a second topic we covered symmetric crypto
symmetric crypto A and B share a key k. They then use it to encrypt and
authenticate their messages.
, first I gave some short sketches
of block ciphers and modes and an even shorter sketch of stream ciphers.
You should watch Tomer Ashur's awesome videos
symmetric crypto,
stream ciphers,
block ciphers, and
modes of operation.
for a more detailed introduction.
We covered Galois Counter mode, where ci = mi + Enc(v,i), for some
randomly chosen initialization vector/nonce v and counter i in binary representation.
The ciphertext needs to include c0 = v so that the receiving party can decrypt.
Decryption is easy: mi = ci + Enc(v,i). The n-bit block input to Enc gets
split between v and the counter. The counter limits how many blocks can be
sent with the same v. To encrypt longer messages, pick another v, this costs
< n bits. If the v repeated then two messages would
be encrypted by adding the same strings, which would be a security problem, so
v needs to be long enough to make this unlikely. (Note that the birthday
paradox applies for randomly chosen vs, so if n=128 gets split into 96 bits
for the v and 32 bits for the counter then a new key needs to be chosen after
about 2^48 encryptions (2^(96/2)). After that many messages a new key needs
to be used (generated by using Diffie-Hellman or as k' = h(k) for some hash
function h)).
AES-GCM then takes the output ciphertext blocks, additional data to be
authenticated, and s=Enc(v,0) and computes a Wegman-Carter MAC in F_{2^{128}}.
I didn't write out how H is computed and also simplified things a bit. The full
description handles shorter IVs and some parallelism in the computation. See
NIST's
SP800-38D for the official standard and Dan Bernstein's
dummy submission
for the Caesar competition which used AES-GCM as an example of what a
submission should look like.
As an example of block ciphers we studied the Tiny Encryption Algorithm (TEA) and also how small variations can lead to catastrophic attacks. I should have emphasized at the beginning what properties we're out to get -- getting the key is of course nice, but ciphers are already discarded if they are distinugishable from a pseudo-random permutation (PRP), i.e. a bijective map from {0,1}^n to itself. Finding any special structure that always holds for a cipher but not necessarily for a PRP gives a distinguisher. The slides I used are here and here. These attacks give a taste of how symmetric-key cryptanalysis works, showing small examples of linear and differential attacks as well as the slide attack.
A Message-Authentication Code (MAC) ensures authenticity and integrity of messages between two users sharing a key. The authentication tag should be such that an attacker has only negligible chance of forging a valid message. We apply the MAC to the ciphertext so that the decryption key gets used only on ciphertexts that are valid. Always use a MAC when encrypting, we want authenticated encryption. As an example we covered Wegman-Carter MACs, first as a small example and then Poly1305 which is deployed in practice; see the last slide of sym-7.pdf for full details of Poly1305.
Pictures of blackboards are here.
22 Oct
We will have an instruction session in the lecture slot (blocks 5 and 6).
Please use the first exam from 2023 (here)
as exercise sheet.
Mike and Jonathan covered parts of exercises 2,3,4, and 5 from the October 2023
exam. They worked through the 2^4 portion of the Pohlig Hellman attack, and
sketched the solution for 3^2 and Pollard Rho portions.
For Exercise 3 they discussed how the p-1 method for factoring works and under what conditions it succeeds: p-1 | s, but not q-1 | s.
Note that these are the cases when it is guaranteed to succeed for any a, but
what you do need is that the order of a modulo p divides s while that of a
modulo q does not.
For part c, they explained that depending on the choice of a, results may vary,
as a p-1 for example might not divide s, but a could itself be a kth power of
another element g, and if (p-1) divides k*s then it will still succeed, So let
k*s = m*(p-1), then a^s -1 = (g^k)^s - 1 = (g^(p-1)^m - 1 = 0 mod p.
They also reviewed how to count points on Edwards curves and compute their
order. They emphasized using the symmetries of the curve equation to limit the
searching of x and y values as much as possible, i.e. compute a table of
squares for y in [2...(p-1)/2], and check if values for x in +/- [2...(p-1)/2]
result in values for y^2 that are in the table.
When computing the order of a point in 4b) they reviewed the relationship
between the order of a point and the order of the group of points on the curve,
namely that the order of a point must divide the order of the group, and so
that information should be used to limit the number of computed point
orders.
Pictures of blackboards are here.
Old exams by me: