THE SELBERG SIEVE, APPLIED TO TWIN PRIMES The theory of ...

THE SELBERG SIEVE, APPLIED TO TWIN PRIMES 

PART III PRIME NUMBERS, MICHAELMAS 2004 

The theory of “small” sieves is large and rather bewildering for the beginner. In this 

chapter we present one example of a small sieve in action, using the so-called Selberg 

sieve to prove the following result: 

Theorem 0.1. The number of twin primes less than N (that is, primes p such that 

p + 2 is also prime) is at most CN/ log 2 N. 

Remarks. Thus most primes are not twin primes. An amusing consequence of this 

result, which is often quoted, is that the sum of the reciprocals of the twin primes 

converges (to what is known as Brun’s constant). It is a famous unsolved problem to 

decide whether there are infinitely many twin primes. 

One of the biggest difficulties for the would-be sieve theorist is the bewildering array 

of notation in the subject. By studying one specific problem we can avoid a lot of 

that, whilst hardly losing any of the ideas, but it would be remiss not to make some 

remarks relevant to the more general context. To place an upper bound on the number 

of twin primes less than N, one studies the sequence A = (an), where an = n(n + 2). If 

√ N � n � N and n is a twin prime, then an does not have any prime divisors smaller 

than √ N. Thus, for any z � √ N we have the upper bound 

number of twin primes p ∈ [ √ N, N] � |S(A, P, z)|, 

where S(A, P, z) is the collection of all a ∈ A which are not divisible by any p ∈ P with 

p � z. In our example we will take P = all primes, though there are problems where 

we might wish to use just a subset of the primes. 

What information do we know about A? Well, it is not very difficult to estimate the 

number of a ∈ A which are divisible by 2, 3, 4, . . . quite accurately. Write 

Ad := {a ∈ A : d|a}. 

Then |Ad| can be studied using the function ω, defined as follows. If p is prime, then 

ω(p) is the number of residues x(mod p) such that x(x + 2) ≡ 0(mod p). Thus ω(2) = 1 

and ω(p) = 2 for p � 3. Extend ω to a function on all of N by insisting that it be 

completely 1 multiplicative, that is to say ω(mm ′ ) = ω(m)ω(m ′ ) for all m, m ′ . 

As regards |Ad|, one easily sees that if d is squarefree then ω(d) is the number of solutions 

to x(x + 2) ≡ 0(mod d). Thus in this case (which will be the only one that interests us) 

where |Rd| � ω(d). 

|Ad| = ω(d) 

d N + Rd, (0.1) 

1 as opposed to just multiplicative, which means that ω(mm ′ ) = ω(m)ω(m ′ ) when (m, m ′ ) = 1 

1

2 PART III PRIME NUMBERS, MICHAELMAS 2004 

All of the above features are more-or-less typical in sieve theory. Once one has an 

understanding of Ad one might hope to estimate S(A, P, z) by using inclusion-exclusion: 

|S(A, P, z)| = N − � 

|Ap1| + � 

p1�z 

p1

THE SELBERG SIEVE, APPLIED TO TWIN PRIMES 3 

2. selberg’s observation 

Let λ : {1, . . . , z} → R be any function whatsoever with λ1 = 1. Then if a ∈ S(A, P, z), 

we have 

� � �2 λd � 1, 

d|a 

d�z 

since the sum collapses to just the one term with d = 1. It follows that 

|S(A, P, z)| � � � � �2 λd = 

� � 

λd1λd2 = � 

|A[d1,d2]|λd1λd2 

n�N 

d|an 

d�z 

d1,d2�z n�N 

d1,d2|n 

d1,d2�z 

Let us assume furthermore that λd = 0 if d is not squarefree. Then we may use (0.1) to 

write this in the form 

|S(A, P, z)| � N � 

d1,d2�z 

µ(di)�=0 

ω([d1, d2]) 

[d1, d2] λd1λd2 + O( � 

d1,d2�z 

µ(di)=0 

ω([d1, d2])λd1λd2). (2.1) 

We have ω(d) = Oɛ(d ɛ ), since by multiplicativity ω(d) is no more than 2 ϖ(d) , where 

ϖ(d) is the number of prime factors of d (this is an exercise on the second example 

sheet – we also used it in [PN9]). It will turn out later on that whenever we apply (2.1) 

we will have the bound 

λd = Oɛ(d ɛ ) (2.2) 

In this case the error term in (2.1) is O(z 2 N ɛ ), which will be dominated by the main 

term if z = N 1/2−δ for some δ > 0. 

Write 

Q := � 

d1,d2�z 

µ(di)�=0 

ω([d1, d2]) 

[d1, d2] λd1λd2 

for the main term in (2.1). Thus under the assumption (2.2) we have 

|S(A, P, z)| � NQ + Oɛ(z 2 N ɛ ). (2.3) 

Q is a quadratic form, and we wish to choose values of λd so that it is as small as 

possible. We do this by diagonalising the form. To begin with, we rewrite Q in the form 

Q = � ω(d1)ω(d2) (d1, d2) 

. (2.4) 

d1,d2�z 

µ(di)�=0 

d1d2 

λd1λd2 

ω((d1, d2)) 

To get a handle on this, we use a fairly standard trick. Write g(k) = k/ω(k), and 

observe that by Möbius inversion we have 

g(k) = � 

f(δ) 

where 

f(k) = � 

δ|k 

δ|k 

µ( k 

δ )g(δ).

4 PART III PRIME NUMBERS, MICHAELMAS 2004 

Substituting into (2.4) and swapping the order of summation yields 

Q = � 

� 

� 

f(δ) 

ω(d) 

d λd 

�2 . (2.5) 

δ�z 

δ|d,d�z 

µ(d)�=0 

This is indeed a diagonal quadratic form, in the variables 

uδ := � ω(d) 

d λd. 

δ|d,d�z 

µ(d)�=0 

To minimise Q we need to express the constraint λ1 = 1 in terms of these new variables 

uδ. This can be achieved by applying Lemma 1.2. One obtains 

ω(d) 

d λd = � 

d|δ,δ�z 

µ(δ)�=0 

µ( δ 

d )uδ, (2.6) 

and so that constraint becomes simply 

� 

µ(δ)uδ = 1. (2.7) 

δ�z 

The minimisation of Q, as given in (2.5), subject to the constraint (2.7), is a simple 

matter of completing the square. The minimum value is Q0 = 1/D, where 

D = D(z) := � 

d�z 

µ 2 (d) 

f(d) . 

The optimal choice for uδ is uδ = µ(δ)/Df(δ), which corresponds in view of (2.6) to 

λd = 

d 

ω(d)D 

� µ(δ/d)µ(δ) 

. 

f(δ) 

d|δ,δ�z 

We may now verify the claimed bound (2.2) for our choice of λd. We start with the 

observation that since g (as defined above) is multiplicative, then so is f = g ∗ µ. 

Furthermore it is easy to check that f(p) = p/ω(p) − 1. Thus (note that f � 0) 

|ω(d)λd| = | d 

D 

� 

d|δ,δ�z 

µ(δ/d)µ(δ) 

| � 

f(δ) 

dµ2 (d) 

Df(d) 

� 

δ�z 

µ 2 (δ) 

f(δ) � dµ2 (d) 

f(d) . 

Remember that µ 2 (n) is simply either 1 or 0 according as n is or is not squarefree, thus 

the above is not as intimidating as it looks. Now if d is squarefree we have, further, 

d 

f(d) 

This establishes (2.2). 

= � 

p|d 

p 

f(p) 

= � 

p|d 

1 

ω(p) 

1 

− 1 

p 

� 6 ϖ(d) = Oɛ(d ɛ ). 

Remark. There are times when it is helpful to know more about the coefficients λd. 

In our particular problem they behave rather like µ(d) 

� log(z/d) 

log z 

� 2 

, the 2 here being a 

consequence of the fact that ω(p) = 2 for almost all primes. They are bounded by 1, 

rather than just by Oɛ(d ɛ ).

THE SELBERG SIEVE, APPLIED TO TWIN PRIMES 5 

Now that (2.2) is established, we may consider (2.3) to have been confirmed, with 

Q = D: 

|S(A, P, z)| � N 

D + Oɛ(z 2 N ɛ ). (2.8) 

When z = N 1/3 , the error term here is small. In the next section we will obtain the 

bound D(N 1/3 ) ≫ log 2 N, which will conclude the proof of Theorem 0.1. 

3. twin primes 

Our aim here is to finish the proof of Theorem 0.1. In view of (2.8) it is enough to 

prove the bound D(N 1/3 ) ≫ log 2 N, where 

Thus we have 

D = � 

To bound � ω(m) 

m�z m 

d�z 

µ 2 (d) 

f(d) 

D(z) = � 

= � 

d�z 

µ(d)=0 

d�z 

d�z 

p|d 

� 

p|d 

µ 2 (d) 

f(d) . 

ω(p)/p 

1 − ω(p) 

p 

= � � � ω(p) 

1 + 

p + ω(p2 ) 

p2 � 

+ . . . 

� � 

m�z 

ω(m) 

m . 

below one can observe that ω(m) � d(m), the number of divisors 

of m, for all odd m. This follows by writing m = p α1 

1 . . . p αk 

k , so that ω(m) = 2α1 . . . 2 αk 

and d(m) = (α1 + 1) . . . (αk + 1). From this observation one can deduce that 

� ω(m) � d(m) 

� 

m m � 

� 

� 

�2 1 

≫ (log z) 

m 

2 . 

m�z 

m odd 

m�z 

m odd 

m� √ z 

m odd 

This concludes our establishment of a lower bound for D(z), and hence the proof of 

Theorem 0.1.

THE SELBERG SIEVE, APPLIED TO TWIN PRIMES The theory of ...

Create successful ePaper yourself

Delete template?

Save as template?