# Integer factorization

In number theory, integer factorization is the decomposition of a composite number into a product of smaller integers. If these factors are further restricted to prime numbers, the process is called prime factorization.

 Unsolved problem in computer science:Can integer factorization be solved in polynomial time on a classical computer?(more unsolved problems in computer science)

When the numbers are sufficiently large, no efficient, non-quantum integer factorization algorithm is known. In 2019, Fabrice Boudot, Pierrick Gaudry, Aurore Guillevic, Nadia Heninger, Emmanuel Thomé and Paul Zimmermann factor a 240-digit number (RSA-240) utilizing approximately 900 core-years of computing power[1]. The researchers estimated that a 1024-bit RSA modulus would take about 500 times as long.[2] However, it has not been proven that no efficient algorithm exists. The presumed difficulty of this problem is at the heart of widely used algorithms in cryptography such as RSA. Many areas of mathematics and computer science have been brought to bear on the problem, including elliptic curves, algebraic number theory, and quantum computing.

Not all numbers of a given length are equally hard to factor. The hardest instances of these problems (for currently known techniques) are semiprimes, the product of two prime numbers. When they are both large, for instance more than two thousand bits long, randomly chosen, and about the same size (but not too close, for example, to avoid efficient factorization by Fermat's factorization method), even the fastest prime factorization algorithms on the fastest computers can take enough time to make the search impractical; that is, as the number of digits of the primes being factored increases, the number of operations required to perform the factorization on any computer increases drastically.

Many cryptographic protocols are based on the difficulty of factoring large composite integers or a related problem—for example, the RSA problem. An algorithm that efficiently factors an arbitrary integer would render RSA-based public-key cryptography insecure.

## Prime decomposition

By the fundamental theorem of arithmetic, every positive integer has a unique prime factorization. (By convention, 1 is the empty product.) Testing whether the integer is prime can be done in polynomial time, for example, by the AKS primality test. If composite, however, the polynomial time tests give no insight into how to obtain the factors.

Given a general algorithm for integer factorization, any integer can be factored into its constituent prime factors by repeated application of this algorithm. The situation is more complicated with special-purpose factorization algorithms, whose benefits may not be realized as well or even at all with the factors produced during decomposition. For example, if N = 171 × p × q where p < q are very large primes, trial division will quickly produce the factors 3 and 19 but will take p divisions to find the next factor. As a contrasting example, if N is the product of the primes 13729, 1372933, and 18848997161, where 13729 × 1372933 = 18848997157, Fermat's factorization method will begin with a = ⌈N⌉ = 18848997159 which immediately yields b = a2N = 4 = 2 and hence the factors ab = 18848997157 and a + b = 18848997161. While these are easily recognized as composite and prime respectively, Fermat's method will take much longer to factor the composite number because the starting value of 18848997157⌉ = 137292 for a is nowhere near 1372933.

## Current state of the art

Among the b-bit numbers, the most difficult to factor in practice using existing algorithms are those that are products of two primes of similar size. For this reason, these are the integers used in cryptographic applications. The largest such semiprime yet factored was RSA-240, a 795-bit number with 240 decimal digits, in November 2019. This factorization was a collaboration of several research institutions, spanning two years and taking the equivalent of almost 1000 years of computing on a single-core 2.1 GHz Intel Xeon. Like all recent factorization records, this factorization was completed with a highly optimized implementation of the general number field sieve run on hundreds of machines.

### Difficulty and complexity

No algorithm has been published that can factor all integers in polynomial time, that is, that can factor b-bit numbers in time O(bk) for some constant k. Neither the existence nor non-existence of such algorithms has been proved, but it is generally suspected that they do not exist and hence that the problem is not in class P.[3][4] The problem is clearly in class NP but has not been proved to be or not be NP-complete. It is generally suspected not to be NP-complete.[5]

There are published algorithms that are faster than O((1 + ε)b) for all positive ε, that is, sub-exponential. The best published asymptotic running time is for the general number field sieve (GNFS) algorithm, which, for a b-bit number n, is

${\displaystyle \exp \left(\left({\sqrt[{3}]{\frac {64}{9}}}+o(1)\right)(\ln n)^{\frac {1}{3}}(\ln \ln n)^{\frac {2}{3}}\right)}$

For current computers, GNFS is the best published algorithm for large n (more than about 400 bits). For a quantum computer, however, Peter Shor discovered an algorithm in 1994 that solves it in polynomial time. This will have significant implications for cryptography if quantum computation becomes scalable. Shor's algorithm takes only O(b3) time and O(b) space on b-bit number inputs. In 2001, Shor's algorithm was implemented for the first time, by using NMR techniques on molecules that provide 7 qubits.[6]

When discussing what complexity classes the integer factorization problem falls into, it is necessary to distinguish two slightly different versions of the problem:

• The function problem version: given an integer N, find an integer d with 1 < d < N that divides N (or conclude that N is prime). This problem is trivially in FNP, and it is not known whether it lies in FP or not. This is the version solved by practical implementations.
• The decision problem version: given an integer N and an integer M with 1 < M < N, does N have a factor d with 1 < dM? This version is useful because most well studied complexity classes are defined as classes of decision problems, not function problems.

For NM < N, the decision problem is equivalent to asking whether N is not prime.

An algorithm for either version provides one for the other. Repeated application of the function problem (applied to d and N/d, and their factors, if needed) will eventually provide either a factor of N no larger than M or a factorization into primes all greater than M. All known algorithms for the decision problem work in this way. Hence it is only of theoretical interest that, with at most log N queries using an algorithm for the decision problem, one would isolate a factor of N (or prove it prime) by binary search.

It is not known exactly which complexity classes contain the decision version of the integer factorization problem. It is known to be in both NP and co-NP. This is because both "yes" and "no" answers can be verified in polynomial time. An answer of "yes" can be certified by exhibiting a factorization N = d(N/d) with dM. An answer of "no" can be certified by exhibiting the factorization of N into distinct primes, all larger than M. We can verify their primality using the AKS primality test and that their product is N by multiplication. The fundamental theorem of arithmetic guarantees that there is only one possible string that will be accepted (providing the factors are required to be listed in order), which shows that the problem is in both UP and co-UP.[7] It is known to be in BQP because of Shor's algorithm. It is suspected to be outside of all three of the complexity classes P, NP-complete, and co-NP-complete. It is therefore a candidate for the NP-intermediate complexity class. If it could be proved that it is in either NP-complete or co-NP-complete, that would imply NP = co-NP. That would be a very surprising result, and therefore integer factorization is widely suspected to be outside both of those classes. Many people have tried to find classical polynomial-time algorithms for it and failed, and therefore it is widely suspected to be outside P.

In contrast, the decision problem "is N a composite number?" (or equivalently: "is N a prime number?") appears to be much easier than the problem of actually finding the factors of N. Specifically, the former can be solved in polynomial time (in the number n of digits of N) with the AKS primality test. In addition, there are a number of probabilistic algorithms that can test primality very quickly in practice if one is willing to accept the vanishingly small possibility of error. The ease of primality testing is a crucial part of the RSA algorithm, as it is necessary to find large prime numbers to start with.

## Factoring algorithms

### Special-purpose

A special-purpose factoring algorithm's running time depends on the properties of the number to be factored or on one of its unknown factors: size, special form, etc. Exactly what the running time depends on varies between algorithms.

An important subclass of special-purpose factoring algorithms is the Category 1 or First Category algorithms, whose running time depends on the size of smallest prime factor. Given an integer of unknown form, these methods are usually applied before general-purpose methods to remove small factors.[8] For example, trial division is a Category 1 algorithm.

### General-purpose

A general-purpose factoring algorithm, also known as a Category 2, Second Category, or Kraitchik family algorithm (after Maurice Kraitchik),[8] has a running time which depends solely on the size of the integer to be factored. This is the type of algorithm used to factor RSA numbers. Most general-purpose factoring algorithms are based on the congruence of squares method.

## Heuristic running time

In number theory, there are many integer factoring algorithms that heuristically have expected running time

${\displaystyle L_{n}\left[{\tfrac {1}{2}},1+o(1)\right]=e^{(1+o(1)){\sqrt {(\log n)(\log \log n)}}}}$

in big O and L-notation. Some examples of those algorithms are the elliptic curve method and the quadratic sieve. Another such algorithm is the class group relations method proposed by Schnorr,[9] Seysen,[10] and Lenstra,[11] that is proved under the assumption of the Generalized Riemann Hypothesis (GRH).

## Rigorous running time

The Schnorr-Seysen-Lenstra probabilistic algorithm has been rigorously proven by Lenstra and Pomerance[12] to have expected running time ${\displaystyle L_{n}\left[{\tfrac {1}{2}},1+o(1)\right]}$ by replacing the GRH assumption with the use of multipliers. The algorithm uses the class group of positive binary quadratic forms of discriminant Δ denoted by GΔ. GΔ is the set of triples of integers (a, b, c) in which those integers are relative prime.

### Schnorr-Seysen-Lenstra Algorithm

Given an integer n that will be factored, where n is an odd positive integer greater than a certain constant. In this factoring algorithm the discriminant Δ is chosen as a multiple of n, Δ = −dn, where d is some positive multiplier. The algorithm expects that for one d there exist enough smooth forms in GΔ. Lenstra and Pomerance show that the choice of d can be restricted to a small set to guarantee the smoothness result.

Denote by PΔ the set of all primes q with Kronecker symbol ${\displaystyle \left({\tfrac {\Delta }{q}}\right)=1}$. By constructing a set of generators of GΔ and prime forms fq of GΔ with q in PΔ a sequence of relations between the set of generators and fq are produced. The size of q can be bounded by ${\displaystyle c_{0}(\log |\Delta |)^{2}}$ for some constant ${\displaystyle c_{0}}$.

The relation that will be used is a relation between the product of powers that is equal to the neutral element of GΔ. These relations will be used to construct a so-called ambiguous form of GΔ, which is an element of GΔ of order dividing 2. By calculating the corresponding factorization of Δ and by taking a gcd, this ambiguous form provides the complete prime factorization of n. This algorithm has these main steps:

Let n be the number to be factored.

1. Let Δ be a negative integer with Δ = −dn, where d is a multiplier and Δ is the negative discriminant of some quadratic form.
2. Take the t first primes ${\displaystyle p_{1}=2,p_{2}=3,p_{3}=5,\dots ,p_{t}}$, for some ${\displaystyle t\in {\mathbb {N} }}$.
3. Let ${\displaystyle f_{q}}$ be a random prime form of GΔ with ${\displaystyle \left({\tfrac {\Delta }{q}}\right)=1}$.
4. Find a generating set X of GΔ
5. Collect a sequence of relations between set X and {fq : qPΔ} satisfying: ${\displaystyle \left(\prod _{x\in X_{}}x^{r(x)}\right).\left(\prod _{q\in P_{\Delta }}f_{q}^{t(q)}\right)=1}$
6. Construct an ambiguous form ${\displaystyle (a,b,c)}$ that is an element fGΔ of order dividing 2 to obtain a coprime factorization of the largest odd divisor of Δ in which ${\displaystyle \Delta =-4ac{\text{ or }}a(a-4c){\text{ or }}(b-2a)(b+2a)}$
7. If the ambiguous form provides a factorization of n then stop, otherwise find another ambiguous form until the factorization of n is found. In order to prevent useless ambiguous forms from generating, build up the 2-Sylow group Sll2(Δ) of G(Δ).

To obtain an algorithm for factoring any positive integer, it is necessary to add a few steps to this algorithm such as trial division, and the Jacobi sum test.

### Expected running time

The algorithm as stated is a probabilistic algorithm as it makes random choices. Its expected running time is at most ${\displaystyle L_{n}\left[{\tfrac {1}{2}},1+o(1)\right]}$.[12]

## Notes

2. Kleinjung; et al. (2010-02-18). "Factorization of a 768-bit RSA modulus" (PDF). International Association for Cryptologic Research. Retrieved 2010-08-09. Cite journal requires |journal= (help)
3. Krantz, Steven G. (2011), The Proof is in the Pudding: The Changing Nature of Mathematical Proof, New York: Springer, p. 203, doi:10.1007/978-0-387-48744-1, ISBN 978-0-387-48908-7, MR 2789493
4. Arora, Sanjeev; Barak, Boaz (2009), Computational complexity, Cambridge: Cambridge University Press, p. 230, doi:10.1017/CBO9780511804090, ISBN 978-0-521-42426-4, MR 2500087
5. Goldreich, Oded; Wigderson, Avi (2008), "IV.20 Computational Complexity", in Gowers, Timothy; Barrow-Green, June; Leader, Imre (eds.), The Princeton Companion to Mathematics, Princeton, New Jersey: Princeton University Press, pp. 575–604, ISBN 978-0-691-11880-2, MR 2467561. See in particular p. 583.
6. Vandersypen, Lieven M. K.; et al. (2001). "Experimental realization of Shor's quantum factoring algorithm using nuclear magnetic resonance". Nature. 414: 883–887. arXiv:quant-ph/0112176. doi:10.1038/414883a.
7. Lance Fortnow (2002-09-13). "Computational Complexity Blog: Complexity Class of the Week: Factoring".
8. David Bressoud and Stan Wagon (2000). A Course in Computational Number Theory. Key College Publishing/Springer. pp. 168–69. ISBN 978-1-930190-10-8.
9. Schnorr, Claus P. (1982). "Refined analysis and improvements on some factoring algorithms". Journal of Algorithms. 3 (2): 101–127. doi:10.1016/0196-6774(82)90012-8. MR 0657269.
10. Seysen, Martin (1987). "A probabilistic factorization algorithm with quadratic forms of negative discriminant". Mathematics of Computation. 48 (178): 757–780. doi:10.1090/S0025-5718-1987-0878705-X. MR 0878705.
11. Lenstra, Arjen K (1988). "Fast and rigorous factorization under the generalized Riemann hypothesis". Indagationes Mathematicae. 50: 443–454.
12. Lenstra, H. W.; Pomerance, Carl (July 1992). "A Rigorous Time Bound for Factoring Integers" (PDF). Journal of the American Mathematical Society. 5 (3): 483–516. doi:10.1090/S0894-0347-1992-1137100-0. MR 1137100.