Math::Prime::Util::GMP - Utilities related to prime numbers and factoring, using GMP |
Math::Prime::Util::GMP - Utilities related to prime numbers and factoring, using GMP
Version 0.34
use Math::Prime::Util::GMP ':all'; my $n = "115792089237316195423570985008687907853269984665640564039457584007913129639937";
# This doesn't impact the operation of the module at all, but does let you # enter big number arguments directly as well as enter (e.g.): 2**2048 + 1. use bigint;
# These return 0 for composite, 2 for prime, and 1 for probably prime # Numbers under 2^64 will return 0 or 2. # is_prob_prime does a BPSW primality test for numbers > 2^64 # is_prime adds some MR tests and a quick test to try to prove the result # is_provable_prime will spend a lot of effort on proving primality
say "$n is probably prime" if is_prob_prime($n); say "$n is ", qw(composite prob_prime def_prime)[is_prime($n)]; say "$n is definitely prime" if is_provable_prime($n) == 2;
# Miller-Rabin and strong Lucas-Selfridge pseudoprime tests say "$n is a prime or spsp-2/7/61" if is_strong_pseudoprime($n, 2, 7, 61); say "$n is a prime or slpsp" if is_strong_lucas_pseudoprime($n); say "$n is a prime or eslpsp" if is_extra_strong_lucas_pseudoprime($n);
# Return array reference to primes in a range. my $aref = primes( 10 ** 200, 10 ** 200 + 10000 );
$next = next_prime($n); # next prime > n $prev = prev_prime($n); # previous prime < n
# Primorials and lcm say "23# is ", primorial(23); say "The product of the first 47 primes is ", pn_primorial(47); say "lcm(1..1000) is ", consecutive_integer_lcm(1000);
# Find prime factors of big numbers @factors = factor(5465610891074107968111136514192945634873647594456118359804135903459867604844945580205745718497);
# Finer control over factoring. # These stop after finding one factor or exceeding their limit. # # optional arguments o1, o2, ... @factors = trial_factor($n); # test up to o1 @factors = prho_factor($n); # no more than o1 rounds @factors = pbrent_factor($n); # no more than o1 rounds @factors = holf_factor($n); # no more than o1 rounds @factors = squfof_factor($n); # no more than o1 rounds @factors = pminus1_factor($n); # o1 = smoothness limit, o2 = stage 2 limit @factors = ecm_factor($n); # o1 = B1, o2 = # of curves @factors = qs_factor($n); # (no arguments)
A module for number theory in Perl using GMP. This includes primality tests, getting primes in a range, factoring, and more.
While it certainly can be used directly, the main purpose of this module is for the Math::Prime::Util manpage. That module will automatically load this one if it is installed, greatly speeding up many of its operations on big numbers.
Inputs and outputs for big numbers are via strings, so you do not need to use a bigint package in your program. However if you do use bigints, inputs will be converted internally so there is no need to convert before a call. Output results are returned as either Perl scalars (for native-size) or strings (for bigints). the Math::Prime::Util manpage tries to reconvert all strings back into the callers bigint type if possible, which makes it more convenient for calculations.
The various is_*_pseudoprime
tests are more appropriately called
is_*_probable_prime
or is_*_prp
. They return 1 if the input is a
probable prime based on their test. The naming convention is historical
and follows Pari, the Math::Primality manpage, and some other math packages.
The modern definition of pseudoprime is a composite that passes the
test, rather than any number.
my $prob_prime = is_prob_prime($n); # Returns 0 (composite), 2 (prime), or 1 (probably prime)
Takes a positive number as input and returns back either 0 (composite), 2 (definitely prime), or 1 (probably prime).
For inputs below 2^64
the test is deterministic, so the possible
return values are 0 (composite) or 2 (definitely prime).
For inputs above 2^64
, a probabilistic test is performed. Only 0
(composite) and 1 (probably prime) are returned. The current
implementation uses the Baillie-PSW (BPSW) test. There is a
possibility that composites may be returned marked prime, but since
the test was published in 1980, not a single BPSW pseudoprime has
been found, so it is extremely likely to be prime.
While we believe (Pomerance 1984) that an infinite number of
counterexamples exist, there is a weak conjecture (Martin) that
none exist under 10000 digits.
In more detail, we are using the extra-strong Lucas test (Grantham 2000) using the Baillie parameter selection method (see OEIS A217719). Previous versions of this module used the strong Lucas test with Selfridge parameters, but the extra-strong version produces fewer pseudoprimes while running 1.2 - 1.5x faster. It is slightly stronger than the test used in Pari.
say "$n is prime!" if is_prime($n);
Takes a positive number as input and returns back either 0 (composite),
2 (definitely prime), or 1 (probably prime). Composites will act
exactly like is_prob_prime
, as will numbers less than 2^64
.
For numbers larger than 2^64
, some additional tests are performed
on probable primes to see if they can be proven by another means.
This call walks the line between the performance of is_prob_prime and the certainty of is_provable_prime. Those calls may be more appropriate in some cases. What this function does is give most of the performance of the former, while adding more certainty. For finer tuning of this tradeoff, especially if performance is critical for 65- to 200-bit inputs, you may instead use is_prob_prime followed by additional probable prime tests such as miller_rabin_random and/or is_frobenius_underwood_pseudoprime.
As with is_prob_prime, a BPSW test is first performed. This is
deterministic for all 64-bit numbers. Next, if the number is a
Proth or LLR form, then a proof is constructed. If the result is
still ``probably prime'' and the input is smaller than the
Sorenson/Webster (2015) deterministic Miller-Rabin limit
(approximately 82 bits) then the 11 or 12 Miller-Rabin tests are
performed and the result is confirmed. For larger inputs that are
still ``probably prime'' but under 200 bits, a quick
BLS75 n-1
primality proof is attempted. This is tuned to give up
if the result cannot be quickly determined, and results in success
rates of ~80% at 80 bits, ~30% at 128 bits, and ~13% at 160 bits.
Lastly, for results still ``probably prime'', a small number of
Miller-Rabin tests with random bases are performed.
The result is that many numbers will return 2 (definitely prime), and the numbers that return 1 (probably prime) have gone through more tests than is_prob_prime while not taking too long.
For cryptographic key generation, you may want even more testing for probable primes (NIST recommends a few more additional M-R tests than we perform). The function miller_rabin_random is made for this. Alternately, a different test such as is_frobenius_underwood_pseudoprime can be used. Even better, use is_provable_prime which should be reasonably fast for sizes under 2048 bits. Typically for key generation one wants random primes, and there are many functions for that.
say "$n is definitely prime!" if is_provable_prime($n) == 2;
Takes a positive number as input and returns back either 0 (composite), 2 (definitely prime), or 1 (probably prime). A great deal of effort is taken to return either 0 or 2 for all numbers.
The current method first uses BPSW to find composites and provide a
deterministic answer for tiny numbers (under 2^64
). If no
certificate is required, LLR and Proth tests can be run, and small
numbers (under approximately 2^82
) can be satisfied with a
deterministic Miller-Rabin test. If the result is still not determined,
a quick BLS75 n-1
test is attempted, followed by ECPP.
The time required for primes of different input sizes on a circa-2009
workstation averages about 3ms
for 30-digits, 5ms
for 40-digit,
20ms
for 60-digit, 50ms
for 80-digit, 100ms
for 100-digit,
2s
for 200-digit, and 400-digit inputs about a minute.
Expect a lot of time variation for larger inputs. You can see progress
indication if verbose is turned on (some at level 1, and a lot at level 2).
A certificate can be obtained along with the result using the is_provable_prime_with_cert method. There is no appreciable extra performance cost for returning a certificate.
Takes a positive number as input and returns back an array with two elements. The result will be one of:
(0, '') The input is composite.
(1, '') The input is probably prime but we could not prove it. This is a failure in our ability to factor some necessary element in a reasonable time, not a significant proof failure (in other words, it remains a probable prime).
(2, '...') The input is prime, and the certificate contains all the information necessary to verify this.
The certificate is a text representation containing all the necessary information to verify the primality of the input in a reasonable time. The result can be used with verify_prime in the Math::Prime::Util manpage for verification. Proof types used include:
ECPP BLS3 BLS15 BLS5 Small
Takes a positive number n
and a base a
as input, and returns 1 if
n
is a probable prime to base a
. This is the simple Fermat primality
test. Removing primes, given base 2 this produces the sequence
OEIS A001567.
my $maybe_prime = is_strong_pseudoprime($n, 2); my $probably_prime = is_strong_pseudoprime($n, 2, 3, 5, 7, 11, 13, 17);
Takes a positive number as input and one or more bases. Returns 1 if the input is a prime or a strong pseudoprime to all of the bases, and 0 if not. The base must be a positive integer. This is often called the Miller-Rabin test.
If 0 is returned, then the number really is a composite. If 1 is returned, then it is either a prime or a strong pseudoprime to all the given bases. Given enough distinct bases, the chances become very strong that the number is actually prime.
Both the input number and the bases may be big integers. If base modulo n <= 1 or base modulo n = n-1, then the result will be 1. This allows the bases to be larger than n if desired, while still returning meaningful results. For example,
is_strong_pseudoprime(367, 1101)
would incorrectly return 0 if this was not done properly. A 0 result should be returned only if n is composite, regardless of the base.
This is usually used in combination with other tests to make either stronger tests (e.g. the strong BPSW test) or deterministic results for numbers less than some verified limit (e.g. Jaeschke showed in 1993 that no more than three selected bases are required to give correct primality test results for any 32-bit number). Given the small chances of passing multiple bases, there are some math packages that just use multiple MR tests for primality testing, though in the early 1990s almost all serious software switched to the BPSW test.
Even numbers other than 2 will always return 0 (composite). While the
algorithm works with even input, most sources define it only on odd input.
Returning composite for all non-2 even input makes the function match most
other implementations including the Math::Primality manpage's is_strong_pseudoprime
function.
my $maybe_prime = miller_rabin_random($n, 10); # 10 random bases
Takes a positive number (n
) as input and a positive number (k
) of bases
to use. Performs k
Miller-Rabin tests using uniform random bases
between 2 and n-2
. This is the correct way to perform k
Miller-Rabin
tests, rather than the common but broken method of using the first k
primes.
An optional third argument may be given, which is a seed to use. The seed
should be a number either in decimal, binary with a leading 0b
, hex with
a leading 0x
, or octal with a leading 0
. It will be converted to a
GMP integer, so may be large. Typically this is not necessary, but
cryptographic applications may prefer the ability to use this, and it
allows repeatable test results.
There is no check for duplicate bases. Input sizes below 65-bits make little sense for this function since the is_prob_prime manpage is deterministic at that size. For numbers of 65+ bits, the chance of duplicate bases is quite small. The exponentiation approximation for the birthday problem gives a probability of less than 2e-16 for 100 random bases to have a duplicate with a 65-bit input, and less than 2e-35 with a 128-bit input.
Takes a positive number as input, and returns 1 if the input is a standard or strong Lucas probable prime. The Selfridge method of choosing D, P, and Q are used (some sources call this a Lucas-Selfridge test). This is one half of the BPSW primality test (the Miller-Rabin strong probable prime test with base 2 being the other half). The canonical BPSW test (page 1401 of Baillie and Wagstaff (1980)) uses the strong Lucas test with Selfridge parameters, but in practice a variety of Lucas tests with different parameters are used by tests calling themselves BPSW.
The standard Lucas test implemented here corresponds to the Lucas test described in FIPS 186-4 section C.3.3, though uses a slightly more efficient calculation. Since the standard Lucas-Selfridge test is a subset of the strong Lucas-Selfridge test, I recommend using the strong test rather than the standard test for cryptographic purposes. It is often slightly faster, has over 4x fewer pseudoprimes, and is the method recommended by Baillie and Wagstaff in their 1980 paper.
Takes a positive number as input, and returns 1 if the input is an extra-strong Lucas probable prime. This is defined in Grantham (2000), and is a slightly more stringent test than the strong Lucas test, though because different parameters are used the pseudoprimes are not a subset. As expected by the extra conditions, the number of pseudoprimes is less than 2/3 that of the strong Lucas-Selfridge test. Runtime performance is 1.2 to 1.5x faster than the strong Lucas test.
The parameters are selected using the Baillie-OEIS method:
P = 3; Q = 1; while ( jacobi( P*P-4, n ) != -1 ) P += 1;
Takes a positive number as input and returns 1 if the input is an ``almost'' extra-strong Lucas probable prime. This is the classic extra-strong Lucas test but without calculating the U sequence. This makes it very fast, although as the input increases in size the time converges to the conventional extra-strong implementation: at 30 digits this routine is about 15% faster, at 300 digits it is only 2% faster.
With the current implementations, there is little reason to prefer this unless trying to reproduce specific results. The extra-strong implementation has been optimized to use similar features, removing most of the performance advantage.
An optional second argument (must be between 1 and 256) indicates the increment amount for P parameter selection. The default value of one yields the method described in is_extra_strong_lucas_pseudoprime. A value of 2 yields the method used in Pari.
Because the U = 0
condition is ignored, this produces about 5% more
pseudoprimes than the extra-strong Lucas test. However this is still only
66% of the number produced by the strong Lucas-Selfridge test. No BPSW
counterexamples have been found with any of the Lucas tests described.
Takes a positive number n
as input and returns 1 if n
divides P(n)
where P(n)
is the Perrin number of n
. The Perrin sequence is defined by
C<P(0) = 3, P(1) = 0, P(2) = 2; P(n) = P(n-2) + P(n-3)>
This is not a commonly used test, as it runs 5 to 10 times slower than most of the other probable prime tests and offers little benefit, especially over combined tests like is_bpsw_prime and is_frobenius_underwood_pseudoprime.
Takes a positive number n
as input, and two optional parameters a
and
b
, and returns 1 if the n
is a Frobenius probable prime with respect
to the polynomial x^2 - ax + b
. Without the parameters, b = 2
and
a
is the least positive odd number such that (a^2-4b|n) = -1
.
This selection has no pseudoprimes below 2^64
and none known. In any
case, the discriminant a^2-4b
must not be a perfect square.
Takes a positive number as input, and returns 1 if the input passes the
efficient Frobenius test of Paul Underwood. This selects a parameter a
as the least non-negative integer such that (a^2-4|n)=-1
, then verifies that
(x+2)^(n+1) = 2a + 5 mod (x^2-ax+1,n)
. This combines a Fermat and Lucas
test at a computational cost of about 2.5x a strong pseudoprime test. This
makes it similar to, but faster than, a Frobenius test.
This test is deterministic (no randomness is used). There are no known pseudoprimes to this test. This test also has no overlap with the BPSW test, making it a very effective method for adding additional certainty.
Takes a positive number as input, and returns 1 if the input passes the
Frobenius test of Sergey Khashin. This ensures n
is not a perfect square,
selects the parameter c
as the smallest odd prime such that (c|n)=-1
,
then verifies that (1+D)^n = (1-D) mod n
where D = sqrt(c) mod n
.
This test is deterministic (no randomness is used). There are no known pseudoprimes to this test.
Given a positive number input, returns 0 (composite), 2 (definitely prime), or 1 (probably prime), using the BPSW primality test (extra-strong variant).
This function does the extra-strong BPSW test and nothing more. That is, it will skip all pretests and any extra work that the is_prob_prime test may add.
say "$n is definitely prime" if is_aks_prime($n);
Takes a positive number as input, and returns 1 if the input passes the Agrawal-Kayal-Saxena (AKS) primality test. This is a deterministic unconditional primality test which runs in polynomial time for general input.
In theory, AKS is extremely important. In practice, it is essentially useless. Estimated run time for a 150 digit input is about 9 years, making the case that while the algorithmic complexity growth is polynomial, the constants are ludicrously high. There are some ideas of Bernstein that can reduce this a little, but it would still take years for numbers that ECPP or APR-CL can prove in seconds.
Typically you should use is_provable_prime and let it decide the method.
say "2^607-1 (M607) is a Mersenne prime" if is_mersenne_prime(607);
Takes a positive number p
as input and returns 1 if 2^p-1
is prime.
After some pre-testing, the Lucas-Lehmer test is performed.
This is a deterministic unconditional test that runs very fast compared
to other primality methods for numbers of comparable size, and vastly
faster than any known general-form primality proof methods.
Takes a positive number n
as input and returns one of: 0 (definitely
composite), 2 (definitely prime), or -1 (test does not indicate anything).
This implements the Lucas-Lehmer-Riesel test for
fast deterministic primality testing on numbers of the form k * 2^n - 1
.
If k = 1
then this is a Mersenne number and the Lucas-Lehmer test is used.
If the number is not of this form, or if k <= 2^n
, then -1
will
be returned as the test does not apply.
Otherwise, the LLR
test is performed. While not as fast as the Lucas-Lehmer test for Mersenne
numbers, it is almost as fast as a single strong pseudoprime test (i.e.
Miller-Rabin test) while giving a certain answer.
Takes a positive number n
as input and returns one of: 0 (definitely
composite), 2 (definitely prime), or -1 (test does not indicate anything).
This applies Proth's theorem for fast Las Vegas
primality testing on numbers of the form k * 2^n + 1
.
If the number is not of this form, or if k <= 2^n
, then -1
will
be returned as the test does not apply.
Otherwise, a search is performed to find a quadratic nonresidue modulo n
.
If none can be found after a brief search, -1
is returned as no conclusion
can be reached. Otherwise, Proth's theorem is checked which conclusively
indicates primality.
While not as fast as the Lucas-Lehmer test for Mersenne
numbers, it is almost as fast as a single strong pseudoprime test (i.e.
Miller-Rabin test) while giving a certain answer.
say "$n is definitely prime" if is_miller_prime($n); say "$n is definitely prime assuming the GRH" if is_miller_prime($n, 1);
Takes a positive number as input, and returns 1 if the input passes the deterministic Miller test. An optional second argument indicates whether the Generalized Riemann Hypothesis should be assumed, and defaults to 0. Setting the verbose flag to 2 or higher will show how many bases are used. The unconditional test is exponential time, while the conditional test (assuming the GRH) is polynomial time.
This is a very slow method in practice, and generally should not be used. The asymptotic complexity of the GRH version is good in theory, matching ECPP, but in practice it is much slower. The number of bases used by the unconditional test grows quite rapidly, impractically many past about 160 bits, and overflows a 64-bit integer at 456 bits -- sizes that are trivial for the unconditional APR-CL and ECPP tests.
say "$n is definitely prime" if is_nminus1_prime($n);
Takes a positive number as input, and returns 1 if the input passes either
theorem 5 or theorem 7 of the Brillhart-Lehmer-Selfridge primality test.
This is a deterministic unconditional primality test which requires factoring
n-1
to a linear factor less than the cube root of the input. For small
inputs (under 40 digits) this is typically very easy, and some numbers will
naturally lead to this being very fast. As the input grows, this method
slows down rapidly.
Typically you should use is_provable_prime and let it decide the method.
say "$n is definitely prime" if is_ecpp_prime($n);
Takes a positive number as input, and returns 1 if the input passes the ECPP primality test. This is the Atkin-Morain Elliptic Curve Primality Proving algorithm. It is the fastest primality proving method in Math::Prime::Util.
This implementation uses a ``factor all strategy'' (FAS) with backtracking.
A limited set of about 500 precalculated discriminants are used, which works
well for inputs up to 300 digits, and for many inputs up to one thousand
digits. Having a larger set will help with large numbers (a set of 2650
is available on github in the xt/
directory). A future implementation
may include code to generate class polynomials as needed.
Typically you should use is_provable_prime and let it decide the method.
my $aref1 = primes( 1_000_000 ); my $aref2 = primes( 2 ** 448, 2 ** 448 + 10000 ); say join ",", @{primes( 2**2048, 2**2048 + 10000 )};
Returns all the primes between the lower and upper limits (inclusive), with
a lower limit of 2
if none is given.
An array reference is returned, matching the signature of the function of the same name in the Math::Prime::Util manpage.
Values above 64-bit are extra-strong BPSW probable primes.
my @primes = sieve_primes(2**100, 2**100 + 10000); my @candidates = sieve_primes(2**1000, 2**1000 + 10000, 40000);
Given two arguments low
and high
, this returns the primes in the
interval (inclusive) as a list. It operates similar to the primes manpage, though
must always have an lower and upper bound and returns a list.
With three arguments low
, high
, and limit
, this does a partial
sieve over the inclusive range and returns the list that pass the sieve.
If limit
is less than 2
then it is identical to the two-argument
version, in that a primality test will be performed after sieving.
Otherwise, sieving is performed up to limit
.
The two-argument version is typically only used internally and adds little functionality. The three-argument version is quite useful for applications that want to apply their own primality or other tests, and wish to have a list of values in the range with no small factors. This is quite common for applications involving prime gaps.
my @primes = sieve_twin_primes(2**1000, 2**1000 + 500000);
Given two arguments low
and high
, this returns each lower twin prime
in the interval (inclusive). The result is a list, not a reference.
This does a partial sieve of the range, removes any non-twin candidates, then checks that each pair are both BPSW probable primes. This is substantially more efficient than sieving for all primes followed by removing those that are not twin primes.
# Find some prime septuplets my @s = sieve_prime_cluster(2**100, 2**100+1e12, 2,6,8,12,18,20);
Efficiently finds prime clusters between the first two arguments low
and high
(inclusive). The remaining arguments describe the cluster.
The cluster values must be even, less than 31 bits, and strictly increasing.
Given a cluster set C
, the returned values are all primes in the
range where p+c
is prime for all c
in the cluster set C
.
The cluster is described as offsets from 0, with the implicit prime
at 0. Hence an empty list is asking for all primes (the cluster
p+0
). A list with the single value 2
will find all twin primes
(the cluster where p+0
and p+2
are prime). The list 2,6,8
will find prime quadruplets. Note that there is no requirement that
the list denote a constellation (a cluster with minimal distance) --
the list 42,92,606
is just fine.
For long clusters, e.g. OEIS series A213601
prime 12-tuplets, this will be immensely more efficient than filtering
out the cluster from a list of primes. For that example, a range of
10^13
takes less than a second to search -- thousands of times faster
than filtering results from primes or twin primes.
Shorter clusters are not quite this efficient, and the overhead for
returning large arrays should not be ignored.
$n = next_prime($n);
Returns the prime following the input number (the smallest prime number that is greater than the input number). The function is_prob_prime is used to determine when a prime is found, hence the result is a probable prime (using BPSW).
For large inputs this function is quite a bit faster than GMP's
mpz_nextprime
or Pari's nextprime
.
$n = prev_prime($n);
Returns the prime preceding the input number (the largest prime number
that is less than the input number).
0 is returned if the input is 2
or lower.
The function is_prob_prime is used to determine when a prime is found,
hence the result is a probable prime (using BPSW).
say "Fibonacci($_) = ", lucasu(1,-1,$_) for 0..100;
Given integers P
, Q
, and the non-negative integer k
,
computes U_k
for the Lucas sequence defined by P
,Q
. These include
the Fibonacci numbers (1,-1
), the Pell numbers (2,-1
), the Jacobsthal
numbers (1,-2
), the Mersenne numbers (3,2
), and more.
This corresponds to OpenPFGW's lucasU
function and gmpy2's lucasu
function.
say "Lucas($_) = ", lucasv(1,-1,$_) for 0..100;
Given integers P
, Q
, and the non-negative integer k
,
computes V_k
for the Lucas sequence defined by P
,Q
. These include
the Lucas numbers (1,-1
).
This corresponds to OpenPFGW's lucasV
function and gmpy2's lucasv
function.
my($U, $V, $Qk) = lucas_sequence($n, $P, $Q, $k)
Computes U_k
, V_k
, and Q_k
for the Lucas sequence defined by
P
,Q
, modulo n
. The modular Lucas sequence is used in a
number of primality tests and proofs.
The following conditions must hold:
- D = P*P - 4*Q != 0
- P > 0
- P < n
- Q < n
- k >= 0
- n >= 2
$p = primorial($n);
Given an unsigned integer argument, returns the product of the prime numbers
which are less than or equal to n
. This definition of n#
follows
OEIS series A034386 and
Wikipedia: Primorial definition for natural numbers.
$p = pn_primorial($n)
Given an unsigned integer argument, returns the product of the first n
prime numbers. This definition of p_n#
follows
OEIS series A002110 and
Wikipedia: Primorial definition for prime numbers.
The two are related with the relationships:
pn_primorial($n) == primorial( nth_prime($n) ) primorial($n) == pn_primorial( prime_count($n) )
Given positive integer argument n
, returns the factorial of n
,
defined as the product of the integers 1 to n
with the special case
of factorial(0) = 1
. This corresponds to Pari's factorial(n)
and Mathematica's Factorial[n]
functions.
Given a list of integers, returns the greatest common divisor. This is often used to test for coprimality.
Given a list of integers, returns the least common multiple.
Given two integers x
and y
, returns u,v,d
such that d = gcd(x,y)
and u*x + v*y = d
. This uses the extended Euclidian algorithm to compute
the values satisfying Bézout's Identity.
This corresponds to Pari's gcdext
function, which was renamed from
bezout
out Pari 2.6. The results will hence match bezout in the Math::Pari manpage.
say chinese( [14,643], [254,419], [87,733] ); # 87041638
Solves a system of simultaneous congruences using the Chinese Remainder
Theorem (with extension to non-coprime moduli). A list of [a,n]
pairs
are taken as input, each representing an equation x ≡ a mod n
. If no
solution exists, undef
is returned. If a solution is returned, the
modulus is equal to the lcm of all the given moduli (see lcm. In
the standard case where all values of n
are coprime, this is just the
product. The n
values must be positive integers, while the a
values
are integers.
Returns the sum of all arguments, each of which must be an integer.
Returns the product of all arguments, each of which must be an integer.
Returns the Kronecker symbol (a|n)
for two integers. The possible
return values with their meanings for odd positive n
are:
0 a = 0 mod n 1 a is a quadratic residue modulo n (a = x^2 mod n for some x) -1 a is a quadratic non-residue modulo n
The Kronecker symbol is an extension of the Jacobi symbol to all integer
values of n
from the latter's domain of positive odd values of n
.
The Jacobi symbol is itself an extension of the Legendre symbol, which is
only defined for odd prime values of n
. This corresponds to Pari's
kronecker(a,n)
function and Mathematica's KroneckerSymbol[n,m]
function.
Given integer arguments n
and k
, returns the binomial coefficient
n*(n-1)*...*(n-k+1)/k!
, also known as the choose function. Negative
arguments use the Kronenburg extensions.
This corresponds to Mathematica's Binomial[n,k]
function, Pari's
binomial(n,k)
function, and GMP's mpz_bin_ui
function.
For negative arguments, this matches Mathematica. Pari does not implement
the n < 0, k <= n
extension and instead returns 0
for this
case. GMP's API does not allow negative k
but otherwise matches.
the Math::BigInt manpage does not implement any extensions and the results for
n < 0, k
0> are undefined.
Returns the Bernoulli number B_n
for an integer argument n
, as a
rational number. Two values are returned, the numerator and denominator.
B_1 = 1/2.
This corresponds to Pari's bernfrac(n)
and Mathematica's BernoulliB
functions.
Returns the Harmonic number H_n
for an integer argument n
, as a
rational number. Two values are returned, the numerator and denominator.
numbers are the sum of reciprocals of the first n
natural numbers:
1 + 1/2 + 1/3 + ... + 1/n
.
Returns the Harmonic number H_n
for an integer argument n
, as a
string floating point. An optional second argument indicates the number
of digits to be preserved past the decimal place, with a default of 40.
say "s(14,2) = ", stirling(14, 2); say "S(14,2) = ", stirling(14, 2, 2);
Returns the Stirling numbers of either the first kind (default), the
second kind, or the third kind (the unsigned Lah numbers), with the kind
selected as an optional third argument. It takes two non-negative integer
arguments n
and k
plus the optional type
. This corresponds to Pari's
stirling(n,k,{type})
function and Mathematica's
StirlingS1
/ StirlingS2
functions.
Stirling numbers of the first kind are -1^(n-k)
times the number of
permutations of n
symbols with exactly k
cycles. Stirling numbers
of the second kind are the number of ways to partition a set of n
elements into k
non-empty subsets. The Lah numbers are the number of
ways to split a set of n
elements into k
non-empty lists.
$order = znorder(17, "100000000000000000000000065");
Given two positive integers a
and n
, returns the multiplicative order
of a
modulo n
. This is the smallest positive integer k
such that
a^k ≡ 1 mod n
. Returns 1 if a = 1
. Returns undef if a = 0
or if
a
and n
are not coprime, since no value will result in 1 mod n.
This corresponds to Pari's znorder(Mod(a,n))
function and Mathematica's
MultiplicativeOrder[a,n]
function.
Given a positive integer n
, returns the smallest primitive root
of (Z/nZ)^*
, or undef
if no root exists. A root exists when
euler_phi($n) == carmichael_lambda($n)
, which will be true for
all prime n
and some composites.
OEIS A033948 is a sequence of integers where the primitive root exists, while OEIS A046145 is a list of the smallest primitive roots, which is what this function produces.
say "Sum of divisors of $n:", sigma( $n ); say "sigma_2($n) = ", sigma($n, 2); say "Number of divisors: sigma_0($n) = ", sigma($n, 0);
This function takes a positive integer as input and returns the sum of
its divisors, including 1 and itself. An optional second argument k
may be given, which will result in the sum of the k-th
powers of the
divisors to be returned.
This is known as the sigma function (see Hardy and Wright section 16.7,
or OEIS A000203). The API is identical to Pari/GP's sigma
function.
This function is useful for calculating things like aliquot sums, abundant
numbers, perfect numbers, etc.
Takes a positive integer as input and returns the value of Ramanujan's tau
function. The result is a signed integer.
This corresponds to Mathematica's RamanujanTau
function.
say "$n is divisible by 2 ", valuation($n,2), " times.";
Given integers n
and k
, returns the numbers of times n
is divisible
by k
. This is a very limited version of the algebraic valuation meaning,
just applied to integers.
This corresponds to Pari's valuation
function.
0
is returned if n
or k
is one of the values -1
, 0
, or 1
.
say "$n is square free" if moebius($n) != 0; $sum += moebius($_) for (1..200); say "Mertens(200) = $sum"; say "Mertens(2000) = ", vecsum(moebius(0,2000));
Returns μ(n), the Möbius function (also known as the Moebius, Mobius, or
MoebiusMu function) for an integer input. This function is 1 if
n = 1
, 0 if n
is not square free (i.e. n
has a repeated factor),
and -1^t
if n
is a product of t
distinct primes. This is an
important function in prime number theory. Like SAGE, we define
moebius(0) = 0
for convenience.
If called with two arguments, they define a range low
to high
, and the
function returns an array with the value of the Möbius function for every n
from low to high inclusive.
say "The inverse of 42 mod 2017 = ", invmod(42,2017);
Given two integers a
and n
, return the inverse of a
modulo n
.
If not defined, undef is returned. If defined, then the return value
multiplied by a
equals 1
modulo n
.
$lcm = consecutive_integer_lcm($n);
Given an unsigned integer argument, returns the least common multiple of all
integers from 1 to n
. This can be done by manipulation of the primes up
to n
, resulting in much faster and memory-friendly results than using
factorials.
Calculates the partition function p(n)
for a non-negative integer input.
This is the number of ways of writing the integer n as a sum of positive
integers, without restrictions. This corresponds to Pari's numbpart
function and Mathematica's PartitionsP
function. The values produced
in order are OEIS series A000041.
This uses a combinatorial calculation, which means it will not be very
fast compared to Pari, Mathematica, or FLINT which use the Rademacher
formula using multi-precision floating point. In 10 seconds, the pure
Perl version can produce partitions(10_000)
while with
the Math::Prime::Util::GMP manpage it can do partitions(220_000)
. In contrast,
in about 10 seconds Pari can solve numbpart(22_000_000)
.
If you want the enumerated partitions, see the Integer::Partition manpage. It is very fast and uses an extremely memory efficient iterator. It is not, however, practical for producing the partition number for values over 100 or so.
Takes a positive integer argument n
and returns the constant Pi with that
many digits (including the leading 3). Rounding is performed.
The implementation uses AGM and is only slightly slower than MPFR (which has tighter bounds on the intermediate bits and exit conditions).
say "exp(lambda($_)) = ", exp_mangoldt($_) for 1 .. 100;
Returns EXP(Λ(n)), the exponential of the Mangoldt function (also known
as von Mangoldt's function) for an integer value.
The Mangoldt function is equal to log p if n is prime or a power of a prime,
and 0 otherwise. We return the exponential so all results are integers.
Hence the return value for exp_mangoldt
is:
p if n = p^m for some prime p and integer m >= 1 1 otherwise.
say "The Euler totient of $n is ", totient($n);
Returns φ(n), the Euler totient function (also called Euler's phi or phi
function) for an integer value. This is an arithmetic function which counts
the number of positive integers less than or equal to n
that are relatively
prime to n
. Given the definition used, totient
will return 0 for all
n < 1
. This follows the logic used by SAGE. Mathematica and Pari
return totient(-n)
for n < 0
. Mathematica returns 0 for n = 0
,
Pari pre-2.6.2 raises and exception, and Pari 2.6.2 and newer returns 2.
say "Jordan's totient J_$k($n) is ", jordan_totient($k, $n);
Returns Jordan's totient function for a given integer value. Jordan's totient
is a generalization of Euler's totient, where
jordan_totient(1,$n) == euler_totient($n)
This counts the number of k-tuples less than or equal to n that form a coprime
tuple with n. As with totient
, 0 is returned for all n < 1
.
This function can be used to generate some other useful functions, such as
the Dedekind psi function, where psi(n) = J(2,n) / J(1,n)
.
Returns the Carmichael function (also called the reduced totient function,
or Carmichael λ(n)) of a positive integer argument. It is the smallest
positive integer m
such that a^m = 1 mod n
for every integer a
coprime to n
. This is OEIS series A002322.
Returns λ(n), the Liouville function for a non-negative integer input. This is -1 raised to Ω(n) (the total number of prime factors).
say "$n is a perfect square" if is_power($n, 2); say "$n is a perfect cube" if is_power($n, 3); say "$n is a ", is_power($n), "-th power";
Given a single positive integer input n
, returns k if n = p^k
for
some integer p > 1, k > 1
, and 0 otherwise. The k returned is
the largest possible. This can be used in a boolean statement to
determine if n
is a perfect power.
If given two arguments n
and k
, returns 1 if n
is a k-th
power,
and 0 otherwise. For example, if k=2
then this detects perfect squares.
This corresponds to Pari/GP's ispower
function, with the limitations of
only integer arguments and no third argument may be given to return the root.
@factors = factor(640552686568398413516426919223357728279912327120302109778516984973296910867431808451611740398561987580967216226094312377767778241368426651540749005659); # Returns an array of 11 factors
Returns a list of prime factors of a positive number, in numerical order. The
special cases of n = 0
and n = 1
will return n
.
Like most advanced factoring programs, a mix of methods is used. This includes trial division for small factors, perfect power detection, Pollard's Rho, Pollard's P-1 with various smoothness and stage settings, Hart's OLF (a Fermat variant), ECM (elliptic curve method), and QS (quadratic sieve). Certainly improvements could be designed for this algorithm (suggestions are welcome).
In practice, this factors 26-digit semiprimes in under 100ms
, 36-digit
semiprimes in under one second. Arbitrary integers are factored faster.
It is many orders of magnitude faster than any other factoring module on
CPAN circa 2013. It is comparable in speed to Math::Pari's factorint
for most inputs.
If you want better factoring in general, I recommend looking at the standalone programs yafu, msieve, gmp-ecm, and GGNFS.
my @factors = trial_factor($n); my @factors = trial_factor($n, 1000);
Given a positive number input, tries to discover a factor using trial division.
The resulting array will contain either two factors (it succeeded) or the
original number (no factor was found). In either case, multiplying @factors
yields the original input. An optional divisor limit may be given as the
second parameter. Factoring will stop when the input is a prime, one factor
is found, or the input has been tested for divisibility with all primes less
than or equal to the limit. If no limit is given, then 2**31-1
will be used.
This is a good and fast initial test, and will be very fast for small numbers (e.g. under 1 million). For larger numbers, faster methods for complete factoring have been known since the 17th century.
For inputs larger than about 1000 digits, a dynamic product/remainder tree is used, which is faster than GMP's native methods. This helps when pruning composites or looking for very small factors.
my @factors = prho_factor($n); my @factors = prho_factor($n, 100_000_000);
Given a positive number input, tries to discover a factor using Pollard's Rho method. The resulting array will contain either two factors (it succeeded) or the original number (no factor was found). In either case, multiplying @factors yields the original input. An optional number of rounds may be given as the second parameter. Factoring will stop when the input is a prime, one factor has been found, or the number of rounds has been exceeded.
This is the Pollard Rho method with f = x^2 + 3
and default rounds 64M. It
is very good at finding small factors. Typically pbrent_factor will be
preferred as it behaves similarly but runs quite a bit faster. They use
different parameters however, so are not completely identical.
my @factors = pbrent_factor($n); my @factors = pbrent_factor($n, 100_000_000);
Given a positive number input, tries to discover a factor using Pollard's Rho method with Brent's algorithm. The resulting array will contain either two factors (it succeeded) or the original number (no factor was found). In either case, multiplying @factors yields the original input. An optional number of rounds may be given as the second parameter. Factoring will stop when the input is a prime, one factor has been found, or the number of rounds has been exceeded.
This is the Pollard Rho method using Brent's modified cycle detection,
delayed gcd
computations, and backtracking. It is essentially
Algorithm P''2 from Brent (1980). Parameters used are f = x^2 + 3
and default rounds 64M. It is very good at finding small factors.
my @factors = pminus1_factor($n);
# Set B1 smoothness to 10M, second stage automatically set. my @factors = pminus1_factor($n, 10_000_000);
# Run p-1 with B1 = 10M, B2 = 100M. my @factors = pminus1_factor($n, 10_000_000, 100_000_000);
Given a positive number input, tries to discover a factor using Pollard's
p-1
method. The resulting array will contain either two factors (it
succeeded) or the original number (no factor was found). In either case,
multiplying @factors yields the original input. An optional first stage
smoothness factor (B1) may be given as the second parameter. This will be
the smoothness limit B1 for the first stage, and will use 10*B1
for
the second stage limit B2. If a third parameter is given, it will be used
as the second stage limit B2.
Factoring will stop when the input is a prime, one factor has been found, or
the algorithm fails to find a factor with the given smoothness.
This is Pollard's p-1
method using a default smoothness of 5M and a
second stage of B2 = 10 * B1
. It can quickly find a factor p
of the input
n
if the number p-1
factors into small primes. For example
n = 22095311209999409685885162322219
has the factor p = 3916587618943361
,
where p-1 = 2^7 * 5 * 47 * 59 * 3137 * 703499
, so this method will find
a factor in the first stage if B1 >= 703499
or in the second stage if
B1 >= 3137
and B2 >= 703499
.
The implementation is written from scratch using the basic algorithm including a second stage as described in Montgomery 1987. It is faster than most simple implementations I have seen (many of which are written assuming native precision inputs), but slower than Ben Buhrow's code used in earlier versions of yafu, and nowhere close to the speed of the version included with modern GMP-ECM with large B values (it is actually quite a bit faster than GMP-ECM with small smoothness values).
my @factors = pplus1_factor($n);
Given a positive number input, tries to discover a factor using Williams'
p+1
method. The resulting array will contain either two factors (it
succeeded) or the original number (no factor was found). In either case,
multiplying @factors yields the original input. An optional first stage
smoothness factor (B1) may be given as the second parameter. This will be
the smoothness limit B1 for the first stage.
Factoring will stop when the input is a prime, one factor has been found, or
the algorithm fails to find a factor with the given smoothness.
my @factors = holf_factor($n); my @factors = holf_factor($n, 100_000_000);
Given a positive number input, tries to discover a factor using Hart's OLF method. The resulting array will contain either two factors (it succeeded) or the original number (no factor was found). In either case, multiplying @factors yields the original input. An optional number of rounds may be given as the second parameter. Factoring will stop when the input is a prime, one factor has been found, or the number of rounds has been exceeded.
This is Hart's One Line Factorization method, which is a variant of Fermat's algorithm. A premultiplier of 480 is used. It is very good at factoring numbers that are close to perfect squares, or small numbers. Very naive methods of picking RSA parameters sometimes yield numbers in this form, so it can be useful to run a few rounds to check. For example, the number:
18548676741817250104151622545580576823736636896432849057 \ 10984160646722888555430591384041316374473729421512365598 \ 29709849969346650897776687202384767704706338162219624578 \ 777915220190863619885201763980069247978050169295918863
was proposed by someone as an RSA key. It is indeed composed of two distinct prime numbers of similar bit length. Most factoring methods will take a very long time to break this. However one factor is almost exactly 5x larger than the other, allowing HOLF to factor this 222-digit semiprime in only a few milliseconds.
my @factors = squfof_factor($n); my @factors = squfof_factor($n, 100_000_000);
Given a positive number input, tries to discover a factor using Shanks' square forms factorization method (usually known as SQUFOF). The resulting array will contain either two factors (it succeeded) or the original number (no factor was found). In either case, multiplying @factors yields the original input. An optional number of rounds may be given as the second parameter. Factoring will stop when the input is a prime, one factor has been found, or the number of rounds has been exceeded.
This is Daniel Shanks' SQUFOF (square forms factorization) algorithm. The
particular implementation is a non-racing multiple-multiplier version, based
on code ideas of Ben Buhrow and Jason Papadopoulos as well as many others.
SQUFOF is often the preferred method for small numbers, and the Math::Prime::Util manpage
as well as many other packages use it was the default method for native size
(e.g. 32-bit or 64-bit) numbers after trial division. The GMP version used
in this module will work for larger values, but my testing indicates it is
generally slower than the prho
and pbrent
implementations.
my @factors = ecm_factor($n); my @factors = ecm_factor($n, 12500); # B1 = 12500 my @factors = ecm_factor($n, 12500, 10); # B1 = 12500, curves = 10
Given a positive number input, tries to discover a factor using ECM. The resulting array will contain either two factors (it succeeded) or the original number (no factor was found). In either case, multiplying @factors yields the original input. An optional maximum smoothness may be given as the second parameter, which relates to the size of factor to search for. An optional third parameter indicates the number of random curves to use at each smoothness value being searched.
This is an implementation of Hendrik Lenstra's elliptic curve factoring method, usually referred to as ECM. The implementation is reasonable, using projective coordinates, Montgomery's PRAC heuristic for EC multiplication, and two stages. It is much slower than the latest GMP-ECM, but still quite useful for factoring reasonably sized inputs.
my @factors = qs_factor($n);
Given a positive number input, tries to discover factors using QS (the
quadratic sieve). The resulting array will contain one or more numbers such
that multiplying @factors yields the original input. Typically multiple
factors will be produced, unlike the other ..._factor
routines.
The current implementation is a modified version of SIMPQS, a predecessor to the QS in FLINT, and was written by William Hart in 2006. It will not operate on input less than 30 digits. The memory use for large inputs is more than desired, so other methods such as pbrent_factor, pminus1_factor, and ecm_factor are recommended to begin with to filter out small factors. However, it is substantially faster than the other methods on large inputs having large factors, and is the method of choice for 35+ digit semiprimes.
Dana Jacobsen <dana@acm.org>
William Hart wrote the SIMPQS code which is the basis for the QS code.
Obviously none of this would be possible without the mathematicians who created and published their work. Eratosthenes, Gauss, Euler, Riemann, Fermat, Lucas, Baillie, Pollard, Brent, Montgomery, Shanks, Hart, Wagstaff, Dixon, Pomerance, A.K. Lenstra, H. W. Lenstra Jr., Atkin, Knuth, etc.
The GNU GMP team, whose product allows me to concentrate on coding high-level algorithms and not worry about any of the details of how modular exponentiation and the like happen, and still get decent performance for my purposes.
Ben Buhrow and Jason Papadopoulos deserve special mention for their open source factoring tools, which are both readable and fast. In particular I am leveraging their SQUFOF work in the current implementation. They are a huge resource to the community.
Jonathan Leto and Bob Kuo, who wrote and distributed the the Math::Primality manpage module on CPAN. Their implementation of BPSW provided the motivation I needed to do it in this module and the Math::Prime::Util manpage. I also used their module quite a bit for testing against.
Paul Zimmermann's papers and GMP-ECM code were of great value for my projective ECM implementation, as well as the papers by Brent and Montgomery.
Copyright 2011-2015 by Dana Jacobsen <dana@acm.org>
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
SIMPQS Copyright 2006, William Hart. SIMPQS is distributed under GPL v2+.
Math::Prime::Util::GMP - Utilities related to prime numbers and factoring, using GMP |