Professional Documents
Culture Documents
Whats New
Whats New
Whats New
exe
--------------------------------------------
1) Edwards ECM curves. Given a reasonable amount of memory, Edwards ECM curves are
roughly
10% faster than Montgomery ECM curves used in all earlier prime95 versions.
2) Added support for running ECM stage 2 given a resume file from GMP-ECM.
Especially useful
when GMP-ECM uses a GPU to run ECM stage 1. Syntax for worktodo entries are:
ECMSTAGE2=k,b,n,c,filename[,B2-or-zero][,skip_curves][,num_curves][,"known-
factors"]
ECMSTAGE2N=filename[,B2-or-zero][,skip_curves][,num_curves][,"known-factors"]
3) New timestamp options in undoc.txt.
4) Manual communication menu choice will also start proof uploads.
5) From the Test/Primenet... dialog box, turning on Use Primenet will bring up the
Test/Worker
Windows... dialog box to allow changing work preferences prior to contacting the
PrimeNet server.
6) The P-1 and ECM stage 2 vs. stage 1 runtime estimate used for optimal B2
calculations is now
compared to the actual so that future estimates will hopefully more accurate.
7) Advanced/Test now creates a PRP worktodo.txt entry for large exponents.
1) Faster generic mod. This is more useful for programs like PFGW than prime95.
1) Smaller executable. Many SSE2 FFTs optimized for CPUs made before 2010 were
removed.
These CPUs will still be able to run all FFT sizes, but 30.15 might have a more
efficient
implementation available.
1) ECM stage 2 using fast polynomial multiplication similar to the GMP-ECM program.
If lots
of memory is available for stage 2 this implementation will be substantially
faster.
2) Proof files start uploading within a minute of sending a PRP result to the
server.
Previously it could take up to an hour for proof uploading to begin.
3) Worktodo.add files are now processed without stopping all workers.
4) Settings from local.txt have been moved to prime.txt. Settings in prime.txt
grouped for clarity.
New features in Version 30.8 of prime95.exe
-------------------------------------------
1) P-1 stage 2 using fast polynomial multiplication similar to the GMP-ECM program.
If lots
of memory is available for stage 2 this implementation will be substantially
faster.
1) PRP proofs. This allows GIMPS to double-check a PRP test at less than 1% of the
cost of a full PRP test!
PRP proofs require lots of temporary disk space. See readme.txt for details.
PRP proofs require uploading a large proof file. See readme.txt for details.
PRP proof verifications require downloading a modest verification file. See
readme.txt for details.
2) Proofs automatically uploaded to server in 30.2.
3) First time LL, World-record LL, 100M-digit LL work preference is deprecated.
4) New resource limits menu choice and dialog box. Consult readme.txt before
making changes to these settings.
Some options previously in Test/Worker Windows and Options/CPU are moved to the
resources dialog box.
5) LL-DC and PRP-DC combined into a single work preference.
6) Warning raised if temporary disk space is less than 1.5GB -- you may not get
first time prime tests.
7) Thanks to Mihai Preda, the P-1 probability calculator has been improved. This
change results in a
lower optimal B1 value and higher optimal B2 value.
1) A new error check for LL testing has been implemented. This error check,
called a Jacobi error check, has a 50% chance of detecting hardware error(s)
since the last time a Jacobi error check was performed. This error check
takes roughly 30 seconds and is scheduled to run twice a day. The program
now saves two additional intermediate files that have passed the Jacobi
error check. This test requires use of the GMP (GNU multi-precision) library.
2) The GCD step in P-1 and ECM factoring is faster.
1) Faster trial factoring for machines that support FMA (Haswell and later).
Multi-threaded trial factoring now supports more than one thread sieving for
small primes. Several tuning parameters added - see undoc.txt.
2) The portable library, hwloc, for analyzing a machine's topology is now used.
This replaces the buggy code prime95 used to detect hyperthreading. It also
eliminates the need for AffinityScramble2. Running a benchmark will output
this topology information to results.txt.
3) AVX-512 trial factoring support added.
4) Dialog box for benchmarking added.
5) In the Test/Worker Windows dialog box you no longer choose how many threads
each worker uses. Instead, you choose how many CPU cores each worker uses.
There affinity options have been removed. There are two new options that
will decide if each worker also uses hyperthreading.
1) Since GPUs are so much better at trial factoring than CPUs, benchmarking no
longer times
prime95's trial factoring by default. Two new benchmarking options are
available:
OnlyBenchThroughput and OnlyBenchMaxCPUs. See undoc.txt for details.
2) Slightly reduced the memory bandwidth requirements for several large FFTs. May
lead to
a very small speed increase for users testing 100 million digit numbers.
3) If running more than one worker, prime95 looks for any sin/cos data that it can
share among
the workers. Depending on the FFT sizes you are running, this could lead to a
very slight
reduction in needed memory bandwidth.
4) Method for choosing the best FFT implementation changed. In previous versions,
the FFT
implementation that resulted in the fastest single worker timing was used. In
this version
the FFT implementation that had the best throughput was selected. For FMA3
FFTs I used a
4-core Skylake to measure best throughput. For AVX FFTs I used a 4-core Sandy
Bridge
to measure best throughput. Not many FFTs were affected, but you may see a few
percent
variation in throughput with this version.
5) Improved AVX2 trial factoring in 64-bit executable. Trial factoring should
still be done
on a GPU. A GPU is over 10 times more efficient at trial factoring than a
CPU!!!
6) Trial factoring now defines one "iteration" as processing 128KB of sieve, or 1M
possible
factors. In previous versions an iteration was defined as 16KB of sieve in 32-
bit executables
and 48KB in 64-bit executables. The trial factoring benchmark still times
processing 16KB of sieve.
7) Trial factoring in 64-bit executables is now multi-threaded.
8) On initial install, the default settings for number of worker windows will be
set to
the number of cores / 4 with multithreading turned on.
9) The worker windows dialog box now enforces a minimum number of multi-threaded
cores for some
work types to ensure timely completion of assignments. Also, the worker
windows menu choice
no longer allows assigning work to hyperthreads (they are rarely beneficial in
prime95).
This behavior can be overridden with the ConfigureHyperthreads undoc.txt
feature.
1) The "days between sending new end dates" preference now accepts values between
0.125 and
7.0 (was between 1 and 7). This lets you contact the server as frequently as
every three
hours. This can be useful in conjunction with the server feature that sends an
email to
you if the computer is more than one hour late contacting the server. To turn
on that
server feature, go to the CPUs web page, and click on the CPU, there you can
turn on a
couple of email features.
2) AVX2 support for trial factoring. About a 50% speedup over the previous
version. However,
all trial factoring should still be done on a GPU. A GPU is over 10 times more
efficient at trial factoring than a CPU!!!
1) Changed the output to the worker windows during LL and PRP tests. The new
output includes
the estimated time to complete the test. There are two new options described
in undoc.txt:
ClassicOutput and OutputRoundoff.
3) Benchmarking on hyperthreaded machines now times only the most common cases.
Specifically,
hyperthreading is used only in the one cpu and all cpu cases.
4) Benchmarking trial factoring is now off by default. Prime95 should not be used
for trial
factoring. GPUs are over 10 times more efficient at that task.
6) There are many new options described in undoc.txt to customize the benchmarking
process.
1) Reduced memory usage. This may make some single-thread benchmarks slower, but
when
running several workers on machines where memory is a bottleneck there should
be a
small performance increase.
1) Supports Intel's new for fused multiply add instruction introduced with the
Haswell CPU.
This results in faster FFTs. Note that performance on many Haswell systems is
memory-bandwidth
limited. This means that when running workers on all cores performance gains
will be small.
2) Some minor optimizations may give a very small performance boost for AVX CPUs.
3) All new test torture test data for AVX CPUs. The new data runs more
iterations, thus more time
is spent torturing the CPU rather than initializing the FFT routines.
4) Information added to result lines containing "has a factor". This information
may be used
by the server's manual web page to give proper TF / P-1 / ECM cpu credit at a
future date.
1) When an error occurs reading a save file it is renamed with a .bad extension.
On rare occasions the file can be read successfully at a later time.
1) Multi-threaded tests might be a little bit faster especially when using a lot
of threads.
Of course, single-threaded usage still gives the best throughput.
2) FFT crossover points were adjusted. Many higher, a few lower.
1) 32-bit FFTs optimized for AVX-capable computers. Intel Sandy Bridge computers
should
see a 25% speed increase.
1) For rare cases where the program cannot figure out the number of cores and
hyperthreading,
the NumPhysicalCores option may help. See undoc.txt.
2) Faster FFT implementations are now selected for Core 2 CPUs with 1MB L2 cache
or less
(marketed under the Celeron and Pentium label).
3) New, slightly higher, trial factoring breakeven points.
1) A few crash bugs were fixed that affected only some CPU architectures and some
FFT lengths. Other minor bugs were fixed.
1) A bug that caused pfactor save file names to begin with the letter p
was fixed. It now uses m like all other P-1 efforts. Old save file
names are automatically upgraded.
2) Prime95 now recovers gracefully from more out-of-memory conditions
when doing ECM or P-1.
3) We now do P-1 factoring one bit level before the trial factoring limit.
The previous version started P-1 two bit levels before the trial
factoring limit.
1) A bug that caused the torture test to hang on 256K FFTs on SSE2 machines
with 128K of L2 cache was fixed.
1) Celeron D (256K L2 cache) and Willamette (also 256K L2 cache) now have
different implementations for several FFT sizes. This results in an
improvement of several percent for the Celeron D.
2) A bug that caused some machines to generate "Error 2252" when communicating
with the server was fixed.
3) SSE2 trial factoring code had a bug when factoring very large exponents.
1) For SSE2 machines the larger FFTs have been changed to more effectively
use a wide variety of L2 cache sizes. The previous version was optimized
for a 256KB L2 cache only. Depending on your CPU and FFT size, you could
see an improvement of several percent.
2) As a side "benefit" even larger FFT sizes are now supported. This allows
testing of exponents up to 596 million. Not recommended.
3) The factoring breakeven points have been recalculated using my 2 GHz P4.
This version of prime95 will do less trial factoring.
4) Since server database crashes cause spurious error 3 messages, prime95
will now ignore error 3 messages from the server for 72 hours. This should
workaround the problem whereby a result is reported but no CPU credit is
given and the reservation is not cleared.
5) Fixed crash bug trial factoring exponents above 286 million.
1) Faster FFTs for AMD64 using prefetchw in both 32-bit and 64-bit mode.
You can expect about a 15% speed improvement.
2) Prime95 now detects support for 3DNow! instructions. See undoc.txt
for overriding this detection with CpuSupports3DNow=n in local.ini.
3) Factoring entries in worktodo.ini now accept exponents up to 2 billion.
4) Improved checking for memory allocation errors during a torture test.
Better guessing of amount of memory to use in a blend test.
5) Added timeouts to PrimeNet communications in hopes of avoiding rare hangs
when contacting the PrimeNet server.
6) Fixed rare bug where P-1's GCD could miss a factor.
7) Added trial factoring to the benchmark.
8) Fixed bug in ECM when using zero-padded FFTs.
9) SSE2 macros optimized for an additional 1-3% improvement on P4 and AMD64
CPUs.
1) The blend torture test now uses less memory by default to prevent
thrashing. The in-place torture test is now the default more cases
to reduce complaints dealing with memory allocation issues (no pagefile,
thrashing, etc.)
2) A bug in continuing after finding a factor when using AdvancedFactor was
fixed.
1) A bug was fixed where the torture test used more virtual memory than
necessary, sometimes resulting in an "Out of memory" error.
1) Four changes have been made after GIMPS' first false positive report
in 7 1/2 years of operation.
a) The program now returns the number of errors that occurred when the
result is prime. A non-zero value will make us more suspect of the
reported prime.
b) The save files will not be deleted. The user can then email these to
me and we can rerun the last 30 minutes of the LL test. It is hard
to imagine a second false positive report in this mini LL run.
c) The shift counter is now checked every iteration. If this variable
AND the FFT data was corrupted, then and endless loop of LL iterations
generating zero is possible - resulting in a false prime report.
d) Every iteration the FFT data is checked to see if the data has been
zeroed.
e) The is-this-a-prime check now makes sure the FFT data is not NaN.
NaN stands for not a number and means the data is corrupt. The
previous version checked for zero and my C compiler returns TRUE for
the test NaN == 0.0!
2) I restored the old behavior sending relative URLs. Some users had trouble
with this feature introduced in 23.4. So, UseFullURL=0 is now the default.
3) Some prefetching improvements were made for the Athlon, Pentium 3, and
Celeron 2 processors. You can expect speed improvements between 3%
and 10% for most FFT sizes. Warning: the new code is slower for Durons
and Celeron's with small L2 caches for FFT sizes 1024K and above.
1) Further performance improvements in the SSE2 code for FFTs larger than
640K. You should see about a 4% improvement in LL tests on a P4. However,
FFTs between 40K and 512K might be a tiny bit slower.
2) Stage 2 of P-1 factoring has been recoded for more speed. WARNING: when
continuing from a save file created in stage 2 by a previous prime95
version, this version will restart stage 2 from the beginning. ALSO:
the stage 2 P-1 save file format has changed - in dual boot scenarios
you must upgrade mprime and prime95 at the same time.
1) Further performance improvements in the SSE2 code. You can expect about
a 5% improvement in LL tests on a P4.
1) Big SSE2 FFTs now take the L2 cache size into account. P4 Celeron (128KB
L2 cache) is faster for FFTs between 512K and 2M. P4 Northwood (512KB
L2 cache) is faster for FFTs larger than 1M.
2) Benchmark no longer times 256K and 320K FFTs, but does time 2048K FFT.
3) Support for torture testing FFT sizes from 1280K to 4096K added.
4) A 900 MHz P-III is now required to get first time LL tests by default.
5) Slightly faster SSE2 FFTs for lengths of 5*2^N and 7*2^N (e.g. 640K, 896K).
1) A P-1 and ECM QA suite was implemented. A bug in ECM for exponents
below 172,700 and near the limit of an FFT size and using SSE2 code
was fixed.
1) A bug was fixed that caused some factors to be missed in stage 2 of P-1
when the available memory did not let the program allocate 12 temporary
variables. If testing a number in the 16 millions using an FFT size of
768K, then each temporary takes 768K * 8 bytes or 6MB. If your memory
setting was less than 72MB (6MB * 12 temporaries) then you were affected
by the bug. Actual the program allocates some fixed tables so anything
less than about 75MB triggered the bug.
1) A bug that caused factors to be missed in the last stage of P-1 and ECM
factoring was fixed. The bug was introduced in executables built after
Sept. 28, 2002.
1) Error rate for a clean run is now estimated at 1.8% raising your chances
of finding a Mersenne prime while double-checking.
2) You can now stop and continue testing from the system tray menu.
3) You can now pause prime95 when another program starts running. See the
PauseWhileRunning option in undoc.txt.
4) Fixed bug introduced in 22.8 where No Icon did not work if Start at Bootup
was also specified in Windows 98.
5) A bug in unscrambling the proxy server password in primenet.ini was fixed.
1) Soft FFT crossovers have been implemented. If you test an exponent that
is within 0.2% of the old hard FFT crossover point, then 1000 test
iterations are run to determine if the smaller or larger FFT is
appropriate for the exponent. Although not recommended, you can adjust
prime95's behavior to be more aggressive using the SoftCrossover and
SoftCrossoverAdjust features discussed in undoc.txt.
2) To better stress main memory, the torture test will now use up to the
amount of memory specified in the Options/CPU dialog box.
3) Iterations with roundoff checking are a little faster for non-SSE2 CPUs.
1) Given more data, the roundoff error checking is now done on every
iteration for exponents within 0.5% of the maximum that can be tested
by the current FFT length. If a roundoff error above 0.4 occurs,
then the iteration is now repeated without any change to the shift count.
It now takes a roundoff error greater than 0.6 to corrupt the results.
2) Many of the FFT ranges have changed. Version 21 was too aggressive
in choosing FFT sizes for the P4. The new handling of roundoff
errors above 0.4 lets us be more aggressive with the non-P4 code.
3) Result lines are now WYn rather than WXn.
4) A crash bug affecting P-1 and ECM using the 2^N+1 option for large N was
fixed.
5) A rare memory corruption and possible crash bug in the GCD code was
fixed.
6) The -t command line argument will run the torture test.
7) To reduce wild fluctuations in the RollingAverage, it will be updated
roughly twice per day.
New features in Version 22.2 of prime95.exe
-------------------------------------------
1) Some bugs in error handling when communicating with the server have
been fixed.
2) Communicating with the server by RPC is no longer supported. The
HTTPNET.DLL and RPCNET.DLL have been deleted. Only "Basic" Proxy server
authentication is supported. Version 21 may have supported MS Proxy
Server 2.0's NTLM (NT Lan Manager) challenge/response authentication.
3) The program now uses a high resolution timer rather than the RDTSC
instruction to time events. This should help the program display accurate
timings on laptops with SpeedStep or desktops that can hibernate. You can
force the program to use the old RDTSC timing method with the RdtscTiming
option described in undoc.txt.
4) The program uses a updated algorithm to automatically detect CPU type and
speed. The Options/CPU dialog box no longer let's you set this
information. Instead, the Options/CPU dialog displays the detected
CPU type and speed. This new feature helps prevent incorrect settings
when users upgrade or try several overclocking speeds. If the new
algorithm fails, you can override the settings as described in undoc.txt.
5) Benchmark now writes the program version, timing methodology, cpu type
and speed, L1 and L2 cache information to results.txt. It will refuse to
benchmark if rdtsctiming is 10.
1) Exiting the Torture Test now prints out how long it ran.
2) P4 error checking was relaxed slightly to reduce false alarms.
1) The program will now skip the P-1 factoring stage if another user has
already performed this step.
2) The Advanced/Quit GIMPS menu choice now lets you quit after current
work completes or quit immediately.
3) A bug was fixed in the error recovery code. After getting a "Disregard
last error" message, the user was treated to a new error on every
iteration. The end result was incorrect. The bug only affected the
error recovery of the new P4 FFT introduced in the beta version 21.2.
1) The program now uses the SSE2 instructions introduced on the Pentium 4 CPU.
This version is about 3 times faster than the previous version on a P4.
2) The program now uses the prefetch instructions on the Celeron 2, Pentium 3,
and all Athlon CPUs. This results in about a 20% performance boost on
these machines.
3) Process priority is now set to idle. Microsoft documentation says that
an idle thread priority overrides process priority. The program's
priority scheme has always worked well. However, several Win2K users
have reported that the program works better if the process priority is
also set to idle.
4) The program now delays calculations until 90 seconds after bootup. This
lets your machine boot up as fast as possible. This can be changed, see
undoc.txt.
5) The default crossover between double-check assignments and first-time
tests has been increased to a 400 MHz PII.
6) After 5 1/2 years, a help file now exists! It is HTML Help which may
have problems on older Windows machines. If so, try downloading
hhupd.exe from Microsoft's web site.
7) The program used to do factoring and P-1 testing on new exponents before
completing LL tests on older exponents. This was confusing to many.
The program now processes the worktodo.ini file in sequential order.
See undoc.txt on how to restore the old behavior.
8) Error messages such as ILLEGAL SUMOUT, SUMINP != SUMOUT, etc. are no
longer sent to the server since the final result contains a count of how
many errors occurred during the LL test.
9) Some of the FFT crossover points have changed slightly.
10) Test/Status now outputs the day of the week each work item will complete.
11) Interim output lines have changed - hopefully, so they are more useful
especially to newbies. By default, output lines no longer contain the
clocks count. Lines now contain a timestamp. Benchmark timings are now
output in milliseconds. These defaults can be changed using options
in undoc.txt.
12) A new welcome screen for new users will encourage stress testers
to use the program without reserving exponents.
13) There is now a menu choice that lets you unreserve a specific exponent.
This is for knowledgable users only. You might do this if the server
assigned a small exponent and you'd rather be testing larger ones. Or
the prime95 you set up on another machine had a hard drive failure.
14) The Windows 9x Service menu choice has been replaced by the more general
Start at Bootup menu choice. This choice now sets registry entries to
autostart prime95 on any Windows machine. WARNING: Using this option
will delete any StartUp menu shortcuts so the registry entry and startup
menu do not both try to start prime95.
15) Prime95 will now ask for confirmation if you enter a CPU speed that
differs from the computed CPU speed by more than 4%.
16) The self-test menu choice was deleted - use the torture test instead.
17) A benchmark menu choice has been added.
18) The torture test code now includes more FFT lengths including smaller
ones that run completely in the L2 cache which may increase CPU
temperatures. Each FFT size is tested for 15 minutes by default.
There are now several options in the undoc.txt for fine tuning the
torture test's behavior.
1) A crash bug was fixed. If you did P-1 factoring followed by trial
factoring, then if you found a factor unallocated memory was accessed
usually resulting in a crash.
1) A rare error was fixed. When the memory settings changed during the
GCD step in P-1 or ECM factoring, then a spurious "ERROR: Factor
does not divide N!" error was raised.
1) Another fairly uncommon ECM bug was fixed. The bug caused "Factor does
not divide N!" errors.
2) A couple of minor bugs in computing the optimal P-1 bounds to use prior
to a Lucas-Lehmer test were fixed. The program now does a better job at
estimating the memory required in P-1 stage 2. Finally, although P-1
stage 2 working set size is unchanged, the program allocates less memory
in stage 2.
3) Prime95 no longer searches for a smaller factor when trial factoring
discovers a factor. The reasons are two-fold. 1) Version 19 had a
bug where stopping and restarting the program bypassed the search for
smaller factors. Thus, my database may already be missing smaller
factors. 2) As we factor larger exponents to a deeper depth it may
no longer be a quick job to determine if there are smaller factors.
Note, that version 20 will still look for smaller factors if you are
looking for factors below 2^60 with the FactorOverride option in undoc.txt.
4) The undocumented AMPM feature controls how times are formatted in the
Options/CPU dialog box.
1) If P-1 stage 1 completed and there was not enough memory to start
stage 2 immediately, then an incorrect save file was generated. This
bug was introduced in version 20.1. Upon restart of the P-1 factoring
job a crash or other unpredictable behavior was possible. This bug was
fixed and this version has special code to properly read these
incorrect save files.
2) P-1 will restart any time the memory settings change. This is done
so that the optimal P-1 bounds can be computed with the new memory
settings.
3) A bug in ECM testing was fixed.
1) The program now does some P-1 factoring prior to running first time
and double-checking Lucas-Lehmer tests. This will increase overall
GIMPS throughput. If you install version 20 in the middle of an LL test
the program will run the P-1 step if the LL test is less than 50%
complete.
2) The Options/CPU dialog box now asks how much memory the program can
use during the P-1 factoring. See the "Setting Available Memory"
section in the readme.txt file.
3) Stage 1 of P-1 factoring is now faster.
4) The GCD used in P-1 and ECM factoring is now faster.
5) The Test/Manual Operation menu choice has been deleted.
6) The memory options in P-1 and ECM dialog boxes have been deleted.
7) The "send new completion dates" checkbox was moved from the Test/Primenet
dialog box to the Advanced/Manual Communication dialog box.
8) A bug in estimating time remaining for a factoring job was fixed.
9) AdvancedFactor now writes a line to the worktodo.ini - just like all
the other work types.
10) ECM and P-1 are now consistent with Lucas-Lehmer testing in the use
of the "Iterations between screen outputs" setting. An iteration is
defined as the time it takes to do a squaring. If you are doing ECM
on small exponents you will probably want to increase this setting.
1) Added code so that server can distinguish between a v17 and v18 client.
2) Only v17 save files above 4194304 are deleted.
1) Quit GIMPS choice moved to the Advanced Menu (so novice users do
not confuse it with exit).
2) A bug in httpnet.dll was fixed. It now works for even more
proxy servers and firewalls.
1) More errors in detecting whether you are connected to the Internet have
been fixed.
2) Minor bugs fixed.
1) Command line arguments are now available to help you run the program
on your co-worker's machines!
1) Factoring speed has been doubled for Pentiums. 486 machines will
notice a 15% improvement when factoring.
2) The program will now automatically determine your CPU type and speed.
This information is used in choosing the optimal factoring algorithm.