Benchmarking FreeBSD

Benchmarking FreeBSD
Ivan Voras <ivoras@freebsd.org>
What and why?
Everyone likes a nice benchmark graph :)
And it's nice to keep track of these things
The previous major run comparing FreeBSD to Linux was done by Kris Kennaway in 2008. There are occasional benchmarks appearing on blogs and mailing lists which show interesting results...
Target audience
Mostly developers

The results are decent, but not rosy Significant space for improvement What to do, what not to do Tuning? Do you need it?
Some system administrator material

Avoid benchmarking?
Purpose of benchmarking...
To check your system for configuration bugs and obvious problems (hw/sw) To check your capacity for running an application (web server, db server, etc.) To compare your hardware to others' To compare your favorite application / operating system to others...
Repeatability
The best benchmarks are performed in a way anyone can repeat them Both result verification (error checking) and a way to compare their own setup to the one tested Repeatability is a good thing...
Test hardware
2x IBM x3250 M3 Xeon E3-1220 3.1 GHz, 4-core (no HTT) 32 GB RAM 4x SATA drive RAID0 1 Gbps NIC (2 ports)
Directly connected NICs
Test software
FreeBSD 9.1 and 10-CURRENT (HEAD) CentOS 6.3 PostgreSQL 9.2.3 blogbench 1.1 bonnie++ 1.97 filebench 1.4.8 Mdcached 1.0.7
Accuracy and precision
Accurate = if it measures what we think it measures (or is it off the mark) Precise = is the measured value good approximation of the real one (or is it affected by noise)
Systematic and random errors
Systematic = we're not measuring what we thing we're measuring (the method of benchmarking is wrong, even though the results may be repeatable and look correct). Random = affected by noise outside our control, results in non-repeatable measurements.
Expressing precision
Find meaningful precision
(or: avoid false precision)
412.567 MB/s 412.5 MB/s 412 MB/s 410 MB/s
Standard deviation
The error bars 1234 +/- 5 of something Expresses confidence in your measurement how precise they are (clustered around the measured value)
Example: hard drive performance
(Systematic: you are not measuring what you think are measuring) File systems vs (spinning) drives
350 300 250
MB/s
200 150 100 50 0 outside
diskinfo -vt /dev/da0
middle
inside
File systems are hard to measure
Data location-dependant performance Data structure (re)use-dependant performance (newfs vs old system)
Double for flash drives

400 350 300
Whole drive
Noise
MB/s
250 200 150 100 50 0 1 2 3 4 5 6 7 8
outside middle inside
Speaking of noise...
Reducing the difference
Create a partition spanning the minimal required disk space (e.g. 3x-4x the RAM size)
350 300 250
MB/s
200
150
outside middle inside
100
50
partition
whole drive
bonnie++
File system bandwidth & file op latency
600
500
400
MB/s
300
200
100
0 Write CentOS 6.3 Rewrite FreeBSD 9.1 FreeBSD 10 Read
File systems are complex beasts
Think about it it's a database... Data layout issues (esp. on rotating rust) Concurrent access issues miscellaneous features such as attributes, security, reliability, TRIM NFS is also a horror but for different reasons
Blogbench
Stresses file system parallelism and random disk IO
Creates a tree of small-ish files (2 KiB 64 KiB) Reads and writes them Atomic renames for (some) writes Multithreaded
Blogbench results
2000000 1800000 1600000 1400000 1200000 1000000
700,000
1,900,000
800000 600000 400000 200000 0 Linux read 4500 4000 3500 3000 2500 2000 1500 1000 500 0 Linux write FreeBSD 9.1 write FreeBSD 10 write FreeBSD 9.1 read FreeBSD 10 read
Blogbench FreeBSD 9.1 vs 10

x read-91 + read-10 +----------------------------------------------------------------------+ |xx x * x x*x + * x +* + + + + +| | |__________AM|________|______M__A_________________| | +----------------------------------------------------------------------+ N Min Max Median Avg Stddev x 11 509450 695303 600394 593580.45 60234.934 + 11 577355 893889 695303 708560.45 102923.93 Difference at 95.0% confidence 114980 +/- 75005.3 19.3706% +/- 12.6361% (Student's t, pooled s = 84325.5) x write-91 + write-10 +----------------------------------------------------------------------+ | + | |+ + + ++* * x+ x x x x * x x+ x| | |__________M_______A|_________________|M_________________| | +----------------------------------------------------------------------+ N Min Max Median Avg Stddev x 11 1299 2173 1734 1716.6364 287.0642 + 11 1111 2104 1299 1416 296.7814 Difference at 95.0% confidence -300.636 +/- 259.694 -17.5131% +/- 15.128% (Student's t, pooled s = 291.963)
Why is Blogbench slow on FreeBSD?
A multithreaded mix of file operations On FreeBSD (UFS):

open(O_WRITE) write() rename() etc ... block each other (exclusive lock) + writers block readers
So called write bias of FreeBSD

884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 884 100130 100190 100162 100141 100190 100162 100130 100141 100162 100130 100190 100162 100130 100141 100190 100141 100162 100190 100130 100190 100141 100162 100190 100141 100130 100190 100162 100190 100162 100141 100130 blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench blogbench 1.472372 1.472381 1.472382 1.472386 1.472387 1.472395 1.472398 1.472402 1.472407 1.472412 1.472403 1.472418 1.472421 1.472415 1.472423 1.472430 1.472433 1.472438 1.472442 1.472443 1.472451 1.472451 1.472450 1.472460 1.472464 1.472465 1.472461 1.472475 1.472477 1.472480 1.472478 CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL CALL open(0x7ffffebf5770,0x601<O_WRONLY|O_CREAT|O_TRUNC>,0x180<S_IRUSR|S_IWUSR>) close(0x2a) read(0x31,0x602f60,0x10000) read(0x32,0x602f60,0x10000) open(0x7ffff73b9ba0,0<O_RDONLY>,<unused>0) read(0x31,0x602f60,0x10000) write(0x2a,0x612f60,0x1df2) read(0x32,0x602f60,0x10000) close(0x31) close(0x2a) open(0x7ffff73b9ba0,0<O_RDONLY>,<unused>0) open(0x7ffffabd5ba0,0<O_RDONLY>,<unused>0) rename(0x7ffffebf5770,0x7ffffebf5ba0) close(0x32) read(0x2a,0x602f60,0x10000) open(0x7ffffd5eaba0,0<O_RDONLY>,<unused>0) read(0x31,0x602f60,0x10000) read(0x2a,0x602f60,0x10000) open(0x7ffffebf5770,0x601<O_WRONLY|O_CREAT|O_TRUNC>,0x180<S_IRUSR|S_IWUSR>) close(0x2a) open(0x7ffffd5eaba0,0<O_RDONLY>,<unused>0) read(0x31,0x602f60,0x10000) open(0x7ffff73b9ba0,0<O_RDONLY>,<unused>0) read(0x2a,0x602f60,0x10000) write(0x32,0x612f60,0x13f2) open(0x7ffff73b9ba0,0<O_RDONLY>,<unused>0) close(0x31) read(0x31,0x602f60,0x10000) open(0x7ffffabd5ba0,0<O_RDONLY>,<unused>0) read(0x2a,0x602f60,0x10000) close(0x32)

800000
close() read()
700000
(not distinguishing between open(O_RD) and open(O_WR) )
600000
500000
open()
400000
300000
write()
200000 100000
PostgreSQL pgbench
Initialized with -s 1000 100,000,000 records, 16 GB size
(fits in RAM, for RO) 8 GB shared buffers 32 MB work memory autovacuum off 30 checkpoint segments
PostgreSQL configuration:

PostgreSQL / pgbench caveats
WAL logging means:
data is written to checkpoint segments, restarts are needed between WRITE benchmarks data is transferred to proper storage almost unpredictably, long-ish runs are needed for repeatable benchmarks
Autovacuum can run almost unpredictably
PostgreSQL results
60000
50000
40000
30000
20000
10000
0 1 2 4 8 12 16 24 32 48 64 128
Linux disk RO avg Linux ramfs RW avg FreeBSD disk RW avg FreeBSD remote tmpfs RO avg
Linux disk RW avg Linux remote ramfs RO avg FreeBSD tmpfs RO avg
Linux ramfs RO avg FreeBSD disk RO avg FreeBSD tmpfs RW avg
PostgreSQL result notes
On Linux, there's almost no difference between ext4 when data is fully cached (warmed) and ramfs On FreeBSD, tmpfs is faster
60000 50000 40000 30000 20000 10000 0 1 2 4 8 12 16 24 32 48 64 128
Linux disk RO avg FreeBSD disk RO avg
Linux ramfs RO avg FreeBSD tmpfs RO avg
Linux is consistently 5%-10% better

1200
1000
800
600
400
200
0 1 2 4 8 12 Linux disk RW avg 16 24 32 48 64 128
FreeBSD disk RW avg
Remote access doesn't help much
60000
50000
40000
30000
20000
Scheduler issue
10000
0 1 2 4 8 12 16 24 FreeBSD tmpfs RO avg 32 48 64 128
Linux ramfs RO avg
Linux remote ramfs RO avg
FreeBSD remote tmpfs RO avg
However...
Different benchmark, by Florian Smeets 40-core, 80-thread system, 256 GB RAM scale 100 10,000,000 records
200000 180000 160000 140000 120000 100000 80000 60000 40000 20000 0 Threads 2
Linux wins after 32 threads

Threads 4 Threads 8 Threads 10 Threads 16 Threads 20 Threads 24 Threads 32
FreeBSD-VMC-233854-postgres-9.2-devel
Linux-kernel-3.3.0-glibc-2.15-postgres-9.2-devel
FreeBSD-head-r233892
Filebench
Dubious correctness of the port... fileserver and webproxy profiles
fileserver: Emulates simple file-server I/O activity. This workload performs a sequence of creates, deletes, appends, reads, writes and attribute operations on a directory tree. 50 threads are used by default. The workload generated is somewhat similar to SPECsfs. webproxy: Emulates I/O activity of a simple web proxy server. A mix of create-write-close, open-read-close, and delete operations of multiple files in a directory tree and a file append to simulate proxy log. 100 threads are used by default.
Local drive, NFSv3 and NFSv4
fileserver profile
900 800 700 600 500 400 300 200 100 0 Linux local MB/s Linux NFSv3 MB/s Linux NFSv4 MB/s FreeBSD local MB/s FreeBSD NFSv3 MB/s FreeBSD NFSv4 MB/s
NFSv4 is slow even on Linux
Should not be possible
webproxy profile
800
700
600
500
Possible, but improbable
400
300
200
100
0 Linux local MB/s Linux NFSv3 MB/s Linux NFSv4 MB/s FreeBSD local MB/s FreeBSD NFSv3 MB/s FreeBSD NFSv4 MB/s
Cross-benchmarking
FreeBSD server, Linux client Linux does caching...

250 200
150
100
50
0 Linux NFSv3 MB/s Linux client, FreeBSD 10 server, NFSv3 MB/s
Bullet Cache
My own cache server, presented at BSDCan 2012 :) Lots of small (32 byte 128 byte) TCP command / response transactions Multithreaded + non-blocking IO (nearly 2,000,000 TPS over Unix sockets on medium-end hardware)
Linux support incomplete, lacks epoll() Within measurement error, 9.1 == 10
600000
500000
400000
300000
200000
100000
0 CentOS 6.3 100 conn 200 conn FreeBSD 9.1 300 conn 400 conn 500 conn FreeBSD 10
Result notes
TCP is fairly concurrent in FreeBSD, multiple TCP streams do not block each other > 470,000 PPS per direction Over Unix sockets (local): 1,020,000 TPS
On tuning...
The year is 2013 and FreeBSD actually auto-tunes reasonably well Example #1: vfs.read_max Example #2: hw.em.txd/rxd Example #3: kern.maxusers Example #4: vm.pmap.shpgperproc
+ hi/lobufspace
Why no CPU benchmarks?
You are not going to influence raw CPU performance from the OS (except in edge cases / misconfiguration) You can test the quality of the libc and libm implementations... but be sure that is what you want to You can also test the quality of compiler optimizations... if you want to.
(LLVM vs GCC)
(do not draw conclusions based on this) (Phoronix benchmark, Botan/KASUMI)

Botan v1.10.3
Test: KASUMI
Mbytes/s, More Is Better OpenBenchmarking.org
GCC 4.7.2
SE +/- 0.00
39.35
GCC 4.8.0
SE +/- 0.00
37.91
LLVM Clang 3.2

SE +/- 0.00
39.34
LLVM Clang 3.3 SVN

SE +/- 0.00
65.75 15 30 45 60 75
Powered By Phoronix Test Suite 4.4.1

1. (CXX) g++ options: -m64 -ldl -lpthread -lrt
Why NOT benchmark?
When is benchmarking important?
Is performance more important than features? Is it cheaper to buy another machine than to reconfigure / develop a better OS? During 7.x timeframe: 8 CPU scalability During 10.x timeframe: 32 CPU scalability
The future is actually quite good

When you do need to benchmark?
As a sysadmin, you just want to know Planning / budgeting for a project When your boss tells you to :) Advocacy...
Continuous benchmarking?
Some talks were had... It would probably be easier to set up now after the developer cluster has been updated... Performance regression monitoring
The End
Benchmarking FreeBSD
Ivan Voras <ivoras@freebsd.org>

Benchmarking FreeBSD

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Benchmarking FreeBSD

Uploaded by

Copyright:

Available Formats

Benchmarking FreeBSD

Ivan Voras <ivoras@freebsd.org>

What and why?

Everyone likes a nice benchmark graph :)

And it's nice to keep track of these things

Some system administrator material

Accuracy and precision

Systematic and random errors

Find meaningful precision

(or: avoid false precision)

412.567 MB/s 412.5 MB/s 412 MB/s 410 MB/s

Example: hard drive performance

200 150 100 50 0 outside

diskinfo -vt /dev/da0

File systems are hard to measure

Double for flash drives

250 200 150 100 50 0 1 2 3 4 5 6 7 8

outside middle inside

Reducing the difference

outside middle inside

File system bandwidth & file op latency

0 Write CentOS 6.3 Rewrite FreeBSD 9.1 FreeBSD 10 Read

File systems are complex beasts

Stresses file system parallelism and random disk IO

Blogbench FreeBSD 9.1 vs 10

Why is Blogbench slow on FreeBSD?

A multithreaded mix of file operations On FreeBSD (UFS):

So called write bias of FreeBSD

Why is Blogbench slow on FreeBSD?

Why is Blogbench slow on FreeBSD?

(not distinguishing between open(O_RD) and open(O_WR) )

Initialized with -s 1000 100,000,000 records, 16 GB size

PostgreSQL / pgbench caveats

WAL logging means:

Autovacuum can run almost unpredictably

Linux ramfs RO avg FreeBSD disk RO avg FreeBSD tmpfs RW avg

PostgreSQL result notes

Linux disk RO avg FreeBSD disk RO avg

Linux ramfs RO avg FreeBSD tmpfs RO avg

PostgreSQL result notes

Linux is consistently 5%-10% better

0 1 2 4 8 12 Linux disk RW avg 16 24 32 48 64 128

FreeBSD disk RW avg

PostgreSQL result notes

Remote access doesn't help much

0 1 2 4 8 12 16 24 FreeBSD tmpfs RO avg 32 48 64 128

Linux ramfs RO avg

Linux remote ramfs RO avg

FreeBSD remote tmpfs RO avg

Linux wins after 32 threads

Dubious correctness of the port... fileserver and webproxy profiles

Local drive, NFSv3 and NFSv4

NFSv4 is slow even on Linux

Should not be possible

Possible, but improbable

FreeBSD server, Linux client Linux does caching...

0 Linux NFSv3 MB/s Linux client, FreeBSD 10 server, NFSv3 MB/s

Linux support incomplete, lacks epoll() Within measurement error, 9.1 == 10

Why no CPU benchmarks?

(do not draw conclusions based on this) (Phoronix benchmark, Botan/KASUMI)

LLVM Clang 3.2

LLVM Clang 3.3 SVN

Powered By Phoronix Test Suite 4.4.1