Professional Documents
Culture Documents
Jetstress Field Guide v2.0.0.8
Jetstress Field Guide v2.0.0.8
Jetstress Field Guide v2.0.0.8
Prepared by
neil.johnson@microsoft.com
000Exchange Community0
000Exchange Community0
Author
22/03/20 Neil
13
Johnson
2.0.0.
1
03/04/20 Neil
13
Johnson
2.0.0.
2
19/06/20 Neil
13
Johnson
2.0.0.
5
20/06/20 Neil
13
Johnson
2.0.0.
6
20/06/20 Neil
13
Johnson
2.0.0.
7
Page
, Field Guide, Version
Prepared by Neil Johnson
"" last modified on 8 Jul. 13, Rev
000Exchange Community0
Document Contributors
Name
Position
Section
Neil Johnson
Author
Alexandre
Costa
Jetstress internals
Ross Smith IV
Ramon b.
Infante
DIR, WW COMMUNITIES, UC
Various
Matt Gossage
Various
Umair Ahmad
Various
Page
, Field Guide, Version
Prepared by Neil Johnson
"" last modified on 8 Jul. 13, Rev
000Exchange Community0
Reviewers
Name
Versio Position
n
Neil Johnson
2.0.0.1
Alexandre
Costa
2.0.0.1
Ross Smith IV
2.0.0.1
Ramon b.
Infante
2.0.0.1
DIR, WW COMMUNITIES, UC
Matt Gossage
2.0.0.1
Umair Ahmad
2.0.0.1
Scott Schnoll
2.0.0.1
Boris
Lokhvitsky
2.0.0.1
Jeff Mealiffe
2.0.0.1
Robert Gillies
2.0.0.1
David Mosier
2.0.0.1
Date
Page
, Field Guide, Version
Prepared by Neil Johnson
"" last modified on 8 Jul. 13, Rev
000Exchange Community0
Table of Contents
1 Purpose...............................................................................1
2 What is New in Jetstress 2013...............................................1
3 Introduction to Jetstress........................................................2
4 Jetstress Internals................................................................3
4.1
4.1.2
Thread Dispatcher............................................................................................ 5
4.1.3
4.1.4
4.1.5
5.1.2
5.2
5.3
5.4
5.5
5.4.1
5.4.2
5.4.3
5.6
Initialisation.................................................................................................... 15
5.6.2
Testing............................................................................................................ 15
5.6.3
Clean-up......................................................................................................... 16
5.7
5.8
6 Installing Jetstress..............................................................19
6.1
Documentation.......................................................................................... 19
6.2
000Exchange Community0
6.3
Prerequisites.............................................................................................. 20
6.4
6.5
6.4.1
6.4.2
Installation................................................................................................. 22
6.5.1
Application Installation...................................................................................22
6.5.2
7 Configuring Jetstress...........................................................26
7.1
7.2
7.1.2
Initial configuration.................................................................................... 27
9.2
Test Summary.................................................................................................35
9.2.2
9.2.3
9.2.4
Database Configuration..................................................................................36
9.2.5
9.2.6
9.2.7
9.2.8
9.2.9
9.3
9.4
Test evaluation........................................................................................... 44
10
11
12
000Exchange Community0
13
14
Common Issues...............................................................50
Page
, Field Guide, Version
Prepared by Neil Johnson
"" last modified on 8 Jul. 13, Rev
000Exchange Community0
Purpose
This document is intended to explain the process and requirements for validating
an Exchange 2013 storage solution prior to releasing an Exchange deployment
into production.
It will explain how Jetstress works, how to plan for and perform a Jetstress test,
and how to analyse the results of the test.
This document is not intended to provide Exchange storage design guidance. For
guidance on Exchange 2013, server design and planning refer to Planning and
Deployment.
The Event log is captured and logged to the test log. These events show
up in the Jetstress UI as the test is progressing.
Any errors are logged against the volume that they occurred. The final
report shows the error counts per volume in a new sub-section.
A single IO error anywhere will fail the test. In case of CRC errors, they
might be remapped. A re-run of Jetstress should verify that they indeed
were remapped.
Detects -1018, -1019, -1021, -1022, -1119, hung IO, DbtimeTooNew,
DbtimeTooOld.
Threads, which generate IO, are now controlled at a global level. Instead
of specifying Threads/DB, you now specify a global thread count, which
works against all databases. This improves the granularity of thread
tuning and enables automatic tuning to work more effectively.
Jetstress configuration files (JetstressConfig.XML) generated from an older
version of Jetstress is no longer allowed.
Important Changes
Do not use Jetstress 2013 for older versions of Exchange Server. Jetstress
2013 has only been tested with Exchange Server 2013.
Page 1
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
Page 2
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
Introduction to Jetstress
Jetstress is a tool for simulating Exchange database I/O load without requiring
Exchange to be installed. It is primarily used to validate physical deployments
against the theoretical design targets that were derived during the design phase.
To simulate the complex Exchange database I/O pattern effectively, Jetstress
makes use of the same ESE.DLL that Exchange uses in production. It is therefore
vital Jetstress use the same version of the Extensible Storage Engine (ESE) files
that your Exchange infrastructure will be built with in production.
Ideally, Jetstress testing will be part of the overall project plan. The best time to
schedule Jetstress testing is just before Exchange will be physically installed onto
the servers.
Jetstress testing provides the following benefits prior to deploying live users.
The most important aspect of Jetstress testing is that it allows you to see how
the physically deployed storage and server infrastructure will behave once a real
Exchange workload is applied. This often works out differently from
expectations, especially in scenarios where shared storage infrastructure is
deployed or where the storage design is complex.
Often the Jetstress test will not provide the results that were expected.
Sometimes by making subtle configuration changes to the storage infrastructure
(for example, driver or firmware updates) it is then possible to get the test to
pass.
It is important to remember that when the Jetstress test reports a failure,
Jetstress has not failed, Jetstress is just reporting on the performance of your
storage solution. This may seem an obvious point, however a large number of
customer escalation cases for Jetstress are not actually Jetstress cases and are
instead storage performance cases. If you need to remediate a test failure,
remember that Jetstress is dumb tool that is used worldwide by thousands of
Exchange professionals and in Office 365. It is extremely unlikely that Jetstress
is broken; it is far more likely that you have a design issue or misconfiguration
with your storage deployment.
Fundamentally, a successful Jetstress test validates that all of the hardware and
software components within the I/O stack from the operating system down to the
physical disk drive are working to a sufficient level to meet the predicted
performance required by Exchange to operate successfully.
Page 3
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
Important:
The validity of your Jetstress testing is only as good as the user profile
analysis and workload prediction that was completed during the design
phase of the project.
2 Jetstress Internals
2.1 Main Jetstress Components
Like Exchange, Jetstress is an ESE-based application. It runs in user memory
space, makes API calls to ESE, which in turn makes calls to the Windows File
system and I/O Manager to gain access to the data stored on disk. During each
of these tasks Windows records performance information about the specific task
and the operating system as a whole. Once the test is completed, Jetstress
analyses the performance data to determine if the system meets the targets
specified at the beginning of the test.
Page 4
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
New:
Auto tuning has been improved in Jetstress 2013 by moving to a global
thread controller. Auto-tuning may still fail, however it should be
successful in many more scenarios than in 2010.
Page 5
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
While working out the correct thread count to use it is not necessary to let the
checksum part of the test complete. To stop the checksum you can either click
on cancel, which will stop the checksum part of the test but still generate the
performance test report, or edit the Jetstress configuration file and change the
VerifyChecksum value to false (default is true).
Page 6
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
<VerifyChecksum>false</VerifyChecksum>
Page 7
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
The following process assumes that you are using the disk subsystem
throughput test and auto-tuning as recommended.
Page 8
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
Page 9
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
So, why would you run Jetstress during the planning/design phase of a project?
The simple answer is that with todays powerful hardware, Exchange design
teams must use standard chunks of hardware to create their design. Rather
than attempt to guess what the I/O limits are of the hardware it is preferable to
perform some Jetstress tests on the hardware to determine the maximum
storage IO capacity of the system. This allows the design team to specify the bill
of materials much more precisely, thereby saving money and reducing risk.
However, if you have already proven the solution in the lab, why test again at
build time? This is a common question. Many projects only schedule sufficient
time for testing a single server and its storage solution with the belief that they
only need to validate the design. The problem with this approach is that it
assumes a zero error rate in the build out. What happens if someone forgets a
part of the build on one server? Alternatively, deploys a different device driver
from the one used in the lab? What happens if a faulty piece of hardware has
been deployed? Jetstress testing at build time is a great way to validate that the
physically deployed hardware and software are capable of providing the required
I/O performance for Exchange. Jetstress testing at build time is also a way to
identify failing components such as disk drives; it is much less stressful to
identify a weak batch of disks during a Jetstress test than on a Monday morning
after a large user migration!
If the project plan will allow it, build in sufficient time to test each server and
storage chassis that will be deployed before migrating user mailboxes to it.
Remember that Jetstress can be fully automated, so with a little bit of planning it
can be left to run overnight and may not actually add any significant overhead to
the project.
Page 10
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
Each database copy must be designed to provide sufficient I/O to support the
copy if it were to become active. Therefore, by testing each database LUN in
parallel, we are validating that the storage solution is able to meet the design
requirements. We are also validating that any pieces of shared infrastructure are
able to meet the demand of the entire solution, rather than simply testing each
server individually.
Note:
Where there is no shared infrastructure and all storage is directly
attached, servers may be tested individually. However, the test must be
configured to include any active, replica or lagged LUNS that could
become online at the same time to be a valid test.
Page 11
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
Test importance
Description
Optimal
Degraded
Rebuilding
Ideally, the Jetstress test should still pass during a degraded mode test. If the
test fails, refer to this post to analyse the failure severity.
1 If your array does not contain a hot spare, you can choose to perform array
rebuilds out of hours so the end user impact is minimized, however your data
loss exposure is increased. If you plan on performing array rebuilds during
working hours, even if you do not have a hot spare configured it is recommended
to perform a Jetstress test run while the array is rebuilding.
Page 12
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
Page 13
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
Note:
Please refer to the following section about understanding storage
configuration for Exchange Server 2013 for more information on
recommended raid configurations for Exchange Server.
http://technet.microsoft.com/en-us/library/ee832792.aspx
Page 14
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
Page 15
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
Additionally the host may be the failover location for other guests,
meaning that workload may increase dramatically in a failure scenario.
3. Follow the current recommended practices from both Microsoft and your
hypervisor vendor. Yes, I know this is obvious but it still amazes me how
many problems are resolved by following the recommended guidance!
Guidance
The spirit of the test is to ensure that the system can meet its predicted
workload during normal working conditions and during any common
failure modes for which the system has been designed to survive.
Page 16
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
Initialisation
Testing
Clean-up
3.6.1 Initialisation
This phase includes installation, prerequisites and initial database creation. Of
these tasks, the initial database creation will take the longest amount of time.
Database creation time varies between hardware deployments however expect
around 24 hours for 10TB of data per server (~7GB/minute). If you are using
direct attached storage and initialise multiple servers in parallel these
predictions apply to each server. If you are using shared storage, your
initialisation time may take considerably longer.
DATA
(TB)
1TB
2TB
5TB
10TB
50TB
100TB
TIME
(Hours)
2.4
4.8
12.0
24.1
120.3
240.6
TIME
(Days)
0.1
0.2
0.5
1.0
5.0
10.0
3.6.2 Testing
The actual testing phase will vary depending on the complexity and maturity of
the design. If your design is based on complex, cutting-edge storage
technology, it is highly likely that you will need to allocate more time for testing.
If your design is based on common direct attached components, the testing
Page 17
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
phase is likely to be quite short. For simple direct attached solutions allow
between 2-5 days, for complex SAN solutions try to allocate up to 10 working
days. If you are working in a complex enterprise with large scale, complex
storage infrastructure budget between 4-6 weeks for Jetstress testing.
Troubleshooting storage performance issues can often be very time-consuming.
3.6.3 Clean-up
Before the server can be put into production, it is necessary to remove the
Jetstress application and the test databases that were created. The
recommended procedure is as follows
JetstressScripts.zip
The scripts will parse your JetstressConfig.XML file and remove all
database and log folders defined in the test. The scripts takes two input
parameters:
Note that these scripts are unsupported and you use them
entirely at your own risk. They are provided here for convenience
only.
Page 18
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
Page 19
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
Page 20
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
Installing Jetstress
4.1 Documentation
The document that you are currently reading represents the main source of
information for Jetstress 2013. If you are validating Exchange Server 2003, 2007
or 2010 refer to the Jetstress Field Guide for Jetstress 2010.
Bui
ld
Usage
14.01.0225
.017
32
bit
Exchange
20032
http://www.microsoft.com/enus/download/details.aspx?id=20054
14.01.0225
.017
64
bit
Exchange
2007
Exchange
2010
http://www.microsoft.com/enus/download/details.aspx?id=4167
Exchange
2013
http://www.microsoft.com/enus/download/details.aspx?id=36849
15.0.658.4
64
bit
Link
Jetstress 2013 will not allow you to use an XML configuration file from an
older version of Jetstress.
Always ensure that you use the same version of Jetstress to initialise the
databases and to perform the testing.
000Exchange Community0
4.3 Prerequisites
It is important that the version of ESE that is used for the test is the same
version that will be used in production.
3 See section 5.4 Getting ESE Files necessary for Jetstress for the locations of
these files.
Page 22
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
Path
C:\Program Files\Microsoft\Exchange Server\V15\Bin
C:\Program Files\Microsoft\Exchange
Server\V15\Bin\perf\AMD64
C:\Program Files\Microsoft\Exchange
Server\V15\Bin\perf\AMD64
C:\Program Files\Microsoft\Exchange
Server\V15\Bin\perf\AMD64
C:\Program Files\Microsoft\Exchange
Server\V15\Bin\perf\AMD64
Path
\setup\serverroles\common
\setup\serverroles\common\perf\amd64
\setup\serverroles\common\perf\amd64
\setup\serverroles\common\perf\amd64
\setup\serverroles\common\perf\amd64
Caution
Remember to use the same version of ESE files in your Jetstress
tests that you will use in production.
Page 23
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
4.5 Installation
Before performing this section, it is recommended that all prerequisites have
been met and that Exchange server is not installed on any servers being
used for Jetstress testing.
Instruction
Screenshot
1.
2.
Page 24
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
3.
4.
Page 25
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
5.
Instruction
Screenshot
Page 26
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
2.
3.
4.
Page 27
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
Configuring Jetstress
For the purposes of this document, we will be configuring a disk subsystem
throughput test. The goal of this test is to identify the peak working IOPS value
that the storage subsystem can sustain while remaining within the disk latency
targets established by the Exchange Product Group.
000Exchange Community0
Page 29
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
Instruction
Screenshot
1.
2.
3.
Page 30
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
4.
5.
6.
000Exchange Community0
8.
0.75 = 45m
0.50 = 30m
0.25 = 15m
Page 32
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
9.
10.
Page 33
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
11.
12.
13.
Page 34
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
14.
15.
Performance_<date>.X
ML
Performance_<date>.H
TML
Performance_<date>.B
LG
DBChecksum_<date>.X
ML
DBChecksum_<date>.H
TML
DBChecksum_<date>.B
LG
XMLConfig_<date>.XML
000Exchange Community0
the test.
Table 9 - Jetstress initial configuration
Page 36
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
Content
Purpose
Performance_<date>.BLG
Performance_<date>.XML
Performance_<date>.HTM
L
DBChecksum_<date>.BLG
Provides binary
performance data
gathered during the CRC
checksum of the database.
Useful if the checksum fails
or takes a long time to
complete.
DBChecksum_<date>.XML
XMLConfig_<date>.XML
Page 37
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
Page 38
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
This section is a basic summary of the test, when it started, finished and which
versions of operating system and ESE were used.
The most important part of this section is the overall test result, pass or fail.
This section shows some more detailed parameters regarding the test. A test
disk subsystem throughput test report will always show 100% for Capacity
Percentage and Throughput Percentage. In this example, 4 x 25GB Databases
were created on a 126GB LUN. Jetstress created a total of 101GB ( 109154926592
bytes) of data for testing which is 80% of the available space. This is normal
behaviour; by default, in performance mode Jetstress will use 80% of the disk
capacity to allow room for growth during the test process.
The most important value in this section is the Achieved Transactional I/O per
Second. In this example the test validated the storage can provide 231
transactional I/O per second. This represents random database IOPS.
Note:
To validate that the test has met the design requirements compare the
Achieved Transactional I/O per Second from your Jetstress report to the
Total Database Required IOPS / Server value recorded in section 8.1 Target
design values, from the Mailbox Role Calculator.
Page 39
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
This section displays some system values that Jetstress used for this test. The
important values for analysis here are the thread count and number of copies
per database.
This section lists the paths for each database and log combination. In this
example, 4 x 25GB databases were configured on a single LUN. Check that all of
the test databases are listed here and the path names are correct.
This section of the report displays the Transactional I/O values that were
achieved for each database. Transactional I/O does not include I/O for
Background Database Maintenance.
BDM I/O is mostly sequential so it is not usually considered during the design
phase.
Information:
If you sum the values highlighted in the red box the result should add up
to the Achieved Transactional I/O per second reported in the Database
Sizing and Throughput table.
Page 40
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
This section displays the I/O that was used to perform Background Database
Maintenance only. The sum of values in the red box shows the total amount of
IO used for BDM operations. These are sequential operations and we do not
usually need to account for them in our design. However, take the advice of
your storage vendor on this aspect, some storage platforms do not handle
sequential IO as well as others and may require some additional design work to
help them deal with BDM more gracefully.
This section displays the I/O overhead for LOG file replication. In this example
there were two replica copies (replicas=2), this is shown by a non-zero count for
I/O Log Reads/sec. If this value is greater than zero it confirms that database
replication is being simulated.
Note:
For those that noticed, I finally provided a report that shows log IO I
know, the little things count
Page 41
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
This table shows all I/O that was recorded during the test (transactional I/O plus
BDM I/O plus LOG I/O). The summation of I/O values from areas highlighted in
red in this table should agree (roughly) with those observed at the storage
subsystem.
In this case, the summation suggests that the storage subsystem had to deal
with 349 IOPS. However, roughly 1/3rd of those (349-231=117) IOPS were
sequential and so were not accounted for during the design process, since
sequential I/O is very easy on most disk subsystems.
The following chart shows the observed IOPS from the Windows host during the
Jetstress test. This counter includes all system IOPS as well as the test IOPS;
however there should be a strong correlation between the IOPS observed on the
windows host and at the storage subsystem. In the event of contradiction
between observed IOPS at the Windows Host and those at the storage controller,
the windows host values take precedence from a Jetstress validation perspective.
Page 42
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
Page 43
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
This section of the report shows the observed system performance during the
test. This section is most often used for troubleshooting. The most important
thing to note from this section is that the CPU load from Jetstress is usually
minimal. Jetstress has been optimized to evaluate the storage subsystem and
not the host performance itself.
Page 44
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
Error Type
Error
Code
IO Failures
JET_errDiskIO
-1022
JET_errReadVerifyFailure
-1018
JET_errPageNotInitialized
-1019
JET_errReadPgnoVerifyFailure
-1118
JET_errDiskReadVerificationFailure
-1021
JET_errCheckpointCorrupt
-533
JET_errMissingLogFile
-528
JET_errLogFileCorrupt
-501
JET_errInvalidPath
-1023
JET_errInvalidSystemPath
-1024
JET_errInvalidLogDirectory
-1025
JET_errFileAccessDenied
-1032
JET_errFileInvalidType
-1812
JET_errLogCorrupted
-1852
JET_errObjectNotFound
-1305
JET_errReadLostFlushVerifyFailure
-1119
JET_errDbTimeTooOld
-566
JET_errDbTimeTooNew
-567
Filesystem Corruptions
Lost Flush
Information
Some failure events are more important than others. Lost Flush events
signal significant data corruption has occurred and something is very
wrong with your storage (under no circumstances should you entertain
putting a system into production that is experiencing ANY lost flush
events during a test). However, some other IO Failures are relatively
normal, for example, in a JBOD environment we may see -1021
(JET_errDiskReadVerificationFailure) which, although signifies that the data
we read was not the same that we originally wrote (checksum failed),
Exchange will try to deal with this scenario via Page Patching in normal
operation and so is not of critical importance.
For a full list of JET/ESE event types see the following article Extensible Storage
Engine Error Codes.
Page 45
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
A lost flush is a very insidious type of storage failure for a database engine
because the consequences can range from none (if we are very lucky) to nasty
and potentially undetectable logical database corruption (more likely).
Undetected lost flushes on the active copy may show up as a
JET_errDbTimeTooNew (-567) replication error on the passive copy. Undetected
lost flushes on the passive copy may show up as a JET_errDbTimeTooOld (-566)
replication error on the passive copy.
ESE has implemented lost flush detection, based on a flush map. Basically, every
time we issue a write on a page, we flip a bit on the actual page and also store
that bit in a flush map in memory. If we read the page again off the disk, we
check the bit against the in-memory flush map and if they dont match, it means
the flush was lost.
Important:
The bottom line for lost flushes is that you should NEVER put a system
into production that has recorded lost flushes during the Jetstress test.
You must be 100% certain that you have resolved the underlying problem
and have at least one good 24 hour test that has no lost flushes recorded
before accepting the solution into production.
Page 46
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
Page 47
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
Page 48
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
DB
IOPS
Target
DB
Read
Laten
cy
LOG
Write
Laten
cy
Action
PASS
PASS
PASS
Test successful
FAIL
PASS
PASS
PASS
FAIL
FAIL
PASS
PASS
FAIL
PASS
FAIL
PASS
FAIL
FAIL
FAIL
FAIL
FAIL
PASS
FAIL
PASS
FAIL
Page 49
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
TargetIOPS
( 65 )
1000
( 65 )
Notes:
If in doubt start with thread=1 and work up until the test fails.
The exact quantity of IOPS generated per thread will change as the
storage system workload changes. As the storage system gets
closer to its performance limit the IOPS per thread value will reduce.
Jetstress was designed to produce approximately 60 IOPS per thread
at 20ms disk latency.
Page 50
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
Page 51
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
10
Example of Use
Description
help
/?
Config
/c JetstressConfig.xml
Generate
/g
TimeOut
/TimeOut 2H0M0S
Output
/output c:\output
DBPath
/dbpath m:\sg1\mdb
/dbpath n:\sg2\mdb
LogPath
PctCapacity
/pctcapacity 100
Specify capacity
percentage
Throughput
/throughput 100
Specify throughput
percentage
Threads
/threads
DoNotRunDBMPerform
ance
RunDBMPerformance
000Exchange Community0
recovery test
New
/new
Open
/open
Bak
/bak
Recovery
/recovery
Streaming
Transaction
Run transaction
performance test
VerifyCheckSum
Page 53
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
11
Page 54
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
12
Common Issues
12.1 Troubleshooting Jetstress
While using Jetstress, you may encounter some known issues with Jetstress. This
section provides possible causes, and the recommended solutions.
12.1.1
Possible cause: Permissions are insufficient to access the .edb file or the
log files.
Solution: Verify that permissions are sufficient for the account under which
Jetstress is running. Jetstress requires read/write permission to the
directories it is using.
Possible cause: The last time Jetstress was run, it was ended uncleanly.
This caused the log files to become unsynchronized with the database.
Solution: Delete the Jetstress database (*.edb), log files (*.log), and check
file (*.chk), and re-create the Jetstress database. You can also use
Eseutil.exe with the /r switch to resynchronize the logs and database.
12.1.2
Cause: When the counters are not loaded correctly, you may see
exception errors related to performance counters.
Solution: To reload the counters, exit from JetstressWin.exe. Locate the
directory where JetstressWin.exe was installed and verify that eseperf.dll,
eseperf.hxx, and eseperf.ini files exist in the directory. In a command shell
window, type the command unlodctr ESE and then click Enter. This will
Page 55
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
12.1.3
This error indicates that Jetstress could not find appropriate parameters that
could be used to run a performance or stress test at the desired level of I/O load.
Cause: This can be caused by several factors. The most common reason is
that the storage subsystem has multiple hosts attached to it, and those
hosts are competing for common resources during the tuning process.
Solution: When you are running in a scenario such as this, you can run
Jetstress on a single host with tuning enabled to generate the appropriate
load parameters, and then rerun the test on the other hosts with the
Suppress Tuning option enabled and the tuning parameters entered
manually from the results of the first test.
12.1.4
Unable to mount databases due to invalid mount point
configuration
When using mount points and running the Prepare phase of Jetstress, the
operation fails with error There is insufficient disk space on volume <system
drive>:\ , where <system drive> is the drive letter where you keep your root
mount folder.
Cause: This error means that one or more of the mount points is invalid or
the mount point folder path is not connected to its LUN. Database creation
fails saying that volume C: (or in general, the system volume) does not
have enough space. The issue here is that some of the mount-points
mapped to directories in the system volume are not properly configured
and so Jetstress is looking at the directory (thus checking against the
system drive itself), rather than the actual disk.
Page 56
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2
000Exchange Community0
Solution: The mount path folder could be listed as <DIR> for a number of
reasons:
1. Verify the LUN is present and in good health.
2. Use the storage system array management software to verify that the
LUN has an assigned logical drive.
3. Using the Disk Management MMC, re-assign the LUN to the correct
mount-point.
Page 57
Jetstress 2013, Jetstress Field Guide, Version 2.0.0.8
Prepared by neil.johnson@microsoft.com
"323222109" last modified on 8 Jul. 13, Rev 2