(2)
6a
os)
0)
wo
United States Patent
Gasser et
INSTALLING DATA STORAGE SYSTEM
SOFTWARE ON DISK DRIVE SYSTEMS
Inventors: Morrie Gasser, Hopkinton, MA (US)
Matthew Ferson, Worcester, MA (US)
Assignee: EMC Corporation, flopkinton, MA,
ws)
Notice: Subject to any disclaimer, the term ofthis
pateat is extended or adjusted under 38
USC. 154(b) by 554 days.
Appl. No 11/725.346
Filed: Man. 19, 2007
Int. Cl
G6F 1200 (2006.01)
us.cl. ‘puna; TLL; TANI
Field of Classification Search nuit,
Fini, 14
‘See pplication ile for complete search history.
7243
SBI
‘US00803"
(10) Patent No.
4s) Date of Patent:
US 8,037,243 BI
Oct. 11, 2011
66) References Cited
US. PATENT DOCUMENTS
Sosgost A + |S1998 Katee nan
SRotooR A * 1999 Kraut al 732
G2eaeon BL 82001 pula etal 732
620449 BL* 92001 Aguilar etal 732
Gas3s7 BL* 72002 Apailareal ivan
SAS032 BL 92002 Aguilar eal 4490
Sa70A57 BL* 102002 Brewerst a 7isae4
6836859 B2* 122004 Berge a 7436
Shoots 2" 22008. McCants a6
200900828) AL* 32005 Berg ea Tait
anus19i2%2 AL* 92005 Dict mas
* cited by examiner
Mandochee Checy
Krishncod Gupta; Jason.
Primary Examiner
(74) Attornes, Agent, or Firm
Reyes: Deepika Bhavana
on ABSTRACT
ata storage system software is installed from nonvolatile
memory. slorage processor is booted, transferring informs
‘ion stored in a nonvolatile memory module to a disk drive
system, thereby enabling. the system processor 10 boot
eet fom the disk drive system in subsequent boots. Aer
the information is transfered the storage processor roots
‘using the information transferred to the disk drive system,
17 Claims, 6 Drawing Sheets
DPE InowowaTie] _ STORAGE PROCESSORS) ponT,
cuassis —|"Meuory [—t Bonny A PORTAL
Me~_MODULE J 3 CPUA ‘we:
Lo |
h i [DISK DRIVE]
FRONT Vi] aang. sas] T
cont. fF] tater [-] SOME, SCORE
a fF 1 [DISK DRIVE]
oe ii i
A
at | | rete oso]
wt oso
m7 TSA
to PY :
BS" LH] seas.
oe, HB Fas SEK
a 7 sis H{OSKORNE)|
! SoRbER '
“ [owe ] [ace ;
Hl Love] [di SKORE]
| ete
' ah +e
| STORAGE PROCESSOR i ros
1___{SP)BOARO B ANTE OSERUS 8,037,243 BL
Sheet 1 of 6
Oct. 11, 2011
U.S, Patent
¥YOSSIOONd
SOVYOLS
U3TIONNOD
‘ON3 INO
USANIS:
PaLNdNOD
1SOH
SINGOW ANOWSW
STLLVIOANON
yOSS3O0Ud
B3OVYOLS
gL
eh
ITIONLNOD
‘GN3 LNOUS
SINGOW ANOWS
SULVIOANON
UaCNVdXa
wSQNVEXa
Svs
USANSS
raaLNdNOO
‘1SOHUS 8,037,243 BL
Sheet 2 of 6
Oct. 11, 2011
U.S, Patent
auvoe ‘
¥aSOd¥aiNiy zOld
Uae ve
Bao usiay—f xn
EE | |
BANG NSIO xn ya ‘anda |! Ln
uaqNvdxa 1
BARONSIO xn Sv —
waTION UaANS
Ens xn “NOD AALNAINOD
1 9 Svs ASOH
Gmeasia cof xan wa
T
ENEGEST¢) aol frrrrriripgli=F=
ERO xn aa Poe
T
(OW) ¥31
Eheaysio xan AOMINGD cd
Geosottt on =
' " ‘aNd Xa waTiON
Svs waTIOw at!
Eros “INO “LNOD
| ss ana
EAR NSIG) cus]
i |e
aoa Se, vanes aoa"
SISSYHO
ez eh eve LJ*|-viuod (gs) dosssooudgovNoUs | _[ZuVIOMON| “SuaUS 8,037,243 BL
Sheet 3 of 6
Oct. 11, 2011
U.S, Patent
SAG NSIC
‘@ NULNO
SARC SIC
BARC NSIC
ERETEST
01
YATIONLNOD ‘Aviasia
‘LOW WAN TONS
yaQNVaXa
‘SVS
a quvos
SAR NSIG
SAUC NSIO
UaQNaxa SVS
Xn
Xn
Xn
[SAG ¥SIC
[=F
xn
SAG HSIC
ARG ASIC
SAR NSIC
Xn
xn
82
XN
UBTIONLNOD ‘AMgsiO
“LOW WAN TONA
waQNYdXa
Svs
99
08
WNILNO vauvoa
[= waanvaxasvs | SISSYHO
€ ld
vLAOINI
)US 8,037,243 BL
Oct. 11, 2011 Sheet 4 of 6
Patent
s
vr Old
a uaAyas Vv Nanas
falNdNOD HLM
1SOH 1SOH
WOYS/OL WOUSIOL
aoe sissvHo 34a
Vluod
@ LNOINI ; V LNOINI
a NVLNO SISSVHO 3V0 YNILNO
@.LNOINI Y LNOINI
@NILNO YNILAO
g LNOINI € Vv LNOINI
@NULNO ol et YNILLAO
@ LNOINI T | VLNOINI
fom J
SISS¥HO 3¥O
@NULNO KF WNILNO
LANIGVO-
vols
ak eh
EEEINES VUSANaS
PalNdNod PaLNdNOD
LSOH LSOH
WOus/OL WOUd/OL
0p 0p.
SISSVHO 3d0
a P0e
V1uOd
@ LNOINI 989
4
SISSVHO 3vd
GOL POL:
‘v LNOINI
S NILNO. ‘YW NILNO
S LNOINI ‘YW LNOINI
z
SISSVHO 3vd
@ NILNO. YW NILNO
‘@ LNOIN ‘Y LNOINI
@NILNO ‘Y NILNO
ainom —J ¥ LNOINI
v
SISSVHO 3va
@NIILNO. WY NULNO.
ABNIaVO.U.S. Patent Oct. 11,2011 Sheet 5 of 6 US 8,037,243 BI
FLASH MEMORY MODULE 500
PARTITION LAYOUT
PARTITION 1 (BOOT) 510
a —
PARTITION 2 540
ICA COMPRESSED
FLARE PARTITION IMAGE
ICA COMPRESSED
UTILITY PARTITION IMAGE
FLASH MEMORY MODULE PARTITION LAYOUT
FIG. 5U.S. Patent Oct. 11, 2011 Sheet 6 of 6 US 8,037,243 B1
610-
IEC
MEMORY
FOR BOOTABLE.
IMAGE
Hi
FLASH
SP BOOT FROM MEMORY
MODULE AND RUNS INITIALIZER]
PROGRAM AND FLARE
PARTITIONS ON DISK DRIVE
640:
4
TRANSFERS COMPRESSED
IMAGE FROM FLASH MODULE
INTO MEMORY ON SP A
SP TRANSFERS:
650-4 DECOMPRESSED
IMAGE TO DISK DRIVES
FIG. 6
660-| ‘SP REBOOTSUS 8,037,243 BI
1
INSTALLING DATA STORAGE SYSTEM.
SOFTWARE ON DISK DRIVE SYSTEMS.
BACKGROUND OF THE INVENTION
1, Field of the Invention
‘The present invention cates to installing data storage sys
tem software on disk dive systems,
2. Bric! Description of Related Prior Art
‘As is known in the art, lage mainframe compute systems
and data servers somtimes require large capacity data stor-
age systems. One type of data storage system is a magnetic
disk stomge system, Here a baak of disk drives nd the com-
per systems and data servers are coupled together through
‘an interfive, The interface includes storage processors that
‘operate in such a way that they are transparent to the coms
per. Tht is, datas stored i, and retrieve from, the bank of
‘isk lives in sucha way that the mainframe computer system
‘or data server functions a8 if tis operating with one main
Jame memory. One type of daa storage system is Redun-
‘dant Array of Inexpensive Disks (RAID) data storage system,
A RAID data storage system includes two ormore disk drives
jn combination for ful tolerance and performance. A RAID
data storage system is typically made up ofa front-end data
processor (DPE) and mated with back-end store disk array
‘enclosure (DAE). Typically the DPE boots up using informa
tion which is preinstalled on disks of either the DPE or the
DAB. The pre-installed information conventionally needs to
be customized to the exaet type, configuration and nature of
the DPF and the exact type, configuration and nature of the
disk deives chosen to form the RAID clta storage system,
‘Consequently the disk drives which are preinstalled with
information fora specific DPE and disk drive configuration
are different from similar disk drives which are pre-installed
with information fora different DPE and disk drive confiu-
ration, due othe information difference install on the drives
SUMMARY OF THE INVENTION
Data storage system software is installed from nonvolatile
memory. A storage processor is booted, transferring informa
tion sored in a nonvolatile memory module to a disk drive
system, thereby enabling. the system processor t0 hoot
tional frontend ports for example port 48 is connect (0 8
‘corresponding backend port 384 of the SAS expander 340
«disposed on afirstoncof the pair of storage processor printed
rit boards, here STORAGE. PROCESSOR BOARD A:
and a second one ofthe pur of bidirectional front end pons
“48h is connected to a comresponding backend port 38 ofthe
SAS expander 346 disposed on a second one ofthe pur of
storage processor printed circuit boards here STORAGE
PROCESSOR BOARD B.
‘As noted above, the DPE 14 includes a plurality of disk
drives 220-22, Each one of the disk drives is coupled to at
Jeast one backend port 802, 0b of acomespondingone ofthe
plurality of multiplexers 22a-22n. More particularly, in the
disk drive 224-220 is a SAS disk deve having a par of ports,
asshown inFG.2, the pai of ports is connected tothe pairof
backend portsof the multiplexer on the other hand ifthe disk
“drive isa Serial ATA (SATA) disk drive having single por,
the signal port is connected 10 only one of the prof backend
ports of the multiplexer. The multiplexers are here active
multiplexers described in the above referenced pending
patent application the subject mater thereof being ineorpo-
rated herein by reference.
“The DPE 1dalsoincludesa pair of management controllers
60, each one being disposed on n corresponding one of the
processor printed circuit boards here STOR=
SOR BOARD A and here STORAGE PRO-
‘CESSOR BOARD B, as shown. A fintofthepairof manage
meat controlles 60, here the controller 60 disposed on
STORAGE PROCESSOR BOARD.A includes an additional
front end port 36a ofthe SAS expander 4 disposed on such
storage processor printed cireit boards and the second one of
the pairof management controllers 60 disposed onthe STOR-
AGE PROCESSOR BOARD B is coupled to an additional
Jot end port 366 ofthe SAS expander 34, a8 shown,
‘Monitors 622, 62, 62c horwin sometimes referred to as 2
Vital Produet Data (VPD), are disposed on the STORAGE
PROCESSOR BOARD A, STORAGE PROCESSOR
BOARD B and inteposer board 44, respectively, as shosen.
‘The monitors 62a, 62b, and 62c are coupled to the pair of
‘management controllers 60 on the STORAGE PROCESSOR
BOARDS A and B, as shown, Vital Prodct Data inclodes
information programmed by the factory into a “resume”
o
4
EEPROM on some Field Replaceable Units (FRU), gener
ally containing some unigue infomation on cach part sueh as
‘World Wide Number al serial number, The team “VPD" is
‘often used referto the FEPROM isl. Here, thereisaVPD
EEPROM on each STORAGE PROCESSOR BOARD A,
STORAGE PROCESSOR BOARD B and iterposer boar
4
Referring now to FIG. 3, DAE 16 sshowa to include a pair
fof SAS expander printed circuit boards 64, 64, a pair of
SAS expanders 66,666, each one being disposed on a cor
responding one of the pair of SAS expander printed cireuit
boards 64a, 646, each one ofthe pair of SAS expanders 66a
(666 has a bidirectional front end expansion port 682, 68,
respoctvely and a bidirectional hackend expansion port 7
‘Tob, vespectivly.
Also included in DAE 16 san interpose printed eieuit 72
board. A pluality of, here twelve, multiplexers 740-74 is
sposed on the interposer printed circuit board 72, each one
ofthe plurality of multiplexers 14a-74n includes (a) pairof
bidirectional frontend ports 762, 766: (b) a pair of bidiree-
sional backend ports 78a, 785 For each one ofthe multiplex-
ers 740-74 a fst one ofthe pair of bidirectional front end
ports here port 76a, for example, is connected to a corre
sponding one of ackend ports 802-800 of the SAS expander
(660 and a second one of the pair of bidirectional frontend
ports, here 76 for example, is connected toa corresponding
backend port of the SAS expander 666 shown. The DAE 16
includes, as noted above, the plurality of disk drives 22%
22'n, each one being coupled tot least one backend port 78a,
‘8b of a corresponding one of the plurality of multiplexers
“Ha-T4n. More paricularly, in the disk drive 22.2% is a
AS disk drive having a paiof ports, as shown in FIG. 3, the
‘of ports is comnccte tothe pair of backend ports ofthe
uliplexer; onthe other hand, ithe disk drives SATA disk
rive having single port the signal ports connected o only
‘one of the pair of backend ports of the multiplexer. The
‘multiplexers ane here active multiplexers described in the
above referenced pending patent application the subject mat-
ter thereof being incorporated herein by reference.
‘Referring again also to FIGS. 1 and 2, the expansion ports
40a, 40D of SAS expanders 34, 34 are connected 19 the
bidirectional font end expansion ports 6Ra, 684, respec-
tively, as shown. Thus, SAS expander Ma is connected 10
SAS expander 640 through cable 1302 and SAS expander 345
js connected to SAS expander 645 through cable 1308, Thus,
referring to FIG. 1 data can pass between any one ofthe host
computer/servers 12a, 126 and any one of the here twenty
tour disk drives 220-22 and 220-22,
Referring ogain to FIG., 3, as with DPE 14 (FIG. 2) the
DAE 16 includes pair of management controllers, exch one
being disposedon a comesponding one ofthe pairof expander
printed eiruit hoards, a first ofthe pair of expansion board
‘agement controllers being coupled to an aditional raat
ted porto the SAS expander disposed onthe frst one ofthe
pair of expander printed cicuit boanls and a second one the
pair of expansion management controllers being coupled to
‘an additonal front end port ofthe SAS expander disposed on
te second one of the pair of expander printed eireuit boards
‘Punter, withthe DPE 14, the DAE 16 includes monitors
(62a, 62b, 62 having Vital Product Data (VPD) as well as
enclosure numerical displays,
“Ths, the data storage system 10 (FIG. 1) may be further
expanded as shown in FIG. 4 ina cabinet here having fonr
ABs 16 and a DPE 12. Asnated above, heroa DPE has up 10
12 disk drives, and each one ofthe four DAES, bas 12 disk
drives to provide, in this example,» data stomge system
having up to 60 disk drives. Enclosures ean be wired upUS 8,037,243 BI
5
various ways, two of which are shown in FIG. 4 aad another
beingshowninF1G. 4A. The connections between enclosures
‘consist of standard SAS signals and cables,
Fach one of the eables includes four SAS lanes so that at
‘any one instant in time, at most 4 messages ean be going 0 4
different dives, hut ocessive messages can be sent to dif
ferent drives using the same SAS lane. Those 4 lanes are also
used to send trafic to drives on downstream expanders, x0 8
message can be sent on one o the inpat lanes, out one ofthe
4 outpot lanes to aa input lane on the next bos.
‘Here, inthe DPE there are eight lanes between the trans
Jator and the SAS controller [bur SAS lunes Betveen the pa
‘of SAS controllers; one SAS lane between each multiplexer
‘and a backend SAS port; and four lanes at each ofthe expan-
sion ports 404, 40, For each DAE there ate four SAS lanes
between each one of the ports 7a, 706 and the onnected one
‘of the pair of SAS expanders 64a, 64, respectively, and one
SAS late between each multiplexer and a backend SAS por.
‘The conventional manufacturing process peefoads a boot
mage directly on the disks, 200-20n, as part ofthe manu
turing process using an Image Copy Application (ICA) pro
‘ess, The ICA provides a mechanism to load “virgin” copies
‘ofopenting system (OS) images tothe appropriate regions
the amray"s drives with « minimum of support hardware
required.
ICA images specifi tothe DPE and selected disk drive
‘combination are downloaded tothe drives which are be con-
figured into the DPE. ICA images are compressed 10 save
space and download time. This process ereates disk drives
‘whi are now customized parts mated toa particular DPE
technology. This avoids the need for distributors to stock @
multitude of preconfigured RAID data storage system com-
binations representing al the unique DPE and disk deve type
‘configurations. Its preferable to allow for separate DPE, and
‘non customized disks shipments, This allows the RAID data
storage system fo iitalze is rst four boot drives without
‘sing conventional LAN-Based ICA manufacturing tons.
‘One SP in each DPE includes a bootable fash memory mod
tle containing an initializer program as well as compressed
ICA images to be writen to the OS disks The initialization of
the disk drives to inclade bootable information occurs upon
the frst bootup typically atthe enstomer’s site. This elimi
nates the preloading ofthe disk as part of the manufacturing
process
‘A cisiomer can purchase a DPE chassis and standard
dives of various interfaces such as the 3.0 Gbiv’s SATA,
(SATA2) or SAS and of different storage capacities. Upon
‘nial installation the eustomer must populate the DPE with
«disks, assuring that a minimum of irst four drives are inserted
properly in the appropriate slots and are of the same type
(SAS of SATA2) and of the same storage capacity Initially,
Extended Power On Self Test (POST) executing on the DPE
‘cocks the drives fora bootable image. I thee sno bootable
Jmage, Pxtended POST boots the SP from the nonvolatile
memory module. In at least one implementation 8 Mash
‘memory module is used as the nonvolatile memory module
“The preferred implementation uses M-Systems’ uDiskOn-
Chip (UDOT) as the flash memory module. The
UDOCM is a Mash memory storage deviee that uses a Uni-
versal Serial Bus (USB) interface. Referring to FIG. 8, the
information 500 on fash memory module 90 includes part
tion 1 510, and parition 2 840, In an embedded system the
DiskOnChip ets asa boot deviee filling the same role as an
IDE hard drive, Microsoft XP Embedded (XPe) $20 does not
havenative support to hoot from USB device, bul M-Systems
{DIskOnChip is delivered with XP Embedded components
that enable it to boot. These components ae include in the
0
o
6
XPe OS data $20, The preferred environment uses Microsoft
XP Embedded (XPe}, however, other implementations are
possible. The uDOC part was chosen because it provides the
industry's fastest OS boot and application load time, Tt
Applies eror detection and on-tho-y err correction, as Well
fas tomatic had block handling to map ost bad blocks and
ensue that no data is lost,
‘An uDOCPan wility provided by M-Systems is used to
partton and format the uDiskOnChip. ubiskOnChip 500,
can be divided into upto four partitions, where the fst one
can be designated as-a bootable drive 810, A proposed parti-
tion tables shown ia FIG. 8. The fist bootable partion 810
is formatted with NTFS, Itholds Windows XP embedded OS
520, wth the initializer program 830 in its Startup directory
Te hootable XP software $20 contains minimal XP compo
sens o save space in fash memory module and to speedup
the booting process, Second partition $40 is also formatted
with a Microsoft Windows NTP fle system and bolds two
mages: Flare (data storage system operating system) Parti
‘ion $80 and Ulity Parton $60. The Flare Partition is
configured to be the bootable partion. Both the FLARE
Partition $80 and Uiiliy Partition $60 imoge are stored in a
compressed ICA format.
PIG, 6 isa flow chart that describes the DPE. power up
sequence. This sequence allows for the unattended transfer
formation to and configuring of disks 22¢-22d as bootable
devices. [n step 610 the SP A 20 powers up and runs Basic
InpuvOutput System (BIOS) anxl POST. In step 620 SP A
Jooks forthe Flare signature on the appropriate drives 222-
22n depending on the SP, and boots from the selected drive
TESP-Acannot finds Pare signaturcon theappeopriate disk
Arives 2a-22n, the BIOS/POST code in SP A. looks for @
bootable image on lash memory module, step 630. Ifa boo
able imsge is found oa the Mash memory module, the SP
praceeds to step 640 and booting fash memory module rane
‘hing an initializer program contained in the boot inzage. The
initializer mode runs required diagnostic checks on the
system configuration and hardware 692, Pat of this diagnos-
tic validates thatthe system has the required number of disks,
And disk types sizes installed, In ease of erors the imaging
process will erminate with the eror 698
Tino bootable image is found on either the diskekves othe
stash memory module, the SP BIOS/POST proceeds to step
6698 which sets an error satus, la step 640, the initializer
rogram erates signatures and Pare Partitions on disks 22a-
22d, wiping out any data already on the disks, The initializer
program transfers enerypiedicompressed partition images
'550 and $60 from flash module into memoey’on SP Ain step
644. In step 650 initializer program then decryptsidecom-
presses the image in memory and writes itt the appropriate
places on disks 220-224. The SP copies the decompressed
Image from the Flare partition image $80 and the decom-
pressed ullty partion image S60 from system memory 10
fhe disks 202-20, cresting two pantions respectively on
cach drive.
"The initializer reboots both SP.A 20a and SPB 20h, in step
6650, BIOS POST runs on both SP A and SP Band proceeds 0
booting Flare code which was moved'o the appropriate drives
in steps 640, 644, 650, In step 670 Flare continues with its
‘normal initialization and checks to soe whether there isa boot
imagen he flash memory module. [fiber isaboot image on
the fash memory module the software proceeds to step 680
where the Flare code erases the boot image from the sh
memory module. Frasing the image has many bonefits,
‘among Which, hut not Hmited to, are the ability to use this
Fash memory module for other data uses, and preventing the