F 1 - VCS Hardware Troubleshooting & Linux OS

You might also like

Download as odp, pdf, or txt
Download as odp, pdf, or txt
You are on page 1of 65

VCS Hardware Troubleshooting &

Boot Process
Vernon !e"ee
'(ter this lesson) learners will be able to*
+ecall the ,ost co,,on (or,s o( hardware (ailure on the
Cisco VCS
!eter,ine i( one o( these hardware (ailures is ta-ing "lace
+ecall the i,"ortant ste"s o( the boot "rocess on the
Cisco VCS and use this -nowledge to troubleshoot issues
with VCS booting
Phase 1 . VCS Hardware
VCS Harddri%es
sda. Pri,ar2 !ri%e3 0( this (ails) the VCS will not boot at
all3 This is ,ounted to the root and /tandberg
sdb 4 Secondar2 !ri%e3 0( this (ails) the VCS will act 5odd63
This is ,ounted to /,nt/harddis-3
S!' 1ailure*
The VCS will ne%er begin loading the #S3 So,ething
si,ilar to this ,a2 be (ro7en on the console*
S!B (ailure*
S!B can (ail in se%eral wa2s*
138 9hen the s2ste, boots) sdb will ne%er be detected
238 sdb will go unreachable a(ter the s2ste, has booted
38 sdb ,a2 be detected during s2ste, boot) but bounce
bac- and (orth between o"erating and not o"erating
Testing sdb* ls
ls 4l /de%/sd;
brw4rw4444 1 root root <) 0 201241241= 1>*2> /de%/sda
brw4rw4444 1 root root <) 1 201241241= 1>*2> /de%/sda1
brw4rw4444 1 root root <) 2 201241241= 1>*2> /de%/sda2
brw4rw4444 1 root root <) 201241241= 1>*2> /de%/sda
brw4rw4444 1 root root <) > 201241241= 1>*2> /de%/sda>
brw4rw4444 1 root root <) 6 201241241= 1>*2> /de%/sda6
brw4rw4444 1 root root <) 7 201241241= 1>*2> /de%/sda7
brw4rw4444 1 root root <) < 201241241= 1>*2> /de%/sda<
brw4rw4444 1 root root <) 16 201241241= 1>*2> /de%/sdb
brw4rw4444 1 root root <) 17 201241241= 1>*2> /de%/sdb1
brw4rw4444 1 root root <) 1< 201241241= 1>*2> /de%/sdb2
Testing sdb* ls
ls 4l /de%/sd;
brw4rw4444 1 root root <) 0 201241241= 1>*2> /de%/sda
brw4rw4444 1 root root <) 1 201241241= 1>*2> /de%/sda1
brw4rw4444 1 root root <) 2 201241241= 1>*2> /de%/sda2
brw4rw4444 1 root root <) 201241241= 1>*2> /de%/sda
brw4rw4444 1 root root <) > 201241241= 1>*2> /de%/sda>
brw4rw4444 1 root root <) 6 201241241= 1>*2> /de%/sda6
brw4rw4444 1 root root <) 7 201241241= 1>*2> /de%/sda7
brw4rw4444 1 root root <) < 201241241= 1>*2> /de%/sda<
Sdb was not detected b2 the VCS
Testing sdb* s,artctl
? @ s,artctl 44all /de%/sdb
s,artctl >3=0 2010410416 r1<A BC<6D6=4"c4linuC4gnuE
Flocal build8
Co"2right FC8 2002410 b2 Bruce 'llen)
GGG ST'+T #1 0/1#+H'T0#/ SICT0#/ GGG
Hodel 1a,il2* Seagate Barracuda 7200312 (a,il2
!e%ice Hodel* ST2>01<'S
Serial /u,ber* >VJ11!/S
1ir,ware Version* CC<
Kser Ca"acit2* 2>0)0>A)>0)016 b2tes
!e%ice is* 0n s,artctl database B(or details use* 4P
'T' Version is* <
'T' Standard is* 'T'4<4'CS re%ision =
Local Ti,e is* 1ri !ec 1= 1>*2<*01 2012 :HT
SH'+T su""ort is* '%ailable 4 de%ice has SH'+T
SH'+T su""ort is* Inabled
GGG ST'+T #1 +I'! SH'+T !'T' SICT0#/ GGG
SH'+T o%erall4health sel(4assess,ent test result*
Testing sdb* s,artctl
? @ s,artctl 44all /de%/sdb
s,artctl >3=0 2010410416 r1<A BC<6D6=4"c4linuC4gnuE
Flocal build8
Co"2right FC8 2002410 b2 Bruce 'llen)
GGG ST'+T #1 0/1#+H'T0#/ SICT0#/ GGG
Hodel 1a,il2* Seagate Barracuda 7200312 (a,il2
!e%ice Hodel* ST2>01<'S
Serial /u,ber* >VJ11!/S
1ir,ware Version* CC<
Kser Ca"acit2* 2>0)0>A)>0)016 b2tes
!e%ice is* 0n s,artctl database B(or details use* 4P
'T' Version is* <
'T' Standard is* 'T'4<4'CS re%ision =
Local Ti,e is* 1ri !ec 1= 1>*2<*01 2012 :HT
SH'+T su""ort is* '%ailable 4 de%ice has SH'+T
SH'+T su""ort is* Inabled
GGG ST'+T #1 +I'! SH'+T !'T' SICT0#/ GGG
SH'+T o%erall4health sel(4assess,ent test result*
Testing sdb* s,artctl
? @ s,artctl 44all /de%/sdb
s,artctl >3=0 2010410416 r1<A BC<6D6=4"c4linuC4gnuE Flocal build8
Co"2right FC8 2002410 b2 Bruce 'llen) htt"*//s,art,ontools3source(orge3net
S,artctl o"en de%ice* /de%/sdb (ailed* /o such de%ice
Basicall2) an2thing other than P'SSI! is bad3
Testing sdb* d(
? @ d( M gre" sdb
/de%/sdb2 207=1A6 16A11=< 2172A22= 1N /,nt/harddis-
? @
Testing sdb* d(
? @ d( M gre" sdb
? @
Sdb is not ,ounted
Testing Ithernet Ports
? @ i(con(ig 4a M gre" eth
eth0 Lin- enca"*Ithernet H9addr 00*10*1*1A*'0*B=
eth1 Lin- enca"*Ithernet H9addr 00*10*1*1A*'0*B>
eth2 Lin- enca"*Ithernet H9addr 00*10*1*1A*'0*B6
eth Lin- enca"*Ithernet H9addr 00*10*1*1A*'0*B7
Testing Ithernet Ports
? @ i(con(ig 4a M gre" eth
eth0 Lin- enca"*Ithernet H9addr 00*10*1*1A*'0*B=
eth1 Lin- enca"*Ithernet H9addr 00*10*1*1A*'0*B6
eth2 Lin- enca"*Ithernet H9addr 00*10*1*1A*'0*B7
'll = Ithernet "orts ,ust show
Serial /u,ber +ange (or Potential
/0C 0ssues
0( the serial nu,ber is in the range >2'1A2674>2'2>>10
(or VCS and >>'000014>>'0001 (or Conductor) the VCS
is at ris- (or a (ailed /0C3
ICact Serial /u,bers can be seen on the eCcel doc at
1an 0ssues
' VCS can run (ine with 2 (ailed (ans
Process +H' i( reOuested b2 custo,er e%en i( onl2 one
(an (ailure3 Custo,er will get a re(urbished VCS that ,a2
ha%e ,ore se%ere "roble,s3
VCS running P> and P6 will re"ort a (an alar, e%en i( the
(an is now o"erating at a "ro"er s"eed3 'lar,s can be
,anuall2 reset3
Current VCS code will onl2 raise (an alar, i( 2 or ,ore
(ans ha%e (ailed3
VCS :enerations
>2'0 . Brage1
>2'1)>2'2 . Brage2
Brage1 is Ouite old and has reached the eC"eced H!! li(e
(or sdb
Phase 2 . VCS Boot Process
0nit scri"ts
VCS Boot Process 4 :rub
The VCS boots o(( o( grub on sda13 :rub will then ha%e
the VCS boot either o(( o( sda> Fi,age18 or sda6
The acti%e i,age is ,ounted onto /3
VCS Boot Process 4 inittab
/etc/inittab is re(erenced3 /etc/inittab calls /etc/init3d/rc with
the current run le%el3
/etc/init3d/rc has di((erent grou"s o( run le%els
0* call /etc/rc3shutdown and halt s2ste,3
14>* call /etc/rc3s2sinit with the run le%el3
6* call /etc/rc3shutdown and reboot3
VCS Boot Process . rc3s2sinit
/etc/rc3s2sinit calls all o( the startu" scri"ts in /etc/init3d/3
Scri"ts starting with I are called (irst) a(ter this scri"ts
starting with S are called3 Scri"ts are called in nu,berical
ICa,"le* I00bootlogd is called (irst and SAA%,toolsd is
called last3
/etc/init3d/ scri"ts are called with an argu,ent o( either
start) sto") or restart3
VCS Boot Process . Hounting o(
other Partitions
/etc/init3d/I26,ount reads /etc/"artitions3con( to get the
location o( the ro and rw "artitions3
The scri"t will then detect which "artition is ,ounted as
root and ,ount the a""ro"riate rw "artition on /tandberg3
+o "artition 1 is /de%/sda>
+w "artition 1 is /de%/sda7
+o "artition 2 is /de%/sda6
+w "artition 2 is /de%/sda<
VCS Boot Process . Clusterdb
9ith /etc/init3d/S66clusterdb the VCS loads the cluster
database) which stores the con(iguration o( the VCS and
can re"licate the con(iguration to cluster "eers Fi( there
are an283
0( (or an2 reason) clusterdb (ails to start) it can ta-e
se%eral ,inutes (or this "rocess to re"ort a (ailure3 Since
the VCS will not ,o%e (orward in the boot "rocess until it
recei%es (eedbac- (ro, the current stage in the boot
"rocess) this ,a2 loo- li-e a (ro7en VCS3
VCS Boot Process 4 /t,"/hw(ail
The /etc/init3d/S7>tandberg scri"t) which launches the
tandberg a""lication chec-s to see i( a (ile na,ed
/t,"/hw(ail eCists3 0( it does) the tandberg a""lication will
not start and a ,essage will be "rinted to the console
sa2ing that /t,"/hw(ail eCists and that the a"" will not
@ !onQt start the i,age i( the hardware is bro-en
i( B 4( /t,"/hw(ail ER then
echo S/t,"/hw(ail eCists* T'/!BI+: a""lication startu" inhibitedS
doDlog SI%entGTSHardware 1ailureTS !etailGTS'""lication startu"
eCit 0
VCS Boot Process 4 /t,"/hw(ail
Luc-il2) i( the /t,"/hw(ail (ile eCists) it will tell 2ou wh2 the
(ile eCists3 The contents o( this (ile can be read with the
cat /t,"/hw(ail
The out"ut o( the cat co,,and should tell 2ou what
section o( the hardware has (ailed3
VCS Boot Process . Kse(ul
/etc/init3d/S10networ- brings u" the networ- inter(aces on
the VCS3
/etc/init3d/S11dns,asO starts the !/S ,asOuerade3
/etc/init3d/S76o"ends calls #"en!S on THS'gent
/etc/init3d/S<0htt"d calls a"ache3
:o ahead and ls /etc/init3d to see a co,"lete list o( whatUs
going on here3
Li%e Loo-u" o( Boot Process
The IndV
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential 0 Cisco Con(idential Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 0
'll 'bout Sna"shots
Alan Ford

Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential 1
'(ter this lesson learners will be able to*
Collect s2ste, sna"shots (ro, a Cisco VCS
'nal27e -e2 "ortions o( a (ull s2ste, sna"shot
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential 2
T2"es o( Sna"shots
Status sna"shot
Con(iguration and Status PHL
:enerated (ro, clusterdb* Hore co,"lete than 5Ccon(6 and 5Cstat6 out"ut
Aside: note difference between clustering and clusterdb
Can also get (ro, htt"*//%cs/con(iguration3C,l etc
Logging sna"shot
Last two instances o( -e2 logs
+arel2 use(ul . diagnostics logs are t2"icall2 ,ore use(ul when re"licating a
1ull sna"shot
9hat 2ou t2"icall2 want . all logs stored) and other use(ul in(or,ation
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential
La2out o( sna"shot
Host use(ul things are under ,nt/harddis-/sna"shot/"lugins/
'lso o( note*
tandberg/etc and tandberg/"ersistent* contains the contents o( these
con(iguration directories) so 2ou can chec- (or con(iguration errors
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential =
PHL +e"resentations o( s2ste, con(iguration and status
IOui%alent to doing Ccon( / Cstat / etc
0ncludes additional con(ig no longer "resent in Ccon(
S2ste, con(iguration3 0ncludes %ersion) o"tion -e2s) etc3
S2ste, status3 #( "articular interest* +esourceKsage) +egistrations
+ecent detailed call and search histor2
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential >
oa-Dcrashlogs F"re4P7328 / crashlogs
3tar3g7 (ile o( recent crashes
/ote an2 "rocess on the boC can crash . so,e serious) so,e not
VCS so(tware F5a""68 will auto,aticall2 restart) but call and registration state
will be lost
5net6 is har,less
's are 5linuCstatus3"26 or 5,anage,ent(ra,ewor-3"26 F"rior to P7328
5winbindd6 could necessitate a restart
Loo- inside crash PHL du,"s to see i( the2 are re"etiti%e
Load into htt"s*//103>031>23110 to see i( the2Ure -nown
Please chec- the out"ut ,a-es sense based on s2,"to,s/logsV
0( in doubt) please do check with us!
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential 6
TeCtual du,"s o( the Cluster!B tables
Those with a serial nu,ber su((iC* sharded F"eer4s"eci(ic8 records
i3e3 records under co,"lete control o( that "eer
/on4sharded Fglobal8 tables
'n2 "eer can edit
Status is t2"icall2 sharded
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential 7
'll logs in logs/ subdirector2
Prior to P7* 2ou need to untar the logs3tar3g7 (ile too
Logs are on a single "artition on the VCS . not %ersion4s"eci(ic
Logs u"grades on the boC3 'llows correlation o( u"grade trans(or,s) reboots)
ser%ice and load changes) etc3
Logs all con(iguration changes3 Ver2 use(ul3
Logs web ser%er accesses and errors) including THS "robes3
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential <
harddis-logs F28
Ver2 use(ul . logs e%er2 startu" and shutdown
Jou can see uncontrolled shutdowns here Fi3e3 a startu" (ollowed b2 another
startu"8 . boC would ha%e been hard "ower c2cled
Jou can also see clusterdb) o"ends) etc (ailing to start3
#ut"ut (ro, the -ernel3 I3g3 use(ul to loo- (or hard dis- errors3
Critical errors (ro, other logs3 /ot actuall2 "articularl2 use(ul3
dae,on3log F"re%iousl2 sa,ba3log8
#ut"ut (ro, s2ste, dae,ons) notabl2 nt"d Fti,e s2nchronisation8 and racoon
F0PSec (or clustering8
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential A
Clean reboot*
Mon Nov 19 14:14:49 UTC 2012 system shutdown completed Linux csn-hen-
vcsd1 2!"!#9!4 $1 %M& Thu 'ul 12 11:#":44 (%T 2012 x)"*"4 +NU,Linux
Mon Nov 19 09:1-:02 .%T 2012 system initi/lis/tion st/ted Linux 0none1
2!"!#9!4 $1 %M& Thu 'ul 12 11:#":44 (%T 2012 x)"*"4 +NU,Linux
Knclean reboot*
Mon Nov 19 14:4#:-9 .%T 2012 system initi/lis/tion complete
Mon Nov 2" 09:#):#- .%T 2012 system initi/lis/tion st/ted Linux 0none1
2!"!#9!4 $1 %M& Thu 'ul 12 11:#":44 (%T 2012 x)"*"4 +NU,Linux
%un 'un - 02:#":24 +MT 2011 system est/t complete
1ailed ser%ice startu"s*
%/t 'un 4 2#:10:44 &3T 2011 %""clusted4 st/tup 5/iled6
Mon M/y 21 14:09:2) +MT 2012 %2"opends st/tup 5/iled6
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential =0
Hard dis- errors*
7ct #1 19:49:4# vcs01 8enel: /t/4: lost inteupt 0%t/tus 0x-01
7ct #1 19:49:4# vcs01 8enel: /t/4!00: exception .m/s8 0x10 %9ct 0x0 %. 0x40-0002
/ction 0xe :o;en
7ct #1 19:49:4# vcs01 8enel: /t/4: %.o: < =ecovComm &>?=dyCh@ CommA/8e 3ev.xch B
7ct #1 19:49:4# vcs01 8enel: /t/4!00: :/iled comm/nd: A=CT. 3M9
7ct #1 19:49:4# vcs01 8enel: /t/4!00: cmd c/,00:0):11:0-:"),00:00:00:00:00,e/ t/@ 0 dm/
409" out
7ct #1 19:49:4# vcs01 8enel: es 40,00:01:09:4::c2,00:00:00:00:00,00 .m/s8 0x14 09T9 4us
7ct #1 19:49:4# vcs01 8enel: /t/4!00: st/tus: < 3=3? B
7ct #1 19:49:4# vcs01 8enel: /t/4: h/d esettin@ lin8
7ct #1 19:49:4# vcs01 8enel: /t/4: %9T9 lin8 up #!0 +4ps 0%%t/tus 12# %Contol #001
7ct #1 19:49:4# vcs01 8enel: /t/4!00: con:i@ued :o U3M9,1##
7ct #1 19:49:4# vcs01 8enel: /t/4: .> complete
Ithernet lin- down
9u@ # 11:4":0# cisco 8enel: e1000e: eth0 NCC Lin8 is 3own
Probable ,e,or2 issue
'un 1 02:4):10 vcsc 8enel: .DT2-:s 0loop1)1: eo: un/4le to e/d supe4loc8
'un 1 02:4):10 vcsc 8enel: 59T: un/4le to e/d 4oot secto
'un 1 02:4):10 vcsc 8enel: 59T: un/4le to e/d 4oot secto
'un 1 02:4):10 vcsc 8enel: iso:s*:ill*supe: 4e/d :/iledE devFloop1)E iso*4l8numF1"E
So,e -ernel "anics Fattaching ,onitor is best8
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential =1
harddis-logs 4 sensors
Ver2) %er2 use(ul3 !ata "olled e%er2 10 ,inutes3
\date]Tue Ha2 1 06*11*06 BST 2012\/date]
\d(] G !is- Ksage
,dev,sd/- 100449" #"01-" -9##12 #)G ,
,dev,/m0 19#")2 -#)9) 1292)9 #0G ,v/
,dev,/m1 1420144 10#1" 1#)-12) 1G ,tmp
,dev,sd/2 100449" -10-") 442900 -4G ,t/nd4e@
,dev,sd42 2#0241#9" 1220#24) 201)12124 )G ,mnt,h/ddis8
\inodes] G i4node usage
,dev,sd/- "#)22 9##0 -4-42 1-G ,
,dev,/m0 -0000 9" 49904 1G ,v/
,dev,/m1 9#"9" 12"# 919## 2G ,tmp
,dev,sd/2 "#)22 #122 "02-0 -G ,t/nd4e@
,dev,sd42 29#1092" 2111 29#0))"- 1G ,mnt,h/ddis8
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential =2
1ull !is-sX
100N d( or inode usage can cause all sorts o( odd beha%iour including
crashes) no web/CL0 access) and odd (ailures
Ha2 see 5no s"ace le(t on de%ice6 in crashes
So,e co,,on scenarios^
1ull d( /tandbergX
'n2thing uneC"ected in /tandberg
Remember to always cd /mnt/harddisk before running tcpdumps!!
1ull d( /,nt/harddis-X
Chec- out /,nt/harddis-/log (or eCcessi%e #"en!S logs
1ull inodes in /t,"X
5ilesystem Cnodes CUsed C5ee CUseG Mounted on
/dev/ram1 93696 93696 0 100% /tmp
Probabl2 #"en!S (ailing to start3 Chec- s2sinit3log and /t," contents
+un t,sagentDdestro2DandD"urgeDdata Freboot (irst8
^'nd ideall2) u"grade to THS PIV
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential =
Hard !ri%e 1ailure
_ernel log errors nor,all2 seen
'lso worth chec-ing \s,artctl] in sensors*
%M9=T .o Lo@ Jesion: 1
No .os Lo@@ed
0( an2thing sa2s 5I++#+6) itUs ,ost li-el2 dead
'lso) an2 %er2 large $u,"s F/#T the absolute %alue) but the change8 in
see- errors would ,ean slow res"onse and re"eated occurrences o(
such a $u," G high li-elihood o( i,,inent (ailure
2 %ee8*.o*=/te 0x000: 0)4 0"0 0#0 &e-:/il 9lw/ys - 2"2400221
2 %ee8*.o*=/te 0x000: 0)4 0"0 0#0 &e-:/il 9lw/ys - 2"2)0#-"#
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential ==
harddis-logs . sensors F28
5/n 1: 10-4" =&M 0min F 2"20 =&ME div F )1
5/n 2: 10-4" =&M 0min F 2"20 =&ME div F )1
5/n #: 10-4" =&M 0min F 2"20 =&ME div F )1
%ys Temp1: K#-!0 C 0hi@h F K4-!0 C1 senso F themisto
%ys Temp2: K#2!0 C 0hi@h F K4-!0 C1 senso F themisto
C&U Temp: K#-!0 C 0hi@h F K-0!0 C1 senso F them/l diode
eth0 Lin8 enc/p:.thenet >A/dd 00:10:5#:05:-5:#)
inet /dd:10!-0!1"4!11 (c/st:10!-0!1"4!122 M/s8:2--!2--!2--!12)
inet" /dd: :e)0::210::#::::e0::-:#),"4 %cope:Lin8
inet" /dd: 2001:420:4:e/:)::1"4:11,"4 %cope:+lo4/l
U& (=793C9%T =UNNCN+ MULTCC9%T MTU:1-00 Metic:1
=D p/c8ets:1#222992 eos:0 dopped:0 oveuns:0 :/me:0
TD p/c8ets:1#-222#9 eos:0 dopped:0 oveuns:0 c/ie:0
collisions:0 txLueuelen:1000
=D 4ytes:-"2"0-10#1 0-41#!1 M41 TD 4ytes:"-2"901-)2 0"222!2 M41
Cnteupt:1) Memoy::d/e0000-:d400000
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential =>
1an and /0C 1ailures
1an (ailures
0n P73231[ we will onl2 raise an alar, i( two or ,ore (ans ha%e (ailed3
Testing has shown the VCS can (ull2 o"erate with one (ailed (an3
9e also ha%e a high te,"erature alar, as a sa(et2 net3
Single (an (ailures "re4P73231 can be +H'Ud i( custo,er insts) but "lease "oint
to CSCud2=211 (irst . it is sa(e to o"erate with one (an (ailure and alar, can
be ignored3 Chec- out sensors to ,a-e sure onl2 one (an has (ailed3
/0C (ailure
0( a VCS is unco,,unicati%e) chec- out the nu,ber o( eth de%ices "resented
in -ernel log
5i(con(ig 4a M gre" eth6 should show (our de%ices
CSCua=7=A docu,ents the batch o( VCSs with "otential L'/ (aults
+oute ,iscon(iguration
0( onl2 certain boCes are re"orted una%ailable) do chec- out the routing table^
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential =6
harddis-logs . sensors F8
3estin/tion +/tew/y +enm/s8 5l/@s Metic =e: Use C:/ce
de:/ult 10!-0!1"4!1 0!0!0!0 U+ 0 0 0 eth0
10!-0!1"4!0 M 2--!2--!2--!12) U 0 0 0 eth0
9ctive Cntenet connections 0seves /nd est/4lished1
&oto =ecv-N %end-N Loc/l 9ddess 5oei@n 9ddess %t/te &C3,&o@/m
tcp 0 0 10!-0!1"4!11:-0"1 10!-4!2"!2:22992 .%T9(LC%>.3 1#"12,/pp
tcp 0 0 10!-0!1"4!11:22104 10!-0!1"1!49:-0"1 .%T9(LC%>.3 1#"12,/pp
11:24:## up # d/ysE 14:29E 2 usesE lo/d /ve/@e: 0!00E 0!04E 0!0-
\"s] G what "rogra,s are running) and since when
oot 1#-"- 0!0 0!0 109)4 1220 O % M/y24 0:00 ,4in,sh ,s4in,t/nd4e@ st/t
=oot 1#"12 2!2 9!1 220)-" #"))92 O %l M/y24 14#:4- P* ,t/nd4e@,im/@es,/pp
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential =7
harddis-logs . sensors F=8
\to"] G "rocesses ordered b2 CPK usage) & resource stats
top - 11:24:#4 up # d/ysE 14:29E 2 usesE lo/d /ve/@e: 0!00E 0!04E 0!0-
T/s8s: 149 tot/lE 1 unnin@E 14) sleepin@E 0 stoppedE 0 ;om4ie
Cpu0s1: 4!1GusE 2!2GsyE 0!#GniE 9#!2GidE 0!2Gw/E 0!0GhiE 0!0GsiE 0!0Gst
Mem: 4044-)48 tot/lE #9949"48 usedE 49"208 :eeE 10#))48 4u::es
%w/p: 922--1"8 tot/lE 142)8 usedE 92240))8 :eeE 1""2")08 c/ched
&C3 U%.= &= NC JC=T =.% %>= % GC&U GM.M TCM.K C7MM9N3
1#"12 oot 20 0 20#m #"0m 19m % ) 9!1 14#:4-!20 ,t/nd4e@,im/@es,/pp
\"rocD,e,in(o] G lots o( use(ul ,e,or2 stats) "articularl2*
Committed*9%: #21224) 8(
NB: #ut"ut (ro, all sensors log ,odules . so,eti,es in ,ore detail .
at the ti,e the sna"shot was ta-en is in the plugins/sysmonitor director2
in a sna"shot
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential =<
LinuC He,or2 Hanage,ent
/u,bers can be %er2 ,isleading3 To" out"ut*
Mem: 4044-)48 tot/lE #9949"48 usedE 49"208 :eeE 10#))48 4u::es
%w/p: 922--1"8 tot/lE 1428k usedE 92240))8 :eeE 1667680k cached
&C3 U%.= &= NC VIRT RES %>= % GC&U GM.M TCM.K C7MM9N3
1#"12 oot 20 0 703m 360m 19m % ) 9!1 14#:4-!20 ,t/nd4e@,im/@es,/pp
Ksed ,e,or2* this is not i,"ortant3 ' LinuC boC will atte,"t to use all
,e,or2 . that which is not used will be used to cache Fsee 5cached683
That which is in use is roughl2 SKHF+ISident8
Swa"* once all ,e,or2 has been used) and ,ost caching is eli,inated)
LinuC will start to swa"3 This can be bad3
Co,,ittedD'S* roughl2 what the boC needs to use to ser%ice all
a""licationUs ,e,or2 reOuests Froughl2 SKHFV0+Tual88
\"rocD,e,in(o] gi%es ,ore caching data too3
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential =A
Load 0ssues
0( Co,,ittedD'S goes signi(icantl2 o%er =:B Fabout =>000008) the VCS
will li-el2 start swa""ing3 Swa""ing leads to the VCS beco,ing less
res"onsi%e3 This is indicati%e o( a need to reduce load be(ore ,ore
"roble,s "ersist3
Load a%erage* i( regularl2 eCceeding 2) things are worr2ing3
VCS 5a""6 CPK usage N3 Chec- \to"] o%er ti,e3 0( regularl2 eCceeding
>0N) things are worr2ing3
+e4registration inter%als
S0P in "articular is %er2 hea%2 on ,essaging
0( the VCS is a Control) it is sa(e . and reco,,ended . to increase the re(resh
inter%als to 1<00/600 or e%en 600/600) de"ending on how bus2 the VCS is3
Search rule o"ti,i7ation to reduce search load
Tr2 not to use 'n2'lias i( at all "ossibleR do tailored) intelligent search rules
He,or2 lea-s
Host -nown issues (iCed in P731) a (ew ,ore in P732) and one about intra4
cluster searches (iCed in P73232
High intra4cluster latenc2 causes "artitioning/re4clustering which is load4
#"en!S* 0( #"en!S is in use) ,o%e to THS PIV
Knnecessar2 high logging le%els Fsna"shot (ull o( !IBK:X8
Knbalanced load across cluster FG] use !/S S+V8
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential >0
harddis-logs F8
Irrors (ro, the Irlang runti,e) which runs the Cluster!B3
Search (or 5C+'SH6 or 5I++#+63
:i%es indication i( 5,nesia6 is o%erloaded . i( so all sorts o( crashes and
unres"onsi%eness and dro""ed registrations can ha""en3 9ill t2"icall2 ha""en
when boC is hea%il2 loaded3
Kse(ul (or debugging #"en!S startu" issues3 +egular occurrence*
#"en!SUs con(iguration (ile gets corru"ted F(iCed in P7318
1ails to start
:ets stuc- in restart loo"
1ills /t," with (iles F(iCed in P7328) until web ser%er sto"s res"onding
1iC with t,sagentDdestro2DandD"urgeDdata
#r u"grade to THS PIV
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential >1
harddis-logs . de%elo"erDlog
!e%elo"er4oriented logging3 Kse(ul "lace to loo- (or I++#+s3
2012-0--24T21:2-:42K01:00 usc01-vcs1 UTCTimeFQ2012-0--24 20:2-:42E122Q
ModuleFQdevelope!clusted4!clustem/n/@eQ LevelFQCN57Q NodeFQclusted4R10!-0!1"4!11Q
3et/ilFQ=eceived el/n@ node up eventQ NodeFQclusted4R10!-0!1"-!11Q %t/teFQunde:inedS
But donUt be ,isled*
M/y 1 0-:4):29 @4syc/1v@001 tvcs: UTCTimeFQ2012-0--01 04:4):29E22#Q
ModuleFQdevelope!sslQ LevelFQ.==7=Q
CodeLoc/tionFQppcm/ins,ssl,ttssl,ttssl*openssl!cpp0"21Q MethodFQ::TT%%L.o7utputQ
The/dFQ0x2:/-1e/49200Q: TT%%L*continue>/ndsh/8e: 5/iled to est/4lish %%L connection
Herging data in clusterdb
See how long things ta-e . s2ste, will be su((ering during this3 0( lots o( node
u"/node down e%ents seen) that is indicati%e o( networ- issues Fincluding
]0,s latenc2 between "eers i( no signi(icant boC load is seen8
Kse(ul to co,"are with other "eer sna"shots too
'lar,s) re"lication) etc) etc
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential >2
harddis-logs . ,essages
,essages G e%ents) e3g3
2012-0--20T10:49:4)K01:00 usc01-vcs1 tvcs: .ventFQ=e@ist/tion =eLuestedQ
%eviceFQ>#2#Q %c-ipFQ10!-0!1"2!-2Q %c-potFQ1219Q %c-/li/s-typeFQ>#2#Q
%c-/li/sFQ/l/n:od!ex90Rusc01!cisco!comQ &otocolFQU3&Q LevelFQ1Q
UTCTimeFQ2012-0--20 09:49:4)E-24S
Hessages include*
+egistration B +eOuested M 'cce"ted M +e$ected E
Search B 'tte,"ted M Co,"leted M Cancelled E
Source 'liases +ewritten
Call B 'tte,"ted M Connected M +e$ected M !isconnected E
Hessage B Sent M +ecei%ed E FH322> Signalling8
+eOuest B Sent M +ecei%ed E FS0P Signalling8
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential >
harddis-logs . networ-Dlog
Logs actual networ- ,essages3 ICa,"le H322>*
2012-0--2)T1#:-9:-1K01:00 usc01-vcs1 tvcs: UTCTimeFQ2012-0--2)
12:-9:-1E410Q ModuleFQnetwo8!h#2#Q LevelFQCN57Q: 3st-ipFQ10!-0!1"1!2#Q
3st-potFQ1220Q TM 3et/ilFQ%endin@ >!22- %etup >!#2#v" (/ndwidth:2")84ps
3est9li/s:h/ll/m2!ex90Rusc01!cisco!com 3estC%9dd:
UVC&v4VVTC&VV10!-0!1"1!2#:1220VW Jendo:T9N3(.=+Q
ICa,"le S0P3 Trace with Call40!*
2012-0--2)T1":00:19K01:00 usc01-vcs1 tvcs: UTCTimeFQ2012-0--2)
1-:00:19E2#2Q ModuleFQnetwo8!sipQ LevelFQCN57Q: %c-ipFQ10!-0!1"-!1-Q
%c-potFQ2-0-#Q 3et/ilFQ=eceive =eLuest MethodFCNJCT.E =eLuest-
U=CFsip:usc01-1-ed@4/stonRcisco!comE C/ll-
2012-0--2)T1":00:19K01:00 usc01-vcs1 tvcs: UTCTimeFQ2012-0--2)
1-:00:19E24-Q ModuleFQnetwo8!sipQ LevelFQCN57Q: 3st-ipFQ10!-0!1"-!1-Q
3st-potFQ2-0-#Q 3et/ilFQ%endin@ =esponse CodeF100E MethodFCNJCT.E
ToFsip:usc01-1-ed@4/stonRcisco!comE C/ll-
/ote* uni(iedDlog G networ- [ ,essages [ de%elo"erDlog
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential >=
!iagnostic Logging
Pro%ides a si,"le wa2 to get "rotocol
logs and other "ertinent log ,essages
(ro, the VCS
Select Haintenance ] !iagnostics ]
!iagnostics Logging
+e"laces 5netlog6 (ro, P730 onwards
Set 5networ-6 to !IBK: FeOui%alent to
5netlog 268
'lso set 5interwor-ing6 to !IBK:
unless 2ouUre sure no interwor-ing is
going on
9ill a((ect what is in sna"shot Flog is
ta-en (ro, uni(iedDlog8
Logging is started/sto""ed across
whole cluster) but ,ust be downloaded
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential >>
/ew Sna"shot 1eatures in P732
'lar,s (older logging acti%e alar,s
Sensors logs now logs 5a""stats6 . the +esourceKsage out"ut Fcalls
and registrations in use at that "oint in ti,e8
Res!urce"sa#e $tem%&1&'
(a))s $tem%&1&'
Traversa) $tem%&1&'
(urre*t $tem%&1&'0/(urre*t'
+a, $tem%&1&'-/+a,'
T!ta) $tem%&1&'680/T!ta)'
.!*Traversa) $tem%&1&'
(urre*t $tem%&1&'0/(urre*t'
+a, $tem%&1&'8/+a,'
T!ta) $tem%&1&'1099/T!ta)'
Re#$strat$!*s $tem%&1&'
(urre*t $tem%&1&'-7/(urre*t'
+a, $tem%&1&'89/+a,'
T!ta) $tem%&1&'-92/T!ta)'
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential >6
Ti"s) tric-s and co,,on issues^
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential >7
9or-ing with !iagnostic Logs
Tracing S0P call
Loo- (or the 0/V0TI
Then trace the Call40!
Beware o( (or-ing
!ebugging 0nterwor-ing
#"enLogicalChannel F#LC8 onl2 sent on ,edia
Kse the 5LCList6 out"ut (or ,onitoring logical channels
5Status sna"shot6 is ,ore co,"lete than Cstat/Ccon(
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential ><
+esetting Stu((
I,ergenc2 co,,and4line bac-u"s*
touch /t,"/reOuest/s2ste,4bac-u"
Chec- (or /t,"/bac-u"4restore4co,"lete
1ind it in /,nt/harddis-/bac-u"restore/
NB private key not backed up do this separately
Cco, !e(aultValuesSet 2 [ Cco, !e(aultLin-s'dd
1iles in /,nt/harddis-/(actor24reset/
tandberg4i,age3tar3g7 [ r-
NB will not be present on a non-upgraded !
Co,,and4line sna"shot
+un 5sna"shot3sh6) (ind in /,nt/harddis-/sna"shot/
Hanual u"grade Fsc" to /t,"/tandberg4i,age3tar3g7 [ /t,"/release4-e28
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential >A
Should onl2 be used (or hardware (ailure
See earlier diagnosis o( so,e issues
See +H' :uide on Cisco website FVCS !ocs/Troubleshooting8*
1actor2 reset scri"t Flog is as root) run 5(actor24reset68 reinstalls the last
%alid i,ageR suitable (or ,an2 so(tware issues
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential 60
Securit2 Vulnerabilities
5VCS 1ails scan (or reason PPP6
Chec- out grwi-i list (or (iCes in later VCS %ersions*
There are a nu,ber o( (alse "ositi%es Fe3g3 PHP C:0) 9eb!'V8
0( 2ou ha%e a CVI nu,ber (or a third "art2 a""lication Fe3g3 '"ache)
o"enssl8) chec- out the re"ort e3g3*
This will t2"icall2 sa2 which %ersions are %ulnerable
Then chec- against the %ersion o( so(tware running on the ,ost recent
VCS %ersion
1or certi(icate errors) (a,iliarise and "oint towards VCS Certi(icate
Creation and Kse :uide FCisco3co,) VCS Con(iguration :uides8*
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential 61
Chec-ing %ersions o( so(tware
Log into VCS as root FeCa,"les (ro, P732318
? @ /a"ache2/bin/htt"d 4%ersion
Ser%er %ersion* '"ache/2..2 FKniC8
Ser%er built* Se" 21 2012 0A*>>*1A
? @ o"enssl
#"enSSL] %ersion
#"enSSL !.".!c 10 Ha2 2012
? @ sshd 4%
sshd* illegal o"tion 44 %
#"enSSHD#.$p2) #"enSSL 13031c 10 Ha2 2012
? @ "h" 44%ersion
PHP #.%.!# with Suhosin4Patch Fcli8 Fbuilt* Se" 21 2012 10*0=*68
Co"2right Fc8 1AA742012 The PHP :rou"
`end Ingine %2330) Co"2right Fc8 1AA<42012 `end Technologies
LinuC -ernel*
? @ una,e 4a
LinuC P0<2 2.&.%$. @1 SHP 1ri Se" 21 10*=<*=7 BST 2012 C<6D6=
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential 62
5VCS is Knres"onsi%e6
Can ,ean a lot o( things
aust s"eed o( accessX Chec- out load on boC3
/etwor- "roble,sX #r crashesX
Be(ore restarting a unres"onsi%e VCS) attach a -e2board/,onitor to see
i( there is a -ernel "anic etc3 0( so) +H' is a""ro"riate3
Sna"shot logs can tell a lot*
_ernel logs* networ- inter(aces u"/downX
'"ache logs* THS or other co,,unications still (unctioningX
Kni(ied logs* an2 noticeable ga"s in the logsX
S2sinit logs* clean/unclean shutdowns etc
Sensors logs* u"ti,e etc
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential 6
License 'lar,s
+aised when license usage hits A0N o( a li,it3
'lerting that at so,e "oint since the last restart this li,it has been hit3 0t
is a courtes2 to in(or, the custo,er in case ser%ice is a((ected and the2
wish to "urchase ,ore licenses3
The alar, is not lowered when usage reduces) since usage ,a2 onl2
occur brie(l23 Howe%er) the custo,er can ac-nowledge this alar,3
There are two di((erent errors listed) which are subtl2 di((erent*
Ca"acit2 warning 4 The nu,ber o( concurrent tra%ersal calls has a""roached
the unit's physical limit
Ca"acit2 warning 4 The nu,ber o( concurrent non4tra%ersal calls has
a""roached the licensed limit
The SunitQs "h2sical li,itS re(ers to 100 Tra%ersal Calls) or >00 /on4
Tra%ersal calls3 The Slicensed li,itS re(ers to the nu,ber o( licenses the
custo,er has in their cluster3
Bear in ,ind that a call license is allocated as soon as a call is
atte,"ted) so e%en calls to un-nown or bus2 users can te,"oraril2 eat a
Custo,ers should de"lo2 cluster load balancing Fe3g3 !/S S+V8
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential 6=
#ther co,,on issues
Stuc- calls in status) "re4P732
Call status s2nc between 5Cstat6 and what was seen on the web had issues
be(ore P732 . rewritten in P732 and issues resol%ed FCSCtr2<<=28
K"grades losing data F1indHe accounts) S/HP grou") etc8
/on4'SC00 character issue3 1iCed in P732323
#ut o( Search +esources
' lot can go wrong i( searches start to be dro""ed3 Chec- out 5Cstatus 7ones
searches6 For eOui%alent in status3C,l83 #"ti,i7e the search rulesV
H%t/tusI HXonesI
HCuent itemFQ1QI1H,CuentI
HTot/l itemFQ1QI10024H,Tot/lI
/r!pped $tem%&1&'0//r!pped'
H,XonesI H,%t/tusI
K"grade issues
+e,e,ber u"grades (ro, "re4P6 ,ust go to P631 be(ore going to P7[
Y 2010 Cisco and/or its a((iliates3 'll rights reser%ed3 Cisco Con(idential 6>

You might also like