Download as pdf or txt
Download as pdf or txt
You are on page 1of 161

TROUBLE SHOOTING Q&A

FOR TRINITY CHIPSET


Steven Wong

27-Sept-2011
PACKET FORWARDING QUICK REVIEW

2 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


1. WHERE IS THE NEXTHOP STATISTICS ?
1. WHERE IS THE NEXTHOP STATISTICS ?
By default, only MPLS related nexthops will have nexthop statistics
count enabled.
MPLS
IPv4->MPLS
MPLS->IPv4
…etc

Need to enable it manually


debug nh hw count <NHID>

4 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


1. WHERE IS THE NEXTHOP STATISTICS ?
TAZ-TBB-0(cheese vty)# show nhdb id 766 detail
ID Type Interface Next Hop Addr Protocol Encap MTU Fla
----- -------- ------------- --------------- ---------- ------------ ---- --------
766 Unicast ge-1/1/2.0 10.1.3.3 IPv4 Ethernet 1500 0x000000

Dram Bytes : 304


PreComputed MTU: 0
Flags : 0x0
Tunnel ID : 0x0
Parent NHID : 0
Feature List: NH
[pfe-0]: 0x1003a0b900d09167;
f_mask:0x00080000; c_mask:0x80000000; f_num:14; c_num:1, inst:0
Idx#12 ucast:
[pfe-0]: 0x1003a0b900d09167

5 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


1. WHERE IS THE NEXTHOP STATISTICS ?
TAZ-TBB-0(cheese vty)# debug nh hw count 766
TAZ-TBB-0(cheese vty)# show nhdb id 766 detail
ID Type Interface Next Hop Addr Protocol Encap MTU Fla
----- -------- ------------- --------------- ---------- ------------ ---- --------
766 Unicast ge-1/1/2.0 10.1.3.3 IPv4 Ethernet 1500 0x000000

Dram Bytes : 304


PreComputed MTU: 0
Flags : 0x0
Tunnel ID : 0x0
Parent NHID : 0
Feature List: NH
[pfe-0]: 0x08d11a5000010000;
f_mask:0x00480000; c_mask:0xc0000000; f_num:14; c_num:2, inst:0
Idx#9 counter:
[pfe-0]: 0x2ffffffc0006a200

Idx#12 ucast:
[pfe-0]: 0x1003a0b900d09167

6 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


1. WHERE IS THE NEXTHOP STATISTICS ?
TAZ-TBB-0(cheese vty)# show nhdb id 766 statistics
Nexthop Statistics:
Interface NH ID Next Hop Addr Output Pkts Pkt Rate Output Bytes Byte Rate
------------ ------- --------------- ------------ -------- --------------- ----------
ge-1/1/2.0 766 10.1.3.3 1180880 69942 54320480 3217333

TAZ-TBB-0(cheese vty)# undebug nh hw count 766


TAZ-TBB-0(cheese vty)# show nhdb id 766 statistics
Nexthop Statistics:
Interface NH ID Next Hop Addr Output Pkts Pkt Rate Output Bytes Byte Rate
------------ ------- --------------- ------------ -------- --------------- ----------
ge-1/1/2.0 766 10.1.3.3 0 0 0 0

TAZ-TBB-0(cheese vty)# show nhdb id 766 detail


ID Type Interface Next Hop Addr Protocol Encap MTU Fla
----- -------- ------------- --------------- ---------- ------------ ---- --------
766 Unicast ge-1/1/2.0 10.1.3.3 IPv4 Ethernet 1500 0x000000
......
Feature List: NH
[pfe-0]: 0x1003a0b900d09167;
f_mask:0x00080000; c_mask:0x80000000; f_num:14; c_num:1, inst:0
Idx#12 ucast:
[pfe-0]: 0x1003a0b900d09167

7 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


2. How the IPv4 fragmentation is done ?
2. HOW THE IPV4 FRAGMENTATION IS DONE ?
MTU check is done on the egress PFE
TAZ-TBB-0(cheese vty)# show jnh if 341 output
......
--------- Output Families --------
IPv4:
Feature List: off
[pfe-0]: 0x08d0f5f000010000;
f_mask:0x80800000; c_mask:0xc0000000; f_num:11; c_num:2, inst:0
Idx#0 set-ifl-state:
[pfe-0]: 0x12c80020005545dc

Idx#8 redirect-check:
[pfe-0]: 0x27fffff80000000c
........

TAZ-TBB-0(cheese vty)# show jnh 0 decode 0x12c80020005545dc

(Q:0x20008, OIF:341 ge-1/0/1.0, mtu:1500)

TAZ-TBB-0(cheese vty)#

9 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


2. HOW THE IPV4 FRAGMENTATION IS DONE ?
PPE thread will generate the header of the first fragment, and send
the header and the head-tail descriptors (LMEM pointer and a 9-bit
length field to indicate the data size) to the LU Reorder block.

Subsequent fragments will be sent in the same way without a


head-tail until the last fragment.

This packet needs to remain at the head of its reorder-hash-queue


until the last fragment is sent so that no subsequent packets from
the same flow get in between the fragments.

Since fragments are generated by wan_output() on the fly, there is


no record of them in the ucode state. Not even a counter !

10 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


2. HOW THE IPV4 FRAGMENTATION IS DONE ?
Even TAP interface won’t see the fragments as it’s a sampling
action before the wan_output()
TAZ-TBB-0(cheese vty)# show jnh if 341 output
-------- Output Features ---------
Topology: ifl(341)
Flavor: Output-IFL (49), Refcount 0, Flags 0x1
Addr: 0x4aa4f050, Next: 0x4acd31e0, Context 0x155
Link 0: 08d10528:00020000, Offset 12, Next: 08d10528:00020000
Link 1: 00000000:00000000, Offset 12, Next: 00000000:00000000
Link 2: 00000000:00000000, Offset 12, Next: 00000000:00000000
Link 3: 00000000:00000000, Offset 12, Next: 00000000:00000000

Topology Neighbors:
[none]-> ifl(341)-> flist-master(oif)
Feature List: oif
[pfe-0]: 0x08d1052800020000;
f_mask:0x00204400; c_mask:0xe0000000; f_num:23; c_num:3, inst:0
Idx#10 ptype-mux:
[pfe-0]: 0x02000c16fb000804
Idx#17 sample:
[pfe-0]: 0x1400000000004000
Idx#21 wan-output:
11 [pfe-0]: 0x240030d000000000
Copyright © 2010 Juniper Networks, Inc. www.juniper.net
2. HOW THE IPV4 FRAGMENTATION IS DONE ?
We can only see that by capturing the parcel directly from the PPE
thread via. packet-via-dmem dump function
Dispatch: Size 14e Copy 0000 M2L_PKT_HEAD (1) TailEntry 0015 Stream Wan (f) Off 0
IxPreClass 2 IxPort 01 IxMyMac 0
Data: 80711fc266788071
Data: 1fc266610800(4500
Data: 05dc000020003f3d
^^^^
Data: b1e0c0010102c800
Data: 1b01000000000000
.......

Dispatch: Size 14e Copy 0000 M2L_PKT_HEAD (1) TailEntry 0018 Stream Wan (f) Off 0
IxPreClass 2 IxPort 01 IxMyMac 0
Data: 80711fc266788071
Data: 1fc266610800(4500
Data: 0208000000b93f3d
^^^^
Data: d4fbc0010102c800
Data: 1b01000000000000
.......

12 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


3. HOW DOES QOS WORK IN TRINITY ?
3. HOW DOES QOS WORK IN TRINITY (BA)?
BA and MF are done in LU
Queuing and Scheduling are done in MQ/QX
TAZ-TBB-0(cheese vty)# show cos classifier
id type num_entries
-- ---- -----------
9 DSCP IPv6(8) 64 , PFE 0: vaddr 0xd09a00.
10 EXP(2) 8 , PFE 0: vaddr 0xd020f0.
13 IP precedence(4) 8 , PFE 0: vaddr 0xd09a08.
27489 IP precedence(4) 8 , PFE 0: vaddr 0xd09a38.

TAZ-TBB-0(cheese vty)# show cos classifier bindings


Classifier id: 27489
IFLs: 346
on pfe 0: bound to families defaut 1 ipv4

TAZ-TBB-0(cheese vty)# show cos ifl-entry 346


CoS IFL IDX: 346
Flags: 0x0
CoS Flags: 0x0
classifier[ EXP] : 10
classifier[IP precedence] : 27489
rewrites[2][0] : 33 rw flags 0x3
14 Copyright © 2010 Juniper Networks, Inc. www.juniper.net
3. HOW DOES QOS WORK IN TRINITY (BA) ?
TAZ-TBB-0(cheese vty)# show cos classifier 13

classifier id: 13 type: IP precedence

CP Q PLP CP Q PLP CP Q PLP CP Q PLP


----------- ----------- ----------- -----------
0x00 0 0 0x01 0 1 0x02 0 0 0x03 0 1
0x04 0 0 0x05 0 1 0x06 3 0 0x07 3 1

Classifer table read from PFE 0 DMEM virtual address 0xd09a08:


.......
TAZ-TBB-0(cheese vty)# show cos classifier 27489

classifier id: 27489 type: IP precedence

CP Q PLP CP Q PLP CP Q PLP CP Q PLP


----------- ----------- ----------- -----------
0x00 2 0 0x01 2 0 0x02 2 0 0x03 2 0
0x04 1 0 0x05 1 0 0x06 1 0 0x07 1 0

Classifer table read from PFE 0 DMEM virtual address 0xd09a38:


CP FC PLP
--------------
0x00 2 0
15 0x01 2 0 Copyright © 2010 Juniper Networks, Inc. www.juniper.net

0x02 2 0
0x03 2 0
3. HOW DOES QOS WORK IN TRINITY (BA) ?
TAZ-TBB-0(cheese vty)# show jnh 0 vread 0xd09a38 8
Addr:0xd09a38, Data = 0x0808080808080808
Addr:0xd09a39, Data = 0x0808080808080808
Addr:0xd09a3a, Data = 0x0808080808080808
Addr:0xd09a3b, Data = 0x0808080808080808
Addr:0xd09a3c, Data = 0x0404040404040404
Addr:0xd09a3d, Data = 0x0404040404040404
Addr:0xd09a3e, Data = 0x0404040404040404
Addr:0xd09a3f, Data = 0x0404040404040404

TAZ-TBB-0(cheese vty)# show jnh if 346 input proto 2


IPv4:
Feature List: iff
[pfe-0]: 0x08d0f16000010000;
f_mask:0x02008000; c_mask:0xc0000000; f_num:18; c_num:2, inst:0
Idx#6 ba_classifier:
[pfe-0]: 0x268107803a13470f
Idx#16 fwd-lookup:
[pfe-0]: 0x18106845a0400008

TAZ-TBB-0(cheese vty)# show jnh 0 decode 0x268107803a13470f


UcodeNH:Classify-L3

TAZ-TBB-0(cheese vty)#
16 Copyright © 2010 Juniper Networks, Inc. www.juniper.net
3. HOW DOES QOS WORK IN TRINITY (BA) ?
COS debug message is available
TAZ-TBB-0(cheese vty)# show ukern_trace handles
Ukernel Trace Info:
ID Name Level Printf Logging Size Wrap
----- --------------- --------- ----- ----- ----- -----
29 COS-HALP terse Off On 65536 0
30 COS terse Off On 65536 14

TAZ-TBB-0(cheese vty)# show ukern_trace 29


[7047725] cos_trinity_compare_types: HALP-comparing protocol family ipv4 against classifie
[7047726] cos_trinity_compare_types: HALP-combo type general case type match
[7047727] cos_trinity_update_family_binding: HALP-family protocol ipv4, classifier type IP
[7047728] cos_trinity_unbind_classifier: HALP-unbinding classifier 13 from family ipv4: us
[7047729] cos_trinity_unbind_classifier_per_pfe: HALP-fetched existing classifier
[7047730] cos_trinity_unbind_classifier_per_pfe: HALP-family feature list 0x52972128
[7047731] cos_trinity_unbind_classifier_per_pfe: HALP-classifier count is 2
[7047732] cos_trinity_clear_table_from_nexthop: HALP-clear table: default flag 1, bound cl
[7047733] cos_trinity_clear_table_from_nexthop: HALP-looking for 0xd09a08, num bound is 0x
[7047734] cos_trinity_clear_table_from_nexthop: HALP-cleared table with 0x0, default flag
[7047735] cos_trinity_clear_table_from_nexthop: HALP-nh now 0x268107802000000f, num_bound_
......

17 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


3. HOW DOES QOS WORK IN TRINITY (BA) ?
COS debug message is available
TAZ-TBB-0(cheese vty)# show ukern_trace 30
[7047744] cos_gencfg_classifier_update_cmd_debug: COS IPC: Received msg class table update
[7047745] cos_gencfg_classifier_update_cmd_debug: table id = 27489, type = IP precedence,
[7047746] cos_gencfg_classifier_update_cmd_debug: entry 0, fc-id = 2,queue num = 2, code p
[7047747] cos_gencfg_classifier_update_cmd_debug: entry 1, fc-id = 2,queue num = 2, code p
[7047748] cos_gencfg_classifier_update_cmd_debug: entry 2, fc-id = 2,queue num = 2, code p
[7047749] cos_gencfg_classifier_update_cmd_debug: entry 3, fc-id = 2,queue num = 2, code p
[7047750] cos_gencfg_classifier_update_cmd_debug: entry 4, fc-id = 1,queue num = 1, code p
[7047751] cos_gencfg_classifier_update_cmd_debug: entry 5, fc-id = 1,queue num = 1, code p
[7047752] cos_gencfg_classifier_update_cmd_debug: entry 6, fc-id = 1,queue num = 1, code p
[7047753] cos_gencfg_classifier_update_cmd_debug: entry 7, fc-id = 1,queue num = 1, code p
[7047837] cos_gencfg_bind_classifier_cmd_debug: ifl = 346, table type = IP precedence, tab
[7047838] cos_classifier_do_bind_add_action: IFL:346 classifier:27489

18 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


3. HOW DOES QOS WORK IN TRINITY (BA) ?
The whole lookup process can be found under ttrace
Lots of information there but it’s useful
.........
75 classify_layer3 @ 0x038c
Prev_PC 0x0024 -> 0x038c

76 classify_layer3_ip @ 0x0434
Prev_PC 0x038c -> 0x0434

77 use_dscp_with_main_tbl @ 0x0438
Cond SYNC XTXN DMEM_RD(VA 0xd09a38 -> PA 0xc0009a38)
Reply64 is 0x0808080808080808
Prev_PC 0x0434 -> 0x0438
xrs: 0x268107803a13470f -> 0x0808080808080808
IR1 0x1a1e2c01 -> 0x00000000

78 extract_result_byte @ 0x0431
GPR29 0x000000000000003f -> 0x000000100000003f
Prev_PC 0x0438 -> 0x0431
LMEM[0x3f] 0x000000000000003f -> 0x000000100000003f
.........

19 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


3. HOW DOES QOS WORK IN TRINITY (BA) ?
src/pfe/ucode/lu/

ucode.th:register 29 CtxQos_t RCtxQos;

struct CtxQos_t { // 64 (8 bytes)


num_tags : 3; // need to handle more than 3 tags
cos2 : 3;
dei2 : 1;
cos1 : 3;
dei1 : 1;
exp_valid : 1;
exp : 3;
rw_flags_t rw_flags;
union {
fwd_class : 6; // q_valid = 0, See Note above
q_offset : 6; // q_valid = 1
} u;
drop_prio : 2; // See Note above
q_valid : 1; // Indicates field in union, fwd_class or q_offset

// 32 bit boundary
exp1 : 3;
exp2 : 3;
tos_t tos; // 8
20 tos2 : 8; Copyright © 2010 Juniper Networks, Inc. www.juniper.net
// TTL and control word related fields are placed here to conserve
// context space.
3. HOW DOES QOS WORK IN TRINITY (BA) ?
Tool is provided by Engineering to break the 64 bits.
~swong/bin/bits

svl-junos-pool72% bits 3 3 1 3 1 1 3 8 6 2 1 3 3 8 8 1 1 8
0x000000100000003f
Wid 3 3 1 3 1 1 3 8 6 2 1 3 3 8 8 1 1 8
Bin 000 000 0 000 0 0 000 00000000 000010 00 0 000 000 00000000 00000000 0 0 00111111
Hex 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 3f
Dec 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 63
^^^^^^ fwd_class = 2

21 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


3. HOW DOES QOS WORK IN TRINITY (SCHED)?
Scheduler and Queue configuration info can also be found from the
ukern trace buffer
TAZ-TBB-0(cheese vty)# show ukern_trace 29
[7048121] cos_halp_bind_scheduler_map: HALP-Element type:2, IFD:153
[7048122] cos_halp_bind_scheduler_map: HALP-current queue config state: 0
[7048123] cos_halp_get_lifetime_shift: HALP-tx_rate: 50000000 tx_rate_bytes; 625000 Lifeti
[7048124] cos_halp_bind_scheduler_map: HALP-Qsys (1) Q(104) Lifetime shift: 7
[7048125] cos_halp_bind_scheduler_map: HALP-Queue:104 sh_rate_percent:0 tx_rate_percent:50
[7048133] cos_halp_bind_scheduler_map: HALP-Configured Q : 104
[7048134] cos_halp_get_lifetime_shift: HALP-tx_rate: 450000000 tx_rate_bytes; 5625000 Life
[7048135] cos_halp_bind_scheduler_map: HALP-Qsys (1) Q(105) Lifetime shift: 10
[7048136] cos_halp_bind_scheduler_map: HALP-Queue:105 sh_rate_percent:0 tx_rate_percent:45
[7048144] cos_halp_bind_scheduler_map: HALP-Configured Q : 105
[7048145] cos_halp_get_lifetime_shift: HALP-tx_rate: 450000000 tx_rate_bytes; 5625000 Life
[7048146] cos_halp_bind_scheduler_map: HALP-Qsys (1) Q(106) Lifetime shift: 10
[7048147] cos_halp_bind_scheduler_map: HALP-Queue:106 sh_rate_percent:0 tx_rate_percent:45
[7048155] cos_halp_bind_scheduler_map: HALP-Configured Q : 106
[7048156] cos_halp_get_lifetime_shift: HALP-tx_rate: 50000000 tx_rate_bytes; 625000 Lifeti
[7048157] cos_halp_bind_scheduler_map: HALP-Qsys (1) Q(107) Lifetime shift: 7
[7048158] cos_halp_bind_scheduler_map: HALP-Queue:107 sh_rate_percent:0 tx_rate_percent:50
[7048166] cos_halp_bind_scheduler_map: HALP-Configured Q : 107
[7048167] cos_halp_bind_scheduler_map: HALP-Updated Q(104) Weight: 6
[7048168]
22 cos_halp_bind_scheduler_map: HALP-Updated
Copyright © 2010 Juniper Networks, Inc. www.juniper.net Q(105) Weight: 56
......
3. HOW DOES QOS WORK IN TRINITY (SCHED)?
......
[7048169] cos_halp_bind_scheduler_map: HALP-Updated Q(106) Weight: 56
[7048170] cos_halp_bind_scheduler_map: HALP-Updated Q(107) Weight: 6
[7048178] cos_halp_bind_scheduler_map: HALP-Updated Q(108) Weight: 1
[7048179] cos_halp_bind_scheduler_map: HALP-Configured Q : 108
[7048187] cos_halp_bind_scheduler_map: HALP-Updated Q(109) Weight: 1
[7048188] cos_halp_bind_scheduler_map: HALP-Configured Q : 109
[7048196] cos_halp_bind_scheduler_map: HALP-Updated Q(110) Weight: 1
[7048197] cos_halp_bind_scheduler_map: HALP-Configured Q : 110
[7048205] cos_halp_bind_scheduler_map: HALP-Updated Q(111) Weight: 1
[7048206] cos_halp_bind_scheduler_map: HALP-Configured Q : 111
[7048207] cos_halp_bind_scheduler_map: HALP-Rate-limit setup not needed for ifd 153

23 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


3. HOW DOES QOS WORK IN TRINITY (SCHED) ?
The COS HALP can show both IFD and IFL/IFLSET level of QoS
configuration
TAZ-TBB-0(cheese vty)# show cos halp ifd 153

--------------------------------------------------------------------------------
IFD name: ge-1/1/1 (Index 153)
MQ chip id: 0
MQ chip Scheduler: 1
MQ chip L1 index: 13
MQ chip dummy L2 index: 13
MQ chip base Q index: 104
Queue State Max Guaranteed Burst Weight Priorities Drop-Rules
Index rate rate size G E Wred Tail
------ ----------- ------------ ------------ ------ ------ ---------- ----------
104 Configured 1000000000 50000000 32767 6 GL EL 4 33
105 Configured 1000000000 450000000 32767 56 GL EL 4 112
106 Configured 1000000000 450000000 32767 56 GL EL 4 112
107 Configured 1000000000 50000000 32767 6 GL EL 4 33
108 Configured 1000000000 0 32767 1 GL EL 0 255
109 Configured 1000000000 0 32767 1 GL EL 0 255
110 Configured 1000000000 0 32767 1 GL EL 0 255
111 Configured 1000000000 0 32767 1 GL EL 0 255
24 Copyright © 2010 Juniper Networks, Inc. www.juniper.net
TAZ-TBB-0(cheese vty)#
3. HOW DOES QOS WORK IN TRINITY (STATS) ?
To check the queue statistics, first, need to identify the queue
number using the egress stream number
TAZ-TBB-0(cheese vty)# show mqchip 0 ifd
......
Output IFD IFD Base
Stream Index Name Qsys Qnum
------ ------ ---------- ------ ------
181 143 ge-1/0/1 MQ1 8

TAZ-TBB-0(cheese vty)# show mqchip 0 stream 181


.....
Output Stream 181
------------------
attached : 1
enabled : 1
pic slot : 1
mac mode : 0
wan if : 0
port : 3
conn : 12
weight : 1
sched : 1 MQ1
l1 node : 1
25 queues : 8..15 Copyright © 2010 Juniper Networks, Inc. www.juniper.net
hcpq start : 31872
hcpq size : 128
3. HOW DOES QOS WORK IN TRINITY (STATS) ?
TAZ-TBB-0(cheese vty)# show mqchip 0 dstat stats 1 8
QSYS 1 QUEUE 8 colormap 3 stats index 3840:
Counter Packets Pkt Rate Bytes Byte Rate
-------------------- ---------------- ------------ ---------------- ------------
Color 0
Forwarded (NoRule) 0 0 0 0
Forwarded (Rule) 1202519070 24085 975028390127 25048400
Dropped (WRED) 0 0 0 0
Dropped (TAIL) 0 0 0 0
Dropped (Force) 0 0 0 0
Dropped (Error) 6564 0 6813432 0
Color 1
Forwarded (NoRule) 0 0 0 0
Forwarded (Rule) 0 0 0 0
Dropped (WRED) 0 0 0 0
Dropped (TAIL) 0 0 0 0
Dropped (Force) 0 0 0 0
Dropped (Error) 0 0 0 0
......
Color 3
Forwarded (NoRule) 0 0 0 0
Forwarded (Rule) 0 0 0 0
Dropped (WRED) 0 0 0 0
Dropped (TAIL) 0 0 0 0
Dropped (Force) 0 0 0 0
Dropped (Error) 0 0 0 0

Queue inst depth: 0


Queue taql : 0
26 Copyright © 2010 Juniper Networks, Inc. www.juniper.net

TAZ-TBB-0(cheese vty)#
3. HOW DOES QOS WORK IN TRINITY (REWRITE) ?
Similar method to check the rewrite function
TAZ-TBB-0(cheese vty)# show ukern_trace 29
[7048213] cos_trinity_rewrite_bind: HALP-cos_trinity_rewrite_bind rewrite_mask 2
[7048214] cos_trinity_rewrite_bind_per_pfe: HALP-cos_trinity_rewrite_bind_per_pfe:
[7048215] trinity_get_rewrite_table_index: HALP-trinity_get_rewrite_table_index fe_id
[7048216] cos_trinity_rewrite_bind_per_pfe: HALP-creating cos rewrite nh for the first tim
[7048217] cos_trinity_rewrite_bind_per_pfe: HALP-cos_rewrite_nh 0x9010000000000000

TAZ-TBB-0(cheese vty)# show ukern_trace 30


[7048212] cos_gencfg_bind_rewrite_table_cmd_debug: ifl:334, action:add, proto:default

TAZ-TBB-0(cheese vty)# show cos rewrite


id type num_entries
-- ---- -----------
33 EXP(2) 4
27488 IP precedence(4) 4

TAZ-TBB-0(cheese vty)# show cos rewrite bindings


.....
rewrite id: 27488
IFLs:
334

27 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


3. HOW DOES QOS WORK IN TRINITY (REWRITE) ?
Rewrite profile configuration
TAZ-TBB-0(cheese vty)# show cos rewrite 27488

rewrite id: 27488 type: IP precedence flag:0x0

FC PLP CP EN FC PLP CP EN FC PLP CP EN FC PLP CP EN


-------------- ------------- ------------- -------------
0x00 0 2 1 0x00 1 2 1 0x00 2 0 0 0x00 3 0 0
0x01 0 0 1 0x01 1 0 1 0x01 2 0 0 0x01 3 0 0
0x02 0 6 1 0x02 1 6 1 0x02 2 0 0 0x02 3 0 0
0x03 0 6 1 0x03 1 7 1 0x03 2 0 0 0x03 3 0 0
FE id 0:
Trinity rewrite type: rewrite-prec-or-dscp Rewrite table index: 0
FC DP Code-point-byte
1 3 0x00 // 802_1p entry: 802_1ad entry: Precedence entry: Dscp entry:
1 0 0x00 // +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ +-+-+-
0 3 0x10 // |0|0|0|0| cos |0| |1|0|0|0|cos/dei| |0|0| prec| 0 | |1|0| DSCP
0 0 0x10 // +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ +-+-+-
3 3 0x38 // 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1
3 0 0x30 //
// Exp entry: No-op entry
2 3 0x30 // +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+
2 0 0x30 // | | exp | |1 1 1 1 1 1 1 1|
...... // +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+
28 //Copyright
7 6© 2010
5 4Juniper
3 2Networks,
1 0 Inc. www.juniper.net
7 6 5 4 3 2 1 0
3. HOW DOES QOS WORK IN TRINITY (REWRITE) ?
Rewrite is applied on the IFL
TAZ-TBB-0(cheese vty)# show jnh if 334 output
-------- Output Features ---------
Topology: ifl(334)
Flavor: Output-IFL (49), Refcount 0, Flags 0x1
......
Topology Neighbors:
[none]-> ifl(334)-> flist-master(oif)
Feature List: oif
[pfe-0]: 0x08d0fd0800020000;
f_mask:0x00210400; c_mask:0xe0000000; f_num:23; c_num:3, inst:0
Idx#10 ptype-mux:
[pfe-0]: 0x02000d0e1b800804
Idx#15 cos-rewrite:
[pfe-0]: 0x9010000000000000
Idx#21 wan-output:
[pfe-0]: 0x24002b6000000000

TAZ-TBB-0(cheese vty)# show jnh 0 decode 0x9010000000000000


COS RW

TAZ-TBB-0(cheese vty)#

29 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


3. HOW DOES QOS WORK IN TRINITY (REWRITE) ?
ttrace again
Should see the same result if we use IFL output tap
......
181 ip_entry @ 0x0239
GPR01 0x9e0000c000000052 -> 0x9e0000c00000cb14
Prev_PC 0x01b7 -> 0x0239

182 ipv4_update_tos @ 0x01b0


Prev_PC 0x0239 -> 0x01b0
PCSD: 1 -> 0, dropped 0x01ba
LMEM[0xa] 0x1fc2666108004500 -> 0x1fc26661080045c0
LMEM[0xc] 0xcbd4c0010102c800 -> 0xcb14c0010102c800
......

30 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


3. HOW DOES QOS WORK IN TRINITY ?
QoS on Trinity is a big topic

More info about scheduling and queueing can be found here


http://www-in/~swong/trinity_cos.pdf

31 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


4. WHERE IS THE PACKET GOT DROPPED ?
4. WHERE IS THE PACKET GOT DROPPED ?
Need to narrow down where the packet is dropped before we can
tell what’s wrong

Drop usually happens in


LU lookup due to configuration or bug
MQ/QX queuing drop
IX ingress drop due to over-subscription
Fabric drop
HW problem
…etc

33 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


4. WHERE IS THE PACKET GOT DROPPED (LU) ?
Drop can be a result after lookup in LU
Always check the exception statistics
AZ-TBB-0(cheese vty)# show jnh 0 exceptions terse
Reason Type Packets Bytes
==================================================================
PFE State Invalid
----------------------
unknown family DISC(73) 1002 1120008

Packet Exceptions
----------------------
IP options PUNT( 2) 6372 1496928

Routing
----------------------
discard route DISC(66) 324 288636
hold route DISC(70) 6 504
resolve route PUNT(33) 11 700
control pkt punt via nh PUNT(34) 169411 18575370
host route PUNT(32) 151056 29012267
reject route PUNT(40) 7969480 20364386
34 Copyright © 2010 Juniper Networks, Inc. www.juniper.net
4. WHERE IS THE PACKET GOT DROPPED (LU) ?
All drops are classified into certain category with a REASON code
src/pfe/ucode/lu/ucode_public.h
#define DROP_CODE_BASE 128
#define DROP_ETH_FRAME_ERROR (DROP_CODE_BASE + 0)
#define DROP_UNKNOWN_IIF (DROP_CODE_BASE + 1)
#define DROP_CSUM_ERR (DROP_CODE_BASE + 2)
#define DROP_INPUT_STP_BLOCKED (DROP_CODE_BASE + 3)
#define DROP_NON_V4_TUNNEL (DROP_CODE_BASE + 4)
#define DROP_GRE_UNSUPPORTED_FLAGS (DROP_CODE_BASE + 5)
#define DROP_TUNNEL_PKT_TOO_SHORT (DROP_CODE_BASE + 6)
#define DROP_TUNNEL_HDR_TOO_LONG (DROP_CODE_BASE + 7)
#define DROP_ENUM_CHK_MISMATCH (DROP_CODE_BASE + 8)
#define DROP_V6_OPTS_TOO_LONG (DROP_CODE_BASE + 9)
#define DROP_UNUSED_CODE_10 (DROP_CODE_BASE + 10)
#define DROP_IP_HDR_ERR (DROP_CODE_BASE + 11)
#define DROP_IP_LEN_ERR (DROP_CODE_BASE + 12)
#define DROP_L4_LEN_ERR (DROP_CODE_BASE + 13)
#define DROP_TCP_FRAG_OFF_ERR (DROP_CODE_BASE + 14)
#define DROP_DMAC_MISS (DROP_CODE_BASE + 15)
#define DROP_TCAM_MISS (DROP_CODE_BASE + 16)
#define DROP_LEARN_LIMIT_EXCEEDED (DROP_CODE_BASE + 17)
#define DROP_STATIC_MAC_MOVE (DROP_CODE_BASE + 18)
#define DROP_PFE_MASK_ZERO (DROP_CODE_BASE + 19)
35 Copyright © 2010 Juniper Networks, Inc. www.juniper.net
#define DROP_NO_LOCAL_SWITCHING (DROP_CODE_BASE + 20)
......
4. WHERE IS THE PACKET GOT DROPPED (LU) ?
......
#define DROP_MTU_EXCEEDED (DROP_CODE_BASE + 21)
#define DROP_FRAG_NEEDED_DF_SET (DROP_CODE_BASE + 22)
#define DROP_IP_OPTIONS_ERR (DROP_CODE_BASE + 23)
#define DROP_MCAST_SMAC (DROP_CODE_BASE + 24)
#define DROP_FLOW_DISCARD (DROP_CODE_BASE + 25)
#define DROP_UCAST_SPLIT_HORIZON (DROP_CODE_BASE + 26)
#define DROP_MCAST_PFE_CHECK (DROP_CODE_BASE + 27)
#define DROP_UNKNOWN_DMAC (DROP_CODE_BASE + 28)
#define DROP_PPPOE_HEADER (DROP_CODE_BASE + 29)
#define DROP_PPPOE_LENGTH (DROP_CODE_BASE + 30)
#define DROP_OUTPUT_STP_BLOCKED (DROP_CODE_BASE + 31)
#define DROP_VLAN_RANGE_CHECK (DROP_CODE_BASE + 32)
#define DROP_MC_STK_OVERFLOW (DROP_CODE_BASE + 33)
#define DROP_NH_STK_ERROR (DROP_CODE_BASE + 34)
#define DROP_SAMPLING_STK_ERROR (DROP_CODE_BASE + 35)
#define DROP_NH_CHAIN_ERROR (DROP_CODE_BASE + 36)
#define DROP_UCODE_ERROR (DROP_CODE_BASE + 37)
#define DROP_L2TP_HEADER_ERROR (DROP_CODE_BASE + 39)
#define DROP_L2TP_CONTROL_PACKET (DROP_CODE_BASE + 40)
#define DROP_DISCARD_SWERR (DROP_CODE_BASE + 64 + 0)
#define DROP_DISCARD_DEBUG (DROP_CODE_BASE + 64 + 1)
#define DROP_DISCARD_PROTOCOL (DROP_CODE_BASE + 64 + 2)
#define DROP_DISCARD_FW (DROP_CODE_BASE + 64 + 3)
......
36 Copyright © 2010 Juniper Networks, Inc. www.juniper.net
4. WHERE IS THE PACKET GOT DROPPED (LU) ?
......
#define DROP_DISCARD_IIF (DROP_CODE_BASE + 64 + 4)
#define DROP_NOT_USED_69 (DROP_CODE_BASE + 64 + 5)
#define DROP_DISCARD_HOLD (DROP_CODE_BASE + 64 + 6)
#define DROP_URPF (DROP_CODE_BASE + 64 + 7)
#define DROP_UNINIT_STREAM (DROP_CODE_BASE + 64 + 8)
#define DROP_UNKNOWN_IFF (DROP_CODE_BASE + 64 + 9)
#define DROP_DISCARD_SWERR_IIF_NOMEM (DROP_CODE_BASE + 64 + 10)
#define DROP_DISCARD_FABRIC (DROP_CODE_BASE + 64 + 11)
#define DROP_DISCARD_GREKEY_MISMATCH (DROP_CODE_BASE + 64 + 12)
#define DROP_UNKNOWN_VRF (DROP_CODE_BASE + 64 + 13)
#define DROP_DISCARD_MAC_FW (DROP_CODE_BASE + 64 + 14)
#define DROP_PPPOE_NO_SESSION (DROP_CODE_BASE + 64 + 15)
#define DROP_PPPOE_SESSION_DOWN (DROP_CODE_BASE + 64 + 16)
#define DROP_PPPOE_SESSION_UNKNOWN_MAC (DROP_CODE_BASE + 64 + 17)
#define DROP_PPP_PROTO_UNCONFIGURED (DROP_CODE_BASE + 64 + 18)
#define DROP_PPP_PROTO_DOWN (DROP_CODE_BASE + 64 + 19)
#define DROP_DISCARD_LT (DROP_CODE_BASE + 64 + 20)
#define DROP_AGG_MCAST (DROP_CODE_BASE + 64 + 21)
#define DROP_DISCARD_L2_TOKEN (DROP_CODE_BASE + 64 + 22)
#define DROP_DISCARD_IIF_DOWN (DROP_CODE_BASE + 64 + 23)

37 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


4. WHERE IS THE PACKET GOT DROPPED (LU) ?
Some drops are associated with counters
src/pfe/common/pfe-arch/trinity/toolkits/jnh/jnh_exception.h

{
.e_category = CAT_BRIDGING,
.e_code = PACKET_ERR_UCAST_SPLIT_HORIZON,
.e_name = "bridge ucast split horizon",
.e_type = DISCARD,
.e_nh = CNT,
.e_help =
"Internal counter. Number of L2 packets failing split-horizon check."
},

38 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


4. WHERE IS THE PACKET GOT DROPPED (LU) ?
How to capture the packet drop ?
Routing
----------------------
discard route DISC(66) 361 291744

TAZ-TBB-0(cheese vty)# debug jnh exceptions 66 discard

TAZ-TBB-0(cheese vty)# debug jnh exceptions-trace

TAZ-TBB-0(cheese vty)# show ukern_trace handles


Ukernel Trace Info:
ID Name Level Printf Logging Size Wrap
----- --------------- --------- ----- ----- ----- -----
7 JNH-EXCEPTIONS terse Off On 65536 0

TAZ-TBB-0(cheese vty)# test jnh exceptions-trace throttle


<number> value
default value 1000
none log every packet

TAZ-TBB-0(cheese vty)# test jnh exceptions-trace throttle none

39 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


4. WHERE IS THE PACKET GOT DROPPED (LU) ?
Can see the captured packet form the exception trace buffer
TAZ-TBB-0(cheese vty)# show jnh exceptions-trace
[7049204] jnh_exception_packet_trace: ###############
[7049205] jnh_exception_packet_trace: [iif:346,code/info:194/0x23,score:mcast|(0x20),ptyp
[7049206] jnh_exception_packet_trace: 0x00: 20 20 c2 00 02 30 01 5a 00 0e 00 62 00 00 00
[7049207] jnh_exception_packet_trace: 0x10: 00 0c 00 01 00 5e 01 01 ad 80 71 1f c2 66 61
[7049208] jnh_exception_packet_trace: 0x20: 00 45 00 00 54 12 ea 00 00 01 01 ba 0f 0a 01
[7049209] jnh_exception_packet_trace: 0x30: 01 e0 01 01 ad 08 00 7d d0 74 67 00 99 4e 80
[7049210] jnh_exception_packet_trace: 0x40: 01 00 0a 4d a0 08 09 0a 0b 0c 0d 0e 0f 10 11
[7049211] jnh_exception_packet_trace: 0x50: 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21
[7049212] jnh_exception_packet_trace: 0x60: 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31
[7049213] jnh_exception_packet_trace: 0x70: 33 34 35 36 37

TAZ-TBB-0(cheese vty)# undebug jnh exceptions 66 discard

TAZ-TBB-0(cheese vty)# undebug jnh exceptions-trace

TAZ-TBB-0(cheese vty)# show ukern_trace handles


Ukernel Trace Info:
ID Name Level Printf Logging Size Wrap
----- --------------- --------- ----- ----- ----- -----
7 JNH-EXCEPTIONS none Off Off 0 0

40 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


4. WHERE IS THE PACKET GOT DROPPED (LU) ?
INFO 194 = 128 + 64 + 2 = DROP_DISCARD_PROTOCOL
#define PACKET_ERR_DISCARD_PROTOCOL PACKET_ERR(DROP_DISCARD_PROTOCOL) /* + 2*/

{
.e_category = CAT_ROUTING,
.e_code = PACKET_ERR_DISCARD_PROTOCOL,
.e_name = "discard route",
.e_type = DISCARD,
.e_nh = CNT,
.e_help =
"RNH_DISCARD nexthops explicitly installed by routing protocols."
},

TAZ-TBB-0(cheese vty)# sh route ip prefix 224/4

IPv4 Route Table 0, default.0, 0x0:


Destination NH IP Addr Type NH ID Interface
------------ --------------- -------- ----- ---------
224/4 mdiscard 35 RT-ifl 0
224.0.0.1 Mcast 31 RT-ifl 0
224.0.0.5 Mcast 31 RT-ifl 0
224.0.0.6 Mgroup 714 RT-ifl 0

41
TAZ-TBB-0(cheese vty)# Copyright © 2010 Juniper Networks, Inc. www.juniper.net
4. WHERE IS THE PACKET GOT DROPPED (IX) ?
How about drops in other places like IX and MQ ?
IX
TAZ-TBB-0(cheese vty)# show ixchip ifd

IFD IFD IX WAN Ing Queue Egr Queue


Index Name Id Port Rt/Ct/Be H/L
====== ========== ====== ====== ============== ======
140 ge-1/0/0 2 0 0/32/64 0/32
141 ge-1/0/1 2 1 1/33/65 1/33
142 ge-1/0/2 2 2 2/34/66 2/34
143 ge-1/0/3 2 3 3/35/67 3/35
144 ge-1/0/4 2 4 4/36/68 4/36
.....
182 ge-1/3/6 3 18 18/50/82 18/50
183 ge-1/3/7 3 19 19/51/83 19/51
184 ge-1/3/8 3 20 20/52/84 20/52
185 ge-1/3/9 3 21 21/53/85 21/53
186 ge-1/3/10 3 22 22/54/86 22/54
187 ge-1/3/11 3 23 23/55/87 23/55

42 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


4. WHERE IS THE PACKET GOT DROPPED (IX) ?
This shows a summary of both ingress and egress statistics
TAZ-TBB-0(cheese vty)# show ixchip 2 statistics wan_port 1
IXCHIP 2 IX wan port 1 stats:

INGRESS Traffic Class - RT, Queue 1


IXCHIP 2 Ingress Buffer Mgr Queue 1 stats:

Counter Name Total Rate Peak Rate


------------------------ ---------------- -------------- --------------
Receive Byte Count 0 0 0
Receive EOP Count 0 0 0
Receive EOPE Count 0 0 0
Transmit Byte Count 0 0 0
Transmit EOP Count 0 0 0
Transmit EOPE Count 0 0 0
Headdrop Byte Count 0 0 0
Headdrop EOP Count 0 0 0
Headdrop EOPE Count 0 0 0
......

43 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


4. WHERE IS THE PACKET GOT DROPPED (IX) ?
......
INGRESS Traffic Class - CTRL, Queue 33
IXCHIP 2 Ingress Buffer Mgr Queue 33 stats:

Counter Name Total Rate Peak Rate


------------------------ ---------------- -------------- --------------
Receive Byte Count 2583825 0 7546
Receive EOP Count 28090 0 20
Receive EOPE Count 0 0 0
Transmit Byte Count 2583825 0 7546
Transmit EOP Count 28090 0 20
Transmit EOPE Count 0 0 0
Headdrop Byte Count 0 0 0
Headdrop EOP Count 0 0 0
Headdrop EOPE Count 0 0 0

INGRESS Traffic Class - BE, Queue 65


IXCHIP 2 Ingress Buffer Mgr Queue 65 stats:

Counter Name Total Rate Peak Rate


------------------------ ---------------- -------------- --------------
Receive Byte Count 0 0 0
Receive EOP Count 0 0 0
Receive EOPE Count 0 0 0
......
44 Copyright © 2010 Juniper Networks, Inc. www.juniper.net
4. WHERE IS THE PACKET GOT DROPPED (IX) ?
......
Transmit Byte Count 0 0 0
Transmit EOP Count 0 0 0
Transmit EOPE Count 0 0 0
Headdrop Byte Count 0 0 0
Headdrop EOP Count 0 0 0
Headdrop EOPE Count 0 0 0

INGRESS Discard, Queue 97


IXCHIP 2 Ingress Buffer Mgr Discard Queue 1 stats:

Counter Name Total Rate Peak Rate


------------------------ ---------------- -------------- --------------
Discard Byte Count 0 0 0
Discard EOP Count 0 0 0
Discard EOPE Count 0 0 0

EGRESS Traffic class - High, Queue 1


IXCHIP 2 Egress Buffer Mgr Queue 1 stats:

Counter Name Total Rate Peak Rate


------------------------ ---------------- -------------- --------------
Receive Byte Count 843020942583 24471470 24580577
Receive EOP Count 829770637 24087 24202
45 Receive EOPE CountCopyright © 2010 Juniper Networks, Inc.
0 www.juniper.net 0 0
......
4. WHERE IS THE PACKET GOT DROPPED (IX) ?
......
Transmit Byte Count 843020942583 24471470 24580577
Transmit EOP Count 829770637 24087 24202
Transmit EOPE Count 0 0 0

EGRESS Traffic class - Low, Queue 33


IXCHIP 2 Egress Buffer Mgr Queue 33 stats:

Counter Name Total Rate Peak Rate


------------------------ ---------------- -------------- --------------
Receive Byte Count 0 0 0
Receive EOP Count 0 0 0
Receive EOPE Count 0 0 0
Transmit Byte Count 0 0 0
Transmit EOP Count 0 0 0
Transmit EOPE Count 0 0 0

TAZ-TBB-0(cheese vty)#

46 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


4. WHERE IS THE PACKET GOT DROPPED (IX) ?
IX error stats on the corresponding stream
TAZ-TBB-0(cheese vty)# show ixchip 2 statistics inq stream 12
IXCHIP 2 INQ stream 12 stats:
Counter Name Total Rate Peak Rate
------------------------ ---------------- -------------- --------------
Drop Pkt Cnt 0 0 0
Abort Pkt Cnt 0 0 0
WI Seq Error Cnt 0 0 0
WI MTU Error Cnt 0 0 0

47 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


4. WHERE IS THE PACKET GOT DROPPED (MQ) ?
There are only 4 counters available on MQ for the WI statistics
Need to modify the counter to measure the statistics from a specific
stream
TAZ-TBB-0(cheese vty)# test mqchip 0 counter wi_rx 0 1025
TAZ-TBB-0(cheese vty)# show mqchip 0 counters input stream 1025
WI Counters:
Counter Packets Pkt Rate Bytes Byte Rate
-------------------- ---------------- ------------ ---------------- ------------
RX Stream 1025 (001) 0 0 0 0
RX Stream 1026 (002) 83421 0 13212261 0
RX Stream 1027 (003) 92559739 0 73029806853 0
RX Stream 1151 (127) 499195 1 88078580 154

DROP Port 0 TClass 0 0 0 0 0


DROP Port 0 TClass 1 0 0 0 0
DROP Port 0 TClass 2 0 0 0 0
DROP Port 0 TClass 3 0 0 0 0
PT phy_stream 1025 Counters:
to LU pkt incr cnt 0
from LU pkt decr cnt 0

TAZ-TBB-0(cheese vty)#
48 Copyright © 2010 Juniper Networks, Inc. www.juniper.net
4. WHERE IS THE PACKET GOT DROPPED (MQ) ?
The same applies on fabric side
TAZ-TBB-0(cheese vty)# show mqchip 0 fi stats

FI Counters:

Stream Mask Match


0x000 0x000 Packets Pkt Rate Cells Cell Rate
-------------------- ---------------- ------------ ---------------- ------------
Received 3995718261 96341 50142294989 1541442
Dropped 0 0 0 0
Pkts to PT 3995718261 96340
Errored Pkts to PT 456128 0

TAZ-TBB-0(cheese vty)# test mqchip 0 counter


fi set the FI counter stream mask and match
fo set the FO counter stream mask and match
li set the LI counter parcel mask
wi_drop map a WI DROP counter to port, traffic class
wi_rx map a WI RX counter to a stream
wo set the WO counter stream mask and match

TAZ-TBB-0(cheese vty)#

49 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


4. WHERE IS THE PACKET GOT DROPPED ?
JNH provide a summary on IFD basis
TAZ-TBB-0(cheese vty)# show jnh ifd 141 stream
ifd = 141, Stream = 225
Stream ID: 225 (inst = 0)
Cntr : 0x00d0ea84
Encap : Ether
Encap = 0, StartNH = 0xd020d5
lacp:-, stp:-/0, esmc:-, lfm:-, erp:-, lldp:-, mvrp:-/-, vc:-, natVlan:-/4095, native

Input Statistics: 0000000335396406 pkts, 0000343122136230 bytes


Detail Statistics:
rx0: 0000000000000000 pkts, 0000000000000000 bytes
rx1: 0000000000013344 pkts, 0000000001307242 bytes
rx2: 0000000335383062 pkts, 0000343120828988 bytes
drop0: 0000000000000000 pkts, 0000000000000000 bytes
drop1: 0000000000000000 pkts, 0000000000000000 bytes
drop2: 0000000000000000 pkts, 0000000000000000 bytes
unknown-iif: 0000000000000007 pkts, 0000000000000370 bytes
checksum: 0000000000000000 pkts, 0000000000000000 bytes
unknown-proto: 0000000000000000 pkts, 0000000000000000 bytes
bad-ucastmac: 0000000335384126 pkts, 0000338425522568 bytes
bad-vrrpmac: 0000000000000000 pkts, 0000000000000000 bytes
......
50 Copyright © 2010 Juniper Networks, Inc. www.juniper.net
4. WHERE IS THE PACKET GOT DROPPED ?
......
bad-smac: 0000000000000000 pkts, 0000000000000000 bytes
in-stp: 0000000000000000 pkts, 0000000000000000 bytes
out-stp: 0000000000000000 pkts, 0000000000000000 bytes
vlan-check: 0000000000000000 pkts, 0000000000000000 bytes
frame-errors: 0000000000000000 pkts, 0000000000000000 bytes
L4-len: 0000000000000000 pkts, 0000000000000000 bytes
Stream Features:
Topology: stream-(225)
Flavor: i-root (1), Refcount 0, Flags 0x1
Addr: 0x4955d458, Next: 0x494be4a8, Context 0x4955d450
Link 0: 0240684f:23000303, Offset 12, Next: 0240684f:23000303
Link 1: 00000000:00000000, Offset 12, Next: 00000000:00000000
Link 2: 00000000:00000000, Offset 12, Next: 00000000:00000000
Link 3: 00000000:00000000, Offset 12, Next: 00000000:00000000

Topology Neighbors:
[none]-> stream-(225)-> flist-master(stream)
Feature List: stream
[pfe-0]: 0x0240684f23000303;
f_mask:0x02000000; c_mask:0x80000000; f_num:7; c_num:1, inst:0
Idx#6 iif-lookup:
[pfe-0]: 0x0240684f23000303

TAZ-TBB-0(cheese
51 vty)# Copyright © 2010 Juniper Networks, Inc. www.juniper.net
5. HOW DOES TRINITY IMPLEMENT LOAD SHARING ?
5. HOW DOES TRINITY IMPLEMENT LOAD SHARING ?
Trinity uses the same polynomials as in Ichip to calculate the hash
value

Trinity supports hash rotation feature like Stoli and Ichip

Trinity supports hash seed configuration

Trinity supports different has key configuration but with different


combination then the standard configuration we have

http://cvs.juniper.net/cgi-bin/viewcvs.cgi/sw-
projects/platform/trinity/pfe/core/load_balance_whitepaper.doc?rev
=1.1&view=log

53 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


5. HOW DOES TRINITY IMPLEMENT LOAD SHARING ?
Trinity load sharing key configuration is under enhanced-hash-key
configuration hierarchy
[edit]
lab@cheese# set forwarding-options enhanced-hash-key family ?
Possible completions:
> inet IPv4 protocol family
> inet6 IPv6 protocol family
> mpls MPLS protocol family
> multiservice Multiservice protocol (bridged/CCC/VPLS) family
[edit]

lab@cheese# set forwarding-options enhanced-hash-key family inet ?


Possible completions:
+ apply-groups Groups from which to inherit configuration data
+ apply-groups-except Don't inherit configuration data from these groups
incoming-interface-index Include incoming interface index in the hash key
no-destination-port Omit IP destination port in the hash key
no-source-port Omit IP source port in the hash key
type-of-service Include TOS byte in the hash key

[edit]
lab@cheese#
54 Copyright © 2010 Juniper Networks, Inc. www.juniper.net
5. HOW DOES TRINITY IMPLEMENT LOAD SHARING ?
[edit]
lab@cheese# set forwarding-options enhanced-hash-key family inet6 ?
Possible completions:
+ apply-groups Groups from which to inherit configuration data
+ apply-groups-except Don't inherit configuration data from these groups
incoming-interface-index Include incoming interface index in the hash key
no-destination-port Omit IP destination port in the hash key
no-source-port Omit IP source port in the hash key
traffic-class Include Traffic Class byte in the hash key
[edit]
lab@cheese# set forwarding-options enhanced-hash-key family mpls ?
Possible completions:
+ apply-groups Groups from which to inherit configuration data
+ apply-groups-except Don't inherit configuration data from these groups
incoming-interface-index Include incoming interface index in the hash key
label-1-exp Include EXP of first MPLS label from the hash key
no-payload Omit MPLS payload data from the hash key
[edit]
lab@cheese# set forwarding-options enhanced-hash-key family multiservice ?
Possible completions:
+ apply-groups Groups from which to inherit configuration data
+ apply-groups-except Don't inherit configuration data from these groups
incoming-interface-index Include incoming interface index in hash key
no-payload Omit payload data from the hash key
55
outer-priority Include Outer 802.1 Priority bits in the hash key
Copyright © 2010 Juniper Networks, Inc. www.juniper.net
[edit]
lab@cheese#
5. HOW DOES TRINITY IMPLEMENT LOAD SHARING ?
Here is the default hash key configuration
Some of them can be changed only under PFE
TAZ-TBB-0(cheese vty)# show jnh lb
Unilist Seed Configured 0x8bce4c39 System Mac address 00:00:00:00:00:00
Hash Key Configuration: 0x0000000000e00000 0xffffffffffffffff
IIF-V4: No
SPORT-V4: Yes
DPORT-V4: Yes
TOS: No

IIF-V6: No
SPORT-V6: Yes
DPORT-V6: Yes
TRAFFIC_CLASS: No

IIF-MPLS: No
MPLS_PAYLOAD: Yes
MPLS_EXP: No

IIF-BRIDGED: No
MAC ADDRESSES: Yes
ETHER_PAYLOAD: Yes
56 802.1P OUTER: No Copyright © 2010 Juniper Networks, Inc. www.juniper.net

Services Hash Key Configuration:


SADDR-V4: No
5. HOW DOES TRINITY IMPLEMENT LOAD SHARING ?

Protocol Fields Configurable Default


Incoming Interface Yes No
Destination IP No Yes
Source IP No Yes
Protocol ID No Yes
IPv4 Source TCP/UDP Port Yes Yes
Destination TCP/UDP Yes Yes
Port
DSCP Yes No

57 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


5. HOW DOES TRINITY IMPLEMENT LOAD SHARING ?

Protocol Fields Configurable Default


Incoming Interface Yes No
Destination IP No Yes
Source IP No Yes
Protocol ID No Yes
IPv6 Source TCP/UDP Port Yes Yes
Destination TCP/UDP Yes Yes
Port
Traffic Class Yes No

58 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


5. HOW DOES TRINITY IMPLEMENT LOAD SHARING ?

Protocol Fields Configurable Default


Incoming Interface Yes No
Destination MAC Yes* Yes
Source MAC Yes* Yes
Ethernet Outer 802.1p bits Yes No
(Bridge, CCC, Payload (check Ethertype) Yes Yes
TCC, VPLS)
Payload = IPv4 See IPv4
Payload = IPv6 See IPv6

Cannot be separately enabled/disabled.


set forwarding-options enhanced-hash-key family multiservice no-
mac-addresses

59 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


5. HOW DOES TRINITY IMPLEMENT LOAD SHARING ?

Protocol Fields Configurable Default


Incoming Interface Yes No
MPLS Top 5 labels No Yes
Outermost Label’s EXP Yes No
Payload* Yes Yes
•Payload recognition uses the following criteria to estimate the nature of the payload
• IPv4 is assumed if the first nibble following the bottom label is 0x4. If this determination is
made, then the field in the IPv4 section are used to load- load-balance the payload.
• IPv6 is assumed if the first nibble following the bottom label is 0x6. If this determination is
made, then the field in the IPv6 section are used to load- load-balance the payload.
• Otherwise Ethernet is assumed, and attempt is made to parse the remaining bytes as an
Ethernet header (see previous slide for Ethernet header fields). Further, the following checks
are done for Ethernet headers, and if matched, appropriate L3 payloads are used for load- load-
balancing:
• Check for 0x0800 after the first 12 bytes – if yes, then payload is IPv4
• Check for 0x86DD after the first 12 bytes – if yes, then payload is IPv6
• Check for 0x8100 after the first 12 bytes – if yes, skip over (up to 2) VLAN tags, and
check for real Ethertype
•MAC address is ignored for L2VPN applications, when Ethernet is present inside
60 MPLS. Will be considered Copyrightfor VPLS
© 2010 though.
Juniper Networks, Inc. www.juniper.net
5. HOW DOES TRINITY IMPLEMENT LOAD SHARING ?
An enhancement from the Ichip/Stoli is that it supports selector
table to provide a better load sharing result.

Selector table is a 16x16 (16x32 in 11.3 or up) array containing the


ECMP nexthops in “random” pattern. As a result, we can get a
better load sharing result. Hash Distribution

NH #0 NH #1 NH #2
Ichip
0x0000 Hash Value Range 0xFFFF
Hash Distribution

NH NH NH NH NH NH NH NH NH NH NH NH NH NH
#0 #1 #2 #0 #1 #2 #0 #0 #1 #2 #0 #1 #2 #0
Trinity
0x0000 Hash Value Range 0xFFFF

61 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


5. HOW DOES TRINITY IMPLEMENT LOAD SHARING ?
Selector table is created for unbalanced mode
ECMP with RSVP LSP
Aggregated Ethernet interface
BGP multipath with link bandwidth community received

Basically, it an Unilist nexthop comes with balance value on each


member link.

All intermediate nexthop uses the legacy way to select the nexthop.
Selector table only appears on final nexthop.

Plain IP ECMP nexthops won’t have that.

62 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


5. HOW DOES TRINITY IMPLEMENT LOAD SHARING ?
This is IP prefix with ECMPs via LSPs
With RSVP load-balance bandwidth enabled
lab@cheese> show route 100.0.0.0/24 logical-system R1 detail

inet.0: 212 destinations, 313 routes (212 active, 0 holddown, 0 hidden)


100.0.0.0/24 (1 entry, 1 announced)
*BGP Preference: 170/-101
Next hop type: Indirect
Address: 0x2b20354
Next-hop reference count: 400
Source: 4.4.4.4
Next hop type: Router, Next hop index: 1048577
Next hop: 10.1.1.2 via ge-1/0/1.0 weight 0x1 balance 10%
Label-switched-path lsp1
Label operation: Push 299968
Label TTL action: prop-ttl
Next hop: 10.1.1.2 via ge-1/0/1.0 weight 0x1 balance 20%, selected
Label-switched-path lsp2
Label operation: Push 299952
Label TTL action: prop-ttl
Next hop: 10.1.1.2 via ge-1/0/1.0 weight 0x1 balance 30%
Label-switched-path lsp3
Label operation: Push 299920
Label TTL action: prop-ttl
Next hop: 10.1.1.2 via ge-1/0/1.0 weight 0x1 balance 40%
Label-switched-path lsp4
63 Label operation: Push
Copyright 299936
© 2010 Juniper Networks, Inc. www.juniper.net
Label TTL action: prop-ttl
Protocol next hop: 4.4.4.4
Indirect next hop: 2b78bc8 1048575
5. HOW DOES TRINITY IMPLEMENT LOAD SHARING ?
Selector table is created for the unilist nexthop
TAZ-TBB-0(cheese vty)# show nhdb id 1048577 extensive
ID Type Interface Next Hop Addr Protocol Encap MTU Flags PFE intern
----- -------- ------------- --------------- ---------- ------------ ---- ---------- ----------
1048577 Unilist ge-1/0/1.0 - IPv4 Ethernet 0 0x00000000 0x0000000
Selector:
ID:22, Ref:1
Key:FRR/size = 4 agg_type: Unicast

Weight Info (Selector's view): Current Weight = 1


Idx Balance Weight Orig-Weight Ifl Install
----- ------- ------- ----------- ------ -------
0 6554 1 1 337 Yes
1 13108 1 1 337 Yes
2 19661 1 1 337 Yes
3 26215 1 1 337 Yes
Unilist Table (4 entries): List flags 0x0

824 Unicast ge-1/0/1.0 - IPv4->MPLS Ethernet 1500 0x00000001 0x00000002


821 Unicast ge-1/0/1.0 - IPv4->MPLS Ethernet 1500 0x00000001 0x00000002
783 Unicast ge-1/0/1.0 - IPv4->MPLS Ethernet 1500 0x00000001 0x00000002
815 Unicast ge-1/0/1.0 - IPv4->MPLS Ethernet 1500 0x00000001 0x00000002

Weight Info: Current Weight = 1


ID Balance Orig-Balance Weight Orig-Weight State Install Flags
----- ------- --------- ------ ----------- -------- ----------- -----
824 6553 6553 1 1 Active Installed 0x00
64 821 19660 19660 Copyright
1 © 2010 Juniper Networks,
1 Inc. www.juniper.net
Active Installed 0x00
783 39320 39320 1 1 Active Installed 0x00
815 65534 65534 1 1 Active Installed 0x00
5. HOW DOES TRINITY IMPLEMENT LOAD SHARING ?
The nexthop distribution within the array is according to the
balance values (1[0000]:2[0001]:3[0002]:4[0003])
TAZ-TBB-0(cheese vty)# show nhdb hw unilist-sel 22
ID:22, Ref:1
Key:FRR/size = 4 agg_type: Unicast
Sel table seed: 1356(0x54c), key size: 162 words
Distribution Table:PFE0

0 0002 0002 0003 0002 0003 0003 0003 0003 0001 0000 0003 0003 0003 0002 0001 0003
1 0003 0001 0003 0002 0003 0000 0002 0002 0002 0002 0003 0002 0002 0003 0002 0003
2 0003 0002 0000 0001 0002 0002 0003 0002 0001 0000 0002 0001 0002 0001 0003 0001
3 0001 0003 0000 0001 0003 0002 0003 0003 0001 0002 0000 0003 0003 0001 0002 0002
4 0003 0002 0003 0002 0001 0003 0003 0003 0001 0003 0002 0003 0002 0003 0003 0002
5 0002 0003 0001 0002 0000 0003 0001 0001 0003 0002 0003 0003 0001 0002 0003 0003
6 0002 0001 0003 0003 0001 0001 0002 0001 0001 0001 0001 0003 0003 0003 0001 0003
7 0002 0003 0002 0002 0003 0001 0003 0003 0002 0001 0003 0002 0003 0002 0002 0002
8 0002 0000 0002 0003 0000 0002 0003 0003 0002 0003 0003 0002 0003 0002 0001 0001
9 0003 0003 0002 0003 0000 0001 0002 0002 0003 0003 0001 0001 0003 0002 0002 0002
10 0003 0002 0002 0002 0001 0003 0003 0001 0002 0003 0000 0001 0003 0003 0001 0001
11 0003 0003 0002 0002 0002 0000 0003 0002 0002 0002 0002 0001 0002 0002 0001 0003
12 0000 0001 0003 0002 0001 0002 0002 0003 0001 0002 0003 0003 0001 0003 0001 0003
13 0002 0001 0002 0003 0001 0001 0000 0001 0003 0003 0003 0003 0002 0002 0003 0002
14 0002 0002 0002 0000 0003 0003 0002 0003 0002 0001 0001 0003 0002 0001 0002 0003
15 0002 0003 0003 0002 0003 0002 0000 0001 0001 0003 0002 0003 0002 0001 0001 0002
16 0000 0002 0001 0001 0002 0003 0002 0003 0002 0002 0003 0003 0003 0002 0003 0001
17 0001 0002 0002 0003 0002 0003 0002 0003 0002 0003 0001 0003 0000 0002 0001 0001
65 18 0000 0002 0003 0003Copyright
0000© 2010
0002 0001
Juniper 0003
Networks, Inc. 0003 0001
www.juniper.net 0000 0003 0000 0002 0003 0000
......
5. HOW DOES TRINITY IMPLEMENT LOAD SHARING ?
......
19 0001 0003 0003 0002 0003 0003 0000 0001 0003 0003 0003 0000 0002 0003 0003 0001
20 0001 0003 0003 0002 0003 0002 0003 0003 0003 0002 0003 0001 0003 0002 0003 0001
21 0003 0003 0002 0003 0003 0003 0003 0002 0001 0003 0003 0001 0003 0002 0000 0002
22 0002 0002 0000 0003 0000 0001 0001 0001 0002 0001 0001 0002 0003 0003 0002 0002
23 0003 0000 0003 0003 0002 0001 0001 0003 0003 0002 0002 0002 0001 0002 0000 0003
24 0000 0003 0003 0003 0001 0002 0002 0000 0003 0002 0002 0003 0002 0003 0003 0002
25 0003 0000 0000 0003 0003 0000 0003 0003 0003 0003 0003 0003 0001 0002 0003 0003
26 0002 0001 0003 0001 0003 0003 0003 0003 0001 0002 0003 0000 0001 0001 0002 0001
27 0002 0003 0003 0003 0000 0003 0003 0003 0003 0002 0002 0001 0001 0002 0003 0002
28 0002 0002 0003 0003 0003 0001 0000 0000 0002 0000 0003 0003 0003 0003 0003 0003
29 0002 0003 0002 0000 0000 0000 0001 0001 0002 0003 0002 0001 0000 0000 0002 0001
30 0003 0000 0003 0003 0003 0000 0003 0001 0003 0003 0002 0000 0000 0003 0003 0001
31 0002 0003 0001 0002 0000 0001 0002 0003 0003 0002 0001 0002 0001 0000 0000 0001

Topology: UnilistSelector(22-0)
Flavor: Unilist-Selector (50), Refcount 0, Flags 0x1
Addr: 0x43b51bf8, Next: 0x49a61968, Context 0x4c703e90
Link 0: 00000000:00000000, Offset 0, Next: 08ba6620:00020000
Link 1: 00000000:00000000, Offset 0, Next: 00000000:00000000
Link 2: 00000000:00000000, Offset 0, Next: 00000000:00000000
Link 3: 00000000:00000000, Offset 0, Next: 00000000:00000000
......

66 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


5. HOW DOES TRINITY IMPLEMENT LOAD SHARING ?
......
Topology Neighbors:
[none]-> UnilistSelector(22-0)-> ifl(337)-> flist-master(iif)
Topology: UnilistSelector(22-1)
Flavor: Unilist-Selector (50), Refcount 0, Flags 0x1
Addr: 0x43b51bb0, Next: 0x49a61968, Context 0x4c703e90
Link 0: 00000000:00000000, Offset 0, Next: 08ba6620:00020000
Link 1: 00000000:00000000, Offset 0, Next: 00000000:00000000
Link 2: 00000000:00000000, Offset 0, Next: 00000000:00000000
Link 3: 00000000:00000000, Offset 0, Next: 00000000:00000000

Topology Neighbors:
[none]-> UnilistSelector(22-1)-> ifl(337)-> flist-master(iif)
Topology: UnilistSelector(22-2)
Flavor: Unilist-Selector (50), Refcount 0, Flags 0x1
Addr: 0x43b51b68, Next: 0x49a61968, Context 0x4c703e90
Link 0: 00000000:00000000, Offset 0, Next: 08ba6620:00020000
Link 1: 00000000:00000000, Offset 0, Next: 00000000:00000000
Link 2: 00000000:00000000, Offset 0, Next: 00000000:00000000
Link 3: 00000000:00000000, Offset 0, Next: 00000000:00000000
......

67 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


5. HOW DOES TRINITY IMPLEMENT LOAD SHARING ?
......
Topology Neighbors:
[none]-> UnilistSelector(22-2)-> ifl(337)-> flist-master(iif)
Topology: UnilistSelector(22-3)
Flavor: Unilist-Selector (50), Refcount 0, Flags 0x1
Addr: 0x43b51b20, Next: 0x49a61968, Context 0x4c703e90
Link 0: 00000000:00000000, Offset 0, Next: 08ba6620:00020000
Link 1: 00000000:00000000, Offset 0, Next: 00000000:00000000
Link 2: 00000000:00000000, Offset 0, Next: 00000000:00000000
Link 3: 00000000:00000000, Offset 0, Next: 00000000:00000000

Topology Neighbors:
[none]-> UnilistSelector(22-3)-> ifl(337)-> flist-master(iif)

Weight Info (Selector's view): Current Weight = 1


Idx Balance Weight Orig-Weight Ifl Install
----- ------- ------- ----------- ------ -------
0 6554 1 1 337 Yes
1 13108 1 1 337 Yes
2 19661 1 1 337 Yes
3 26215 1 1 337 Yes

TAZ-TBB-0(cheese vty)#

68 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


5. HOW DOES TRINITY IMPLEMENT LOAD SHARING ?
The hash seed is, by default, generated with the chassis MAC
address
TAZ-TBB-0(cheese vty)# show jnh 0 master-record 0
Master Record:
.....
9 LB Seed : 0x000000008bce4c39

69 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


5. HOW DOES TRINITY IMPLEMENT LOAD SHARING ?
For the unilist nexthop without selector table, the nexthop selection
is as usual
– Hash16 % number of nexthops
// index = ( nentries * hash16) / (max hash)
// = ( (max_entry + 1) * hash16) / (1 << 16)
// = ((max_entry * hash16) + hash16) / (1 << 16)
// = ((max_entry * hash16) + hash16) >> 16

TAZ-TBB-0(cheese vty)# show nhdb id 1048580 detail


ID Type Interface Next Hop Addr Protocol Encap MTU Flags PFE inter
----- -------- ------------- --------------- ---------- ------------ ---- ---------- ---------
1048580 Unilist ge-1/1/1.0 - IPv4 Ethernet 0 0x00000000 0x000000
Unilist Table (2 entries): List flags 0x0

795 Unicast ge-1/1/1.0 - IPv4->MPLS Ethernet 1500 0x00000001 0x00000002


777 Unicast ge-1/1/2.0 - IPv4->MPLS Ethernet 1500 0x00000001 0x00000002

Weight Info: Current Weight = 0


ID Balance Orig-Balance Weight Orig-Weight State Install Flags
----- ------- --------- ------ ----------- -------- ----------- -----
795 0 0 0 0 Active Installed 0x00
777 0 0 0 0 Active Installed 0x00
Routing-table id: 0

70 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


5. HOW DOES TRINITY IMPLEMENT LOAD SHARING ?
Hash manipulation enhancement on AE interface
https://gnats.juniper.net/web/default/677483
https://gnats.juniper.net/web/default/600071

71 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


6. HOW TO IDENTIFY THE MQ/IX/JNH STREAM FOR AN
INTERFACE ?
6. HOW TO IDENTIFY THE MQ/IX/JNH STREAM FOR AN
INTERFACE ?
For IX, we can do “show ixchip ifd” to find out the IX id and WAN
port for the interface, then, poll the statistics.
TAZ-TBB-0(cheese vty)# show ixchip ifd

IFD IFD IX WAN Ing Queue Egr Queue


Index Name Id Port Rt/Ct/Be H/L
====== ========== ====== ====== ============== ======
143 ge-1/0/1 2 1 1/33/65 1/33

TAZ-TBB-0(cheese vty)# show ixchip 2 statistics wan_port 1


IXCHIP 2 IX wan port 1 stats:
......

73 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


6. HOW TO IDENTIFY THE MQ/IX/JNH STREAM FOR AN
INTERFACE ?
For MQ, we need to identify the stream and modify the wi_rx
counter mask to collect the statistics.
This is a MX80, so, the fabric stream is used for the WAN port as
well.
TAZ-TBB-0(cheese vty)# show mqchip 0 ifd

Input IFD IFD LU


Stream Index Name Sid TClass
------ ------ ---------- ------ ------
12 143 ge-1/0/1 225 drop
13 143 ge-1/0/1 225 hi
14 143 ge-1/0/1 225 med
15 143 ge-1/0/1 225 lo

Output IFD IFD Base


Stream Index Name Qsys Qnum
------ ------ ---------- ------ ------
181 143 ge-1/0/1 MQ1 8

TAZ-TBB-0(cheese vty)#

74 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


6. HOW TO IDENTIFY THE MQ/IX/JNH STREAM FOR AN
INTERFACE ?
TAZ-TBB-0(cheese vty)# test mqchip 0 counter fi 15

TAZ-TBB-0(cheese vty)# show mqchip 0 fi stats

FI Counters:

StreamMask Match
0x3ff 0x00f Packets Pkt Rate Cells Cell Rate
-------------------- ---------------- ------------ ---------------- ------------
Received 147300 48170 2357424 770914
Dropped 0 0 0 0
Pkts to PT 147300 48169
Errored Pkts to PT 0 0

TAZ-TBB-0(cheese vty)#

75 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


6. HOW TO IDENTIFY THE MQ/IX/JNH STREAM FOR AN
INTERFACE ?
Similar for JNH stream (SID).
TAZ-TBB-0(cheese vty)# show mqchip 0 ifd

Input IFD IFD LU


Stream Index Name Sid TClass
------ ------ ---------- ------ ------
12 143 ge-1/0/1 225 drop
13 143 ge-1/0/1 225 hi
14 143 ge-1/0/1 225 med
15 143 ge-1/0/1 225 lo

TAZ-TBB-0(cheese vty)# show jnh 0 stream 225

Stream ID: 225 (inst = 0)


Cntr : 0x00d0eab0
Encap : Ether
Encap = 0, StartNH = 0xd020d6
lacp:-, stp:-/0, esmc:-, lfm:-, erp:-, lldp:-, mvrp:-/-, vc:-, natVlan:-
/4095, native tpid 0, tpidMask:0x0001

Input Statistics: 0000000000110639 pkts, 0000000009966765 bytes


Detail Statistics:
.....
76 Copyright © 2010 Juniper Networks, Inc. www.juniper.net
6. HOW TO IDENTIFY THE MQ/IX/JNH STREAM FOR AN
INTERFACE ?
.....
rx0: 0000000000000000 pkts, 0000000000000000 bytes
rx1: 0000000000110639 pkts, 0000000009966765 bytes
rx2: 0000000000000000 pkts, 0000000000000000 bytes
drop0: 0000000000000000 pkts, 0000000000000000 bytes
drop1: 0000000000000000 pkts, 0000000000000000 bytes
drop2: 0000000000000000 pkts, 0000000000000000 bytes
unknown-iif: 0000000000000000 pkts, 0000000000000000 bytes
checksum: 0000000000000000 pkts, 0000000000000000 bytes
unknown-proto: 0000000000000000 pkts, 0000000000000000 bytes
bad-ucastmac: 0000000000000000 pkts, 0000000000000000 bytes
bad-vrrpmac: 0000000000000000 pkts, 0000000000000000 bytes
bad-smac: 0000000000000000 pkts, 0000000000000000 bytes
in-stp: 0000000000000000 pkts, 0000000000000000 bytes
out-stp: 0000000000000000 pkts, 0000000000000000 bytes
vlan-check: 0000000000000000 pkts, 0000000000000000 bytes
frame-errors: 0000000000000000 pkts, 0000000000000000 bytes
L4-len: 0000000000000000 pkts, 0000000000000000 bytes
Stream Features:
Topology: stream-(225)
Flavor: i-root (1), Refcount 0, Flags 0x1
Addr: 0x496f6c30, Next: 0x49a5eed8, Context 0x496f6c28
Link 0: 0240684f:3c800303, Offset 12, Next: 0240684f:3c800303
Link 1: 00000000:00000000, Offset 12, Next: 00000000:00000000
77 Link 2: 00000000:00000000, Offset
Copyright © 2010 12,Inc.Next:
Juniper Networks, 00000000:00000000
www.juniper.net
Link 3: 00000000:00000000, Offset 12, Next: 00000000:00000000

Topology Neighbors:
7. HOW TO CHECK ASIC STUCK ?
7. HOW TO CHECK ASIC STUCK ?
There are a few places that we can see “stuck” in Trinity – LU, MQ and
TOE

Stuck can happen in either of them because of different reasons


http://confluence.jnpr.net/confluence/display/IPGE/Trinity+Wedges

Wedge detection is an on-going effort


http://cvs.juniper.net/cgi-bin/viewcvs.cgi/*checkout*/sw-
projects/platform/trinity/pfe/RLI/Trio_Wedge_Detection_Functional_Spec_R
LI_17022.docx

https://gnats.juniper.net/web/default/669131
– LU wedge detection
– https://matrix.juniper.net/docs/DOC-84959
https://gnats.juniper.net/web/default/680609
– MQ wedge detection
https://gnats.juniper.net/web/default/680604
– Host path wedge detection

79 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


7. HOW TO CHECK ASIC STUCK (LU) ?
How do I know if LU is stuck ?
No trap message is found
https://wintermute.juniper.net/projects/trinity/trinity-software/trinity-
debugging/zone-timeout-with-no-ppe-trap

It usually comes with errors


Jun 13 15:04:02 mx960-2-re0 fpc1 LUCHIP(1): Secondary PPE 0 zone 1 timeout.
Jun 13 15:04:02 mx960-2-re0 fpc1 LUCHIP(1) PPE_3 Errors thread timeout error
Jun 13 15:04:02 mx960-2-re0 fpc1 LUCHIP(1) PPE_4 Errors thread timeout error
Jun 13 15:04:02 mx960-2-re0 fpc1 LUCHIP(1) PPE_9 Errors thread timeout error
Jun 13 15:04:02 mx960-2-re0 fpc1 LUCHIP(1) PPE_15 Errors thread timeout error
Jun 13 15:04:02 mx960-2-re0 fpc1 LUCHIP(1) PPE_15 Errors thread timeout error

80 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


7. HOW TO CHECK ASIC STUCK (LU) ?
The Dispatch block is responsible to monitoring the zones. When a parcel
is loaded into a zone, the zone_state_timeout counter is set to 0 and the
zone is dispatched to the PPE. The PPE accepts the zone, and binds it to
a free context. The context processes the parcel until it is done -- prior to
which the context will SEND the parcel to the Reorder Block. When the
reorder block has unloaded the relevant contents of the zone, it is freed
back to dispatch and the zone timer can be retired.

The zone timer was set to 0 at zone load time, when the dispatch prescaler
reaches 0, this counter is incremented. Given that dispatch does not care
about the prescaler value at the time of the load, the zone_state_timeout
transition 0->1 is ignored. When the dispatch prescaler hits 0 the second
time, the zone_state_timeout transitions 1->2 and the dispatch block
notifies the PPE of the thread_timeout event.

When the dispatch prescaler hits 0 the third time, the zone_state_timeout
transitions 2->3 and the dispatch block notifies the control plane that a
secondary timeout has occurred. The PPE is expected to process the
thread timeout by cleaning up the context state and dropping the parcel..

81 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


7. HOW TO CHECK ASIC STUCK (LU) ?
Check the dispatch block and reodering blocks to see if any parcel
is moving in and out.
parcel_count[3] shouldn’t be close to zero
rord_packets rate shouldn’t be close to zero

TAZ-TBB-0(cheese vty)# show luchip 0 disp rate 200

LUCHIP(0) Dispa: Aggregate Counts Rates 205 msec


hsl2_chunks 143636154522 4067804/sec 4067/ms
recirc_chunks 0
callout_count 69428714 1960/sec 1/ms
parcel_count[0] 1 < 1/sec
parcel_count[1] 3391073204 96053/sec 96/ms
parcel_count[2] 0
parcel_count[3] 3338141849 94629/sec 94/ms
error_parcels 0

82 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


7. HOW TO CHECK ASIC STUCK (LU) ?
TAZ-TBB-0(cheese vty)# show luchip 0 rord rate 200

LUCHIP(0) Reord: Aggregate Counts Rates 215 msec


rord_packets 3472522584 96516/sec 96/ms
rord_nonpackets 1 < 1/sec
rord_recirc 0
rord_hsl2_chunks 117618375854 3269488/sec 3269/ms
perf counter 8653060951056 206782000/sec 206782/ms

Check if the PPEs are busy


Busy PPE zone is marked with 1
TAZ-TBB-0(cheese vty)# show luchip 0 disp
LUCHIP(0) DISP config:
......
PPE[0] Zone Enable Mask 0x00fffffe (0x00000000)
PPE[1] Zone Enable Mask 0x00ffffff (0x00000000)
PPE[2] Zone Enable Mask 0x00ffffff (0x00000000)
......
PPE[11] Zone Enable Mask 0x00ffffff (0x00000000)
PPE[12] Zone Enable Mask 0x00ffffff (0x00000000)
PPE[13] Zone Enable Mask 0x00ffffff (0x00000000)
PPE[14] Zone Enable Mask 0x00ffffff (0x00000000)
PPE[15] Zone Enable Mask 0x00ffffff (0x00000000)
83 Copyright © 2010 Juniper Networks, Inc. www.juniper.net
7. HOW TO CHECK ASIC STUCK (LU) ?
Are all the PPEs busy ?
unused_slot_count should be low when it’s busy

TAZ-TBB-0(cheese vty)# show luchip 0 ppe_perf rate 0xffff 200

LUCHIP(0) PPE0 Perf Mon: Aggregate Counts Rates 225 msec


umem_instr_count 224016457644 1612902/sec 1612/ms
gumem_instr_count 74695208583 538017/sec 538/ms
cancel_instr_count 41171489704 298560/sec 298/ms
unused_slot_count 1996233216699 14147871/sec 14147/ms

LUCHIP(0) PPE1 Perf Mon: Aggregate Counts Rates 225 msec


umem_instr_count 169627946079 1223871/sec 1223/ms
gumem_instr_count 4887330085 12844/sec 12/ms
cancel_instr_count 842985896 6120/sec 6/ms
unused_slot_count 1045509817766 5830595/sec 5830/ms
.....
LUCHIP(0) PPE15 Perf Mo: Aggregate Counts Rates 225 msec
umem_instr_count 169180795281 1217404/sec 1217/ms
gumem_instr_count 6984145088 12951/sec 12/ms
cancel_instr_count 847220031 6204/sec 6/ms
unused_slot_count 1178828829006 5749804/sec 5749/ms
84 Copyright © 2010 Juniper Networks, Inc. www.juniper.net
7. HOW TO CHECK ASIC STUCK (LU) ?
Get a packet and see what the PPE is working on
For example, PPE 5 Zone 3 is working on something
Extract the packet and run ttrace to check what it’s stuck on. Maybe a
lookup loop or something ?
TAZ-TBB-0(cheese vty)# show luchip 0 ppe 5 cntx
PPE 5 Zone 12 Context 0 State: free Ucode_addr 0x02a8
PPE 5 Zone 1 Context 1 State: free Ucode_addr 0x02a8
PPE 5 Zone 14 Context 2 State: free Ucode_addr 0x02a8
PPE 5 Zone 3 Context 3 State: busy blocked Ucode_addr 0x62a8
.....

TAZ-TBB-0(cheese vty)# show luchip 0 ppe 5 zone 3


PPE 5 LMEM Zone 3:
0x0000: 0x00000000c8003b01 c00101023d000000 0000000000000000 0000000000000000
0x0020: 0x0000000000000000 0000000000000000 0000000000000000 014e000000000000
^^^
packet is 334-8 = 326 bytes long (parcel_head)
0x0040: 0x1005807003f88408 80711fc266948071 1fc2669308004500 03e8000000003c3d
^^^^^^^^^^cookie^^ ^^^^^^^^^^Ethernet hdr^^^^^^^^....IPv4 pkt
0x0060: 0xb6d4c0010102c800 3b01000000000000 0000000000000000 0000000000000000
0x0080: 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000
.....
85 Copyright © 2010 Juniper Networks, Inc. www.juniper.net
7. HOW TO CHECK ASIC STUCK (MQ) ?
Usually, MQ stuck becoz. it’s running out of PT entries.
[Jun 10 03:45:00.400 LOG: Debug] MQCHIP(0) FI PT or usemeter drops
[Jun 10 03:45:01.416 LOG: Debug] MQCHIP(0) FI PT or usemeter drops

Jun 13 15:04:01 mx960-2-re0 fpc1 MQCHIP(1) PT Errors in tail length/offset in CPT scan
Jun 13 15:04:01 mx960-2-re0 fpc1 MQCHIP(1) LI Packet length error, pt entry 24
Jun 13 15:04:01 mx960-2-re0 fpc1 MQCHIP(1) MALLOC Pre-Q Reference Count underflow -
Jun 13 15:04:01 mx960-2-re0 fpc1 MQCHIP(1) PT Errors in tail length/offset in CPT scan
Jun 13 15:04:01 mx960-2-re0 fpc1 MQCHIP(1) LI Packet length error, pt entry 2
Jun 13 15:04:01 mx960-2-re0 fpc1 MQCHIP(1) PT Errors in tail length/offset in CPT scan
Jun 13 15:04:01 mx960-2-re0 fpc1 MQCHIP(1) LI Packet length error, pt entry 24
Jun 13 15:04:02 mx960-2-re0 fpc1 MQCHIP(1) LI Packet length error, pt entry 0

[Jul 11 02:27:04.398 LOG: Err] MQCHIP(0) FI Enqueuing error


[Jul 11 02:27:05.387 LOG: Err] MQCHIP(0) FI cell underflow at engine stage
[Jul 11 02:27:05.387 LOG: Err] MQCHIP(0) FI Reorder cell timeout
[Jul 11 02:27:05.387 LOG: Err] MQCHIP(0) FI Cell underflow at the state stage
[Jul 11 02:27:05.387 LOG: Err] MQCHIP(0) FI Enqueuing error
[Jul 11 02:27:06.398 LOG: Err] MQCHIP(0) FI cell underflow at engine stage
[Jul 11 02:27:06.398 LOG: Err] MQCHIP(0) FI Reorder cell timeout
[Jul 11 02:27:06.398 LOG: Err] MQCHIP(0) FI Cell underflow at the state stage
[Jul 11 02:27:06.399 LOG: Err] MQCHIP(0) FI Enqueuing error

86 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


7. HOW TO CHECK ASIC STUCK (MQ) ?
Because it’s running out of PT entries, it might back pressure to LU
(failed to receive the unloaded parcel from the PPE after
processing), hence, we can see this as well - it doesn’t necessary
to be a LU problem.
[Aug 22 07:41:59.296 LOG: Err] LUCHIP(3): Secondary PPE 0 zone 2 timeout.
[Aug 22 07:42:01.296 LOG: Err] LUCHIP(3): Secondary PPE 0 zone 5 timeout.
[Aug 22 07:42:03.296 LOG: Err] LUCHIP(3): Secondary PPE 0 zone 20 timeout.
[Aug 22 07:42:05.296 LOG: Err] LUCHIP(3): Secondary PPE 0 zone 1 timeout.
[Aug 22 07:42:07.296 LOG: Err] LUCHIP(3): Secondary PPE 0 zone 1 timeout.
[Aug 22 07:42:09.296 LOG: Err] LUCHIP(3): Secondary PPE 0 zone 1 timeout.

NPC3(eabu-bng-dt-f vty)# sho luchip 0 disp rate 200

LUCHIP(0) Dispa: Aggregate Counts Rates 205 msec


hsl2_chunks 19081161 2204/sec 2/ms
recirc_chunks 0
callout_count 2862138< 1/sec
parcel_count[0] 1< 1/sec
parcel_count[1] 6672455 1102/sec 1/ms
parcel_count[2] 0
parcel_count[3] 542609< 1/sec
87 error_parcels Copyright © 2010 Juniper Networks, Inc.13< 1/sec
www.juniper.net
7. HOW TO CHECK ASIC STUCK (MQ) ?
Check the MQ counters and PT usage meter
TAZ-TBB-0(cheese vty)# show mqchip 0 counters
......
PT Counters:
PCT FI entries 391
PCT WI entries 2197
PCT LI entries 1

CPT FI entries 303


CPT WI entries 5167
CPT LI entries 0
CPT MCAST entries 0

TAZ-TBB-0(cheese vty)# show mqchip 0 pt status


......
fromli:
lu_map_empty[1] : 0x0
pct_rdy[1] : 0x0

88 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


7. HOW TO CHECK ASIC STUCK (MQ) ?
Bit set means we are dropping PT entries request coming from
those inputs
TAZ-TBB-0(cheese vty)# show mqchip 0 pt usemeter status
PCT UM Status:
FI drop bits : 0x0
WI drop bits : 0x0
LI drop bits : 0x0
FI region : 0
WI region : 0
LI region : 0
TOT region : 0

CPT UM Status:
FI drop bits : 0xf
WI drop bits : 0x3
LI drop bits : 0x3
FI region : 3
WI region : 0
LI region : 0
TOT region : 3

89 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


7. HOW TO CHECK ASIC STUCK (MQ) ?
Memory can get stuck on the queues even if the board is good
When there is a fabric destination error on MX
Since MQ use port-group (group 0 for all Trinity PFEs, group 1 for all
Ichip PFEs), one remote PFE error can cause problem for the whole
group !

NPC5(GRTWASEQ5 vty)# show mqchip 1 malloc memory


SRAM total : 32768 chunks
SRAM WI usage : 3194 chunks (9%)
SRAM FI usage : 5883 chunks (17%)
SRAM LI usage : 416 chunks (1%)
SRAM DBB usage : 0 chunks (0%)

DRAM DBB0 total : 327680 pages


DRAM DBB0 usage : 60 pages (0%)
DRAM DBB1 total : 196608 pages
DRAM DBB1 usage : 196608 pages (100%)

90 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


7. HOW TO CHECK ASIC STUCK (MQ) ?
NPC5(mx2-shooter-re0 vty)# show mqchip 0 fo pgrp 0

FO PGRP 0:
port rq_en gr_en rcsel
---- ----- ----- -----
0 1 1 4
1 0 0 0
2 1 1 5
3 0 0 0
4 1 1 0
5 0 0 0
6 1 1 1
7 0 0 0 ......
8 1 1 2 Request Outstanding counters:
9 0 0 0 rcnt 0: 14
10 1 1 3 rcnt 1: 8
11 0 0 0 rcnt 2: 7
12 0 0 0 rcnt 3: 7
13 0 0 0 rcnt 4: 8
14 0 0 0 rcnt 5: 4
15 0 0 0 rcnt 6: 0
...... rcnt 7: 0

91 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


7. HOW TO CHECK ASIC STUCK (MQ) ?
NPC5(mx2-shooter-re0 vty)# show mqchip 0 fo stream 36

FO phy_stream 36:
Port Group : 0
Req Enable : 1
Grant Enable : 1
Drop Enable : 0

Request Outstanding counters:


rcnt 0: 17
rcnt 1: 6
rcnt 2: 8
rcnt 3: 8
rcnt 4: 8
rcnt 5: 3
rcnt 6: 0
rcnt 7: 0

92 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


7. HOW TO CHECK ASIC STUCK (MQ) ?
Sometime, it’s not a stuck…
When the PFE is tested with small packet in line rate, this might show
up – kind of performance issue
The key point to check if this is a stuck – Stop the traffic and see if the
pt usage is still high

[Sep 30 04:40:13.397 LOG: Debug] MQCHIP(0) FI PT or usemeter drops


[Sep 30 04:40:14.398 LOG: Debug] MQCHIP(0) FI PT or usemeter drops
[Sep 30 04:40:15.397 LOG: Debug] MQCHIP(0) FI PT or usemeter drops
[Sep 30 04:40:16.397 LOG: Debug] MQCHIP(0) FI PT or usemeter drops

93 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


7. HOW TO CHECK ASIC STUCK (MQ) ?
Not much we can do when it’s broken !
Have to reset the board to make it works again.
Need to replicate it before we can fix it.
For example:
– https://gnats.juniper.net/web/default/661698
– https://gnats.juniper.net/web/default/662160

94 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


7. HOW TO CHECK ASIC STUCK (MQ) ?
Fabric hardening project should help with that.
http://cvs.juniper.net/cgi-bin/viewcvs.cgi/sw-
projects/platform/atlas/fabric_hardening/15438_fabric_hardening_funcs
pec.txt PR/602847, PR/610828 and RLI 15438
http://cvs.juniper.net/cgi-bin/viewcvs.cgi/sw-
projects/platform/atlas/fabric_hardening/ichip_fabric_selfping_spec.txt

95 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


7. HOW TO CHECK ASIC STUCK (TOE) ?
TOE is Trinity Offload Engine to the host interface for LU and MQ
Connect to the host via PCIE
ASIC configuration
Update data structure in DMEM
Gather statistics (LU)
Hostbound exception packet (MQ)

The problem is usually related to hostbound traffic

96 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


7. HOW TO CHECK ASIC STUCK (TOE) ?
TOE in LU
TAZ-TBB-0(cheese vty)# show toe pfe 0 lu 0 packet-stats
TX packets
tx accepted: 0000000000823737
tx transferred: 0000000000823737
tx rejected: 0000000003052611

TX descriptors
tx accepted: 0000000000823737
tx completed: 0000000000823737

elapsed time: 140792.000


TX Rates:
TX packets per second: 5
TX descriptors per second: 5
TX bytes per second: 641
TX descriptors completed since last count: 823737

RX packets
rx accepted: 0000000001269326
......

97 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


7. HOW TO CHECK ASIC STUCK (TOE) ?
.....
RX descriptors
RX completed: 0000000001811580
RX recycled: 0000000001811580
RX recycle fails: 0000000000000000

total captured tokens: 0000000001351161


total nacked tokens: 0000000000006242

RX Rates:
RX packets per second: 9
RX descriptors per second: 12
RX bytes per second: 4565
RX completed since last count: 1811580
TX errors:
FIFO not initialized: 0000000000000000
head descriptor invalid: 0000000007636926
init descriptor idx invalid: 0000000000000000
packet null: 0000000000000000
packet buffer out of range: 0000000000000000
packet length out of range: 0000000000000000
......

98 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


7. HOW TO CHECK ASIC STUCK (TOE) ?
......
RX errors:
SOP/prev packet: 0000000000000000
no prev packet: 0000000000000000
no start packet: 0000000000000000
init descriptor idx invalid: 0000000000000000

Other errors:
Chip start fail: 0000000000000000
TOE chip:
Current TOE status: TOE thread 0 status:

Good! (Status code 0x0 Unknown status code 0x0)


LU TOE pfe 0 asic 0 thread 0 current PC: 0x00000e3c
LU TOE pfe 0 asic 0 thread 0 reset reg contains 0x00000000
LU TOE pfe 0 asic 0 thread 0 reset PC reg contains 0x00000000
LU TOE pfe 0 asic 0 thread 0 stall reg contains 0x00000000
TOE congestion:
No space in TOE TX packet FIFO: 0
No space in driver RX descriptor FIFO: 0

99 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


7. HOW TO CHECK ASIC STUCK (TOE) ?
TOE in MQ
TAZ-TBB-0(cheese vty)# show toe pfe 0 mq 0 packet-stats
TX packets
tx accepted: 0000000000824886
tx transferred: 0000000000824886
tx rejected: 0000000003052670

TX descriptors
tx accepted: 0000000000824886
tx completed: 0000000000824886

elapsed time: 174.000


TX Rates:
TX packets per second: 6
TX descriptors per second: 6
TX bytes per second: 730
TX descriptors completed since last count: 1149

RX packets
rx accepted: 0000000001270356
......

100 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


7. HOW TO CHECK ASIC STUCK (TOE) ?
......
RX descriptors
RX completed: 0000000001812610
RX recycled: 0000000001812610
RX recycle fails: 0000000000000000

total captured tokens: 0000000001352191


total nacked tokens: 0000000000006242

RX Rates:
RX packets per second: 5
RX descriptors per second: 5
RX bytes per second: 678
RX completed since last count: 1030
TX errors:
FIFO not initialized: 0000000000000000
head descriptor invalid: 0000000007637373
init descriptor idx invalid: 0000000000000000
packet null: 0000000000000000
packet buffer out of range: 0000000000000000
packet length out of range: 0000000000000000
.....

101 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


7. HOW TO CHECK ASIC STUCK (TOE) ?
......
RX errors:
SOP/prev packet: 0000000000000000
no prev packet: 0000000000000000
no start packet: 0000000000000000
init descriptor idx invalid: 0000000000000000

Other errors:
Chip start fail: 0000000000000000
TOE chip:
Current TOE status: TOE thread 0 status:

Bad! (Packet Transfer: bad stream data)


MQ TOE pfe 0 asic 0 thread 0 current PC: 0x000008dc
MQ TOE pfe 0 asic 0 thread 0 reset reg contains 0x00000000
MQ TOE pfe 0 asic 0 thread 0 reset PC reg contains 0x00000000
MQ TOE pfe 0 asic 0 thread 0 stall reg contains 0x00000000
TOE congestion:
No space in TOE TX packet FIFO: 0
No space in driver RX descriptor FIFO: 9057821

102 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


7. HOW TO CHECK ASIC STUCK (TOE) ?
TAZ-TBB-0(cheese vty)# show toe pfe 0 lu 0 wedge-stats

PFE 0 Host Path Wedge Test Status


--------------------------------
Declared Wedge: NO
Wedge window size: 0
TOE ucode halted..................NO
Suspected to-asic blockage........NO
TOE driver host path app halted...NO

TAZ-TBB-0(cheese vty)# show toe pfe 0 mq 0 wedge-stats

PFE 0 Host Path Wedge Test Status


--------------------------------
Declared Wedge: NO
Wedge window size: 0
TOE ucode halted..................NO
Suspected to-asic blockage........NO
TOE driver host path app halted...NO

TAZ-TBB-0(cheese vty)#

103 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


7. HOW TO CHECK ASIC STUCK (TOE) ?
TAZ-TBB-0(cheese vty)# show host_loopback 0 status
PFE invalid: 0000000000000000
No CM Alarm handle: 0000000000000000

PFE 0 Host Path Wedge Test Status


--------------------------------
Declared Wedge: NO
Suspected wedge: NO

Wedge window size: 0

Send failures: 11
Send ack timeout: 329
Packet get failures: 0
Current time is 3669 seconds
Sent Wedge test packets: 3608.
Test packet 3600 sent at 3661
Test packet 3601 sent at 3662
Test packet 3602 sent at 3663
Test packet 3603 sent at 3664
Test packet 3604 sent at 3665
Test packet 3605 sent at 3666
Test packet 3606 sent at 3667
Test packet 3607 sent at 3668
104 ..... Copyright © 2010 Juniper Networks, Inc. www.juniper.net
7. HOW TO CHECK ASIC STUCK (TOE) ?
......

Wedge test packets counted on the PFE: 3607

Received Wedge test packets: 3584


Test packet 16 received at 3661
Test packet 17 received at 3662
Test packet 18 received at 3663
Test packet 19 received at 3664
Test packet 20 received at 3665
Test packet 21 received at 3666
Test packet 22 received at 3667
Test packet 23 received at 3668

TAZ-TBB-0(cheese vty)#

105 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


7. HOW TO CHECK ASIC STUCK (TOE) ?
Hostbound traffic stats
TAZ-TBB-0(cheese vty)# show jnh host-path-stats
Transmits: 0000000000827256
Bypass: 0000000000827058
Non-Bypass: 0000000000000198
Commands: 0000000000000000
Probes: 0000000000000000
OAM LFM Loop: 0000000000000000
CCC injects: 0000000000000000

Receives: 0000000001272472
Options: 0000000000041767
MLP: 0000000000000000
Probes: 0000000000000000
Skip services: 0000000000000000

Receive Errors: 0000000000542034


Bad Ifl Indx: 0000000000000000
Bad Reason: 0000000000000000
Bad PFE Parse: 0000000000542034
Bad GRE Ctrl : 0000000000000000
106 ...... Copyright © 2010 Juniper Networks, Inc. www.juniper.net
7. HOW TO CHECK ASIC STUCK (TOE) ?
......
Null packet_t ptr: 0000000000000000
Null packet alloc: 0000000000000000
Bad packet notif: 0000000000000000
Bad IFL: 0000000000000000
Bad local IFL: 0000000000000000
Non-ISIS frag GRE: 0000000000000000
Hardware error: 0000000000000000
Bad Notif Hint: 0000000000000000

Transmit Errors: 0000000000000000


Bad Inst 0000000000000000
Bad Ifl 0000000000000000
Bad Ifd 0000000000000000
Bad Channel 0000000000000000
Bad Length 0000000000000000
ASIC err 0000000000571745
OAM LFM Loop-L2 0000000000000000
OAM LFM Loop-L3 0000000000000000

TAZ-TBB-0(cheese vty)#

107 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


7. HOW TO CHECK ASIC STUCK (TOE) ?
ASIC err is bad

Bad PFE Parse doesn’t necessary to be bad


pfe_notif_parse_no_dma() is used to rate limit the notification based on
the notification type
Check “show pfe statistics notification” output

printf(" Bad PFE Parse: %016llu\n", jnh_pkt_stats.rx_err_parse);

717 status = pfe_notif_parse_no_dma(notifp);


718 if (status != EOK) {
719 jnh_pkt_stats.rx_err_parse++;
720 goto error;
721 }

108 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


7. HOW TO CHECK ASIC STUCK (TOE) ?
Some known issues to check
https://gnats.juniper.net/web/default/676729
https://gnats.juniper.net/web/default/671205
https://gnats.juniper.net/web/default/590547
https://gnats.juniper.net/web/default/586866
https://gnats.juniper.net/web/default/584521
https://gnats.juniper.net/web/default/579340
https://gnats.juniper.net/web/default/584957
and more to come….

109 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


8. IS THE ASIC RUNNING OUT OF HORSEPOWER ?
8. IS THE ASIC RUNNING OUT OF HORSEPOWER ?
How can I tell if the ASIC is running in full speed already ?
ie. It’s running at it’s max and hitting the limit ?

It’s related to various resource usages


Microcode instructions
External and Internal Data memory references
Counter writes/reads
Hash block access

111 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


8. IS THE ASIC RUNNING OUT OF HORSEPOWER ?
LU Contexts and Zones
A Context in LU = A Process within a CPU
A Zone is the private memory used by the Context
There are 24 Zones and 20 Context (per PPE)
Dispatch block loads parcel to the least loaded PPE
Reorder block unloaded the parcel when the lookup is done

When the number of active Zones is close to the number of PPEs,


the LU is operating near full throttle
No zone is used here.
TAZ-TBB-0(cheese vty)# show jspec luchip[0] registers disp pzarb zone_active 0

Offset Name Current


0x00121000 lu.disp.pzarb.zone_active[0].ppe 00000000

TAZ-TBB-0(cheese vty)#

112 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


8. IS THE ASIC RUNNING OUT OF HORSEPOWER ?
Most of the information can be found under “show luchip x rate”
command.
How to check the LU processing rate ?
Parcel[0] – Callout used by applications like aging and flow exporter
Parcel[1] – Stats update parcels from MQ, which are used by LU to
increment counters for packets, based on the enqueue result in MQ
Parcel[2] – Re-circulate parcels when LU re-injects packets to itself for
more processing
Parcel[3] – Data packet

113 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


8. IS THE ASIC RUNNING OUT OF HORSEPOWER ?
TAZ-TBB-0(cheese vty)# show luchip 0 rate 200

LUCHIP(0) Dispa: Aggregate Counts Rates 215 msec


hsl2_chunks 919687521309 4094586/sec 4094/ms
recirc_chunks 0
callout_count 444039635 1976/sec 1/ms
parcel_count[0] 1 < 1/sec
parcel_count[1] 21718161880 96646/sec 96/ms
parcel_count[2] 0
parcel_count[3] 21373715129 95158/sec 95/ms
error_parcels 0
......

The maximum throughput would be around 50Mpps when the


ingress and egress PFE are the same. Otherwise, we can reach
100Mpps

114 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


8. IS THE ASIC RUNNING OUT OF HORSEPOWER ?
External RLDRAM access can be a bottleneck
There are 4 RLDRAM parts, each part has 8 memory banks
Each part runs at 533MHz (or 533M transactions per second) and each
bank takes up 1/8 of the loading
......
LUCHIP(0) RLDRA: Aggregate READ Counts Rates 220 msec
Bank 0 7027938180 33000/sec 33/ms
Bank 1 7053390312 32868/sec 32/ms
Bank 2 7067851960 33054/sec 33/ms
Bank 3 6848931658 31504/sec 31/ms
Bank 4 6954803756 31981/sec 31/ms
Bank 5 6966272064 32218/sec 32/ms
Bank 6 6987172517 32095/sec 32/ms
Bank 7 7088862925 32954/sec 32/ms

LUCHIP(0) RLDRA: Aggregate WRITE Counts Rates 220 msec


Bank 0 9968243 < 1/sec
Bank 1 104365440 581/sec
Bank 2 36280924 9/sec
Bank 3 26924071 272/sec
Bank 4 107434928 963/sec
Bank 5 30984794 4/sec
Bank 6 52295590 < 1/sec
Bank 7 185036570 1395/sec 1/ms
......
115 Copyright © 2010 Juniper Networks, Inc. www.juniper.net
8. IS THE ASIC RUNNING OUT OF HORSEPOWER ?
It’s possible that the RLDRAM loading is not high even if the LU is
busy. That’s becoz. of the cache implementation in the memory
controller
There are 4 external and 4 internal memory controllers
......
LUCHIP(0) MXI X: Aggregate Counts Rates 215 msec
emc_0_reads 187763973147 834539/sec 834/ms
emc_1_reads 178385027596 783330/sec 783/ms
emc_2_reads 177198741258 779000/sec 779/ms
emc_3_reads 181273416417 821604/sec 821/ms
imc_0_reads 79327601291 353948/sec 353/ms
imc_1_reads 54466776260 257041/sec 257/ms
imc_2_reads 26232989868 134200/sec 134/ms
imc_3_reads 37635601508 188716/sec 188/ms
emc_0_writes 9997083886 42837/sec 42/ms
emc_1_writes 2027536772 6079/sec 6/ms
emc_2_writes 10846411253 50897/sec 50/ms
emc_3_writes 12661523753 58632/sec 58/ms
imc_0_writes 5109156021 23506/sec 23/ms
imc_1_writes 2282243769 9390/sec 9/ms
imc_2_writes 4592116280 20441/sec 20/ms
imc_3_writes 453140527 1934/sec 1/ms
hash_reads 139597971456 < 1/sec
plct_reads 46487491775 207227/sec 207/ms
PIO_reads 24077561978 990660/sec 990/ms
hash_writes 0
plct_writes 47034337650 209660/sec 209/ms
116 PIO_writes Copyright © 2010 Juniper Networks, Inc. www.juniper.net
17592764 < 1/sec
8. IS THE ASIC RUNNING OUT OF HORSEPOWER ?
How about the counter block ?
Favor tests from the customer on firewall filter

plct_p0_xtxn – Policer
Worst Case
– 1 Policer in LU
– Performance 800/14 = 57M Policer XTXN/sec
Best Case
– 200M Policer XTXN/sec (Number of Policers in LU < 24)

plct_p1_xtxn – Counters
Worst Case
– 1 Counter in LU
– Performance 800/5 = 160M Counter XTXN/sec
Best Case
– 400M Counter XTXT/sec (Number of Counters in LU < 32)

117 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


8. IS THE ASIC RUNNING OUT OF HORSEPOWER ?
Typical Number – Around 80-90% of best case
......
LUCHIP(0) XP2B : Aggregate Counts Rates 215 msec
rord_p0_xtxn 121961475326 543079/sec 543/ms
rord_p1_xtxn 109784498683 492576/sec 492/ms
rord_p2_xtxn 100331906139 445702/sec 445/ms
rord_p3_xtxn 99693835356 444172/sec 444/ms
rord_p4_xtxn 99781014531 448065/sec 448/ms
rord_p5_xtxn 99849088710 445990/sec 445/ms
rord_p6_xtxn 99902348300 446255/sec 446/ms
rord_p7_xtxn 99659043211 447548/sec 447/ms
plct_p0_xtxn 2535541 < 1/sec
plct_p1_xtxn 73497867356 329088/sec 329/ms
lock_xtxn 1331763764 5962/sec 5/ms
hash_xtxn 23266328576 < 1/sec
hash_lmem_xtxn 0
host_xtxn 470801100 < 1/sec
tcam_q0_xtxn 0
tcam_q1_xtxn 0
tcam_q2_xtxn 0
tcam_q3_xtxn 0
xb2mw_xtxn 916487095 3967/sec 3/ms

118 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


8. IS THE ASIC RUNNING OUT OF HORSEPOWER ?
Microcode instruction
show luchip <> ppe_perf rate 0xffff <period>
On each PPE, we have these instruction execution rate
The sum of these are limited to 800M instruction/sec
– umem_instr_count
– gumem_instr_count
– cancel_instr_count
– unused_slot_count
TAZ-TBB-0(cheese vty)# show luchip 0 ppe_perf rate 0xffff 200

LUCHIP(0) PPE0 Perf Mon: Aggregate Counts Rates 215 msec


umem_instr_count 364942023153 1678125/sec 1678/ms
gumem_instr_count 122270596888 542069/sec 542/ms
cancel_instr_count 67396085124 299758/sec 299/ms
unused_slot_count 3260193077924 15741316/sec 15741/ms

LUCHIP(0) PPE1 Perf Mon: Aggregate Counts Rates 215 msec


umem_instr_count 275919150812 1282758/sec 1282/ms
gumem_instr_count 7989757639 14041/sec 14/ms
cancel_instr_count 1379897679 6520/sec 6/ms
unused_slot_count 1703142252490 7377432/sec 7377/ms

......
119 Copyright © 2010 Juniper Networks, Inc. www.juniper.net
9. WHAT IS TRAP MESSAGE ?
9. WHAT IS TRAP MESSAGE ?
When the PPE encounters a lookup error, a trap message will be
generated for offline analysis
The trap information can be collected from the MPC
– “show jnh <lu_instance> trap-info
Watch out PR660002 !
fpc11 PPE Thread Timeout Trap: Count 42, PC 2a4, 0x02a4: fw_op_eq
fpc11 PPE Thread Timeout Trap: Count 43, PC 36, 0x0036: entry_fw_stop
fpc11 PPE Thread Timeout Trap: Count 108, PC 181, 0x0181: call_table_launch_nh
fpc11 PPE Thread Timeout Trap: Count 44, PC 181, 0x0181: call_table_launch_nh
fpc11 PPE Thread Timeout Trap: Count 45, PC 196, 0x0196: counter_nh_read_next_nh
fpc11 PPE Thread Timeout Trap: Count 113, PC 19a, 0x019a: compute_pkt_len
fpc11 PPE Thread Timeout Trap: Count 46, PC 2ac, 0x02ac: fw_rd_value_mask
fpc11 PPE Thread Timeout Trap: Count 117, PC 23, 0x0023: entry_index_nh
fpc11 PPE Thread Timeout Trap: Count 47, PC 36, 0x0036: entry_fw_stop
fpc11 PPE Thread Timeout Trap: Count 119, PC 23, 0x0023: entry_index_nh
fpc11 PPE Thread Timeout Trap: Count 49, PC 1c0, 0x01c0: wr_ctx_info_ret_or_read_nh

fpc5 PPE PPE HW Fault Trap: Count 13540176663, PC 26, 0x0026: fw_set_layer3_length 0x0026:
fw_set_layer3_length
fpc4 PPE PPE HW Fault Trap: Count 73615, PC 6109, 0x6109: mac_age_entry
fpc7 PPE PPE HW Fault Trap: Count 1391192282, PC 315, 0x0315: fab_out_apply_pfe_alive_mask
fpc4 PPE PPE HW Fault Trap: Count 61298, PC 2b1, 0x02b1: set_iif_compute_hash

121 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


9. WHAT IS TRAP MESSAGE ?
There are different trap types generated by different errors
Some are errors encountered in the lookup process
Some are memory errors
lu_trap_entry[TRAP_INFO_OFF_SYNC_DMEM_WP].name = "Sync DMEM WP";
lu_trap_entry[TRAP_INFO_OFF_ASYNC_DMEM_WP].name = "Async DMEM WP";
lu_trap_entry[TRAP_INFO_OFF_SYNC_XTXN_ERR].name = "Sync XTXN Err";
lu_trap_entry[TRAP_INFO_OFF_ASYNC_XTXN_ERR].name = "Async XTXN Err";
lu_trap_entry[TRAP_INFO_OFF_THREAD_TIMEOUT].name = "Thread Timeout";
lu_trap_entry[TRAP_INFO_OFF_PPE_STACK_ERR].name = "PPE Stack Err";
lu_trap_entry[TRAP_INFO_OFF_XRA_READ_ERR].name = "XRA Read Err";
lu_trap_entry[TRAP_INFO_OFF_PPE_HW_FAULT].name = "PPE HW Fault";

JSPEC_FIELD("lmem_data_err", 5, 3, 3, "LMem Data Error")


JSPEC_FIELD("kmem_data_err", 5, 2, 2, "KMem Data Error")
JSPEC_FIELD("ucode_addr_err", 5, 1, 1, "UCode Addr Error")
JSPEC_FIELD("ucode_data_err", 5, 0, 0, "UCode Data Error")

122 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


9. WHAT IS TRAP MESSAGE ?
This is how it looks like. It contains all the registers and LMEM
dump. Also a parcel dump.
TAZ-TBB-0(1399A-MX80-01 vty)# show jnh 0 trap-info

Trap Type Count PC


PPE HW Fault 69 0x395

PPE HW Fault PPE/Context 8/5 @ PC 0x0395:


GPR Registers:
R00-03: 0x27fffff800000002 800191c700000046 0000590000000000 1812f00000000000
R04-07: 0x0000000000e00000 a06000ea10000e0c 0fb400000008003f 00f6ff4600240000
R08-11: 0x008c85e4b19db367 60107f49ae38d766 0001870100000320 733e67f02c9bc1e6
R12-15: 0x0000000000000000 0000000000000000 003f820008000000 6310000a40038395
R16-19: 0x0000000000000000 0000000000000000 8100080c080045c0 3e82f3e800002020
R20-23: 0x804e001464166cd0 00010600013df982 0000000000000000 0000000000000000
R24-27: 0x0f083e3c1fb87c78 000000000033ffff 182a930000001000 5902010030120066
R28-31: 0x0052300000000000 2000003103000001 0000000000000082 2020200001f00066

LMEM Dump:
LMEM 0x0000: 0x80c5c000e0000005 5045606659020100 30000005fe800000 000000008a43e1ff
LMEM 0x0004: 0xfe37fc2459100200 0000000000000000 0000000000000000 0074000000000000
LMEM 0x0008: 0x000680e041080100 5e00000500171003 63438100080c0800 45c000547a8b0000
LMEM 0x000c: 0x0159ad5550456066 e000000502010030 504565fb63000000 0000000200000b10
LMEM 0x0010: 0x4e5c4250fffffffc 0001020000000004 5045606500000000 5045685e175cdfb6
LMEM 0x0014: 0x82bef1f35d4a92b4 39bb3ab600000003 0008050100007530 000808010000000a
LMEM 0x0018: 0x0024090200000007 050000067f000005 0000000000000000 7f80000000000014
LMEM 0x001c: 0x000005dc000c0a07 504568630000003b 000810010005afd4 0034150101085045
LMEM 0x0020: 0x68c5200001085045 6b06200001085045 6bcd200001085045 6b0a200001085045
123 ...... Copyright © 2010 Juniper Networks, Inc. www.juniper.net
9. WHAT IS TRAP MESSAGE ?
......
LMEM 0x004c: 0x0000000220000000 c000ffff00460066 0000001902c00230 0f400e3801571e64
LMEM 0x0050: 0x000c0000000f094b 529a728100000000 0000000000040000 0000000000000000
LMEM 0x0054: 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000
LMEM 0x0058: 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000
LMEM 0x005c: 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000
LMEM 0x0060: 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000
LMEM 0x0064: 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000
LMEM 0x0068: 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000
LMEM 0x006c: 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000
LMEM 0x0070: 0x0000000000000000 0000000000000000 0000000000000000 0000000000000f08
LMEM 0x0074: 0x3e3c1fb87c780000 ffffff0000000000 0000000000000000 0000000000000000
LMEM 0x0078: 0x0f62bc0100010000 0e31280100010000 166ba60100010000 d0149e0100010000
LMEM 0x007c: 0xffff010200000000 ffffff0000000000 0000000000000000 6737a8d600000000
XRS XRA IR64 WP64 BASE_0
0x27fffff800000002 20000000c000ffff 0046006600000019 02c002300f400e38 0000
Dispatch size is 116
000680e041080100
5e00000500171003
63438100080c0800
45c000547a8b0000
0159ad5550456066
e000000502010030
504565fb63000000
0000000200000b10
4e5c4250fffffffc
0001020000000004
5045606500000000
5045685e175cdfb6
82bef1f35d4a92b4
39bb3ab6
124 Copyright © 2010 Juniper Networks, Inc. www.juniper.net
9. WHAT IS TRAP MESSAGE ?
To decode the coredump, we need a parcel_decode script
Need to use the correct version !
It’s under /src/pfe/ucode/lu/scripts/
% ~swong/code/10.4/release/10.4R6.5/src/pfe/ucode/lu/scripts/decode_parcels
./trap_fpc0_pfe0.core.0
Dispatch: Size 000 Copy 0000 M2L_PKT (0)

This is a trap dump context, not all Regs/LMEM may be relevant,


as some values could carry over from previously processed packets.

Register Context:

Info PType IPV4 (2) subtype 0 score 20 reason 20 AddInfo 0001f IIF 00066
Pkt Proto_TcpF 59 SrcPort 0201 DstPort 0030 EncLen 12 PktLen 0066
Qos NTags 1 cos2 0 dei2 0 cos1 0 dei1 0 ExpV 0 Exp 0 RwFlg 00
FC 06 DP 0 Qv 1 Exp1 0 Exp2 0 Tos c0 Tos2 00 InsCW 0 NoPropTTL 0 TTL 01
Fwd HcopyIFfwd 0 Token 00000 VRF 0000 Egr64BitMC 0 PfeHi 00 PfeLo 82
FwdEgr Queue 003f8 OIF 20008 EID 00000 Media 0 RestrictedQ 0
Vars ImpLen 00 OPtype 2 OEncLen 12 OL2off 06 IRB 0 Bridge 0
OLabel 0 LabelCnt 0 PolRes 0 NHStkIdx 0 XConn 0
ChkEth 0 InsNative 0 NatTPID 0 NatVid 000
Bridge MGid 00 BDid 0000 Lvlan 0000 LrnEna 1 LSwEna 0 DMissDrp 0 Age 3ffff
Bridge2 STPid 182a BDdata_addr 930000 Epoch 00 L2Token 01000
......
125 Copyright © 2010 Juniper Networks, Inc. www.juniper.net
9. WHAT IS TRAP MESSAGE ?
......
BDdata StatsEna 0 V4MC_IRB 1 V6MC_IRB 0 CollStat 0 FTF 1
LrnEna 1 LSwEna 1 IRBnh_L2iif 000a32 IgmpSnoop 0 MGarr 166cd0
Vlan 8100080c08004800
Hash ReorderQueue 0f08 LB Hash 3e3c1fb87c78
PostProc Mlist 000000 Count 0000 Flags 0000
Services 0000000000000000
SampleClass 0000000000000000
PfeMask 0000000000000000
FW 00010600013df982
Trap DumpBase 10000a40 TrapOff 03 TrapPC 8395

LMEM Context:

Info PType IPV4 (2) subtype 0 score 20 reason 20 AddInfo 0001f IIF 00066
Pkt Proto_TcpF 59 SrcPort 0201 DstPort 0030 EncLen 12 PktLen 0054
Qos NTags 1 cos2 0 dei2 0 cos1 0 dei1 0 ExpV 0 Exp 0 RwFlg 00
FC 01 DP 0 Qv 0 Exp1 0 Exp2 0 Tos c0 Tos2 00 InsCW 0 NoPropTTL 0 TTL 01
Fwd HcopyIFfwd 1 Token fffe0 VRF 0000 Egr64BitMC 0 PfeHi 00 PfeLo 82
Opt OptionMap 00 GRE Key 00000000 CP Hash 0000
L2 Uvlan 0000 Lvlan 0000 Etype 0000 BUM 0

126 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


9. WHAT IS TRAP MESSAGE ?
We can also send the parcel info into the PPE again to run ttrace,
then, see how it breaks
000680e041080100 M2L cookie
5e00000500171003 Ethernet hdr (vlan)
63438100080c0800
45c000547a8b0000 IP pkts
0159ad5550456066
e000000502010030
504565fb63000000
0000000200000b10
4e5c4250fffffffc
0001020000000004
5045606500000000
5045685e175cdfb6
82bef1f35d4a92b4
39bb3ab6

https://wintermute.juniper.net/projects/trinity/trinity-software/trinity-
debugging/copy3_of_gen_ttrace

127 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


9. WHAT IS TRAP MESSAGE ?
With the change in PR563318, the trap message will be solved in
the RE /var/crash automatically
The name is trap_fpc<>_pfe<>.core.#
Auto ttrace can be enabled (disabled by default)
NPC0(MX960-DUT2 vty)# show jnh 0 trap-info struct
TRAP INFO Config:
trap_rd_sem: 0x4e64ab00
trap_rd_thread: 0x4e64bba8
trap_info_savefile: trap_fpc0_pfe0
trap_trace_savefile:
save_count: 32
autottrace_count: 0
flags: SAVE
TRAP INFO Status:
save_index: 0
show_index: 0

NPC0(MX960-DUT2 vty)# test jnh 0 trap-info-read


<carriage return> Completes command
disable Disable PPE Trap info reader
save Enable autosave option for PPE Trap info reader
trace Enable auto-TTRACE for PPE traps

NPC0(MX960-DUT2 vty)#
128 Copyright © 2010 Juniper Networks, Inc. www.juniper.net
9. WHAT IS TRAP MESSAGE ?
OK. What should I do with that then ?
We can use ttrace to see what’s wrong
– https://gnats.juniper.net/web/default/611029
Apr 26 17:44:11 tkmrt2 tfeb0 PPE PPE HW Fault Trap: Count 22, PC 6c, 0x006c:
l2tp_hdlc_hdr_size
Apr 26 17:45:39 tkmrt2 tfeb0 LUCHIP(0) PPE_2 Errors lmem addr error
Apr 26 17:45:39 tkmrt2 tfeb0 LUCHIP(0) PPE_3 Errors lmem addr error

svl-junos-pool72% ~/code/10.4/release/10.4R1.4/src/pfe/ucode/lu/scripts/decode_parcels
~nfujita/q1
Dispatch: Size 052 Copy 0000 M2L_PKT (0) TailEntry 0004 Stream Wan (406) Off 0
IxPreClass 1 IxPort 01 IxMyMac 1
Data: 80711fc3de000012
Data: f2f5890081000065
Data: 0800450000325aa0
Data: 00b178111b4b7889
Data: e24f7d18f42de4ab
Data: 06a5cded575413c2
Data: c6e363a9651cd7c2
Data: df9ee1cfebaf1218
Data: 074c67e8

129 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


9. WHAT IS TRAP MESSAGE ?
The trap happens at l2tp_hdlc_hdr_size()
TAZ-TBB-0(cheese vty)# test jnh 0 packet-via-dmem inject trace
Please paste hex dump of the packet, end with a dot (.)
0002406041088071
1fc3de000012f2f5
8900810000650800
450000325aa000b1
78111b4b7889e24f
7d18f42de4ab06a5
cded575413c2c6e3
63a9651cd7c2df9e
e1cfebaf1218074c
67e8
.
......
l2tp_hdr_size_offset_check @ 0x006f
Prev_PC 0x006d -> 0x006f

l2tp_hdr_too_long @ 0x0070
GPR31 0x2000000000000000 -> 0x2000870000000000
Prev_PC 0x006f -> 0x0070
PCSD: 2 -> 1, dropped 0x005d

l2tp_check_pkt_size @ 0x005d
Prev_PC 0x0070 -> 0x005d
......
130 Copyright © 2010 Juniper Networks, Inc. www.juniper.net
9. WHAT IS TRAP MESSAGE ?
What that ucode instruction does ?
/src/pfe/ucode/lu/layer3_input.t

l2tp_hdr_size_offset_check:
begin
// In the above instructions ir_l2tp_encap_len is set to the length
// of the L2 encapsulation header and here we check to insure it fits
// in the available parcel space before attempting the next access.
if (ir_l2tp_encap_len > r_l2tpsz.rem_off) {
goto l2tp_hdr_too_long;
}
end

l2tp_hdlc_hdr_size:
begin
// HDLC header may or may not be included. If included, the HDLC frame
// will have the value = 0xff03 (All stations, unnumbered).
const :16 hdlc_frame_field = *cast<:16 *>(layer3_ptr + ir_l2tp_encap_len);

// assume header length for PPP header = 2 bytes


RCtxInfo2.ppp_header_len = 2;

// the field is two bytes wide (typically 0xff03)


if (hdlc_frame_field == HDLC_PPP_FRAME) {
// add size of HDLC header
ir_l2tp_encap_len += HDLC_HDR_LEN;
}
131 Copyright © 2010 Juniper Networks, Inc. www.juniper.net
end
9. WHAT IS TRAP MESSAGE ?
It tells us the reason of why dropping the packet
<-- Here comes to the error. The GPR31 is CtxInfo_t and
<-- the drop reason is 0x87 = 135.

ucode.th:register 31 CtxInfo_t RCtxInfo;

struct CtxInfo_t {
PType : 4; // Packet type, see PTYPE_TAG
subtype : 4;
scorecard_t scorecard;
reason : 8; // punt/drop reason (see PUNT_REASON_TAG)
add_info : 20; // additional info, usage depends on punt/drop reason
iif : 20;
};

svl-junos-pool72% bits 4 4 8 8 20 20
0x2000870000000000
0010 0x2 0000 0x0 00000000 0x0 10000111 0x87 00000000000000000000 0x0
00000000000000000000 0x0

#define DROP_CODE_BASE 128


#define DROP_TUNNEL_HDR_TOO_LONG (DROP_CODE_BASE + 7)

132 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


9. WHAT IS TRAP MESSAGE ?
Going back to the packet, it doesn’t make any sense becoz. it’s not
a l2tp packet !
Where do we classify that as l2tp packet ?
svl-junos-pool72%
~/code/10.4/release/10.4R1.4/src/pfe/ucode/lu/scripts/decode_parcels ~nfujita/q1
Dispatch: Size 052 Copy 0000 M2L_PKT (0) TailEntry 0004 Stream Wan (406) Off 0
IxPreClass 1 IxPort 01 IxMyMac 1
Data: 80711fc3de000012
Data: f2f5890081000065
Data: 0800450000325aa0
^^^^^^^^^^^^
Data: 00b178111b4b7889 Last frag offset 0xb1, UDP packet SRC: 120.137.226.79
^^^^ ^^ ^^^^
Data: e24f7d18f42de4ab DST: 125.24.224.45
^^^^^^^^^^^^
Data: 06a5cded575413c2 payload data 0x6a5 = 1701
Data: c6e363a9651cd7c2
Data: df9ee1cfebaf1218
Data: 074c67e8

133 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


9. WHAT IS TRAP MESSAGE ?
Going back to the ttrace, it’s after ipv4_input_verify_checksum()
.....
ipv4_input_verify_checksum @ 0x0054
Prev_PC 0x0053 -> 0x0054
IR0 0x000000a0 -> 0x00000000

l2tp_input @ 0x0057
GPR05 0x0060004270000e0c -> 0x13c2004270000e0c
GPR12 0x0000000000000000 -> 0x0000000000000004
Prev_PC 0x0054 -> 0x0057
WP0 0x0360 -> 0x03a0
.....

134 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


9. WHAT IS TRAP MESSAGE ?
Hmm…did we miss something ?
We should check if the port number is really L2TP but not a fragmented
payload
ipv4_input_verify_checksum:
begin
// MSbit will be set if IHL > 5 (options packet)
ir0 = 5 - ipv4_ptr->ihl;

if (csh_reg.csum != 0) {
goto ipv4_csum_error;
}

// The RE will NEVER use a src_port other than 0x1701 for L2TP.
// The ucode today was simplified to use this fact to determine if
// the header following the UDP header is L2TP or not.
// This could be changed by adding 17 bits to the tunnel decaps NH
// to specify L2TP=true and the actual dest port to check (16 bits).

// Bypass L2TP parsing if protocol != UDP, dest_port != 1701,


// or packet comes from a loopback or host stream.
if ((RCtxPkt.proto_tcp_flags != LAYER4_PROTO_UDP) ||
(RCtxPkt.dest_port != UDP_PORT_L2TP) ||
(RCtxVars.host_tunnel_pkt == 1)) {
goto l2tp_check_done;
}
end
135 Copyright © 2010 Juniper Networks, Inc. www.juniper.net
9. WHAT IS TRAP MESSAGE ?
Here is the fix
L2TP check for dest_port was incorrect for a non-initial fragment packet
as these packets do not contain Layer4 info (ports). The fix is to set
dest_port to 0 for non-initial fragment, which ensures that dest_port will
not match UDP_PORT_L2TP for non-initial fragment, effectively
bypassing L2TP parsing when ipv4_ptr->frag_off > 0.

136 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


10. MEMORY ERROR. IS THAT HW OR SW ?
10. MEMORY ERROR. IS THAT HW OR SW ?
Some memory related errors can show up
Is this HW or SW ?
fpc4 LU 3 PPE_8 Errors kmem data error 0x00000000
fpc5 LU 3 PPE_11 Errors ucode data error 0x00000276
Fpc6 LU 0 PPE_8 Errors lmem data error 0x908
fpc9 LU 0 IDMEM Parity error in Bank 1, Count 12, IDMEM Bank 1 Offset 0x000168ac IDMEM[0x0005a2a4].

It’s hard to tell. Some of them might be SW and some of them


might be HW
PR564760
– EDMEM differentiates between ECC errors and accesses to uninitialized
memory by checking for pattern (left buy BIST) in the error read data. IDMEM
does not have similar support, hence, SW errors are flagged as Parity (HW)
errors.

138 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


10. MEMORY ERROR. IS THAT HW OR SW ?
Some are most likely a bad HW
LUCHIP(0) PPE_8 Errors sync xtxn error
LUCHIP(0) PPE_8 Errors lmem data error 0x908
LUCHIP(0) PPE_8 Errors KMA[17] parity error
LUCHIP(0) PPE_8 Errors ucode data error 0x0128
LUCHIP(0) RMC 2 Uncorrectable ECC 0x1122334455667788 @
0x1d2345, cnt 6, syn 0x0 - EDMEM[0x1d2345]
LUCHIP(0) RMC 2 Correctable ECC @ 0x1d2345, cnt 6, syn 0x0 -
EDMEM[0x1d2345]

But there is always an exception


PR613358 shows sync xtxn error but it’s a sw bug.

Have a recovery procedure in place just in case if that’s just a


transient HW error – let’s reduce the RMA rate is that’s a one time
error.
139 Copyright © 2010 Juniper Networks, Inc. www.juniper.net
10. MEMORY ERROR. IS THAT HW OR SW ?
With PR593906, system will perform best effort recovery correction
when it sees memory error.
A background thread will poll status (errors), poll counters (events), and
perform LU memory integrity checks.
EDMEM: Let ECC handle this. There is no checking/recovery
planned. Given that ECC will correct any single bit error, recovery is not
required in most cases. In addition, EDMEM has the highest degree of
flux, thus making recovery problematic.
OMEM: No plan to check/recover this memory. Error detection is not
straight-forward and there is no opportunity to recover since this
memory is most often changed by the PPE.
IDMEM: Use parity for error detection. In addition to implicitly checking
this memory using parity, a low priority background thread scans
IDMEM seeking parity errors in those locations used infrequently.
Recovery is provided from a SW shadow.

140 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


10. MEMORY ERROR. IS THAT HW OR SW ?
UMEM/GUMEM: These memories are parity protected. The PPE
checks a portion of these memories (those instructions used). In
addition, a low priority background thread scans UMEM/GUMEM
seeking parity errors and data mismatches. All errors and
mismatches are recovered from a shadow.
KMA/KMB: These memories are parity protected. The PPE checks a
portion of these memories (those locations used). In addition, a low
priority background thread scans KMA/KMB seeking parity errors and
data mismatches. All errors and mismatches are recovered from a
shadow.
CBO: There is no error detection logic in the HW. A low priority
background thread scans CBO for mismatches. All mismatches are
recovered from a shadow.

141 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


10. MEMORY ERROR. IS THAT HW OR SW ?
LMEM: The bulk of LMEM is managed by the data plane, making
recovery from the control plane potentially problematic. Do not
check LMEM with a background thread. When an LMEM error is
detected, the failing zone is identified. If the failing zone is within a
SHARED Zone, the location is fixed, when the zone is PRIVATE the
error is counted. Multiple PRIVATE errors force repair. Each repair is
counted as well, repeated repairs lead to the Zone (PRIVATE) or PPE
(SHARED) to be disabled.

We shouldn’t see a high rate of correction actions. If errors keep


happening on the same board and memory location, better replace
the board and check.

Some guidelines from Engineering


http://www-in/~swong/PPE_errors.docx

142 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


11. WHERE IS THE JSIM ?
11. WHERE IS THE JSIM ?
JSIM is a tool to simulate a route lookup on the forwarding plane
Just input the parameters as iif, src and dst addresses…etc

It’s different on Trinity as we no longer lookup the notification.


Instead, the whole packet header is loaded up on the PPE for
lookup. So, we couldn’t do that.

The tool we have is called ttrace. Basically, we craft a packet, then,


inject it into the PPE to show us the whole packet lookup process –
what microcode instructions are involved and the results.

Here comes to the problem. Since the PPE will treat this as a real
packet, we need to inject a *valid* packet – you even need to make
sure the IP packet header is with correct checksum byte !
If you inject a bad packet to run ttrace, the error will be logged as an
error from a real packet !

144 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


11. WHERE IS THE JSIM ?
There are two ways to get a packet to run ttrace
Capture it from the PPE
Create one offline

To capture the packet from PPE, we use the packet-via-dmem


function.
https://wintermute.juniper.net/projects/trinity/trinity-software/trinity-
debugging/packet-capture-via-dmem
Basically we capture the frame with the corresponding cookie (M2L),
then, can inject the packet back to see how the lookup looks like.

145 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


11. WHERE IS THE JSIM ?
We can also make up one offline
~swong/misc/ttrace.sh
svl-junos-pool72% sh ./ttrace.sh
Input Port (WAN/FAB): WAN
Protocol (IPv4/MPLS/Bridge/ARP/ISIS): IPv4
Source MAC address (enter to skip): 80711fc26661
Destination MAC address (enter to skip): 80711fc26678
vlan (enter to skip):
Source IP address: 1.1.1.1
Destination IP address: 200.0.0.1
Protocol Number (enter to skip):
TTL (enter to skip): 255
DSCP (enter to skip):
TCP/UDP Source Port (enter to skip):
TCP/UDP Destination Port (enter to skip):
MQ stream: 15
IXPort: 12
Version: 11.4

TTRACE M2L_Packet parcel:


000000f08c0880711fc2667880711fc2666108004500001400010000ff00f1e501010101c8000001
svl-junos-pool72%

146 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


11. WHERE IS THE JSIM ?
How to read the output ?
Step by step ucode instuctions
All are in src/pfe/ucode/lu/

147 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


11. WHERE IS THE JSIM ?

148 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


11. WHERE IS THE JSIM ?

149 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


11. WHERE IS THE JSIM ?

150 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


11. WHERE IS THE JSIM ?
More info can be found here
http://www-in/~swong/ttrace.pdf

Be careful when you are using ttrace


http://gnats.juniper.net/default?cmd=view+audit-trail&pr=688270

Too complicated. Can I get a easier one to use ?


https://gnats.juniper.net/web/default/581484

151 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


12. RUNNING OUT OF LOOKUP MEMORY ?
12. RUNNING OUT OF LOOKUP MEMORY ?
The memory is partitioned for different applications
Nexthop
Firewall
Counters
UEID

Each of them has private and share pool


Reserved space
Shared space

153 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


12. RUNNING OUT OF LOOKUP MEMORY ?
To check the memory map layout
show jnh <> pool layout

To get a memory usage summary


show jnh <> pool summary verbose bytes

To check the memory usage


show jnh <> pool bytes
show jnh <> pool detail bytes
show jnh <> pool composition bytes

To check if the memory is under serve fragmentation


show jnh <> pool stats <>

154 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


12. RUNNING OUT OF LOOKUP MEMORY ?
A better layout ?
https://gnats.juniper.net/web/default/685597
NPC5(halfpint vty)# show jnh 0 pool summary bytes
Name Size(b) Allocated(b) % Utilization
EDMEM 268435456 149716720 55%
IDMEM 2621440 1627192 62%
OMEM 268435456 264634432 98%
Shared LMEM 4096 504 12%

NPC5(halfpint vty)# show jnh 0 pool bytes


Name MemType Total(b) Used(b) (%) Free(b) (%)
Next Hop EDMEM 50331648 49604736 98% 726912 2%
IDMEM 2424832 1601720 66% 823112 34%
Firewall EDMEM 16777216 267328 1% 16509888 99%
IDMEM 2228224 1276752 57% 951472 43%
Counters EDMEM 41943040 468112 1% 41474928 99%
IDMEM 65536 28832 43% 36704 57%
HASH EDMEM 65798144 65798144 100% 0 0%
ENCAPS EDMEM 34078720 34078720 100% 0 0%
LMEM LMEM 4096 504 12% 3592 88%
OMEM OMEM 264634432 264634432 100% 0 0%
UEID_SPACE 0x01c00000 1048576 131 < 1% 1048445 >99%
UEID_SHARED_SPACE 0x01bf0000 65536 2 < 1% 65534 >99%

155 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


12. RUNNING OUT OF LOOKUP MEMORY ?
NPC5(halfpint vty)# sho jnh 0 pool usage
EDMEM overall usage:
[NH///|FW///|CNTR////////|HASH///////////////|ENCAPS////|---------------------------]
0 2.0 4.0 9.0 16.8 20.9 32.0M
Next Hop
[***************|-----] 2.0M (75% | 25%)
Firewall
[|--------------------] 2.0M (1% | 99%)
Counters
[|--------------------------------------------------] 5.0M (1% | 99%)
HASH
[********************************************************************************] 7.8M (100% | 0%)
ENCAPS
[*****************************************] 4.1M (100% | 0%)

NPC5(halfpint vty)# show jnh 0 pool usage bytes


EDMEM overall usage:
[NH/////////////|FW///|CNTR////////|HASH///////////////|ENCAPS////|-----------------]
0 48.0 64.0 104.0 166.8 199.3 256.0MB
Next Hop
[************************************************************|-] 48.0MB (98% | 2%)
Firewall
[|--------------------] 16.0MB (1% | 99%)
Counters
[|--------------------------------------------------] 40.0MB (1% | 99%)
HASH
[********************************************************************************] 62.8MB (100% | 0%)
ENCAPS
[*****************************************]
156
32.5MB (100% | 0%)
Copyright © 2010 Juniper Networks, Inc. www.juniper.net
13. WHAT TO CAPTURE WHEN PACKET DROP
HAPPENS ?
13. WHAT TO CAPTURE WHEN PACKET DROP
HAPPENS ?
The information required will depend on the packet drop type
Transit packet drop ?
Hostbound packet drop ?
Drop related to certain error ?
ASIC wedge problem ?

It’s good to collect some baseline information anyway


Command list is in ~swong/public_html/trinity_wedge.cmd
Collect at list 3 times
Get a MPC coredump file

158 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


14. ANYTHING I CAN READ MYSELF ?
14. ANYTHING I CAN READ MYSELF ?
Debugging tips from Engineering
https://wintermute.juniper.net/projects/trinity/trinity-software/trinity-
debugging

Fabric hardening project


http://cvs.juniper.net/cgi-bin/viewcvs.cgi/*checkout*/sw-
projects/platform/trinity/pfe/RLI/Trio_Wedge_Detection_Design_Spec_R
LI_17022.docx
http://cvs.juniper.net/cgi-bin/viewcvs.cgi/sw-
projects/platform/atlas/fabric_hardening/15438_fabric_hardening_funcs
pec.txt

A summary of different issues being seen from the field


http://confluence.jnpr.net/confluence/display/IPGE/Trinity+Wedges
http://confluence.jnpr.net/confluence/display/IPGE/Trinity+Fabric+Stuff

160 Copyright © 2010 Juniper Networks, Inc. www.juniper.net


MORE QUESTIONS?

You might also like