Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Recent Progress in the

APL Fault Management


Process
Kristin Fretz & Adrian Hill
JHU/APL
Overview
§  2008 FM Workshop published a White
Paper Report identifying 12 top-level
findings and recommendations for
improving FM systems
§  APL recognized these needs and
improved its FM and Autonomy processes
to address many of these findings
q  Processes are part of APL’s Quality
Management System (QMS)
q  Changes implemented on the Radiation Belt
Storm Probes (RBSP) project
§  Presentation will highlight 6 of the
findings, discuss the APL process
changes, and the impact of the changes
on both the FM and Autonomy processes
APL FM Development Process
RBSP Overview
2  Observatories  (RBSP-­‐A,  RBSP-­‐B)  
Launch  and  Orbit  Inser=on   • Nearly  iden.cal,  single  string  architecture  
• Single  EELV  (Observatories  Stacked)  
• Launch  from  KSC   • Spin  Stabilized  ~5  RPM  
• Each  observatory  independently  released   • Spin-­‐Axis  15°-­‐27°  off  Sun  
Sun  pointed   • APtude  Maneuvers  Every  21  days  
• LV  performs  maneuver  to  achieve  nominal   • No  on-­‐board  G&C  aPtude  control    
lapping  rate   • Opera.onal  Design  Life  of  2  years  
• Launch:    May  2012  
Perigee  Al.tudes  
605  &  625  km  
APL  Ground     NEN  Sta.on(s)    &  30,540  km
 
Sta.on   •   Data  Augmenta.on   d es   30,410
Apogee  Al.tu Inclina.on  
•   Primary   •   Back-­‐up  
10°  
TDRSS  

Differing  apogees  allow  for  simultaneous  


measurements  to  be  taken  over  the  full  range  of  observatory  
separa.on  distances  several  .mes  over  the  course  of  the  
Cri.cal  Events   mission.    This  design  allows  one  observatory  to  lap  the  other  
at  Launch   every  75  days.  
Commands  &  Telemetry  
Decoupled  Opera.ons  
Instrument  
commands   • Basic  Approach  Based  on  TIMED,  STEREO  
MOC  
Instrument  
Telemetry   EFW   EMFISIS   ECT   PSBR   RBSPICE  
SOC   SOC   SOC   SOC   SOC  
Finding 2:
Diffused FM
Finding:
§  “Responsibility for FM currently is diffused throughout
multiple organizations; unclear ownership leads to gaps,
overlap and inconsistencies in FM design,
implementation and validation”
Recommendation:
§  “Establish clear roles and responsibilities for FM
engineering”
§  “Establish a process to train personnel to be FM
engineers and establish or foster dedicated education
programs in FM”
Implemented Change for Finding 2:
Diffused FM
§  APL split FM functionality into two distinct roles: Fault
Management and Autonomy
§  FM is a system engineering function that:
q  Develops concept of operations
q  Develops FM architecture
q  Oversees reliability analyses such as FMEA/PRA
q  Defines requirements and allocation of those requirements to
hardware, software, autonomy, and operations
q  Verifies and validates system with mission-level tests during I&T
§  Autonomy is a software engineering function that:
q  Designs and implements autonomy rule-based system derived
from allocated FM requirements
q  Performs unit-level/subsystem-level verification
Finding 4:
Insufficient Formality of Documentation

Finding:
§  “There is insufficient formality in the documentation of
FM designs and architectures, as well as a lack of
principles to guide the processes”
Recommendation:
§  “Identify representation techniques to improve the
design, implementation and review of FM systems”
§  “Establish a set of design guidelines to aid in FM design”
Implemented Change for Finding 4:
Insufficient Formality of Documentation
§  APL’s QMS has formal FM and Autonomy engineering
processes which define required documentation and
reviews
FM     Autonomy  
FM  Architecture  Document   N/A  
FM  Requirements  Document   Autonomy  Requirements  Specifica.on  
FM  Design  Specifica.on   Autonomy  Design  Spec  &  Users  Guide  
FM  Test  Plan  &  Verifica.on  Matrix   Autonomy  Acceptance  Test  Plan  
FM  Test  Procedures   Autonomy  Acceptance  Test  Specifica.on  
FM  Test  Reports   Autonomy  Acceptance  Test  Report  
(includes  verifica.on  matrix)  
§  RBSP FM and Autonomy developed system and
subsystem diagrams to communicate information
captured in architecture, requirements, and specifications
Implemented Change for Finding 4:
Insufficient Formality of Documentation
§  Example of Fault Management Modes Diagram used to convey interactions between
Hardware, Software, Autonomy, and Ground/Operations for managing spacecraft faults
Ground Commands
XCVR FSW
OPERATIONAL MODE self-initiated reset RF processor watchdog
•Normal power-positive conditions exist timeout
•Spacecraft is available for science data collection Nominal Operations Configurations
Communication C&DH FSW self-initiated reset
XCVR FSW
self-initiated reset RF processor watchdog
Power Low battery current not during eclipse or
current controller over-power Battery electronics over-power Payload Instrument power down timeout
request, instrument
RF Processor Reset heartbeat failure, or over-power
Avionics IEM processor
watchdog timeout Communication
Battery Discharge Battery Electronics Protection
Protection RF Electronics
Individual Establish
XCVR FSW defaults to RF Processor Reset
Power Off Disconnect Non-Optimal Instrument Safing Emergency IEM Non Power-On Reset
Current
Reset PSE Battery
Turn Battery
Battery Reboots Emergency RF Electronics
Interface Card Electronics Off Comm Recover Establish
Controller Electronics Maintenance Comm
Power Off
C&DH
Existing SSR Spin Pulse
Establish
Emergency
XCVR FSW defaults to
Emergency
FSW File System Management Reboots Emergency
Instrument Comm Comm
Reboots (3x max) Comm
Battery over-temperature Battery Temperature
returns to nominal SSPA over-power IEM Reset CCD Command SSPA over-power
Battery Over-Temperature Protection Battery Over-Temperature
Recovery Latch Valve IEM Reset via IEM CCD RF Electronics
Enable
G&C/Propulsion continuous current
RF Electronics Protection
Disable Set Charge Enable Over- Enable Over- Reset IEM Recover
Coulometer
Charge Control
Current to
Trickle
Temperature
Recovery
Coulometer
Charge
Temperature
Protection
Protection
Latch Valve
Continuous-Current
non-critical
modules*
Existing SSR
File System
Power Off
Control SSPA
Power Off Power-Off *Resets SBC, SSR (lose open files only), SCIF (non-critical) SWCLT expiration or
FSW off-pulse command Latch Valve Transceiver over-power
to PSE I/F card SSPA IEM off-pulse
Local

Load over-current/power RF CCD off-pulse command RF CCD Command


Minimum sun angleSWCLT
violation, expiration or
thruster armed in eclipse, Software Command Loss Timer
Load PDU Off-Pulse via RF CCD off-Nominal spinTransceiver
rate, or IEM Off-Pulse via RF CCD
over-power
VT Off-Pulse
Over-Current Maneuver Abort thruster arm time-out Interrupt IEM Reinitialize Ground Off-pulse
RF Electronics
Establish
defaults to
1 of 3 VT CB or AUT Unswitched File System Recovery of XCVR Emergency
Ground Disable Thruster Emergency
Regulators removes Power* (Destructive) IEM** (3x max) Comm
PDU Off-Pulse Reconfigures Comm
Off-Pulsed
(Limited by PSE I/F)
Switched Power
from Load
PDU Software Command Loss Timer
Fire I/F & Power
Off Catbeds *Resets SBC, SSR (lose all data), SCIF (full FPGA)
**Restore MET, reload time-tagged cmds & SW updates, restart science data collection
Response Location

HWCLT Expiration
RF Electronics
LVS or LBSOC Off-pulse Establish
XCVR Softdefaults
LVS to MaxEmergency
Sun Angle Exceedance Hardware Command Loss Timer Sequence
Emergency
PDU Load Shed Sequence (3x max)
Comm
Comm
SAFE MODE PDU
Off-Pulse
XCVR
Off-Pulse
IEM
Off-Pulse
Reinitialize
File System
Switched Loads Separated •Protects against life-threatening low-power (Destructive)
PGS VT-only
OFF
or communication fault HWCLT Expiration
Separated
•No science data collection Separated

KEY
Not
Separated
Hardware Command Loss Timer Sequence
Go-Safe Sequence
Post-Separation Sequence Disable Disable C&DH/ BME, PSE Power on RF Load
System

Assert PDU

Autonomy Autonomy Post-Sep Macro Separated & Thruster Fire I/F Delete Inst CC, & Downlink & Resume SSR Re-enforcement
PDU Detects 3 of
4 Switches and
IEM Provides
Switch Status to
Detects 3 of 4 (Enable Safety Nominal -OR- Fault LVS or LBSOC & Power Off Time-tag Instruments Configure Emer Operations
Reinitialize (SS, HTRs,
Go-Safe
Relay KEY
Switches via IEM Buses, RF ON, Sun SA Deploy Macro PDU XCVR
Catbeds Commands IEMOFF Comm PSE I/F ON)
Removes Inhibit Autonomy
telemetry Sensor ON) File System Autonomy
Off-Pulse Off-Pulse Off-Pulse
Hardware Rule that triggers post-sep macro disabled by ground after confirmation that macro successfully completed
(Destructive)
Hardware
MOPS
MOPS Launch Vehicle Separation
FSW
LVS or
Launch Safe
FSW LBSOC
Configuration Configuration
Implemented Change for Finding 4:
Insufficient Formality of Documentation
§  Example of Autonomy Design Diagram used to convey implementation of monitors
and responses that comprise the on-board autonomy system for responding to faults
RF COMMUNICATIONS LAUNCH SAFE MODE DETECT SUN ANGLE MAX [R12]
If sun angle exceeds Maximum Sun Angle
SSR INITIALIZATION
[SV04] and either Mission Phase [SV03] is
DETECT SW CLT TIMEOUT [R15] DETECT ANY CDH REBOOT [R01] DETECT SEPARATION [R05] DETECT SA DEPLOY READY [R06] DETECT CDH REBOOT WITH SSR
ORBIT or Deployments Complete Flag
If time elapsed since If C&DH processor reboots for any If 3 (or more) of 4 switches indicate separation and If 3 (or more) of 4 switches indicate sep -AND- CONTENTS LOST [R04]
DETECT HARDWARE LVS [R10] [SV11] is TRUE
Time of Last SWCLT Restart reason and (spacecraft is separated Mission Phase [SV03] is LAUNCH for 5 seconds Mission Phase [SV03] is LAUNCH -AND- If C&DH processor reboots and SSR
If Hardware LVS detected and either
[SV01] exceeds SWCLT Interval or Mission Phase [SV03] is ORBIT) Time of Separation [SV05] is non-zero -AND- contents are lost (e.g., IEM Off-
Mission Phase [SV03] is ORBIT or
[SV02] and (spacecraft is separated ( time elapsed since Time of Separation [SV05] DETECT HARDWARE CLT [R13] Pulse) and SSR H/W Initialization of
Deployments Complete Flag [SV11] is
or Mission Phase [SV03] is ORBIT) exceeds SA Deploy Delay Interval [SV13] -OR- If C&DH processor reboots as result of HW its SDRAM is complete
TRUE
POST-SEPARATION SEQUENCE EXECUTIVE LVS detected -OR- LBSOC is detected ) CLT expiration (HWCLT Relay set) and
[M05] (spacecraft is separated or Mission Phase
q  
RF COMMUNICATIONS
Set Time of Separation [SV05] to current DETECT LOW BATT SOC [R11] [SV03] is ORBIT)
POWER ON RF DOWNLINK PATH
POWER ON RF DOWNLINK PATH FSW uptime If LBSOC is detected and and either REINITIALIZE SSR RECORDING
AND OFF PULSE TRANSCEIVER
AND CONFIGURE EMERGENCY q   CALL PERFORM POST-SEPARATION Mission Phase [SV03] is ORBIT or [M04]
[M15] DETECT SOFTWARE LVS [R14]
COMM [M01] SEQUENCE [M07] SA DEPLOYMENTS EXECUTIVE [M06] Deployments Complete Flag [SV11] is q   Enable SSR memory scrub
q   Power on SSPA (A) If batt voltage is less than Software LVS
q   Power on SSPA (A) q   Sleep 20 seconds q   Sleep 60 seconds TRUE q   Create new SSR file system
q   CALL OFF PULSE
q   Sleep 5 seconds q  
DETECT SW CLT TIMEOUT
CALL PERFORM POST-SEPARATION
[R15]
q  
DETECT
CALL PERFORM X-AXIS SA DEPLOYMENTS [M08]
ANY CDH REBOOT [R01] Threshold [SV14] and either Mission Phase
(destructive)
TRANSCEIVER [M16] [SV03] is ORBIT or Deployments Complete
q   Disable baseband uplink
q   Issue transceiver firmware SEQUENCE If [M07]
time elapsed since q   Sleep 60 seconds If C&DH processor reboots for any Flag [SV11] is TRUE
q   Sleep 5 seconds
q  
q  
reset
Sleep 15 seconds
Time of Last SWCLT Restartqq     CALL PERFORM Y-AXIS SA DEPLOYMENTS [M13]
Sleep 60 seconds
reason and (spacecraft is separated CALL RECORD ENGINEERING
DATA TO SSR [M27]
DETECT TRANSCEIVER EXCESS
q   CALL CONFIGURE RF FOR [SV01] exceeds
PERFORM POST-SEPARATION SEQUENCE
SWCLT Interval
q   or
CALL PERFORM X-AXIS SA DEPLOYMENTS Mission
[M08] Phase [SV03] is ORBIT) PERFORM GOSAFE [M10]
EMERGENCY COMM [M19] q   Sleep 60 seconds q   CALL DISABLE THRUSTERS [M20]
POWER [R16] [M07] [SV02] and (spacecraft is separated q   CALL PERFORM Y-AXIS SA DEPLOYMENTS [M13] q   Delete instrument timetags
If transceiver power exceeds limits q   Enable RF safety buses (A&B) DETECT CDH REBOOT WITH SSR
q  
or Mission Phase [SV03] is ORBIT)
Enable actuator safety buses (A&B) q   Set Deployments Complete Flag [SV11] to TRUE q   Suspend C&DH timetags at priorities 4-15
CONTENTS PRESERVED [R03]
q   CALL POWER OFF BATTERY
q   Enable propulsion safety buses (A&B) If C&DH processor reboots and SSR
MANAGEMENT ELECTRONICS [M42]
q   Enable battery conn relays (1&2) RECORD ENGINEERING DATA TO contents are preserved (subject to
q   Power off PSE current controller
CONFIGURE RF FOR q   Enable SAJB Relay at Reduced Curr (A&B) SSR [M27] maximum 3 attempts)
OFF PULSE TRANSCEIVER [M16] q   CALL SHUTDOWN PAYLOAD [M47]
EMERGENCY COMM [M19] q   Select VT Level #4 q   Open files on SSR to record
DisconnectPOWER ONrelay
RF DOWNLINK PATH
q   Off-pulse transceiver (h/w- PERFORM X-AXIS SA PERFORM Y-AXIS SA q   CALL POWER ON RF DOWNLINK PATH
q   Configure FSW tables for q  
limited to 3 attempts)
BME cell bypass DEPLOYMENTS [M08] POWER ON[M13]
DEPLOYMENTS RF DOWNLINK PATH AND CONFIGURE EMERGENCY COMM engineering data (includes
emergency downlink rates q   CALL POWERAND ONOFF PULSEPATH
RF DOWNLINK TRANSCEIVERq   Fire +X primary / -X q   Fire +Y primary / -Y housekeeping, event message
q   Disable FSW playback AND CONFIGURE EMERGENCY COMM ANDredundant CONFIGURE EMERGENCY [M01]
and command status packets) RESUME SSR RECORDING [M03]
off pulse will
q   Sleep 30 seconds [M15] redundant actuators actuators q   CALL RECORD ENGINEERING DATA TO q   Enable SSR memory scrub
cause transceiver [M01] q   Sleep 5 seconds qCOMM
  Sleep[M01]
5 seconds SSR [M27]
software reboot q   Configure transceiver for q   q  sensorPower
Power on sun on
electronics unitSSPA (A) q   Fire +X redundant / -X qq     FirePower
+Y redundant / -Y
q   Recover existing SSR file system
emergency uplink/downlink on SSPA (A) q   Power on propulsion module heaters q   Sleep 5 seconds
q   CALL OFF PULSE primary actuators primary actuators q   Power on sun sensor electronics unit
DETECT TRANSCEIVER *** This macro is disabled by default for I&T *** q   Sleep 5 seconds q   Power on bus survival heaters
q   CALL RECORD ENGINEERING
SOFTWARE REBOOT [R17] TRANSCEIVER [M16] RECORD INSTRUMENT DATA TO DATA TO SSR [M27]
DETECT SSPA EXCESS POWER q   Issue transceiver firmware q   Inhibit PDU Response to Low Battery State of SSR [M28] q   CALL RECORD INSTRUMENT
If transceiver flight software reboots
[R18] q   Disable baseband uplink Charge from PSE Interface Card
unexpectedly
If SSPA power exceeds limits
reset q   Power on PSE interface card
q   Open files on SSR to record DATA TO SSR [M28]
instrument data
q   Sleep 15 seconds q   Assert PDU Go-Safe relay

SAVE TRANSCEIVER REBOOT q   CALL CONFIGURE RF FOR


CAUSE [M17]
DETECT TRANSCEIVER EXCESS
EMERGENCY COMM [M19]
q   Save transceiver reboot flags
in Last Transceiver Reboot
POWER OFF SSPA [M18] C LATCH
POWERVALVE
[R16] ACTUATION
q   Power off SSPA (A&B) If transceiver power exceeds limits
q  
Cause [SV12]
Clear transceiver reboot flags
DETECT LV-1 OPEN/CLOSE CMD
CONTINUOUS CURRENT [R33]
DETECT LV-2 OPEN/CLOSE CMD
CONTINUOUS CURRENT [R34]
DETECT XOVER LV OPEN CMD
CONTINUOUS CURRENT [R35]
PGS DETECT PSE CURR CNTRLR
q   CALL CONFIGURE RF FOR OVERCURRENT [R46]
If latch valve-1 open or close If latch valve-2 open or close If xover latch valve open command
EMERGENCY COMM [M19] If PSE current controller load current
command asserts continuous current command asserts continuous current asserts continuous current for 2
DETECT BATTERY MANAGEMENT DETECT PSE INTERFACE CARD exceeds limits
for 2 seconds for 2 seconds seconds
ELECTRONICS EXCESS POWER EXCESS POWER [R44]
CONFIGURE RF FOR
OFF PULSE TRANSCEIVER [M16] [R42] If PSE Interface card power exceeds
EMERGENCY COMM [M19] If battery management electronics limits
MANEUVER ABORTS q   Off-pulse transceiver (h/w- power exceeds limits
DETECT PSE CURR CNTRLR
REMOVE LATCH VALVE-1 OPEN/ REMOVE LATCH VALVE-2 OPEN/ q REMOVE
  Configure
XOVER LATCH FSW tables for
VALVE CIRCUIT BREAKER TRIP [R47]
limited
CLOSE PULSE to 3[M33]
POWER attempts) CLOSE PULSE POWER [M34] OPEN PULSE POWER [M35] If PSE current controller trips circuit
DETECT SPIN RATE ERROR [R20] DISABLE THRUSTERS [M20] q   Terminate latch valve-1 open q   Terminate latch valve-2 open q  
emergency downlink rates
Terminate xover latch valve breaker
DETECT PSE INTERFACE CARD
If spacecraft spin rate is below q   Disable IEM thruster fire command pulse
off actuation
pulse will command pulse actuation q   Disable
open FSW
cmd pulse playback
actuation DETECT BATTERY MANAGEMENT
CIRCUIT BREAKER TRIP [R45]
Minimum Spin Rate [SV07] or interface q   Terminate latch valve-1 close q   Terminate latch valve-2 close ELECTRONICS CIRCUIT
exceeds Maximum Spin Rate q   Power off catbed heaters command cause transceiver
pulse actuation command pulse actuation
q   Sleep 30 seconds BREAKER TRIP [R43]
If PSE Interface card trips circuit
breaker DETECT UNEXPECTED BATTERY
[SV06] and manuever in progress software reboot q DETECT
  Configure
XOVER LV CLOSEtransceiver
CMD for If battery management electronics
DISCHARGE [R48]
CONTINUOUS CURRENT [R38] trips circuit breaker
emergency uplink/downlink
If xover latch valve close command
If battery is discharging while shunt
current is above acceptable limits
DETECT MANEUVER EXECUTION PRESSURE XDCRS PROCESSOR RAMTRANSCEIVER
DETECT DISK SSR MAINTENANCE asserts continuous current for 2
seconds POWER OFF PSE INTERFACE
TIMEOUT [R21] SOFTWARE REBOOT [R17] CARD AND CURRENT
If maneuver execution longer than DETECT PRESSURE DETECT CDH REBOOT [R02] DETECT SSR RESET [R26] DETECT SSPA EXCESS POWER CONTROLLER [M44]
If C&DH reboots andIf transceiver
this is the first flight
software reboots
If FSW has induced an SSR Reset POWER OFF BATTERY
Maximum Maneuver Time [SV08] TRANSDUCERS OVERCURRENT [R18] MANAGEMENT ELECTRONICS
q   Inhibit PDU Response to Low
[R29] unexpectedly
reboot since last ground contact REMOVE XOVER LV CLOSE Battery State of Charge from
If pressure transducers load current If PULSE
SSPAPOWER
power exceeds limits
[M38]
[M42]
PSE Interface Card
RESET PSE INTERFACE CARD [M46]
q   Throw relay to disconnect q   Inhibit PDU Response to Low Battery
exceeds limits q   Terminate xover latch valve q   Power off PSE interface card
Battery Management State of Charge from PSE Interface
close cmd pulse actuation q   Power off PSE current
Electronics from battery Card
DETECT MANEUVER EXECUTION controller
IN ECLIPSE [R22]
RECORD HK TO PROCESSOR SAVE TRANSCEIVER
RAM DISK REBOOT
RE-ENABLE SSR MEMORY q   Sleep 2 seconds q   Power off PSE current controller
[M02] SCRUB [M26] q   Power off Battery q   Reset PSE Interface Card
If solar array current is below limits or POWER OFF PRESSURE q   Change record CAUSE [M17]
rate of G&C Maneuver q   Re-enable SSR memory scrub Management Electronics
no sun pulses and maneuver in TRANSDUCERS [M29] Tlm Packet to 1qpacket
LEGEND   every
Save 300 transceiver reboot
q   flags
Clear count of FSW-induced DETECT BATTERY OVER- RESUME BATTERY OVERTEMP
progress q   Power off pressure seconds SSR resets POWER OFF SSPA [M18] TEMPERATURE [R40] MONITORING [R41]
transducers q   in Last
Record engineering data (h/k, evt Transceiver Reboot
q   Power off SSPA (A&B) If battery temperature exceeds over- If battery temperature is below
RULE [ID] q  
msgs, cmd status) to localCause
RAM disk [SV12]
Freeze recording of CFE_EVS log
temp limit and charge rate is above
limit and coulometer charge control is
acceptable limits and PSE current
DETECT SUN ANGLE VIOLATION q   q   errors
Dump memory scrub Clear
to RAM transceiver reboot flags enabled
controller is powered on and
coulometer charge control is disabled
[R23]
If sun angle is below Maneuver Min LEGEND q  
disk
Sleep 60 minutes
q   CALL CONFIGURE RF FORSUN SENSOR
Sun Angle [SV10] or beyond
MACRO [ID] q   EMERGENCY
Halt recording of engineering data to COMM [M19]
Maneuver Max Sun Angle [SV09] RULE [ID] local RAM disk DETECT SUN SENSOR
DETECT SUN SPIN PULSE USE DETECT SIMULATED SPIN PULSE
DISABLE COULOMETER CHARGE RESUME COULOMETER CHARGE
and maneuver in progress q   Restore record rate of G&C Maneuver MISCONFIGURATION [R31] USE MISCONFIGURATION {R32]
OVERCURRENT [R30] CONTROL [M40] CONTROL [M41]
Tlm Packet to 1 packet every second If sun sensor based spin pulse is If hardware timer spin pulse is
If sun sensor load current exceeds q   Disable coulometer charge q   Enable coulometer charge
MACRO [ID] selected but sun presence is not selected but sun presence is
Stor Var [ID] limits
detected (i.e., sun sensor eclipse) detected (i.e., sun sensor pulse is A q  
control
Set battery chrg rate to C/100
control
DETECT CDH REBOOT AND sensed)
CATBED ON [R24] Stor Var [ID]
If C&DH processor reboots for any
POWER OFF SUN SENSOR [M30]
To
reason and catbed heaters were left SHUTDOWN
q   Power off sun sensor SELECT SIMULATED SPIN PULSE SELECT SUN SENSOR SPIN
powered on
electronics unit [M31] PULSE [M32] PAYLOAD
MOPS MOPS q   Configure IEM to use q   Configure IEM to use q   Configure IEM to use Sun
Macro
Hardware Timer Spin Pulse Hardware Timer Spin Pulse Sensor Spin Pulse
(Page 2)
Findings 7 & 8:
FM Complexity
Finding 7:
§  “The impact of mission-level requirements on FM complexity
and V&V is not fully recognized”
q  “Review and understand the impacts of mission-level requirements
on FM complexity; FM designers should not suffer in silence, but
should assess and elevate impacts to the appropriate levels of
management”
Finding 8:
§  “FM architectures often contain complexity beyond what is
defined by project specific definitions of faults and required
fault tolerance”
§  “Increased FM architecture complexity leads to increased
challenges during I&T and mission operations”
q  “Assess the appropriateness of the FM architecture with respect to
the scale and complexity of the mission, and the scope of the
autonomy functions to be implemented within the architecture”
Implemented Change for Findings 7 & 8:
FM Complexity

§  FM involvement on RBSP started in early phase-A as a


part of the systems engineering team
q  Involved in mission-level trades and assessed impact of
mission-level decisions on FM
q  FM “ownership” of requirements at all levels with
requirements allocated from mission-to-system-to-
subsystem
§  Development of FM Architecture Document defined FM
approach for the project
q  Architecture document now a formal part of FM process
which helped to communicate FM concepts/approach
q  Resulted in project “buy-in” on FM approach early in
mission development
Finding 11:
Inadequate Testbed Resources
Finding:
§  “Inadequate testbed resources is a significant schedule driver
during V&V”
Recommendation:
§  “Develop high-fidelity simulations and hardware testbeds to
comprehensively exercise the FM system prior to spacecraft
level testing”
Implemented Change for Finding 11:
Inadequate Testbed Resources
§  RBSP flight software testbeds available for both FM and
Autonomy testing
q  Available early and provided enough fidelity to verify
Autonomy system
q  Allowed for dry-running portions of FM system-level tests
§  RBSP project also developed high-fidelity hardware-in-the-
loop (HIL) simulator
q  Purpose of HIL was to provide high-fidelity test platform for
the development of FM system tests prior to spacecraft I&T
•  Due to hardware issues completion and availability of HIL was
delayed until late I&T
q  FM system-level testing (dry-run and for-score) was
performed on spacecraft without the opportunity to dry-run
on the HIL
Finding 1:
Cost & Schedule
Finding:
§  “Unexpected cost and schedule growth during final system
integration and test are a result of underestimated Verification
and Validation (V&V) complexity combined with late resource
availability and staffing”
Recommendation:
§  “Allocate FM resources and staffing early, with appropriate
schedule, resource scoping, allocation, and prioritizing;
schedule V&V time to capitalize on learning opportunity”
§  “Establish Hardware / software / “sequences” /operations
function allocations within an architecture early to minimize
downstream testing complexity”
§  “Engrain FM into the system architecture. FM should be “dyed
into design” rather than “painted on””
Implemented Change for Finding 1:
Cost & Schedule
§  RBSP spacecraft-level testing for both FM and autonomy
started early in I&T
q  FM tests were ready at start of I&T which allowed for dry-
running of tests (utilized partially integrated spacecraft)
q  Staffing for FM and Autonomy testing increased in
response to lessons learned from New Horizons and
MESSENGER
§  FM Design Specification clearly defined allocations of
FM functions to hardware, software, autonomy, and
operations
§  Improved approach, planning, and staffing resulted in
avoiding “death spiral” and staffing “bump” during I&T
Implemented Change for Finding 1:
Cost & Schedule
§  Past missions resulted in staffing “bump” during I&T
§  Noticeable change in shape of staffing curve for RBSP
q  FM/Autonomy waited on the spacecraft for testing rather than spacecraft waiting
on FM/Autonomy
q  Remaining RBSP time is projected

KEY:  
 Past  APL  Mission  #1  
 Past  APL  Mission  #2  
 Past  APL  Mission  #3  
 RBSP  
Staff  Months  

-­‐70.00   -­‐60.00   -­‐50.00   -­‐40.00   -­‐30.00   -­‐20.00   -­‐10.00   0.00   10.00  


Months  to  Launch  
Conclusion

§  Significant changes seen in staffing curve for RBSP as


compared to previous APL missions
§  Credit process changes for these improvements:
q  Better defined FM/Autonomy roles
q  Required documentation at key milestones
q  Use of diagrams as communication tool for documentation
q  Early program involvement to manage complexity
q  Improved testbed resources
§  APL FM/Autonomy processes continuing to evolve by
using lessons learned from RBSP and findings from
2008 FM Workshop

You might also like