Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Fault tolerant TTCAN networks

B. Müller, T. Führer, F. Hartwich, R. Hugel, H. Weiler, Robert Bosch GmbH

TTCAN is a time triggered layer using the CAN protocol to communicate in a time trig-
gered fashion. As TTCAN is based on CAN it uses the power of CAN´s error detection
mechanisms and robustness, but it also provides a step towards determinism and
time triggered technology.
Future system architectures will include applications that need to access more than
one TTCAN controller. This article describes how to build fault-tolerant TTCAN net-
works, in particular the mechanisms to synchronize different TTCAN busses. It is
shown that it is very easy to implement a synchronized network of any reasonable
redundancy level, even if non-trivial architectures (for instance more than a simple
dual channel network) are involved. Moreover, this synchronization can be achieved
even when the individual TTCAN busses use different time bases without ever violat-
ing the modular integrity of one single bus.

1 Introduction 2 TTCAN
There is a variety of real-time bus-systems As fault tolerant TTCAN networks consist
that are used to connect electronic control of combinations of TTCAN busses we be-
units in automation or in the automobile. gin with a short description of the TTCAN
Most of these communication protocols bus. The interested reader may find more
are one channel systems, i.e. although detailed descriptions in [1].
there are possibly some fault-tolerance
mechanisms, there is no really redundant 2.1 TTCAN Basics
transmission of messages. In some safety
Time triggered communication in TTCAN
critical applications however, redundant
is based on the reference message being
message transmission becomes a re-
transmitted regularly by the time master.
quirement.
Following the reference message there is
A time triggered variant of CAN, denoted
a sequence of time windows that provide
in the sequel by TTCAN, is described by
the time slots for individual message
the ISO standard 11898-4 (currently still a
transmissions. There are three types of
draft version). Essentially CAN and hence
time windows: exclusive time windows that
TTCAN is a one channel system, redun-
are exclusively reserved for one message,
dancy can only be provided by using mul-
arbitrating time windows during which
tiple TTCAN busses. However, compared
messages can compete for the bus by the
with intrinsically redundant systems (e.g.
non-destructive arbitrating mechanism of
FlexRay, TTP/C), the use of multiple sin-
CAN, and free time windows that are re-
gle channel busses introduces the prob-
served for future extensions of the net-
lem of management of redundancy. This
work. The pattern of time windows follow-
mainly consists of synchronizing the dif-
ing a reference message is called a basic
ferent busses, but it must also be ensured
cycle, i.e. each basic cycle starts with a
that the main services of a time triggered
reference message and contains an off-
communication system (providing a global
line configured set of time windows.
time and a consistent schedule all over the
In TTCAN not all basic cycles necessarily
network) can be used by an application
have to be the same. It is possible to dis-
from either of the channels. This means it
tinguish different basic cycles by the cycle
must be possible for an application to treat
count, a counter that is incremented each
the set of different busses as one commu-
cycle up to the maximum value after which
nication system. In the paper it is shown
it is restarted again. Combining all these
that the TTCAN interfaces allow to easily
different cycles we get the so called matrix
combine TTCAN busses in a modular way
cycle which represents the complete
so that this can be achieved even in sys-
communication overview of a TTCAN net-
tem architectures that go far beyond the
work.
standard dual channel scenario.
time, and global time. All three times run
with the same rate but use different and
varying relative phases to each other. Lo-
cal time is the controller internal basis for
all other times.
The basic network time unit is the so
called NTU and within a level 2 controller
the local time counter must be able to
count even fractional parts of an NTU (with
a fractional resolution of at least 3 bits).
With each reference message, cycle time
is restarted. More precisely: The sample
Figure 1 Example of a matrix cycle point of the SoF bit of any message gen-
erates a (logical) frame synchronization
pulse in the network. On occurrence of
2.2 TTCAN Timing such a frame synchronization pulse each
A TTCAN controller can be configured to controller captures the current value of its
support two levels, level 1 and level 2. local time and, after identification of a
Level 2 is an extension of level 1, and only message as reference message, treats
in level 2 high end synchronization and this value (the local reference mark) as the
global time are provided [1]. In the sequel starting point of cycle time, i.e. within a
we will always consider level 2 networks controller we have
although some of the mechanisms de- Cycle time = Local time – Local reference
scribed below can also be used in level 1 mark.
networks.
Within a TTCAN controller we basically Actually the fractional parts of this differ-
have three time notions: local time, cycle ence are ignored as cycle time only counts
NTUs.

Byte 1: Next_Is_Gap bit and cycle count


7 6 5 4 3 2 1 0
Next_Is_ Reserved Cycle Cycle Cycle Cycle Cycle Cycle
Gap Count (5) Count (4) Count (3) Count (2) Count (1) Count (0)

Byte 2: Fractional part NTU_Res of Master_Ref_Mark and discontinuity bit


7 6 5 4 3 2 1 0
NTU_Res NTU_Res NTU_Res NTU_Res NTU_Res NTU_Res NTU_Res Disc_Bit
(6) (5) (4) (3) (2) (1) (0)

Byte 3: Low byte of Master_Ref_Mark


7 6 5 4 3 2 1 0
MRM MRM MRM MRM MRM MRM MRM MRM
(7) (6) (5) (4) (3) (2) (1) (0)

Byte 4: High byte of Master_Ref_Mark


7 6 5 4 3 2 1 0
MRM MRM MRM MRM MRM MRM MRM MRM
(15) (14) (13) (12) (11) (10) (9) (8)

Figure 2: Reference message in TTCAN level 2


Global time is not only using the existence sage. Figure 2 shows the four mandatory
but also the content of the reference mes- bytes of the reference message, further
data bytes may be added by the applica- master´s) length of the NTU as good as
tion. possible. The TTCAN clock synchroniza-
By definition, at a given time, global time in tion automatically computes the optimum
a TTCAN network is what the current TUR value by use of the global time given
master thinks it is. So within the reference by the master and therefore ensures that
message the master transmits the global the different local times within the network
time value including fractional parts (the have the same rate.
Master_Ref_Mark) that is valid at the
pulse of the respective frame synchroniza- 2.3 TTCAN Initialization
tion. The local offset - the difference be-
In a single channel TTCAN network there
tween the Master_Ref_Mark and the cor-
can be up to 8 potential time masters, dis-
responding local reference mark - gives
tinguished by the three bit time master
the difference between local and global
priority. The time master priority is given
time at the point of frame synchronization,
by the three least significant bits of the
hence the local approximation to global
reference message that is transmitted by
time is given by
the respective potential time master. Al-
Global time = local time +local offset. though we do not describe the initialization
procedure in detail (see for instance [1]), it
This even holds for the time master. We
is important to note that eventually the
see that both cycle time and global time
potential time master with the highest time
are derived from local time, and both are
master priority becomes the time master of
resynchronized with every reference mes-
the TTCAN network.
sage. Again the fractional parts of this dif-
ference are ignored as cycle time only
2.4 The Gap Case
counts NTUs.
Moreover, each controller adjusts by use Although it is possible to transmit the ref-
of the clock synchronization algorithm the erence message completely periodically
rate of its local time, so that it almost per- this is not required in TTCAN. To support
fectly matches the rate of the master: this the time master announces in the ref-
Within a TTCAN controller the rate of local erence message by use of the
time is determined by the so called TUR Next_Is_Gap bit that after the current ba-
(time unit ratio) variable, a variable that sic cycle there will be a gap of undeter-
indicates the (non-integer) ratio between mined length. (However, the maximum
an NTU and the local clock period. TUR length of the gap is specified off-line). At
has a configuration value within each con- some point the application initiates the
troller, but a perfect adjustment to an os- transmission of a reference message and
cillator in another node can only occur if so brings the gap to an end. This typically
the TUR value is adapted so that the local is synchronized with some application
length of an NTU equals the global (the specific event.

Figure 3: The gap


This feature is particularly useful if one continuity of global time, i.e. global time
wants to synchronize applications to proc- just continues to increase during the gap
esses that are not periodic in the TTCAN as it does during a cycle.
time. Note that the gap does not influence
2.5 Global time discontinuities nization of the TTCAN global time to an
external source.
By definition the relation between
TTCAN's global time and any external time
3 Fault-tolerant TTCAN networks
is determined by the more or less random
time at which the initialization of the Although CAN is intrinsically a two wire
TTCAN network began. In some applica- system with strong error detection and
tions it is desirable to synchronize TTCAN handling capabilities it is usually consid-
global time to some external source. In ered a one channel system. This means
some cases the external source is not al- that fault-tolerance and redundant chan-
ways present, in particular not during ini- nels can only be achieved by combining
tialization (consider for instance synchro- two or more CAN busses. Naturally ex-
nization to GPS time during start of an actly the same is true for TTCAN. So this
automobile in a garage). In this case at section deals with the problem how to
some point of time there will be a TTCAN combine two or more TTCAN busses. The
network running with a global time that is subsequent definitions may seem ex-
significantly different from the external tremely formal when considering the com-
time and a synchronization problem parably obvious architectures they cover,
arises. A TTCAN controller allows the ap- however they allow a precise notation and
plication (of the master) to add some value they allow to demonstrate clearly the ex-
to the global time, but, as the TTCAN tremely strong fault-tolerance capabilities
global time is also used for protocol inter- that can be achieved by a combination of
nal clock synchronization reasons, this TTCAN networks.
must be signaled to the other nodes in the We call a system of two TTCAN busses a
network. This signal is given by setting the coupled TTCAN pair if there is at least one
discontinuity bit. The synchronization pro- “gateway” node that has access to both
cedure contains the following steps: TTCAN busses.
• The application determines the differ-
ence between external time and
TTCAN global time.
• The master is told to add this differ-
ence to the TTCAN global time. Channel 1
• When transmitting the next reference
message, the master uses the “new”
Channel 2
global time within the reference mes-
sage and sets the discontinuity bit. Figure 4: Classical redundant network
• The receivers of the reference mes-
sage adapt their global time according
to the reference message but do not
update their TUR value. Figure 4 shows a typical redundant net-
After these steps the TTCAN network is work, where each node is a “gateway”
synchronized to the external global time. node. This is the most straight forward
example of a coupled TTCAN pair. How-
2.6 Maintaining synchronization ever, a coupled TTCAN pair covers also
architectures as depicted in Figure 5.
The above procedure describes how to
synchronize the TTCAN global time to an
external source by a discontinuous jump.
However, to maintain synchronization it is
important to adjust the rates as well. As all
nodes will follow the master, it is sufficient Channel 1
to adjust the rate of the master. This can
be done by allowing the application to in- Channel 2
fluence the TUR value of the master. By
using both mechanisms it is very easy to Figure 5: Mixed redundant network
build and maintain a high quality synchro-
Within a system of many TTCAN busses vided for each channel and the manage-
we call any two of those TTCAN-coupled if ment of redundant transmission is the task
they are connected by a chain of coupled of a higher layer (e.g. FTCom in OSEK-
TTCAN pairs. More precisely: busa and time). When synchronizing two TTCAN
busb are TTCAN coupled if there is a se- busses there is a variety of synchroniza-
quence (bus1,...,busn) of TTCAN busses tion problems to solve. First of all each
with bus1 = busa and busn = busb where controller has three times. Local time es-
(busi,busi+1) are coupled TTCAN pairs for sentially is a controller internal time so
all i=1,...,n-1. Now a fault tolerant TTCAN cycle time and global time remain as can-
network is a system of TTCAN busses didates for synchronizing. Secondly both
where each two of them are TTCAN cou- of those times have a rate and a phase
pled. aspect to synchronize. Now by definition
the rates of global and cycle time within a
controller are equal as they are both de-
rived from local time. So three synchroni-
zation problems remain
Bus a • Phase synchronization of cycle time
• Phase synchronization of global time
Bus b Bus c • Rate synchronization
As for one TTCAN bus all nodes will follow
the time master in all three aspects it is
sufficient to solve these problems for the
master of the TTCAN bus. In the sequel
for shortness we use the abbreviation SL
for the synchronization layer that synchro-
nizes the different TTCAN busses.

4.1 Phase synchronization of cycle time


After initialization each TTCAN bus is run-
ning with its own time master, cycle time,
and global time. We first consider only two
busses. First of all the SL has to measure
the phase difference between the two
busses. This can be done in any of the
Figure 6: Examples for fault-tolerant TTCAN “gateway” nodes at any time as a phase
networks (essentially) is not time dependent. To
formulate this differently: As cycle time is
Using fault-tolerant TTCAN networks it is part of the interface of a TTCAN controller
possible to map the system safety struc- the phase difference measurement is ex-
ture into a system architecture that is best tremely simple for the SL and the real time
suited to solve the original problem. requirements for this measurement are
The main basic message of the paper is negligible for practical purposes.
that it is possible to treat all the busses of A few remarks on the possible generaliza-
a fault-tolerant TTCAN network as one tions and implications should be added.
communication system. First of all in order to have something like
a “phase” we implicitly might assume that
4 Synchronization of TTCAN busses the (nominal) cycle lengths on the two
busses are equal. Although this certainly
The main problem of a combination of two
will be one of the most important applica-
time triggered busses is that of synchroni-
tions, this is not at all necessary. Consider
zation. Redundancy management on
for instance the case where the cycle
message level itself is not a communica-
length of the first bus is twice the cycle
tion system problem as even in a two
length of the second. It is obvious that a
channel system it should be possible to
phase concept still is reasonably well de-
transmit different messages on the two
fined and that the measurement still is as
busses, so the message memory is pro-
easy as in the case when all cycle lengths not necessary that one of the gateway
are equal. One might even go for more nodes is a time master on any of the par-
complicated scenarios like two cycles on ticipating busses.
the one bus versus three on the other. A time master that has to produce a phase
The second point is that we also may have shift will set the Next_Is_Gap bit in the
assumed that the time unit, the NTU, is the next reference message and will use the
same on both busses. Again this certainly desired phase jump to initiate the trans-
will be an important application case but it mission of the reference message one
is not a necessary assumption. For in- cycle later. After the transmission of this
stance in some applications the resolution deferred reference message on both bus-
for one bus can be much higher than on ses the cycle times of both busses have
the other. This even extends to the use of the desired phase relation. Again by the
different baud rates. As long as the SL interfaces of the TTCAN controller [2] this
knows the ratio between the involved can be handled completely without any
NTUs there is still no problem. When significant real-time implications on the
thinking about a standardization of the SL involved application CPU. So the problem
one doubtlessly will restrict the possibilities of synchronizing two TTCAN busses can
but the fact that the definition of the be considered to be solved. When consid-
TTCAN interfaces removes all non trivial ering a whole fault-tolerant TTCAN net-
real-time requirements from the measure- work, the general case can by definition be
ment remains true. reduced to the above, although the SL
After having performed the measurement then additionally has to ensure that two
the SL has to tell the time masters of the TTCAN busses that are synchronized at
two busses the desired phase jump for the some time stay so during the rest of the
relevant bus. All of the above statements synchronization process. Without addi-
about real-time implications hold here as tional network load this can for instance be
well. The desired phase difference might solved by introducing some kind of order-
be zero, but this is not necessary. On the ing structure (not even necessarily linear)
contrary in a redundant system one might on the system of TTCAN busses. Just to
explicitly wish to have a non zero phase give an example: Each TTCAN bus is as-
difference to avoid common mode failures. signed an order level. There is one bus
We make some remarks on possible im- with order level 1, and each bus of order
plementations and special cases. In the level n must form a coupled TTCAN pair
easiest and most obvious case one bus with some bus of order level n-1. By defi-
synchronizes on the other, i.e. the first bus nition of a fault-tolerant TTCAN network it
is not influenced at all by the SL and only is clear that such an ordering structure can
the second bus adjusts its phase. If the always be found. The synchronizing strat-
current time master of the second bus is egy then is that each bus of order level n
one of the gateway nodes, the SL can op- considers exactly one bus of order level n-
erate solely within this node and the nec- 1 as a master bus. This certainly is just an
essary information transfer is more or less example but existence of one example
trivial. So an efficient implementation of shows that it is possible to handle the cy-
the SL might use the following strategy: cle time synchronization within a complete
Bus2 synchronizes on bus1 (considers bus1 fault-tolerant TTCAN network.
as the “master bus”), the time master(s) of
bus2 with the highest time master priorities 4.2 Phase synchronization of global time
are gateway nodes, the SL operates within
Essentially similar statements and strate-
each of those gateway nodes internally
gies can be applied as for the cycle time
(without any communication to the node
although naturally another mechanism for
outside), and the SL within a node can
the synchronization must be used. There-
only get active when it detects that the
fore this part is only sketched. As above it
node is the current master of bus2. How-
is sufficient to consider only two busses.
ever, this strategy is the only one. It is
The SL has to measure the phase differ-
possible to use CAN messages to com-
ence, calculate the desired phase shifts for
municate the desired phase jump to the
both busses and give the information to
current master of bus2. In particular it is
the current time master of the respective
bus. Both busses can have different cycle desired ratio. Without interrupts, the SL
lengths and bit times as those are irrele- may capture the time values of both con-
vant for global time. Different nominal NTU trollers on a regular (but not necessarily
lengths can easily be handled if there is an periodic) basis and get the ratio out the
integer factor (preferably a power of 2) differences. There are a few variations of
between them. The TTCAN interfaces al- this strategy as well (like monitoring the
low to do this any time without imposing behavior of time differences). Taking all
any real-time requirements onto the appli- things together it is possible to measure
cation CPU. the ratio without any real-time implications.
A time master that has to produce a phase However, it may be possible that a division
shift uses the discontinuity bit for the is required (acceptable in software). An
global time exactly as explained in section optimized strategy can be used again if
2.5. After successful transmission of this the second bus is synchronized on the first
reference message the global time phases one and if the SL operates on the time
of the two busses are synchronized. master of the second bus. Then the SL
can just take the TUR value that is gained
4.3 Rate synchronization from the first bus and enforce this value for
the second bus. In case the involved
The rate both of global and cycle time on a
TTCAN controllers have the same inter-
TTCAN bus is determined by the rate of
face, the whole rate synchronization es-
the local time of the current time master on
sentially is a copying action.
this bus. Besides oscillator frequency the
only relevant influence on the rate of the
5 Conclusion
local time is given by the TUR value.
Within the time master the TUR value re- We have demonstrated in this paper that
mains constant during normal operation of there exist simple and in real-time applica-
the protocol (slow changes towards the tions usable mechanisms to synchronize
configuration value are allowed, but this is all TTCAN busses of a fault-tolerant
not relevant here). However the applica- TTCAN network up to an extent that the
tion is allowed to enforce some TUR value whole fault-tolerant TTCAN network can
using a TTCAN interface. This again al- be considered as one communication
lows a very simple way to synchronize system. The range of supportable archi-
rates (as above we can concentrate on the tectures goes far beyond redundant or
synchronization of two busses): The SL partially redundant systems. The fault-
starts to measure the rate difference (or tolerance properties of such a network are
ratio) between both busses, it then calcu- extremely strong as each TTCAN bus is a
lates the new TUR values (or a corre- modular component that is not dependent
sponding quantity) for the two busses and on other channels of the system. Hence
gives this information to the current time already by design the channels have some
masters of the respective busses. A time level of independence that allows “local”
master receives a new TUR value from the error management and reduces error
application and, beginning with the trans- propagation. A simple application of this
mission of the next reference message, principle allows to increase the effective
starts to use this value. One basic cycle data rate by using multiple TTCAN busses
later the other nodes on the bus are rate not for fault tolerance but just for higher
synchronized as well. data rate.
Some comments on the measurement
step: This can be done in different ways. A References
precise measurement can for instance be [1] Road Vehicles – Controller Area
performed by measuring the length of the Network (CAN) – part 4: Time trig-
same physical time interval in units of both gered communication, ISO CD
busses. Using interrupts (this has real-time 11898-4
implications, i.e. may not be a preferred [2] Timing in the TTCAN Network; F.
option) one can use a time interrupt from Hartwich, T. Führer, R. Hugel, B.
one TTCAN controller at a given time to Müller, Robert Bosch GmbH; Pro-
capture the local time of the other TTCAN ceedings 8th International CAN
controller. Doing this twice, we get the Conference; 2002; Las Vegas
Dr. Bernd Müller
Robert Bosch GmbH, FV/FLI
P.O. Box 106050
70049 Stuttgart, Germany
Tel. +49 711 811 7053
Fax. +49 711 811 7136
mueller.bernd@de.bosch.com

Thomas Führer
Robert Bosch GmbH, FV/FLI
P.O. Box 106050
70049 Stuttgart, Germany
Tel. +49 711 811 7597
Fax. +49 711 811 7136
thomas-peter.fuehrer@de.bosch.com

Florian Hartwich
Robert Bosch GmbH, AE/EIS
P.O. Box 1342
72703 Reutlingen, Germany
Tel. +49 7121 35 2594
Fax. +49 7121 35 1746
florian.hartwich@de.bosch.com

Robert Hugel
Robert Bosch GmbH, FV/SLN
P.O. Box 30 02 40
70049 Stuttgart, Germany
Tel. +49 711 811 8517
Fax. +49 711 811 1052
robert.hugel@de.bosch.com

Dr. Harald Weiler


Robert Bosch GmbH, FV/FLI
P.O. Box 106050
70049 Stuttgart, Germany
Tel. +49 711 811 7060
Fax. +49 711 811 7136
harald.weiler2@de.bosch.com

You might also like