Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

21, rue d’Artois, F-75008 PARIS D2/B5_108_2010 CIGRE 2010

http : //www.cigre.org

Communication issues using line protection schemes

C. SAMITIER
On Behalf of
CIGRE JWG B5/D2.30

SUMMARY
Modern digital data communication has increasingly become an essential part of different protection
schemes. The reliability and correct performance of the telecommunication infrastructure has become
of most vital importance for the protection security and dependability.
Communication links for teleprotection may share transmission resources with other users and the
physical route of teleprotection data is not necessarily known. Hence, the complexity has increased
and the performance of the telecommunication depends on an increasing number of systems.
Several utilities have faced severe problems with non-adequate functionality when installing modern
communication and teleprotection systems due to inadequate coordination between Protection and
communication equipment.
The paper identifies the most relevant issues that may impair or influence protection schemes
operation and performance with a specific focus on line differential protection scheme. Remedies and
alternative solutions to these performance mismatches are proposed.
This paper identifies different architectural approaches of the implementation of telecommunication
networks for protection applications, plus its advantages and limitations for some protection schemes.
The paper analyses the interactions between protection schemes using telecommunication and the
network implemented using SDH/SONET/PDH and WDM. The most relevant aspects to be developed
are reliability and synchronisation.

KEYWORDS

Line differential, Teleprotection, Digital channels.

1 INTRODUCTION
Electrical power system protection is provided to detect unwanted conditions on the system and to
initiate actions to remove the unwanted condition. It is required to do this quickly and selectively, and
often this is achieved by having two or more protection devices communicating with each other.
At the highest voltage levels, detection of faults and initiation of circuit isolation is required in
typically less than one cycle of the power system cycle which means <20ms for a 50Hz system (e.g.
Europe) or <17ms for a 60Hz system (e.g. America). Achieving this is constrained by the delays
through the communications system. A knowledge of, or a prediction of, these communications
delays - within an accuracy figure of microseconds - can be critical for the safe, effective and efficient
operation of the power system.

csamitier@gne-eng.com
Further, the protection is required 24/7 and so the communications channel must be similarly available
(there is no time to make a ‘phone call to clear the fault!); continuously open, connected, and available
communications channels are required for electrical power system protection.
Today, the protection of national and international electricity transmission grids has a dependency
upon the characteristics of the communication networks. An unpredicted change in the
communications can cause false protection operation with unwanted isolation of parts of the electricity
network. In extreme cases this can lead to large scale blackouts, compromising safety and causing
huge financial losses.
Over time, different protection techniques have been developed to take advantage of evolving
communication technologies. Some techniques require the communication of “command”
information (i.e. ON/OFF signalling); others require the communication of “data” (the transportation
of power system signal values across the system), and the necessary characteristics of the
communications can differ between these “command” and “data” applications.
Communications channels may be realised in the form of dedicated communications channels (for
example, pilot wires, power line carrier, or dedicated fibre-optic links) or they may be leased from
telecommunication service providers. Dedicated channels are normally under the full control of the
electricity utility and are more likely to provide the predictable, deterministic operation required for
electrical power system protection.
Now as we move towards “next generation network” communications technologies, whilst the
interfaces presented to the electrical power system protection equipment remain the same, the
management of the communications traffic behind the network is dynamic and some of the underlying
characteristics reflected in the table that impact the correct operation of the protection, and hence of
the operation, of the electrical power system, can no longer be assumed.
An important consideration for each of the different categories of protection scheme is the effect of the
communication channel on overall operating times, parameters such as delay, delay variation, errors,
channel availability, etc., may increase the protection operation time or even disable it. Following
chapters analyse telecommunication performance, the key aspect that may influence protection
operation, plus the most common problems found and possible remedies.

2 TELECOMS PERFORMANCE
2.1 Performance Indicators
Telecommunication channels are not perfect. Physical limitations and impairments produce errors and
other defect that limit their performance.
The analysis of the parameters that characterize the performance of the communication service used
by a protection relay application and the relation between the protection relaying parameters on one
side, and those of the communication channel on the other side, leads to the definition of
Performance Objectives. The result of this analysis can be used by the Protection engineer to specify
his communication service requirements for the telecom service provider, and by the telecom
professional to design and implement communication network infrastructures.
The discussion is based on a general model presented in figure 1 and applied equally to command
schemes or to analogue comparison schemes.
Propagation
Parameters
Channel Adaptation
or Performance Mapping

Protection Communication Protection


Relay Channel Relay

Communication Channel Objective

Protection Communication Service Objective

Protection Performance Objective


Figure 1 – General Protection Communication Model

2
Protection Performance Parameters
Performance of a protection system can be characterized by the parameters listed in the following
table:
Protection Objective Dimensioning
Parameter
Performance Parameters
Security Pr (Fault Detected / No Fault) Osec< 1 in S years

Dependability Pr (No Fault Detected / Fault) Odep < 1 in D years

Speed Time of Operation Odelay < TPR


Pr (Fault Detected / Fault outside
Selectivity Not related to Comms.
Protected Section)

The time objective for the protection system is determined by the overall fault clearance time and the
portion of time that has to be allocated to the operation of the circuit breakers (and other processes
such as Breaker Failure detection that may be performed during the clearance time).
The maximum operation time of the Protection system may also be based upon the time limit beyond
which another concurrent mechanism (i.e. protection isolating a wider section of the network) shall
operate. In this case, operating too late can be assimilated to a lack of dependability (i.e. not
operating).
Protection Performance can be related to communication parameters using the State Space table
presented hereafter which classes different communication impairments in terms of resulting
protection system anomalies. The causes of system anomalies are listed below and discussed in the
following sections:
A1: Data Integrity – Protection receiving invalid data
A2 : Channel Availability – Communication service unavailable
A3 : Time limit exceeded – Maximum time to be specified using Fault Clearance Time and/or Timing
of Protection Zones
B1: Time Incoherence – Comparison of samples which are not captured at the same time, e.g. in
differential protection systems with no time-stamping of samples
B2: Residual Errors – Invalid data interpreted by the protection as a command or as a value e.g. long
period of false data without blocking the operation of the Protection
Other causes of protection system malfunctioning which are not related to communications are as
follows:
C1: Wrong setting, e.g. Protection setting too sensitive
C2: Wrong type of Protection, e.g. Short Lines, Parallel Lines, etc.
C3: Installation issues, e.g. interference from substation electromagnetic disturbances

Fault in the Protected Fault outside Protected No Fault


Section Section

Fault Detected Proper Operation C1 – Wrong setting


C2 – Wrong Protection B1 – Time Incoherence
C3 – EMC issues, interface B2 – Residual Error
& wire, sync loss
No Fault Detected A1 – Data Integrity Proper Operation Proper Operation
A2 – Comm Unavailable
A3 –Time limit exceeded
Figure 2 – State space for the operation of a Protection System with different events contributing to
each

3
2.2 Data Integrity, Bit Error Rate and Communication Availability
Data integrity is the capability of the communication network to deliver error-free frames (composed
of commands or sample values) to the protection system. It includes any error correcting mechanism
which is incorporated in the communication system (including the teleprotection signalling
equipment).
Sampled values and commands are transmitted continuously without any acknowledge and
retransmission request. When a valid sample or command does not reach its destination in the limit of
the time objective, then it can be considered as “lost”. If this happens during a time interval when an
operation is required then a “missed operation” can be experienced.
Data integrity depends upon the error generating behaviour of the communication channel which is
generally characterized by an average Bit Error Rate (BER). The BER is commonly used to
characterize the performance of digital communication channels and general data transmission circuits
in serial or TDM links. For packet switched circuits other parameters such as packet loss rate are
commonly used but the basic considerations given here similarly apply.
However, BER cannot be used for specifying the requirements for protection signal transmission.
Similarly, telecom standards definitions of availability/unavailability, which are based on the average
BER over a communication channel, are not meaningful for protection, as illustrated further.
 BER is a statistical figure, obtained from error measurements spanning a sufficient measuring
interval which can be significantly longer than the time which is relevant for protection
operation
 The principle of BER is based on the assumption that bit errors occur with random distribution
over time, i.e. every bit has the same probability of being received correctly or in error
respectively. In particular, the BER does not characterize errors which may occur in “bursts”
or errors which are introduced by some contingencies like dependence on a power system
fault.
 Even a low BER does not help if the error(s) occurs just in that moment when protection
needs to act.
Bit errors generally do not hit communication frames randomly but in bursts of varying length
depending upon their origins. This can be modelled as the channel presenting a long term “steady
state” BER whose level is determined by the dimensioning of the system, and a short term BER whose
level can be very high (more than 1E-3) with a probability of occurrence and a distribution of
durations that must be characterized.
It should be noted that even for low error probabilities of 1E-6 or 1E-9, an error occurs on average
every 16 seconds or 4.3 hours respectively on a 64 kbps channel. Hence even for “healthy”
communication links, an error will occur “every now and when”. The protection relay needs to act on
such conditions in appropriate form and in a defined manner regarding stated loss of performance and
alarming.
For protection purposes it is relevant to know:
 How does a bit error – if it happens – influence the operation of the protection relay like
o Producing unwanted operation (Security)
o Delaying wanted operation (Dependability)
 How should the protection deal with errors
o Should the relay include error correction? If so, what is the impact on the overhead
(=increase of the gross transmission rate) in order to correct single or multiple bit
errors? Or should it just block its operation and “wait” until an error-free sequence is
received?
o When should the protection raise an alarm if it discovers error(s) in its “message” or
“frame”?
o Should the protection adapt its settings when it discovers erroneous messages (e.g. by
lowering the sensitivity) or should it block operation and wait for an error-free
transmission?
It may therefore be more relevant
o to characterize the communication by the probability that a sequence of bits is received error-
free, which would not jeopardize the protection performance at all

4
o to characterize the protection relay by how protection performance deteriorates in the presence
of error(s).
It has to be considered the following situations:
o An error probability of 0.5 may temporarily occur when a link fails until the failure is detected
and the link is blocked and/or an alarm is signalled
o Error rates up to 1E-3 are not uncommon for fading radio links and may even occur on optical
systems operating at the system limit due to increased fibre attenuation, dirty connectors,
transmitter degradation etc.

Communication Availability
Unavailability of the protection communication channel can be caused by
 A service interruption due to a failure in the communication network
 Network reconfiguration for service restoration
 Synchronization loss in the network or at the communication service interface
 Excessive communication errors
Considering the latter case, ITU-T G821 declares the channel unavailable due to excessive error, if 10
consecutive SES (Severely Errored Seconds) are detected, and available again when no SES is
detected for 10 consecutive seconds.
This means that the system may remain “available” during 10 seconds with a an error rate that exceeds
1E-3, and once declared unavailable, may remain blocked for 10 seconds even if the channel is error-
free. The consequences on Security (spurious operation due to non-detected errors) and Dependability
(channel being non-operating for 10 second intervals) can be enormous.
In order to avoid undetected invalid frames which may generate spurious operations, the
communication channel must be blocked after N consecutive invalid frames. Frames shall then be
examined for validity but not used for system operation, until M consecutive valid frames are detected.
The communication channel shall be considered as unavailable during this time interval.

2.3 Error handling


As stated before, the design of a protection system using telecommunication should consider the
presence of channel impairments, and should be able to cope with adverse transmission conditions
when they occur, and adequately handle them.
The two key aspects that must be adequately handled by TDM network users are:

1. Unavailability periods and Telecom signal alarms:


The telecom network is subject to faults that yield to a total temporal interruption of user services, due
to fibre cuts, hardware failures, power blackouts, wrong configurations, wiring failures, and other
sources. These kinds of failures can affect the user’s equipment connection links in a unidirectional or
bidirectional way. During the fault periods, instead of the signal from the peer user at the other end,
user equipment may not receive any signal at all, or receive a special signal generated by the TDM
system with an alarm code. Standard alarm codes for G.703 signals like AIS (Alarm Indication Signal)
and RDI (Remote Defect Indication) give valuable information about the type of fault condition, its
location and direction. This alarm codes should be decoded and interpreted by the user system. Also,
when one of these codes is received, an alarm identifying the corresponding situation should be raised
on the local/remote managing console of the user system, and a registry of the alarm should be
recorded in the user system event log. Other alarms related to G.703 signals like LOS (Loss of Signal)
and LOF (Loss of Frame), among others, should also be generated and raised by the user system
hardware/software when one of these standard error conditions is present.

2. Link errors / Performance monitoring:


Other possible occasional impairment of telecom networks is the continuous or eventual presence of
data errors in the user system received signal. These data errors can be caused by hardware failures,
increased attenuation in fibres, EMI interference in copper wiring, synchronization problems, and
other causes. As some times the data error events appear as unpredictable bursts, the troubleshooting
of the errors origin can be very difficult, so all the help that the telecom link user system may provide,
identifying the exact type of error and time of the occurrences, will be very valuable. Error bursts

5
repeated every approximately equal periods of time can reveal synchronization problems. Line Code
Errors reveal the presence of local wiring induced errors, while CRC and/or other upper layers errors
in absence of Line Code Errors reveal problems inside the network, on the trunk links or hardware, but
not in the local interconnection between the end TDM equipment and the end user equipment.

3 EXPERIENCES TO DATE: WHAT ARE THE PROBLEMS?


The problems identified by the JWG B5/D2.30 can be classified in two generic items:
1. The use of channels or services provided by public operators
2. Transient failures of SDH/TDM networks mostly related with outage recovery
The following sections develop these issues and the next chapter proposes solutions.

3.1 Use of an external telecommunication provider


For cost reduction, protective relaying makes use of a multiple user data network, instead of using a
dedicated network. This network, sometimes, is rented or leased from an external provider. As
teleprotection represents a very small service, sometimes its requirements are not fully taken into
account in the design of the network because they are much stricter than the ones for other services.
This drives to a lack of performance guarantees and unexpected behaviour that may produce miss-
operation of the protection system.

3.2 Telecom transient miss operation


The most relevant causes of transient miss-operation are timing and synchronisation problems, and
unavailability of the telecommunication service.

3.2.1 Timing constraints


Timing constraints are related to the transmission delay; the following issues have been identified:
- Excessive signal propagation delay
The transmission delay affects the performance of both analogue and state comparison schemes by
increasing the tripping times.
- Delay variation
Transmission delay variation can affect the performance of some current differential relays, which
require a constant propagation time to start the synchronization of the clocks. Once a delay variation is
detected, the synchronization is blocked until the propagation delay is constant again during a period
of time. The aim of this check is to use a reliable propagation time, filtering the transients that occur
during a switching in the going and return paths. Even though the synchronization of the clocks is
blocked, the line differential relays can continue in operation for some period of time as, just before
the channel delay variation, they were synchronized. Some line differential relays measure, during the
synchronization process, the drift between the clocks. This data can be used to extend the time the
protective relays can operate without synchronization through communications. If delay variations
occur frequently the synchronization process will be blocked for a long time resulting in blocking the
line differential relays.
- Delay asymmetry
SDH networks include self healing ring architectures, which allow switching one path to another when
a failure is detected. There are two types of switching:
- Unidirectional: only the faulted path is switched. The non-faulted path is not changed, remains the
same.
- Bidirectional: when a failure is detected in one path, both the going and return paths are switched.
When bidirectional switching is used there is no permanent asymmetry in the propagation time (both
going and return paths are equal). This is not the case for unidirectional switching. When
synchronization of the clocks in the line differential relays is done via communications it is considered
that the propagation time in the going and return paths are equal (the propagation time is calculated by
dividing by two the round-trip time). The asymmetry in both delays will create an error in the clocks
synchronization which will be translated into a phase shift between the local and remote currents. This
phase shift emulates a differential current which can make the differential unit trip.
Re-routing, buffering at some equipment in the network and network synchronization errors affect the
three issues described before.

6
3.2.2 Addressing Capabilities
It is important that line differential relays include terminal addressing in order to check that the
received messages come from the corresponding device. Channel loop backs or errors in the routing
make messages reach wrong destinations resulting in false differential currents. False trips have
occurred from line differential relays, without addressing capabilities, due to unexpected channel loop
backs.

3.2.3 Service availability


The way availability for a telecommunication link defined by the ITU-T is not adequate for protection
applications as it specifies a maximum BER of 10 –3 in a 10 seconds period. Availability for protection
should be defined in the millisecond range.

3.3 Alarm Handling


The alarm information can be valuable for the user system itself, to make decisions about how to react
to the fault in the communications link, or for the maintenance staff, to locate the failure more easily
with the information supplied by the alarms.
During the period of telecom network unavailability the user system should raise the appropriate
warnings, and disable its functions that require communications, but continue monitoring the telecom
link, so that, when it recovers normal performance, the suspended functions can be resumed without
needing of any external intervention.

3.4 Handling of faulty conditions


A good TDM user system should implement telecom link Performance Monitoring functionalities, in
order to register and make available detailed information about received data errors, including all
standard different types of error counts (Bit Errors, Block Errors, ES, SES, LSS, SLIPS, ALL0, ALL1,
among others), taken every 15 minutes and 24 hours periods. The memory buffer dedicated to store
this error count data should have capacity to collect historical information for an appropriate long
period of time, so it can be retrieved and seen several days after the error event has occurred.
A good TDM user system should adequately handle the received data errors, not going to a blocking
or hanged state when data errors are received, or confusing them with other events like changes in
network propagation delays. If the amount of errors is enough to impair the performance of the user
intended function, the user system must raise the adequate warning alarm and disable its function until
the abnormal situation of high data error rate ends. The recovery must be automatic without the need
of external intervention.
The hardware of the TDM user system should have filters at the reception signal input that help to
attenuate frequencies higher than the maximum normal received signal frequency component. These
filters help to avoid unwanted out of band induced noise to cause errors.

3.5 Interface Converters


The Current Differential Relays protections are usually equipped with a fibre interface originally
intended to be connected directly to the Protection relay at the remote end, via dark fibre, and without
any active equipment or system in between. In most cases, this fibre interface is proprietary so there is
no possibility of direct connection to the TDM system. An additional piece of equipment, an Electro-
Optical converter, must be used to archive the connection of the Protection to the TDM system at both
ends of the link.
The Electro-Optical converter device is considered part of the Protection system because it is designed
and supplied by the same vendor as the Protection, and it is intended to adapt the Protection
proprietary fibre interface, to an open standard interface, of suitable data rate, available in normal
TDM systems. Usually an electrical G.703: 64 Kbps co-directional or E1/T1.
As the hardware that directly handles the G.703 electrical interface between the Protection and the
TDM system is located inside the Electro-Optical converter, many of the desirable alarm and error
handling procedures proposed in the previous section must be implemented in the Electro-Optical
converter hardware/software. This device is usually a very simple box without sophisticated
management functions, so it is not able to register/show any alarm or error counter. The transfer of this

7
information from the Electro-Optical converter to the actual Protection, in order to be registered,
processed and used, is also not feasible because Line Current Differential Relays were originally
designed to work only directly connected to each other using dedicated fibre links, so any subtending
network should be transparent to them. As a result, the current solution using Electro-Optical
converters is a partial adaptation in which a lot of information and performance is lost.
Another possibility available today is to use the IEEE C37.94 standard for direct fibre interface
between the Protection and the TDM system. The IEEE C37.94 standard has been widely accepted,
and many TDM equipment vendors already provide IEEE C37.94 interface cards that allow the direct
connection of the Protections to the TDM systems without the need of converters. This solution is
particularly advantageous because it provides “fibre only” interconnection between the Protection and
the Telecom equipment, eliminating the copper wiring, with the consequent EMI Protection
improvement. But the installed base of Protections and TDM systems that do not support this standard
is so vast, that the Electro-Optical converters are expected to continue in use in the mid term, and all
the efforts that can be done to improve their operation will be valuable.

3.6 Physical connection/installation


Physical connections are a potential source of interferences that can cause errors and miss-operation of
the telecommunication service.
The most common sources of problems in the installations are:
 Poor or no shielding to the metallic links
 Poor or no grounding to the metallic links
 Poor or insecure connections
 Use of unsuitable wiring e.g. not using twisted pairs

3.7 Lack of understanding between disciplines


Protection engineers would like to have point to point connections so that they can easily control them.
However it is more cost-effective to use an existing communication network which, besides, can offer
better redundancy and a high bandwidth. The complexity of the new communication network is
difficult to understand by the protection engineers. This means that sometimes the requirements for
teleprotection are not clearly stated.

4. STRATEGIES TO SOLVE PROBLEMS EXPERIENCED


Previous chapters have presented the most common problems experienced by protection using
communication. Several techniques can be applied to remedy these problems. The following clauses
summarise possible actions. The work produced by the JWG will fully explain the solutions.
 Block recovery mechanisms in SDH networks
Route switching function is blocked so when a failure in the communication link is detected
line differential relays get out of service and distance relays (back-up devices) are the only
ones responsible for protection.
 Use of WDM instead of SDH
With WDM the traditional point-to-point architecture would be achieved. At the same time it
is cost effective as other services are also used through multiplexing. It removes the delay
asymmetry; however, in case of a fibre failure the communication link is lost.
 Limit propagation delay differences between main and back-up paths
That implies the use of a bidirectional switching and a particular configuration and topology of
the telecommunication network.
 Use line differential relays that tolerate asymmetry
There are some algorithms that tolerate delay asymmetry:
- Use of GPS for synchronization. It also avoids the problem with variable propagation
delay [1]
- Principle of charge comparison tolerates 90º asymmetry [2]
- Use the phase shift measured during load conditions to adjust the clocks difference [3]. In
this case, capacitive charging compensation has to be used by means of the voltage
measurement.

8
- Use fault detectors not based on differential current to avoid tripping during load
conditions.
 Remove need for interface adaptors
It is advisable to use IEEE C37.94 standard interface or a direct fibre optical interface through
a WDM system.
 Simplify the protection schemes
A typical application for network protection is the utilisation of a distance protection (with
teleprotection) together with a line differential protection.
Another possibility is the utilisation of two multifunction protections, both composed with a
line differential function (main) and a distantiometric function (back-up).
Since they are complementary, there is no need for telecom recovery mechanisms. If the
differential protection is disabled by any reason, the backup will act.
 Observe proper installation instructions

4.1 External Telecoms


When using leased services to external operators, both long term and very short term performance
measurement has to be carried out to verify the compliance of agreed performance.
The use of emulated TDM circuits over IP networks have to be avoided since there is no experience
proving that these circuits can comply with protection requirements. Since most public operator are
migrating their networks to packet technology, there are many concerns about future availability of
suitable services from Public Telecommunication Operators.

5. BIBLIOGRAPHY

[1] Charge Comparison Protection of Transmission Lines – Relaying Concepts – L.J. Ernst, W. L.
Hinman, D. H. Quam, J. S. Thorp – IEEE Transactions on Power Delivery, Vol 7, No. 4, Oct 1992

[2] New Line Current Differential Relay using GPS Synchronization – I. Hall, P. G. Beaumont, G. P.
Baber, I. Shuto, M. Saga, K. Okuno, H. Ito - IEEE Bologna Power Tech Conference, Jun 2003

[3] US Patent 6,456,947 Sep 2002. Digital Current Differential Systems. M. Adamiak, et al.

[4] "Protection using Telecommunications" Cigre TB192.

[5] “Electromagnetic Compatibility Problems with Telecommunications Equipment in the Power


Utility Environment”. DC Smith. National Telecommunications Transmission Group. Oct 1997.

You might also like