Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 12

Train door systems

Duplicated timestamp (noise) processing


Duplicated timestamp identification
Example: T020R-C11-D1

Mixture of

Closing Opening
Deleting noise duplicates
- Solution 1-

Example of duplicated timestamp in current and voltage space

Data intrinsic characteristic: Duplicated n°1


• The voltage and current of the engine are
gradually changing (increasing/decreasing).
e 1
n c
ta
Dis

1st Solution: Duplicated n°2


• Among each couple of duplicates, we only keep Distance 2
the nearest observation to the previous Previous timestamp
timestamp in voltage and current space.

distance 2 < distance 1 Delete duplicated n°1


Deleting noise duplicates
- Solution 1 results -
Example 1

solution1

Example 2

solution1 Zoom in
Deleting noise duplicates
- Solution 2 -

Example of duplicated timestamp in current and voltage space


Duplicated n°1

2nd Solution:
• Among the couple of duplicates, we only keep the
nearest observation to the centroid of the last 5 ce1
stan
previous timestamps in voltage and current space. Di
t-5 t-4 t-3
Distance t-1
oid 2
centr
t-2
Duplicated n°2
Previous timestamp

distance 2 < distance 1 Delete duplicated n°1


Deleting noise duplicates
- Solution 2 results-
Example 2

Solution 2

Example 3

Solution 2

Zoom in
Deleting noise duplicates
- Solution 2 behavior explanation -
Example 3

Why do have
these unusual
patterns?

Even if red duplicate has


nearest tension but has a high
(far) current value. It is more logical that the
Nearest to centroid of last 5 timestamps voltage stayed negative while
current had an extreme value!
Deleting noise duplicates
- Solution 3 -

Distance 1
2nd Solution:
• Among the couple of duplicates, we only keep the
nearest observation to the centroid of the last 5
previous timestamps in voltage space. Distance 2

distance 2 < distance 1 Delete duplicated n°1


Deleting noise duplicates
- Solution 3 results-
Example 2

Solution 2

Example 3

Solution 2

Zoom in
Deleting noise duplicates
- Solution 3 behavior explanation
Example 3
Has no duplicate timestamps
while having 800 as position,
After suppression

as if the door was open while


it is closed.

Should the opening start earlier?


Initial data

Even then, a huge jump in voltage


and position would occur.
Deleting noise duplicates
- Solution 3 results final -
Example 4

Solution 2

Solution 2
If for a system cleaning, we take
5 minutes then for 480 system, Deleting noise duplicates Takes 6min on a dataset
with 700 000 rows
we will need 40 hours ! - Solution 3 code-
• Extract the index of each first duplicates in
Executed

timestamps(index of dataframe)
once

• Given df a dataframe and “i” the first timestamp


Takes 0.003

duplicate index, return either “i” the first duplicate


seconds

or “i+1” the last duplicate index.


Takes 0.3s

Given df a dataframe and “i” the first timestamp


duplicate index, return a dataframe without the i’th
row.

Given df a dataframe and “dup_index” a list of the first


timestamp duplicates indexes, delete iteravely the
Takes 0.3*1180 =

farthest duplicate to the centroid of the 5 previous


timestamps of each duplicated index.
6min

Number of duplicates

You might also like