Professional Documents
Culture Documents
HandGest Hierarchical Sensing For Robust In-The-Air Handwriting Recognition With Commodity WiFi Devices
HandGest Hierarchical Sensing For Robust In-The-Air Handwriting Recognition With Commodity WiFi Devices
HandGest Hierarchical Sensing For Robust In-The-Air Handwriting Recognition With Commodity WiFi Devices
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2022.3170157, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL 1
Abstract—Recent advances in wireless sensing techniques have teers, evaluation results demonstrate that HandGest outperforms
made it possible to recognize hand gestures using Channel state-of-the-art methods on a large number of handwritings with
State Information (CSI) in commodity WiFi devices. Existing different transceivers’ location, different initial hand locations
WiFi based gesture recognition systems mainly use learning- and orientations, as well as different drawing sizes. Given its
based pattern recognition methods to recognize different gestures, superior performance, we believe that HandGest paves a new
however, these methods fail to work well when the locations of way to enhance the real-world practicality of WiFi-based gesture
transceivers, the relative location and orientation of the hand recognition.
with respect to transceivers, and/or the hand gesturing size
change, leading to inconsistent signal patterns caused by those Index Terms—Handwriting recognition, dynamic phase vector,
factors. Although some recent efforts have been made to address motion rotation variable, hierarchical sensing.
the so-called “domain-dependent” gesture recognition problem,
they either require prior knowledge on initial locations of the
hand and WiFi devices or need to train several classifiers for
I. I NTRODUCTION
the specific domains. Different from state-of-the-art methods, N the past decade, the feasibility of gesture-based human
we construct two distinct features from a hand-oriented view
(rather than from a transceiver’s view), namely dynamic phase
I computer interaction (HCI) has been widely explored by
researchers, aiming at developing advanced interactive tech-
vector (DPV) and motion rotation variable (MRV), which are
quite consistent in characterizing a big set of handwriting nologies with intelligent devices [1]–[3]. Different from tradi-
gestures, despite significant change in locations of transceivers, tional touch-based interfaces, contactless hand gesture-based
the relative location and orientation of the hand with respect to HCI supports the natural interaction through hand drawing in
transceivers, and the drawing sizes. We further incorporate a the air without dedicated on-body hardware. Existing state-of-
hierarchical sensing framework and develop HandGest – a real-
time handwriting gesture recognition system using commodity the-art methods can be grouped into the following categories:
WiFi devices, to precisely recognize a great number of “in-the- camera-based, acoustic-based, and WiFi-based methods, etc.
air” handwritings based on the aforementioned two domain- [4]. Blessed by the ubiquitousness of WiFi signals at low costs,
independent features and a pipeline of specific features. Extensive WiFi-based solutions, capturing the variations of WiFi signals
experiments have been done in practical settings with 20 volun- caused by hand movements, have attracted growing attention
from both academia and industries [5]–[7].
This work is supported in part by the National Natural Science Foundation To capture gestures in the space covered by WiFi signals,
of China A3 Foresight Program under Grant 62061146001, and in part by the
PKU-Baidu Fund under Grant 2019BD005. (Corresponding Author: Daqing Channel State Information (CSI) has been used by existing
Zhang) studies [8]–[10]. Specifically, CSI characterizes the frequency
Jie Zhang is with the Key Laboratory of High Confidence Software response of the wireless channel between the transceivers at
Technologies, Ministry of Education, Beijing 100871, China, also with the
School of Computer Science, Peking University, Beijing 100871, China, orthogonal frequency-division multiplexing (OFDM) subcarri-
and also with the School of Automation & Electrical Engineering, Uni- ers, and it would vary greatly along with moving objects in
versity of Science and Technology Beijing, Beijing 100083, China (e-mail: the space [11]–[13]. Given a commodity WiFi device pair, one
zhangjie.saee@pku.edu.cn).
Yang Li is with the Key Laboratory of High Confidence Software Tech- could retrieve CSI from the network interface card (NIC), such
nologies, Ministry of Education, Beijing 100871, China, and also with the as Intel 5300 NIC, with CSITOOL [14]. Based on raw CSI
School of Computer Science, Peking University, Beijing 100871, China (e- measurements, most state-of-the-art methods adopt learning-
mail: liyang@stu.pku.edu.cn).
Haoyi Xiong and Dejing Dou are with the Big Data Laboratory, Baidu Inc., based methods [15]–[17] to first extract features from CSI, and
Beijing 100193, China, and also with the National Engineering Laboratory of then recognize different gestures though classification, where
Deep Learning Technology and Applications, Beijing 100193, China (e-mails: the primitive features, such as CSI amplitude and statistical
haoyi.xiong.fr@ieee.org; doudejing@baidu.com).
Chunyan Miao is with the School of Computer Science and Engineering, feature, have been frequently used to classify gesture with
Nanyang Technological University, Singapore 639798, and also with the respect to signal variations.
Joint NTU-UBC Research Centre of Excellence in Active Living for the However, these methods fail to recognize hand gestures
Elderly (LILY), Nanyang Technological University, Singapore 639798 (e-mail:
ascymiao@ntu.edu.sg). robustly in practical settings, as they usually cannot tolerate
Daqing Zhang is with the Key Laboratory of High Confidence Software the change of the relative location and orientation of the hand
Technologies, Ministry of Education, Beijing 100871, China, also with the with respect to transceivers, and the drawing size. Specific
School of Computer Science, Peking University, Beijing 100871, China, and
also with the Telecom SudParis, Institute Polytechnique de Paris, 91011 Evry constraints or prior knowledge on those factors are required
Cedex, France (e-mail: dqzhang@sei.pku.edu.cn). by these methods for hand gesture recognition, as signal
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BEIJING INSTITUTE OF TECHNOLOGY. Downloaded on May 04,2022 at 12:52:51 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2022.3170157, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL 2
patterns of hand gestures are usually inconsistent and they consistent and distinguishable representations of handwriting
are indistinguishable with primitive features. To show the gestures in a domain-independent way. Specifically, the use of
inconsistency of signal patterns with varying factors, in Fig. 1, DPV and MRV accurately categorizes various handwritings
we visualize CSI amplitudes of four hand moving straight into a big number of groups, thus it is quite straight-forward
lines with different initial locations, directions and lengths. As to distinguish gestures from different groups accurately. In
shown in Fig. 1a, line 1 and line 2 share the common moving order to differentiate gestures in the same group, a hierarchical
direction, but with different initial locations and lengths. As sensing framework is proposed to apply with fine-grained
a result, CSI amplitudes of line 1 and line 2 are significantly feature to refine the classification results.
different. Line 2 and line 3 share the same initial location, but In this way, we construct HandGest that incorporates the
with different moving directions and lengths. Such that, CSI two distinct features and a hierarchical sensing framework to
amplitude of line 2 has more peaks and valleys than line 3 (see recognize the hand gestures robustly. Note that, HandGest is
Fig. 1b). Moreover, when the hand moves another straight line just an initial attempt to address the fundamental problem
with different lengths, i.e., line 4, it appears to share similar in CSI-based gesture/activity recognition – the location and
CSI amplitude with line 1. Apparently, all primitive features orientation dependent problem – in the wireless sensing field,
extracted from CSI measurements are affected by the above- where we use digits and letters recognition as an example.
mentioned three factors. The inconsistency of signal patterns With the aforementioned problem solved, HandGest provides
would fail most of the existing WiFi-based gesture/activity researchers in the community critical insights in system design
recognition methods, which highly rely on the assumption that while leading a way to move on. The main contributions can
there are pair-to-pair feature mappings between signal varia- be summarized as follows:
tions and gestures/activities. This is one fundamental problem
1) Two hand-centric features, i.e., DPV and MRV, are
hindering the existing WiFi-based gesture/activity recognition
constructed to convert every handwriting gesture to a
systems from achieving consistent and high accuracy [18].
unique profile from a hand-oriented view, which is not
To recognize gestures under the change of aforementioned
affected by the initial locations and orientations of the
factors (or namely “configurations” in [19] and “domains”
hand, and drawing sizes.
in [20]), some “domain adaptation” techniques have been
2) A hierarchical sensing framework is proposed for ro-
proposed. Specifically, WiAG [19] first leveraged a translation
bust handwriting recognition in real-world application
function to generate virtual samples with realistic CSI mea-
scenarios, where DPV, MRV and fine-grained feature
surements of hand gestures for other possible configurations.
are fused in a hierarchical decision making pathway to
After that, several classifiers were trained using the generated
ensure the distinguishability of numerous handwriting
virtual samples for pattern recognition under different configu-
gestures.
rations. Note that, an additional smart phone had been request-
3) We design and implement HandGest on commodity
ed to collect data from users, so as to model the translation
WiFi devices, and conduct comprehensive experiments
function. Widar3.0 [20] proposed a domain-independent body-
in real-world settings. Experimental results of handwrit-
coordinate velocity profile (BVP) to unify the characterization
ing gestures performed by 20 volunteers with different
of gestures in the same type with at least three pairs of WiFi
initial locations, drawing orientations/sizes, and WiFi
devices, where the locations of the users and WiFi devices are
transceiver settings in 3 typical indoor environments
requested for BVP construction. In summary, state-of-the-art
show that the average recognition accuracy of HandGest
methods usually acquire additional information to tackle the
outperforms state-of-the-art methods, indicating its ro-
“domain adaptation” issue, which limits their practicability and
bustness and great generalization performance.
applicability in real-world application scenarios.
Our Work and Contributions. In this work, we study The remaining of this paper is organized as follows: Section
a specific hand gesture recognition task, namely “in-the- II reviews recent advances of WiFi-based gesture recognition.
air” handwriting recognition, aiming at recognizing the digits The preliminaries and backgrounds of gesture sensing with
and letters written by a hand in the air. While the state- CSI ratio are given in Section III. Section IV presents the
of-the-art learning-based solutions either are intolerant with methodologies of our work, including the design of DPV and
changing factors/domains/configurations, or rely on additional MRV features, and the hierarchical sensing framework. System
prior knowledge, data annotation, devices and transceiver de- implementation is shown in Section V. Performance evaluation
ployments for hand gesture recognition, we propose to address and further analysis are reported in Section VI, followed
the domain-dependent issue from a different perspective. by some discussions in Section VII. Finally, conclusions are
While all existing gesture recognition methods apply the presented in Section VIII.
transceiver-oriented view to receive CSI and extract primitive
features, it is difficult to recognize the hand gestures robustly
as the pattern of the same gesture may vary from the view II. R ELATED W ORK
of WiFi devices due to initial locations and orientations of
the hand, and drawing sizes. Differently, we characterize the WiFi-based gesture recognition can be grouped into two
relative motion of handwriting using a hand-oriented view categories depending on whether the initial locations and
and construct two distinct features – dynamic phase vector orientations of handwritings have been specified or included
(DPV) and motion rotation variable (MRV) – that provide as the input for recognition.
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BEIJING INSTITUTE OF TECHNOLOGY. Downloaded on May 04,2022 at 12:52:51 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2022.3170157, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL 3
Line 1 Line 2
Amplitude
Amplitude
Line 2
Line 3 Line 4
Tx Rx
Amplitude
Amplitude
(a) (b)
Fig. 1. Effects of initial locations, orientations and drawing sizes on gesture patterns: (a) Drawing lines with different initial locations, directions and lengths,
(b) Patterns of different lines.
A. WiFi-based Gesture Recognition that Depends on the Ini- advanced location/orientation independent feature or/and re-
tial Hand Locations and Orientations sort to machine/deep learning methods. WiAG [19] proposed
a modified translation function to generate virtual samples of
Most of the existing WiFi-based gesture recognition works
all the gestures in the possible configurations only using the
assume the initial hand locations, orientations and drawing
samples of all the gestures in one configuration. Widar3.0 [20]
sizes of handwritings are known or specified. They barely per-
constructed a domain-independent BVP to guarantee each type
form well with the changes of initial location and orientation
of gestures has its unique velocity profile in the established
of the hand, and drawing sizes. Specifically, WiFinger [15]
body coordinate system, which could achieve satisfactory
adopted multi-dimensional DTW (MD-DTW) to calculate the
recognition accuracy on relatively complex gestures. However,
similarity between the testing CSI measurements and each of
it requires to know the locations of user and WiFi devices in
the known gesture profiles, and the one with highest similarity
advance, and a large number of data for training deep neural
score was identified as recognized gesture. QGesture [17]
network, which limit its practicability and scalability in real-
adopted phase correction algorithm and a novel estimation
world application scenarios. WiHand [25] extracted location
algorithm to reduce the phase noise of CSI measurements
independent gesture signals from the raw WiFi signals through
and impacts of environmental dynamics for high accuracy, but
the low rank and sparse decomposition algorithm, and then
user must perform two predefined actions to separate gesture
utilized SVM as classifier to identify the gestures. Regani et
from daily activities. WiMU [21] was designed for multi-user
al. [26] proposed a unique hand gesture model and derived the
gesture recognition, which first generated virtual samples for
correspondence between the time reversal resonating strength
various plausible combinations of simultaneously performed
decay and relative distance moved by the hand in the through-
gestures, and then classified those gestures utilizing the Jac-
the-wall scenario. They achieved 87% recognition accuracy
card similarity coefficient based method. Yang et al. [22]
on 6 uppercase English alphabets, but the hand movement
designed a modified OpenWrt-based platform, in which deep
must be straight line during performing gestures in the air.
neural networks were adopted to extract the spatiotemporal
Additionally, Ma et al. [27] developed a novel deep neural
patterns of gestures. WiDraw [23] extracted angle-of-arrival
network for gesture recognition, which could simultaneously
(AoA) values from CSI and average received signal strength
learn high-level features and a transferrable similarity eval-
(RSS) measurements for hand motion tracking, but it required
uation ability, making the learned knowledge transferrable
dense transmitters to guarantee its high accuracy. WiGest [24]
between the training dataset and new testing scenarios. While
defined three unique signal states, including rising edge, falling
state-of-the-art methods intend to solve the location/orientation
edge and pause, and parsed the combinations of those signal
dependent problem, they actually are with poor robustness
states to identify various gestures. While the aforementioned
and generalization performance especially when the initial
works have made some progresses in recognition, they fail to
location, orientation and drawing size, etc., are not constrained.
solve the location/orientation dependent problem – a funda-
mental problem hindering the wireless sensing techniques to
be applied in real settings. III. G ESTURE S ENSING WITH CSI R ATIO
Given K subcarriers, denoted as H(f1 ), H(f2 ), . . . H(fK )
B. Location/Orientation Independent WiFi-based Gesture over the frequencies f1 , f2 , . . . fK , CSI aggregates the real-
Recognition time characterization of the K subcarriers of radio propaga-
tion, which can be mathematically represented as
In order to tackle the location/orientation dependent prob-
lem of WiFi-based gesture recognition, some works propose H = [H (f1 ) , H (f2 ) , . . . , H (fK )] . (1)
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BEIJING INSTITUTE OF TECHNOLOGY. Downloaded on May 04,2022 at 12:52:51 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2022.3170157, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL 4
Specifically, a single subcarrier H(fk ) of CSI can be two antennas can be represented as
expressed as ( d (t)
)
−jθoffset −j2π 1λ
H1 (f, t) e Hs,1 + A1 e
= ( )
̸ H(fk ) H2 (f, t) d2 (t)
H (fk ) = |H (fk )| ej (2) e−jθoffset H + A e−j2π λ s,2 2
d (t)
−j2π 1λ
A1 e +Hs,1 (4)
where |H (fk )| denotes the amplitude, and ̸ H (fk ) denotes = d1 (t)+∆d
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BEIJING INSTITUTE OF TECHNOLOGY. Downloaded on May 04,2022 at 12:52:51 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2022.3170157, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL 5
Q
Q
-0.4
-0.7
Rx2 Rx1 -0.5
-0.8
-0.6
-0.9
-0.8 -0.6 -0.4 -0.2 0.4 0.5 0.6 0.7 0.8 0.9 1
I I
18
man body, such as hand, arm and even shoulder. Furthermore,
Amplitude
0.32
16 the CSI signals of more than two times reflection are usually
0.3 14
weak, it is reasonable to ignore them. In this way, we only
12
y (cm)
0.28
0 200 400 600 800 1000 1200 1400 1600 1800
Tx-Rx2
consider the CSI signals that directly propagate to the moving
8
0.26
hand and then are reflected by the hand to the WiFi receivers.
Amplitude
0.24
4
0.22 2
0.27 0.28 0.29 0.3 0.31 0.32 0.33 0.34 0 200 400 600 800 1000 1200 1400 1600 1800
x (cm) Samples
(a) (b)
Tx-Rx1
0.34 16
Qtk
15
Amplitude
0.33
14
0.32 13
12
0.31
y (cm)
0.29 6
0.28 5
0.27
0.32 0.33 0.34 0.35 0.36 0.37 0.38 0.39
4
0 200 400 600 800 1000 1200 1400 1600 1800
Fig. 6. Illustration of DPV feature.
x (cm) Samples
(c) (d)
Fig. 4. Empirical study of drawing digit “3”: (a) Drawing digit “3” for the B. Hand-Centric Feature – Dynamic Phase Vector
first time, (b) CSI amplitude of the first digit “3”, (c) Drawing digit “3” for
the second time, (d) CSI amplitude of the second digit “3”. Practically, when performing the same handwriting, it is
impossible to request users to keep the same initial location-
Remark 1. Without loss of generality, we take digits “0- s/orientations of the hand, and sizes of handwriting every
9” as an example to illustrate the proposed hand-centric fea- time. From a transceiver-oriented view, the change of such
tures, fine-grained feature, and hierarchical sensing framework factors makes the representation of hand movement different.
in the following sections. We use these ten digits as examples Therefore, it is worthwhile to construct a hand-centric feature
for handwriting recognition and uncover every digit with a that can characterize the hand movement in a hand-oriented
unique and consistent pattern from a hand-oriented view. view.
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BEIJING INSTITUTE OF TECHNOLOGY. Downloaded on May 04,2022 at 12:52:51 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2022.3170157, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL 6
DPV Pattern of Digit "0" DPV Pattern of Digit "1" DPV Pattern of Digit "2" DPV Pattern of Digit "3" DPV Pattern of Digit "4"
2 0 3
2 2
2
1 -1
Phase Changes
Phase Changes
Phase Changes
Phase Changes
Phase Changes
1 1 1
0 -2
0 0 0
-1 -3
-1
-1 -1
-2 -4 -2
0 200 400 600 800 1000 0 200 400 600 800 1000 0 200 400 600 800 1000 0 200 400 600 800 1000 0 200 400 600 800 1000
Samples Samples Samples Samples Samples
(a) Digit “0” (b) Digit “1” (c) Digit “2” (d) Digit “3” (e) Digit “4”
DPV Pattern of Digit "5" DPV Pattern of Digit "6" DPV Pattern of Digit "7" DPV Pattern of Digit "8" DPV Pattern of Digit "9"
2 1.5 2 2
1 1
1 1 1
Phase Changes
Phase Changes
Phase Changes
Phase Changes
Phase Changes
0.5
0
0
0 0 0
-0.5 -1
-1
-1 -1 -1 -2
-1.5
-2 -2 -2 -2 -3
0 200 400 600 800 1000 0 200 400 600 800 1000 0 200 400 600 800 1000 0 200 400 600 800 1000 0 200 400 600 800 1000
Samples Samples Samples Samples Samples
(f) Digit “5” (g) Digit “6” (h) Digit “7” (i) Digit “8” (j) Digit “9”
Fig. 7. Pattern templates of DPV of the ten digits.
As shown in Fig. 6, we establish a coordinate system “2” and “4”, digits “5”, “6” and “9” should be two independent
with initial location of the hand as the origin. We define a subgroups.
modified hand-centric feature, named dynamic phase vector We still take digit “3” as an example to demonstrate the
(DPV), as the phase variation between the initial location of representation capability of DPV for the hand movement. As
the hand and the horizontal axis of each point on the trajectory. illustrated in Fig. 8a, when we draw digit “3” without stroke
Mathematically, dynamic phase sequence can be represented reversal, its DPV first rotates clockwise and then counter
as clockwise (see Fig. 8c and Fig. 8d). When we draw digit
Θ= {Θt1 , Θt2 , ..., Θtn } (5) “3” with stroke reversal, the corresponding DPV also first
rotates clockwise and then counter clockwise (see Fig. 8b,
where Θtk stands for the dynamic phase at time step tk , k ∈
Fig. 8c and Fig. 8d). According to Fig. 6, digit “3” consists
[1, n].
three components in DPV feature space, sequentially denoted
Note that, with the hand movement in ideal settings, CSI
by blue, green and red, corresponding to the three-segment
of every subcarrier should vary consistently with the same
curves in Fig. 7d. DPV can be regraded as a kind of coarse-
pace while the reflection path caused by the moving hand
grained hand-centric feature, which has excellent representa-
could be measured by any of them. However, due to multipath
tion capability for the hand movement and is not sensitive
propagation and random noise, etc., such consistency may
to the local characteristics. Thus, it can efficiently reduce the
not always exist during the whole hand moving process in
effects of different handwriting styles on the same digit. Fig. 9
practical scenarios. Based on the observation, we find that
illustrates a practical case of drawing digit “3” without and
the subcarrier, whose phase change difference has the lowest
with stroke reversal, we can find that the two corresponding
variance, can characterize the hand movement better than
DPV patterns are consistent, verifying the above analysis about
others in a sliding window. Therefore, to refine DPV, we select
the representation capability of DPV for the hand movement.
the subcarrier with the lowest variance in each sliding window:
Υtk ,Q = argmin var (Υtk ,q ) (6)
C. Hand-Centric Feature - Motion Rotation Variable
where Υtk ,q denotes the dynamic phase of the specific qth In this section, we introduce another hand-centric feature,
subcarrier at time step tk , q ∈ [1, K], and Υtk ,i is the dynamic named motion rotation variable (MRV). As was discussed,
phase of the ith subcarrier, respectively. DPV represents digits with consistent patterns, however, it
Fig. 7 illustrates the pattern templates of DPV of the ten also fails to provide distinguishable characterizations to the
digits in three independent experiments of different settings, digits “2” and “4”, and digits “5”, “6” and “9”. For example,
where every pattern template refers to the phase of DPV in Fig. 10, the DPVs of both digits “2” and “4” first rotate
changing over time when handwriting the digit in a common clockwise and then counter clockwise. It means that the
way. Apparently, DPV is capable of preserving characteristics essential difference between digits “2” and “4” is the different
of all digits through capturing the hand movement in the hand- segment sequence, but DPV cannot characterize these detailed
oriented view. Though these digits are not fully distinguishable segments (including straight line and arc, etc.) well.
using the DPV features, DPV is still able to categorize them To tackle the problem, we propose MRV, as the angle
into several subgroups with consistent patterns. For example, variation of adjacent sampling points in a certain period of
digits “0”, “1”, “3”, “7” and “8” have their own unique time (see Fig. 11):
patterns, but digits “2” and “4” have similar DPV patterns,
digits “5”, “6” and “9” have similar DPV patterns, thus digits ∆θ= {∆θt1 , ∆θt2 , ..., ∆θtn } (7)
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BEIJING INSTITUTE OF TECHNOLOGY. Downloaded on May 04,2022 at 12:52:51 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2022.3170157, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL 7
Qt Qt
Stroke Reversal
(a) (b)
Fig. 10. Similarity between digits “2” and “4” in DPV feature space.
Tx
qt k
movement.
0.44 0.46
Phase Changes
Phase Changes
0.42 0.44
0.4 0.42
distinguished patterns. Similarly, digit “6” has different MRV
0.38 0.4 pattern compared with digits “5” and “9”. Accordingly, the ten
0.36 0.38 digits can be furthered classified with the help of MRV. Un-
0.34 0.36
0.01 0.02 0.03 0.04 0.05
Samples
0.06 0.07 0.05 0.1
Samples
0.15 fortunately, patterns of some digits are still similar, motivating
(a) (b) us to construct targeted fine-grained feature to try to separate
them.
0
-0.4
-0.5
-0.6 D. Fine-grained Feature
Phase Changes
Phase Changes
-1
-0.8
-1.5 -1 In this section, we introduce fine-grained feature designed
-2
-1.2
-1.4
for the indistinguishable digits by DPV and MRV. As men-
-2.5
-1.6 tioned above, two digits are still indistinguishable, but they
-3
0 500 1000 1500
-1.8
0 200 400 600 800 1000 1200 1400 could be divided into one subgroup. Thus, we can construct
Samples Samples
(c) (d)
case-by-case fine-grained feature depending on the characteris-
tics of the digits in the subgroup to finally separate all of them.
Fig. 9. Empirical study of drawing digit “3” without and with stroke reversal:
(a) Digit “3” without stroke reversal, (b) Digit “3” with stroke reversal, (c) The constructed fine-grained feature is actually the qualitative
DPV pattern of digit “3” without stroke reversal, (d) DPV pattern of digit “3” representation of primitive features, which is enough for the
with stroke reversal. rest of unrecognized digits in the subgroup.
According to Fig. 7 and Fig. 12, digits “5” and “9” have
where ∆θtk is the angle variation at time step tk (θtk and similar DPV and MPV patterns. Experimentally, the velocity
θtk−1 represent the instantaneous angles at time step tk and directions of the last segment of digits “5” and “9” are
tk−1 ): significantly different relative to link Tx-Rx1. The velocity
direction of the last segment of dight “5” is always positive,
∆θtk =θtk − θtk−1 (8) and the corresponding one of digit “9” is negative. Thus, the
velocity direction of the last segment can be a potential fine-
According to Eq. (8), ∆θtk can be either positive or negative grained feature for separating digits “5” and “9”. Practically,
depending on the hand moving direction. Specifically, if hand we randomly select some digits “5” and “9” to verify this
rotates clockwise, ∆θtk > 0, and if hand rotates counter observation, which are demonstrated in Fig. 13. According
clockwise, ∆θtk < 0. Thus, MRV also can be utilized to show to Fig. 13, we can find that the last segment of velocity
the hand moving direction. directions of the selected digit “5” are always positive, and
Fig. 12 illustrates the pattern templates of MRV of the ten the corresponding ones of the selected digit “9” are always
digits in three independent experiments of different settings, negative, indicating that velocity direction is enough for only
where every pattern template refers to the angle of MRV recognizing digits “5” and “9”. The velocity direction of the
changing over time when handwriting the digit. In Section last segment of handwriting is only a kind of case-by-case
IV-B, we have demonstrated that DPV cannot separate digits fine-grained feature proposed in this paper. If more gestures
“2” and “4”, and they can be identified using MRV with are supposed to be recognized using the proposed method in
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BEIJING INSTITUTE OF TECHNOLOGY. Downloaded on May 04,2022 at 12:52:51 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2022.3170157, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL 8
MRV Pattern of Digit "0" MRV Pattern of Digit "1" MRV Pattern of Digit "2" MRV Pattern of Digit "3" MRV Pattern of Digit "4"
2 0 2 2 1.5
1.5 1
1 -1 1
Angle Changes
Angle Changes
Angle Changes
Angle Changes
Angle Changes
1
0.5
0.5
0 -2 0 0
0
-0.5
-0.5
-1 -3 -1
-1 -1
-2 -4 -1.5 -2 -1.5
0 200 400 600 800 1000 0 200 400 600 800 1000 0 200 400 600 800 1000 0 200 400 600 800 1000 0 200 400 600 800 1000
Samples Samples Samples Samples Samples
(a) Digit “0” (b) Digit “1” (c) Digit “2” (d) Digit “3” (e) Digit “4”
MRV Pattern of Digit "5" MRV Pattern of Digit "6" MRV Pattern of Digit "7" MRV Pattern of Digit "8" MRV Pattern of Digit "9"
2 2 1.5 2
2
1
1 1 1
Angle Changes
Angle Changes
Angle Changes
Angle Changes
0.5 1
Angles
0
0 0 0
0
-1 -0.5
-1 -1 -1
-1
-2
-2 -1.5 -2 -2
0 200 400 600 800 1000 0 200 400 600 800 1000 0 200 400 600 800 1000 0 200 400 600 800 1000 0 200 400 600 800 1000
Samples Samples Samples Samples Samples
(f) Digit “5” (g) Digit “6” (h) Digit “7” (i) Digit “8” (j) Digit “9”
Fig. 12. Pattern templates of MRV of the ten digits.
a location/orientation independent manner, more features need Each or several digits have their own common char-
•
to be constructed following the idea. acteristics. A certain feature, including both traditional
primitive feature and hand-centric feature, may have
good representation capability only for specific digits, but
Tx-Rx1 Tx-Rx1
0.04
relatively poor representation capability for other digits.
0.02
5 0.02 5
0
0
-0.02
Thus, it is impossible to completely separate all the digits
-0.02
-0.04
-0.06
only utilizing any of the feature. For example, DPV
0 200 400 600 800 1000 1200 1400 0 200 400 600 800 1000 1200
Tx-Rx2 Tx-Rx2 cannot identify digits “2” and “4”, and MRV does not
0
-0.05
0
function well for digits “0” and “6”.
-0.05
-0.1 • Every feature has different representation capability for
-0.1 -0.15
0 200 400 600 800 1000 1200 1400 0 200 400 600 800 1000 1200 different digits, if all the features are utilized equally, the
(a) (b) features with poor representation capability for specific
digits may provide redundant information, leading to
Tx-Rx1 Tx-Rx1
0.05 0.02 performance loss.
0
0 In this way, we design the hierarchical sensing framework
9 -0.02
9 as illustrated in Fig. 14. Accordingly, all the digits are divided
-0.05 -0.04
0 200 400 600 800 1000 0 200 400 600 800
Tx-Rx2
0.02
Tx-Rx2 into several subgroups by DPV in the first layer, named first
0.02
0 0
coarse-grained classification. In the next layer, MRV-based
-0.02 -0.02 second coarse-grained classification is implemented for the
0 200 400 600 800 1000 0 200 400 600 800 grouped digits. After that, unrecognized digits with similar
(c) (d) MRV patterns have been categorized into smaller subgroup.
Fig. 13. Empirical study of identifying digits “5” and “9” using targeted In the final layer, targeted finer-grained feature is designed
fine-grained feature. for the subgroup to separate the digits completely, i.e., fine-
grained classification. Fig. 15 illustrates the hierarchical sens-
ing framework for English letters, which also employs a three-
E. Hierarchical Sensing Framework layer structure with DPV-based first coarse-grained classifi-
With the above three types of features, in this section, we cation, MRV-based second coarse-grained classification, and
present a hierarchical sensing framework for the handwriting fine-grained classification.
recognition. The objective of the hierarchical sensing frame-
work is to enhance the discriminant capability of the features V. S YSTEM I MPLEMENTATION
for pattern recognition, while the design of the framework is In this section, we detail the design and implementation of
motivated by the following observations: HandGest, which consists of data collection, data preprocess-
• Compared with traditional primitive features, the pro- ing, feature construction, and hierarchical sensing components.
posed hand-centric features, i.e., DPV and MRV, play Fig. 16 depicts its overview architecture, in which data collec-
important roles in maintaining pattern consistency of dig- tion component collects raw CSI measurements from the WiFi
its, which make the digit patterns not change significantly receivers and divides them to produce CSI ratio streams, data
due to the difference of initial location and orientation of preprocessing component is mainly for denoising CSI ratio
the hand, and drawing size, etc. data through Savitzky-Golay (SG) filter, feature construction
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BEIJING INSTITUTE OF TECHNOLOGY. Downloaded on May 04,2022 at 12:52:51 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2022.3170157, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL 9
0 Classification Pool
Fine-grained Velocity
1 feature direction 5
Rx1
3 relative to link 9
7 Tx-Rx1 3
DPV 2 5 Red digits have similar 5 9 7
Tx 4 9 DPV/MRV patterns
5 MRV 2 1
6 4 Green digits have
Rx2
8 6 Distinguishable
8 DPV/MRV patterns
9 Labeling
Fig. 14. Hierarchical sensing framework for digits.
l No
Recognized
Yes
q m Subgroup1
p Classification Pool Hierarchical
MRV Contruction
Sensing
Fig. 15. Hierarchical sensing framework for letters. Yes
Recognized
No
Subgroup2
A. Data Collection
Fine-grained
In this component, we collect CSI measurements from the Feature Labeling
two WiFi receivers equipped with Intel 5300 NICs with three Construction
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BEIJING INSTITUTE OF TECHNOLOGY. Downloaded on May 04,2022 at 12:52:51 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2022.3170157, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL 10
where Φ1tk and Φ2tk denote the phase of CSI ratio of links state-of-the-art methods. Finally, we show experimental results
Tx-Rx1 and Tx-Rx2. of HandGest with different initial locations of the hand, draw-
DPV has good representation capability for characterizing ing sizes, drawing speeds, drawing manners, LoS distances of
the hand movement, which can reduce the effects of drawing WiFi transceivers, angles between WiFi transceivers, sampling
inconsistency of the same handwriting gestures, but it may not rates, static environments, and dynamic environments, etc.
identify all the digits only using DPV. Thus, we then calculate
the angle variation of adjacent sampling points in a certain A. Experimental Settings
period of time to construct MRV: The experimental environments include an office, a meeting
room, and an empty hall of Peking University, layouts of
∆Φ1tk = Φ1tk − Φ1tk−1 (10) the office and meeting room are illustrated in Fig. 17. The
size of the office is 5m×5m, the size of the meeting room
is 6m×8m, and the size of the empty hall is 8m×12m,
∆Φ2tk = Φ2tk − Φ2tk−1 (11) respectively. Accordingly, both of the office and meeting room
are multipath-rich due to their complex layouts. The tested
( )
∆Φ1tk handwriting gestures include selected digits and letters.
∆θtk = arctan (12)
∆Φ2tk
Differently, MRV is sensitive to the local characteristics of
hand movement, which is discriminative to the gestures with
similar DPV patterns. Finally, targeted fine-grained feature is
proposed to identify the unrecognized gestures by DPV and
MRV, such as velocity relative to link Tx-Rx1, direction of
displacement relative to link Tx-Rx1, etc.
D. Hierarchical Sensing
In this component, all the constructed features are embedded (a) (b)
into the modified hierarchical sensing framework to identify Fig. 17. Layouts of the office and meeting room: (a) Office, (b) Meeting
handwriting gestures, enhancing discriminant capability of room.
those features in different layers. Specifically, first coarse- In each of the three indoor environments, we build a 2-
grained classification separates some of the gestures using dimensional Fresnel zones with two pairs of WiFi transceivers,
DPV, in which all the handwriting gestures have their own utilizing GigaByte mini PCs with an Intel 5300 NIC and
consistent patterns. In addition, second coarse-grained classi- external omni-directional antennas for data transmitting and
fication is performed for the handwriting gestures with similar receiving, as shown in Fig. 18 (experimental locations of
DPV patterns using MRV. Finally, fine-grained classification HandGest are denoted by the red points in Fig. 17). HandGest
is implemented for the unrecognized handwriting gestures by runs in 5.24 GHz frequency with 40 MHz, and the packet
DPV and MRV. In the offline preparation phase, to minimize transmission rate is set as 500 packets per second, installing
the efforts required for training and calibration, our method CSITOOL on the mini PCs for data collection.
only needs to collect the WiFi signals for each handwrit-
ing once as the training sample. In the online recognition
phase, when an unknown handwriting is drawn and processed,
HandGest extracts the two features from the WiFi signals
and searches the nearest example as the recognition label
using DTW by calculating the Euclidean distance of DPV
and MRV features between the unknown handwriting and the
template handwritings. Note that, no matter the handwritings
are performed by whom and in which location/orientation
settings, the proposed method is capable of producing a
robust model that can accurately recognize the handwritings
performed by different persons in varying location/orientation
settings.
Fig. 18. Experimental setting in the office.
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BEIJING INSTITUTE OF TECHNOLOGY. Downloaded on May 04,2022 at 12:52:51 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2022.3170157, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL 11
Accuracy
in WiFi sensing, including support vector machine (SVM),
convolutional neural network (CNN), long short-term memory 40
True Label
level gestures. Furthermore, it needs at least three wireless 4 0.00 0.00 0.00 0.00 0.96 0.03 0.00 0.00 0.01 0.00
links, and the locations of user and WiFi devices are required, 5 0.00 0.00 0.00 0.00 0.01 0.94 0.00 0.00 0.00 0.05
which may be practically inconvenient. HandGest achieves 6 0.10 0.00 0.00 0.00 0.00 0.00 0.90 0.00 0.00 0.00
better recognition accuracy than FingerDraw and AirDraw. 7 0.01 0.02 0.00 0.00 0.00 0.00 0.00 0.98 0.00 0.00
The main reasons include: 1) Both FingerDraw and AirDraw 8 0.01 0.00 0.00 0.00 0.04 0.00 0.00 0.00 0.94 0.02
calculate the displacement information of the hand/finger to 9 0.00 0.00 0.00 0.00 0.00 0.03 0.00 0.01 0.00 0.97
92
is around 95%.
90
C. Evaluation of Robustness and Generalization Performance
In this section, we evaluate the robustness and generalization 88
performance of HandGest under various factors.
1) Impact of Different Initial Locations of the Hand: We
print templates for all the selected digits and letters on a 1 2 3 4 5 6 7 8 9 1011121314151617181920
cardboard, and hang them in the sensing area between the Volunteer Number
two pairs of WiFi transceivers, making participants draw at
Fig. 21. Comparison results of different participants without restrictions on
different initial locations by adjusting the locations of those initial location of the hand.
templates. In this manner, there are no special constraints
on the initial locations of the hand, all the participants can 2) Impact of Different Drawing Sizes: We print three
perform handwriting gesture at any locations naturally be- templates on a cardboard in three different sizes (5cm×7cm
tween the WiFi transceivers. Fig. 21 depicts the corresponding for small size handwriting gestures, 7cm×10cm for middle
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BEIJING INSTITUTE OF TECHNOLOGY. Downloaded on May 04,2022 at 12:52:51 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2022.3170157, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL 12
Accuracy 95.35% 95.79% 92.97% Accuracy 95.41% 95.62% 94.63% 93.50% 91.30%
TABLE II TABLE IV
C OMPARISON R ESULTS OF D IFFERENT D RAWING S PEEDS C OMPARISON R ESULTS OF D IFFERENT A NGLES BETWEEN W I F I
T RANSCEIVERS
Low speed Normal speed High speed
60 degree 90 degree 120 degree
Accuracy 94.17% 95.10% 90.79%
Accuracy 94.71% 96.03% 95.09%
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BEIJING INSTITUTE OF TECHNOLOGY. Downloaded on May 04,2022 at 12:52:51 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2022.3170157, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL 13
VII. D ISCUSSIONS
100
In this section, we discuss several open issues of this
work, including the flexibility of the proposed hierarchical
80
sensing framework, interpretability of feature extraction for
WiFi sensing, additional benefits of the proposed hierarchical
Accuracy
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BEIJING INSTITUTE OF TECHNOLOGY. Downloaded on May 04,2022 at 12:52:51 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2022.3170157, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL 14
DPV MRV
Fine-grained
Labeling hands by increasing additional WiFi devices to construct 3-
Features
dimensional Fresnel zones in a multiview manner. Moreover,
we treat the moving hand as a mass point due to the weak
Unknown
reflection of other parts of human body. However, it should
Gestures
be multiple points reflection when our body shakes vigorously
during performing handwritings, which is a challenging issue.
General
Local Details Qualitative Details While the proposed hierarchical sensing framework is extensi-
Moving Trend
ble, but more fine-grained features are needed to be identified
Fig. 24. Interpretable feature extraction of handwriting gesture recognition. if additional types of handwritings are required for recognition.
VIII. C ONCLUSION
C. Additional Benefits of the Hierarchical Sensing Mechanism In this paper, we design and implement HandGest, the
In this paper, we propose the hierarchical sensing mechanis- real-time handwriting recognition system using commodity
m for handwriting recognition, but we envision that it should WiFi devices. Specifically, we first construct two hand-centric
be a general methodology to guarantee the robustness and gen- features, i.e., DPV and MRV, to ensure the consistency and
eralization performance for many WiFi sensing applications. discriminability of handwriting, in the feature spaces, without
In addition, we also can identify handwriting gestures with strict constraints of initial locations and orientations of the
similar trajectories using the hand-centric features embedded hand, and drawing sizes, etc. Furthermore, we propose a
in the hierarchical structure. For example, we can draw digit hierarchical sensing framework that only needs two pairs
“0” tall and slender, and letter “o” short and fat. Some of commodity WiFi devices and achieves good robustness
existing methods directly utilize quantitative features, such as and great generalization performance in practical scenarios
velocity and accumulated displacement, making them achieve through a layer-by-layer recognition procedure. Comprehen-
satisfactory accuracy only when the hand moves in specific sive experiments on a large number of handwritings performed
areas of the two pairs of WiFi transceivers. Because the by 20 volunteers in different real-world settings demonstrate
intersected Fresnel zones of the two pairs of WiFi transceivers that HandGest outperforms state-of-the-art solutions without
are orthogonal at specific areas between them (illustrated as knowing or constraining the initial locations and orientations
optimal sensing area in Fig. 25), and it is sparse and dense of the hand, and drawing sizes, etc. The main objective of
alternately in other areas. In the optimal sensing area, quan- this work is to address the fundamental problem in CSI-based
titative relationship of the features can be held. However, the gesture/activity recognition – the location and orientation
performance of the proposed hierarchical sensing mechanism dependent problem. The proposed HandGest is just an initial
is not limited by the so-called “optimal sensing area” due to attempt to solve this problem, where we use the ten digits
the embedded hand-centric features recording instantaneous recognition as an example. We envision that the proposed
status of the moving hand at each time step from a hand- hierarchical sensing mechanism is a general methodology
oriented view. for both gesture recognition and other CSI-based sensing
applications, which may fill in the gap between laboratory
prototype systems and real-world deployments.
R EFERENCES
[1] D. Zhang, H. Wang, and D. Wu, “Toward centimeter-scale human activity
sensing with Wi-Fi signals,” Computer, vol. 50, vol. 1, 2017, pp. 48-57.
[2] Y. Ma, G. Zhou, and S. Wang, “WiFi sensing with channel state
information: A survey,” ACM Comput. Surv. Tut., vol. 52, no. 3, 2019,
pp. 1-36.
[3] J. Liu, H. Liu, Y. Chen, Y. Wang, and C. Wang, “Wireless sensing for
human activity: A survey,” IEEE Commun. Surv. Tut., vol. 22, no. 3, 2020,
pp. 1629-1645.
[4] Y. Ma, G. Zhou, S. Wang, H. Zhao, and W. Jung, “SignFi: Sign language
recognition using WiFi,” Proc. ACM Interact. Mob. Wearable Ubiquitous
Technol., vol. 2, no. 1, 2018, pp. 1-21.
[5] X. Wang, X. Wang, and S. Mao, “RF sensing with the Internet of Things:
A general deep learning framework,” IEEE Commun. Mag., vol. 56, no.
9, 2018, pp. 62-67.
[6] J. Zhang, Y. Li, and W. Xiao, “Integrated multiple kernel learning for
Fig. 25. Intersection of Fresnel zones of the WiFi transceivers. device-free localization in cluttered environmnets using spatiotemporal
information,” IEEE Internet Things J., vol. 8, no. 6, 2021, pp. 4749-4761.
[7] Y. Wang, J. Liu, Y. Chen, M. Gruteser, J. Yang, and H. Liu, “E-
eyes: Device-free location-oriented activity indentification using fine-
grained WiFi gestures,” in: Proceedings of the 20th Annual International
D. Limitations Conference on Mobile Computing and Networking (MobiCom), Maui,
Hawaii, USA, 2014, pp. 617-628.
HandGest focuses on handwriting recognition of one hand, [8] W. Wang, A.X. Liu, M. Shahzad, K. Li, and S. Lu, “Understanding
it is difficult to separate the WiFi signals reflected by multiple and modeling of WiFi signal based human activity recognition,” in:
Proceedings of the 21st Annual International Conference on Mobile
moving hands simultaneously using two pairs of commodi- Computing and Networking (MobiCom), New York, NY, USA, 2015, pp.
ty WiFi devices. It is possible to capture multiple moving 65-76.
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BEIJING INSTITUTE OF TECHNOLOGY. Downloaded on May 04,2022 at 12:52:51 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2022.3170157, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL 15
[9] K. Ali, A.X. Liu, W. Wang, and M. Shahzad, “Keystroke recognition ratio of two antennas,” Proc. ACM Interact. Mob. Wearable Ubiquitous
using WiFi signals,” in: Proceedings of the 21st Annual International Technol., vol. 3, no. 3, 2019, pp. 1-26.
Conference on Mobile Computing and Networking (MobiCom), London, [31] T. Needham, “Visual complex analysis,” Oxford University Press, 1998.
UK, 2015, pp. 1-13. [32] R.W. Schafer, “What is a Savitzky-Golay filter?” IEEE Signal Proc.
[10] Y. Zou, W. Liu, K. Wu, and L.M. Ni, “Wi-Fi radar: Recognizing human Mag., vol. 28, no. 4, 2011, pp. 111-117.
behavior with commodity Wi-Fi,” IEEE Commun. Mag., vol. 55, no. 10, [33] D. Wu, D. Zhang, C. Xu, Y. Wang, and H. Wang, “WiDir: Walking
2017, pp. 105-111. direction estimation using wireless signals,” in: Proceedings of the
[11] J. Zhang, W. Xiao, and Y. Li, “Data and knowledge twin driven 2016 ACM International Joint Conference on Pervasive and Ubiquitous
integration for large-scale device-free localization,” IEEE Internet Things Computing (UbiComp), Heidelberg, Germany, 2016, pp. 351-362.
J., vol. 8, no. 1, 2021, pp. 320-331. [34] Z. Han, Z. Lu, X. Wen, J. Zhao, L. Guo, and Y. Liu, “In-air handwriting
[12] Z. Wang, B. Guo, Z. Yu, and X. Zhou, “Wi-Fi CSI-based behavior by passive gesture tracking using commodity WiFi,” IEEE Commun. Lett.,
recogniton: From signals and actions to activies,” IEEE Commun. Mag., vol. 24, no. 11, 2020, pp. 2652-2656.
vol. 56, no. 5, 2018, pp. 109-115. [35] S. Yousefi, H. Narui, S. Dayal, S. Ermon, and S. Valaee, “A survey
[13] F. Wang, F. Zhang, C. Wu, B. Wang, and K.J.R. Liu, “Respiration on behavior recognition using WiFi channel state information,” IEEE
tracking for people counting and recognition,” IEEE Internet Things J., Commun. Mag., vol. 55, no. 10, 2017, pp. 98-104.
vol. 7, no. 6, 2020, pp. 5233-5245. [36] Y. Yang, J. Cao, X. Liu, and X. Liu, “Door-monitor: Counting in-and-
[14] D. Halperin, W. Hu, A. Sheth, and D. Wetherall, “Tool release: Gath- out visitors with COTS WiFi devices,” IEEE Internet Things J., vol. 7,
ering 802.11 n traces with channle state information,” ACM SIGCOMM no. 3, 2020, pp. 1704-1717.
Computer Commun. Rev., vol. 41, no. 1, 2011, pp. 53. [37] W. Jiang, C. Miao, F. Ma, S. Yao, Y. Wang, Y. Yuan, H. Xue, C. Song,
[15] S. Tang and J. Yang, “WiFinger: Leveraging commodity WiFi for fine- X. Ma, D. Koutsonikolas, W. Xu, and L. Su, “Towards environmnet
grained finger gesture recognition,” in: Proceedings of the 17th ACM independent device free human activity recognition,” in: Proceedings
International Symposium on Mobile Ad Hoc Networking and Computing of the 24th Annual International Conference on Mobile Computing and
(MobiHoc), Paderborn, Germany, 2016, pp. 201-210. Networking (MobiCom), New Delhi, India, 2018, pp. 1-16.
[16] O. Zhang, and K. Srinivasan, “Mudra: User-friendly fine-grained gesture [38] F. Wang, W. Gong, and J. Liu, “On spatial diversity in WiFi-based human
recognition using WiFi signals,” in: Proceedings of the 12th Internation- activity recognition: A deep learning-based approach,” IEEE Internet
al Conference on emerging Networking EXperiments and Technologies Things J., vol. 6, no. 2, 2019, pp. 2035-2047.
(CoNEXT), Irvine, CA, USA, 2016, pp. 83-96. [39] J. Zhang, Y. Li, W. Xiao, and Z. Zhang, “Robust extreme learning
[17] N. Yu, W. Wang, A.X. Liu, and L. Kong, “QGesture: Quantifying gesture machine for modeling with unknown noise,” J. Franklin Inst., vol. 357,
distance and direction with WiFi signals,” Proc. ACM Interact. Mob. no. 14, 2020, pp. 9885-9908.
Wearable Ubiquitous Technol., vol. 1, no. 4, 2018, pp. 1-22. [40] D. Wu, D. Zhang, C. Xu, H. Wang, and X. Li, “Device-free WiFi human
[18] Y. Zeng, D. Wu, J. Xiong, and D. Zhang, “Boosting WiFi sensing sensing: From pattern-based to model-based approaches,” IEEE Commun.
performance via CSI ratio,” IEEE Pervas. Comput., vol. 20, no. 1, 2021, Mag., vol. 55, no. 10, 2017, pp. 91-97.
pp. 62-70. [41] J. Zhang, Y. Li, W. Xiao, and Z. Zhang, “Non-iterative and fast deep
learning: Multilayer extreme learning machines,” J. Franklin Inst., vol.
[19] A. Virmani, and M. Shahzad, “Position and orientation agnostic gesture
357, no. 13, 2020, pp. 8925-8955.
recognition using WiFi,” in: Proceedings of the 15th ACM International
[42] T. Zheng, Z. Chen, S. Ding, and J. Luo, “Enhancing RF sensing with
Conference on Mobile System, Applications and Services (MobiSys),
deep learning: A layered approach,” IEEE Commun. Mag., vol. 59, no.
Niagara, NY, USA, 2017, pp. 252-264.
2, 2021, pp. 70-76.
[20] Y. Zheng, Y. Zhang, K. Qian, G. Zhang, Y. Liu, C. Wu, and Z. Yang,
“Zero-effort cross-domain gesture recognition with Wi-Fi,” in: Proceed-
ings of the 17th ACM International Conference on Mobile System,
Applications and Services (MobiSys), Seoul, South Korea, 2019, pp. 313-
325.
[21] R.H. Venkatnarayan, G. Page, and M. Shahzad, “Multi-user gesture
recognition using WiFi,” in: Proceedings of the 16th ACM International Jie Zhang (Member, IEEE) received the Ph.D.
Conference on Mobile System, Applications and Services (MobiSys), degree from University of Science and Technology
Munich, Germany, 2018, pp. 401-413. Beijing, China, in 2019.
[22] J. Yang, H. Zou, Y. Zhou, and L. Xie, “Learning gestures from WiFi: He was a Visiting Research Fellow with Uni-
A siamese recurrent convolutional architecture,” IEEE Internet Things J., versity of Leeds, U.K., in 2018. From 2019 to
vol. 6, no. 6, 2019, pp. 10763-10772. 2021, he was a Postdoctoral Research Fellow with
[23] L. Sun, S. Sen, D. Koutsonikolas, and K.H. Kim, “WiDraw: Enabling Peking University, China. He is currently an Asso-
hands-free drawing in the air on commodity WiFi devices,” in: Proceed- ciate Professor with the School of Automation &
ings of the 21st Annual International Conference on Mobile Computing Electrical Engineering, University of Science and
and Networking (MobiCom), Paris, France, 2015, pp. 77-89. Technology Beijing, China, and a Marie Sklodowska
[24] H. Abdelnasser, M. Youssef, and K.A. Harras, “WiGest: A ubiquitous Curie Research Fellow with the School of Electronic
WiFi-based gesture recognition system,” in: Proceedings of the 2015 and Electrical Engineering, University of Leeds, U.K. His current research
IEEE Conference on Computer Communications (INFOCOM), Hong interests include pervasive computing, body sensor network, wireless sensor
Kong, China, 2015, pp. 1472-1480. networks, and machine learning.
[25] Y. Lu, S. Lv, and X. Wang, “Towards location independent gesture Dr. Zhang was a recipient of the Marie Sklodowska-Curie Actions Individ-
recognition with commodity WiFi devices,” Electronics, vol. 8, no. 9, ual Fellowships (MSCA-IF), European Commission.
2019, pp. 1069-1088.
[26] S.D. Regani, B. Wang, and K.J.R. Liu, “WiFi-based device-free gesture
recognition through-the-wall,” in: Proceedings of the 2011 IEEE Interna-
tional Conference on Acoustics, Speech and Signal Processing (ICASSP),
Toronto, ON, Canada, 2021, pp. 8017-8021.
[27] X. Ma, Z. Zhao, L. Zhang, Q. Gao, M. Pan, and J. Wang, “Practical Yang Li received the B.E. degree from Northwest-
device-free gesture recognition using WiFi signals based on metalearn- ern Polytechnical University, China, in 2020. He
ing,” IEEE Trans. Ind. Inform., vol. 16, no. 1, 2020, pp. 228-237. is currently a Ph.D. student with the School of
[28] C. Wu, Z. Yang, Z. Zhou, X. Liu, Y. Liu, and J. Cao, “Non-invasive Computer Science, Peking University, China. His
detection of moving and stationary human with WiFi,” IEEE J. Sel. Areas research interest includes wireless sensing.
Commun., vol. 33, no. 11, 2015, pp. 2329-2342.
[29] D. Wu, R. Gao, Y. Zeng, J. Liu, L. Wang, T. Gu, and D. Zhang,
“FingerDraw: Sub-wavelength level finger motion tracking with WiFi
signals,” Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., vol.
4, no. 1, 2020, pp. 1-27.
[30] Y. Zeng, D. Wu, J. Xiong, E. Yi, R. Gao, and D. Zhang, “FarSense:
Pushing the range limit of WiFi-based respiration sensing with CSI
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BEIJING INSTITUTE OF TECHNOLOGY. Downloaded on May 04,2022 at 12:52:51 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JIOT.2022.3170157, IEEE Internet of
Things Journal
IEEE INTERNET OF THINGS JOURNAL 16
Haoyi Xiong (Senior Member, IEEE) received Ph.D Daqing Zhang (Fellow, IEEE) received the Ph.D.
in computer science from Télécom SudParis and degree from the University of Rome “La Sapienza”,
Université Pierre et Marie Curie, France in 2015. Italy, in 1996.
From 2016 to 2018, he was a Tenure-Track Assistant He is currently a Chair Professor with the School
Professor with the Department of Computer Science, of Computer Science, Peking University, China,
Missouri University of Science and Technology, and Telecom SudParis, France. His current research
Rolla, MO, and a Postdoc at University of Virginia, interests include context-aware computing, urban
Charlottesville, VA, USA, from 2015 to 2016. He computing, mobile computing, big data analytics,
is currently a Principal Architect at Big Data Lab, and pervasive elderly care. He has published over
Baidu Inc., Beijing, China, and also a Graduate 300 technical papers in leading conferences and
Faculty Scholar affiliated to University of Central journals.
Florida, Orlando FL, USA. Prof. Zhang was a recipient of the Ten-Years CoMoRea Impact Paper Award
His current research interests include AutoDL and ubiquitous computing. at IEEE PerCom 2013, the Honorable Mention Award at ACM UbiComp 2015
He has published more than 70 papers in top computer science conferences and 2016, the Ten-Years Most Influential Paper Award at IEEE UIC 2019,
and journals, including UbiComp, ICML, ICLR, RTSS, KDD, AAAI/IJCAI, and the Distinguished Paper Award at ACM UbiComp 2021. He served as
TMC, TC, TKDE, TNNLS, IoTJ, and TKDD. He was a co-recipient of the the General or Program Chair for over 17 international conferences, giving
2020 IEEE TCSC Award for Excellence in Scalable Computing (Early Career keynote talks at more than 20 international conferences. He is an Associate
Researcher) and the prestigious Science & Technology Advancement Award Editor for IEEE Pervasive Computing, ACM Transactions on Intelligent
(First Prize) from Chinese Institute of Electronics in 2019. Systems and Technology, and Proceedings of the ACM on Interactive, Mobile,
Wearable and Ubiquitous Technologies.
2327-4662 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: BEIJING INSTITUTE OF TECHNOLOGY. Downloaded on May 04,2022 at 12:52:51 UTC from IEEE Xplore. Restrictions apply.