Differentially Private Depth Functions and Their Associated Medians

Differentially private depth functions and their associated medians
Kelly Ramsay, Shoja’eddin Chenouri

April 2020
Abstract
arXiv:2101.02800v1 [math.ST] 7 Jan 2021
In this paper, we investigate the differentially private estimation of data depth functions and their
associated medians. We start with several methods for privatizing depth values at a fixed point, and
show that for some depth functions, when the depth is computed at an out of sample point, privacy
can be gained for free when n → ∞. We also present a method for privately estimating the vector of
sample depth values, and show that privacy is not gained for free asymptotically. We also introduce
estimation methods for depth-based medians for both depth functions with low global sensitivity and
depth functions with only highly probably, low local sensitivity. We provide a general Theorem (Lemma
1) which can be used to prove consistency of an estimator produced by the exponential mechanism,
provided the asymptotic cost function is uniquely minimized and is sufficiently smooth. We introduce a
general algorithm to privately estimate minimizers of a cost function which has low local sensitivity, but
high global sensitivity. This algorithm combines propose-test-release with the exponential mechanism.
An application of this algorithm to generate consistent estimates of the projection depth-based median
is presented. For these private depth-based medians, we show that it is possible for privacy to be free
when n → ∞.
Keywords— Differential Privacy, Depth function, Multivariate Median, Propose-test-release
1 Introduction
There is a large body of literature that shows simply removing the identifying information about subjects from a
database is not enough to ensure data privacy, [see Dwork et al., 2017, and the references therein]. One can learn
a surprising amount about a database even if only certain summary statistics are released [Dwork et al., 2017], this
is due in part to the wide variety of publicly available information. On the contrary, if a statistic is differentially
private an adversary cannot learn about the attributes of specific individuals in the original database, regardless
of the amount of initial information the adversary possesses. This property, coupled with the lack of assumptions
needed on the data itself to ensure privacy, accounts for the volume of recent literature on differentially private
statistics. There is a growing interest in the statistical community about private inference [Wasserman and Zhou,
2010, Awan et al., 2019, Cai et al., 2019, Brunel and Avella-Medina, 2020, see, e.g.,]. One burgeoning area is the
connection between robust statistics and differentially private statistics, first discussed by Dwork and Lei [2009].
Private M-estimators were studied by [Lei, 2011, Avella-Medina, 2019]. The connection between private estimators
and gross error sensitivity was formalized in [Chaudhuri and Hsu, 2012], who present upper and lower bounds on the
convergence of private estimators in relation to their gross error sensitivity. The connection between private estimators
and gross error sensitivity is further explored to construct differentially private statistics in [Avella-Medina, 2019].
[Brunel and Avella-Medina, 2020] greatly expanded the propose-test-release paradigm of Dwork and Lei [2009] using
the concept of the finite sample breakdown point. The same authors use the same idea to construct private median
estimators with sub-Gaussian errors [Avella-Medina and Brunel, 2019]. Our present work is inspired by these recent
papers, where we explore the privatization of depth functions, a robust and nonparametric data analysis tool; given
the recent success of robust procedures in the private setting, it is worthwhile to develop and study privatized depth
functions and associated medians.
Depth functions facilitate the extension of, among other things, medians and rank statistics to the multivariate
setting. The robustness properties of depth functions, including breakdown and gross error sensitivities, are well
studied and favourable [Romanazzi, 2001, Chen and Tyler, 2002, Zuo, 2004, Dang et al., 2009], making them a
promising direction of study for use in the private setting. We take the some of the first steps in privatizing depth
based inference. The contributions are as follows
1
• We present several approaches for the privatization of depth functions, including a discussion of advantages
and disadvantages of each approach.
• We present algorithms for the private release of sample depth values of several popular depth functions. These
include halfspace depth [Tukey, 1974], simplicial depth [Liu, 1990], IRW depth [Ramsay et al., 2019] and
projection depth [Zuo, 2003]. Our algorithms and analysis can also be applied to depth functions with similar
characteristics. We present asymptotic results concerning these private depth value estimates.
• We present algorithms for generating consistent, private depth-based medians, using the exponential mechanism
and the propose-test-release framework of [Dwork and Lei, 2009, Brunel and Avella-Medina, 2020].
• As a result of the privatization of depth-based medians, we extend the propose-test-release algorithm of Brunel
and Avella-Medina [2020] to be used with the exponential mechanism. We present a general algorithm for
releasing a private maximizer (or minimizer), where the objective function has infinite global sensitivity but
high local sensitivity is unlikely.
• We present a general theorem that can be used to prove weak consistency of estimators generated from the
exponential mechanism, even when the cost function is not necessarily differentiable.
Some work has been done surrounding the private computation of halfspace depth regions and the halfspace median
[Beimel et al., 2019, Gao and Sheffet, 2020] from a computational geometry point of view. Though Beimel et al.
[2019] mentions that the Tukey depth function can be used with the exponential mechanism, they do not study the
estimator’s properties from a statistical point of view; it is used as a method of finding a point in the convex hull of
a set of points. To the best of our knowledge, no one has attempted to privatize other depth functions.
2 Differential Privacy
In this section we introduce the fundamentals of differential privacy. Two essential concepts are that of a mechanism
and that of adjacent databases. In order for a statistic (or database) to be differentially private, it must be stochas-
tically computed [Dwork and Roth, 2014]. This differs from typical data analysis in that all differentially private
statistics are generated from a distribution Te(Xn ) ∼ QXn rather than being deterministically computed. In other
words, all differentially private statistics Te(Xn ) admit measures QXn given the data Xn . We assume here that the data
is a random sample of size n such that each observation is in Rd , we will denote the sample by Xn = {X1 , . . . , Xn }.
We call the procedure that determines QXn and then outputs Te(Xn ) ∼ QXn a mechanism. We may also refer to the
mechanism by Te with an abuse of notation.
Along with mechanisms, we also must define adjacent databases. We say that Xn and Yn (another random
sample of size n) are adjacent if they differ by one observation. In other words, Xn and Yn are adjacent if they have
symmetric difference equal to one. Equipped with these concepts, we can define differential privacy:
Definition 1. A mechanism Te is -differentially private for > 0 if the following holds for all measurable sets B
and adjacent Xn and Yn :
QXn (B)
≤ e . (1)
QYn (B)
The parameter should be small, implying that

QXn (B)
≈ 1,
QYn (B)
which gives the interpretation that the two measures QXn and QYn are almost equivalent. To understand this
definition, it helps to think of the problem from the adversary’s point of view. Suppose that we are the adversary
and that we have access to all the entries in the database except for one, call it θ, which we are trying to learn about.
If Te is released, how can we use it to conduct inference about θ? In order to test
H0 : θ = θ0 vs. H1 : θ = θ1
we, as statisticians, would then ask two questions:
How likely was it to observe Te under H0 ? and How likely was it to observe Te under H1 ?
Differential privacy stipulates that both of these questions have practically the same answer, making it impossible to
infer anything about θ from Te. Definition 1 implies that if someone in the dataset was replaced, we are just as likely to
have seen Te (or some value very close to Te if QXn is continuous). Another way to interpret the definition is to observe
2
that differential privacy implies that KL(QXn , QXn ) < , where KL is the Kullback–Leibler divergence; implying that
the distributions are necessarily close. Differential privacy is a worst case restriction, in that the inequality covers all
databases, and all possible outcomes of the mechanism.
Definition 1 can be difficult to satisfy because the umbrella of ‘all databases and mechanism outputs’ can include
both some extreme databases and extreme mechanism outputs. One may wish to relax this definition over unlikely
mechanism outputs; one way to do this is if B is such that QXn (B) is very small, then the bound could be allowed
to fail. This is called approximate differential privacy or (, δ)-differential privacy, in which we have
QXn (B) ≤ e QYn (B) + δ (2)
in place of the condition (1). Typically, δ << , and δ can be interpreted as the probability under which the bound
is allowed to fail. To see this, observe that for B such that QXn (B) < δ, (2) holds regardless of .
Central to many private algorithms is the concept of sensitivity. Consider some function S : (Rd )n → Rk where
(Rd )n denotes the sample space. Usually S represents a statistic or a data driven objective function. There are three
main types of sensitivity; local sensitivity, global sensitivity and smooth sensitivity:
LS(S; Xn ) = sup kS(Xn ) − S(Yn )k .

Yn
GS(S) = sup kS(Xn ) − S(Yn )k .

Xn ,Yn

SS(S; Xn , β) = sup e−βsd(Xn ,Yn ) LS(S; Yn ) ,
Yn ∈(Rd )n
where sd represents the symmetric difference between two databases. In some cases, it is necessary to use different
norms and so we add the subscript GSp to indicate global sensitivity computed with respect to the p-norm. With
sensitivity understood, we now introduce some important building blocks of differentially private algorithms. Let
W1 , . . . , Wk , . . . and Z1 , . . . , Zk , . . . represent a sequence of independent, standard Laplace random variables and a
sequence of independent, standard Gaussian random variables respectively. The Laplace and Gaussian mechanisms
are essential differentially private mechanisms; they define how much an estimator must be perturbed in order for it
to be differentially private.
Mechanism 1 (Dwork et al. [2006]). For a statistic T : (Rd )n → Rk and some > 0, the mechanism that outputs
Te(Xn ) = T (Xn ) + (W1 , . . . , Wk )GS1 (T )/
is -differentially private.
Mechanism 2 (Dwork et al. [2006], Dwork and Roth [2014]). For a statistic T : (Rd )n → Rk and some , δ > 0 the
mechanism that outputs p
2 log(1.25/δ)GS2 (T )
Te(Xn ) = T (Xn ) + (Z1 , . . . , Zk )

is (, δ)-differentially private.
This can be improved in strict privacy scenarios [Balle and Wang, 2018]. We can also add noise based on smooth
sensitivity [Nissim et al., 2007]. Using the smooth sensitivity allows the user to leverage improbable worst case local
sensitivities. Often in practice, statistics are computed by maximizing a data driven objective function, φXn (·). We
can privatize such a procedure via the exponential mechanism. The exponential mechanism can be defined as follows:
Mechanism 3 (McSherry and Talwar [2007]). Given the data, consider a function φXn : Rd → R and define the
global sensitivity of such a function as GS(φ) = supXn ,Yn kφXn − φYn k∞ . Then a random draw from the density
f (y; φXn , ) that satisfies
φXn (y)
f (y; φXn , ) ∝ exp ,
2GS(φ)
is an -differentially private mechanism. It is assumed that the following is satisfied:
Z
φXn (y)
exp dy < ∞.
Rd 2GS(φ)
3
The factor of 2 can be removed if the normalizing term is independent of the sample. All of the mechanisms
discussed so far require that the statistic has finite global sensitivity. This is a somewhat strict requirement; under
the normal model neither the sample mean nor sample median have finite global sensitivity. The sample median
does, however, have low local sensitivity, viz.
LS(Med(Xn )) ≤ |Fn−1 (1/2 − 1/n) − Fn−1 (1/2 + 1/n)|,
where F −1 refers to the left continuous quantile function for a distribution F and Fn is the empirical distribution
of a univariate sample Xn . Since 1/n → 0, we expect this value to be small (assuming the sample comes from a
distribution which is continuous at its median). Throughout the paper we may also write the median as a function of
the distribution Med(F ), where it is understood that Med(Xn ) and Med(Fn ) are equivalent. The propose-test-release
mechanism, or PTR, can be used to generate private versions of statistics with infinite global sensitivity but highly
probable low local sensitivity. The propose-test-release idea was introduced by Dwork and Lei [2009] but was greatly
expanded in the recent paper by Brunel and Avella-Medina [2020]. The PTR algorithm of Brunel and Avella-Medina
[2020] relies on the truncated breakdown point Aη , which is the minimum number of points that must be changed in
order to move an estimator by η:
( )
Aη (S; Xn ) = min k : sup |S(Xn ) − S(Yn )| > η , (3)
Yn ∈D(Xn ,k)
where D(Xn , k) is the set of all samples that differ from Xn by k observations. Unlike the traditional breakdown
point, the dependence of Aη (S; Xn ) on Xn is important. PTR works by proposing a statistic, testing if it is insensitive
and then releasing it if it is, in fact, insensitive. A private version of Aη (S; Xn ) is used to check the sensitivity.
Mechanism 4. For a statistic T : (Rd )n → Rk and some , δ > 0 the mechanism that outputs
Aη (T ; Xn ) + 1 W1 ≤ 1 + log(2/δ)

⊥
Te(Xn ) =
T (Xn ) + η W2 o.w.
is (2, δ) differentially private and the statistic

 √
2 log(1.25/δ) 2 log(1.25/δ)
⊥ if Aη (T ; Xn ) + Z1 ≤1+

Te(Xn ) = √
 T (X ) + η 2 log(1.25/δ) Z o.w.
n 2
is 2, 2e δ + δ 2 differentially private.

The release of ⊥ means that the dataset was too sensitive for the statistic to be released. The goal is to use T
such that releasing ⊥ is incredibly unlikely; Aη (T ; Xn ) should be expected to be large with high probability.
All of the mechanisms discussed thus far can be combined to produce more sophisticated algorithms, combining
two or more mechanisms is called composition. One type of composition is computing functions of differentially
private statistics or databases, (which do not depend on the data,) these are also differentially private. It is also true
that sums and products of k differentially private procedures each with privacy budget i are ki=1 i - differentially
P
private [Dwork et al., 2006]. This can be improved with advanced composition [Dwork and Roth, 2014].
0
Theorem
1 (Dwork and
Roth [2014]). For given 0 < < 1 and δ > 0, the composition of k mechanisms which are
each √ 0
, δ -differentially private is (, kδ + δ 0 )-differentially private.
2 2k log(1/δ )
3 Data Depth
A data depth function is a robust, nonparametric tool used for a variety of inference procedures in multivariate
spaces, as well as more general spaces. A data depth function gives meaning to centrality, order and outlyingness
in spaces beyond R. Data depth functions do this by giving all points in the working space a rating based on how
central the point is in the sample. Precisely, we can write multivariate depth functions as D : Rd × Fn → R+ ; given
the empirical distribution of a sample Fn and a point in the domain, the depth function assigns a real valued depth
to that point. Writing depth functions as functions of the empirical distribution D(·; Fn ) rather than functions of
the sample, say, D(·; Xn ) provides a natural definition for population values of depth; D(·; F ). If we let F be the set
of distributions on Rd we can write depth functions as D : Rd × F → R+ . Figure 1(a) shows a sample of 20 points
labelled by their depth values, we can see that the points in the centre of the data cloud have larger values. Note
that it is not necessary to restrict the domain of the depth function to points in the sample; we can compute depth
4
3 3
0.05 0.05
0.15
2
0.1 2
0.15
HD
0.5
0.3
1 0.15 1
0.1 0.4
0.3 0.2
0.2
0.3
0
Y
0
Y
0.10.15 0.3 0.15

0.05
0.15 0.2
0.05
−1 −1 0.1
0.1
0.0
−2 −2
0.05
−3 −3
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3
X X
(a) (b)
Figure 1: (a) Sample halfspace depth values, i.e., D(Xi ; Fn ), are displayed in white text. The heatmap
of the sample depth function, i.e., D(·; Fn ), is also displayed. This sample is drawn from a standard, two
dimensional normal distribution. (b) Theoretical halfspace depth contours for the standard, two dimensional
normal distribution.
values for each point in the sample space. The heatmap in Figure 1(a) gives the depth value for each point in the
plot. Figure 1(b) gives the corresponding theoretical depth values when F is the two dimensional standard normal
distribution.
Depth functions provide an immediate definition of order statistics; observations can be ordered by their depth
values. However, since the ordering of the sample is center outward, the depth-based order statistics have a different
interpretation than univariate order statistics. Nevertheless, data depth-based order can be used to define multivariate
analogues of many univariate, nonparametric inference procedures. For example, the definition of the depth-based
median is:
Med(F ; D) = argmax D(x; F ).
x∈Rd
These medians are usually robust, in the sense that they are not affected by outliers. Many depth-based medians
have high a breakdown point and favourable properties related to the influence function [Chen and Tyler, 2002, Zuo,
2004]. Furthermore, depth-based medians inherent any transformation invariance properties possessed by the depth
function. We can subsequently define sample depth ranks as
Ri = #{Xj : D(Xj ; Fn ) ≤ D(Xi ; Fn )},
which are the building block of various multivariate depth based rank tests [Liu and Singh, 1993, Serfling, 2002,
Chenouri et al., 2011], as well as providing a method to construct trimmed means [Zuo, 2002]. Depth values can also
be used directly in testing procedures [Li and Liu, 2004]. We have also seen depth functions used for visualization,
including the bivariate extension of the boxplot (bagplots) and dd-plots, which allow the analysts to visually compare
two samples of any dimension [Liu et al., 1999, Li and Liu, 2004]. In the same vein of data exploration, we can
visualise multivariate distributions through one dimensional curves based on depth values [Liu et al., 1999]. Such
curves describe scale, kurtosis, skew ect. . In the past decade this depth-based inference framework has expanded to
include solutions to clustering [Jörnsten, 2004, Baidari and Patil, 2019], classification [Jörnsten, 2004, Lange et al.,
2014], outlier detection [Chen et al., 2009, Cárdenas-Montes, 2014], process monitoring [Liu, 1995], change-point
problems [Chenouri et al., 2019] and discriminant analysis [Chakraborti and Graham, 2019]. In summary, depth
functions facilitate a framework for robust, nonparametric inference in Rd . A major motivating factor for this work
is that by privatizing depth functions, we consequentially privatize many of the procedures in this framework. This
means that private depth values imply access to private procedures for nonparametrically estimating location, scale,
rank tests, building classifiers and more.
5
In their seminal paper Zuo and Serfling [2000] give a concrete set of mathematical properties which a multivariate
depth function should satisfy in order to be considered a statistical depth function. These properties include:
1. Affine invariance: This implies any depth based analysis is independent of the coordinate system, particularly
the scales used to measure the data.
2. Maximality at centres of symmetry: If a distribution is symmetric about a point, then surely this point should
be regarded as the most central point.
3. Decreasing along rays: This property ensures that as one moves away from the deepest point, the depth
decreases.
4. Vanishing at infinity: As a point moves toward infinity along some ray, the depth vanishes.
A function D : Rd × F → R+ which satisfies these four properties is known as a statistical depth function. The last
three properties are all related to centrality, where the first is to ensure there is no dependence on the measurement
system. Not all popular depth functions satisfy all four of these properties, but they typically satisfy most of them.
Affine invariance, as discussed previously, ensures that the function is not dependent on the coordinate system, which
from a practical point of view means that the measurement scales can be adjusted freely. Maximality at centre means
that if a distribution is symmetric about some point θ, the depth is maximal at that point. Think of the median
coinciding with the mean in the univariate case. Decreasing along rays means that as one moves away from the
deepest point along some ray, i.e., moves away from the centre, the depth decreases. Vanishing at infinity means that
as the point moves along a ray to infinity, it’s depth approaches 0. Note that if all four of these properties are not
satisfied, it does not necessarily mean that a depth function is invalid or not useful in data analysis; it is merely a
limitation to consider.
Aside from coordinate invariance and centrality, there are other properties that are desirable for a depth function
to satisfy. We shall list the main ones here:
• Robustness: A robust depth function implies subsequent inference will be robust, and may make it more
amenable to privatization.
• Consistency/Limiting Distribution: Consistency for a population depth value and existence of a limiting dis-
tribution is useful for developing inference procedures.
• Continuity: It can be a building block for consistency and for optimizing the depth function.
• Computation: In order to apply depth-based inference, it is necessary that the depth values are computed
quickly. Specifically, being able to compute or approximate the depth values in polynomial time with respect
to both d and n is useful.
On top of having these properties, a depth function that is to be used in the private setting should be insensitive. In
other words, the depth function has low global sensitivity and or highly probable low, local sensitivity.
We can now introduce several depth functions, and evaluate their sensitivities. The first depth function we will
discuss is halfspace depth [Tukey, 1974].
Definition 2 (Halfspace depth.). Let S d−1 = {x : x ∈ Rd , kxk = 1} be the set of unit vectors in Rd . Define the
halfspace depth HD of a point x ∈ Rd with respect to some distribution X ∼ F as,

HD(x; F ) = inf Pr X > u ≤ x> u .
u∈S d−1
Halfspace depth is the minimum of the projected mass above and below the projection of x, over all univariate
projections. We can interpret the sample depth of some point x as the minimum normalised univariate centre-
outward rank of x’s projections amongst the samples’ projections, over all univariate directions. Therefore, if a point
is exchanged, all the ranks are shifted by at most one, and the global sensitivity of the unnormalised halfspace depth
is 1. We get GS(HD) = 1/n, which leads us to conclude that this depth function is relatively insensitive. In terms
of known properties, halfspace depth is a statistical depth function as was defined above. Its sample depth function
is also uniformly consistent [Massé, 2004]. We can also mention that halfspace depth identifies the distribution in
certain cases Nagy [2018]. Halfspace depth is frequently cited as being computationally complex Serfling [2006],
however, recently an algorithm for computing half-space depth in high dimensions has been proposed Zuo [2019].
This function is also upper semi-continuous Zuo and Serfling [2000].
We can replace the minimum in Definition 2 with an average [Ramsay et al., 2019].
Definition 3 (IRW Depth.). Define integrated rank-weighted depth as
Z
IRW(x; F ) = min Pr X > u ≤ x> u , 1 − Pr X > u < x> u dν(u),
S d−1
where ν is the uniform measure on S d−1 .
6
It immediately follows from the discussion on the sensitivity of halfspace depth that GS(IRW) = 1/n; this
depth function has the interpretation of the average normalised univariate centre-outward rank over all projections.
Therefore, IRW depth is also insensitive. Aside from being insensitive, IRW depth also vanishes at infinity and
is maximal at points of symmetry. It is invariant under similarity transformations, which is weaker than affine
invariance. It is conjectured that this function also has the decreasing along rays property. This depth function
is also continuous, and can be approximately computed very quickly [Ramsay et al., 2019]. This depth function’s
sample depths are also uniformly consistent and asymptotically normal under mild assumptions.
Another, asymptotically normal depth function is simplicial depth, which was introduced by Liu [1988].
Definition 4 (Simplicial Depth.). Suppose that Y1 , . . . , Yd+1 are i.i.d. from F . Define simplicial depth as
SMD(x; F ) = Pr(x ∈ ∆(Y1 , . . . , Yd+1 )),
where ∆(Y1 , . . . , Yd+1 ) is the simplex with vertices Y1 , . . . , Yd+1 .
We can also show that sample simplicial depth has finite sensitivity. Note that
1
1 X ∈ ∆(Xi1 , . . . , Xid+1 ) .
X
SMD(x; Fn ) = n

d+1 1<i1 <...<id+1 <n
Changing one observation can influence a maximum of n−1

d
terms, and each term has a sensitivity of 1. It follows
that GS(SMD) = (d + 1)/n.
Although it is insensitive, this depth function can be difficult to compute in even moderate dimensions (d > 3).
Simplicial depth is a statistical depth function if F is angularly symmetric, but fails to satisfy the maximality at
centre and decreasing along rays for some discrete distributions. This is not usually a major concern, seeing as we
are typically concerned with continuous distributions in a depth-based inference context [Zuo and Serfling, 2000, Liu,
1988].
The investigation of Zuo and Serfling [2000] lead to the study of a general and powerful statistical depth function
based on outlyingness functions. Outlyingness functions O : Rd × F → R+ measure the degree of outlyingness of a
point. A particular version of depth based on outlyingness is projection depth.
Definition 5 (Projection Depth). Given a univariate translation and scale equivariant location measure µ and a
univariate scale equivariant and translation invariant measure of scale ς, we can define projected outlyingness as
u> x − µ(Fu )
O(x; F ; µ, ς) = sup
u∈S d−1 ς(Fu )
and thus projection depth as,

1
PD(x; F ; µ, ς) = .
1 + O(x; F ; µ, ς)
Typically, µ and ς refer to the median and median absolute deviation, but properties have been investigated for
general µ and ς. One idea is to come up with choices of µ and ς that are designed such that O(x; Fn ) has low global
sensitivity, but that is left to later work. Here, we will use either
u> x − Med X>

nu
O1 (x; Fn ) = O(x; Fn ; Med, MAD) = sup ,
kuk=1 MAD (X>n u)
or
u> x − Med X>

nu
O2 (x; Fn ) = O(x; Fn ; Med, IQR) = sup .
kuk=1 IQR (X>
n u)
The global sensitivities of O1 , O2 are unbounded, but they have bounded local sensitivities, making projection depth
a good candidate for the propose-test-release procedure. Since the range of projection depth is (0, 1), it’s global
sensitivity is 1. However, the global sensitivity spanning the range of the depth is not desirable, so we hope to make
improvements by using a propose-test-release algorithm. Note that we use a slight abuse of notation, where X> nu
refers to the sample {X1> u, . . . , Xn> u}. We may also refer to the empirical distribution implied by this sample as
Fn,u . A thorough investigation of the properties of projection depth was done in the successive papers [Zuo, 2003,
2004]. As a result of these papers, it has been shown that projection depth is a statistical depth function, it also has
a limiting distribution and is quite robust against outliers.
7
4 Private Data Depth
There are several ways in which we could approach privatizing depth functions. A natural and easy way to do
this is to start with a differentially private estimate of the distribution of the data Fen and use D(x, e Fen ), which is
differentially private. Computing Fn relies on existing methods for generating private multidimensional empirical
e
distribution functions. This method fails to take advantage of any local robustness properties of depth functions; it
does not leverage the low local sensitivities and global sensitivities of the depth function itself. This method also does
not give a method for computing the sample depth values D(X1 , Fn ), since D(X1 , Fen ) is not private. Computing the
sample depth values is often included in depth-based inference [see, e.g., Li and Liu, 2004, Lange et al., 2014]. In this
paper, we aim to study the advantages of the robustness properties of depth functions in the private setting and so
we forgo study of D(x,
e Fen ).
If the global sensitivity of D is finite, then an obvious private estimate is
D(x;
e Fn ) = D(x; Fn ) + Vδ, GS(D),
where Vδ, is independent noise, from the Laplace or Gaussian distribution with scale calibrated to ensure privacy.
If a depth function has infinite or large global sensitivity then we can leverage some robustness properties, such as a
high breakdown point, to show that the depth function has generally low local sensitivity and use PTR.
We can also produce a more direct privatized estimate of D(·, F ) based on the sample, such as is done with the
histogram bins [Wasserman and Zhou, 2010]. For example, many depth functions are defined based on functions of
projections: h(·; X> n u), u ∈ Un where Un is some set of directions, i.e., Un ⊂ S
d−1
. We could then produce private
versions of X> n u or private versions of h(·; X >
n u), if they are insensitive. The advantage of this approach would be that
the entire depth function could be privatized at once, including the sample depth values. In the same vein, recalling
that ν is the uniform measure on S d−1 , there exists an image measure, µx,Fn (A) = ν(h−1 (A; x, Fn )) on the Borel sets
of the range of the depth: A ∈ B(ID ). If µx,Fn is insensitive, then we can construct a differentially private estimator
based on random draws from µx,Fn . This approach is somewhat complicated, and could be tedious since we must
setup a sampler for each x at which we want to compute a depth value. We leave these projection type approaches
for future research.
From the discussion above, it is clear that a key question is at which points would we like to estimate depth
values? Algorithms which estimate the depth of single point are of course of interest; they can be composed to
compute depth values at several points privately. Additionally, simple algorithms to compute the depth of a single
point can be used as building blocks for private versions of depth based inference procedures. As mention previously,
it is also of interest to compute the depth values of the sample points:
b n ) := (D(X1 ; Fn ), D(X2 ; Fn ), . . . , D(Xn ; Fn )).
D(F
Since Xi appears in both arguments, the sensitivity of D(Xi ; Fn ) is larger than that of D(x; Fn ) x ∈ / Xn . We
investigate private methods of estimating this vector of depth values. A further question is whether or not we
can estimate several depth values from different samples simultaneously, e.g., for use in depth-based clustering. To
elaborate, if Xn contains the samples for J groups, then Xn = X1n ∪ . . . ∪ XJn . For example, if we privatize the one
dimensional projections of the entire sample X>n u we can then compute the depth of each point in Xn with respect
to each group Xjn .
An important question is how well do the privatized inference procedures perform when compared to their non-
private counterparts. Do the privatized depth values converge to their non-private counterparts? If so, what is the
rate of convergence? Does this private estimate converge weakly? If so, is the limiting distribution different from
that of the non-private limiting distribution? We investigate some of these questions in the next section.
5 Algorithms for Private Depth Values

As mentioned previously, for depth functions with finite global sensitivity, we can make use of the Gaussian and
Laplace mechanisms.
Mechanism 5. For given , δ > 0 and x given independently of the data, the following estimators
D
e 1 (x; Fn ) = D(x; Fn ) + W GS(D)/.
p
D
e 2 (x; Fn ) = D(x; Fn ) + ZGS(D) 2 log(1.25/δ)/.
are -differentially private and (, δ)-differentially private respectively.
The fact that these mechanisms are differentially private follow from the differential privacy of Mechanisms 1 and 2.
The following results are immediate
8
Theorem 2. For a given depth function, let VD (x) be the limiting distribution of the sample depth values D(x; Fn ).
Suppose we can write GS(D) = C(D)/n where C(D) does not depend on n. Let , r > 0 and ` = 1, 2. For depth values
generated under Mechanism 5, the following holds
p p
1. For any δ = o(n−k ) and any = O(n−1+r ), D
e ` (x; Fn ) → D(x; F ), where → denotes convergence in probability.
√ e d d
2. For any δ = o(n−k ) and any = O(n−1/2+r ), n(D ` (x; Fn ) − D(x; F )) → VD (x), where → denotes convergence
in distribution.
It should be noted that choosing δ = o(n−k ) and = O(n−1/2+r ) maintains a reasonable level of privacy. For
example, Cai et al. [2019] says that choosing ∈ O(1) and δ < 1/n is “the most-permissive setting under which
(, δ)-differential privacy is a nontrivial guarantee.” From Theorem 2 we can conclude that for large samples and for
certain privacy parameters, depth value estimates generated via Mechanism 5 are minimally affected by privatization.
What if we want to calculate depth at a sample point? How we can estimate the vector of depth values at the
sample points
b n ) := (D(X1 ; Fn ), D(X2 ; Fn ), . . . , D(Xn ; Fn ))
D(F
privately? The sample values now appear in both arguments of D so we must do a bit more work to compute
sensitivities. First we look at halfspace and IRW depth. Consider one set of projections X>
n u and their corresponding
empirical distribution Fn,u . We want to compute the sensitivity of the vector Rn,u = (R1,u , . . . , Rn,u ), with
Ri,u = min{Fn,u (Xi> u), 1 − Fn,u (Xi> u−)},
and F (x−) = P (X < x). If we change one observation, we can change at most n − 1 values of Rn,u in the odd case
and we can change at most n − 2 values of Rn,u in the even case. Alternatively, we can change one depth value by
b(n + 1)/2c/n − 1/n and b(n + 1)/2c − 1 values by 1/n. This gives that

2 n+1
GS1 (Rn,u ) = − 1 ≈ 1,
n 2
and that s 2

1 n+1 n+1
GS2 (Rn,u ) = −1 + − 1 ≈ 1.
n 2 2
Since averaging or taking the supremum over such Rn,u does not affect these sensitivities, it follows that for halfspace
and IRW depth,
GS` (D) = GS` (Rn,u ).
Concerning simplicial depth, with respect to some adjacent dataset, the depth values of the unchanged points can
each change by at most (d + 1)/n. For the point thatpis different, we can bound the sensitivity above by 1 − (d + 1)/n.
It follows that GS1 (SMD) ≤ 1 and GS2 (SMD) ≤ (1 − (d + 1)/n)2 + ((d + 1)/n)2 ≈ 1, where SMD is the vector
D with D = SMD. In summary, the global sensitivities of the vector of sample depth values for halfspace, IRW and
simplicial depth are all close to 1. Considering we pay roughly 1/n for each depth value in Mechanism 5, the fact
that we would pay n · 1/n for n sample values makes sense. In fact, we do not use any extra privacy budget for the
fact that we are computing the depth at the sample values. We can then use the following mechanism to estimate
the vector of depth values:
Mechanism 6. The following estimators for the vector of depth values of the sample points
D
e 1 (Fn ) = D(Fn ) + (W1 , . . . , Wn )GS(D)/,
and p
2 log(1.25/δ)
D
e 2 (Fn ) = D(Fn ) + (Z1 , . . . , Zn )GS(D) ,

are -differentially private and (, δ)-differentially private respectively.
The fact that these mechanisms are differentially private follow from the differential privacy of Mechanisms 1 and
2. For the full vector of sample depth values we do not get privacy for free in the limit. Observe that
e 1 (Fn ) − D(F ) ≤ D
D e 1 (Fn ) − D(F ) = k(W1 , . . . , Wn )GS(D)/k + Op (n1/2 ) = Op (n1/2 ).
e 1 (Fn ) − D(Fn ) + D (4)
It seems reasonable that we would have to pay, even asymptotically, to maintain privacy of the entire distribution.
The level of noise is the same order as for the sampling error, and therefore must be accounted for in inference
procedures.
9
We now turn our attention to a depth function with unbounded or high global sensitivity. Projection depth has
been shown to have many favourable properties [Zuo, 2003, 2004], and so we investigate an algorithm for privatizing
this depth function. For projection depth, we would like to generate private outlyingness values, which have un-
bounded sensitivity. Note that Med, MAD, IQR are all robust statistics, in the sense that they are not perturbed
by extreme data points. This implies that O1 and O2 have an unlikely chance of worst case sensitivity, which would
make this depth function a good candidate for the propose test release framework [Dwork and Lei, 2009, Brunel and
Avella-Medina, 2020]. Suppose that IQR ≈ 1 ∀u. If
−1 −1 −1 −1
η ' max(Fn,u (1/2) − Fn,u (1/2 − 1/n), Fn,u (1/2 + 1/n) − Fn,u (1/2))
for all u then Aη ≈ bn/2c − 1, which means it is very unlikely that Mechanism 4 will return ⊥. Obviously, the
choice of η cannot depend on the data, and so we would want η to be large enough so that this happens with high
probability.
Mechanism 7. For ` ∈ {1, 2}, , δ > 0 define privatized projection depth as
1
PD
f ` (x; Fn ) = ,
1+O
e` (x; Fn )
where
aδ
V1 ≤ 1 + bδ

⊥ if Aη (O` (x; Fn ); Xn ) +
O
e` (x; Fn ) = ηaδ
O` (x; Fn ) +
V2 o.w.
where aδ , bδ , the level of privacy and Vj are according to Mechanism 4.
We can actually show that this algorithm is consistent for the population depth values when using IQR as the
univariate measure of scale.
Theorem 3. Let ξp,u be the pth quantile of Fu . Suppose that for all h > 0, u
|Fu (ξp,u + h) − Fu (ξp,u )| = M |h|q (1 + O|h|q/2 ) with M > 0, q > 0,
for p = 1/4, 1/2, 3/4. Suppose that supu ξp,u < ∞ for p = 1/4, 3/4. For η ∝ log n
n3/4−r
with r > 0, δn = O(n−k ) and
n1/4 log log n3/4
log δn
n → ∞ we have that
p
|PD
f 2 (x; Fn ) − PD(x; F ; Med, IQR)| → 0.
Theorem 3 shows we can choose both n and ηn decreasing in n and still maintain a consistent estimator. In terms
of computing this estimator, the difficult lies in computing Aη for a given dataset. This is non-trivial for projection
depth, as the ratio of estimators makes this difficult. We can approximate the depth by instead of computing
|x> u − Med(X> n u)|

O` = sup
u∈S d−1 ς` (X>
n u)
we can compute
|x> u − Med(X> n u)|
Ô` = max ,
u∈U1 ,...,Um ς` (X>
n u)
d−1
where Uj are sampled uniformly from S . Then, we can compute the truncated breakdown point of each
> >
b Uj (x) = |x Uj − Med(Xn Uj )| ,
O`
ς` (X>
n Uj )
(which can be done easily) to construct the breakdown Aη . We must analyze the necessary m for stability and compute
the running time. It may also be possible to compute this estimator exactly using techniques from computational
geometry, [e.g., Liu and Zuo, 2014].
We can present an algorithm to check if Aη O b Uj (x); Xn ≤ k∗ , where k∗ = 1 + bδ − aδ V1 . Suppose that
2
Yn ∈ D(Xn , k∗ ). We will actually check if Aη (O2u , Xn ) > k∗ . To this end, note that
|x − Med(Y> −1 ∗ −1 ∗
n u)| ≤ max(|x − Fn,u (1/2 + k /n)|, |x − Fn,u (1/2 − k /n)|) := up(Med, u)
and that
|x − Med(Y> −1 ∗ −1 ∗
n u)| ≥ min(|x − Fn,u (1/2 + k /n)|, |x − Fn,u (1/2 − k /n)|, |x − m1 (u)|, |x − m2 (u)|) := lo(Med, u),
10
with m1 (u) being the median of a dataset the same as X> ∗ >
n u, except that the smallest k observations of Xn u are
> ∗ >
replaced with x u. m2 (u) is the same as m1 (u), except instead the largest k observations of Xn u are replaced.
Define
−1 −1
B = {Fn,u (3/4 + k1 /n) − Fn,u (1/4 + k2 /n) : − k∗ ≤ k1 , k2 ≤ k∗ , |k1 | + |k2 | = k∗ }
we have that
lo(IQR, u) := min B ≤ IQR(Y>
n u) ≤ max B := up(IQR, u).
Therefore, it holds that

h
bù (x) ∈ lo(Med, u) up(Med, u) bù (x)), up(O
bù (x))
i
O , = lo(O
up(IQR, u) lo(IQR, u)
and we can check if
max(O bù (x) − lo(O
bù (x)), up(Obù (x)) − O
bù (x)) < η.
Then if this holds for all u we must have that Aη (O b2 (x; Fn ), Xn ) ≥ k∗ , which gives a lower bound on the truncated
breakdown point.
The methods used to construct private depth values can be used to privatize inference procedures based solely
on functions of depth values. For example, a common way to compare scale between two multivariate samples, say
Xn1 and Yn2 , is to compute the sample depth values with respect to the empirical distribution of the pooled sample
Xn1 ∪ Yn2 [Li and Liu, 2004, Chenouri et al., 2011]. We can denote this empirical distribution by Gn1 +n2 . Private
depth-based ranks could then be defined as
e ji = {#Xk` : D(X
R e k` ; Gn1 +n2 ) ≤ D(X
e ji ; Gn1 +n2 )},
where Xji is the ith observation from sample j. We can use these ranks to privately test for a difference in scale
between the two groups with the rank sum test statistic, viz.
n1
X
Te(Xn1 ∪ Yn2 ) = R
e ji .
i=1
The distribution of such a statistic remains the same under the null hypothesis, and (4) can be used to assess its
performance under the alternative hypothesis. It is clear that the power will be lowered, as the noise biases the
statistic toward failing to reject the null hypothesis. We can also take a similar approach in multivariate, covariance
change-point models [Chenouri et al., 2019, Ramsay and Chenouri, 2020]. The algorithms of this section cannot be
used to compute private depth-based medians, i.e., private maximizers of the depth functions, and so we investigate
algorithms to compute depth-based medians in the next section.
6 Private Multivariate Medians

For depth functions with finite global sensitivity, it is natural to estimate the depth-based median using the exponential
mechanism. As such, we could generate an observation from

f (v; Fn ) ∝ exp D(v; Fn ) ,
2GS(D)
to be used as a private estimate of the D-based median. One issue is that this density is not necessarily valid. For
example,

exp HD(v; Fn )
2GS(HD)
f (v; Fn ) = ,
R
d exp HD(v; F n )
R 2GS(HD)

R
is not a valid density, since Rd exp D(v; Fn ) = ∞. To see this, note that
2GS(D)

1 < exp HD(v; Fn ) < ∞
2GS(HD)
and so even if we transform this to

exp − (α − HD(v; Fn )) ,
2GS(HD)
11
it is still bounded below for any α. This implies that
Z

exp HD(v; Fn ) dv = ∞.
Rd 2GS(HD)
Similar results follow for the remaining depth functions, since they all have a range that lies in an interval. If the
data for which we would like to estimate the median is within some compact set B, then we can easily reduce the
range of the estimator to B and the density

exp HD(v; Fn ) 1 {v ∈ B}
2GS(HD)
f (v; Fn ) = ,
R
B
exp HD(v; F n ) dv
2GS(HD)
is valid. If there is no clear set B in which the median will lie then we propose a Bayesian approach, and recommend
using a prior π(v) on the median such that

exp HD(v; Fn ) π(v)
2GS(HD)
f (v; Fn ) = ,
R
Rd
exp HD(v; F n ) π(v)dv
2GS(HD)
is a valid density. Seeing as 1 {v ∈ B} normalized by B dv is a special case of a prior, we can summarise this
R
procedure as follows:
Mechanism 8. Suppose that GS(D) = C(D)/n. Suppose also that π(v) is a density chosen independently of the
data. Provided
n
exp D(v; Fn )π(v)
2C(D)
f (v; Fn ) = ,
R n
Rd
exp D(v; F n ) π(v)dλ(v)
2C(D)
is a valid Lebesgue density, a random draw from f (v; Fn ) is an -differentially private estimate of the depth-based
median of Xn .
We remark that it is imperative that this prior is chosen independently of the data or the privacy of the procedure
will be violated. It is easy to see that for any depth function bounded above and below that this is a valid density.
Suppose that the range of D is [0,1], then
Z Z
n n n
exp D(v; Fn ) π(v)dλ(v) ≤ exp π(v)dλ(v) = exp
Rd 2C(D) Rd 2C(D) 2C(D)
Z Z
n
exp D(v; Fn ) π(v)dλ(v) ≥ π(v)dλ(v) = 1.
Rd 2C(D) Rd
Some asymptotics of the exponential mechanism have been investigated by Awan et al. [2019], but their result requires
that the cost function is twice differentiable and convex. Depth functions do not typically satisfy these requirements.
The following Lemma is useful for proving some asymptotic results related to the exponential mechanism, when the
cost function is not necessarily differentiable, but smooth at the limiting minimizer
Lemma 1. Let 0 be the 0 vector in Rd and π(v) be a density on Rd . Suppose that φn (ω, v) : Ω × Rd → R+ are a
sequence of random functions on the probability space (Ω, A , P ) and that for each fixed ω. Assume that
1. n = Cnr , r < 1/2.
2. kφn (ω, ·) − φ(ω, ·)k∞ = O(n−1/2 ) P − a.s. .
3. For some α > 0, φ(ω, v) is α-Holder continuous in a neighborhood around 0 P −almost surely. This means that
kφn (ω, v) − φn (ω, 0)k ≤ C 0 kvkα for some constant C’ which may depend on ω.
4. φ(ω, v) = 0 if and only if v = 0. This means that φ is uniquely minimized at v = 0.
5. π(v) is a bounded Lebesgue density which is positive in some neighborhood around 0.
Let Vn be a sequence of random variables whose measure on (Rd , B(Rd )) is given by
e−n φn (ω,v) π(v)dv

Z Z
Qn (A) = R
−n φn (ω,v) π(v)dv
dP,
Ω A Rd e
p
for A ∈ B(Rd ). Then Vn → 0.
12
Lemma 1 may be applied outside the context of depth functions, and can be used to prove weak consistency of
an estimator based on the exponential mechanism. In Lemma 1, n refers to the ratio of the privacy parameter and
the global sensitivity of the cost functions. This lemma shows that smoother, insensitive cost functions will allow
the estimator to be consistent for smaller privacy budgets. Additionally, if e−n φn (ω,v) is integrable, then we can let
π(v) = 1 and the result still holds. Assumption 5. gives that the prior must be positive in the region of the maximizer.
We can apply Lemma 1 to data depth functions, which results in the following theorem.
Theorem 4. Suppose that |D(v; Fn ) − D(v; F )| = O(n1/2 ) a.s., ρ := supx D(x; F ) < ∞ and GS(D) = C(D)/n.
Suppose that F is such that D is Holder continuous for some α > 0. Suppose that the maximum of D(x; F ) occurs
uniquely at θ(F ), and π(v) is a bounded Lebesgue density which is positive in a neighborhood around θ(F ). Then for
Te(Xn ) drawn from the density

exp − (ρ − D(v; F )) π(v)
2GS(D)
f (v; Fn ) = ,
R
Rd
exp − (ρ − D(v; Fn )) π(v)dλ(v)
2GS(D)
satisfies the following

p d K(ρ − D(v; F )) n
Te(Xn ) → θ(F ) when n1/2 → ∞ and Te(Xn ) → exp − when → K < ∞.
2 C(D)
The continuity condition is weak, in the sense that α can be very small. For halfspace depth and IRW depth this
follows if Fu are Holder continuous. For simplicial depth, we need F to imply that the probability of being some
simplex is Holder continuous. If the median is not unique this estimator is still consistent for a median of F and
such a median would be drawn uniformly from the set of medians. Theorem 4 can be applied to the three depths of
[Tukey, 1974, Liu, 1988, Ramsay et al., 2019]. For fixed this estimator reaches the lower bound on convergence of
a differentially private estimator with finite gross error sensitivity [Hsu et al., 2014], so it is also somewhat optimal.
Algorithms that implement Mechanism 8 are an interesting line of new research; we cannot directly use, say Markov
Chain Monte Carlo methods, without first ensuring that they maintain the privacy of the estimators.
For projection depth, we cannot use the exponential mechanism without injecting a significant level of noise into
the estimator and so we instead extend the propose-test-release framework [Brunel and Avella-Medina, 2020] to be
used with the exponential mechanism. Suppose φXn : Rd → R+ is some cost function which we would like to minimize.
Then, define ( )
Aη (φXn ; Xn ) = min k ∈ N : sup sup |φXn (x) − φYn (x)| > η ,
Yn ∈D(Xn ;k) x
as the truncated breakdown point of the cost function. This is a direct extension of the truncated breakdown point
of [Brunel and Avella-Medina, 2020] to the functional context; we are essentially using the infinity norm and writing
kφXn (x) − φYn (x)k∞ > η. We can write the estimator as follows:
Mechanism 9. The estimator
(
aδ
⊥ if Aη (φXn ; Xn ) +
V ≤ 1 + bδ
T (Xn ) =
e ,
Tb(Xn ) ∼ exp(−φXn (v) 2η ) o.w.
with aδ , bδ and V as in Mechanism 4 is a differentially private estimate of argmin φXn (v). Under the Laplace version,
the estimator is (2, δ)-differentially private and under the Gaussian version, the estimator is (2, 2δ)-differentially
private.
This theorem shows that we can still use PTR with the exponential mechanism. The level of privacy under the
Gaussian version is slightly lower than for the original PTR mechanism, which is due to the pure differential privacy
of the exponential mechanism.
Theorem 5. Suppose that the sequence of random functions φXn (v) : Rd → R+ satisfy the conditions of Lemma 1.
Suppose that Te(Xn ) is generated according to Mechanism 9. If the sequences n , δn , ηn imply that

aδ bδ
Pr Aηn (φXn ; Xn ) + V n ≤ n + 1 → 0
n
then it holds that

p d KφF (v)
Te(Xn ) → argmin φF when n−1/2 n /ηn → ∞ and Te(Xn ) → exp − when n /ηn → K < ∞.
2
13
We can now substitute in the outlyingness function and see how this algorithm works for the purposes of privately
estimating the projection depth median. A first question is whether or not this density
supu |v > u − Med(X> >

n u)|/ς(Xn u)
exp −
2η
f (v) =
supu |v u − Med(X>
> >

n u)|/ς(Xn u)
R
Rd
exp − dv
2η
even exists. If supu Med(X> >

n u) < ∞ and inf u ς(Xn u) > 0
supu |v > u − Med(X> >

kvk2
Z Z
n u)|/ς(Xn u)
exp − dv ≤ C1 exp − dv
Rd 2η kvk>1 2η
Z

+ C2 exp − dv < ∞.
kwk≤1 2η
Unfortunately, immediately using PTR with the exponential mechanism gives no gains in estimating the projection
median over using the global sensitivity of projection depth (which is 1). If the points in Xn are distinct, we have
that Aη (O` (·; Xn ); Xn ) = 1 for any η. To see this, suppose that Yn is a neighboring dataset, with X1 changed to be
some observation such that ς(Y> >
n u) 6= ς(Xn u). It follows that for any u
ς(X> >
n u) − ς(Yn u)
sup |Où (x; Xn ) − Où (x; Xn )| ≈ sup x> u = ∞.
x x ς(X> >
n u)ς(Yn u)
In order to estimate the projection depth median privately, we can truncate the outlyingness function O to a fixed
number; for kxk > Mn , O(x) = ∞. Define this function to be

O` (x; Xn ) kxk < M
O` (x; Xn ; M ) = .
∞ kxk ≥ M
Note that projection depth based on O` does not have the ‘vanishing at infinity property’ from the depth function,
however since this is in the specific context of estimating the median, it is not a concern. The following shows
consistency of the private projection median estimator.
Theorem 6. Suppose that n−1/2 n /ηn → ∞ and that Mn = o(n1/2 ). Additionally, suppose F is such that (C0)-(C4)
of [Zuo, 2003] are satisfied, when σ = IQR. Then, for the estimator of Mechanism 9 with φXn (v) = O2 (v; Xn ; Mn ),
the conditions of Theorem 5 are met and is weakly consistent.
The obvious issue is choosing Mn in practice, which can be partially informed by the above theorem. Clearly, if
the data is known to be bounded it is easy to choose Mn . If the data is not bounded one can choose Mn independent
of the data, given domain knowledge. The choice of Mn should not depend on the data, which would violate the
consistency theorem; if Mn is chosen based on the data, then Mn could differ between two datasets, implying that
Où (x; Xn ) − Où (x; Yn ) = ∞ for some x and consequentially the truncated breakdown point of the outlyingness
function is 1. Computationally, again the difficulty lies in computing Aη (φXn ; Xn ), for which we can use similar
methods as the previous section.
7 Proofs
Proof of Theorem 2. The first property follows directly from consistency of the sample depths and the fact that
GS(D) → 0. The second case is true for the same reasons.
Proof of Theorem 3. We show the result for the Laplace case, but the Gaussian case follows the same path. First,
we want to show that
log(2/δn ) − W1
Pr Aη (O2 (x; Fn ) ; Xn ) ≤ 1 + → 0.
n
Note that
Pr(|W1 | > log(2/δn )) = e− log(2/δn ) = (δn /2) = O(n−k ),
14
from the properties of the Laplace distribution and the rate of convergence of δn . We can then write

log(2/δn ) − W1 log(2/δn ) − W1
Pr Aη (O2 (x; Fn ) ; Xn ) ≤ 1 + ≤ Pr Aη (O2 (x; Fn ) ; Xn ) ≤ 1 + , W1 > − log(2/δn )
n n

log(2/δn ) − W1
+ Pr Aη (O2 (x; Fn ) ; Xn ) ≤ 1 + , W1 < − log(2/δn )
n

log(2/δn )
≤ Pr Aη (O2 (x; Fn ) ; Xn ) ≤ 1 + 2 + O(n−k ).
n
Now, let ρn = 2 log(2/δ

n
n)
and we want to show that
Pr (Aη (O2 (x; Fn ) ; Xn ) ≤ 1 + ρn ) → 0.
It holds that
!
Pr (Aη (O2 (x; Fn ) ; Xn ) ≤ 1 + ρn ) = Pr sup |O2 (x; Xn ) − O2 (x; Yn )| ≥ η .
Yn ∈D(Xn ,1+ρn )
Consider the Taylor series expansion of f (x, y) = x/y about |x> u − Med (Fu ) |/ IQR(Fu ):
|x> u − Med X> >

u |x> u − Med (Fu ) | n u | − |x u − Med (Fu ) |
O2 (x; Xn ) = +
IQR(Fu ) IQR(Fu )
|x> u − Med (Fu ) |
− (IQR(X>
n u) − IQR(Fu )) + Rn,u
IQR(Fu )
|x> u − Med X>

nu | |x> u − Med (Fu ) |
= − (IQR(X>
n u) − IQR(Fu )) + Op (n
−1
).
IQR(Fu ) IQR(Fu )
It is easy to see that Rn,u = Op (n−1 ), since (IQR(X> 2

n u) − IQR(Fu )) = Op (n
−1
). We can then write
|x> u − Med u> Xn | |x> u − Med u> Yn |

|O2u (x; Xn ) − O2u (x; Yn )| = −
IQR(X>
n u) IQR(Y>
n u)
|x> u − Med X> > >

n u | − |x u − Med Yn u |
=
IQR(Fu )
|x> u − Med (Fu ) |

− (IQR(X> >
n u) − IQR(Yn u)) + Op (n
−1
)
IQR(Fu )
|x> u − Med X> > >

n u | − |x u − Med Yn u |
≤
IQR(Fu )
|x> u − Med (Fu ) |

+ (IQR(X> >
−1
)
IQR(Fu )
| Med Y> >

n u − Med Xn u | |x> u − Med (Fu ) |
≤ + (IQR(X> >
−1
),
IQR(Fu ) IQR(Fu )
where the last line follows from the reverse triangle inequality. Now, recall that Yn differs from Xn by ρn + 1 points.
So if Fn,u is the empirical distribution corresponding to X>n u and Gn,u (x) = 1 − Fn,u (x) , it holds that

| Med Y> > −1 −1 −1 −1

n u − Med Xn u | ≤ max |Fn (1/2) − Fn,u (1/2 + (ρn + 1)/n)|, |Fn,u (1/2) − Fn,u (1/2 − (ρn + 1)/n)|
−1 −1
= |Fn,u (1/2) − Fn,u (1/2 ± (ρn + 1)/n)|
= |Fu−1 (1/2) − Fn,u
−1 −1
(1/2) + Fn,u (1/2 ± (ρn + 1)/n) − Fu−1 (1/2)|
= |Gn,u (Med(Fu )) + R00n,u − 1/2 − Gn,u (Med(Fu )) − R0n,u + 1/2|
= O(n−3/4 log n) a.s. .

where the second last and last line follow from a Bahadur type representation of quantiles, as long as 1
n
1 + 2 log(2/δ
n
n)
=
3/4
O((log log n/n) ) see Theorem 2 on page 2 of [de Haan and Taconis-Haantjes, 1979]. However, we know that
15

1
n
1 + 2 log(2/δ
n
n)
= O((log log n/n)3/4 ) holds from the assumptions on n and δn . We can show something similar
for the inter-quartile range by simply replacing 1/2 with 1/4 and 3/4. Now, we must show that
sup O2u (x; Xn ) − sup O2u (x; Yn ) → 0 a.s. .

u u
We see that
sup O2u (x; Xn ) − sup O2u (x; Yn ) ≤ 2 sup |O2u (x; Xn ) − Ou (x; Yn )| = O(n−3/4 log n) a.s. ,
u u u
where the last line follows from the fact that ξ1/4,u , ξ3/4,u are bounded as functions of u, implying that R0n,u , R0n,u
are also bounded in u. See the proof of Theorem 2’ of [de Haan and Taconis-Haantjes, 1979] for the exact expression
of R0n,u , R00n,u . This implies that for η ∝ nlog n
3/4−r it holds that Pr [Aη (O2 (x; Xn ); Xn ) ≤ 1 + ρn ] → 0. Now, since
η p p
n
W2 → 0 and PD(x; Fn ; Med, IQR) → PD(x; F ; Med, IQR) we have that PD f 2 (x; Fn ) → PD(x; F ; Med, IQR).
Proof of Lemma 1. It is clear that Qn (A) is a valid density, since e−x is bounded for all x ∈ R+ . The goal is to show
that Qn converges weakly to 10 (·), since this is equivalent to Vn → 0. We use the Portmanteau Theorem and show
p
that for all 10 (·)-continuity sets A, Qn (A) → 10 (·).

Z Z
lim Qn (A) = lim R dP
n→∞ n→∞ Ω A Rd
Z
:= lim Qω
n (A)dP
n→∞ Ω
Z
= lim Qωn (A)dP
n→∞
ZΩ
= lim Qωn (A)dP,
Ω0 n→∞
where Ω0 is a set such that P (Ω0 ) = 1 and for ω ∈ Ω0 , φn (ω, ·) → φ(ω, ·), which exists from 2. The second last line
follows from Lebesgue’s dominated convergence theorem, noting that Qω ω
n (A) < 1. We now consider limn→∞ Qn (A)
for fixed ω ∈ Ω0 . Recall that A is a 10 (·)-continuity set, which means that 0 is not on the boundary of A. This also
implies that 0 is not on the boundary of Ac .

Z
lim Qωn (A) = lim R
n→∞ n→∞ A
Rd
Z
= lim R
−
R .
n→∞ A
A
e n φ n (ω,v) π(v)dv + Ac e−n φn (ω,v) π(v)dv
First, let AI be a 10 (·)-continuity set such that 0 is interior in AI .

Z Z
e−n φn (ω,v) π(v)dv = e−n (φn (ω,v)−φ(ω,v)+φ(ω,v)−φ(ω,0)) π(v)dv
AI AI
Z
= e−n (φn (ω,v)−φ(ω,v)) e−n (φ(ω,v)−φ(ω,0)) π(v)dv
AI
Z
00
n−β
≥ e−n (φ(ω,v)−φ(ω,0)) e−C π(v)dv.
AI
Assumptions 1 and 2 imply that n (φn (ω, v) − φ(ω, v)) ≤ C 00 n−β for some β > 0, independent of v. Observe that
n (φn (ω, v) − φ(ω, v)) = O(nr )O(n−1/2 ) = O(nr−1/2 ), which implies ∃β, r − 1/2 < β < 0 such that n (φn (ω, v) −
φ(ω, v)) = o(n−β ). Since 0 is interior in AI , for some large n, there exists a neighborhood N1/n (0) in AI . From
Assumption 3 (Holder continuity) we have that
sup |φ(ω, v) − φn (ω, 0)| < C ∗ dξα /nξα .

v∈N1/nξ (0)
16
Choose ξ such that α0 = ξα > r. Now, we can use this to write
Z Z
00 −β 00
n−β
e−n (φ(ω,v)−φ(ω,0)) e−C n π(v)dv ≥ e−n (φ(ω,v)−φ(ω,0)) e−C π(v)dv
AI N1/n (0)
∗ ξα 0
n−α n −C 00 n−β
≥ M (d)n−dξ e−C d
e
∗ ξα −α0 +r 00
n−β
≥ M (d)n−dξ e−C d n
e−C
= O(n−dξ ).
Note that Assumption 5. implies that there exists some N , such that for all n > N , π(v) is bounded below on
N1/n (0). Therefore, we can include it in the constant M (d). Now, consider AcI , a 10 (·)-continuity set such that 0 is
not interior in AcI . It follows that
Z Z
lim e−n φn (ω,v) π(v)dv = lim e−n φ(ω,v) e−n (φn (ω,v)−φ(ω,v)) π(v)dv
n→∞ Ac n→∞ Ac
I I
Z
= lim e−n φ(ω,v) e−n (φn (ω,v)−φ(ω,v)) π(v)dv
n→∞ Ac
I
Z
≤ lim e−n supv |φn (ω,v)−φ(ω,v)| e−n φ(ω,v) π(v)dv
n→∞ Ac
I
Z
= lim e−n φ(ω,v) π(v)dv
Ac n→∞
I
=0
The fourth equality follows from monotone convergence theorem and the that e−n supv |φn (ω,v)−φ(ω,v)| converges to
1. The last line follows from the fact that 0 is not interior in AcI and π(v) is bounded above. To elaborate, there
exists a neighborhood around 0, call it Nk (0), such that Nk (0) ∈ / AcI . By 3. and 4. , we can choose k is such that
0
0 −n φ(ω,v)
c
φ(ω, v) > k > 0 on AI . So, this implies that limn→∞ e ≤ limn→∞ e−n k = 0. The same line of reasoning
gives that Z
0
e−n φn (ω,v) π(v)dv = O(e−n k ).
Ac
I
So, it then follows that

Z
lim Qω
n (AI ) = lim R R
n→∞ n→∞ AI AI
e−n φn (ω,v) π(v)dv + Ac e−n φn (ω,v) π(v)dv
I
cn
= lim
n→∞ cn + bn
= 1,
since cn = O(n−dξ ) and bn = O(e−n k ). So it follows that limn→∞ Qω n (A) = 10 (·)(A) for 10 (·)-continuity sets A.
Then, since this is on a set Ω0 which has P (Ω0 ) = 1 we have that
Z
lim Qn (A) = lim Qωn (A)dP = 10 (A),
n→∞ Ω0 n→∞
which implies that

Qn → 10 (·).
d
Lemma 2. Suppose the conditions of Lemma 1 hold, except that limn→∞ n = K. Let Vn be a sequence of random
variables whose measure on (Rd , B(Rd )) is given by
e−n φn (ω,v) dv
Z Z
Qn (A) = R
−n φn (ω,v) dv
dP,
Ω A Rd e
d
for A ∈ B(Rd ). Then Vn → Q(A), where
e−Kφ(v) dv
Z
Q(A) = R .
A Rd
e−Kφ(v) dv
17
Proof. Note that kφn (ω, v) − φ(ω, v)k∞ → 0 a.e., which implies that an = e−n (φn (ω,v)−φ(ω,v)) → 1 a.e., uniformly
in v. We can then say that
Z Z
lim e−n φn (ω,v) dv = lim e−n φ(ω,v) an dv
n→∞ A n→∞ A
Z
≤ lim an e−n φ(ω,v) dv
n→∞ A
Z
= lim an ∗ ( lim an e−Kφ(ω,v) an dv).
n→∞ n→∞ A
Proof of Mechanism 9. The proof has the same outline as Brunel and Avella-Medina [2020]. First, assume that it
holds |φXn (x) − φYn (x)| ≤ η ∀x.
exp(−φXn (v) /2 ) exp(−φYn (v) /2

R
η η
)dv
fXn (v)/fYn (v) = /2 R /2
exp(−φYn (v) η ) exp(−φXn (v) η )dv
exp(−φYn (v) /2
R
/2 η
)dv
≤e R /2
exp(−φXn (v) η )dv
exp(−φXn (v) /2
R
/2 /2 η
)dv
≤e e R /2
exp(−φXn (v) η )dv
= e .
Note that, for B ∈ B(Rd ) (the Borel sets with respect to Rd ) this implies that
Pr(Tb(Xn ) ∈ B) ≤ e Pr(Tb(Yn ) ∈ B). (5)

It follows from Brunel and Avella-Medina [2020] that Aη (φXn ; Xn ) has global sensitivity equal to 1, since changing
one point can at most change the breakdown by 1. Then

1 log(2/δ) b
Pr Te(Xn ) ∈ B = Pr Aη (φXn ; Xn ) + V ≥ 1 + , T (Xn ) ∈ B

1 log(2/δ)
≤ e Pr Aη (φYn ; Yn ) + V ≥ 1 + Pr(Tb(Xn ) ∈ B)

1 log(2/δ)
≤ e2 Pr Aη (φYn ; Yn ) + V ≥ 1 + Pr(Tb(Yn ) ∈ B)

= e2 Pr Te(Yn ) ∈ B .
The first inequality is from independence and the fact that Aη (φXn ; Xn ) + 1 V is an -differentially private estimator.
The second inequality is from (5). Now what if ∃x |φXn (x) − φYn (x)| ≥ η ? This implies that Aη (φXn ; Xn ) = 1 and

1 log(2/δ)
Pr T (Xn ) ∈ B ≤ Pr Aη (φXn ; Xn ) + V ≥ 1 +
e = Pr (V ≥ log(2/δ)) = δ ≤ δ + e2 Pr Te(Yn ) ∈ B .

So, we get (2, δ) differential privacy if B is restricted to B(Rd ). For completeness, we need to include sets of the
form B = B 0 ∪ {⊥}, where B 0 ∈ B(Rd ). Consider

1 log(2/δ) 1 log(2/δ)
Pr Te(Xn ) ∈ B = Pr Tb(Xn ) ∈ B 0 , Aη (φXn ; Xn ) + V ≤ 1 + + Pr Aη (φXn ; Xn ) + V > 1 +

2 0
1 log(2/δ)
≤e Pr Te(Yn ) ∈ B + Pr Aη (φYn ; Yn ) + V > 1 + +δ

= e2 Pr Te(Yn ) ∈ B + δ.
The first inequality comes from the fact that we get (2, δ) differential privacy if B is restricted to B(Rd ) and the
fact that Aη (φYn ; Yn ) + 1 V is -differentially private.
Now, suppose that V, aδ and bδ correspond to the Gaussian version of PTR. Then, following the same steps as
for the Laplace version gives, for B ∈ B(Rd ),

Pr Te(Xn ) ∈ B ≤ e2 Pr Te(Yn ) ∈ B + δ,
18
when kφXn − φYn k∞ < η. When kφXn − φYn k∞ ≥ η,
p
Pr Te(Xn ) ∈ B ≤ Pr Z ≥ 2 log(1.25/δ) ≤ δ.
So, we have that

Pr Te(Xn ) ∈ B ≤ e2 Pr Te(Yn ) ∈ B + δ.
Again, we need to include sets of the form B = B 0 ∪ {⊥}, where B 0 ∈ B(Rd ). Consider
p !

0
2 log(1.25/δ) 2 log(1.25/δ)
Pr T (Xn ) ∈ B = Pr T (Xn ) ∈ B + Pr Aη (φXn ; Xn ) +
e e Z >1+

p !!
2

0
2 log(1.25/δ) 2 log(1.25/δ)
≤e Pr T (Yn ) ∈ B + Pr Aη (φYn ; Yn ) +
e Z >1+ + 2δ

= e2 Pr Te(Yn ) ∈ B + 2δ.
Proof of Theorem 5. Lemma 1 and Lemma 2 imply that Tb(Xn ) satisfies the above convergence results. The assump-
p
tion implies that |Tb(Xn ) − Te(Xn )| → 0.
Proof of Theorem 6. We need to show that

aδ bδ
Pr Aηn (φXn ; Xn ) + V ≤ + 1 → 0.

Following the same lines as the proof of Theorem 3, we want to show that
Pr (Aηn (φXn ; Xn ) ≤ 1 + ρn ) → 0,
where ρn = 2 log(2/δ
n
n)
. To this end,
!
Pr (Aηn (φXn ; Xn ) ≤ 1 + ρn ) = Pr sup sup |O` (x; Xn ) − O` (x; Yn )| ≥ ηn .
Yn ∈D(Xn ,1+ρn ) x
First, suppose that kxk < Mn . It immediately follows that the remainder of the series expansion of O
|x> u − Med(Fu )| p
(ς(X> >
n u) − ς(Yn u)) → 0.
ς(Fu )
Therefore, the same analysis as in the proof of Theorem 3 can be used and it holds that
!
Pr sup sup |O` (x; Xn ) − O` (x; Yn )| ≥ ηn → 0.
Yn ∈D(Xn ,1+ρn ) x
Now, suppose that kxk > Mn , which implies that Où (x; Xn ) − Où (x; Yn ) = 0.
References
M. Avella-Medina. Privacy-preserving parametric inference: A case for robust statistics. Journal of the American
Statistical Association, pages 1–45, 2019. ISSN 0162-1459. doi: 10.1080/01621459.2019.1700130. 1
M. Avella-Medina and V.-E. Brunel. Differentially private sub-Gaussian location estimators. arXiv e-prints, pages
1–16, 2019. URL http://arxiv.org/abs/1906.11923. 1
J. Awan, A. Kenney, M. Reimherr, and A. Slavković. Benefits and pitfalls of the exponential mechanism with
applications to hilbert spaces and functional pca. arXiv e-prints, art. arXiv:1901.10864, Jan. 2019. 1, 12
I. Baidari and C. Patil. K-data depth based clustering algorithm. In Computational Intelligence: Theories, Applica-
tions and Future Directions, volume 1 of Advances in Intelligent Systems and Computing, pages 13–24. Springer,
Singapore, 2019. doi: 10.1007/978-981-13-1132-1 2. 5
19
B. Balle and Y.-X. Wang. Improving the gaussian mechanism for differential privacy: Analytical calibration and
optimal denoising. arXiv e-prints, art. arXiv:1805.06530, May 2018. 3
A. Beimel, S. Moran, K. Nissim, and U. Stemmer. Private center points and learning of halfspaces. arXiv e-prints,
art. arXiv:1902.10731, Feb. 2019. 2
V.-E. Brunel and M. Avella-Medina. Propose, test, release: Differentially private estimation with high probability.
arXiv e-prints, art. arXiv:2002.08774, Feb. 2020. 1, 2, 4, 10, 13, 18
T. T. Cai, Y. Wang, and L. Zhang. The cost of privacy: Optimal rates of convergence for parameter estimation with
differential privacy. arXiv e-prints, art. arXiv:1902.04495, Feb. 2019. 1, 9
M. Cárdenas-Montes. Depth-based outlier detection algorithm. In M. Polycarpou, A. C. P. L. F. de Carvalho, J.-S.

Pan, M. Woźniak, H. Quintian, and E. Corchado, editors, Hybrid Artificial Intelligence Systems, pages 122–132,
Cham, 2014. Springer International Publishing. ISBN 978-3-319-07617-1. 5
S. Chakraborti and M. A. Graham. Nonparametric (distribution-free) control charts: An updated overview and some
results. Quality Engineering, pages 1–22, may 2019. ISSN 0898-2112. doi: 10.1080/08982112.2018.1549330. URL
https://www.tandfonline.com/doi/full/10.1080/08982112.2018.1549330. 5
K. Chaudhuri and D. Hsu. Convergence rates for differentially private statistical estimation. arXiv e-prints, art.
arXiv:1206.6395, June 2012. 1
Y. Chen, X. Dang, H. Peng, and H. Bart. Outlier detection with the kernelized spatial depth function. IEEE
Transactions on Pattern Analysis and Machine Intelligence, 31:288–305, 2009. 5
Z. Chen and D. E. Tyler. The influence function and maximum bias of Tukey’s median. Annals of Statistics, 30(6):
1737–1759, 2002. ISSN 00905364. doi: 10.1214/aos/1043351255. 1, 5
S. Chenouri, C. G. Small, and T. J. Farrar. Data depth-based nonparametric scale tests. Canadian Journal of
Statistics, 39(2):356–369, 2011. doi: 10.1002/cjs.10099. URL https://onlinelibrary.wiley.com/doi/abs/10.
1002/cjs.10099. 5, 11
S. Chenouri, A. Mozaffari, and G. Rice. Robust multivariate change point analysis based on data depth. To Appear
in Canadian Journal of Statistics, Under Revi, 2019. 5, 11
X. Dang, R. Serfling, and W. Zhou. Influence functions of some depth functions, and application to depth-
weighted L-statistics. Journal of Nonparametric Statistics, 21(1):49–66, jan 2009. ISSN 1048-5252. doi:
10.1080/10485250802447981. URL http://www.tandfonline.com/doi/abs/10.1080/10485250802447981. 1
L. de Haan and E. Taconis-Haantjes. On Bahadur’s representation of sample quantiles. Annals of the Institute
of Statistical Mathematics, 31(2):299–308, dec 1979. ISSN 0020-3157. doi: 10.1007/BF02480286. URL http:
//link.springer.com/10.1007/BF02480286. 15, 16
C. Dwork and J. Lei. Differential privacy and robust statistics. Proceedings of the 41st annual ACM symposium on
Symposium on theory of computing - STOC ’09, page 371, 2009. ISSN 07378017. doi: 10.1145/1536414.1536466.
URL http://portal.acm.org/citation.cfm?doid=1536414.1536466. 1, 2, 4, 10
C. Dwork and A. Roth. The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical
Computer Science, 9(3–4):211–407, 2014. ISSN 1551-305X. doi: 10.1561/0400000042. URL http://dx.doi.org/
10.1561/0400000042. 2, 3, 4
C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In Theory
of Cryptography Conference, pages 265–284, 2006. 3, 4
C. Dwork, A. Smith, T. Steinke, and J. Ullman. Exposed! a survey of attacks on private data. An-
nual Review of Statistics and Its Application, 4(1):61–84, mar 2017. ISSN 2326-8298. doi: 10.1146/
annurev-statistics-060116-054123. URL www.annualreviews.orghttp://www.annualreviews.org/doi/10.1146/
annurev-statistics-060116-054123. 1
Y. Gao and O. Sheffet. Private approximations of a convex hull in low dimensions. arXiv e-prints, art.
arXiv:2007.08110, July 2020. 2
20
J. Hsu, M. Gaboardi, A. Haeberlen, S. Khanna, A. Narayan, B. C. Pierce, and A. Roth. Differential privacy: An
economic method for choosing epsilon. In 2014 IEEE 27th Computer Security Foundations Symposium, pages
398–410, 2014. doi: 10.1109/CSF.2014.35. 13
R. Jörnsten. Clustering and classification based on the L1 data depth. Journal of Multivariate Analysis, 90(1):67–89,
jul 2004. ISSN 0047259X. doi: 10.1016/j.jmva.2004.02.013. URL http://www.stat.rutgers.edu/rebecka.http:
//linkinghub.elsevier.com/retrieve/pii/S0047259X04000272. 5
T. Lange, K. Mosler, and P. Mozharovskyi. Fast nonparametric classification based on data depth. Statistical Papers,
55(1):49–69, feb 2014. ISSN 0932-5026. doi: 10.1007/s00362-012-0488-4. URL http://link.springer.com/10.
1007/s00362-012-0488-4. 5, 8
J. Lei. Differentially private m-estimators. In J. Shawe-Taylor, R. Zemel, P. Bartlett, F. Pereira, and K. Q. Weinberger,
editors, Advances in Neural Information Processing Systems, volume 24, pages 361–369. Curran Associates, Inc.,
2011. 1
J. Li and R. Y. Liu. New nonparametric tests of multivariate locations and scales using data depth. Statistical Science,
19(4):686–696, nov 2004. ISSN 0883-4237. doi: 10.1214/088342304000000594. URL http://projecteuclid.org/
euclid.ss/1113832733. 5, 8, 11
R. Y. Liu. On a notion of simplicial depth. Proceedings of the National Academy of Sciences, 85(6):1732–1734, 1988.
ISSN 0027-8424. doi: 10.1073/pnas.85.6.1732. URL https://www.pnas.org/content/85/6/1732. 7, 13
R. Y. Liu. On a notion of data depth based on random simplices. Annals of Statistics., 18(1):405–414, 03 1990. doi:
10.1214/aos/1176347507. URL https://doi.org/10.1214/aos/1176347507. 2
R. Y. Liu. Control charts for multivariate processes. Journal of the American Statistical Association, 90(432):1380–
1387, dec 1995. ISSN 0162-1459. doi: 10.1080/01621459.1995.10476643. URL http://www.tandfonline.com/doi/
abs/10.1080/01621459.1995.10476643. 5
R. Y. Liu and K. Singh. A quality index based on data depth and multivariate rank tests. Journal of the American
Statistical Association, 88(421):252–260, 1993. ISSN 01621459. URL http://www.jstor.org/stable/2290720. 5
R. Y. Liu, J. M. Parelius, and K. Singh. Multivariate analysis by data depth: Descriptive statistics, graphics and
inference. The Annals of Statistics, 27(3):783–840, 1999. ISSN 00905364. URL http://www.jstor.org/stable/
120138. 5
X. Liu and Y. Zuo. Computing projection depth and its associated estimators. Statistics and Com-
puting, 24(1):51–63, jan 2014. ISSN 0960-3174. doi: 10.1007/s11222-012-9352-6. URL https:
//link-springer-com.proxy.lib.uwaterloo.ca/content/pdf/10.1007{%}2Fs11222-012-9352-6.pdfhttp:
//link.springer.com/10.1007/s11222-012-9352-6. 10
J.-C. Massé. Asymptotics for the tukey depth process, with an application to a multivariate trimmed mean. Bernoulli,
10(3):397–419, 06 2004. doi: 10.3150/bj/1089206404. URL https://doi.org/10.3150/bj/1089206404. 6
F. McSherry and K. Talwar. Mechanism design via differential privacy. In 48th Annual IEEE Symposium on
Foundations of Computer Science (FOCS’07), pages 94–103. IEEE, oct 2007. ISBN 0-7695-3010-9. doi: 10.1109/
FOCS.2007.4389483. 3
S. Nagy. Halfspace depth does not characterize probability distributions. arXiv e-prints, art. arXiv:1810.09207, Oct.
2018. 6
K. Nissim, S. Raskhodnikova, and A. Smith. Smooth sensitivity and sampling in private data analysis. In Proceedings
of the Thirty-Ninth Annual ACM Symposium on Theory of Computing, STOC ’07, page 75–84, New York, NY,
USA, 2007. Association for Computing Machinery. ISBN 9781595936318. doi: 10.1145/1250790.1250803. URL
https://doi.org/10.1145/1250790.1250803. 3
K. Ramsay and S. Chenouri. Robust, multiple change-point detection for covariance matrices using data depth. arXiv
e-prints, art. arXiv:2011.09558, Nov. 2020. 11
K. Ramsay, S. Durocher, and A. Leblanc. Integrated rank-weighted depth. Journal of Multivariate Analysis, 173:51
– 69, 2019. ISSN 0047-259X. doi: https://doi.org/10.1016/j.jmva.2019.02.001. URL http://www.sciencedirect.
com/science/article/pii/S0047259X18304068. 2, 6, 7, 13
21
M. Romanazzi. Influence function of halfspace depth. Journal of Multivariate Analysis, 77(1):138–161, 2001. ISSN
0047259X. doi: 10.1006/jmva.2000.1929. 1
R. Serfling. A depth function and a scale curve based on spatial quantiles. In Statistical Data Analysis Based on
the L1 -Norm and Related Methods, pages 25–38. Birkhäuser Basel, Basel, 2002. doi: 10.1007/978-3-0348-8201-9 3.
URL http://link.springer.com/10.1007/978-3-0348-8201-9{_}3. 5
R. J. Serfling. Depth functions in nonparametric multivariate inference. Data Depth: Robust Multivariate Analysis,
Computational Geometry, and Applications, pages 1–16, 2006. 6
J. W. Tukey. Mathematics and the picturing of data. In Proceedings of the International Congress of Mathematicians,
1974. 2, 6, 13
L. Wasserman and S. Zhou. A statistical framework for differential privacy. Journal of the American Statistical
Association, 105(489):375–389, 2010. ISSN 01621459. doi: 10.1198/jasa.2009.tm08651. 1, 8
Y. Zuo. Multivariate trimmed means based on data depth. In Y. Dodge, editor, Statistical Data Analysis Based on
the L1-Norm and Related Methods, pages 313–322, Basel, 2002. Birkhäuser Basel. ISBN 978-3-0348-8201-9. 5
Y. Zuo. Projection-based depth functions and associated medians. The Annals of Statistics, 31(5):1460–1490, 10
2003. doi: 10.1214/aos/1065705115. URL https://doi.org/10.1214/aos/1065705115. 2, 7, 10, 14
Y. Zuo. Influence function and maximum bias of projection depth based estimators. Annals of Statistics, 32(1):
189–218, 2004. 1, 5, 7, 10
Y. Zuo. A new approach for the computation of halfspace depth in high dimensions. Communications in Statistics
- Simulation and Computation, 48(3):900–921, mar 2019. ISSN 0361-0918. doi: 10.1080/03610918.2017.1402040.
URL https://www.tandfonline.com/doi/full/10.1080/03610918.2017.1402040. 6
Y. Zuo and R. Serfling. General notions of statistical depth function. Annals of Statistics, 28(2):461–482, 2000. 6, 7
22

Differentially Private Depth Functions and Their Associated Medians

Uploaded by

Copyright:

Available Formats

You might also like

Differentially Private Depth Functions and Their Associated Medians

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Differentially Private Depth Functions and Their Associated Medians

Uploaded by

Copyright:

Available Formats

Differentially private depth functions and their associated medians

Kelly Ramsay, Shoja’eddin Chenouri

Keywords— Differential Privacy, Depth function, Multivariate Median, Propose-test-release

The parameter  should be small, implying that

we, as statisticians, would then ask two questions:

QXn (B) ≤ e QYn (B) + δ (2)

LS(S; Xn ) = sup kS(Xn ) − S(Yn )k .

GS(S) = sup kS(Xn ) − S(Yn )k .

Te(Xn ) = T (Xn ) + (W1 , . . . , Wk )GS1 (T )/

LS(Med(Xn )) ≤ |Fn−1 (1/2 − 1/n) − Fn−1 (1/2 + 1/n)|,

is (2, δ) differentially private and the statistic

is 2, 2e δ + δ 2 differentially private.

0.10.15 0.3 0.15

Ri = #{Xj : D(Xj ; Fn ) ≤ D(Xi ; Fn )},

where ν is the uniform measure on S d−1 .

SMD(x; F ) = Pr(x ∈ ∆(Y1 , . . . , Yd+1 )),

where ∆(Y1 , . . . , Yd+1 ) is the simplex with vertices Y1 , . . . , Yd+1 .

Changing one observation can influence a maximum of n−1

and thus projection depth as,

u> x − Med X>

5 Algorithms for Private Depth Values

Ri,u = min{Fn,u (Xi> u), 1 − Fn,u (Xi> u−)},

and that s  2  

|Fu (ξp,u + h) − Fu (ξp,u )| = M |h|q (1 + O|h|q/2 ) with M > 0, q > 0,

|x> u − Med(X> n u)|

Therefore, it holds that

6 Private Multivariate Medians

e−n φn (ω,v) π(v)dv

 supu |v > u − Med(X> >

even exists. If supu Med(X> >

 supu |v > u − Med(X> >

Now, let ρn = 2 log(2/δ

Pr (Aη (O2 (x; Fn ) ; Xn ) ≤ 1 + ρn ) → 0.

|x> u − Med X> >

It is easy to see that Rn,u = Op (n−1 ), since (IQR(X> 2

|x> u − Med u> Xn | |x> u − Med u> Yn |

|x> u − Med X> > >

|x> u − Med (Fu ) |

|x> u − Med X> > >

|x> u − Med (Fu ) |

| Med Y> >

sup O2u (x; Xn ) − sup O2u (x; Yn ) → 0 a.s. .

that for all 10 (·)-continuity sets A, Qn (A) → 10 (·).

e−n φn (ω,v) π(v)dv

e−n φn (ω,v) π(v)dv

First, let AI be a 10 (·)-continuity set such that 0 is interior in AI .

sup |φ(ω, v) − φn (ω, 0)| < C ∗ dξα /nξα .

So, it then follows that

which implies that

exp(−φXn (v) /2 ) exp(−φYn (v) /2

Pr(Tb(Xn ) ∈ B) ≤ e Pr(Tb(Yn ) ∈ B). (5)

So, we have that    

Proof of Theorem 6. We need to show that

M. Cárdenas-Montes. Depth-based outlier detection algorithm. In M. Polycarpou, A. C. P. L. F. de Carvalho, J.-S.

You might also like

The parameter should be small, implying that

QXn (B) ≤ e QYn (B) + δ (2)

Te(Xn ) = T (Xn ) + (W1 , . . . , Wk )GS1 (T )/

is (2, δ) differentially private and the statistic

is 2, 2e δ + δ 2 differentially private.

and that s 2

e−n φn (ω,v) π(v)dv

supu |v > u − Med(X> >

supu |v > u − Med(X> >

e−n φn (ω,v) π(v)dv

e−n φn (ω,v) π(v)dv

exp(−φXn (v) /2 ) exp(−φYn (v) /2

Pr(Tb(Xn ) ∈ B) ≤ e Pr(Tb(Yn ) ∈ B). (5)

So, we have that