Professional Documents
Culture Documents
BO证明新技术
BO证明新技术
Abstract gunovic et al., 2016; Wang & Jegelka, 2017; Janz et al.,
2020), and algorithm-independent lower bounds have been
In this paper, we consider algorithm-independent given in several settings of interest (Bull, 2011; Scarlett
lower bounds for the problem of black-box et al., 2017; Scarlett, 2018; Chowdhury & Gopalan, 2019;
optimization of functions having a bounded Wang et al., 2020).
norm is some Reproducing Kernel Hilbert Space
(RKHS), which can be viewed as a non-Bayesian These theoretical works can be broadly categorized into
Gaussian process bandit problem. In the stan- one of two types: In the Bayesian setting, one adopts a
dard noisy setting, we provide a novel proof Gaussian process prior according to some kernel function,
technique for deriving lower bounds on the re- whereas in the non-Bayesian setting, the function is as-
gret, with benefits including simplicity, versatil- sumed to lie in some Reproducing Kernel Hilbert Space
ity, and an improved dependence on the error (RKHS) and be upper bounded in terms of the correspond-
probability. In a robust setting in which every ing RKHS norm.
sampled point may be perturbed by a suitably- In this paper, we focus on the non-Bayesian setting, and
constrained adversary, we provide a novel lower seek to broaden the existing understanding of algorithm-
bound for deterministic strategies, demonstrat- independent lower bounds on the regret, which have re-
ing an inevitable joint dependence of the cumu- ceived significantly less attention than upper bounds. Our
lative regret on the corruption level and the time main contributions are briefly summarized as follows:
horizon, in contrast with existing lower bounds
that only characterize the individual dependen- • In the standard noisy GP optimization setting, we
cies. Furthermore, in a distinct robust setting in provide an alternative proof strategy for the existing
which the final point is perturbed by an adver- lower bounds of (Scarlett et al., 2017), which we be-
sary, we strengthen an existing lower bound that lieve to be of significant importance in itself due to
only holds for target success probabilities very the lack of techniques in the literature. We addition-
close to one, by allowing for arbitrary success ally show that our approach strengthens the depen-
probabilities above 23 . dence on the error probability, and give scenarios in
which our approach is simpler and/or more versatile.
required just to succeed with probability at least 32 . gret requires the time horizon to satisfy T =
d/2
The relevant existing results are highlighted throughout the Ω 12 log 1 for the SE kernel, and T =
1 2+d/ν
paper, with further details in Appendix A. Ω for the Matérn kernel.
• The (average or constant-probability) cumulative
2. Problem Setup p is lower bounded according to RT =
regret
Ω T (log T )d/2 for the SE kernel, and RT =
The three problem settings considered throughout the pa- ν+d
Ω T 2ν+d for the Matérn kernel.
per are formally described as follows. We additionally in-
formally summarize the existing lower bounds in each of The SE kernel bounds have near-matching upper bounds
these settings, with formal statements given in Appendix A (Srinivas et al., 2010), and while standard results yield
along with existing upper bounds. The existing and new wider gaps for the Matérn kernel, these have been tight-
bounds are summarized in Table 1 below. ened in recent works; see Appendix A for details.
In Sections 4.2–4.5, we will present novel analysis tech-
2.1. Standard Setting
niques that can both simplify the proofs and strengthen the
Let f be a function on the compact domain D = [0, 1]d ; dependence on the error probability compared to the lower
by simple re-scaling, the results that we state readily ex- bounds in (Scarlett et al., 2017).3
tend to other rectangular domains. The smoothness of f
is modeled by assuming that kf kk ≤ B, where k · kk 2.2. Robust Setting – Corrupted Samples
is the RKHS norm associated with some kernel function
k(x, x0 ) (Rasmussen, 2006). The set of all functions satis- In the robust setting studied in (Bogunovic et al., 2020),
fying kf kk ≤ B is denoted by Fk (B), and x∗ denotes an the optimization goal is similar, but each sampled point is
arbitrary maximizer of f . further subject to adversarial noise; for t = 1, . . . , T :
t−1
At each round indexed by t, the algorithm selects some • Based on the previous samples {(xi , ỹi )}i=1 , the
xt ∈ D, and observes a noisy sample yt = f (xt ) + zt . player selects a distribution Φt (·) over D.
Here the noise term is distributed as N (0, σ 2 ), with σ 2 > 0 • Given knowledge of the true function f , the previous
and independence between times. samples {(xi , yi )}t−1
i=1 , and the player’s distribution
Φt (·), an adversary selects a function ct (·) : D →
We measure the performance using the following two
[−B0 , B0 ], where B0 > 0 is constant.
widespread notions of regret:
• The player draws xt ∈ D from the distribution Φt ,
• Simple regret: After T rounds, an additional point and observes the corrupted sample
x(T ) is returned, and the simple regret is given by
r(x(T ) ) = f (x∗ ) − f (x(T ) ). ỹt = yt + ct (xt ), (3)
• Cumulative regret: After T P rounds, the cumula-
T where yt is the noisy non-corrupted observation yt =
tive regret incurred is RT = t=1 rt , where rt = f (xt ) + zt as in Section 2.1.
f (x∗ ) − f (xt ).
Note that in the special case that Φt (·) is deterministic, the
As with the previous work on noisy lower bounds (Scarlett
adversary knowing Φt also implies knowledge of xt .
et al., 2017), we focus on the squared exponential (SE) and
Matérn kernels, defined as follows with length-scale l > 0 For this problem to be meaningful, the adversary must be
and smoothness ν > 0 (Rasmussen, 2006): constrained. Following (Bogunovic et al., 2020), we as-
2 sume the following constraint for some corruption level C:
rx,x
0
0
kSE (x, x ) = exp − (1)
2l2 T
√ √
X
ν max |ct (x)| ≤ C. (4)
21−ν
2ν rx,x0 2ν rx,x0 x∈D
kMatérn (x, x0 ) = Jν , t=1
Γ(ν) l l
(2) When C = 0, we reduce to the setup of Section 2.1.
where rx,x0 = kx − x0 k, and Jν denotes the modified While both the simple regret and cumulative regret could
Bessel function. be considered here, we focus entirely on the latter, as it has
been the focus of the related existing works (Bogunovic
Existing lower bounds. The results of (Scarlett et al.,
3
2017) are informally summarized as follows: We are not aware of any way to adapt the analysis of (Scarlett
et al., 2017) to obtain a high-probability lower bound that grows
• Attaining (average or constant-probability) simple re- unbounded as the target error probability approaches zero.
On Lower Bounds for Standard and Robust Gaussian Process Bandit Optimization
SE kernel
Upper Bound Existing Lower Bound Our Lower Bound
Standard q p q
Cumulative Regret1,2 O∗ T (log T )2d log 1δ Ω T (log T )d/2 Ω∗ T (log T )d/2 log 1δ
Corrupted Samples std
O∗ RT + C T (log T )d Ω Rstd Ω Rstd
p
d/2
Cumul. Regret, δ = Θ(1) T +C T + C(log T )
d
Corrupted Final Point
1 1 2d 1
Ω 12 log 1 2 d2
O∗ 1 1 1
Time to -optimality2 2 log log δ Ω 2 log log δ
(only for δ ≤ O(ξ d ))
Matérn-ν kernel
Upper Bound Existing Lower Bound Our Lower Bound
Standard ν+d
q ν+d
ν+d ν
Cumulative Regret1 O∗ T log 1δ
2ν+d Ω T 2ν+d Ω T 2ν+d log 1δ 2ν+d
Corrupted Samples std ν+d
ν d
Cumul. Regret, δ = Θ(1) O∗ RT + CT 2ν+d Ω Rstd
T + C Ω Rstd
T + C d+ν T d+ν
2(2ν+d)
d
log 1 1+ 2ν 1 1 d/ν
Corrupted Final Point O∗ 1 2ν−d + 2 δ Ω 2
1 1 d/ν
1
Time to -optimality Ω 2 log δ
(only for d < 2ν) (only for δ ≤ O(ξ d ))
Table 1. Summary of new and existing regret bounds. T denotes the time horizon, d denotes the dimension, ξ denotes the corruption
std
radius, and δ denotes the allowed error probability. In the middle row, RT and Rstd
T denote upper and lower bounds on the standard
cumulative regret. The existing upper and lower bounds are from (Srinivas et al., 2010; Chowdhury & Gopalan, 2017; Scarlett et al.,
2017; Bogunovic et al., 2018a; 2020), with the partial exception of the Matérn kernel upper bounds, which are detailed at the end of
Appendix A.4. The notation O∗ (·) and Ω∗ (·) hides dimension-independent log T factors, as well as log log 1δ factors.
et al., 2020; 2021; Lykouris et al., 2018; Gupta et al., 2019; lowing the worst-case perturbation within ∆ξ (·); in partic-
Li et al., 2019). See also (Bogunovic et al., 2020, App. C) ular, the global robust optimizer is given by
for discussion on the use of simple regret in this setting.
Existing lower bound. The only lower bound stated in x∗ξ ∈ arg max min f (x + δ). (6)
x∈D δ∈∆ξ (x)
(Bogunovic et al., 2020) states that RT = Ω(C) for any
algorithm, whereas the upper bound therein essentially
amounts to multiplying (rather than adding) the uncor- Then, if the algorithm returns x(T ) , the performance is
rupted regret bound by C. Thus, significant gaps remain measured by the ξ-regret:
in terms of the joint dependence on C and T , which our
lower bound in Section 4.6 will partially address. rξ (x) = min f (x∗ξ + δ) − min f (x + δ). (7)
δ∈∆ξ (x∗
ξ) δ∈∆ξ (x)
Here, the implied constants may depend on (d, l, ν). such a level of improvement is impossible in the RKHS
setting. On the other hand, further gaps remain between
In Section 4.5, we show that for the Matérn kernel, the anal- the lower bounds in Theorem 3 and the O(CR(0) ) upper
ysis can be simplified even further by using a function class bounds of (Bogunovic et al., p 2020) (e.g., for the SE kernel,
proposed in (Bull, 2011) (which studied the noiseless set- the latter introduces an Õ(C T (log 2d
T ) ) term, whereas
ting) with bounded support. Theorem 3 gives an Ω C(log T ) d/2
lower bound). Recent
results for the linear bandit setting (Bogunovic et al., 2021)
3.2. Robust Setting – Corrupted Samples suggest that the looseness here may be in the upper bound;
this is left for future work.
In the general setup studied in (Bogunovic et al., 2020)
and presented in Section 2.2, the player may randomize the
choice of action, and the adversary can know the distribu- 3.3. Robust Setting – Corrupted Final Point
tion but not the specific action. However, if the player’s Here we provide improved variant of the lower bound in
actions are deterministic (given the history), then knowing (Bogunovic et al., 2018a) (replicated in Theorem 12 in Ap-
the distribution is equivalent to knowing the specific ac- pendix A) for the adversarially robust setting with a cor-
tion. In this section, we provide a lower bound for such rupted final point, described in Section 2.3.
scenarios. While a lower bound that only holds for deter-
Theorem 4. (Improved Lower Bound – Corrupted Final
ministic algorithms may seem limited, it is worth noting
Point) Fix ξ ∈ 0, 21 , ∈ 0, 12 , B > 0, and T ∈ Z, and
that the smallest regret upper bound in (Bogunovic et al.,
set dist(x, x0 ) = kx − x0 k2 . Suppose that there exists an
2020) (see Theorem 9 in Appendix A) is established using
algorithm that, for any f ∈ Fk (B), reports x(T ) achieving
such an algorithm. More generally, it is important to know
ξ-regret rξ (x(T ) ) ≤ with probability at least 1 − δ. Then,
to what extent randomization is needed for robustness, so
provided that B is sufficiently small, we have the following:
bounds for both deterministic and randomized algorithms
are of significant interest. 1. For k = kSE , it is necessary that T =
2 d/2
Ω σ2 log B log 1δ .
Theorem 3. (Lower Bound – Corrupted Samples) In the
setting of corrupted samples with a corruption level sat- 2. For k = kMatérn , it is necessary that T =
isfying Θ(1) ≤ C ≤ T 1−Ω(1) , even in the noiseless set- 2 d/ν
Ω σ2 B log 1δ .
ting (σ 2 = 0), any deterministic algorithm (including those
having knowledge of C) yields the following with probabil- Here, the implied constants may depend on (ξ, d, l, ν).
ity one for some f ∈ Fk (B):
Compared to the existing lower bound in (Bogunovic et al.,
• Under the SE kernel, RT = Ω C(log T )d/2 ; 2018a) (Theorem 12 in Appendix A), we have removed the
ν d restrictive requirement that δ is sufficiently small (see Ap-
• Under the Matérn-ν kernel, RT = Ω C d+ν T d+ν .
pendix A.3 for further discussion), giving the same scaling
laws even when the algorithm is only required to succeed
We provide a proof outline in Section 4.6, and the full de-
with a small probability such as 0.01 (i.e, δ = 0.99). In
tails in Appendix B. We note that the assumption Θ(1) ≤
addition, for small δ, we attain a log 1δ factor improvement
C ≤ T 1−Ω(1) primarily rules out the case C = Θ(T )
similar to Theorem 1.
in which the adversary can corrupt every point by a con-
stant amount. This assumption also ensures that the bound
ν d
RT = Ω C d+ν T d+ν is stronger than the bound RT = 4. Mathematical Analysis and Proofs
Ω(C) from (Bogunovic et al., 2020). Note also that any 4.1. Preliminaries
lower bound for the standard setting applies here, since the
adversary can choose not to corrupt. In this section, we introduce some preliminary auxiliary re-
sults from (Scarlett et al., 2017) that will be used through-
Theorem 3 addresses a question posed in (Bogunovic et al.,
out our analysis. While we utilize the function class and
2020) on the joint dependence of the cumulative regret on
auxiliary results from this existing work, we apply them in
C and T . The upper bounds established therein (one of
a significantly different manner in order to broaden the lim-
which we replicate in Theorem 9 in Appendix A) are of the
ited techniques known for GP bandit lower bounds, and to
form O(CR(0) ), where R(0) is a standard (non-corrupted)
reap the advantages outlined above and in Section 4.4.
regret bound, whereas analogous results from the multi-
armed bandit literature (Gupta et al., 2019) suggest that We proceed as follows (Scarlett et al., 2017):
Õ(R(0) + C) may be possible, where the Õ(·) notation
• We lower bound the worst-case regret within Fk (B)
hides dimension-independent logarithmic factors.
by the regret averaged over a finite collection
Theorem 3 shows that, at least for deterministic algorithms, {f1 , . . . , fM } ⊂ Fk (B) of size M .
On Lower Bounds for Standard and Robust Gaussian Process Bandit Optimization
such functions; we will always consider w 1, so where Pm (y|x) is the distribution of an observation
that the case M = 0 is avoided. See Figure 1 for an y for a given selected point x under the function fm ,
illustration of the function class. and similarly for P0 (y|x).
• It is shown in (Scarlett et al., 2017) that the above
properties can be achieved with • Let Nj ∈ {0, . . . , T } be a random variable represent-
ing the number of points from Rj that are selected
$ q 2 )d/4 h(0) !d % throughout the T rounds.
log B(2πl 2
M= (15) Finally, the following auxiliary lemmas will be useful.
ζπl
Lemma 2. (Scarlett et al., 2017, Eq. (36)) For P1 and P2
in the case of the SE kernel, and with being Gaussian with means (µ1 , µ2 ) and a common vari-
−µ2 )2
j Bc d/ν k ance σ 2 , we have D(P1 kP2 ) = (µ12σ 2 .
3
M= (16)
Lemma 3. (Scarlett et al., 2017, Lemma 7) The functions
1 ν
{fm } corresponding to (15)–(16) are such that the quan-
in the case of the Matérn kernel, where c3 := ζ · PM
tities v jm in (17) satisfy (i) j=1 v jm = O() for all m;
−1/2
c2 PM PM
2(8π 2 )(ν+d/2)/2
, and where c2 > 0 is an absolute (ii) m=1 v jm = O() for all j; and (iii) m=1 (v jm )2 =
constant. Note that these values of M amount to O(2 ) for all j.
On Lower Bounds for Standard and Robust Gaussian Process Bandit Optimization
4.2. Standard Setting – Simple Regret Remark 1. The preceding analysis can easily be adapted
to show that when T is allowed to have variable length (i.e.,
For fixed > 0, we consider the function class
the algorithm is allowed to choose when to stop), E[T ] is
{f1 , . . . , fM } described in Section 4.1, with B replaced by
B lower bounded by the right-hand side of (23). See (Gabil-
3 . This change should be understood as applying to all lon et al., 2012) for a discussion on analogous variations in
previous equations and auxiliary results that we use, e.g.,
the context of multi-armed bandits.
replacing B by B3 in (16), but this only affects constant
factors, which we do not attempt to optimize anyway. Be-
4.3. Standard Setting – Cumulative Regret
fore continuing, we recall the important property that any
x ∈ [0, 1]d can be -optimal for at most one function. We fix some > 0 to be specified later, consider the func-
0
For fixed m and m , we will apply Lemma 1 with f (x) = tion class {f1 , . . . , fM } from Section 4.1, and show that it
fm (x) and f 0 (x) = fm (x) + 2fm0 (x). Intuitively, this is not possible to attain RT ≤ T2 with probability at least
choice is made so that f and f 0 have different maximizers, 1 − δ for all functions with kf kk ≤ B.
but remain near-identical except in a small region around Assuming by contradiction that the preceding goal is possi-
the peak of f 0 . It will be useful to characterize the quantity ble, this class of functions includes the choices of f and f 0
Djf,f 0 in (9); by Lemma 2, at the start of Section 4.2. However, if we let A be the event
that at least T2 of the sampled points lie in Rm , it follows
|2fm0 (x)|2 2(v jm0 )2 that Pf [A] ≥ 1 − δ and Pf 0 [A] ≤ δ, since (i) each sample
Djf,f 0 = max = , (19)
x∈Rj 2σ 2 σ2 outside Rm incurs regret at least under f ; and (ii) each
sample within Rm incurs regret at least under f 0 .
using the definition of v jm in (17).
Hence, despite being derived with a different choice of A,
In the following, let A be the event that the returned point (20) still holds in this case, and (23) follows. This was de-
x(T ) lies in the region Rm (defined just above (17)). Sup- rived under the assumption that RT ≤ T2 with probability
pose that an algorithm attains simple regret at most for at least 1 − δ for all functions with kf kk ≤ B; the contra-
both f and f 0 (note that kf 0 kk ≤ B by the triangle in- positive statement is that when
equality and maxj kfj kk ≤ B3 ), each with probability at
least 1 − δ. We claim that this implies Pf [A] ≥ 1 − δ and (M − 1)σ 2 1
Pf 0 [A] ≤ δ. Indeed, by construction in Section 4.1, only T < 2
log , (24)
2c0 2.4δ
points in Rm can be -optimal under f = fm , and only
points in Rm0 can be -optimal under f 0 = fm + 2fm0 . it must be the case that some function yields RT > T
with
2
Hence, Lemma 1 and (19) give probability at least δ.
2 X
M
1 The remainder of the proof of Theorem 2 follows that of
Em [Nj ] · (v jm0 )2 ≥ log , (20) (Scarlett et al., 2017, Sec. 5.3), but with σ 2 log 1δ in place
σ 2 j=1 2.4δ
of σ 2 , and the final regret expressions adjusted accordingly
(e.g., compare Theorem 2 with Theorem 8 in Appendix A).
and summing over all m0 6= m gives
Due to this similarity, we only outline the details:
M
2 X X 1 (i) Consider (24) nearly holding with equality (e.g., T =
Em [Nj ] · (v jm0 )2 ≥ (M − 1) log . (21) M σ2 1
σ2 0 2.4δ 4c0 2 log 2.4δ suffices).
m 6=m
j=1
change-of-measure result (see Lemma 8 in Appendix D). • g(x) = 0 for all x outside the `2 -ball of radius w
We highlight the following advantages of our approach: centered at the origin;
• While both approaches can be used to lower bound • g(x) ∈ [0, 2] for all x, and g(0) = 2.
the average or constant-probability regret,6 the above • kgkk ≤ c1 h(0)2 1 ν
khkk when k is the Matérn-ν
w
analysis gives the more precise log 1δ dependence d
kernel on R , where c1 is constant. In particular, we
when the algorithm is required to succeed with prob- 1 khkk 1/ν
have kgkk ≤ B when w = 2c
ability at least 1 − δ. h(0)B .
• As highlighted in Remark 1, the above simple re- This function can be used to simplify both the original anal-
gret analysis extends immediately to provide a lower ysis in (Scarlett et al., 2017), and the alternative proof in
bound on E[T ] in the varying-T setting, whereas at- Sections 4.2–4.3. We focus on the latter, and on the simple
taining this via the approach of (Scarlett et al., 2017) regret; the cumulative regret can be handled similarly.
appears to be less straightforward.
We consider functions {f1 , . . . , fM } constructed similarly
• Although we do not explore it in this paper, we ex- to Section 4.1, but with each fm being a shifted and scaled
pect our approach to be more amenable to deriving version of g(x) in Lemma 4, using the choice of w in the
instance-dependent regret bounds, rather than worst- third statement of the lemma. By the first part of the lemma,
case regret bounds over the function class. The the functions have disjoint support as long as their center
idea, as in the multi-armed bandit setting (Kaufmann points are separated by at least w. We also use B3 in place of
et al., 2016), is that if we can “perturb” one func- B in the same way as Section 4.2. By forming a regularly
tion/instance to another so that the set of near-optimal spaced grid in each dimension, it follows that we can form
points changes significantly, we can use Lemma 1 to
infer bounds on the required number of time steps in j 1 kd j h(0)B kd/ν
the original instance. M= = (25)
w 6c1 khkk
• As evidence of the versatility of our approach, we will
B d/ν
use it in Section 4.7 to derive an improved result over such functions.7 Observe that this matches the O
that of (Bogunovic et al., 2018a) in the robust setting scaling in (16).
with a corrupted final point.
We clearly still have the property that any point x is -
optimal for at most one function. The additional useful
4.5. Simplified Analysis – Matérn Kernel property here is that any point x yields any non-zero value
In the function class from (Scarlett et al., 2017) used above, for at most one function. Letting {Rj }M j=1 be the partition
each function is a bump function (with bounded support) in of the domain induced by the above-mentioned grid (so that
the frequency domain, meaning that it is non-zero almost each fm ’s support is a subset of Rm ), we notice that (20)
everywhere in the spatial domain. In contrast, the earlier still holds (with the choice of function in the definition of
j
work of Bull (Bull, 2011) for the noiseless setting directly vm 0 suitably modified), but now simplifies to
References Contal, E., Buffoni, D., Robicquet, A., and Vayatis, N. Ma-
chine Learning and Knowledge Discovery in Databases,
Aronszajn, N. Theory of reproducing kernels. Trans. Amer.
chapter Parallel Gaussian Process Optimization with Up-
Math. Soc, 68(3):337–404, 1950.
per Confidence Bound and Pure Exploration, pp. 225–
Auer, P., Cesa-Bianchi, N., Freund, Y., and Schapire, R. E. 240. Springer Berlin Heidelberg, 2013.
Gambling in a rigged casino: The adversarial multi-
Dai Nguyen, T., Gupta, S., Rana, S., and Venkatesh, S. Sta-
armed bandit problem. In IEEE Conf. Found. Comp. Sci.
ble Bayesian optimization. In Pacific-Asia Conf. Knowl.
(FOCS), 1995.
Disc. and Data Mining, 2017.
Aziz, M., Anderton, J., .Kaufmann, E., and .Aslam, J. Pure
De Freitas, N., Smola, A. J., and Zoghi, M. Exponential re-
exploration in infinitely-armed bandit models with fixed-
gret bounds for Gaussian process bandits with determin-
confidence. In Alg. Learn. Theory (ALT), 2018.
istic observations. In Int. Conf. Mach. Learn. (ICML),
Beland, J. J. and Nair, P. B. Bayesian optimization under 2012.
uncertainty. NIPS BayesOpt 2017 workshop, 2017.
Duchi, J. Lecture notes for statistics 311/electrical en-
Bertsimas, D., Nohadani, O., and Teo, K. M. Nonconvex gineering 377. http://stanford.edu/class/
robust optimization for problems with constraints. IN- stats311/.
FORMS journal on Computing, 22(1):44–58, 2010.
Gabillon, V., Ghavamzadeh, M., and Lazaric, A. Best
Bogunovic, I., Scarlett, J., Krause, A., and Cevher, V. arm identification: A unified approach to fixed budget
Truncated variance reduction: A unified approach to and fixed confidence. In Conf. Neur. Inf. Proc. Sys.
Bayesian optimization and level-set estimation. In Conf. (NeurIPS), 2012.
Neur. Inf. Proc. Sys. (NeurIPS), 2016.
Grill, J.-B., Valko, M., and Munos, R. Optimistic opti-
Bogunovic, I., Scarlett, J., Jegelka, S., and Cevher, V. Ad- mization of a Brownian. In Conf. Neur. Inf. Proc. Sys.
versarially robust optimization with Gaussian processes. (NeurIPS). 2018.
In Conf. Neur. Inf. Proc. Sys. (NeurIPS), 2018a.
Grünewälder, S., Audibert, J.-Y., Opper, M., and Shawe-
Bogunovic, I., Zhao, J., and Cevher, V. Robust maximiza- Taylor, J. Regret bounds for Gaussian process bandit
tion of non-submodular objectives. In Int. Conf. Art. In- problems. In Int. Conf. Art. Intel. Stats. (AISTATS), pp.
tel. Stats. (AISTATS), 2018b. 273–280, 2010.
Bogunovic, I., Krause, A., and Scarlett, J. Corruption- Gupta, A., Koren, T., and Talwar, K. Better algorithms for
tolerant Gaussian process bandit optimization. In Int. stochastic bandits with adversarial corruptions. In Conf.
Conf. Art. Intel. Stats. (AISTATS), 2020. Learn. Theory (COLT), 2019.
Bogunovic, I., Losalka, A., Krause, A., and Scarlett, J. Janz, D., Burt, D. R., and González, J. Bandit optimisation
Stochastic linear bandits robust to adversarial attacks. In of functions in the Matérn kernel RKHS. In Int. Conf.
Int. Conf. Art. Intel. Stats. (AISTATS), 2021. Art. Intel. Stats. (AISTATS), 2020.
Bull, A. D. Convergence rates of efficient global optimiza- Kaufmann, E., Cappé, O., and Garivier, A. On the com-
tion algorithms. J. Mach. Learn. Res., 12(Oct.):2879– plexity of best-arm identification in multi-armed bandit
2904, 2011. models. J. Mach. Learn. Res. (JMLR), 17(1):1–42, 2016.
Calandriello, D., Carratino, L., Lazaric, A., Valko, M., and Kawaguchi, K., Kaelbling, L. P., and Lozano-Pérez, T.
Rosasco, L. Gaussian process optimization with adap- Bayesian optimization with exponential convergence. In
tive sketching: Scalable and no regret. In Conf. Learn. Conf. Neur. Inf. Proc. Sys. (NeurIPS), 2015.
Theory (COLT), 2019.
Kirschner, J., Bogunovic, I., Jegelka, S., and Krause, A.
Chowdhury, S. R. and Gopalan, A. On kernelized multi- Distributionally robust Bayesian optimization. In Int.
armed bandits. In Int. Conf. Mach. Learn. (ICML), 2017. Conf. Art. Intel. Stats. (AISTATS), 2020.
Chowdhury, S. R. and Gopalan, A. Bayesian optimization Li, Y., Lou, E. Y., and Shan, L. Stochas-
under heavy-tailed payoffs. In Conf. Neur. Inf. Proc. Sys. tic linear optimization with adversarial corruption.
(NeurIPS), 2019. https://arxiv.org/abs/1909.02109, 2019.
On Lower Bounds for Standard and Robust Gaussian Process Bandit Optimization
Lykouris, T., Mirrokni, V., and Paes Leme, R. Stochastic Vakili, S., Khezeli, K., and Picheny, V. On information
bandits robust to adversarial corruptions. In ACM Symp. gain and regret bounds in Gaussian process bandits. In
Theory Comp. (STOC), 2018. Int. Conf. Art. Intel. Stats. (AISTATS), 2021.
Lyu, Y., Yuan, Y., and Tsang, I. W. Efficient batch Valko, M., Korda, N., Munos, R., Flaounas, I., and Cris-
black-box optimization with deterministic regret bounds. tianini, N. Finite-time analysis of kernelised contextual
https://arxiv.org/abs/1905.10041, 2019. bandits. In Conf. Uncertainty in AI (UAI), 2013.
Martinez-Cantin, R., Tee, K., and McCourt, M. Practical Wang, Z. and Jegelka, S. Max-value entropy search for ef-
Bayesian optimization in the presence of outliers. In Int. ficient Bayesian optimization. In Int. Conf. Mach. Learn.
Conf. Art. Intel. Stats. (AISTATS), 2018. (ICML), pp. 3627–3635, 2017.
Nguyen, T. T., Gupta, S., Ha, H., Rana, S., and Venkatesh, Wang, Z., Zhou, B., and Jegelka, S. Optimization as es-
S. Distributionally robust bayesian quadrature optimiza- timation with Gaussian processes in bandit settings. In
tion. In Int. Conf. Art. Intel. Stats. (AISTATS), 2020. Int. Conf. Art. Intel. Stats. (AISTATS), 2016.
Wang, Z., Tan, V. Y. F., and Scarlett, J. Tight regret
Nogueira, J., Martinez-Cantin, R., Bernardino, A., and Ja-
bounds for noisy optimization of a Brownian motion.
mone, L. Unscented Bayesian optimization for safe
https://arxiv.org/abs/2001.09327, 2020.
robot grasping. In IEEE/RSJ Int. Conf. Intel. Robots and
Systems (IROS), 2016.