In many real world settings binary classification decisions are made bas...
Model selection in the context of bandit optimization is a challenging
p...
Large transformer models trained on diverse datasets have shown a remark...
We study an abstract framework for interactive learning called interacti...
We consider model selection for sequential decision making in stochastic...
We propose Heuristic Blending (HUBL), a simple performance-improving
tec...
In many bandit problems, the maximal reward achievable by a policy is of...
Transferring knowledge across domains is one of the most fundamental pro...
Two central paradigms have emerged in the reinforcement learning (RL)
co...
Building generally capable agents is a grand challenge for deep reinforc...
Reinforcement learning provides an automated framework for learning beha...
The problem of how to genetically modify cells in order to maximize a ce...
We study the problem of model selection in bandit scenarios in the prese...
Classical theory in reinforcement learning (RL) predominantly focuses on...
Motivated by applications to online learning in sparse estimation and
Ba...
We study meta-learning in Markov Decision Processes (MDP) with linear
tr...
We study a class of classification problems best exemplified by the
bank...
We study the problem of information sharing and cooperation in Multi-Pla...
Much of the recent success of deep reinforcement learning has been drive...
We study the role of the representation of state-action value functions ...
Reinforcement learning (RL) is empirically successful in complex nonline...
We study a theory of reinforcement learning (RL) in which the learner
re...
Standard approaches to decision-making under uncertainty focus on sequen...
Since its introduction a decade ago, relative entropy policy search
(REP...
There has recently been significant interest in training reinforcement
l...
In recent years, deep off-policy actor-critic algorithms have become a
d...
We introduce ES-ENAS, a simple neural architecture search (NAS) algorith...
Whilst optimal transport (OT) is increasingly being recognized as a powe...
We propose a simple model selection approach for algorithms in stochasti...
Deep reinforcement learning has achieved impressive successes yet often
...
Over the last decade, a single algorithm has changed many facets of our ...
Maximum a posteriori (MAP) inference in discrete-valued Markov random fi...
The principle of optimism in the face of uncertainty is prevalent throug...
We study a constrained contextual linear bandit setting, where the goal ...
We consider model selection in stochastic bandit and reinforcement learn...
Learning under one-sided feedback (i.e., where examples arrive in an onl...
We present a new class of stochastic, geometrically-driven optimization
...
Mode estimation is a classical problem in statistics with a wide range o...
We study model selection in stochastic bandit problems. Our approach rel...
Thompson sampling is a methodology for multi-armed bandit problems that ...
Model-Based Reinforcement Learning (MBRL) offers a promising direction f...
Maintaining a population of solutions has been shown to increase explora...
We introduce ES-MAML, a new framework for solving the model agnostic met...
We propose an approach to fair classification that enforces independence...
We present a new algorithm for finding compact neural networks encoding
...
Maximum a posteriori (MAP) inference is a fundamental computational para...
We propose behavior-driven optimization via Wasserstein distances (WDs) ...
We propose a new class of structured methods for Monte Carlo (MC) sampli...
We present a new algorithm ASEBO for conducting optimization of
high-dim...
Interest in derivative-free optimization (DFO) and "evolutionary strateg...