b'Paul Mineiro'

research

∙ 02/28/2023

Time-uniform confidence bands for the CDF under nonstationarity

Estimation of the complete distribution of a random variable is a useful...

0 Paul Mineiro, et al. ∙

research

∙ 02/17/2023

Graph Feedback via Reduction to Regression

When feedback is partial, leveraging all available information is critic...

0 Paul Mineiro, et al. ∙

research

∙ 02/16/2023

Infinite Action Contextual Bandits with Reusable Data Exhaust

For infinite action contextual bandits, smoothed regret and reduction to...

0 Mark Rucker, et al. ∙

research

∙ 11/28/2022

Personalized Reward Learning with Interaction-Grounded Learning (IGL)

In an era of countless content offerings, recommender systems alleviate ...

0 Jessica Maghakian, et al. ∙

research

∙ 11/14/2022

Towards Data-Driven Offline Simulations for Online Reinforcement Learning

Modern decision-making systems, from robots to web recommendation engine...

0 Shengpu Tang, et al. ∙

research

∙ 10/25/2022

Eigen Memory Tree

This work introduces the Eigen Memory Tree (EMT), a novel online memory ...

0 Mark Rucker, et al. ∙

research

∙ 10/24/2022

Deploying a Steered Query Optimizer in Production at Microsoft

Modern analytical workloads are highly heterogeneous and massively compl...

0 Wangda Zhang, et al. ∙

research

∙ 10/24/2022

Conditionally Risk-Averse Contextual Bandits

We desire to apply contextual bandits to scenarios where average-case st...

0 Mónika Farsang, et al. ∙

research

∙ 10/20/2022

A lower confidence sequence for the changing mean of non-negative right heavy-tailed observations with bounded mean

A confidence sequence (CS) is an anytime-valid sequential inference prim...

0 Paul Mineiro, et al. ∙

research

∙ 10/19/2022

Anytime-valid off-policy inference for contextual bandits

Contextual bandit algorithms are ubiquitous tools for active sequential ...

0 Ian Waudby-Smith, et al. ∙

research

∙ 07/12/2022

Contextual Bandits with Smooth Regret: Efficient Learning in Continuous Action Spaces

Designing efficient general-purpose contextual bandit algorithms that wo...

0 Yinglun Zhu, et al. ∙

research

∙ 07/12/2022

Contextual Bandits with Large Action Spaces: Made Practical

A central problem in sequential decision making is to develop algorithms...

0 Yinglun Zhu, et al. ∙

research

∙ 06/16/2022

Interaction-Grounded Learning with Action-inclusive Feedback

Consider the problem setting of Interaction-Grounded Learning (IGL), in ...

2 Tengyang Xie, et al. ∙

research

∙ 06/13/2021

Bellman-consistent Pessimism for Offline Reinforcement Learning

The use of pessimism, when reasoning about datasets lacking exhaustive e...

0 Tengyang Xie, et al. ∙

research

∙ 06/09/2021

Interaction-Grounded Learning

Consider a prosthetic arm, learning to adapt to its user's control signa...

0 Tengyang Xie, et al. ∙

research

∙ 06/09/2021

ChaCha for Online AutoML

We propose the ChaCha (Champion-Challengers) algorithm for making an onl...

0 Qingyun Wu, et al. ∙

research

∙ 06/01/2021

Improving Long-Term Metrics in Recommendation Systems using Short-Horizon Offline RL

We study session-based recommendation scenarios where we want to recomme...

1 Bogdan Mazoure, et al. ∙

research

∙ 02/18/2021

Off-policy Confidence Sequences

We develop confidence bounds that hold uniformly over time for off-polic...

0 Nikos Karampatziakis, et al. ∙

research

∙ 06/07/2019

Empirical Likelihood for Contextual Bandits

We apply empirical likelihood techniques to contextual bandit policy val...

5 Nikos Karampatziakis, et al. ∙

research

∙ 05/06/2019

Lessons from Real-World Reinforcement Learning in a Customer Support Bot

In this work, we describe practical lessons we have learned from success...

0 Nikos Karampatziakis, et al. ∙

research

∙ 07/17/2018

Contextual Memory Trees

We design and study a Contextual Memory Tree (CMT), a learning memory co...

6 Wen Sun, et al. ∙

research

∙ 06/15/2016

Logarithmic Time One-Against-Some

We create a new online reduction of multiclass classification to binary ...

0 Hal Daumé III, et al. ∙

research

∙ 02/05/2016

Active Information Acquisition

We propose a general framework for sequential and dynamic acquisition of...

0 He He, et al. ∙

research

∙ 11/10/2015

A Hierarchical Spectral Method for Extreme Classification

Extreme classification problems are multiclass and multilabel classifica...

0 Paul Mineiro, et al. ∙

research

∙ 11/13/2014

A Randomized Algorithm for CCA

We present RandomizedCCA, a randomized algorithm for computing canonical...

0 Paul Mineiro, et al. ∙

research

∙ 08/09/2014

Normalized Online Learning

We introduce online learning algorithms which are independent of feature...

0 Stéphane Ross, et al. ∙

research

∙ 10/07/2013

Discriminative Features via Generalized Eigenvectors

Representing examples in a way that is compatible with the underlying cl...

0 Nikos Karampatziakis, et al. ∙

research

∙ 06/07/2013

Loss-Proportional Subsampling for Subsequent ERM

We propose a sampling scheme suitable for reducing a data set prior to s...

0 Paul Mineiro, et al. ∙

Paul Mineiro

Featured Co-authors

Sign in with Google

Consider DeepAI Pro