A central problem in the theory of multi-agent reinforcement learning (M...
We study the problem of Reinforcement Learning (RL) with linear function...
Calibration means that forecasts and average realized frequencies are cl...
We propose to smooth out the calibration score, which measures how good ...
In order to identify expertise, forecasters should not be tested by thei...
The current paper studies sample-efficient Reinforcement Learning (RL) i...
We consider the problem of contextual bandits where actions are subsets ...
Stochastic gradient descent (SGD) exhibits strong algorithmic regulariza...
This paper introduces a martingale that characterizes two properties of
...
Offline reinforcement learning seeks to utilize offline (observational) ...
This work describes a novel recurrent model for music composition, which...
Canonical Correlation Analysis (CCA) is a widely used statistical tool w...
We propose a new two stage algorithm LING for large scale regression
pro...
In Natural Language Processing (NLP) tasks, data often has the following...
The problem of topic modeling can be seen as a generalization of the
clu...
Hidden Markov Models (HMMs) can be accurately approximated using
co-occu...
We compare the risk of ridge regression to a simple variant of ordinary ...