Stochastic multi-armed bandits are a sequential-decision-making framewor...
Many real-world machine learning applications are characterized by a hug...
Policy-based algorithms are among the most widely adopted techniques in
...
A large variety of real-world Reinforcement Learning (RL) tasks is
chara...
In Reinforcement Learning (RL), an agent acts in an unknown environment ...
Inverse reinforcement learning (IRL) denotes a powerful family of algori...
The most relevant problems in discounted reinforcement learning involve
...
One of the central issues of several machine learning applications on re...
We investigate the problem of bandits with expert advice when the expert...
Uncertainty quantification has been extensively used as a means to achie...
Stochastic Rising Bandits is a setting in which the values of the expect...
Autoregressive processes naturally arise in a large variety of real-worl...
Behavioral Cloning (BC) aims at learning a policy that mimics the behavi...
This paper is in the field of stochastic Multi-Armed Bandits (MABs), i.e...
In reinforcement learning, the performance of learning agents is highly
...
According to the main international reports, more pervasive industrial a...
In many real-world sequential decision-making problems, an action does n...
With the continuous growth of the global economy and markets, resource
i...
Warehouse Management Systems have been evolving and improving thanks to ...
Atmospheric Extreme Events (EEs) cause severe damages to human societies...
There is a rising interest in industrial online applications where data
...
Automated Reinforcement Learning (AutoRL) is a relatively new area of
re...
When the agent's observations or interactions are delayed, classic
reinf...
In reinforcement learning, we encode the potential behaviors of an agent...
In the sequential decision making setting, an agent aims to achieve
syst...
In the maximum state entropy exploration framework, an agent interacts w...
The classic Reinforcement Learning (RL) formulation concerns the maximiz...
Several recent works have been dedicated to unsupervised reinforcement
l...
Learning in a lifelong setting, where the dynamics continually evolve, i...
We study the role of the representation of state-action value functions ...
Many real-world domains are subject to a structured non-stationarity whi...
The linear contextual bandit literature is mostly focused on the design ...
Real-world decision-making tasks are generally complex, requiring trade-...
Policy Optimization (PO) is a widely used approach to address continuous...
In the contextual linear bandit setting, algorithms built on the optimis...
In this paper we show how risk-averse reinforcement learning can be used...
Inverse Reinforcement Learning addresses the problem of inferring an exp...
Many learning problems involve multiple agents optimizing different
inte...
In a reward-free environment, what is a suitable intrinsic objective for...
We are interested in how to design reinforcement learning agents that
pr...
In most transfer learning approaches to reinforcement learning (RL) the
...
We study finite-armed stochastic bandits where the rewards of each arm m...
Pay-per-click advertising includes various formats (e.g., search,
contex...
The choice of the control frequency of a system has a relevant impact on...
MushroomRL is an open-source Python library developed to simplify the pr...
In real-world decision-making problems, for instance in the fields of
fi...
Traditional model-based reinforcement learning approaches learn a model ...
We study the problem of identifying the policy space of a learning agent...
Mutual information has been successfully adopted in filter feature-selec...
What is a good exploration strategy for an agent that interacts with an
...