Marcello Restelli

research

∙ 08/29/2023

Pure Exploration under Mediators' Feedback

Stochastic multi-armed bandits are a sequential-decision-making framewor...

0 Riccardo Poiani, et al. ∙

research

∙ 06/19/2023

Nonlinear Feature Aggregation: Two Algorithms driven by Theory

Many real-world machine learning applications are characterized by a hug...

0 Paolo Bonetti, et al. ∙

research

∙ 06/13/2023

Stepsize Learning for Policy Gradient Methods in Contextual Markov Decision Processes

Policy-based algorithms are among the most widely adopted techniques in ...

0 Luca Sabbioni, et al. ∙

research

∙ 05/10/2023

An Option-Dependent Analysis of Regret Minimization Algorithms in Finite-Horizon Semi-Markov Decision Processes

A large variety of real-world Reinforcement Learning (RL) tasks is chara...

0 Gianluca Drappo, et al. ∙

research

∙ 05/07/2023

Truncating Trajectories in Monte Carlo Reinforcement Learning

In Reinforcement Learning (RL), an agent acts in an unknown environment ...

0 Riccardo Poiani, et al. ∙

research

∙ 04/25/2023

Towards Theoretical Understanding of Inverse Reinforcement Learning

Inverse reinforcement learning (IRL) denotes a powerful family of algori...

0 Alberto Maria Metelli, et al. ∙

research

∙ 04/11/2023

A Tale of Sampling and Estimation in Discounted Reinforcement Learning

The most relevant problems in discounted reinforcement learning involve ...

0 Alberto Maria Metelli, et al. ∙

research

∙ 03/26/2023

Interpretable Linear Dimensionality Reduction based on Bias-Variance Analysis

One of the central issues of several machine learning applications on re...

0 Paolo Bonetti, et al. ∙

research

∙ 03/14/2023

Information-Theoretic Regret Bounds for Bandits with Fixed Expert Advice

We investigate the problem of bandits with expert advice when the expert...

0 Khaled Eldowa, et al. ∙

research

∙ 03/04/2023

Wasserstein Actor-Critic: Directed Exploration via Optimism for Continuous-Actions Control

Uncertainty quantification has been extensively used as a means to achie...

0 Amarildo Likmeta, et al. ∙

research

∙ 02/15/2023

Best Arm Identification for Stochastic Rising Bandits

Stochastic Rising Bandits is a setting in which the values of the expect...

0 Marco Mussi, et al. ∙

research

∙ 12/12/2022

Autoregressive Bandits

Autoregressive processes naturally arise in a large variety of real-worl...

0 Francesco Bacchiocchi, et al. ∙

research

∙ 12/07/2022

Tight Performance Guarantees of Imitator Policies with Continuous Actions

Behavioral Cloning (BC) aims at learning a policy that mimics the behavi...

0 Davide Maran, et al. ∙

research

∙ 12/07/2022

Stochastic Rising Bandits

This paper is in the field of stochastic Multi-Armed Bandits (MABs), i.e...

0 Alberto Maria Metelli, et al. ∙

research

∙ 11/21/2022

Simultaneously Updating All Persistence Values in Reinforcement Learning

In reinforcement learning, the performance of learning agents is highly ...

0 Luca Sabbioni, et al. ∙

research

∙ 11/17/2022

Dynamic Pricing with Volume Discounts in Online Settings

According to the main international reports, more pervasive industrial a...

0 Marco Mussi, et al. ∙

research

∙ 11/16/2022

Dynamical Linear Bandits

In many real-world sequential decision-making problems, an action does n...

0 Marco Mussi, et al. ∙

research

∙ 07/25/2022

Optimizing Empty Container Repositioning and Fleet Deployment via Configurable Semi-POMDPs

With the continuous growth of the global economy and markets, resource i...

0 Riccardo Poiani, et al. ∙

research

∙ 07/08/2022

Storehouse: a Reinforcement Learning Environment for Optimizing Warehouse Management

Warehouse Management Systems have been evolving and improving thanks to ...

0 Julen Cestero, et al. ∙

research

∙ 06/03/2022

Analysis, Characterization, Prediction and Attribution of Extreme Atmospheric Events with Machine Learning: a Review

Atmospheric Extreme Events (EEs) cause severe damages to human societies...

0 Sancho Salcedo-Sanz, et al. ∙

research

∙ 06/01/2022

Multi-Armed Bandit Problem with Temporally-Partitioned Rewards: When Partial Feedback Counts

There is a rising interest in industrial online applications where data ...

0 Giulia Romano, et al. ∙

research

∙ 05/20/2022

ARLO: A Framework for Automated Reinforcement Learning

Automated Reinforcement Learning (AutoRL) is a relatively new area of re...

0 Marco Mussi, et al. ∙

research

∙ 05/11/2022

Delayed Reinforcement Learning by Imitation

When the agent's observations or interactions are delayed, classic reinf...

0 Pierre Liotet, et al. ∙

research

∙ 02/22/2022

Reward-Free Policy Space Compression for Reinforcement Learning

In reinforcement learning, we encode the potential behaviors of an agent...

0 Mirco Mutti, et al. ∙

research

∙ 02/14/2022

Provably Efficient Causal Model-Based Reinforcement Learning for Systematic Generalization

In the sequential decision making setting, an agent aims to achieve syst...

0 Mirco Mutti, et al. ∙

research

∙ 02/07/2022

The Importance of Non-Markovianity in Maximum State Entropy Exploration

In the maximum state entropy exploration framework, an agent interacts w...

0 Mirco Mutti, et al. ∙

research

∙ 02/03/2022

Challenging Common Assumptions in Convex Reinforcement Learning

The classic Reinforcement Learning (RL) formulation concerns the maximiz...

6 Mirco Mutti, et al. ∙

research

∙ 12/16/2021

Unsupervised Reinforcement Learning in Multiple Environments

Several recent works have been dedicated to unsupervised reinforcement l...

0 Mirco Mutti, et al. ∙

research

∙ 12/13/2021

Lifelong Hyper-Policy Optimization with Multiple Importance Sampling Regularization

Learning in a lifelong setting, where the dynamics continually evolve, i...

0 Pierre Liotet, et al. ∙

research

∙ 10/27/2021

Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection

We study the role of the representation of state-action value functions ...

12 Matteo Papini, et al. ∙

research

∙ 05/18/2021

Meta-Reinforcement Learning by Tracking Task Non-stationarity

Many real-world domains are subject to a structured non-stationarity whi...

0 Riccardo Poiani, et al. ∙

research

∙ 04/08/2021

Leveraging Good Representations in Linear Contextual Bandits

The linear contextual bandit literature is mostly focused on the design ...

5 Matteo Papini, et al. ∙

research

∙ 03/17/2021

A Practical Guide to Multi-Objective Reinforcement Learning and Planning

Real-world decision-making tasks are generally complex, requiring trade-...

55 Conor F. Hayes, et al. ∙

research

∙ 12/15/2020

Policy Optimization as Online Learning with Mediator Feedback

Policy Optimization (PO) is a widely used approach to address continuous...

3 Alberto Maria Metelli, et al. ∙

research

∙ 10/23/2020

An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits

In the contextual linear bandit setting, algorithms built on the optimis...

0 Andrea Tirinzoni, et al. ∙

research

∙ 10/23/2020

Option Hedging with Risk Averse Reinforcement Learning

In this paper we show how risk-averse reinforcement learning can be used...

0 Edoardo Vittori, et al. ∙

research

∙ 07/15/2020

Inverse Reinforcement Learning from a Gradient-based Learner

Inverse Reinforcement Learning addresses the problem of inferring an exp...

0 Giorgia Ramponi, et al. ∙

research

∙ 07/15/2020

Newton-based Policy Optimization for Games

Many learning problems involve multiple agents optimizing different inte...

0 Giorgia Ramponi, et al. ∙

research

∙ 07/09/2020

A Policy Gradient Method for Task-Agnostic Exploration

In a reward-free environment, what is a suitable intrinsic objective for...

0 Mirco Mutti, et al. ∙

research

∙ 07/01/2020

Sequential Transfer in Reinforcement Learning with a Generative Model

We are interested in how to design reinforcement learning agents that pr...

0 Andrea Tirinzoni, et al. ∙

research

∙ 05/26/2020

Time-Variant Variational Transfer for Value Functions

In most transfer learning approaches to reinforcement learning (RL) the ...

5 Giuseppe Canonaco, et al. ∙

research

∙ 05/23/2020

A Novel Confidence-Based Algorithm for Structured Bandits

We study finite-armed stochastic bandits where the rewards of each arm m...

0 Andrea Tirinzoni, et al. ∙

research

∙ 03/03/2020

Online Joint Bid/Daily Budget Optimization of Internet Advertising Campaigns

Pay-per-click advertising includes various formats (e.g., search, contex...

0 Alessandro Nuara, et al. ∙

research

∙ 02/17/2020

Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning

The choice of the control frequency of a system has a relevant impact on...

0 Alberto Maria Metelli, et al. ∙

research

∙ 01/04/2020

MushroomRL: Simplifying Reinforcement Learning Research

MushroomRL is an open-source Python library developed to simplify the pr...

0 Carlo D'Eramo, et al. ∙

research

∙ 12/06/2019

Risk-Averse Trust Region Optimization for Reward-Volatility Reduction

In real-world decision-making problems, for instance in the fields of fi...

0 Lorenzo Bisi, et al. ∙

research

∙ 09/09/2019

Gradient-Aware Model-based Policy Search

Traditional model-based reinforcement learning approaches learn a model ...

0 Pierluca D'Oro, et al. ∙

research

∙ 09/09/2019

Policy Space Identification in Configurable Environments

We study the problem of identifying the policy space of a learning agent...

0 Alberto Maria Metelli, et al. ∙

research

∙ 07/17/2019

Feature Selection via Mutual Information: New Theoretical Insights

Mutual information has been successfully adopted in filter feature-selec...

0 Mario Beraha, et al. ∙

research

∙ 07/10/2019

An Intrinsically-Motivated Approach for Learning Highly Exploring and Fast Mixing Policies

What is a good exploration strategy for an agent that interacts with an ...

0 Mirco Mutti, et al. ∙

Marcello Restelli

Featured Co-authors

Sign in with Google

Consider DeepAI Pro