Abstract:
Reinforcement Learning (RL) is considered to be a powerful tool for achieving a
specific task without explicit programming. However, in order to achieve optimal
behavior a reinforcement learning algorithm should interact extensively with environments,
which can be time-consuming and costly in real-world settings. Simulation
software allows RL to get an unlimited number of trials, but on the other hand,
creates the sim-to-real gap problem. The problem arises when a simulated environment
inaccurately represents reality in ways that affect real-world performance. One
of the popular approaches to combat the gap is domain randomization. It supplies
generalization of learned policy to a variety of possible real-world configurations but
also results in suboptimality when applied to a specific configuration. Inspired by
domain adaptation techniques for efficient knowledge transfer in supervised learning,
i.e. training a model using data from one domain and then adapting the model to
excel on a separate target domain, this thesis proposes providing additional training
on the real plant after being trained with domain randomization. We evaluate
our approach, firstly, in sim-to-sim, and then in sim-to-real transfer and find that it
demonstrates a higher task success rate and average scores of the proposed method
in comparison to simple domain randomization techniques. However, the efficiency of
the approach depends on the amount of affordable real-world training. Overall, our
results suggest that utilizing domain randomization followed by additional affordable
real-world training can help bridge the sim-to-real gap.