Dissertação

Hybrid CAVIAR Simulations and Reinforcement Learning Applied to 5G Systems: Experiments with Scheduling and Beam Selection

Reinforcement Learning (RL) is a learning paradigm suitable for problems in which an agent has to maximize a given reward, while interacting with an ever-changing environment. This class of problem appears in several research topics of the 5th Generation (5G) and the 6th Generation (6G) of mobile ne...

ver descrição completa

Autor principal: BORGES, João Paulo Tavares
Grau: Dissertação
Idioma: eng
Publicado em: Universidade Federal do Pará 2024
Assuntos:
Acesso em linha: https://repositorio.ufpa.br/jspui/handle/2011/16551
Resumo:
Reinforcement Learning (RL) is a learning paradigm suitable for problems in which an agent has to maximize a given reward, while interacting with an ever-changing environment. This class of problem appears in several research topics of the 5th Generation (5G) and the 6th Generation (6G) of mobile networks. However, the lack of freely available data sets or environments to train and assess RL agents are a practical obstacle that delays the widespread adoption of RL in 5G and future networks. These environments must be able to close the so-called reality gap, where reinforcement learning agents, trained in virtual environments, are able to generalize their decisions when exposed to real, never before seen, situations. Therefore, this work describes a simulation methodology named CAVIAR, or Communication Networks, Artificial Intelligence and Computer Vision with 3D Computer-Generated Imagery, tailored for research on RL methods applied to the physical layer (PHY) of the wireless communications systems. In this work, this simulation methodology is used to generate an environment for the tasks of user scheduling and beam selection, where, at each time step, the RL agent needs to schedule a user and then choose the index of a fixed beamforming codebook to serve it. A key aspect of this proposal is that the simulation of the communication system and the artificial intelligence engine must be closely integrated, such that actions taken by the agent can reflect back on the simulation loop. This aspect makes the trade-off of processing time versus realism of the simulation, an element to be considered. This work also describes the modeling of the communication systems and RL agents used for experimentation, and presents statistics concerning the environment dynamics, such as data traffic, as well as results for baseline systems. Finally, it is discussed how the methods described in this work can be leveraged in the context of the development of digital twins.