Rather than increasing the transmission power of the pico base station (PBS), the cell range extension (CRE) virtually magnifies the pico cell range by adding an offset value to the pico reception power, thereby increasing the coverage, the cell edge throughput , And technology that extends the whole network. Throughput improves. Many studies focus on inter-cell interference coordination (ICIC) in CRE, as the strong transmit power of macro base station (MBS) compromises extended area (ER) user equipment (UE) selecting PBS by offset value I have come with it. The optimum deviation value that minimizes the number of interrupted UEs depends on several factors such as the radio resource allocation ratio between MBS and PBS. It also varies from UE to UE. Therefore, most articles use common bias values for all UEs determined by trial and error. In this paper, we propose a method to determine the bias value of each UE by using the Q learning algorithm, and each UE learns a bias value that creates the number of UEs that discontinue irrespective of past experience. Minimize. Simulation results show that this scheme reduces the number of UEs that were interrupted and improves network throughput compared to the scheme using this optimal common offset value.
Open Image in New Window Open Image in New Window Open Image in New Window Open Image in New Window Open Image in New Window Open Image in New Window Open Image in New Window Open Image in New Window In Window Open image Open image in new window
The network learned to solve the FrozenLake problem, but it turns out that it is not as effective as Q-Table. Although neural networks allow for greater flexibility, they are at the expense of the stability of Q-Learning. Our simple Q-Network has many possible extensions to provide higher performance and more powerful learning. In particular, the two methods are called experience replays and freeze the target network. These and other tweaks are the key for Atari to play Deep Q-Networks. We will examine these new functions in the future. For details on the theory behind Q-Learning, see this wonderful article by Tambet Matiisen. I hope that this tutorial will be useful to those who are interested in implementing a simple Q-Learning algorithm!
In this tutorial of my intensive learning series, we examine a series of RL algorithms called the Q-Learning algorithm. These are slightly different from policy based algorithms and can be found in the next tutorial (part 1 - 3). Instead of first implementing a simple lookup table version of the algorithm and then beginning with a complex and clumsy deep neural network we show how to use Tensorflow to achieve neural network equivalence. Given the return to the foundation, it is best to think of it as part 0 of the series. It is expected to understand what actually happens with Q-Learning, and finally combine strategy gradient and Q learning method to build the most advanced RL proxy. Learn, please start the tutorial series from here.
RL technology has been used to improve network security. A spectrum allocation method based on minimum Large Q learning developed in reference. Improve spectral efficiency of cognitive radio network. In the reference document, an anti-jamming communication method based on two-dimensional Q learning has been proposed. In order to prevent cooperative interference in the cognitive radio network, the signal to interference noise ratio of the secondary user can be increased. The spoof detection method proposed in the reference document. Use the Q-learning and Dyna-Q techniques to obtain test thresholds that are optimal for physical layer authentication in wireless networks. Q network based deep transmission method developed in the reference. Optimum power and node mobility control is achieved to cope with interference in underwater acoustic networks. A power allocation strategy for drone based on depth Q learning is proposed in the reference. Optimum power allocation for smart attack without knowing attack model and channel model