The authors consider a cognitive radio network overlaying on top of a legacy primary network in which a secondary user is allowed to access primary channel by overhearing feedback signals over the primary channels. Each channel is assumed to be a two state Makovian process. Aiming at maximising the expected accumulated discounted network throughput, the considered sequential decision-making problem can be cast into a restless multi-armed bandit (RMAB) problem which is well-known to be PSPACE-hard, and thus a natural alternative approach is to seek a simple myopic policy.
This study presents a theoretical study on the optimality of the proposed myopic policy for the special RMAB problem by considering four different cases: negatively correlated homogeneous channels, heterogeneous channels, positively correlated heterogeneous channels and negatively correlated heterogeneous channels. More specifically, the authors establish the closed-form conditions to guarantee the optimality of the myopic policy for the four cases, respectively, which, combined with the case of positively correlated homogeneous channels, constitute a complete paradigm for the optimality of the myopic policy.