UNIVERSITY PARK, Pa. — Jing Yang, assistant professor of electrical engineering and computer science at Penn State, and Cong Shen, assistant professor of electrical and computer engineering at the University of Virginia (UVA), have earned a $550,000, three-year grant to pursue machine learning for wireless networking systems. The National Science Foundation (NSF) and Intel have partnered to fund this multi-university research program to accelerate fundamental, broad-based research on wireless-specific machine learning techniques applied to new wireless systems and architecture design.
Traditionally, wireless network operation and maintenance systems relied on experts who possessed first-hand knowledge of the wireless network, its functions and its end-users. It is generally acknowledged within the industry that the complexity of 5G networks and the Internet of Things will render these manual approaches obsolete.
“What worked well in the past will not work well in the future,” Shen said.
Authors of a widely cited report published in June 2019 by Ericsson, a multinational networking and telecommunications company, predict that in four years, 5G coverage will reach 45% of the world’s population, or 1.9 billion people.
“Our machine learning approach will allow network managers to reliably meet user demands across a wide range of applications and infrastructure,” Yang said. “Reliability is key to meeting rising demand for virtual reality and high-resolution video applications, embedded and wearable tech, and large-scale infrastructure for smart homes and autonomous cars.”
Yang and Shen offer a machine learning solution that combines pure data-driven reinforcement learning algorithms with domain knowledge. They call their technique “domain knowledge enriched reinforcement learning framework for wireless network optimization,” or Dino-RL. Reinforcement learning algorithms are well suited to the exponential growth and open-ended design of wireless networks. The algorithm receives continuous feedback and dynamically adjusts its decision rules to maximize a reward, in this case to quickly detect and fix network problems without human intervention.
To incorporate wireless network domain knowledge, Shen will build an efficient episodic memory to drive Dino-RL. Episodic RL is a novel paradigm that can significantly speed up learning by keeping an explicit record of past events and using such “memory” directly as a point of reference in making future decisions.
“It mimics how we humans process learning tasks, to subconsciously resort to similar situations we have encountered in the past when facing new decisions,” Shen said.
This method was only recently developed in the machine learning community, and this project will attempt to develop a novel episodic reinforcement learning framework that is tailored for wireless networks.
The second research direction focuses on how to enable knowledge transfer and efficient exploration.
“If I develop a machine learning system to manage networks for Penn State and its State College neighborhood, with the proposed meta-learning method, it can be easily transferred to ‘plug-and-play’ with wireless systems supporting UVA grounds and Charlottesville,” Yang said.
Yang will develop a reinforcement learning technique, called meta-reinforcement learning, to ensure that Dino-RL achieves efficient knowledge transfer across multiple instantiations of the algorithm. This is particularly challenging because wireless networks have many parameters to tune, and methods that can allow a new wireless network to quickly operate at its optimum are highly valuable.
Both episodic and meta-RL methods are known to be data hungry, which is why they are mostly used in non-real-time tasks such as the game of Go. Wireless network operation, however, is highly time-sensitive and data efficiency becomes the key challenge. Yang and Shen plan to attack this crucial problem with tools from multi-armed bandits, a subfield of online learning for which they have collaborated over the past few years.
They plan to develop Dino-RL to switch between the real-world wireless network and an imaginary “dream-world” model that is built based on domain knowledge and gradually refined by assimilating real-world observations. They can then leverage this “dream-world” model to test future actions before committing to the real-world network. Shen and Yang believe that this will ultimately solve the data efficiency problem and enable Dino-RL to learn and act using as little data as possible.
Yang and Shen are optimistic that Dino-RL will accelerate industry adoption of machine learning techniques for wireless network management.
“Intel’s partnership with NSF is key,” Shen said. “This is a problem I have known for years, tracing back to my days at Qualcomm Research when we attempted to solve it with Self-Organizing Networks. Unfortunately, that didn’t really take off, as we did not have the right tools [of machine learning] at the time. I am glad that this particular grant is funded by both NSF and Intel, which gives us a unique opportunity to take it to the real world.”
NSF and Intel will launch the research program in June 2020, following a kick-off meeting with all university grantees.