Project Details
Description
Recent years have seen the emergence of a variety of resource-intensive distributed compute loads in context of machine learning model training and streaming data analytics. These compute loads often consist of graph-based jobs with its nodes being computation tasks, and edges being communication-based dependencies between these tasks. Due to high communication costs or privacy concerns, such a graph-based job is preferred to be performed in a distributed manner at wireless edge cloud. To achieve this, the computation tasks of a job that require computation and communication resources must be mapped to physical servers in wireless edge cloud. However, most existing wireless network scheduling algorithms rarely account for logical relationships between computation tasks of a graph-based job; and most existing learning algorithms pay little attention to the underlying wireless network constraints, and their successes in practice are further impeded due to the curse-of-dimensionality, and lack of expressiveness and adaptation. This project aims to bridge the gap between prevailing graph-based job services and wireless edge cloud designs via advocating structured learning and optimization solutions with provable performance guarantees. This project will additionally focus on advancing curriculum development, recruitment of students, involvement of undergraduate students in research, K-12 outreach via summer camps, as well as research dissemination via workshops and conferences.
The project aims to serve concurrent resource-contention graph-based jobs in the wireless edge cloud. It brings together mathematical methods to develop and analyze structured learning and optimization solutions that holistically exploit the inherent problem structure encoded in classical network models and algorithms to design data features and learning architectures for improved sample efficiency, accelerated learning speed and robust performance. The project addresses the key challenges of doing so via three interdependent thrusts. The first thrust focuses on designing structured reinforcement learning solutions for concurrent graph-based job scheduling at a fast timescale to minimize the job service latency when concurrent graph-based jobs arrive randomly and contend for limited network resources. The second thrust focuses on studying the complementary fast timescale problem of maximizing a job’s output given the allocated resources. The third thrust addresses the pertinent problem of adaptive resource provisioning at a slow timescale to make graph-based job services cost-effective. The project also conducts extensive performance evaluations to validate the developed approaches and algorithms.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
| Status | Active |
|---|---|
| Effective start/end date | 10/1/24 → 09/30/29 |
Funding
- National Science Foundation: $559,653.00
Fingerprint
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.