Jinlai Xu is a Researcher/Engineer at Infrastructure System Lab of ByteDance US. He is currently working on cutting-edge technologies which will evolve the infrastructure to a new generation, including but not limited to Next Generation ML/Big Data infra, Graph Learning/Computing, Cloud Native & Serverless Infra and Hyper-Scale Heterogeneous Cluster Management.
He got his PhD from University of Pittsburgh in 2021. His research interests include Big Model Training/Inference/Finetune Framework, Serverless Computing, Distributed Systems, Fog/Edge and Cloud Computing, Stream Processing Optimization and Blockchain-based Techniques. His PhD advisor is Balaji Palanisamy. He got his bachelor and master from China University of Geosciences and his undergraduate and graduate advisor is Zhongwen Luo.
PhD in Infomation Science, 2021
University of Pittsburgh
M.Phil in Software Engineering, 2015
China University of Geosciences
BEng in Software Engineering, 2012
China University of Geosciences
Big Model training/inference/finetune Framework
Low-latency Stream Processing/Resilience/Elasticity in Edge Computing
RL on Systems
Incentive Deisign for Edge Resource Sharing
Low-latency data processing is critical for enabling next generation Internet-of-Things(IoT) applications. Edge computing-based stream processing techniques that optimize for low latency and high throughput provide a promising solution to ensure a rich user experience by meeting strict application requirements. However, manual performance tuning of stream processing applications in heterogeneous and dynamic edge computing environments is not only time consuming but also not scalable. Our work presented in this paper achieves elasticity for stream processing applications deployed at the edge by automatically tuning the applications to meet the performance requirements. The proposed approach adopts a learning model to configure the parallelism of the operators in the stream processing application using a reinforcement learning(RL) method. We model the elastic control problem as a Markov Decision Process(MDP) and solve it by reducing it to a contextual Multi-Armed Bandit(MAB) problem. The techniques proposed in our work uses Upper Confidence Bound(UCB)-based methods to improve the sample efficiency in comparison to traditional random exploration methods such as the e-greedy method. It achieves a significantly improved rate of convergence compared to other RL methods through its innovative use of MAB methods to deal with the tradeoff between exploration and exploitation. In addition, the use of model-based pre-training results in sub-stantially improved performance by initializing the model with appropriate and well-tuned parameters. The proposed techniques are evaluated using realistic and synthetic workloads through both simulation and real testbed experiments. The experiment results demonstrate the effectiveness of the proposed approach compared to standard methods in terms of cumulative reward and convergence speed.
The proliferation of Internet-of-Things (IoT) devices is rapidly increasing the demands for efficient processing of low latency stream data generated close to the edge of the network. Edge computing-based stream processing techniques that carefully consider the heterogeneity of the computational and network resources available in the infrastructure provide significant benefits in optimizing the throughput and end-to-end latency of the data streams. In this paper, we propose a novel stream query processing framework called Amnis that optimizes the performance of the stream processing applications through a careful allocation of computational and network resources available at the edge. The Amnis approach differentiates itself through its consideration of data locality and resource constraints during physical plan generation and operator placement for the stream queries. Additionally, Amnis considers the coflow dependencies to optimize the network resource allocation through an application-level rate control mechanism. We implement a prototype of Amnis in Apache Storm. Our performance evaluation carried out in a real testbed shows that the proposed techniques achieve as much as 200X improvement on the end-to-end latency and 10X improvement on the overall throughput compared to the default resource aware scheduler in Storm.
In the Internet of Things(IoT) era, the demands for low-latency computing for time-sensitive applications (e.g., location-based augmented reality games, real-time smart grid management, real-time navigation using wearables) has been growing rapidly. Edge Computing provides an additional layer of infrastructure to fill latency gaps between the IoT devices and the back-end computing infrastructure. In the edge computing model, small-scale micro-datacenters that represent ad-hoc and distributed collection of computing infrastructure pose new challenges in terms of management and effective resource sharing to achieve a globally efficient resource allocation. In this paper, we propose Zenith, a novel model for allocating computing resources in an edge computing platform that allows service providers to establish resource sharing contracts with edge infrastructure providers apriori. Based on the established contracts, service providers employ a latency-aware scheduling and resource provisioning algorithm that enables tasks to complete and meet their latency requirements. The proposed techniques are evaluated through extensive experiments that demonstrate the effectiveness, scalability and performance efficiency of the proposed model.
Teaching Assistant , September 2015 - Present
Teaching Assistant , September 2013 - January 2014