A Joint Model Provisioning and Request Dispatch Solution for Mobile Inference Serving at the Edge

dc.contributor.advisorPeng, Yang
dc.contributor.authorPrasad, Anish Nagendra
dc.date.accessioned2021-08-26T18:02:51Z
dc.date.available2021-08-26T18:02:51Z
dc.date.issued2021-08-26
dc.date.submitted2021
dc.descriptionThesis (Master's)--University of Washington, 2021
dc.description.abstractWith the advancement of machine learning (ML), a growing number of mobile clients rely onML inference for making time-sensitive and safety-critical decisions. Therefore, the demand for high-quality and low-latency inference services at the network edge has become the key to the modern intelligent society. This thesis proposes a novel solution that jointly provisions inference models and dispatches inference requests for reducing the latency of mobile inference serving on edge nodes. Unlike existing solutions that either direct inference requests to the nearest edge node or balance the workload between edge nodes, the solution we propose provisions each edge node with the optimal type and the number of inference serving instances under a holistic consideration of networking, computing, and memory resources. Mobile clients can thus utilize ML inference services on edge nodes that offer minimal inference serving latency. In this work, we implement the proposed solution using TensorFlow Serving and Kubernetes on a cluster of edge nodes, including Nvidia Jetson Nano and Jetson Xavier. We further demonstrate the proposed solution’s effectiveness in reducing the overall inference latency under various system parameters and practical system settings through simulation and testbed experiments, respectively.
dc.embargo.termsOpen Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherPrasad_washington_0250O_23263.pdf
dc.identifier.urihttp://hdl.handle.net/1773/47193
dc.language.isoen_US
dc.rightsnone
dc.subject
dc.subjectComputer science
dc.subject.otherComputing and software systems
dc.titleA Joint Model Provisioning and Request Dispatch Solution for Mobile Inference Serving at the Edge
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Prasad_washington_0250O_23263.pdf
Size:
552.28 KB
Format:
Adobe Portable Document Format