[DynDNNs] Outline: Optimizing Runtime & System Level for Dynamic DNN

Posted Aug 1, 2025 Updated Aug 1, 2025

By 정영신

1 min read

Abstract

This outline presents the high-level research plan for accelerating dynamic DNN workloads at the runtime and system software levels.

1. Introduction

Dynamic DNNs—neural networks whose input shapes or structures change at runtime—pose unique challenges for both hardware and system software.
To address these, we will use Gemmini as the acceleration tool and llama.cpp as the execution platform, integrating and optimizing them for variable-shape inference.

2. Organization

Gemmini Hardware Analysis (Acceleration Tool) — Analysis of the Gemmini accelerator’s microarchitecture and its key components.
llama.cpp Framework Analysis (Execution Platform) — Review of the llama.cpp (GGML) inference engine’s architecture and inference pipeline.
Research Progress Updates — Summary of ongoing work: porting, profiling, and optimization experiments.

Research, DynDNNs

This post is licensed under CC BY 4.0 by the author.

Abstract

1. Introduction

2. Organization

Trending Tags