Publications | PSAL at POSTECH

Exploring High-Bandwidth Flash for Modern LLM Inference: Opportunities and Challenges

Publications

2026

Exploring High-Bandwidth Flash for Modern LLM Inference: Opportunities and Challenges

IEEE Computer Architecture Letters (CAL)

Dowon Son, Yonggon Park, Hyunuk Cho, Hyungkyu Ham, Onur Mutlu, Sungjin Lee, Gwangsun Kim, Jisung Park

[ IEEE Xplore ]

A Programming Model for Efficient Inter-kernel Control-flow on Memory-mapped Near-data Processing Architecture (WIP)

The 27th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES)

Seungheon Lee, Wonhyuk Yang, Seonyeong Heo, Gwangsun Kim

[ ACM DL ] [ Slides ]

2025

PyTorchSim: A Comprehensive, Fast, and Accurate NPU Simulation Framework

The 58th IEEE/ACM International Symposium on Microarchitecture (MICRO) (Acceptance rate: 20.8%)

Wonhyuk Yang*, Yunseon Shin*, Okkyun Woo*, Geonwoo Park, Hyungkyu Ham, Jeehoon Kang, Jongse Park, Gwangsun Kim (*: co-first authors)

[ ACM DL ] [ Slides ] [ Github ]

LibraPIM: Dynamic Load Rebalancing to Maximize Utilization in PIM-Assisted LLM Inference Systems

2025 International Conference on Parallel Architectures and Compilation Techniques (PACT)

Hyeongjun Cho, Yoonho Jang, Hyungi Kim, Seongwook Kim, Keewon Kwon, Gwangsun Kim, Seokin Hong

[ IEEE Xplore ]

Cost-Effective Extension of DRAM-PIM for Group-Wise LLM Quantization

IEEE Computer Architecture Letters (CAL)

Byeori Kim, Changhun Lee, Gwangsun Kim, Eunhyeok Park

[ IEEE Xplore ]

2024

Low-overhead General-purpose Near-Data Processing in CXL Memory Expanders

The 57th IEEE/ACM International Symposium on Microarchitecture (MICRO) (Acceptance rate: 22.7%)

Hyungkyu Ham*, Jeongmin Hong*, Geonwoo Park, Yunseon Shin, Okkyun Woo, Wonhyuk Yang, Jinhoon Bae, Eunhyeok Park, Hyojin Sung, Euicheol Lim, Gwangsun Kim (*: co-first authors)

[ IEEE Xplore ] [ arXiv ] [ Slides ]

NeuPIMs: NPU-PIM Heterogeneous Acceleration for Batched LLM Inferencing

The 29th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)

Guseul Heo, Sangyeop Lee, Jaehong Cho, Hyunmin Choi, Sanghyeon Lee, Hyungkyu Ham, Gwangsun Kim, Divya Mahajan, Jongse Park

[ ACM DL ] [ arXiv ]

Bandwidth-Effective DRAM Cache for GPUs with Storage-Class Memory

The 30th International Symposium on High-Performance Computer Architecture (HPCA) (Accept. rate: 18.3%)

Jeongmin Hong, Sungjun Cho, Geonwoo Park, Wonhyuk Yang, Young-Ho Gong, Gwangsun Kim

[ IEEE Xplore ] [ arXiv ] [ Slides ] [ Github ]

CR2: Community-aware Compressed Regular Representation for Graph Processing on a GPU

The 53rd International Conference on Parallel Processing (ICPP)

Shinnung Jeong, Sungjun Cho, Yongwoo Lee, Hyunjun Park, Seonyeong Heo, Gwangsun Kim, Youngsok Kim, Hanjun Kim

[ ACM DL ]

Non-Invasive, Memory Access-Triggered Near-Data Processing for DNN Training Acceleration on GPUs

IEEE Access

Hyungkyu Ham*, Hyunuk Cho*, Minjae Kim, Jueon Park, Jeongmin Hong, Hyojin Sung, Eunhyeok Park, Euicheol Lim, Gwangsun Kim (*: co-first authors)

[ IEEE Xplore ]

ONNXim: A Fast, Cycle-level Multi-core NPU Simulator

IEEE Computer Architecture Letters (CAL)

Hyungkyu Ham*, Wonhyuk Yang*, Yunseon Shin, Okkyun Woo, Guseul Heo, Sangyeop Lee, Jongse Park, Gwangsun Kim (*: co-first authors)

[ IEEE Xplore ] [ arXiv ] [ Github ]

2022

Dynamic Global Adaptive Routing in High-radix Networks

The 49th International Symposium on Computer Architecture (ISCA)

Hans Kasan, Gwangsun Kim, Yung Yi, John Kim

[ Paper ]

Overcoming Memory Capacity Wall of GPUs With Heterogeneous Memory Stack

IEEE Computer Architecture Letters (CAL)

Jeongmin Hong*, Sungjun Cho*, Gwangsun Kim (*: co-first authors)

[ Paper ]

2021

Near-Data Processing in Memory Expander for DNN Acceleration on GPUs

IEEE Computer Architecture Letters (CAL)

Hyungkyu Ham*, Hyunuk Cho*, Minjae Kim, Jueon Park, Jeongmin Hong, Hyojin Sung, Eunhyeok Park, Euicheol Lim, Gwangsun Kim (*: co-first authors)

[ Paper ]

2018

TCEP: Traffic Consolidation for Energy-Proportional High-Radix Networks

The 45th International Symposium on Computer Architecture (ISCA) (Acceptance rate: 16.9%)

Gwangsun Kim, Hayoung Choi, John Kim

[ Paper ] [ Slides ]

2017

History-Based Arbitration for Fairness in Processor-Interconnect of NUMA Servers

The 22nd ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (Acceptance rate: 17.4%)

Wonjun Song, Gwangsun Kim, Hyungjoon Jung, Jongwook Chung, Jung Ho Ahn, Jae W Lee, John Kim

[ Paper ]

Toward Standardized Near-Data Processing with Unrestricted Data Placement for GPUs

The International Conference for High Performance Computing, Networking, Storage and Analysis (SC) (Acceptance rate: 18.7%)

Gwangsun Kim, Niladrish Chatterjee, Mike O’Connor, Kevin Hsieh

[ Paper ] [ Slides ]

2016

Contention-based Congestion Management in Large-Scale Networks

The 49th IEEE/ACM International Symposium on Microarchitecture (MICRO) (Acceptance rate: 21.6%)

Gwangsun Kim, Changhyun Kim, Jiyun Jeong, Mike Parker, John Kim

[ Paper ] [ Slides ]

Transparent Offloading and Mapping (TOM): Enabling Programmer-Transparent Near-Data Processing in GPU Systems

The 43rd International Symposium on Computer Architecture (ISCA) (Acceptance rate: 19.6%)

Kevin Hsieh, Eiman Ebrahimi, Gwangsun Kim, Niladrish Chatterjee, Mike O’Connor, Nandita Vijaykumar, Onur Mutlu, Stephen W. Keckler

[ Paper ] [ Slides ]

iPAWS : Instruction-Issue Pattern-based Adaptive Warp Scheduling for GPGPUs

The 22nd IEEE International Symposium on High Performance Computer Architecture (HPCA) (Acceptance rate: 22.1%)

Minseok Lee, Gwangsun Kim, John Kim, Woong Seo, Yeongon Cho, Soojung Ryu

[ Paper ]

Automatically Exploiting Implicit Pipeline Parallelism from Multiple Dependent Kernels for GPUs

The 25th International Conference on Parallel Architectures and Compilation Techniques (PACT) (Acceptance rate: 26.1%)

Gwangsun Kim, Jiyun Jeong, John Kim, Mark Stephenson

[ Paper ] [ Slides ]

Accelerating Linked-list Traversal through Near-Data Processing

The 25th International Conference on Parallel Architectures and Compilation Techniques (PACT) (Acceptance rate: 26.1%)

Byungchul Hong, Gwangsun Kim, Jung Ho Ahn, Yongkee Kwon, Hongsik Kim, John Kim

[ Paper ] [ Slides ]

Best Paper Nominee

Design and Analysis of Hybrid Flow Control for Hierarchical Ring Network-on-Chip

IEEE Transactions on Computers, vol. 65, no. 2, pp. 480-494, 1

Hanjoon Kim, Gwangsun Kim, Hwasoo Yeo, John Kim, Seungryoul Maeng

[ Paper ]

2015

Overcoming Far-end Congestion in Large-Scale Networks

The 21st IEEE International Symposium on High Performance Computer Architecture (HPCA) (Accept. rate: 22.1%)

Jongmin Won, Gwangsun Kim, John Kim, Ted Jiang, Mike Parker, and Steve Scott

[ Paper ]

2014

Multi-GPU System Design with Memory Networks

The 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) (Accept. rate: 19.4%)

Gwangsun Kim, Minseok Lee, Jiyun Jeong, and John Kim

[ Paper ] [ Slides ]

Memory Network: Enabling Technology for Scalable Near-Data Computing

The 2nd Workshop on Near-Data Processing (WoNDP) (in conjunction with MICRO-47)

Gwangsun Kim, John Kim, Jung Ho Ahn, and Yongkee Kwon

[ Paper ] [ Slides ]

Transportation-Network Inspired Network-on-Chip

The 20th International Symposium on High Performance Computer Architecture (HPCA) (Accept. rate: 25.6%)

Hanjoon Kim, Gwangsun Kim, Hwasoo Yeo, Seungryoul Maeng, and John Kim

[ Paper ]

Low-overhead Network-on-Chip Support for Location-oblivious Task Placement

IEEE Transactions on Computers, vol. 63, no. 6, pp. 1487-1500, June 2014

Gwangsun Kim, Michael M. Lee, John Kim, Jae W. Lee, Dennis Abts, and Michael Marty

[ Paper ]

2013

Memory-centric System Interconnect Design with Hybrid Memory Cubes

The 22nd International Conference on Parallel Architectures and Compilation Techniques (PACT) (Accept. rate: 17.3%)

Gwangsun Kim, John Kim, Jung Ho Ahn, and Jaeha Kim

[ Paper ] [ Slides ]

Best Paper Award

2012

Scalable On-chip Network in Power Constrained Manycore Processors

The 3rd International Green Computing Conference (IGCC)

Hanjoon Kim, Gwangsun Kim, and John Kim

[ Paper ]

2011

FlexiBuffer: Reducing Leakage Power in On-Chip Network Routers

The 48th ACM/EDAC/IEEE Design Automation Conference (DAC) (Accept. rate: 22%)

Gwangsun Kim, John Kim, and Sungjoo Yoo

[ Paper ] [ Slides ]