PyTorchSim: A Comprehensive, Fast, and Accurate NPU Simulation Framework
Publications
2025
PyTorchSim: A Comprehensive, Fast, and Accurate NPU Simulation Framework
The 58th IEEE/ACM International Symposium on Microarchitecture (MICRO) (Acceptance rate: 20.8%)
Wonhyuk Yang*, Yunseon Shin*, Okkyun Woo*, Geonwoo Park, Hyungkyu Ham, Jeehoon Kang, Jongse Park, Gwangsun Kim (*: co-first authors)
LibraPIM: Dynamic Load Rebalancing to Maximize Utilization in PIM-Assisted LLM Inference Systems
2025 International Conference on Parallel Architectures and Compilation Techniques (PACT)
Hyeongjun Cho, Yoonho Jang, Hyungi Kim, Seongwook Kim, Keewon Kwon, Gwangsun Kim, Seokin Hong
Cost-Effective Extension of DRAM-PIM for Group-Wise LLM Quantization
IEEE Computer Architecture Letters (CAL)
Byeori Kim, Changhun Lee, Gwangsun Kim, Eunhyeok Park
[ IEEE Xplore ]
2024
Low-overhead General-purpose Near-Data Processing in CXL Memory Expanders
The 57th IEEE/ACM International Symposium on Microarchitecture (MICRO) (Acceptance rate: 22.7%)
Hyungkyu Ham*, Jeongmin Hong*, Geonwoo Park, Yunseon Shin, Okkyun Woo, Wonhyuk Yang, Jinhoon Bae, Eunhyeok Park, Hyojin Sung, Euicheol Lim, Gwangsun Kim (*: co-first authors)
[ IEEE Xplore ] [ arXiv ] [ Slides ]
NeuPIMs: NPU-PIM Heterogeneous Acceleration for Batched LLM Inferencing
The 29th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)
Guseul Heo, Sangyeop Lee, Jaehong Cho, Hyunmin Choi, Sanghyeon Lee, Hyungkyu Ham, Gwangsun Kim, Divya Mahajan, Jongse Park
Bandwidth-Effective DRAM Cache for GPUs with Storage-Class Memory
The 30th International Symposium on High-Performance Computer Architecture (HPCA) (Accept. rate: 18.3%)
Jeongmin Hong, Sungjun Cho, Geonwoo Park, Wonhyuk Yang, Young-Ho Gong, Gwangsun Kim
[ IEEE Xplore ] [ arXiv ] [ Slides ] [ Github ]
CR2: Community-aware Compressed Regular Representation for Graph Processing on a GPU
The 53rd International Conference on Parallel Processing (ICPP)
Shinnung Jeong, Sungjun Cho, Yongwoo Lee, Hyunjun Park, Seonyeong Heo, Gwangsun Kim, Youngsok Kim, Hanjun Kim
[ ACM DL ]
Non-Invasive, Memory Access-Triggered Near-Data Processing for DNN Training Acceleration on GPUs
IEEE Access
Hyungkyu Ham*, Hyunuk Cho*, Minjae Kim, Jueon Park, Jeongmin Hong, Hyojin Sung, Eunhyeok Park, Euicheol Lim, Gwangsun Kim (*: co-first authors)
[ IEEE Xplore ]
ONNXim: A Fast, Cycle-level Multi-core NPU Simulator
IEEE Computer Architecture Letters (CAL)
Hyungkyu Ham*, Wonhyuk Yang*, Yunseon Shin, Okkyun Woo, Guseul Heo, Sangyeop Lee, Jongse Park, Gwangsun Kim (*: co-first authors)
[ IEEE Xplore ] [ arXiv ] [ Github ]
2022
Dynamic Global Adaptive Routing in High-radix Networks
The 49th International Symposium on Computer Architecture (ISCA)
Hans Kasan, Gwangsun Kim, Yung Yi, John Kim
[ Paper ]
Overcoming Memory Capacity Wall of GPUs With Heterogeneous Memory Stack
IEEE Computer Architecture Letters (CAL)
Jeongmin Hong*, Sungjun Cho*, Gwangsun Kim (*: co-first authors)
[ Paper ]
2021
Near-Data Processing in Memory Expander for DNN Acceleration on GPUs
IEEE Computer Architecture Letters (CAL)
Hyungkyu Ham*, Hyunuk Cho*, Minjae Kim, Jueon Park, Jeongmin Hong, Hyojin Sung, Eunhyeok Park, Euicheol Lim, Gwangsun Kim (*: co-first authors)
[ Paper ]
2017
History-Based Arbitration for Fairness in Processor-Interconnect of NUMA Servers
The 22nd ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (Acceptance rate: 17.4%)
Wonjun Song, Gwangsun Kim, Hyungjoon Jung, Jongwook Chung, Jung Ho Ahn, Jae W Lee, John Kim
[ Paper ]
Transparent Offloading and Mapping (TOM): Enabling Programmer-Transparent Near-Data Processing in GPU Systems
The 43rd International Symposium on Computer Architecture (ISCA) (Acceptance rate: 19.6%)
Kevin Hsieh, Eiman Ebrahimi, Gwangsun Kim, Niladrish Chatterjee, Mike O’Connor, Nandita Vijaykumar, Onur Mutlu, Stephen W. Keckler
iPAWS : Instruction-Issue Pattern-based Adaptive Warp Scheduling for GPGPUs
The 22nd IEEE International Symposium on High Performance Computer Architecture (HPCA) (Acceptance rate: 22.1%)
Minseok Lee, Gwangsun Kim, John Kim, Woong Seo, Yeongon Cho, Soojung Ryu
[ Paper ]
Design and Analysis of Hybrid Flow Control for Hierarchical Ring Network-on-Chip
IEEE Transactions on Computers, vol. 65, no. 2, pp. 480-494, 1
Hanjoon Kim, Gwangsun Kim, Hwasoo Yeo, John Kim, Seungryoul Maeng
[ Paper ]
2015
Overcoming Far-end Congestion in Large-Scale Networks
The 21st IEEE International Symposium on High Performance Computer Architecture (HPCA) (Accept. rate: 22.1%)
Jongmin Won, Gwangsun Kim, John Kim, Ted Jiang, Mike Parker, and Steve Scott
[ Paper ]
Transportation-Network Inspired Network-on-Chip
The 20th International Symposium on High Performance Computer Architecture (HPCA) (Accept. rate: 25.6%)
Hanjoon Kim, Gwangsun Kim, Hwasoo Yeo, Seungryoul Maeng, and John Kim
[ Paper ]
Low-overhead Network-on-Chip Support for Location-oblivious Task Placement
IEEE Transactions on Computers, vol. 63, no. 6, pp. 1487-1500, June 2014
Gwangsun Kim, Michael M. Lee, John Kim, Jae W. Lee, Dennis Abts, and Michael Marty
[ Paper ]
2012
Scalable On-chip Network in Power Constrained Manycore Processors
The 3rd International Green Computing Conference (IGCC)
Hanjoon Kim, Gwangsun Kim, and John Kim
[ Paper ]