KMS

浏览/检索结果: 共4条,第1-4条 帮助

已选(0)清除 条数/页:   排序方式:
Make LLM Inference Affordable to Everyone: Augmenting GPU Memory with NDP-DIMM 会议论文
2025 IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), Las Vegas, NV, USA, 1-5 March 2025
作者:  Lian Liu;  Shixin Zhao;  Bing Li;  Haimeng Ren;  Zhaohui Xu
Adobe PDF(1300Kb)  |  收藏  |  浏览/下载:34/1  |  提交时间:2025/04/14
Accelerating Mini-batch HGNN Training by Reducing CUDA Kernels 会议论文
LECTURE NOTES IN COMPUTER SCIENCE (INCLUDING SUBSERIES LECTURE NOTES IN ARTIFICIAL INTELLIGENCE AND LECTURE NOTES IN BIOINFORMATICS), Macau, China, October 29, 2024 - October 31, 2024
作者:  Wu, Meng;  Qiu, Jingkai;  Yan, Mingyu;  Li, Wenming;  Zhang, Yang
收藏  |  浏览/下载:339/0  |  提交时间:2025/03/14
COMET: Towards Practical W4A4KV4 LLMs Serving 会议论文
INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS - ASPLOS, Rotterdam, Netherlands, March 30, 2025 - April 3, 2025
作者:  Liu, Lian;  Cheng, Long;  Ren, Haimeng;  Xu, Zhaohui;  Pan, Yudong
Adobe PDF(2187Kb)  |  收藏  |  浏览/下载:37/1  |  提交时间:2025/05/09
ZeroTetris: A Spacial Feature Similarity-based Sparse MLP Engine for Neural Volume Rendering 会议论文
PROCEEDINGS - DESIGN AUTOMATION CONFERENCE, San Francisco, CA, United states, June 23, 2024 - June 27, 2024
作者:  Wan, Haochuan;  Ma, Linjie;  Li, Antong;  Zhou, Pingqiang;  Yu, Jingyi
Adobe PDF(1288Kb)  |  收藏  |  浏览/下载:204/3  |  提交时间:2024/12/27
  • 首页
  • 上一页
  • 1
  • 下一页
  • 末页