标签: 模型量化
所有带有此标签的文章 "模型量化".
-
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
更新于:开始做SNN-LLM的QAT/PTQ了,重新读一下之前看过的一些Activation量化的工作。
-
I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models
更新于:LLM的Interger-Only PTQ量化工作。
-
I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference
更新于:对ViT的纯整型量化,W8A8,中科院2023 ICCV
-
Efficient and Effective Methods for Mixed Precision Neural Network Quantization for Faster, Energy-efficient Inference
更新于:EAGL,声称只要用CPU在3秒内就能完成对ResNet的量化,效率远高于HAWQ等其他传统的方法
-
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
更新于:谷歌的,第一篇完整跑通interger-only量化推理流程的工作。
-
HAWQ: Hessian Aware Quantization of Neural Networks with Mixed-Precision
更新于:模型量化经典方法,基于黑森矩阵,一种二阶信息的量化方法。