Tag: 模型量化
All the articles with the tag "模型量化".
I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models
Updated: at 15:06Published: at 17:11LLM的Interger-Only PTQ量化工作。
I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference
Updated: at 15:06Published: at 15:56对ViT的纯整型量化,W8A8,中科院2023 ICCV
Efficient and Effective Methods for Mixed Precision Neural Network Quantization for Faster, Energy-efficient Inference
Updated: at 15:06Published: at 16:28EAGL,声称只要用CPU在3秒内就能完成对ResNet的量化,效率远高于HAWQ等其他传统的方法
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
Updated: at 15:06Published: at 18:33谷歌的,第一篇完整跑通interger-only量化推理流程的工作。
HAWQ: Hessian Aware Quantization of Neural Networks with Mixed-Precision
Updated: at 15:06Published: at 18:30模型量化经典方法,基于黑森矩阵,一种二阶信息的量化方法。