Tag: 模型量化

All the articles with the tag "模型量化".

I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models
Updated:2025年3月8日 at 15:06Published: 2024年6月17日 at 17:11
LLM的Interger-Only PTQ量化工作。
I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference
Updated:2025年3月8日 at 15:06Published: 2024年5月7日 at 15:56
对ViT的纯整型量化，W8A8，中科院2023 ICCV
Efficient and Effective Methods for Mixed Precision Neural Network Quantization for Faster, Energy-efficient Inference
Updated:2025年3月8日 at 15:06Published: 2024年3月27日 at 16:28
EAGL，声称只要用CPU在3秒内就能完成对ResNet的量化，效率远高于HAWQ等其他传统的方法
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
Updated:2025年3月8日 at 15:06Published: 2024年3月4日 at 18:33
谷歌的，第一篇完整跑通interger-only量化推理流程的工作。
HAWQ: Hessian Aware Quantization of Neural Networks with Mixed-Precision
Updated:2025年3月8日 at 15:06Published: 2024年3月4日 at 18:30
模型量化经典方法，基于黑森矩阵，一种二阶信息的量化方法。

I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models