Tag: 加速器

All the articles with the tag "加速器".

Prosperity: Accelerating Spiking Neural Networks via Product Sparsity
Published:2025年6月11日 at 16:52
HPCA在投的一篇SNN加速器文章，里面的“Product Sparsity”本质是减少相同内容的重复计算，和一般讨论的稀疏是两种不同的概念。
DeltaCNN: End-to-End CNN Inference of Sparse Frame Differences in Videos
Updated:2025年5月23日 at 15:07Published: 2025年5月23日 at 12:11
利用CNN Layer的“线性”特征在帧之间做feature的差分，并且做了CUDA加速。和ViStream几乎一样的思路，能不能解决我们现在的问题？
Phi: Leveraging Pattern-based Hierarchical Sparsity for High-Efficiency Spiking Neural Networks
Published:2025年5月21日 at 17:45
ISCA 2025, 基于结构化稀疏的SNN加速器。如果直接用LUT存，可能会出现需要保存的稀疏pattern数量太多，显存占用太严重，所以通过预先校准一级“结构化稀疏”，将Online Spike Activation变成一级可以完全用LUT算的L1 Sparse和稀疏度非常高的L2 Sparse。模仿一下idea搬到GPU上来做？
SpikeSim: An end-to-end Compute-in-Memory Hardware Evaluation Tool for Benchmarking Spiking Neural Networks
Updated:2025年3月8日 at 15:06Published: 2024年3月4日 at 18:33
SNN部署的硬件设计or evaluation benchmark。
Evaluating Spatial Accelerator Architectures with Tiled Matrix-Matrix Multiplication
Updated:2025年3月8日 at 15:06Published: 2024年3月4日 at 18:31
GEMM data mapping的介绍，主要是各种脉动阵列相关的加速器。
Optimizing Bit-Serial Matrix Multiplication for Reconfigurable Computing
Updated:2025年3月8日 at 15:06Published: 2024年3月4日 at 18:29
BISMO优化。
TVM: An Automated End-to-End Optimizing Compiler for Deep Learning
Updated:2025年3月8日 at 15:06Published: 2024年3月4日 at 18:29
TVM。
A Hardware-Software Blueprint for Flexible Deep Learning Specialization
Updated:2025年3月8日 at 15:06Published: 2024年3月4日 at 16:33
VTA。
BISMO: A Scalable Bit Serial Matrix Multiplication Overlay for Reconfigurable Computing
Updated:2025年3月8日 at 15:06Published: 2024年3月4日 at 14:31
BISMO。

Tag: 加速器

Prosperity: Accelerating Spiking Neural Networks via Product Sparsity

DeltaCNN: End-to-End CNN Inference of Sparse Frame Differences in Videos

Phi: Leveraging Pattern-based Hierarchical Sparsity for High-Efficiency Spiking Neural Networks

SpikeSim: An end-to-end Compute-in-Memory Hardware Evaluation Tool for Benchmarking Spiking Neural Networks

Evaluating Spatial Accelerator Architectures with Tiled Matrix-Matrix Multiplication

Optimizing Bit-Serial Matrix Multiplication for Reconfigurable Computing

TVM: An Automated End-to-End Optimizing Compiler for Deep Learning

A Hardware-Software Blueprint for Flexible Deep Learning Specialization

BISMO: A Scalable Bit Serial Matrix Multiplication Overlay for Reconfigurable Computing