Tag: 加速器
All the articles with the tag "加速器".
DeltaCNN: End-to-End CNN Inference of Sparse Frame Differences in Videos
Updated: at 15:07Published: at 12:11利用CNN Layer的“线性”特征在帧之间做feature的差分,并且做了CUDA加速。和ViStream几乎一样的思路,能不能解决我们现在的问题?
Phi: Leveraging Pattern-based Hierarchical Sparsity for High-Efficiency Spiking Neural Networks
Published: at 17:45ISCA 2025, 基于结构化稀疏的SNN加速器。如果直接用LUT存,可能会出现需要保存的稀疏pattern数量太多,显存占用太严重,所以通过预先校准一级“结构化稀疏”,将Online Spike Activation变成一级可以完全用LUT算的L1 Sparse和稀疏度非常高的L2 Sparse。模仿一下idea搬到GPU上来做?
SpikeSim: An end-to-end Compute-in-Memory Hardware Evaluation Tool for Benchmarking Spiking Neural Networks
Updated: at 15:06Published: at 18:33SNN部署的硬件设计or evaluation benchmark。
Evaluating Spatial Accelerator Architectures with Tiled Matrix-Matrix Multiplication
Updated: at 15:06Published: at 18:31GEMM data mapping的介绍,主要是各种脉动阵列相关的加速器。
Optimizing Bit-Serial Matrix Multiplication for Reconfigurable Computing
Updated: at 15:06Published: at 18:29BISMO优化。
TVM: An Automated End-to-End Optimizing Compiler for Deep Learning
Updated: at 15:06Published: at 18:29TVM。
A Hardware-Software Blueprint for Flexible Deep Learning Specialization
Updated: at 15:06Published: at 16:33VTA。
BISMO: A Scalable Bit Serial Matrix Multiplication Overlay for Reconfigurable Computing
Updated: at 15:06Published: at 14:31BISMO。