AndyBlocker

Recent Posts

T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge
Published:2025年7月7日 at 16:23
T-MAC, 用LUT加速BitNet系列的工作，在CPU上跑，后续还有一个工作叫T-MAN是在移动端的高通CPU里面的NPU上跑LUT加速。
HYTE: Flexible Tiling for Sparse Accelerators via Hybrid Static-Dynamic Approaches
Published:2025年6月25日 at 16:27
ISCA2025，做稀疏数据流分块的，后半截没什么精力看了，现在的工作还没做稀疏编码。
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Published:2025年6月23日 at 17:47
看看Shift-Window Attention。
SpikeVideoFormer: An Efficient Spike-Driven Video Transformer with Hamming Attention and O(T) Complexity
Published:2025年6月17日 at 16:56
用汉明距离替换Attention中的点乘操作，避免出现Spike错开的情况。中间的做法比较有趣，但是实验感觉做的一般般，尤其是claim了自己有硬件实现的情况下energy计算还用的是纯算法的计算，并且FPGA的具体实现也没有透露，说了也没有说清楚。精度没有超过ANN2SNN的SOTA。重点还是需要用一些其他的操作替换掉对SNN不适应的算子。
Sparse Spiking Neural Network: Exploiting Heterogeneity in Timescales for Pruning Recurrent SNN
Published:2025年6月11日 at 19:11
ICLR 2024 Spotlight, 利用Lyapunov Noise进行SNN Pruning。
Prosperity: Accelerating Spiking Neural Networks via Product Sparsity
Published:2025年6月11日 at 16:52
HPCA在投的一篇SNN加速器文章，里面的“Product Sparsity”本质是减少相同内容的重复计算，和一般讨论的稀疏是两种不同的概念。
Towards Scalable GPU-Accelerated SNN Training via Temporal Fusion
Published:2025年6月10日 at 14:34
意义不明，用Layer-By-Layer写了一下LIF就没别的Contribution了，发在了一个叫做ICANN的会上。工作量也太小了。
Recurrent Residual Module for Fast Inference in Videos
Published:2025年6月9日 at 15:25
CVPR2018， DiffEncode + 稀疏加速，但感觉太老了。
Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models
Published:2025年6月9日 at 14:18
NIPS2022上一篇比较有影响力的论文，对GAN和扩散模型做推理加速的工作，提出了Spatially Sparse Inference，仅在被编辑区域上稀疏地应用卷积滤波器，同时对未编辑区域复用缓存的特征
SlowFast Networks for Video Recognition
Updated:2025年5月30日 at 06:15Published: 2025年5月27日 at 16:57
多分支CNN，会不会有一些分支能学到更加相似的帧间变化？

All Posts

AndyBlocker

Recent Posts

T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge

HYTE: Flexible Tiling for Sparse Accelerators via Hybrid Static-Dynamic Approaches

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

SpikeVideoFormer: An Efficient Spike-Driven Video Transformer with Hamming Attention and O(T) Complexity

Sparse Spiking Neural Network: Exploiting Heterogeneity in Timescales for Pruning Recurrent SNN

Prosperity: Accelerating Spiking Neural Networks via Product Sparsity

Towards Scalable GPU-Accelerated SNN Training via Temporal Fusion

Recurrent Residual Module for Fast Inference in Videos

Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models

SlowFast Networks for Video Recognition