Tag: 算法

All the articles with the tag "算法".

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Published:2025年6月23日 at 17:47
看看Shift-Window Attention。
SpikeVideoFormer: An Efficient Spike-Driven Video Transformer with Hamming Attention and O(T) Complexity
Published:2025年6月17日 at 16:56
用汉明距离替换Attention中的点乘操作，避免出现Spike错开的情况。中间的做法比较有趣，但是实验感觉做的一般般，尤其是claim了自己有硬件实现的情况下energy计算还用的是纯算法的计算，并且FPGA的具体实现也没有透露，说了也没有说清楚。精度没有超过ANN2SNN的SOTA。重点还是需要用一些其他的操作替换掉对SNN不适应的算子。
SlowFast Networks for Video Recognition
Updated:2025年5月30日 at 06:15Published: 2025年5月27日 at 16:57
多分支CNN，会不会有一些分支能学到更加相似的帧间变化？
DeltaCNN: End-to-End CNN Inference of Sparse Frame Differences in Videos
Updated:2025年5月23日 at 15:07Published: 2025年5月23日 at 12:11
利用CNN Layer的“线性”特征在帧之间做feature的差分，并且做了CUDA加速。和ViStream几乎一样的思路，能不能解决我们现在的问题？
Temporal Flexibility in Spiking Neural Networks: Towards Generalization Across Time Steps and Deployment Friendliness
Published:2025年5月21日 at 15:38
ICLR2025 Poster，似乎也在做Elastic inference？
A Simple Framework for Contrastive Learning of Visual Representations
Published:2025年5月20日 at 13:42
对比学习SimCLR的论文。对比学习能对齐每一层的Feature吗？
QKFormer: Hierarchical Spiking Transformer using Q-K Attention
Published:2025年5月8日 at 18:09
QKFormer，NIPS2024 Spotlight，把Direct Training SNN在ImageNet和CIFAR上的点刷的特别高，感觉之后要做就避不开它。
Transformers without Normalization
Published:2025年5月7日 at 16:09
何恺明新作，用DyT代替Norm，把同步操作变成了Element Wise的操作。新文章里面有用到，学习一下。
Visualizing and Understanding the Effectiveness of BERT
Published:2025年5月6日 at 10:21
最近做SNN训练的过程中在研究怎么可视化训练过程中的Loss，在想新加入的方法会不会对模型的Loss Landscape有影响，一般讲Loss Landscape怎么做可视化的文章都会引用这篇文章对Loss Landscape的分析和做法。
One-Minute Video Generation with Test-Time Training
Published:2025年4月22日 at 18:17
最近Demo很火的TTT视频生成，可以生成60s级别的长视频。学习一下TTT的东西，SNN的On-Chip Learning和TTT能不能做结合？

Tag: 算法

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

SpikeVideoFormer: An Efficient Spike-Driven Video Transformer with Hamming Attention and O(T) Complexity

SlowFast Networks for Video Recognition

DeltaCNN: End-to-End CNN Inference of Sparse Frame Differences in Videos

Temporal Flexibility in Spiking Neural Networks: Towards Generalization Across Time Steps and Deployment Friendliness

A Simple Framework for Contrastive Learning of Visual Representations

QKFormer: Hierarchical Spiking Transformer using Q-K Attention

Transformers without Normalization

Visualizing and Understanding the Effectiveness of BERT

One-Minute Video Generation with Test-Time Training