<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>起居室老虎</title><description>记录了学习和阅读的笔记。</description><link>https://mer.run/</link><item><title>A Unified View of Attention and Residual Sinks: Outlier-Driven Rescaling is Essential for Transformer Training</title><link>https://mer.run/posts/a-unified-view-of-attention-and-residual-sinks-outlier-driven-rescaling-is-essential-for-transforme/</link><guid isPermaLink="true">https://mer.run/posts/a-unified-view-of-attention-and-residual-sinks-outlier-driven-rescaling-is-essential-for-transforme/</guid><description>Qwen团队，分析LLM中的Outliers是如何产生的、有什么影响。</description><pubDate>Mon, 02 Mar 2026 08:05:00 GMT</pubDate></item><item><title>2025</title><link>https://mer.run/posts/2025/</link><guid isPermaLink="true">https://mer.run/posts/2025/</guid><description>2025.</description><pubDate>Sun, 18 Jan 2026 16:55:00 GMT</pubDate></item><item><title>SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models</title><link>https://mer.run/posts/smoothquant-accurate-and-efficient-post-training-quantization-for-large-language-models/</link><guid isPermaLink="true">https://mer.run/posts/smoothquant-accurate-and-efficient-post-training-quantization-for-large-language-models/</guid><description>开始做SNN-LLM的QAT/PTQ了，重新读一下之前看过的一些Activation量化的工作。</description><pubDate>Tue, 30 Dec 2025 08:20:00 GMT</pubDate></item><item><title>Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free</title><link>https://mer.run/posts/gated-attention-for-large-language-models-non-linearity-sparsity-and-attention-sink-free/</link><guid isPermaLink="true">https://mer.run/posts/gated-attention-for-large-language-models-non-linearity-sparsity-and-attention-sink-free/</guid><description>NIPS2025 Best Paper。Qwen的。实验实在是过于solid了，真有钱啊。</description><pubDate>Wed, 03 Dec 2025 09:46:00 GMT</pubDate></item><item><title>Nested Learning: The Illusion of Deep Learning Architectures</title><link>https://mer.run/posts/nested-learning-the-illusion-of-deep-learning-architectures/</link><guid isPermaLink="true">https://mer.run/posts/nested-learning-the-illusion-of-deep-learning-architectures/</guid><description>谷歌新作，号称“深度学习新范式”。提到了异步，具体指的是让模型靠近输入的位置的更新频率高于靠后的位置，这个思路和之前Sakana AI的那个文章有点像。但文章里面的东西感觉全都是Fast Weight Programming的内容，arxiv的文章全文也一直没挂出来。</description><pubDate>Mon, 10 Nov 2025 09:08:00 GMT</pubDate></item><item><title>Kimi Linear: An Expressive, Efficient Attention Architecture</title><link>https://mer.run/posts/kimi-linear-an-expressive-efficient-attention-architecture/</link><guid isPermaLink="true">https://mer.run/posts/kimi-linear-an-expressive-efficient-attention-architecture/</guid><description>Kimi Linear，有比较详细的实验&amp;Scale Up。有Linear Attention可以去掉RoPE这个结论还是比较惊喜的。</description><pubDate>Tue, 04 Nov 2025 11:10:00 GMT</pubDate></item><item><title>Speed Always Wins: A Survey on Efficient Architectures for Large Language Models</title><link>https://mer.run/posts/speed-always-wins-a-survey-on-efficient-architectures-for-large-language-models/</link><guid isPermaLink="true">https://mer.run/posts/speed-always-wins-a-survey-on-efficient-architectures-for-large-language-models/</guid><description>AI Lab关于”广义“LLM推理加速的工作，包括Linear Attention，Sparse Attention，Diffusion LLM，Applications等。</description><pubDate>Thu, 16 Oct 2025 07:15:00 GMT</pubDate></item><item><title>Neuromorphic Principles for Efficient Large Language Models on Intel Loihi 2</title><link>https://mer.run/posts/neuromorphic-principles-for-efficient-large-language-models-on-intel-loihi-2/</link><guid isPermaLink="true">https://mer.run/posts/neuromorphic-principles-for-efficient-large-language-models-on-intel-loihi-2/</guid><description>ICLR2025 Workshop，基于HAQ实现的Matmul-Free SNN LLM（虽然只做了370M参数的实验）部署到Loihi2上，实现了相比于Qwen-500M 模型3\timesThroughput和2\times能效。但说实话文章内容关键点都没怎么讲，也没有什么特别很exciting的东西。</description><pubDate>Sun, 28 Sep 2025 16:39:00 GMT</pubDate></item><item><title>Parallelizing Linear Transformers with the Delta Rule over Sequence Length</title><link>https://mer.run/posts/parallelizing-linear-transformers-with-the-delta-rule-over-sequence-length/</link><guid isPermaLink="true">https://mer.run/posts/parallelizing-linear-transformers-with-the-delta-rule-over-sequence-length/</guid><description>DeltaNet</description><pubDate>Fri, 26 Sep 2025 08:46:00 GMT</pubDate></item><item><title>Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity</title><link>https://mer.run/posts/flash-llm-enabling-cost-effective-and-highly-efficient-large-generative-model-inference-with-unstru/</link><guid isPermaLink="true">https://mer.run/posts/flash-llm-enabling-cost-effective-and-highly-efficient-large-generative-model-inference-with-unstru/</guid><description> VLDB2024，阿里的工作，看起来工程特别扎实。LLM任务上只通过对weight做sparse load就能在decode阶段获得3-4倍的提速。</description><pubDate>Wed, 24 Sep 2025 07:07:00 GMT</pubDate></item><item><title>SpikingBrain-瞬息 1.0技术报告：原生国产自主可控类脑脉冲大模型</title><link>https://mer.run/posts/spikingbrain-%E7%9E%AC%E6%81%AF-10%E6%8A%80%E6%9C%AF%E6%8A%A5%E5%91%8A%E5%8E%9F%E7%94%9F%E5%9B%BD%E4%BA%A7%E8%87%AA%E4%B8%BB%E5%8F%AF%E6%8E%A7%E7%B1%BB%E8%84%91%E8%84%89%E5%86%B2%E5%A4%A7%E6%A8%A1%E5%9E%8B/</link><guid isPermaLink="true">https://mer.run/posts/spikingbrain-%E7%9E%AC%E6%81%AF-10%E6%8A%80%E6%9C%AF%E6%8A%A5%E5%91%8A%E5%8E%9F%E7%94%9F%E5%9B%BD%E4%BA%A7%E8%87%AA%E4%B8%BB%E5%8F%AF%E6%8E%A7%E7%B1%BB%E8%84%91%E8%84%89%E5%86%B2%E5%A4%A7%E6%A8%A1%E5%9E%8B/</guid><description>李国齐老师组的新工作技术报告。说实话，我并不觉得这是一个正经的SNN-LLM工作，感觉已经完全是Linear Attention国产化的工作了。很难评价。</description><pubDate>Mon, 15 Sep 2025 06:34:00 GMT</pubDate></item><item><title>MLP Memory: Language Modeling with Retriever-pretrained External Memory</title><link>https://mer.run/posts/mlp-memory-language-modeling-with-retriever-pretrained-external-memory/</link><guid isPermaLink="true">https://mer.run/posts/mlp-memory-language-modeling-with-retriever-pretrained-external-memory/</guid><description>用MLP学习并代替RAG中kNN输出的概率分布。</description><pubDate>Mon, 25 Aug 2025 06:23:00 GMT</pubDate></item><item><title>Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention</title><link>https://mer.run/posts/native-sparse-attention-hardware-aligned-and-natively-trainable-sparse-attention/</link><guid isPermaLink="true">https://mer.run/posts/native-sparse-attention-hardware-aligned-and-natively-trainable-sparse-attention/</guid><description>ACL2025 Best Paper，DeepSeek新作。分层KV Cache提高稀疏度，在训练和推理阶段同时提高性能。</description><pubDate>Thu, 14 Aug 2025 08:12:00 GMT</pubDate></item><item><title>GPU上的SNN稀疏加速</title><link>https://mer.run/posts/gpu%E4%B8%8A%E7%9A%84snn%E7%A8%80%E7%96%8F%E5%8A%A0%E9%80%9F/</link><guid isPermaLink="true">https://mer.run/posts/gpu%E4%B8%8A%E7%9A%84snn%E7%A8%80%E7%96%8F%E5%8A%A0%E9%80%9F/</guid><description>把最近做的关于GPU上SNN稀疏加速的东西做一下总结，虽然不太成功。</description><pubDate>Mon, 14 Jul 2025 03:09:00 GMT</pubDate></item><item><title>T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge</title><link>https://mer.run/posts/t-mac-cpu-renaissance-via-table-lookup-for-low-bit-llm-deployment-on-edge/</link><guid isPermaLink="true">https://mer.run/posts/t-mac-cpu-renaissance-via-table-lookup-for-low-bit-llm-deployment-on-edge/</guid><description>T-MAC, 用LUT加速BitNet系列的工作，在CPU上跑，后续还有一个工作叫T-MAN是在移动端的高通CPU里面的NPU上跑LUT加速。</description><pubDate>Mon, 07 Jul 2025 08:24:00 GMT</pubDate></item><item><title>HYTE: Flexible Tiling for Sparse Accelerators via Hybrid Static-Dynamic Approaches</title><link>https://mer.run/posts/hyte-flexible-tiling-for-sparse-accelerators-via-hybrid-static-dynamic-approaches/</link><guid isPermaLink="true">https://mer.run/posts/hyte-flexible-tiling-for-sparse-accelerators-via-hybrid-static-dynamic-approaches/</guid><description>ISCA2025，做稀疏数据流分块的，后半截没什么精力看了，现在的工作还没做稀疏编码。</description><pubDate>Wed, 25 Jun 2025 08:28:00 GMT</pubDate></item><item><title>SNN on GPU</title><link>https://mer.run/posts/snn-on-gpu/</link><guid isPermaLink="true">https://mer.run/posts/snn-on-gpu/</guid><description>接下来要开始着手做这个SNN在GPU上的推理加速了，写一些笔记整理思路。</description><pubDate>Tue, 24 Jun 2025 03:50:00 GMT</pubDate></item><item><title>Swin Transformer: Hierarchical Vision Transformer using Shifted Windows</title><link>https://mer.run/posts/swin-transformer-hierarchical-vision-transformer-using-shifted-windows/</link><guid isPermaLink="true">https://mer.run/posts/swin-transformer-hierarchical-vision-transformer-using-shifted-windows/</guid><description>看看Shift-Window Attention。</description><pubDate>Mon, 23 Jun 2025 09:47:00 GMT</pubDate></item><item><title>SpikeVideoFormer: An Efficient Spike-Driven Video Transformer with Hamming Attention and O(T) Complexity</title><link>https://mer.run/posts/spikevideoformer-an-efficient-spike-driven-video-transformer-with-hamming-attention-and-ot-comple/</link><guid isPermaLink="true">https://mer.run/posts/spikevideoformer-an-efficient-spike-driven-video-transformer-with-hamming-attention-and-ot-comple/</guid><description>用汉明距离替换Attention中的点乘操作，避免出现Spike错开的情况。中间的做法比较有趣，但是实验感觉做的一般般，尤其是claim了自己有硬件实现的情况下energy计算还用的是纯算法的计算，并且FPGA的具体实现也没有透露，说了也没有说清楚。精度没有超过ANN2SNN的SOTA。重点还是需要用一些其他的操作替换掉对SNN不适应的算子。</description><pubDate>Tue, 17 Jun 2025 08:56:00 GMT</pubDate></item><item><title>Sparse Spiking Neural Network: Exploiting Heterogeneity in Timescales for Pruning Recurrent SNN</title><link>https://mer.run/posts/sparse-spiking-neural-network-exploiting-heterogeneity-in-timescales-for-pruning-recurrent-snn/</link><guid isPermaLink="true">https://mer.run/posts/sparse-spiking-neural-network-exploiting-heterogeneity-in-timescales-for-pruning-recurrent-snn/</guid><description>ICLR 2024 Spotlight, 利用Lyapunov Noise进行SNN Pruning。</description><pubDate>Wed, 11 Jun 2025 11:11:00 GMT</pubDate></item><item><title>Prosperity: Accelerating Spiking Neural Networks via Product Sparsity</title><link>https://mer.run/posts/prosperity-accelerating-spiking-neural-networks-via-product-sparsity/</link><guid isPermaLink="true">https://mer.run/posts/prosperity-accelerating-spiking-neural-networks-via-product-sparsity/</guid><description>HPCA在投的一篇SNN加速器文章，里面的“Product Sparsity”本质是减少相同内容的重复计算，和一般讨论的稀疏是两种不同的概念。</description><pubDate>Wed, 11 Jun 2025 08:53:00 GMT</pubDate></item><item><title>Towards Scalable GPU-Accelerated SNN Training via Temporal Fusion</title><link>https://mer.run/posts/towards-scalable-gpu-accelerated-snn-training-via-temporal-fusion/</link><guid isPermaLink="true">https://mer.run/posts/towards-scalable-gpu-accelerated-snn-training-via-temporal-fusion/</guid><description>意义不明，用Layer-By-Layer写了一下LIF就没别的Contribution了，发在了一个叫做ICANN的会上。工作量也太小了。</description><pubDate>Tue, 10 Jun 2025 06:34:00 GMT</pubDate></item><item><title>Recurrent Residual Module for Fast Inference in Videos</title><link>https://mer.run/posts/recurrent-residual-module-for-fast-inference-in-videos/</link><guid isPermaLink="true">https://mer.run/posts/recurrent-residual-module-for-fast-inference-in-videos/</guid><description>CVPR2018， DiffEncode + 稀疏加速，但感觉太老了。</description><pubDate>Mon, 09 Jun 2025 07:26:00 GMT</pubDate></item><item><title>Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models</title><link>https://mer.run/posts/efficient-spatially-sparse-inference-for-conditional-gans-and-diffusion-models/</link><guid isPermaLink="true">https://mer.run/posts/efficient-spatially-sparse-inference-for-conditional-gans-and-diffusion-models/</guid><description>NIPS2022上一篇比较有影响力的论文，对GAN和扩散模型做推理加速的工作，提出了Spatially Sparse Inference，仅在被编辑区域上稀疏地应用卷积滤波器，同时对未编辑区域复用缓存的特征</description><pubDate>Mon, 09 Jun 2025 06:19:00 GMT</pubDate></item><item><title>SlowFast Networks for Video Recognition</title><link>https://mer.run/posts/slowfast-networks-for-video-recognition/</link><guid isPermaLink="true">https://mer.run/posts/slowfast-networks-for-video-recognition/</guid><description>多分支CNN，会不会有一些分支能学到更加相似的帧间变化？</description><pubDate>Thu, 29 May 2025 22:15:00 GMT</pubDate></item><item><title>DeltaCNN: End-to-End CNN Inference of Sparse Frame Differences in Videos</title><link>https://mer.run/posts/deltacnn-end-to-end-cnn-inference-of-sparse-frame-differences-in-videos/</link><guid isPermaLink="true">https://mer.run/posts/deltacnn-end-to-end-cnn-inference-of-sparse-frame-differences-in-videos/</guid><description>利用CNN Layer的“线性”特征在帧之间做feature的差分，并且做了CUDA加速。和ViStream几乎一样的思路，能不能解决我们现在的问题？</description><pubDate>Fri, 23 May 2025 07:07:00 GMT</pubDate></item><item><title>Phi: Leveraging Pattern-based Hierarchical Sparsity for High-Efficiency Spiking Neural Networks</title><link>https://mer.run/posts/phi-leveraging-pattern-based-hierarchical-sparsity-for-high-efficiency-spiking-neural-networks/</link><guid isPermaLink="true">https://mer.run/posts/phi-leveraging-pattern-based-hierarchical-sparsity-for-high-efficiency-spiking-neural-networks/</guid><description>ISCA 2025, 基于结构化稀疏的SNN加速器。如果直接用LUT存，可能会出现需要保存的稀疏pattern数量太多，显存占用太严重，所以通过预先校准一级“结构化稀疏”，将Online Spike Activation变成一级可以完全用LUT算的L1 Sparse和稀疏度非常高的L2 Sparse。模仿一下idea搬到GPU上来做？</description><pubDate>Wed, 21 May 2025 09:46:00 GMT</pubDate></item><item><title>Temporal Flexibility in Spiking Neural Networks: Towards Generalization Across Time Steps and Deployment Friendliness</title><link>https://mer.run/posts/temporal-flexibility-in-spiking-neural-networks-towards-generalization-across-time-steps-and-deploy/</link><guid isPermaLink="true">https://mer.run/posts/temporal-flexibility-in-spiking-neural-networks-towards-generalization-across-time-steps-and-deploy/</guid><description>ICLR2025 Poster，似乎也在做Elastic inference？</description><pubDate>Wed, 21 May 2025 07:38:00 GMT</pubDate></item><item><title>A Simple Framework for Contrastive Learning of Visual Representations</title><link>https://mer.run/posts/a-simple-framework-for-contrastive-learning-of-visual-representations/</link><guid isPermaLink="true">https://mer.run/posts/a-simple-framework-for-contrastive-learning-of-visual-representations/</guid><description>对比学习SimCLR的论文。对比学习能对齐每一层的Feature吗？</description><pubDate>Tue, 20 May 2025 05:42:00 GMT</pubDate></item><item><title>QKFormer: Hierarchical Spiking Transformer using Q-K Attention</title><link>https://mer.run/posts/qkformer-hierarchical-spiking-transformer-using-q-k-attention/</link><guid isPermaLink="true">https://mer.run/posts/qkformer-hierarchical-spiking-transformer-using-q-k-attention/</guid><description>QKFormer，NIPS2024 Spotlight，把Direct Training SNN在ImageNet和CIFAR上的点刷的特别高，感觉之后要做就避不开它。</description><pubDate>Thu, 08 May 2025 10:10:00 GMT</pubDate></item><item><title>Transformers without Normalization</title><link>https://mer.run/posts/transformers-without-normalization/</link><guid isPermaLink="true">https://mer.run/posts/transformers-without-normalization/</guid><description>何恺明新作，用DyT代替Norm，把同步操作变成了Element Wise的操作。新文章里面有用到，学习一下。</description><pubDate>Wed, 07 May 2025 08:09:00 GMT</pubDate></item><item><title>Visualizing and Understanding the Effectiveness of BERT</title><link>https://mer.run/posts/visualizing-and-understanding-the-effectiveness-of-bert/</link><guid isPermaLink="true">https://mer.run/posts/visualizing-and-understanding-the-effectiveness-of-bert/</guid><description>最近做SNN训练的过程中在研究怎么可视化训练过程中的Loss，在想新加入的方法会不会对模型的Loss Landscape有影响，一般讲Loss Landscape怎么做可视化的文章都会引用这篇文章对Loss Landscape的分析和做法。</description><pubDate>Tue, 06 May 2025 02:22:00 GMT</pubDate></item><item><title>One-Minute Video Generation with Test-Time Training</title><link>https://mer.run/posts/one-minute-video-generation-with-test-time-training/</link><guid isPermaLink="true">https://mer.run/posts/one-minute-video-generation-with-test-time-training/</guid><description>最近Demo很火的TTT视频生成，可以生成60s级别的长视频。学习一下TTT的东西，SNN的On-Chip Learning和TTT能不能做结合？</description><pubDate>Tue, 22 Apr 2025 10:18:00 GMT</pubDate></item><item><title>Evolution Strategies as a Scalable Alternative to Reinforcement Learning</title><link>https://mer.run/posts/evolution-strategies-as-a-scalable-alternative-to-reinforcement-learning/</link><guid isPermaLink="true">https://mer.run/posts/evolution-strategies-as-a-scalable-alternative-to-reinforcement-learning/</guid><description>这两天在弄SNN训练的事情，需要验证一下用的Surrogate Gradient的准确性，老师介绍读一下这篇文章，用Evolution Strategy验证一下现在梯度估计的准确性。</description><pubDate>Mon, 21 Apr 2025 08:48:00 GMT</pubDate></item><item><title>SparTA: Deep-Learning Model Sparsity via Tensor-with-Sparsity-Attribute</title><link>https://mer.run/posts/sparta-deep-learning-model-sparsity-via-tensor-with-sparsity-attribute/</link><guid isPermaLink="true">https://mer.run/posts/sparta-deep-learning-model-sparsity-via-tensor-with-sparsity-attribute/</guid><description>sparTA，带稀疏优化的DNN编译器，把tensor的稀疏性作为一种重要属性考虑到编译过程中，生成高效的代码。</description><pubDate>Tue, 15 Apr 2025 03:07:00 GMT</pubDate></item><item><title>Scalable Diffusion Models with Transformers</title><link>https://mer.run/posts/scalable-diffusion-models-with-transformers/</link><guid isPermaLink="true">https://mer.run/posts/scalable-diffusion-models-with-transformers/</guid><description>Diffusion Transformer.</description><pubDate>Sun, 16 Mar 2025 08:33:00 GMT</pubDate></item><item><title>初探AI Infra</title><link>https://mer.run/posts/%E5%88%9D%E6%8E%A2ai-infra/</link><guid isPermaLink="true">https://mer.run/posts/%E5%88%9D%E6%8E%A2ai-infra/</guid><description>趁最近找实习的机会学习、总结一下之前零散接触过的模型推理/训练加速的知识，还有一些CUDA编程的体系架构之类的内容。</description><pubDate>Tue, 11 Mar 2025 10:30:00 GMT</pubDate></item><item><title>Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition</title><link>https://mer.run/posts/conv2former-a-simple-transformer-style-convnet-for-visual-recognition/</link><guid isPermaLink="true">https://mer.run/posts/conv2former-a-simple-transformer-style-convnet-for-visual-recognition/</guid><description>使用大kernel DS卷积替代self-attention。字节新加坡的工作。</description><pubDate>Sat, 08 Mar 2025 06:57:00 GMT</pubDate></item><item><title>SpikeCV: Open a Continuous Computer Vision Era</title><link>https://mer.run/posts/spikecv-open-a-continuous-computer-vision-era/</link><guid isPermaLink="true">https://mer.run/posts/spikecv-open-a-continuous-computer-vision-era/</guid><description>事件相机开源框架。</description><pubDate>Sat, 08 Mar 2025 06:57:00 GMT</pubDate></item><item><title>Neuromorphic computing at scale</title><link>https://mer.run/posts/neuromorphic-computing-at-scale/</link><guid isPermaLink="true">https://mer.run/posts/neuromorphic-computing-at-scale/</guid><description>发在Nature上的一篇review，讨论了SNN/神经模态计算社区现在面临的一些问题、挑战，和一些可能的发展方向。</description><pubDate>Sat, 08 Mar 2025 06:57:00 GMT</pubDate></item><item><title>Titans: Learning to Memorize at Test Time</title><link>https://mer.run/posts/titans-learning-to-memorize-at-test-time/</link><guid isPermaLink="true">https://mer.run/posts/titans-learning-to-memorize-at-test-time/</guid><description>从TTT改进而来的新架构，尝试通过TTT的方式改进模型的记忆能力。</description><pubDate>Sat, 08 Mar 2025 06:57:00 GMT</pubDate></item><item><title>Segment Anything</title><link>https://mer.run/posts/segment-anything/</link><guid isPermaLink="true">https://mer.run/posts/segment-anything/</guid><description>Meta的SAM。</description><pubDate>Sat, 08 Mar 2025 06:57:00 GMT</pubDate></item><item><title>SDiT: Spiking Diffusion Model with Transformer</title><link>https://mer.run/posts/sdit-spiking-diffusion-model-with-transformer/</link><guid isPermaLink="true">https://mer.run/posts/sdit-spiking-diffusion-model-with-transformer/</guid><description>脉冲Diffusion Transformer，里面的Transformer的结构是RWKV的。</description><pubDate>Sat, 08 Mar 2025 06:57:00 GMT</pubDate></item><item><title>2024</title><link>https://mer.run/posts/2024/</link><guid isPermaLink="true">https://mer.run/posts/2024/</guid><description>2024.</description><pubDate>Sat, 08 Mar 2025 06:57:00 GMT</pubDate></item><item><title>ConvUNeXt:An efficient convolution neural network for medical image segmentation</title><link>https://mer.run/posts/convunextan-efficient-convolution-neural-network-for-medical-image-segmentation/</link><guid isPermaLink="true">https://mer.run/posts/convunextan-efficient-convolution-neural-network-for-medical-image-segmentation/</guid><description>ConvNext + UNet，发在一个C刊上，借鉴学习一下，想想我的模块怎么设计。</description><pubDate>Sat, 08 Mar 2025 06:57:00 GMT</pubDate></item><item><title>Rethinking the Membrane Dynamics and Optimization Objectives of Spiking Neural Networks</title><link>https://mer.run/posts/rethinking-the-membrane-dynamics-and-optimization-objectives-of-spiking-neural-networks/</link><guid isPermaLink="true">https://mer.run/posts/rethinking-the-membrane-dynamics-and-optimization-objectives-of-spiking-neural-networks/</guid><description>NIPS2024。主要研究的是静态任务中，推理前膜电位初始值设置对精度的影响。</description><pubDate>Sat, 08 Mar 2025 06:57:00 GMT</pubDate></item><item><title>ConvNext V2: Co-designing and Scaling ConvNets with Masked Autoencoders</title><link>https://mer.run/posts/convnext-v2-co-designing-and-scaling-convnets-with-masked-autoencoders/</link><guid isPermaLink="true">https://mer.run/posts/convnext-v2-co-designing-and-scaling-convnets-with-masked-autoencoders/</guid><description>ConvNext续作，引入了MAE。</description><pubDate>Sat, 08 Mar 2025 06:57:00 GMT</pubDate></item><item><title>A ConvNet for the 2020s</title><link>https://mer.run/posts/a-convnet-for-the-2020s/</link><guid isPermaLink="true">https://mer.run/posts/a-convnet-for-the-2020s/</guid><description>CVPR2022。Meta的工作，在ViT相关工作占视觉大头的情况下重构纯卷积的网络，并且取得了很好的效果。</description><pubDate>Sat, 08 Mar 2025 06:57:00 GMT</pubDate></item><item><title>LoCC工作总结</title><link>https://mer.run/posts/locc%E5%B7%A5%E4%BD%9C%E6%80%BB%E7%BB%93/</link><guid isPermaLink="true">https://mer.run/posts/locc%E5%B7%A5%E4%BD%9C%E6%80%BB%E7%BB%93/</guid><description>老板找到idea到交稿只用了两个星期，第一次完整跟着做完一整篇论文的工作。</description><pubDate>Sat, 08 Mar 2025 07:05:00 GMT</pubDate></item><item><title>Were RNNs All We Needed?</title><link>https://mer.run/posts/were-rnns-all-we-needed/</link><guid isPermaLink="true">https://mer.run/posts/were-rnns-all-we-needed/</guid><description>改进RNN，便于scale up</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>SparseRT: Accelerating Unstructured Sparsity on GPUs for Deep Learning Inference</title><link>https://mer.run/posts/sparsertaccelerating-unstructured-sparsity-on-gpus-for-deep-learning-inference/</link><guid isPermaLink="true">https://mer.run/posts/sparsertaccelerating-unstructured-sparsity-on-gpus-for-deep-learning-inference/</guid><description>GPU上做MM相关的算子生成，利用load balancing和稀疏做加速，根据model生成PTX代码</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>VPRTempo: A Fast Temporally Encoded Spiking Neural Network for Visual Place Recognition</title><link>https://mer.run/posts/vprtempo-a-fast-temporally-encoded-spiking-neural-network-for-visual-place-recognition/</link><guid isPermaLink="true">https://mer.run/posts/vprtempo-a-fast-temporally-encoded-spiking-neural-network-for-visual-place-recognition/</guid><description>ICRA2024的论文，用Temporal Encoding的STDP Direct Training的SNN做场景识别的任务。太简单了</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>Memory-Efficient Reversible Spiking Neural Networks</title><link>https://mer.run/posts/memory-efficient-reversible-spiking-neural-networks/</link><guid isPermaLink="true">https://mer.run/posts/memory-efficient-reversible-spiking-neural-networks/</guid><description>通过设计提高训练速度，降低显存占用的工作。</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>SpikeMba: Multi-Modal Spiking Saliency Mamba for
Temporal Video Grounding</title><link>https://mer.run/posts/spikemba-multi-modal-spiking-saliency-mamba-fortemporal-video-grounding/</link><guid isPermaLink="true">https://mer.run/posts/spikemba-multi-modal-spiking-saliency-mamba-fortemporal-video-grounding/</guid><description>SNN+Mamba完成TVG时序视频定位任务，哈工大和北大的工作。</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>Integer-Valued Training and Spike-Driven Inference Spiking Neural Network for High-performance and Energy-efficient Object Detection</title><link>https://mer.run/posts/integer-valued-training-and-spike-driven-inference-spiking-neural-network-for-high-performance-and-e/</link><guid isPermaLink="true">https://mer.run/posts/integer-valued-training-and-spike-driven-inference-spiking-neural-network-for-high-performance-and-e/</guid><description>SpikeYOLO，中科院自动化所的工作，ECCV2024 Oral</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>SNN视频流任务调研</title><link>https://mer.run/posts/snn%E8%A7%86%E9%A2%91%E6%B5%81%E4%BB%BB%E5%8A%A1%E8%B0%83%E7%A0%94/</link><guid isPermaLink="true">https://mer.run/posts/snn%E8%A7%86%E9%A2%91%E6%B5%81%E4%BB%BB%E5%8A%A1%E8%B0%83%E7%A0%94/</guid><description>学习一下视频stream上任务的一些工作，大概计划一下后续的工作。</description><pubDate>Wed, 07 May 2025 02:14:00 GMT</pubDate></item><item><title>SpikeZIP-TF: Conversion is All You Need for Transformer-based SNN</title><link>https://mer.run/posts/spikezip-tf-conversion-is-all-you-need-for-transformer-based-snn/</link><guid isPermaLink="true">https://mer.run/posts/spikezip-tf-conversion-is-all-you-need-for-transformer-based-snn/</guid><description>游康师兄的工作，ANN2SNN的Transformer。</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>SpikingJelly: An open-source machine learning infrastructure platform for spike-based intelligence</title><link>https://mer.run/posts/spikingjelly-an-open-source-machine-learning-infrastructure-platform-for-spike-based-intelligence/</link><guid isPermaLink="true">https://mer.run/posts/spikingjelly-an-open-source-machine-learning-infrastructure-platform-for-spike-based-intelligence/</guid><description>北大惊蛰，非常有影响力的SNN框架，实现了从数据编码、数据集整合到训练、硬件部署的全流程，SNN的torch级别的工作。发表在Science Advanced上。</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>I-LLM: Efficient Integer-Only Inference for
Fully-Quantized Low-Bit Large Language Models</title><link>https://mer.run/posts/i-llm-efficient-integer-only-inference-forfully-quantized-low-bit-large-language-models/</link><guid isPermaLink="true">https://mer.run/posts/i-llm-efficient-integer-only-inference-forfully-quantized-low-bit-large-language-models/</guid><description>LLM的Interger-Only PTQ量化工作。</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>程序语言理论笔记</title><link>https://mer.run/posts/%E7%A8%8B%E5%BA%8F%E8%AF%AD%E8%A8%80%E7%90%86%E8%AE%BA%E7%AC%94%E8%AE%B0/</link><guid isPermaLink="true">https://mer.run/posts/%E7%A8%8B%E5%BA%8F%E8%AF%AD%E8%A8%80%E7%90%86%E8%AE%BA%E7%AC%94%E8%AE%B0/</guid><description>程序语言理论课程的复习笔记。</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>The Minimum Equivalent DNF Problem and
Shortest Implicants</title><link>https://mer.run/posts/the-minimum-equivalent-dnf-problem-andshortest-implicants/</link><guid isPermaLink="true">https://mer.run/posts/the-minimum-equivalent-dnf-problem-andshortest-implicants/</guid><description>证明MIN-DNF问题是完全的</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference</title><link>https://mer.run/posts/i-vit-integer-only-quantization-for-efficient-vision-transformer-inference/</link><guid isPermaLink="true">https://mer.run/posts/i-vit-integer-only-quantization-for-efficient-vision-transformer-inference/</guid><description>对ViT的纯整型量化，W8A8，中科院2023 ICCV</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>Efficient and Effective Methods for Mixed Precision Neural Network Quantization for Faster, Energy-efficient Inference</title><link>https://mer.run/posts/efficient-and-effective-methods-for-mixed-precision-neural-network-quantization-for-faster-energy-e/</link><guid isPermaLink="true">https://mer.run/posts/efficient-and-effective-methods-for-mixed-precision-neural-network-quantization-for-faster-energy-e/</guid><description>EAGL，声称只要用CPU在3秒内就能完成对ResNet的量化，效率远高于HAWQ等其他传统的方法</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>Towards spike-based machine intelligence with neuromorphic computing</title><link>https://mer.run/posts/towards-spike-based-machine-intelligence-with-neuromorphic-computing/</link><guid isPermaLink="true">https://mer.run/posts/towards-spike-based-machine-intelligence-with-neuromorphic-computing/</guid><description>Nature上关于SNN的综述</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness</title><link>https://mer.run/posts/flashattention-fast-and-memory-efficient-exact-attention-with-io-awareness/</link><guid isPermaLink="true">https://mer.run/posts/flashattention-fast-and-memory-efficient-exact-attention-with-io-awareness/</guid><description>Flash Attention，利用硬件结构加速Attention计算速度、减少内存占用的算法。核心是Tiling，Online Softmax和Kernel Fusion。</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>WWW: What, When, Where to Compute-in-Memory</title><link>https://mer.run/posts/www-what-when-where-to-compute-in-memory/</link><guid isPermaLink="true">https://mer.run/posts/www-what-when-where-to-compute-in-memory/</guid><description>一些关于存内计算的验证与思考。</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference</title><link>https://mer.run/posts/quantization-and-training-of-neural-networks-for-efficient-integer-arithmetic-only-inference/</link><guid isPermaLink="true">https://mer.run/posts/quantization-and-training-of-neural-networks-for-efficient-integer-arithmetic-only-inference/</guid><description>谷歌的，第一篇完整跑通interger-only量化推理流程的工作。</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>SpikeSim: An end-to-end Compute-in-Memory Hardware Evaluation Tool for Benchmarking Spiking Neural Networks</title><link>https://mer.run/posts/spikesim-an-end-to-end-compute-in-memory-hardware-evaluation-tool-for-benchmarking-spiking-neural-n/</link><guid isPermaLink="true">https://mer.run/posts/spikesim-an-end-to-end-compute-in-memory-hardware-evaluation-tool-for-benchmarking-spiking-neural-n/</guid><description>SNN部署的硬件设计or evaluation benchmark。</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
</title><link>https://mer.run/posts/powerinfer-fast-large-language-model-serving-with-a-consumer-grade-gpu/</link><guid isPermaLink="true">https://mer.run/posts/powerinfer-fast-large-language-model-serving-with-a-consumer-grade-gpu/</guid><description>From IPADS, 利用模型预测LLM中需要激活的MoE or Neuron，减少资源消耗。</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>Evaluating Spatial Accelerator Architectures with
Tiled Matrix-Matrix Multiplication
</title><link>https://mer.run/posts/evaluating-spatial-accelerator-architectures-withtiled-matrix-matrix-multiplication/</link><guid isPermaLink="true">https://mer.run/posts/evaluating-spatial-accelerator-architectures-withtiled-matrix-matrix-multiplication/</guid><description>GEMM data mapping的介绍，主要是各种脉动阵列相关的加速器。</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>HAWQ: Hessian Aware Quantization of Neural Networks with Mixed-Precision</title><link>https://mer.run/posts/hawq-hessian-aware-quantization-of-neural-networks-with-mixed-precision/</link><guid isPermaLink="true">https://mer.run/posts/hawq-hessian-aware-quantization-of-neural-networks-with-mixed-precision/</guid><description>模型量化经典方法，基于黑森矩阵，一种二阶信息的量化方法。</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>Optimizing Bit-Serial Matrix Multiplication for Reconfigurable Computing</title><link>https://mer.run/posts/optimizing-bit-serial-matrix-multiplication-for-reconfigurable-computing/</link><guid isPermaLink="true">https://mer.run/posts/optimizing-bit-serial-matrix-multiplication-for-reconfigurable-computing/</guid><description>BISMO优化。</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>TVM: An Automated End-to-End Optimizing Compiler for Deep Learning</title><link>https://mer.run/posts/tvm-an-automated-end-to-end-optimizing-compiler-for-deep-learning/</link><guid isPermaLink="true">https://mer.run/posts/tvm-an-automated-end-to-end-optimizing-compiler-for-deep-learning/</guid><description>TVM。</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>Roofline: An Insightful Visual Performance Model for Floating-Point Programs and Multicore Architectures</title><link>https://mer.run/posts/roofline-an-insightful-visual-performance-model-for-floating-point-programs-and-multicore-architect/</link><guid isPermaLink="true">https://mer.run/posts/roofline-an-insightful-visual-performance-model-for-floating-point-programs-and-multicore-architect/</guid><description>Roofline model，描述一个系统的性能是受内存制约还是受计算制约。</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>A Comprehensive Survey on Electronic Design Automation and Graph Neural Networks: Theory and Applications</title><link>https://mer.run/posts/a-comprehensive-survey-on-electronic-design-automation-and-graph-neural-networks-theory-and-applica/</link><guid isPermaLink="true">https://mer.run/posts/a-comprehensive-survey-on-electronic-design-automation-and-graph-neural-networks-theory-and-applica/</guid><description>图神经网络在EDA领域应用的综述。</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>A Hardware-Software Blueprint for Flexible Deep Learning Specialization</title><link>https://mer.run/posts/a-hardware-software-blueprint-for-flexible-deep-learning-specialization/</link><guid isPermaLink="true">https://mer.run/posts/a-hardware-software-blueprint-for-flexible-deep-learning-specialization/</guid><description>VTA。</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>BISMO: A Scalable Bit Serial Matrix Multiplication Overlay for Reconfigurable Computing</title><link>https://mer.run/posts/bismo-a-scalable-bit-serial-matrix-multiplication-overlay-for-reconfigurable-computing/</link><guid isPermaLink="true">https://mer.run/posts/bismo-a-scalable-bit-serial-matrix-multiplication-overlay-for-reconfigurable-computing/</guid><description>BISMO。</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item><item><title>Code Transpilation for Hardware Accelerators</title><link>https://mer.run/posts/code-transpilation-for-hardware-accelerators/</link><guid isPermaLink="true">https://mer.run/posts/code-transpilation-for-hardware-accelerators/</guid><description>基于Metalift，做的还很不完善。</description><pubDate>Sat, 08 Mar 2025 07:06:00 GMT</pubDate></item></channel></rss>