TGI 相比于 huggingface 的 transformers pipline 提高的地方

发表于 2025-04-21 更新于 2025-04-25 本文字数： 644 阅读时长 ≈ 2 分钟

TGI（Text Generation Inference）是 Hugging Face 专为生产环境优化的文本生成服务，相比直接使用 transformers 库的 pipeline，它在性能、功能和部署效率上有显著提升。以下是主要改进点：

1. 性能优化

连续批处理（Continuous Batching）
TGI 支持动态批处理，将不同长度的请求合并计算，显著提高 GPU 利用率（尤其适合长短不一的请求）。而 pipeline 的批处理是静态的，同一批次内的输入必须填充到相同长度，浪费计算资源。
Flash Attention 和 Paged Attention
通过优化注意力机制减少显存占用，支持更长上下文（如 100K tokens）。transformers 需手动启用这些功能且依赖特定硬件。
量化支持
TGI 内置 GPTQ、bitsandbytes 等量化技术，降低显存需求；pipeline 需额外配置。

阅读全文 »

Rust GUI 图形渲染

发表于 2025-03-25 更新于 2025-04-16 本文字数： 52 阅读时长 ≈ 1 分钟

Rust GUI 图形渲染

public:: true

wgpu[^1]
[[raqote]]
- jrmuizel/raqote: Rust 2D graphics library (github.com)
[[GPUI]]
[[tiny-skia]]
- RazrFalcon/tiny-skia: A tiny Skia subset ported to Rust (github.com)
femtovg/femtovg: Antialiased 2D vector drawing library written in Rust (github.com)
- c nanovg 移植库

[^1]: # wgpu

- ![big-picture.png](https://github.com/gfx-rs/wgpu/blob/trunk/etc/big-picture.png?raw=true)

阅读全文 »

Logseq 博客发布方案

发表于 2025-03-25 更新于 2025-04-27 本文字数： 366 阅读时长 ≈ 1 分钟

public:: true
type:: 笔记
item-type:: 软件分享
plane:: done

Logseq 博客发布方案
Logseq 一个双链笔记
两种转换成静态页面的方案：
- 1、通过 Logseq 的导出图谱功能导出静态文件到一个文件夹，github/gitee page 直接部署这个文件夹
- 2、通过 Logseq Publish Action ，这个时候是推送整个 Logseq 所有文件到 github，然后添加 Logseq Publish Action， github 会自动发布静态页面到另一个分支
  - logseq/publish-spa@v0.3.0
~~同时发布到 gitee 和 github~~ gitee已下线page服务
- 另外可以使用 git-mirror-action 不同仓库之间同步：首先推送 Logseq 全部文件到 GitHub 一个公开或者私有的库，然后 Logseq Publish Action 来生成静态文件并推送到另一个公开的分支或者库，git-mirror-action 可以同步不同的 git 库，利用 git-mirror-action 将 GitHub 上的库同步到 Gitee，但是 Gitee Page 并不会自动部署，所以要使用 gitee-pages-action 自动部署 Gitee Pages
评论系统
- Logseq 接入评论系统 (abosen.top)
#部署方案#
- Cloudflare Pages
  - (wzzc.pages.dev)
- GitHub Pages
  - wzzc-dev.github.io
- Vercel
  - wzzc.vercel.app/#/page/overview
- render
  - wzzc.onrender.com
- netlify
  - wzzc.netlify.app
参考文档

算丰学院-LLM的概念与实践-LLM：世界知识的无损压缩

发表于 2025-03-24 更新于 2025-03-31 本文字数： 507 阅读时长 ≈ 2 分钟

算丰学院-LLM的概念与实践-LLM：世界知识的无损压缩

LM (language model)

统计语言模型(SLM)
Statistical language models

N 元模型

神经语言模型(NLM)
Neural language models

阅读全文 »

Hello World

发表于 2025-03-01 更新于 2025-03-31 本文字数： 78 阅读时长 ≈ 1 分钟

Welcome to Hexo! This is your very first post. Check documentation for more info. If you get any problems when using Hexo, you can find the answer in troubleshooting or you can ask me on GitHub.

Quick Start

Create a new post

1	$ hexo new "My New Post"

More info: Writing

阅读全文 »