<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>AI on 🌲Treetopia🌲</title>
    <link>https://tree2601.github.io/en/categories/ai/</link>
    <description>Recent content in AI on 🌲Treetopia🌲</description>
    <generator>Hugo -- 0.154.2</generator>
    <language>en</language>
    <lastBuildDate>Wed, 04 Feb 2026 22:13:42 +0800</lastBuildDate>
    <atom:link href="https://tree2601.github.io/en/categories/ai/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Z Image Turbo</title>
      <link>https://tree2601.github.io/en/posts/2026/z-image-turbo/</link>
      <pubDate>Wed, 04 Feb 2026 22:13:42 +0800</pubDate>
      <guid>https://tree2601.github.io/en/posts/2026/z-image-turbo/</guid>
      <description>&lt;h3 id=&#34;content&#34;&gt;&lt;strong&gt;Content&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;This article introduces the usage of Z-Image-Turbo in conjunction with ComfyUI.&lt;/p&gt;
&lt;p&gt;Advantages of Z-Image-Turbo:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Strong Chinese prompt-following and Chinese character generation capabilities.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Requires only 8 inference steps for image generation. With a compact 6B parameter count, it can run on consumer-grade hardware (16GB VRAM) using quantization.&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Due to network restrictions in certain regions that prevent the use of ComfyUI-Manager for automatic downloads, all file downloads are provided for manual installation.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Llamafactory Distributed Training</title>
      <link>https://tree2601.github.io/en/posts/2026/llamafactory/</link>
      <pubDate>Wed, 21 Jan 2026 23:06:53 +0800</pubDate>
      <guid>https://tree2601.github.io/en/posts/2026/llamafactory/</guid>
      <description>&lt;h3 id=&#34;content&#34;&gt;Content&lt;/h3&gt;
&lt;p&gt;Conduct SFT training using the llamafactory framework on L20*8 servers with Ubuntu 22.04. Utilize both single-node multi-GPU and multi-node multi-GPU modes. Selected base model: Qwen3-32B.&lt;/p&gt;
&lt;h3 id=&#34;environment-configuration&#34;&gt;&lt;strong&gt;Environment Configuration&lt;/strong&gt;&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;Clone the code repository, set up a new conda environment, and install dependencies.&lt;/li&gt;
&lt;/ol&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-Plain&#34; data-lang=&#34;Plain&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;cd LLaMA-Factory
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;conda activate llamafactory_env
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;pip install -e .
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;pip install -r requirements/metrics.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;ol start=&#34;2&#34;&gt;
&lt;li&gt;Prepare SFT data, place it in the &lt;code&gt;data&lt;/code&gt; folder, and register it in &lt;code&gt;dataset_info.json&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-Plain&#34; data-lang=&#34;Plain&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;cd ./data
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;Open dataset_info.json
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;and add the dataset, for example:
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&amp;#34;my_example&amp;#34;: {
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;    &amp;#34;file_name&amp;#34;: &amp;#34;my_example.json&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;  },
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;# Use Alpaca or ShareGPT format for SFT data. Alpaca format example is used here.
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;# Alpaca format: (where `instruction` and `input` are automatically concatenated with `\n`)
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;[{
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&amp;#34;instruction&amp;#34;: &amp;#34;Human instruction (required)&amp;#34;,
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&amp;#34;input&amp;#34;: &amp;#34;Human input (optional)&amp;#34;,
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&amp;#34;output&amp;#34;: &amp;#34;Model response (required)&amp;#34;,
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&amp;#34;system&amp;#34;: &amp;#34;System prompt (optional)&amp;#34;,
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;&amp;#34;history&amp;#34;:
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;[
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;[&amp;#34;First round instruction (optional)&amp;#34;, &amp;#34;First round response (optional)&amp;#34;],
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;[&amp;#34;Second round instruction (optional)&amp;#34;, &amp;#34;Second round response (optional)&amp;#34;]
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;]
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;}]
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&#34;single-node-multi-gpu-training&#34;&gt;&lt;strong&gt;Single-Node Multi-GPU Training&lt;/strong&gt;&lt;/h3&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-Plain&#34; data-lang=&#34;Plain&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;# Prepare a yaml file by referring to existing templates in `./examples` and run it.
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;# If using deepspeed for multi-GPU training, specify the number of GPUs via CUDA_VISIBLE_DEVICES.
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;CUDA_VISIBLE_DEVICES=0,1 FORCE_TORCHRUN=1 llamafactory-cli train examples/train_lora/qwen3_30b_lora_sft.yaml
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;# After training, the parameters can be found in the path specified by `output_dir` in the yaml file.
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;After training, you can invoke the LoRA adapter using vLLM. Here is a docker compose template.&lt;/p&gt;</description>
    </item>
    <item>
      <title>ComfyUI Guide</title>
      <link>https://tree2601.github.io/en/posts/2026/comfyui/</link>
      <pubDate>Wed, 14 Jan 2026 11:11:51 +0800</pubDate>
      <guid>https://tree2601.github.io/en/posts/2026/comfyui/</guid>
      <description>&lt;h3 id=&#34;content&#34;&gt;Content&lt;/h3&gt;
&lt;p&gt;This guide details the deployment of ComfyUI on Ubuntu 22.04, including the manual installation of ComfyUI Manager for extended functionalities (custom nodes) and the manual installation of custom nodes, specifically tailored for &lt;strong&gt;users within China&amp;rsquo;s network environment&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Official Documentation: &lt;a href=&#34;https://docs.comfy.org/zh-CN/installation/manual_install&#34;&gt;ComfyUI&lt;/a&gt;&lt;/p&gt;
&lt;h3 id=&#34;installation&#34;&gt;Installation&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;First, ensure you have a &lt;a href=&#34;https://tree2601.github.io/en/posts/2026/conda/&#34;&gt;conda&lt;/a&gt; environment on your server and create a new one.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-Plain&#34; data-lang=&#34;Plain&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;# Clone the ComfyUI git repository
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;git clone https://github.com/Comfy-Org/ComfyUI.git
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;# Navigate to the ComfyUI directory, activate the conda environment, and install dependencies
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;cd ComfyUI
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;conda activate comfyui_env
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;pip install -r requirements.txt
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;# Start ComfyUI, specifying the port number and GPU device
&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;python main.py --listen --port 10020 --cuda-device 0
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&#34;install-comfyui-manager&#34;&gt;Install ComfyUI Manager&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;```Plain
# Change to the custom_nodes subdirectory
cd custom_nodes

# If your network environment has no restrictions
git clone https://github.com/ltdrdata/ComfyUI-Manager.git

# If your network environment has restrictions, manually download the [repository](https://github.com/Comfy-Org/ComfyUI-Manager),
# unzip it, rename the folder to ComfyUI-Manager, and place it in custom_nodes.
# Then, restart ComfyUI.
python main.py --listen --port 10020 --cuda-device 0
```
&lt;/code&gt;&lt;/pre&gt;
&lt;h3 id=&#34;install-any-plugins&#34;&gt;Install Any Plugins&lt;/h3&gt;
&lt;pre&gt;&lt;code&gt;i.  If the ComfyUI Manager GUI can download nodes:

ii.  If the ComfyUI Manager GUI consistently fails to download nodes:

```Plain
# Clone the corresponding git repository, rename it, and place it in custom_nodes.
git clone https://github.com/some/custom/nodes.git

# Navigate into the node&#39;s directory and install its dependencies.
pip install -r requirements.txt

# Restart ComfyUI.

# Some commonly used custom nodes:
# --Control Net: https://github.com/Fannovel16/comfyui_controlnet_aux
# --ComfyUI-Impact-Pack: https://github.com/ltdrdata/ComfyUI-Impact-Pack
# --rgthree-comfy: https://github.com/rgthree/rgthree-comfy


```
&lt;/code&gt;&lt;/pre&gt;
&lt;h3 id=&#34;quick-start&#34;&gt;Quick Start&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;The most widely used text-to-image model is Flux. Setting up a workflow using it is an excellent starting point.&lt;/li&gt;
&lt;li&gt;Images generated by ComfyUI contain workflow information, which can be directly dragged into the GUI to re-create the workflow.&lt;/li&gt;
&lt;li&gt;Example 1: Flux + Lora + ControlNet Workflow
&lt;img alt=&#34;workflow-1&#34; loading=&#34;lazy&#34; src=&#34;https://tree2601.github.io/images/comfyui/workflow-1.png&#34;&gt;&lt;/li&gt;
&lt;li&gt;Example 2: MimicMotion Action Simulation Workflow
&lt;img alt=&#34;workflow-2&#34; loading=&#34;lazy&#34; src=&#34;https://tree2601.github.io/images/comfyui/workflow-2.png&#34;&gt;&lt;/li&gt;
&lt;/ol&gt;</description>
    </item>
    <item>
      <title>DeepSeek-671B Distributed Deployment</title>
      <link>https://tree2601.github.io/en/posts/2026/deepseek-671b/</link>
      <pubDate>Tue, 06 Jan 2026 11:16:30 +0800</pubDate>
      <guid>https://tree2601.github.io/en/posts/2026/deepseek-671b/</guid>
      <description>&lt;h3 id=&#34;1-overview&#34;&gt;1. Overview&lt;/h3&gt;
&lt;p&gt;a. This guide describes the deployment of the DeepSeek-671B model across two servers, each equipped with 8x NVIDIA L20 GPUs. The technology stack utilizes Docker for containerization, the vLLM high-performance inference engine, and the Ray distributed computing framework.&lt;/p&gt;
&lt;p&gt;b. Official Documentation: &lt;a href=&#34;https://docs.vllm.ai/en/v0.8.1/serving/distributed_serving.html&#34;&gt;vLLM-Distributed&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;c. The official tutorial involves complex steps requiring frequent switching between multiple SSH sessions. To simplify the process, this article consolidates and optimizes the official workflow into a systematic, one-stop deployment guide.&lt;/p&gt;</description>
    </item>
    <item>
      <title>L20 8-GPU Server Deep Dive: Integrated Deployment Guide for Multimodal AI Systems (LLM &#43; VLM &#43; RAG &#43; ASR &#43; Dify &#43; MinerU)</title>
      <link>https://tree2601.github.io/en/posts/2026/l20/</link>
      <pubDate>Mon, 05 Jan 2026 16:56:40 +0800</pubDate>
      <guid>https://tree2601.github.io/en/posts/2026/l20/</guid>
      <description>&lt;h3 id=&#34;overview&#34;&gt;Overview&lt;/h3&gt;
&lt;p&gt;This guide provides a step-by-step walkthrough for deploying a full-stack multimodal AI system on a single server equipped with 8x NVIDIA L20 GPUs. The stack includes LLM, VLM, Embedding/Reranker (RAG), ASR, Dify (LLM Orchestration Agent Platform), and MinerU (PDF Extraction).&lt;/p&gt;
&lt;h3 id=&#34;vram-estimation-for-llms&#34;&gt;VRAM Estimation for LLMs&lt;/h3&gt;
&lt;p&gt;&lt;img alt=&#34;formula&#34; loading=&#34;lazy&#34; src=&#34;https://tree2601.github.io/images/l20/formula.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;img alt=&#34;parameter&#34; loading=&#34;lazy&#34; src=&#34;https://tree2601.github.io/images/l20/parameter.png&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Key Strategy:&lt;/strong&gt; Since Large Language Model (LLM) performance correlates more strongly with parameter scale (B) than with quantization levels, we prioritize models with higher parameter counts. For this deployment, we selected the &lt;strong&gt;int4 AWQ versions&lt;/strong&gt; of &lt;strong&gt;Qwen3-235B&lt;/strong&gt; and &lt;strong&gt;GLM-4.5V-106B&lt;/strong&gt; to maximize overall intelligence and performance within the available VRAM.&lt;/p&gt;</description>
    </item>
  </channel>
</rss>
