IntelliStream

IntelliStream Research Group

专注于流处理、AI系统与智能数据库的研究与开发

Focused on Stream Processing, AI Systems, and Intelligent Databases

🌟 SAGE 项目生态系统 | SAGE Project Ecosystem

SAGE (Streaming-Augmented Generative Execution) 是一个高性能、模块化的 AI 推理框架生态系统，通过数据流抽象实现透明、可扩展的 LLM 驱动系统。

SAGE is a high-performance, modular AI inference framework ecosystem that enables transparent, scalable LLM-powered systems through dataflow abstractions.

📦 核心仓库 | Core Repositories

🎯 SAGE

主框架 | Main Framework

声明式、可组合的流式增强生成执行框架，用于通过数据流抽象构建透明的 LLM 驱动系统。

A declarative, composable framework for building transparent LLM-powered systems through dataflow abstractions.

特性 | Features:

⚡ 生产就绪的企业级应用
🔧 直观的声明式 API
🚀 高吞吐量流式工作负载优化
👁️ 内置可观测性和调试工具

� sage-benchmark

SAGE 系统基准测试 | SAGE System Benchmarks

SAGE 框架的端到端基准测试套件，评估系统整体性能。

End-to-end benchmark suite for SAGE framework evaluating system-level performance.

测试维度 | Test Dimensions:

🔄 控制面调度 | Control Plane Scheduling
🧪 端到端流水线 | E2E Pipeline
📈 隔离性与扩展性 | Isolation & Scalability

📚 SAGE-Pub

文档中心 | Documentation Hub

SAGE 系统的官方对外文档仓库，包含快速开始、架构图、API 文档等。

Official public documentation repository for the SAGE system, including quick start guides, architecture diagrams, and API documentation.

内容 | Contents:

📘 快速开始指南
🏗️ 架构与核心模块说明
📊 Dashboard 使用指南
🔗 API 文档

🔧 数据库与系统组件 | Database & System Components

💾 向量数据库 & ANNS | Vector Database & ANNS

🔍 sageVDB

向量数据库核心 | Vector Database Core

高性能向量数据库 C++ 核心库，支持可插拔 ANNS 架构和多模态特性。

High-performance C++20 vector database library with pluggable ANNS architecture and multimodal support.

🔍 sage-anns

ANNS 算法库 | ANNS Algorithm Library

提供统一 Python 接口的近似最近邻搜索算法集合，被 sageVDB 调用。

ANNS algorithms with unified Python interface, used by sageVDB.

📊 CANDOR-Bench

ANNS 基准测试 | ANNS Benchmark [SIGMOD'26]

全面的 ANNS 算法基准测试套件，评估 sage-anns 和 sageVDB 性能。

Comprehensive ANNS benchmark suite evaluating sage-anns and sageVDB performance.

🌊 流处理引擎 | Stream Processing

sageFlow

向量流处理引擎 | Vector Stream Processing Engine

向量原生流处理引擎，专为实时 LLM 生成任务维护和物化语义状态快照而设计。

Vector-native stream processing engine for real-time LLM generation tasks.

🔗 分布式运行时 | Distributed Runtime

sageFlownet

分布式通信框架 | Distributed Communication Framework

类似 Ray 的分布式运行时基础组件，提供高性能通信堆栈。

Ray-like distributed runtime infrastructure providing high-performance communication stack.

⏱️ 时序数据库 | Time Series Database

sageTSDB

时序数据库 | Time Series Database

SAGE 生态系统的时序数据库组件，用于处理时间序列数据。

Time series database component for handling temporal data streams.

📊 数据集 | Datasets

sageData

基准数据集 | Benchmark Datasets

SAGE 基准测试的共享数据集和资源库。

Shared test datasets and resources for SAGE benchmarks.

🤏 上下文压缩 | Context Compression

sageRefiner

上下文压缩 | Context Compression

SAGE 生态系统的上下文压缩组件，用于优化 RAG 应用的输入长度。

Context compression component for optimizing input length in RAG applications.

📊 sage-refiner-benchmark

Refiner 基准测试 | Refiner Benchmarks

评估各种上下文压缩算法在 RAG 应用中的性能。

Benchmark suite for context compression algorithms in RAG applications.

🤖 AI 与智能体组件 | AI & Agent Components

🧠 LLM 推理引擎 | LLM Inference Engine

sageLLM

LLM 推理引擎 | LLM Inference Engine

面向华为昇腾与 NVIDIA 的模块化 LLM 推理引擎，默认 CPU 优先，提供统一的 Python/HTTP 接口。 (See dedicated section below for sub-modules)

Modular LLM inference engine for domestic computing power, CPU-first with unified APIs.

📊 sagellm-benchmark

E2E 验证 | E2E Validation

sageLLM 推理引擎的端到端验证套件，年度验证与演示运行器。

End-to-end validation suite for sageLLM with yearly validations.

📊 sagellm-control-plane-benchmark

Control Plane 评测 | Control Plane Benchmark

专门评测 sageLLM Control Plane 模块的调度策略、吞吐量、延迟等性能指标。

Dedicated benchmark for sageLLM Control Plane module.

� Agent 工具选择 | Agent Tool Selection

sage-agentic

工具选择算法框架 | Tool Selection Framework

Agent 工具选择算法框架，提供多种工具选择策略的统一接口。

Framework for agent tool selection algorithms with unified interface for multiple strategies.

📊 sage-agent-benchmark

工具选择评测 | Tool Selection Benchmark

配置驱动的 Agent 工具选择能力评估框架（工具选择、规划、时序检测）。

Configuration-driven benchmark for agent tool selection, planning, and timing detection.

🎯 sage-agentic-sias

SIAS 工具选择算法 | SIAS Tool Selection Algorithm

基于样本重要性感知选择（SIAS）的 Agent 工具选择算法实现。

Agent tool selection algorithm based on Sample-Importance-Aware Selection (SIAS).

🧩 记忆体 | Memory Systems

neuromem

记忆管理引擎 | Memory Management Engine

SAGE 项目的记忆体组件，RAG 应用的独立内存管理引擎。

Standalone memory management engine for RAG applications.

📊 sage-memory-benchmark

记忆系统评测 | Memory System Benchmark

NeuroMem 记忆系统性能评估。

Performance evaluation for NeuroMem memory systems.

📚 RAG 框架 | RAG Framework

sage-rag

RAG 框架 | RAG Framework

RAG 流水线的文档加载、分块与检索框架。

Document loaders, chunkers, and retrievers for RAG pipelines.

📊 sage-rag-benchmark

RAG 评测 | RAG Benchmark

RAG 流水线端到端性能评估框架。

End-to-end performance evaluation for RAG pipelines.

� 示例与教程 | Examples & Tutorials

📖 sage-examples

示例代码库 | Examples Repository

SAGE 框架的应用示例代码和使用案例集合。

Collection of application examples and use cases for SAGE framework.

📘 sage-tutorials

教程代码库 | Tutorials Repository

SAGE 框架的分层教程，从 L1-L5 逐步学习。

Layer-by-layer tutorials for SAGE framework (L1-L5).

�🛠️ 其他工具 | Other AI Tools

🎯 sage-intent

意图识别 | Intent Recognition

基于关键词和大模型的对话 AI 意图分类工具。

Keyword and LLM-based intent classification for conversational AI.

🔧 sage-finetune

轻量微调工具 | Lightweight Fine-tuning

SAGE 生态系统的 LLM 轻量级微调工具箱。

Lightweight LLM fine-tuning toolkit for SAGE ecosystem.

🔒 sage-safety

安全框架 | Safety Framework

AI 系统的安全护栏与检测器。

Safety guardrails and detectors for AI systems.

🔒 sage-privacy

隐私保护 | Privacy Protection

机器学习遗忘与差分隐私工具。

Machine unlearning and differential privacy tools.

🧪 sage-eval

评估工具库 | Evaluation Toolkit

L3 纯算法库，提供评估指标（F1/ROUGE/BLEU）、性能分析器与 LLM 评审工具。

L3 algorithm library providing metrics, profilers, and LLM judges.

🧠 sageLLM 模块架构 | sageLLM Modular Architecture

The modular ecosystem behind the sageLLM inference engine.

🔒 sagellm-protocol

基础协议 | Protocol & Foundations

定义推理引擎的 Schema、Error Codes 和基础类型 (Task0.1)。

Protocol definitions and types for sageLLM inference engine.

🔒 sagellm-core

引擎核心 | Engine Core

推理引擎的核心运行时与执行逻辑 (Task0)。

Core engine and runtime for sageLLM inference.

🔒 sagellm-backend

计算后端 | Compute Backend

面向国产硬件（华为昇腾 / CPU）的计算抽象层 (Task0)。

Backend provider abstraction for domestic hardware.

🔒 sagellm-comm

通信层 | Communication Layer

分布式推理的通信硬件抽象层与拓扑管理 (Task1)。

Communication layer for distributed inference.

🔒 sagellm-kv-cache

KV 缓存 | KV Cache Management

KV 缓存池、前缀缓存与驱逐策略管理 (Task2)。

KV cache management with prefix caching and eviction.

🔒 sagellm-control-plane

控制面 | Control Plane

请求路由、调度器 IR 与生命周期管理。

Request routing, scheduling, and lifecycle management.

🔒 sagellm-gateway

API 网关 | API Gateway

OpenAI 兼容的 REST API 网关。

OpenAI-compatible REST API gateway.

🔒 sagellm-compression

模型压缩 | Model Compression

量化、稀疏化与投机解码加速技术 (Task3)。

Model compression and acceleration techniques.

✖️ sage-amms

近似矩阵乘法算子 | AMM Operators

为 sageLLM 提供基础矩阵乘法算子的 C++ 实现。

AMM operators providing foundational matrix multiplication for sageLLM.

📊 LibAMM

AMM 基准测试 | AMM Benchmark Library [NIPS'24]

聚合主流 AMM 算法的高性能基准测试库。

High-performance benchmark library for AMM algorithms with CUDA acceleration.

🔒 sagellm-docs

文档 | Documentation

内部任务书、规范与研究文档。

Internal task books, specifications, and research docs.

�️ 工具与基础设施 | Tools & Infrastructure

📦 sage-pypi-publisher

PyPI 发布工具 | PyPI Publisher Toolkit

Python monorepos 的字节码编译与 PyPI 发布工具。

Bytecode compiler and PyPI publisher toolkit for Python monorepos.

🌐 sage-edge

SAGE 网关聚合器 | SAGE Gateway Aggregator

轻量级 FastAPI 网关聚合器，为 SAGE 提供统一的 API 入口。

Lightweight FastAPI aggregator for SAGE Gateway.

🐙 sage-github-manager

GitHub 问题管理工具 | GitHub Issues Manager

SAGE 项目的 GitHub Issues 管理工具，具有 AI 增强功能。

A comprehensive GitHub Issues management tool for SAGE project with AI-powered features.

🎨 sage-studio

可视化工作流 | Visual Workflow

SAGE AI 流水线的可视化构建器与 LLM Playground。

Visual workflow builder and LLM playground for SAGE AI pipelines.

🔒 sage-team-info

团队信息 | Team Info

SAGE 项目人员分配和敏感信息。

Internal team allocation and sensitive information.

�🗄️ 历史仓库 | Historical Repositories

sage-db_outdated - SAGE 数据库的早期版本（已过时）| Early version of SAGE database (outdated)

🚀 其他研究项目 | Other Research Projects

流处理系统 | Stream Processing Systems

MorphStream ⭐ 141 - [ICDE'20, SIGMOD'23, TKDE'24] 可扩展的事务性流处理引擎 | Scalable transactional stream processing engine
AllianceDB ⭐ 16 - [SIGMOD'21] 并行数据库系统 | Parallel database system

基准测试与工具 | Benchmarks & Tools

Sesame ⭐ 26 - [SIGMOD'23] 数据流聚类实证研究 | Data stream clustering empirical study
PDSC - 并行数据流聚类基准 | Parallel data stream clustering benchmark

机器学习与AI | Machine Learning & AI

SentiStream ⭐ 7 - [EMENLP'23] 情感分析流处理 | Sentiment analysis stream processing
StreamLearning - 流式学习框架 | Stream learning framework

资源与文档 | Resources & Documentation

StreamProcessing_ReadingList ⭐ 69 - 流处理文献阅读列表 | Stream processing reading list
Awesome-Online-Continual-Learning - 在线持续学习资源 | Online continual learning resources

📖 快速开始 | Quick Start

安装 SAGE | Install SAGE

# PyPI 安装 | Install from PyPI
pip install isage

# 开发安装 | Development installation
git clone https://github.com/intellistream/SAGE.git
cd SAGE
./quickstart.sh --dev --yes

简单示例 | Simple Example

from sage.kernel.api.local_environment import LocalEnvironment
from sage.libs.io.source import FileSource
from sage.middleware.operators.rag import DenseRetriever, QAPromptor, OpenAIGenerator
from sage.libs.io.sink import TerminalSink

# 创建执行环境 | Create execution environment
env = LocalEnvironment("rag_pipeline")

# 构建声明式管道 | Build declarative pipeline
(
    env.from_source(FileSource, {"file_path": "questions.txt"})
    .map(DenseRetriever, {"model": "sentence-transformers/all-MiniLM-L6-v2"})
    .map(QAPromptor, {"template": "Answer based on: {context}\nQ: {query}\nA:"})
    .map(OpenAIGenerator, {"model": "gpt-3.5-turbo"})
    .sink(TerminalSink)
)

# 执行管道 | Execute pipeline
env.submit()

详细文档请访问：SAGE Documentation

For detailed documentation, visit: SAGE Documentation

🤝 参与贡献 | Contributing

我们欢迎各种形式的贡献！请查看各个仓库的 CONTRIBUTING.md 文件了解详情。

We welcome contributions of all kinds! Please check the CONTRIBUTING.md file in each repository for details.

📞 联系我们 | Contact Us

💬 Email: shuhao_zhang at hust.edu.cn
🌐 Website: intellistream.github.io

📄 许可证 | License

各项目许可证详见各仓库的 LICENSE 文件。大多数项目采用 MIT 或 Apache 2.0 许可证。

License details can be found in each repository's LICENSE file. Most projects use MIT or Apache 2.0 licenses.

⭐ 如果我们的项目对您有帮助，请给我们一个 Star！

If our projects help you, please give us a Star!

IntelliStream Research Group

🌟 SAGE 项目生态系统 | SAGE Project Ecosystem

📦 核心仓库 | Core Repositories

🎯 SAGE

� sage-benchmark

📚 SAGE-Pub

🔧 数据库与系统组件 | Database & System Components

💾 向量数据库 & ANNS | Vector Database & ANNS

🔍 sageVDB

🔍 sage-anns

📊 CANDOR-Bench

🌊 流处理引擎 | Stream Processing

🔗 分布式运行时 | Distributed Runtime

⏱️ 时序数据库 | Time Series Database

📊 数据集 | Datasets

🤏 上下文压缩 | Context Compression

📊 sage-refiner-benchmark

🤖 AI 与智能体组件 | AI & Agent Components

🧠 LLM 推理引擎 | LLM Inference Engine

📊 sagellm-benchmark

📊 sagellm-control-plane-benchmark

� Agent 工具选择 | Agent Tool Selection

📊 sage-agent-benchmark

🎯 sage-agentic-sias

🧩 记忆体 | Memory Systems

📊 sage-memory-benchmark

📚 RAG 框架 | RAG Framework

📊 sage-rag-benchmark

� 示例与教程 | Examples & Tutorials

📖 sage-examples

📘 sage-tutorials

�🛠️ 其他工具 | Other AI Tools

🎯 sage-intent

🔧 sage-finetune

🔒 sage-safety

🔒 sage-privacy

🧪 sage-eval

🧠 sageLLM 模块架构 | sageLLM Modular Architecture

🔒 sagellm-protocol

🔒 sagellm-core

🔒 sagellm-backend

🔒 sagellm-comm

🔒 sagellm-kv-cache

🔒 sagellm-control-plane

🔒 sagellm-gateway

🔒 sagellm-compression

✖️ sage-amms

📊 LibAMM

🔒 sagellm-docs

�️ 工具与基础设施 | Tools & Infrastructure

📦 sage-pypi-publisher

🌐 sage-edge

🐙 sage-github-manager

🎨 sage-studio

🔒 sage-team-info

�🗄️ 历史仓库 | Historical Repositories

🚀 其他研究项目 | Other Research Projects

流处理系统 | Stream Processing Systems

基准测试与工具 | Benchmarks & Tools

机器学习与AI | Machine Learning & AI

资源与文档 | Resources & Documentation

📖 快速开始 | Quick Start

安装 SAGE | Install SAGE

简单示例 | Simple Example

🤝 参与贡献 | Contributing

📞 联系我们 | Contact Us

📄 许可证 | License

Popular repositories Loading

Repositories

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

People

Top languages

Uh oh!