Intel NPU LLM Proof of Concept

This project demonstrates optimized inference for large language models using Intel's Neural Processing Unit (NPU) via IPEX-LLM.

Features

Optimized Inference: Leverages Intel NPU for efficient LLM inference.
Modular Design: Core logic encapsulated in NPUInferenceEngine for easy integration.
Hardware Verification: Includes tools to verify NPU availability.
Example Applications: Includes a summarization example.

Setup

Create and activate virtual environment:
```
uv venv .env
.env\Scripts\activate
```
Install dependencies:
```
uv pip install -e .
```
Verify NPU hardware:
```
python tests/hardware_test.py
```

Usage

Run Inference

Run the main script to generate text. The model will be automatically downloaded, optimized, and saved to ./model_weights by default.

python src/main.py --prompt "What is the flavor of water?"

Options:

--prompt: The input prompt (default: "What is life?").
--repo-id-or-model-path: Hugging Face model ID or local path (default: "Qwen/Qwen2.5-0.5B-Instruct").
--save-directory: Directory to save optimized model (default: ./model_weights).
--n-predict: Max tokens to predict (default: 128).
--disable-streaming: Disable streaming output.

Summarize Text

Run the example script to summarize a text file (examples/story.txt).

python examples/summarize.py

Documentation

Intel IPEX-LLM NPU Examples

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
examples		examples
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
run_tests.py		run_tests.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Intel NPU LLM Proof of Concept

Features

Setup

Usage

Run Inference

Summarize Text

Documentation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Intel NPU LLM Proof of Concept

Features

Setup

Usage

Run Inference

Summarize Text

Documentation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages