Skip to content

[FEATURE] Integrate LLM Compresor #367

Description

@sdiazlor

‼️ If you want to work on this issue: please comment below and wait until a maintainer assigns this issue to you before opening a PR to avoid several contributions on the same issue. Thanks! 😊

✨ What You’ll Do

Making the models faster, smaller, and greener is essential in Pruna, which is why we continually seek new ways to increase efficiency.

LLM Compresor is a library for optimizing models with vllm. Right now, we have only integrated AWQ, but it would be nice to implement the rest of quantizers: Simple PTQ, GPTQ, SmoothQuant, or SparseGPT.

Don't hesitate to coordinate and collaborate to work on each of them.

🤖 Useful Resources

✅ Acceptance Criteria

{Which are the requirements for the acceptance? Should the code work in any specific way? What’s the correct output}

  • Tests & Docs: All existing and new unit tests pass, and the documentation is updated

And don’t forget to give us a ⭐️!


❓ Questions?

Feel free to jump into the #contributing Discord channel] if you hit any roadblocks. Can’t wait to see your contribution! 🚀


Share on Socials

Share on X

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestfirst-pruneCelebrate the 1st pruna OSS anniversarygood first issueGood for newcomers

Type

No type

Fields

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions