‼️ If you want to work on this issue: please comment below and wait until a maintainer assigns this issue to you before opening a PR to avoid several contributions on the same issue. Thanks! 😊
✨ What You’ll Do
Making the models faster, smaller, and greener is essential in Pruna, which is why we continually seek new ways to increase efficiency.
LLM Compresor is a library for optimizing models with vllm. Right now, we have only integrated AWQ, but it would be nice to implement the rest of quantizers: Simple PTQ, GPTQ, SmoothQuant, or SparseGPT.
Don't hesitate to coordinate and collaborate to work on each of them.
🤖 Useful Resources
✅ Acceptance Criteria
{Which are the requirements for the acceptance? Should the code work in any specific way? What’s the correct output}
- Tests & Docs: All existing and new unit tests pass, and the documentation is updated
And don’t forget to give us a ⭐️!
❓ Questions?
Feel free to jump into the #contributing Discord channel] if you hit any roadblocks. Can’t wait to see your contribution! 🚀
Share on Socials

✨ What You’ll Do
Making the models faster, smaller, and greener is essential in Pruna, which is why we continually seek new ways to increase efficiency.
LLM Compresor is a library for optimizing models with
vllm. Right now, we have only integrated AWQ, but it would be nice to implement the rest of quantizers: Simple PTQ, GPTQ, SmoothQuant, or SparseGPT.Don't hesitate to coordinate and collaborate to work on each of them.
🤖 Useful Resources
✅ Acceptance Criteria
{Which are the requirements for the acceptance? Should the code work in any specific way? What’s the correct output}
And don’t forget to give us a ⭐️!
❓ Questions?
Feel free to jump into the #contributing Discord channel] if you hit any roadblocks. Can’t wait to see your contribution! 🚀
Share on Socials