Skip to content

ktk-07/specdec

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Main Objective

An educational repository meant to understand Speculative Decoding and the math behind it. We want to be able to understand why it works.

Speculative Decoding/Sampling was written in Fast Transformer Inference via Speculative Decoding by google and Accelerating Large Language Model Decoding via Speculative Sampling by google deepmind

How we will go about this Repository

Give to notesbooks directory of the repository

  • There would be a few notebooks to understand
  1. Understanding Model Inference and Different Decoding/Generation Techniques
    • Deterministic Decoding
      • Greedy Sampling
      • Beam Search
    • Probabilistic Decoding/Sampling
      • Sampling with temperature
      • Top K Sampling
      • Top P Sampling
  2. Understanding the Speculative Decoding Theory
  3. Proving that the Speculative Decoding actual samples under the target distribution

Final Product

  • Build a mini speculative inference library

Additional Research

  1. Can we use models of different architectures but similart to do speculative decoding?

Keywords to be Mindful off

Target Model Draft Model Speculative Sampling Speculative Decoding

Prerequisites

Citations

Fast Transformer Inference via Speculative Decoding Accelerating Large Language Model Decoding via Speculative Sampling

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors