This repository contains the code for our paper "Accelerating Newton-Schulz Iteration for Orthogonalization via Chebyshev-type Polynomials" by Ekaterina Grishina, Matvey Smirnov and Maxim Rakhuba.
polynomials.pycontains code for finding optimal polynomials.notebooks/polynomials.ipynbcontains examples of how to generate optimal polynomials using these functions.riemannian_opt.pycontains code for polar retraction, Riemannian SGD and Adam optimizers on Stiefel manifold./nanoGPTcontains code for running NanoGPT with Muon optimizer and different polynomials. This code is based on NanoGPT speedrun repository.
Navigate to /nanoGPT directory and install requirements. To download the data for training NanoGPT, run
python data/cached_fineweb10B.py 8To train the model
./run.sh