Powerful Models on a Budget: Combining DoReMi…

Owen Colegrove

Oct 9, 2023

In the race to build robust AI models, size isn't everything.

Read →

3 Comments

Rapael Kalandadze

Feb 13, 2024

Hi Owen, thanks for you work. It's really helpful 🙌 🙌

I'm trying to run DOREMI on the entire 21 dataset to get the weight to create global_combine_v1.

But I have some cuda version problems. as I understand doremi is just for detecting weights of datasto combine. Can you share the doremi weights what it was for global_combine_v1? In order to prevent Doremi from rerunning

alternatively, if you have global_combine_v1 already, it would be great to share, Thanks

Expand full comment

Gireesan

Oct 11, 2023

I am also following the same thoughts and was looking at a way to use Karpathy's llama2c instead of Pythia and the synthetic datasets (of high quality) to have 100MB-300 MB task specific models.

Did you publish your pipeline?

Expand full comment

Reply (1)

Owen Colegrove

Oct 11, 2023Edited

Yes, please take a look here - https://github.com/emrgnt-cmplxty/sciphi and here https://github.com/emrgnt-cmplxty/SmolTrainer, for data and training pipelines, respectively.

Expand full comment

Owen’s Substack

Powerful Models on a Budget: Combining DoReMi…