Discussion about this post

User's avatar
Raphael Kalandadze's avatar

Hi Owen, thanks for you work. It's really helpful 🙌 🙌

I'm trying to run DOREMI on the entire 21 dataset to get the weight to create global_combine_v1.

But I have some cuda version problems. as I understand doremi is just for detecting weights of datasto combine. Can you share the doremi weights what it was for global_combine_v1? In order to prevent Doremi from rerunning

alternatively, if you have global_combine_v1 already, it would be great to share, Thanks

Expand full comment
Gireesan's avatar

I am also following the same thoughts and was looking at a way to use Karpathy's llama2c instead of Pythia and the synthetic datasets (of high quality) to have 100MB-300 MB task specific models.

Did you publish your pipeline?

Expand full comment
1 more comment...

No posts