18: Examples & Tutorials

Fine-tuning for Vicuna 13B, using Skypilot spot instances

FastChat is a platform for training (and serving plus evaluating) LLM-based chatbots. They have the full code for fine-tuning and reinforcement learning to train Vicuna from Llama.

Also nicely, this works with SkyPilot. That’s a UC Berkeley framework to run ML workloads on any cloud (AWS, GCP, Lambda etc.). You can train Vicuna on 8 A100 GPUs with 80GB memory, and SkyPilot can provision those on the spot.

StableVicuna, open-source RLHF LLM chatbot, including training datasets

StableVicuna one is another fine-tuning of the Llama base weights. There isn’t anything new in the training code here, but thankfully they publish all of their datasets: human-annotated assistant-style conversations comprising 161K messages; 438K prompts and responses generated by GPT-3.5; and Alpaca, the Stanford-generated dataset of 52K instructions generated from OpenAI’s text-davinci-003 model.

Creating a coding assistant with StarCoder

This is an excellent Hugging Face tutorial on how to use and fine-tune StarCoder (the 16B open-source code generation model), including preparation of datasets and formats for fine-tuning.

19: Terminology →