101 Train Decision Transformers
Training Decision Transformers with ๐ค transformers
In this tutorial, youโll learn to train your first Offline Decision Transformer model from scratch to make a half-cheetah run. ๐
โ If you have questions, please post them on #study-group discord channel ๐ https://discord.gg/aYka4Yhff9
๐ฎ Environments:
โฌ๏ธ Here's what you'll achieve at the end of this tutorial. โฌ๏ธ
Step 0: Set the GPU ๐ช
- To faster the agent's training, we'll use a GPU to do that go to
Runtime > Change Runtime type
Hardware Accelerator > GPU
Step 1: Install dependencies for model evaluation ๐ฝ
Step 2: Install and import the packages ๐ฆ
Step 3: Loading the dataset from the ๐ค Hub and instantiating the model
We host a number of Offline RL Datasets on the hub. Today we will be training with the halfcheetah โexpertโ dataset, hosted here on hub.
First we need to import the load_dataset function from the ๐ค datasets package and download the dataset to our machine.
Step 4: Defining a custom DataCollator for the transformers Trainer class
Step 5: Extending the Decision Transformer Model to include a loss function
In order to train the model with the ๐ค trainer class, we first need to ensure the dictionary it returns contains a loss, in this case L-2 norm of the models action predictions and the targets.
Step 6: Defining the training hyperparameters and training the model
Here, we define the training hyperparameters and our Trainer class that we'll use to train our Decision Transformer model.
This step takes about an hour, so you may leave it running. Note the authors train for at least 3 hours, so the results presented here are not as performant as the models hosted on the ๐ค hub.
Step 7: Visualize the performance of the agent
Step 8: Publish our trained model on the Hub ๐ฅ
Now that we saw we got good results after the training, we can publish our trained model on the hub ๐ค with one line of code.
Under the hood, the Hub uses git-based repositories (don't worry if you don't know what git is), which means you can update the model with new versions as you experiment and improve your agent.
To be able to share your model with the community there are three more steps to follow:
1๏ธโฃ (If it's not already done) create an account to HF โก https://huggingface.co/join
2๏ธโฃ Sign in and then, you need to store your authentication token from the Hugging Face website.
- Create a new token (https://huggingface.co/settings/tokens) with write role
If you don't want to use a Google Colab or a Jupyter Notebook, you need to use this command instead: huggingface-cli login
3๏ธโฃ We're now ready to push our trained model to the ๐ค Hub ๐ฅ !!
Some additional challenges ๐
Congratulations, you've just trained your first Decision Transformer ๐ฅณ.
Now, the best way to learn is to try things by your own! Why not trying with another environment?
We provide datasets for some other environments:
Have fun!