SimpleLLM is a basic implementation for training Large Language Models. This project will aim to include both a working module for training and generating text from created language models, as well as a series of tutorial notebooks for explaining each part of the process. The initial implmentation will be loosely based on the incredible work of the NanoGPT project, but will be expanded to include more model architectures (e.g. selective structured state space models) and a more modular design (multiple tokenizers and tools for running experiments).
SimpleLLM is currently only available on GitHub, so you will need to clone the repository to use it. You can do this by running the following command in your terminal:
git clone https://github.com/SamPIngram/SimpleLLM.git
Once you have cloned the repository, you can install the required packages by running the following command in the root directory of the project:
pip install -r requirements.txt
SimpleLLM is designed to be as simple to use as possible. The following code snippet shows how to train a model on a text file and then generate text from it:
# Use a config file to set all the parameters
config = "configs/shakespeare_config.py"
# Get the Tiny Shakespeare dataset
# Train a model on the Tiny Shakespeare dataset
trainer = simplellm.Trainer(config_fp=config)
# Generate text from the trained model
This can run by running the following command in the root directory of the project:
Training can also be done using muiltiple GPUs by running the following command (correcting the number of GPUs):
torchrun --standalone --nproc_per_node=8 demo.py
Or if you have a multiple cluster of GPUs, you can run the following command (correcting the number of nodes, node rank, master address and master port):
$ torchrun --nproc_per_node=8 --nnodes=2 --node_rank=0 --master_addr=123.456.123.456 --master_port=1234 demo.py
SimpleLLM also includes a basic GUI for running experiments. This can be run by running the following code snippet:
or by running the following command in the root directory of the project:
The following is a list of features that I would like to add to SimpleLLM in the future:
- Add a basic implementation of a language model
- Add a basic implementation of a trainer
- Add a basic implementation of a generator
- Add a basic implementation of a tokenizer
- Add a basic implementation of getting datasets
- Add a GUI for running training experiments
- Add other model architectures:
- selective structured state space models
- mixture of experts models
- Add a tutorial notebook for each part of the process
If you would like to contribute to SimpleLLM, please feel free to open a pull request. If you have any questions about the project, please feel free to open an issue.
SimpleLLM is licensed under the MIT license. See LICENSE for more details.