Finetune Text Generation (Generative Models)

Synopsis

Finetunes an LLM for Text Generation tasks

Description

Finetunes a Large Language Model for Text Generation tasks. Requires that the foundation model has been downloaded from Huggingface into a model directory which needs to be provided as an input to this operator next to the training data. You can use the Download Model and / or Load Model operators for this. The training data set must have a text columns which will be used as the input column for this finetuning. All texts from this column will be concatenated and then split into training chunks of the specified size (input length). The goal of Text Generation is to predict the next word(s), so unlike other finetuning operators this one will only need this single input column instead of input and target columns. The resulting model will be stored in a directory of your project (recommended), your file system, or in a temporary location if you chose to do so. Please note that for Text Generation tasks only specific models can be used which are marked as such on Huggingface. Since finetuning is a complex topic we recommend to read this documentation here: https://docs.rapidminer.com/latest/studio/generative-ai/#finetuning-a-model

Input

data (Data table)
The training data for the finetuning task. Needs at least an input column as well as a column with the desired outcomes.
model (File)
The model directory for the foundation model you are using for this finetuning job.

Output

model (File)
The model directory into which the finetuned result has been stored, either a folder in your projects or your file system.

Parameters

storage type Determines where the finetuned model will be stored. Either in a folder in one of your projects / repositories (recommended), in a folder of your file system, or in a temporary folder.
project folder The folder in a project / repository to store the finetuned model in.
file folder The folder in your file system to store the finetuned model in.
input column The name of the attribute or column which should be used as input for the fine-tuning. Unlike other finetuning tasks we do not need to provide a target column since the goal is to predict the next word(s) and the shifted inputs will serve as the labels.
input length All the texts in the input column will be concatenated and the split into parts of which each part will have this length.
epochs The number of epochs for this fine-tuning.
device Where the finetuning should take place. Either on a GPU, a CPU, or Apple’s MPS architecture. If set to Automatic, the training will prefer the GPU if available and will fall back to CPU otherwise.
device indices If you have multiple GPUs and computation is set up to happen on GPUs you can specify which ones are used with this parameter. Counting of devices starts with 0. The default of “0” means that the first GPU device in the system will be used, a value of “1” would refer to the second and so on. You can utilize multiple GPUs by providing a comma-separated list of device indices. For example, you could use “0,1,2,3” on a machine with four GPUs if all four should be utilized. Please note that RapidMiner performs data-parallel computation which means that the model needs to be small enough to be completely loaded on each of your GPUs.
finetuning mode Indicates if a full finetuning is performed or PEFT / LoRA which can dramatically accelerate the finetuning task.
lora r The dimension of the low-rank matrices used by LoRA.
lora alpha The scaling factor for the weight matrices used by LoRA.
lora dropout The dropout probability of the LoRA layers.
target modules mode If set to None, no specific definition is made for which modules (or layers) should be finetuned with PEFT / LoRA. This is the best setting for all the models which are natively supported by PEFT. If set to Automatic, we will extract the names of all linear layers automatically which is the recommended approach. And if set to Manual, you can specify a comma-separated list of target module names yourself.
target modules Only shown if the target module mode is set to Manual. You can specify here a comma-separated list of target module names. Those modules would be finetuned with PEFT / LoRA then. You can see the structure of the model including the module names in the logs.
quantization Quantization techniques reduce memory and computational costs by representing weights and activations with lower-precision data types like 8-bit or 4-bit integers.
16 bit precision Whether to use 16-bit (mixed) precision training (fp16) instead of 32-bit training.
input grouping length The size of each group or batch which will be concatenated and then split into chunks of input length. The larger this grouping is the less data loss from the concatenation and chunking you will suffer but processing times may be longer.
add separator tokens Indicates if tokens should be added marking the beginning and the end of each sentens (BOS and EOS). Using those tokens often help the model to determine when to stop the text generation.
prep threads The number of parallel threads used for the data preprocessing.
batch size The batch size for this fine-tuning. The number of GPUs x batch size x gradient accumulation steps should usually be a multiple of 8.
gradient accumulation steps The gradient accumulation steps used for this fine-tuning. The number of GPUs x batch size x gradient accumulation steps should usually be a multiple of 8.
train test ratio The ratio of rows which is used for testing the fine-tuned model.
learning rate The learning rate for this fine-tuning.
weight decay Weight decay is as a form of regularization to prevent overfitting. Weight decay reduces the size of the model’s weights by adding a penalty to the loss, meaning that weights can’t become overly important compared to all the others.
conda environment The conda environment used for this model task. Additional packages may be installed into this environment, please refer to the extension documentation for additional details on this and on version requirements for Python and some packages which have be present in this environment.

Tutorial Processes

Finetune a model for generating show descriptions

This process downloads a foundation model from Huggingface and finetunes it to generate show descriptions based on show titles. The training data consists of about 8,000 rows of Netflix shows including their titles and descriptions. The finetuned model is then applied on three test examples and deleted afterwards. The original foundation model is also deleted. Obviously one would not delete the finetuned model after one application, but we wanted to keep things clean for this tutorial. Please note that this process may run for a long time based on your hardware setup.

Categories

Versions