Finetune Text2Text Generation (Generative Models)
Synopsis
Finetunes an LLM for Text2Text Generation tasksDescription
Finetunes a Large Language Model for Text2Text Generation tasks. Requires that the foundation model has been downloaded from Huggingface into a model directory which needs to be provided as an input to this operator next to the training data. You can use the Download Model and / or Load Model operators for this. This training data set must have at least two columns, namely an input and the target column. The resulting model will be stored in a directory of your project (recommended), your file system, or in a temporary location if you chose to do so. Please note that for Text2Text Generation tasks only specific models can be used which are marked as such on Huggingface. Since finetuning is a complex topic we recommend to read this documentation here: https://docs.rapidminer.com/latest/studio/generative-ai/#finetuning-a-modelInput
- data (Data table)
The training data for the finetuning task. Needs at least an input column as well as a column with the desired outcomes.
- model (File)
The model directory for the foundation model you are using for this finetuning job.
Output
- model (File)
The model directory into which the finetuned result has been stored, either a folder in your projects or your file system.
Parameters
- storage type Determines where the finetuned model will be stored. Either in a folder in one of your projects / repositories (recommended), in a folder of your file system, or in a temporary folder.
- project folder The folder in a project / repository to store the finetuned model in.
- file folder The folder in your file system to store the finetuned model in.
- prompt prefix This prefix is used in front of each input to tell the finetuned model the task of this finetuning.
- input column The name of the attribute or column which should be used as input for the fine-tuning.
- target column The name of the attribute or column which should be used as the target for this fine-tuning. Since this is a translation task, the model will try to learn how to translate the values from the input column to those in the target column.
- max input tokens The maximum number of tokens allowed for the inputs. Longer sequences will be ignored.
- max target tokens The maximum number of tokens allowed for the target or output of the fine-tuned model. Longer sequences will be ignored.
- epochs The number of epochs for this fine-tuning.
- device Where the finetuning should take place. Either on a GPU, a CPU, or Apple’s MPS architecture. If set to Automatic, the training will prefer the GPU if available and will fall back to CPU otherwise.
- device indices If you have multiple GPUs and computation is set up to happen on GPUs you can specify which ones are used with this parameter. Counting of devices starts with 0. The default of “0” means that the first GPU device in the system will be used, a value of “1” would refer to the second and so on. You can utilize multiple GPUs by providing a comma-separated list of device indices. For example, you could use “0,1,2,3” on a machine with four GPUs if all four should be utilized. Please note that RapidMiner performs data-parallel computation which means that the model needs to be small enough to be completely loaded on each of your GPUs.
- finetuning mode Indicates if a full finetuning is performed or PEFT / LoRA which can dramatically accelerate the finetuning task.
- lora r The dimension of the low-rank matrices used by LoRA.
- lora alpha The scaling factor for the weight matrices used by LoRA.
- lora dropout The dropout probability of the LoRA layers.
- target modules mode If set to None, no specific definition is made for which modules (or layers) should be finetuned with PEFT / LoRA. This is the best setting for all the models which are natively supported by PEFT. If set to Automatic, we will extract the names of all linear layers automatically which is the recommended approach. And if set to Manual, you can specify a comma-separated list of target module names yourself.
- target modules Only shown if the target module mode is set to Manual. You can specify here a comma-separated list of target module names. Those modules would be finetuned with PEFT / LoRA then. You can see the structure of the model including the module names in the logs.
- quantization Quantization techniques reduce memory and computational costs by representing weights and activations with lower-precision data types like 8-bit or 4-bit integers.
- 16 bit precision Whether to use 16-bit (mixed) precision training (fp16) instead of 32-bit training.
- add separator tokens Indicates if at the beginning and end of each statement the beginning / end of sentence tokens (bos, eos) should be added.
- prep threads The number of parallel threads used for the data preprocessing.
- batch size The batch size for this fine-tuning. The number of GPUs x batch size x gradient accumulation steps should usually be a multiple of 8.
- gradient accumulation steps The gradient accumulation steps used for this fine-tuning. The number of GPUs x batch size x gradient accumulation steps should usually be a multiple of 8.
- train test ratio The ratio of rows which is used for testing the fine-tuned model.
- learning rate The learning rate for this fine-tuning.
- conda environment The conda environment used for this model task. Additional packages may be installed into this environment, please refer to the extension documentation for additional details on this and on version requirements for Python and some packages which have be present in this environment.
Tutorial Processes
Finetune a model for a translation task
This process downloads a foundation model from Huggingface and finetunes it to translate from English to Dutch based on only 5,000 training examples. The finetuned model is then applied on three test examples and deleted afterwards. The original foundation model is also deleted. Obviously one would not delete the finetuned model after one application, but we wanted to keep things clean for this tutorial. Please note that this process will run for many hours based on your hardware setup.