Text Generation (Generative Models)

Synopsis

Applies a Text Generation model for a given prompt

Description

Applies a Text Generation model for a given prompt. These models have been trained to extend a given text with the next word. For example, a model could predict that the next word after “Once upon a time in a dark forest lived an evil…” should be “witch”. Prompts can refer to data columns by putting them into double brackets, for example [[column_name]]. The result will be stored in a new column with the specified name.

Input

data (Data table)
The data which will be injected into the prompt. Column values of the data set can get accessed with [[column_name]] in the prompt.
model (File)
The optional model directory (in your project / repository or your file system). Has to be provided if the parameter "use local model" is true. Typically, this is only necessary if you want to use your own finetuned local version of a model.

Output

data (Data table)
The input data plus a new column (or several ones) which are the result of the prompts sent to the model.
model (File)
The model directory which has been delivered as input.

Parameters

use local model Indicates if a local model should be used based on a local file directory or if a model should be used from the Huggingface portal. If a local model is to be used, all task operators require a file object referencing to the model directory as a second input. If this parameter is unchecked, you will need to specify the full model name coming from the Huggingface portal for the “model” parameter.
model The model from the Huggingface portal which will be used by the operator. Only used when the “use local model” parameter is unchecked. The model name needs to be the full model name as found on each model card on the Huggingface portal. Please be aware that using large models can result in downloads of many gigabytes of data and models will be stored in a local cache.
name The name of the new column which will be created as a result.
prompt The prompt used for querying the model. Please note that you can reference the values of any of the input data columns with [[column_name]]. You may need to use a prompt prefix such as “Translate to Dutch: [[column_name]]” to tell the model what it is supposed to do.
max target tokens The maximum number of tokens produced as output by the model. Note that some models can only work with specific maximum numbers of tokens. Please refer to the model documentation pages on Huggingface for more information about such limits.
device Where the finetuning should take place. Either on a GPU, a CPU, or Apple’s MPS architecture. If set to Automatic, the training will prefer the GPU if available and will fall back to CPU otherwise.
device indices If you have multiple GPUs and computation is set up to happen on GPUs you can specify which ones are used with this parameter. Counting of devices starts with 0. The default of “0” means that the first GPU device in the system will be used, a value of “1” would refer to the second and so on. You can utilize multiple GPUs by providing a comma-separated list of device indices. For example, you could use “0,1,2,3” on a machine with four GPUs if all four should be utilized. Please note that RapidMiner performs data-parallel computation which means that the model needs to be small enough to be completely loaded on each of your GPUs.
data type Specifies the data type under which the model should be loaded. Using lower precisions can reduce memory usage while leading to slightly less accurate results in some cases. If set to “auto” the data precision is derived from the model itself.
revision The specific model version to use. The default is “main”. The value can be a branch name, a tag name, or a commit id of the model in the Huggingface git repository.
trust remote code Whether or not to allow for custom code defined on the Hub in their own modeling, configuration, tokenization or even pipeline files.
do sample If true, the next word will be randomly picked by its conditional probability distribution instead of following a fixed path.
temperature Controls the randomness used in the answers. Lower values will lead to less random answers. A temperature of 0 represents a fully deterministic model behavior.
top k The Top-K most likely next words are selected, and the entire probability mass is shifted to these k words, i.e., we remove the low probability words all together.
top p Top-P sampling is like Top-K, but instead of choosing the top k most likely words, we choose the smallest set of words whose total probability is larger than p, and then the entire probability mass is shifted to the words in this set.
number of beams Beam search is essentially greedy search (with a single beam), but the model tracks and keeps this number of hypotheses at each time step, so the model is able to compare alternative paths as it generates text.
penalty alpha The penalty for degeneration in contrastive search. When alpha is zero, contrastive search degenerates to the vanilla greedy search.
no repeat ngram size This defines the size of n-grams which are not allowed to appear twice. For example, a value of 2 means that 2-grams should appear twice.
conda environment The conda environment used for this model task. Additional packages may be installed into this environment, please refer to the extension documentation for additional details on this and on version requirements.

Tutorial Processes

Using a text generation model

This tutorial shows how to use a text generation model. It creates some prompts and feeds them into the task operator. You can also deliver a local model using the second operator input or specify a different model from Huggingface using the model parameter.

Categories

Versions