Send Prompt (OpenAI) (Generative Models)

Synopsis

Generates a new column from LLM results using OpenAI

Description

This operator sends prompts to OpenAI (ChatGPT, a different OpenAI model or even a finetuned variant of their models) or the equivalent models hosted by Microsoft Azure. The results will be stored in a new column of the input data table. The prompts can refer to the values in one or several other columns by using the format [[column_name]]. Please note that you will need an account with OpenAI and an API key which you will need to provide as a Dictionary Connection as the second operator input. You may also need to buy credits since OpenAI only allows a few queries for free. You can select one of the models or simply enter the model ID of a model that you have finetuned yourself. If you select Azure as type, you will need to provide a connection to Azure Open AI instead. This dictionary connection needs to contain the 'api_key' as well as the 'api_base_url' of your Azure OpenAI environment. Please refer to the Azure OpenAI documentation for additional information.

Input

data (Data table)
The data which will be injected into your prompt template. You can refer to columns of the data with [[column_name]] in the prompt.
connection (Connection)
A Dictionary Connection with only one key-value-pair with the key being 'api_key' and the value a valid OpenAI API key.

Output

data (Data table)
The resulting data set where the prompt results are added as a new column to the input data.
connection (Connection)
The input connection.

Parameters

type Indicates if this operator should use an OpenAI model or a model hosted by Microsoft Azure.
model Select a model which should be used for this application. You can type also the model ID of a previously finetuned model instead.
name The name of the new column which will be created as a result.
prompt The prompt used for querying the model. Please note that you can reference the values of any of the input data columns with [[column_name]].
max target tokens The maximum number of tokens which will be generated by the model.
system prompt A system prompt which can be used to initialize the model or embody a specific persona.
number of parallel requests The number of requests sent in parallel to OpenAI. Depending on your usage limits you may need to reduce this or can increase it further to speed up computation time for larger data sets.
check price limit Indicates if an estimated price should be calculated before the model starts to be used and aborts execution if the estimated price is larger than the defined limit.
price limit The price limit for this execution. If the estimated price is larger than this limit the model application will not happen.
price output length factor Will be used for the calculation of the estimated price. Since the length of the expected outputs often correlates somewhat with the length of the provided inputs, this factor can be used to define these expected output lenghts in relation to the inputs. A value of 1 means that it is expected that the outputs are roughly the same number of tokens as the inputs. A value of 0.5 means only half the tokens are expected. A value of 2 means double the amount of output and so on.
completion type The completion type for the model, typically Chat Completion but in certain cases you may need to use a regular Completion instead.
temperature Controls the randomness used in the answers. Lower values will lead to less random answers. A temperature of 0 represents a fully deterministic model behavior.
top p Controls diversity via nucleus sampling. A value of 0.5 means that half of all likelihood-weighted options would be considered.
frequency penalty How much to penalize new tokens based on their frequency in the answer so far.
presence penalty How much to penalize new tokens based on their presence in the answer so far. Increases the model's likeliness to talk about new topics.
conda environment The conda environment used for this downloading task. Additional packages may be installed into this environment, please refer to the extension documentation for additional details on this and on version requirements for Python and some packages which have be present in this environment.

Tutorial Processes

Send Prompts to Open AI for Data Enrichment

This process simply generates a data set with five countries in a column called Country. It then asks ChatGPT for the capitals for the countries which are added as an additional column. IMPORTANT: you will need an account with OpenAI and provide your own API key as a Dictionary Connection to make this tutorial process work.

Categories

Versions