Categories

Versions

Insert Documents (Milvus) (Generative Models)

Synopsis

Inserts data rows as documents to a collection of the vector database Milvus

Description

Inserts all data rows as documents to a collection of the vector database Milvus. You need to specify a collection name as well as a column which contains the embeddings for each document (as comma-separated list of values). The size of these embedding vectors must match the vector size of the collection to which you want to add the documents and has been specified during creation of the collection. All columns but the embedding column will become part of the documents to be added to collection. For example, if you have two additional columns called "Id" and "Text" than their contents will become the documents in the collection stored under each embedding vector.

Input

  • data (Data table)

    A data set with one embedding column and at least one additional column for the document information.

  • connection (Connection)

    A Dictionary Connection to a Milvus vector database.

Output

  • data (Data table)

    The input data which is passed through here as output.

  • connection (Connection)

    The input connection which is passed through here as output.

Parameters

  • collection The name of the collection to insert the documents to. The vector size of the collection needs to be the same as the size of the provided embeddings.
  • embeddings column The column in your data containing the embeddings for each document. All other columns will become the documents inserted to the collection. The values of the embeddings column need to be a comma-separated list of embedding values. The number of values needs to be the same as the vector size of the collection to which the documents should be added.
  • conda environment The conda environment used for this task. Please refer to the extension documentation for additional details on this and on version requirements for Python and all used packages in this environment.

Tutorial Processes

Inserting documents to a collection using embedding vectors

This tutorial loads some sample data and creates a new embeddings column based on the input text documents. Those documents together with the embedding vectors are then the input to the operator Insert Documents. All documents are stored at the position of the embeddings and can be retrieved later. Please note that you will need to have a Milvus database running for this tutorial to work. The database connection must be delivered as input for this tutorial to work. The connection must be a Dictionary Connection with the keys 'uri' and 'token'.