Categories

Versions

You are viewing the RapidMiner Studio documentation for version 9.2 - Check here for latest version

Loop Azure Blob Storage (Cloud Connectivity)

Synopsis

This operator loops over all files in the specified container/folder from the Microsoft Azure Blob Storage.

Description

After you have configured your Azure Blob Storage account, you can process all Azure Blob Storage files within the selected folder.

Be aware that the operator cannot read the file as example set. For this reason, you must connect the file input in the inner process of this operator to another appropriate operator to process the file. For example, if you want to load Excel files from your Azure Blob Storage folder, you must connect the file input in the inner process with the Read Excel operator.

Input

  • in (IOObject)

    Optional input data which is delivered to the inner process.

Output

  • out (IOObject)

    Output data of the inner process.

Parameters

  • connection The connection details for the Azure Blob Storage connection have to be specified. If you have already configured an Azure Blob Storage connection, you can select it from the drop-down list. If you have not configured an Azure Blob Storage yet, select the icon to the right of the drop-down list. Create a new Azure Blob Storage connection in the Manage connections box. The account name and account key are required. Range: configurable
  • folder Provide the name of the Azure Blob Storage 'folder' over which you want to loop. Note that the concept of folders does not exist in Azure Blob Storage, so the default delimiter ('/') is used to represent them. If your file was stored as 'name1/name2/my_file.xls' on Azure Blob Storage, the file 'my_file.xls' would be displayed as residing in the folder 'name1/name2/'. Range: selection
  • filter Optional filter via a regular expression which is used to exclude files from looping over them, e.g. 'a.*b' for all files starting with 'a' and ending with 'b'. Ignored if empty. Range: string
  • filtered_stringIndicates which part of the file name is matched against the filter expression.
    • file_name: Filtered on the name, e.g. 'myfolder/myfile.txt'
    • full_path: Filtered on the full path, e.g. 'mycontainer/myfolder/myfile.txt'
    • parent_path: Filtered on the parent folder, e.g. 'myfolder/'
    Range: selection
  • file_name_macro The name of the macro which will contain the name of the current file for each file the loop iterates over, e.g. 'myfolder/myfile.txt' Range: string
  • file_path_macro The name of the macro which will contain the full path of the current file for each file the loop iterates over, e.g. e.g. 'mycontainer/myfolder/myfile.txt' Range: string
  • parent_path_macro The name of the macro which will contain the parent folder of the current file for each file the loop iterates over, e.g. e.g. 'myfolder/' Range: string
  • recursive If selected, the loop will also iterate over all files in all subfolders of the selected folder. Otherwise, it will only iterate over the files in the selected folder. Range: boolean