Categories

Versions

You are viewing the RapidMiner Applications documentation for version 2024.0 - Check here for latest version

Altair Monarch extension

Currently, the Altair Monarch extension is only compatible with

The extension will not work together with Altair AI Hub, nor with other operating systems.

This document provides a tutorial for using the Altair Monarch extension in Altair AI Studio.

Altair® Monarch® is a market-leading, desktop-based, self-service data preparation solution. It helps you connect to multiple data sources, including cloud-based and big data. It helps you to cleanse and manipulate your data, both structured and unstructured. No coding is required.

The Altair Monarch extension for AI Studio provides three operators:

  • Execute Monarch,
  • Monarch settings, and
  • Open Monarch Workspace.

For installation instructions, see Adding Extensions.

Read more:

System requirements

The system requirements are the following:

Notes on Altair Units

The Altair Monarch extension runs the Altair Monarch command-line tool on the back-end. The command-line tools check out the Altair units and release them once the execution is completed.

If you are using Altair AI Studio together with an Altair Units license, you should be able to use the same license configuration.

Tutorial

The following tutorial demonstrates how to use Altair Monarch operators for loading, configuring, and running an Altair Monarch Workspace.

Look at the process below. A PDF file containing semi-structured data is loaded. The data is extracted into a table and exported into an Excel file. Finally, the resulting Excel file is imported via the Read Excel operator, and the data is shown in the Results view.

Use Monarch to create Excel from PDF

Let’s look at the process and the operators involved. The Open File operator loads an Altair Monarch workspace file and pipes it into the Open Monarch Workspace operator. To function, all other Altair Monarch operators must be nested inside an Open Monarch Workspace operator. Double-click this operator to access its subprocess.

Open Monarch Workspace subprocess

The actual call to Altair Monarch Data Prep Studio happens via the Execute Monarch operator, which is configured by the Set Monarch Options operators.

The first Set Monarch Options operator overwrites the file path of the input PDF file in the workspace. This is useful, for example, if you have a workspace that you want to apply to different inputs on the fly. In our example, we want to apply the workspace to the current month's data, which we hardcoded to February.

Set Monarch Options parameters 1

The second Set Monarch Options operator tells Monarch that we want to export the "CurrentMonth" table to table "table1" of an Excel File called CurrentMonth.xlsx. Monarch looks at the file ending (".xlsx") of our file path and infers that we want Excel as the output file type. To export to other supported file formats, change the file ending.

Set Monarch Options parameters 2

After running the process, change to the log view to see the Altair Monarch Data Prep Studio command line call that has been executed based on the process configuration.

Here, you can see the absolute paths of the files involved in this process. Open the Monarch Workspace dpwx file and the pdf input file to get an overview of how the Monarch workspace is configured.

Monarch log view

Additional information

Other Monarch options are available from the "type" dropdown of the Set Monarch Options operator. The operator description provides details on all the possible options.

You can add as many options to a Monarch run as appropriate. For one run, you can define multiple input file paths and exports.

The Execute Monarch operator will provide additional ports as needed. Alternatively, the options ports also accept collections. Therefore, you can group your parameters to clean up your processes or even use loop operators and macros to create collections of options automatically.

The Open Monarch Workspace operator extracts metadata from the workspace and provides it to the inner operators to suggest table names, named exports, and runtime parameter names. Therefore, it is recommended that you connect its workspace port before configuring the subprocess. Please note that it can take some time to extract the metadata. You can inspect the progress at the bottom right of the AI Studio window.

For advanced users: The "command line argument" option type lets you directly specify one of the many command-line arguments the Altair Monarch Data Prep Studio Command Line Interface accepts.