You are viewing the RapidMiner Hub documentation for version 10.1 - Check here for latest version
Upgrade from AI Hub 9
For a more streamlined approach to migration, including a migration script,
see the docker compose
-based approach described in
Upgrade from 9.10.11 to 10.0.0.
The document that follows describes the migration in more detail, but it does not include a migration script.
Table of contents
You need to be at least on AI Hub version 9.10.4 and use a containerized setup.
This document outlines migration instructions coming from AI Hub version >= 9.10.4
(minimum requirement).
By default, the required migration properties need to be enabled explicitly for each application
(AI Hub/Server, Job Agent, Scoring Agent).
The updated deployment descriptor files do have some reasonable defaults which work out of the box, but some old data sources
like old home directories of AI Hub or the respective home directories for
Job Agents need to be accessible by the new setup meaning that
you need to have the old data and database still available and running during migration.
Please read this documentation in full before starting!
All performed AI Hub 10 migration steps do not alter existing data, but instead copy data from your old instance and its database to the new instance and its new database. This way, a failed migration will not impact your existing installation and in case of an unsuccessful migration, you can easily roll back to the state before AI Hub 10 by simply starting up the old instance.
Keep in mind that migration might take quite some time as data is actually being copied. The time needed depends on the amount and size of all your Projects. Another factor is the amount of jobs, schedules and queues.
This guide will not cover how to set up new Job Agents, but if you
had multiple queues and multiple Job Agents
or Scoring Agents attached to it before, you need to re-create the same setup in the new deployment adapting the deployment's
descriptor files, e.g. the docker-compose.yml
and .env
files.
The following sections outline steps which are highly advised to do before starting migration to AI Hub/Server 10.
Preparation steps useful in AI Hub 9
AI Hub 10 drops legacy web services, the legacy Repository and Web Apps. Only Scoring Agents, Projects and Dashboards are available. It's recommended that before actually upgrading to version 10, you should already do some pre-migration steps in your current installation. This will ultimately make the migration process easier and more automated later.
The following sections outline potential steps you should do upfront in your old instance. This also has the advantage to start from a fresh and cleaned up baseline for migration and your future work with AI Hub utilizing the full power and synergies of collaboration with Projects.
If you choose not to do the steps pre-migration, post-migration will be more complex.
Moving to Projects from Legacy Repository
Projects are the new default for managing your assets and data. Please refer to the Projects section how to create and add contents to it. This section outlines following two approaches how to move to Projects before starting the major version upgrade to AI Hub 10:
- utilizing RapidMiner Studio or
- using AI Hub/Server legacy repository dump functionality.
For each of these approaches, your users need to later re-create any vault entries for injected parameters of Connections! Legacy repository and Project vaults are different systems.
In contrast to the legacy repository, Projects are self-contained and designed to encapsulate use cases or a project itself. Please think about how your current legacy repository structure might fit into this concept and keep in mind that cross-Project sharing of assets or data is not supported. In addition, Project access is per Project and not on a file and folder base.
We recommend against moving all contents to a single Project!
There's also an extension available for RapidMiner Studio which helps you to set up a proper Project structure. It's called Projects on Marketplace.
With RapidMiner Studio
- Create the desired Project structure in AI Hub 9 and assign access rights
- Connect from RapidMiner Studio 9 to your legacy repository
- Clone the Projects you've created in RapidMiner Studio 9
- In RapidMiner Studio 9, copy desired legacy repository contents needed for your Projects to the respective newly cloned Project and create Snapshots to push the contents to AI Hub/Server
Keep in mind that users having assigned injected values for Connections, need to enter their credentials inside the Project's contents interface again. This step is optional, because it can also be done after the migration succeeded.
In addition, you might need to look into each process and ensure that it's using the correct paths for storing and retrieving assets.
With Server dump functionality
This approach comes with the major downside of not dumping Connections, but can still serve your needs to move other assets and data to a Project. Please consider the conceptual changes and what a Project is which has been mentioned above.
- Create the desired Project structure in AI Hub 9 and assign access rights
- Go to the Repository web interface UI of AI Hub/Server 9 and
- Click on Download in the right menu
- Click on ZIP dump which creates a ZIP file containing the contents of the current folder you're browsing
- Go to the desired Project's contents page, press on Add Content, select the previously downloaded ZIP file and hit the Add button
As outlined above, Connections are not dumped as this would impose a security risk. Please re-create them manually inside RapidMiner Studio 9 or copy them over as outlined in the previous Moving to Projects via RapidMiner Studio section.
In addition, you might need to look into each process and ensure that it's using the correct paths for storing and retrieving assets.
Moving to Scoring Agents
After moving to Projects, you can create Deployments from the AI Hub/Server web interface when browsing the contents of a Project and add them to your Scoring Agents.
Please refer to the Scoring Agents documentation how to create and deploy deployments.
Moving existing web services to Scoring Agents affects how your process takes input and produces output. For further information, please consult the example creation of a deployment and its process inside the Scoring Agents section.
Moving to Dashboards
Dashboards are backed up by Grafana and can point to any Scoring Agent instance. Please refer to the proper Dashboards and Scoring Agent sections how to create deployments and wire them into Dashboards.
Requirements
- Your old instance has at least version
9.10.4
- Your old instance is preferably on a platform/docker setup
- Ensure that your environment has enough disk space left, because existing source migration data is not altered, but copied
- Projects data including large files
- Jobs data from Job Agents
- Deployment data from Scoring Agents
- Old home directories and old databases are still available, because migration needs to access them
What data is migrated?
Before jumping into starting the migration, the following table outlines what data is migrated to AI Hub 10.
Application | Data | Description |
---|---|---|
Job Agent | Jobs | The jobs directory (logs and error files distributed to AI Hub/Server for the web interface on demand) of the old home directory will be copied to the new home directory. Existing files will be overwritten. |
Job Agent | ID file | The .id.properties of the old home directory will be copied to the new home directory. Existing files will be overwritten. |
Scoring Agent | Deployments | The deployments directory of the old home will be copied to the home directory. Existing files will be overwritten. |
AI Hub/Server | Permissions | The Permissions for Queues and Projects are migrated. The existence of users and groups is verified in the keycloak service. |
AI Hub/Server | Queues | The Queues are migrated and created in the new system with the migrated permissions. If a Queue with the same name already exists, the Queue will not be migrated. If no suitable owner is found, the admin will become the owner. |
AI Hub/Server | Projects | The Projects are migrated and copied into the home directory into the new system. If a Project with the same name already exists, the Project will not be migrated. If no suitable owner is found, the admin will become the owner. |
AI Hub/Server | Injectable Values | The injectable values are migrated if the secret used for encrypting them is valid. |
AI Hub/Server | Schedules | The schedules are migrated. |
AI Hub/Server | Job Agents | The job agents saved in the old database table JOBSERVICE_JOB_AGENT will be migrated into the new database table. |
AI Hub/Server | Jobs | The jobs residing in table JOBSERVICE_JOB will be migrated except for running jobs. Pending jobs will be scheduled. All related job errors, contexts and logs will also be migrated. |
AI Hub/Server | Archived Jobs | All jobs residing in the database table A_JOBSERVICE_JOB will be migrated. All related job errors, contexts and logs will also be migrated. |
AI Hub/Server | Users | All local users will be migrated and stored inside Keycloak. External users will not be migrated. |
AI Hub/Server | Groups | All groups will be migrated. If the group is mirrored, there will be additional logging as to what external group the local group was mirrored to. |
Users and Groups
While migrating users and groups you may encounter some additional logging if the user or group was externally managed meaning by Keycloak or LDAP. If the user has been external, the user won't be migrated. External groups are still migrated and the corresponding LDAP/SAML groups will be logged as well for manual replication of them.
At the end of a successful user and group migration, a result file named migrated_users.csv
is created inside the
rapidminer-server-home
directory. This CSV file includes all users and their "new", temporary passwords. For security reasons
old passwords are not migrated. When migrated users log into AI Hub/Server for the first time, they will then be asked
to change their password.
What data is NOT migrated?
Legacy Repository contents including vault entries, Web Services and Web Apps are not migrated, because support for them has been dropped. Please refer to the prior section on how to move to the latest concepts with Scoring Agents, Projects and Dashboards.
Start migration
To ensure consistency, it's advised to follow these steps in your old instance.
- In the running AI Hub/Server 9 instance
- Pause all schedules so that no new jobs are submitted anymore
- Ensure that no jobs are running or pending, if so, stop them manually in the Executions page
- Shut down AI Hub 9 instance
- Make a proper backup
- Ensure that your environment has enough disk space, because existing source migration data is not altered, but copied
- Projects data including large files
- Jobs data from Job Agents
- Deployment data from Scoring Agents
Mirroring deployments and application descriptors
The default deployment descriptors like docker-compose.yml
of AI Hub 10 assumes one Job Agent and one Scoring Agent.
If you already added additional Job Agents or Scoring Agents in your old instance, your new version 10 deployment descriptor and environment needs to reflect those changes to fully mirror old behavior. Please make changes to the deployment descriptor files accordingly.
You can look up any property inside the respective settings tables for AI Hub/Server, Job Agents and Scoring Agents.
Example
Depending on your previous setup of AI Hub, required environment variables may differ. Please refer to the migration property tables for detailed information about each of them, although the example below should serve as a good starter.
The migration environment variables must be applied to your deployment's descriptor file properly depending on your application you like to migrate.
AI Hub/Server
Please ensure that the old home directory is properly bound/mounted to /migration/old-home
in the deployment
descriptor file, e.g. rm-server-home-vol:/migration/old-home
when coming from the default docker-compose 9.x setup.
In this example, PostgreSQL has been used in the old instance. Please refer to the database properties section
for other database types. Ensure that your database probably needs to join the new setup's network in order to be accessible
via <address-or-service-name-of-old-database>
.
SPRING_PROFILES_ACTIVE=default,migration10,migration10-postgres
MIGRATION_DATABASE_DATASOURCE_URL=jdbc:postgresql://<address-or-service-name-of-old-database>:5432/rm_server
MIGRATION_DATABASE_DATASOURCE_USERNAME=<your-old-database-username>
MIGRATION_DATABASE_DATASOURCE_PASSWORD=<your-old-database-password>
Job Agent
If you have multiple Job Agents, you need to migrate all of them.
Please ensure that the old home directory is properly mounted to /migration/old-home
in the deployment descriptor file.
JOB_AGENT_MIGRATION_ENABLE_PRE_X_MIGRATION=true
Scoring Agent
If you have multiple Scoring Agents, you need to migrate all of them.
Please ensure that the old home directory is properly mounted to /migration/old-home
in the deployment descriptor file.
RTS_MIGRATION_ENABLE_PRE_X_MIGRATION=true
Advanced properties
The following tables outlines properties which need to be set to invoke migration steps required for AI Hub/Server 10, Job Agents and Scoring Agents in your deployment in detail. You may need those properties to extend the example.
AI Hub 10 only supports containerized setups. You need to make your old-home
available to the default location
outlined in the following table and that user 2011
as access writes. You can also change the default location, then
the mounted volume or host bind needs to point to that changed location instead of the default location.
In addition, your need to set the MIGRATION_ENABLE_PRE_X_MIGRATION
property to true
.
Application | Property | Default | Description |
---|---|---|---|
AI Hub/Server | MIGRATION_ENABLE_PRE_X_MIGRATION |
false (true in migration10 profile) |
If migration tasks from pre X installations should be carried out |
AI Hub/Server | MIGRATION_ENABLE_PRE_X_DATABASE |
false (true in migration10-dbtype profiles) |
If migration tasks which need a connection to an old database should be carried out |
AI Hub/Server | MIGRATION_ENABLE_PRE_X_SAMPLE_MIGRATION |
false |
If migration of the sample projects should be carried out |
AI Hub/Server | MIGRATION_PRE_X_HOME_DIR |
/migration/old-home |
The home dir of pre AI Hub X installations |
AI Hub/Server | MIGRATION_PRE_X_REPOSITORY_DIRECTORY |
/migration/old-home/data/repositories |
The dir where repositories have been located inside the old home dir of pre AI Hub X installations |
AI Hub/Server | MIGRATION_DATABASE_DATASOURCE |
none | The datasource configuration of the old AI Hub database. See documentation of main application database |
AI Hub/Server | MIGRATION_DATABASE_JPA |
none | The JPA configuration of the old AI Hub database. See documentation of main application database |
AI Hub/Server | MIGRATION_QUARTZ |
Pre X default values | The Quartz scheduler configuration of the old AI Hub database. If changes were done in the old scheduler configuration they need to be transposed to here as well |
Job Agent | JOB_AGENT_MIGRATION_ENABLE_PRE_X_MIGRATION |
false |
If migration tasks from pre X installations should be carried out |
Job Agent | JOB_AGENT_MIGRATION_PRE_X_HOME_DIR |
/migration/old-home |
The home dir of pre AI Hub X installations |
Scoring Agent | RTS_MIGRATION_ENABLE_PRE_X_MIGRATION |
false |
If migration tasks from pre X installations should be carried out |
Scoring Agent | RTS_MIGRATION_PRE_X_HOME_DIR |
/migration/old-home |
The home dir of pre AI Hub X installations |
AI Hub/Server
At the start of the application, when using a deployed docker image, migration will be performed. The migration will use
the current home directory of AI Hub, which is set from the AIHUB_HOME_DIR
environment variable. Migration tasks,
which take care of the migration from a pre AI Hub X home dir will only be executed when the MIGRATION_ENABLE_PRE_X_MIGRATION
environment variable is set to true
. The default dir of MIGRATION_OLD_HOME_DIR
can be used as a mounting point for
old home directory. Migration tasks, which use the old database will only be carried out, if MIGRATION_ENABLE_PRE_X_DATABASE
is set to true
.
Database
For migration steps, which require the old database, the connection details can be defined with the known Spring Data values. You can set these values as environment variables before starting the migration. Because the database configuration varies between systems, no default values for the connection properties are provided.
If any changes were done to the Hibernate configuration of pre X AI Hub installations, you need to make sure to
transpose them to the MIGRATION_DATABASE_JPA_PROPERTIES_HIBERNATE
configuration
Property | Description |
---|---|
MIGRATION_DATABASE_DATASOURCE_URL |
The complete URL of the data source, e.g. jdbc:mysql://localhost:1456/velox?useSSL=false&useUnicode=yes&characterEncoding=UTF-8&allowPublicKeyRetrieval=true |
MIGRATION_DATABASE_DATASOURCE_USERNAME |
The username of the database user |
MIGRATION_DATABASE_DATASOURCE_PASSWORD |
The password of the database user |
MIGRATION_DATABASE_JPA_PROPERTIES_HIBERNATE_DIALECT |
The HHibernate dialect of the old system, e.g. org.hibernate.dialect.MySQL57InnoDBDialect |
Please ensure that the old database joins the deployment’s network in order to be accessible, otherwise migration will fail.
Migrating from AI Hub/Server 9 instances using Oracle as database is currently not supported yet.
Depending on your old database, you need to enable different profiles with the SPRING_PROFILES_ACTIVE
environment variable.
Ensure to always provide the default,migration10
and in addition your database type. Please see the example
for a full setup of migration.
Property | Profile name |
---|---|
Main migration invoked | migration10 |
Postgres | migration10-postgres |
MySQL 5.7 | migration10-mysql-5-7 |
MySQL 8 | migration10-mysql-8 |
MSSQL 8 | migration10-mssql-8 |
Users, groups and permissions
The migration of users, groups and permissions require the new Keycloak instance and step uses the service account
of aihub-backend
.
You need to make sure that is has realm-management -> manage-users
(for creating groups and users during migration) role.
Please refer to the special roles section for more information.
By default, the role should be applied to the service account.
Job Agent
At the start of the application, when using a deployed docker image, migration will be performed. The migration will use
the current home directory of the Job Agent, which is set from the JOBAGENT_HOME_DIR
environment variable.
Migration tasks, which take care of the migration from a pre AI Hub X home dir will only be executed when the
JOB_AGENT_MIGRATION_ENABLE_PRE_X_MIGRATION
environment variable is set to true
. The default dir of
JOB_AGENT_MIGRATION_PRE_X_HOME_DIR
can be used as a mounting point for older home dirs.
Scoring Agent
At the start of the application, when using a deployed docker image, migration will be performed. The migration will use
the current home directory of the Scoring Agent, which is set from the SCORING_AGENT_HOME_DIR
environment variable.
Migration tasks, which take care of the migration from a pre AI Hub X home dir will only be executed when the
RTS_MIGRATION_ENABLE_PRE_X_MIGRATION
environment variable is set to true
. The default dir of
RTS_MIGRATION_PRE_X_HOME_DIR
can be used as a mounting point for older home dirs.
Post migration
When the new upgraded version has started up and migration has been completed, there might be some steps you need to perform manually.
Legacy repository contents
If you have not moved to Projects in your old instance before starting the migration process, you'll notice that there are no legacy repository contents available in your new instance. Those are not migrated as support for legacy Repository has been dropped.
It’s advised to move to Projects before migration, but you can also migrate contents manually afterwards by utilizing different RapidMiner Studio versions, although this manual migration step imposes more work.
Please read through the Moving to Projects from legacy repository section to see the differences between Projects and the legacy repository on a conceptual level.
Depending on your setup and the availability of your old instance in parallel to the new instance, steps needed slightly differ, but you can achieve the same result by performing the steps sequentially.
During the migration of legacy repository contents, please keep in mind that RapidMiner Studio 10 can only connect to AI Hub 10 instances and RapidMiner Studio 9 can only connect to AI Hub 9 instances. This means that you need to have different RapidMiner Studio versions installed on your system to perform the manual migration step.
- In RapidMiner Studio 9 and with the old AI Hub/Server 9 instance being up and running
- Create a new Repository in RapidMiner Studio 9 (local, not connected to an AI Hub/Server instance)
- Connect to the old AI Hub 9 instance
- Copy contents to the freshly created Repository, so you have everything on your local disk
- Create necessary Projects in AI Hub/Server 10
- Switch to RapidMiner Studio 10 and
- check out desired Projects
- copy contents from (local) repository to the Project
Connection injected parameters and vault entries
If you have been working with Projects and created Connections containing injected parameters, your existing vault entries are migrated seamlessly.
If you have been working mostly with the legacy repository where Connections reside in the top-level global
/Connections
folder, then you either moved to Projects before or after the major version upgrade.
In any case, chances are high that either your Connections are not present if you've used the
dump approach, and thus you need to
re-create them through RapidMiner Studio 10 with your AI Hub/Server 10 version running or that your Connections have
been moved to Projects before. If latter is the case, then you need to manually ask users to add their values for any
injected parameters, although they should quickly notice as process execution will fail if required values are not
present in the Projects' vault.
Additional configuration and system settings
RapidMiner AI Hub/Server 9 System Settings web interface has been dropped in version 10. Its main responsibility was to
configure Web Service and Web Apps in-application execution and additionally some parameters of Server configuration
itself.
Starting from version 10, configuration has moved to an environment variables based configuration approach
for changing behavior of AI Hub/Server. This means, that all properties containing rapidanalytics
are not supported anymore.
Please refer to the list of available environment variables in the configuration section of AI Hub/Server 10 what matches your explicitly changed system settings.
In contrast to the aforementioned properties, RapidMiner Studio related properties like
rapidminer.python_scripting.python_binary
or rapidminer.proxy.mode
can be modified for AI Hub/Server, Job Agents and
Scoring Agents by changing their rapidminer.properties
file inside the respective application's home directory or volume.
Scoring Agents provide web service functionality. For securing them or allowing anonymous access, please refer to the Scoring Agent's documentation section.
Moving to Keycloak and adapting mirror groups
When you have been on a native installment of AI Hub/Server 9 and have been using direct LDAP or SAML integration,
you need to look into the migration.log
file for further instructions, e.g., mapping of mirror groups are not
created automatically. Please also look into the users and groups section and cross-check if this
matches your setup.
If you're already on the platform docker-based setup, then Keycloak migration is done automatically to adapt to changes in special roles and clients. You should cross-check configuration of externalized Identity Provider integration nevertheless.
Clean up deployment descriptor files
After migration succeeded, you can remove all related environment variables from your deployment descriptor file, they're
no longer needed, e.g. SPRING_ACTIVE_PROFILES
or any environment variable starting with MIGRATION_
.
In addition, you can shut down your old instance and old database.