In this document, you'll learn the purpose of Radoop Proxy and decide whether or not it is needed in your deployment.
Radoop Proxy lets you tunnel all Radoop connections through a single machine residing on the edge of your secure Hadoop cluster. Its purpose is to significantly reduce the number of ports that need to be opened on the firewall protecting the Hadoop cluster, making the networking configuration much easier.
Check out our detailed Radoop Proxy documentation for more info.
When should I deploy Radoop Proxy?
Deploying and configuring Radoop Proxy can come in handy in a number of situations. If the answer to ANY of the questions below is yes, you should definitely deploy Radoop Proxy.
- Is there a firewall between the network that your Hadoop cluster is located on, and the network that users of RapidMiner Studio are on?
- Is there a firewall between the network that your Hadoop cluster is located on, and the network that your RapidMiner platform deployment is on?
- Is your Hadoop cluster deployed in a public cloud infrastructure, such as Amazon Elastic MapReduce?
Radoop Proxy may be deployed standalone, or as part of a RapidMiner platform deployment. We recommend the latter approach, if there's no firewall between your RapidMiner platform deployment and your Hadoop cluster. Check out our deployment templates to get started.