Categories

Versions

You are viewing the RapidMiner Studio documentation for version 9.5 - Check here for latest version

Logistic Regression (SVM) (RapidMiner Studio Core)

Synopsis

This operator is a Logistic Regression Learner. It is based on the internal Java implementation of the myKLR by Stefan Rueping.

Description

This learner uses the Java implementation of the myKLR by Stefan Rueping. myKLR is a tool for large scale kernel logistic regression based on the algorithm of Keerthi etal (2003) and the code of mySVM. For compatibility reasons, the model of myKLR differs slightly from that of Keerthi etal (2003). As myKLR is based on the code of mySVM; the format of example files, parameter files and kernel definition are identical. Please see the documentation of the SVM operator for further information. This learning method can be used for both regression and classification and provides a fast algorithm and good results for many learning tasks. mySVM works with linear or quadratic and even asymmetric loss functions.

This operator supports various kernel types including dot, radial, polynomial, neural, anova, epachnenikov, gaussian combination and multiquadric. Explanation of these kernel types is given in the parameters section.

Input

  • training set (IOObject)

    This input port expects an ExampleSet. This operator cannot handle nominal attributes; it can be applied on data sets with numeric attributes. Thus often you may have to use the Nominal to Numerical operator before application of this operator.

Output

  • model (Model)

    The Logistic Regression model is delivered from this output port. This model can now be applied on unseen data sets.

  • weights (Average Vector)

    This port delivers the attribute weights. This is only possible when the dot kernel type is used, it is not possible with other kernel types.

  • example set (IOObject)

    The ExampleSet that was given as input is passed without changing to the output through this port. This is usually used to reuse the same ExampleSet in further operators or to view the ExampleSet in the Results Workspace.

Parameters

  • kernel_typeThe type of the kernel function is selected through this parameter. Following kernel types are supported: dot, radial, polynomial, neural, anova, epachnenikov, gaussian combination, multiquadric
    • dot: The dot kernel is defined by k(x,y)=x*y i.e. it is inner product of x and y.
    • radial: The radial kernel is defined by exp(-g ||x-y||^2) where g is the gamma, it is specified by the kernel gamma parameter. The adjustable parameter gamma plays a major role in the performance of the kernel, and should be carefully tuned to the problem at hand.
    • polynomial: The polynomial kernel is defined by k(x,y)=(x*y+1)^d where d is the degree of polynomial and it is specified by the kernel degree parameter. The polynomial kernels are well suited for problems where all the training data is normalized.
    • neural: The neural kernel is defined by a two layered neural net tanh(a x*y+b) where a is alpha and b is the intercept constant. These parameters can be adjusted using the kernel a and kernel b parameters. A common value for alpha is 1/N, where N is the data dimension. Note that not all choices of a and b lead to a valid kernel function.
    • anova: The anova kernel is defined by raised to power d of summation of exp(-g (x-y)) where g is gamma and d is degree. gamma and degree are adjusted by the kernel gamma and kernel degree parameters respectively.
    • epachnenikov: The epachnenikov kernel is this function (3/4)(1-u2) for u between -1 and 1 and zero for u outside that range. It has two adjustable parameters kernel sigma1 and kernel degree.
    • gaussian_combination: This is the gaussian combination kernel. It has adjustable parameters kernel sigma1, kernel sigma2 and kernel sigma3.
    • multiquadric: The multiquadric kernel is defined by the square root of ||x-y||^2 + c^2. It has adjustable parameters kernel sigma1 and kernel sigma shift.
    Range: selection
  • kernel_gammaThis is the SVM kernel parameter gamma. This is only available when the kernel type parameter is set to radial or anova. Range: real
  • kernel_sigma1This is the SVM kernel parameter sigma1. This is only available when the kernel type parameter is set to epachnenikov, gaussian combination or multiquadric. Range: real
  • kernel_sigma2This is the SVM kernel parameter sigma2. This is only available when the kernel type parameter is set to gaussian combination. Range: real
  • kernel_sigma3This is the SVM kernel parameter sigma3. This is only available when the kernel type parameter is set to gaussian combination. Range: real
  • kernel_shiftThis is the SVM kernel parameter shift. This is only available when the kernel type parameter is set to multiquadric. Range: real
  • kernel_degreeThis is the SVM kernel parameter degree. This is only available when the kernel type parameter is set to polynomial, anova or epachnenikov. Range: real
  • kernel_aThis is the SVM kernel parameter a. This is only available when the kernel type parameter is set to neural. Range: real
  • kernel_bThis is the SVM kernel parameter b. This is only available when the kernel type parameter is set to neural. Range: real
  • kernel_cacheThis is an expert parameter. It specifies the size of the cache for kernel evaluations in megabytes. Range: real
  • CThis is the SVM complexity constant which sets the tolerance for misclassification, where higher C values allow for 'softer' boundaries and lower values create 'harder' boundaries. A complexity constant that is too large can lead to over-fitting, while values that are too small may result in over-generalization. Range: real
  • convergence_epsilonThis is an optimizer parameter. It specifies the precision on the KKT conditions. Range:
  • max_iterationsThis is an optimizer parameter. It specifies to stop iterations after a specified number of iterations. Range: integer
  • scaleThis is a global parameter. If checked, the example values are scaled and the scaling parameters are stored for a test set. Range: boolean

Tutorial Processes

Introduction to the Logistic Regression operator

The 'Sonar' data set is loaded using the Retrieve operator. The Split Validation operator is applied on it for training and testing a regression model. The Logistic Regression operator is applied in the training subprocess of the Split Validation operator. All parameters are used with default values. The Logistic Regression operator generates a regression model. The Apply Model operator is used in the testing subprocess to apply this model on the testing data set. The resultant labeled ExampleSet is used by the Performance operator for measuring the performance of the model. The regression model and its performance vector are connected to the output and it can be seen in the Results Workspace.