Support Vector Machine (PSO) (RapidMiner Studio Core)

Synopsis

This operator is a Support Vector Machine (SVM) Learner which uses Particle Swarm Optimization (PSO) for optimization. PSO is a computational method that optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality.

Description

This operator implements a hybrid approach which combines support vector classifier with particle swarm optimization, in order to improve the strength of each individual technique and compensate for each other's weaknesses. Particle Swarm Optimization (PSO) is an evolutionary computation technique in which each potential solution is seen as a particle with a certain velocity flying through the problem space. Support Vector Machine (SVM) classification operates a linear separation in an augmented space by means of some defined kernels satisfying Mercer's condition. These kernels map the input vectors into a very high dimensional space, possibly of infinite dimension, where linear separation is more likely. Then a linear separating hyper plane is found by maximizing the margin between two classes in this space. Hence the complexity of the separating hyper plane depends on the nature and the properties of the used kernel.

Support Vector Machine (SVM)

Here is a basic description of the SVM. The standard SVM takes a set of input data and predicts, for each given input, which of the two possible classes comprises the input, making the SVM a non-probabilistic binary linear classifier. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples into one category or the other. An SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall on. For more information about SVM please study the description of the SVM operator.

Particle Swarm Optimization (PSO)

Particle swarm optimization (PSO) is a computational method that optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality. PSO is a metaheuristic as it makes few or no assumptions about the problem being optimized and can search very large spaces of candidate solutions. However, metaheuristics such as PSO do not guarantee an optimal solution is ever found. More specifically, PSO does not use the gradient of the problem being optimized, which means PSO does not require that the optimization problem be differentiable as is required by most classic optimization methods. PSO can therefore also be used on optimization problems that are partially irregular, noisy, change over time, etc.

Input

  • training set (Data Table)

    This input port expects an ExampleSet. This operator cannot handle nominal attributes; it can be applied on data sets with numeric attributes. Moreover, this operator can only be applied on ExampleSets with binominal label.

Output

  • model (Kernel Model)

    The SVM model is delivered from this output port. This model can now be applied on unseen data sets.

  • example set (Data Table)

    The ExampleSet that was given as input is passed without changing to the output through this port. This is usually used to reuse the same ExampleSet in further operators or to view the ExampleSet in the Results Workspace.

Parameters

  • show_convergence_plotThis parameter indicates if a dialog with a convergence plot should be drawn. Range: boolean
  • kernel_typeThe type of the kernel function is selected through this parameter. Following kernel types are supported: dot, radial, polynomial, neural, anova, epachnenikov, gaussian combination, multiquadric
    • dot: The dot kernel is defined by k(x,y)=x*y i.e. it is inner product of x and y.
    • radial: The radial kernel is defined by exp(-g ||x-y||^2) where g is the gamma, it is specified by the kernel gamma parameter. The adjustable parameter gamma plays a major role in the performance of the kernel, and should be carefully tuned to the problem at hand.
    • polynomial: The polynomial kernel is defined by k(x,y)=(x*y+1)^d where d is the degree of polynomial and it is specified by the kernel degree parameter. The polynomial kernels are well suited for problems where all the training data is normalized.
    • neural: The neural kernel is defined by a two layered neural net tanh(a x*y+b) where a is alpha and b is the intercept constant. These parameters can be adjusted using the kernel a and kernel b parameters. A common value for alpha is 1/N, where N is the data dimension. Note that not all choices of a and b lead to a valid kernel function.
    • anova: The anova kernel is defined by raised to power d of summation of exp(-g (x-y)) where g is gamma and d is degree. gamma and degree are adjusted by the kernel gamma and kernel degree parameters respectively.
    • epachnenikov: The epachnenikov kernel is this function (3/4)(1-u2) for u between -1 and 1 and zero for u outside that range. It has two adjustable parameters kernel sigma1 and kernel degree.
    • gaussian_combination: This is the gaussian combination kernel. It has adjustable parameters kernel sigma1, kernel sigma2 and kernel sigma3.
    • multiquadric: The multiquadric kernel is defined by the square root of ||x-y||^2 + c^2. It has adjustable parameters kernel sigma1 and kernel sigma shift.
    Range: selection
  • kernel_gammaThis is the SVM kernel parameter gamma. This is available only when the kernel type parameter is set to radial or anova. Range: real
  • kernel_sigma1This is the SVM kernel parameter sigma1. This is available only when the kernel type parameter is set to epachnenikov, gaussian combination or multiquadric. Range: real
  • kernel_sigma2This is the SVM kernel parameter sigma2. This is available only when the kernel type parameter is set to gaussian combination. Range: real
  • kernel_sigma3This is the SVM kernel parameter sigma3. This is available only when the kernel type parameter is set to gaussian combination. Range: real
  • kernel_shiftThis is the SVM kernel parameter shift. This is available only when the kernel type parameter is set to multiquadric. Range: real
  • kernel_degreeThis is the SVM kernel parameter degree. This is available only when the kernel type parameter is set to polynomial, anova or epachnenikov. Range: real
  • kernel_aThis is the SVM kernel parameter a. This is available only when the kernel type parameter is set to neural. Range: real
  • kernel_bThis is the SVM kernel parameter b. This is available only when the kernel type parameter is set to neural. Range: real
  • CThis is the SVM complexity constant which sets the tolerance for misclassification, where higher C values allow for 'softer' boundaries and lower values create 'harder' boundaries. A complexity constant that is too large can lead to over-fitting, while values that are too small may result in over-generalization. Range: real
  • max_evaluationThis is an optimizer parameter. It specifies to stop evaluations after the specified number of evaluations. Range: integer
  • generations_without_improvalThis parameter specifies the stop criterion for early stopping i.e. it stops after n generations without improvement in the performance. n is specified by this parameter. Range: integer
  • population_sizeThis parameter specifies the population size i.e. the number of individuals per generation. Range: integer
  • inertia_weightThis parameter specifies the (initial) weight for the old weighting. Range: real
  • local_best_weightThis parameter specifies the weight for the individual's best position during run. Range: real
  • global_best_weightThis parameter specifies the weight for the population's best position during run. Range: real
  • dynamic_inertia_weightThis parameter specifies if the inertia weight should be improved during run. Range: boolean
  • use_local_random_seedThis parameter indicates if a local random seed should be used for randomization. Using the same value of local random seed will produce the same randomization. Range: boolean
  • local_random_seedThis parameter specifies the local random seed. This parameter is only available if the use local random seed parameter is set to true. Range: integer

Tutorial Processes

Introduction to the SVM (PSO) operator

The 'Ripley-Set' data set is loaded using the Retrieve operator. The Split Validation operator is applied on it for training and testing a classification model. The SVM (PSO) operator is applied in the training subprocess of the Split Validation operator. The SVM (PSO) operator is applied with default values of all parameters. The Apply Model operator is used in the testing subprocess for applying the model generated by the SVM (PSO) operator. The resultant labeled ExampleSet is used by the Performance (Classification) operator for measuring the performance of the model. The classification model and its performance vector are connected to the output and they can be seen in the Results Workspace. The accuracy of this model turns out to be around 85%.

Default values were used for most of the parameters. To get more reliable results these values should be carefully selected. Usually techniques like cross-validation are used to find the best values of these parameters for the ExampleSet under consideration.