Categories

Versions

Smote Upsampling (Operator Toolbox)

Synopsis

This operator implements the Synthetic Minority Over-sampling Technique as proposed by Chawla et. al., Journal of Artificial Intelligence Research 16 (2002), 321 -- 357.

Description

In the first step the ExampleSet is filtered to only consider examples of the minority class. Afterwards a search on the k nearest neighbours for all examples is performed. The algorithm then selects a random example and a random nearest neighbour for this example. A new example is created which is on the line between the two examples.

Input

  • exa (Data table)

    ExampleSet you want to upsample.

Output

  • ups (Data table)

    The original ExampleSet with the attached synthetic examples.

  • ori (Data table)

    The original ExampleSet.

Parameters

  • number of neighbours In SMOTE we calculate the k nearest neighborhood. This parameter defines the number of neighbors to consider.
  • normalize If checked range transformation to [0,1] is performed to make distance calculation solid.
  • equalize classes If activated as many new examples as needed to balance the classes are drawn.
  • upsampling size Defines the number of examples you want to create.
  • auto detect minority class If activated the class to upsample is the class with the least occurrences.
  • minority class Defines the class you want to upsample.
  • round integers Round Integer attributes to the next Integer.
  • nominal change rate Probability to change a nominal value to the nominal value of it's nearest neighbor.
  • use local random seed This parameter indicates if a local random seed should be used.
  • local random seed If the use local random seed parameter is checked this parameter determines the local random seed.

Tutorial Processes

Use smote on imbalanced Sonar

In this tutorial we unbalance the sonar data set with a sample operator and create synthetic examples to recreate class balance.