What is Data Preparation
Data preparation is the process of cleaning and transforming raw data into organized data so that it can be processed and further analyzed. In data preparation, data is reformatted, corrected, and combined so that it gets enriched.
Why is Data Preparation required
Data preparation is complex yet essential to create relevant contextual data. This makes the analysis of such data, efficient, and produces reliable and insightful results. In the absence of precise data preparation, we may get a biased data which may result in poor analysis and erroneous results.

How is Data Preparation done in rubiscape
In rubiscape, there is a comprehensive set of algorithms for performing data preparation. They are used singularly or in combination with other algorithms to remove any anomaly in the dataset. Each algorithm has a specific function, which can be used to enhance the data quality. In rubiscape, the user can find the missing values, create additional data columns, merge and join data, and so on.
In rubiscape, the Data Preparation algorithms are,

  • Aggregation
  • Combined Data Cleansing
  • Data Joiner
  • Data Merge
  • Data Pivot
  • Data Unpivot
  • Descriptive Statistics
  • Expression
  • Factor Analysis
  • File Management
  • Filtering
  • Lookup 
  • Missing Value Imputation
  • Outlier Detection
  • PCA
  • Sequence Generator
  • Sorting

In the task pane, click Model Studio, and then click Data Preparation.



For more information, refer to Data Preparation