Overview | |
What | Understanding the diverse types of datasets in Rubiscape and their creation process. |
When | When you want to use data from various sources in your algorithm flows. |
Why | To extract data from various sources and create datasets as per your requirements. |
Where | Inside a workspace that is assigned to you. |
Who | A user with dataset creation rights. |
How | The dataset creation process is described in the following sections. |
A dataset is a compilation or collection of data, usually in the tabular form. However, non-tabular datasets can also be compiled, as in the case of an XML file, where data appears in the form of marked-up strings of characters.
In machine learning, data is mostly categorized into four types.
Numerical data | Categorical data | Time-series data | Textual data | Geographical Data |
---|
The data types and corresponding datasets supported in Rubiscape are given below.
As shown in the above figure, Rubiscape supports various data sources under each of the dataset types.
The dataset creation process for these types is explained in the sections that follow.
Data Types | Social Media | RDBMS | File | Hadoop | API | |
Datasets |
|
|
|
|
|