Data Merge

Description

Data Merge involves the combining of two or more rows to include them into one table. 

Why to use

For Data Preparation 

When to use

When you want to merge two or more dataset tables into one table where at least one column is common in all the datasets.

When not to use

Prerequisites

It can be used on numerical data. At least one column should be common in the datasets. 

Input

Two or more tables

Output

Single table with rows merged together

Statistical Methods used


Limitations

It can create only additional rows.


Data Merge is located under Model Studio (  ) in Data Preparation, in the task pane on the left. Use drag-and-drop method to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis.

In the Data Merge algorithm, we merge two or more rows to include them in a table. The rows of data appear one below the other in the resulting table. All the columns in the two tables (common or uncommon) are also retained in the resulting table.
Data merger is used to combine tables with similar data derived from different sources.
Generally, when we merge tables, there is a common element (a column with the same variable) in them.
However, tables that do not have any common elements can still be merged. The variable values for the absent columns in individual tables are marked as 'na' in the combined table. Thus, neither the data is deleted nor omitted from the two tables.

Properties of Data Merge

The available properties of Data Merge are as shown in the figure given below.

In the above figure, the Task Name 'Data Merge' appears by default. You can click in the text box to edit or modify the name of the task as required.

Example of Data Merge

In the example, the data obtained from school test results for four subjects is used. Table 1 and Table 2 are the two tables to be merged together.
The figure given below display the input data from Table 1.

The figure given below displays the input data from Table 2.

The Data Merge is applied to the two tables. The figure given below displays the output data obtained after merging. The table contains all the rows corresponding to the two input tables. The cells of no data are marked as 'na'.


Table of Contents