Description

Kruskal-Wallis test is a non-parametric version of one way ANOVA. It determines whether the medians of two or more groups are different.

Why to use

To identify whether there is a significant difference between the medians in the groups.

When to use

  • When all the independent variables are numerical.
  • When there are more than two variables in single or multiple datasets.
  • When there is no relationship between the members of all groups.

When not to use

  • In the case of non-continuous, categorical, or textual variables.
  • If a group contains any dependent variable and is not on the Ordinal, Ratio, or Interval scale.
  • If a group contains constants, discrete, and empty/missing values.

Prerequisites

  • The independent variable should be a numerical value. It should not possess any infinite or missing value.
  • All the groups should have the same size of distributions.

Input

Numeric dataset

Output

  • H-statistic
  • p Value

Statistical Methods Used

  • H-statistic
  • p Value
  • Alpha (α)

Limitations

this method cannot identify which dataset/column has a different median.

In the left task pane, Kruskal-Wallis is located under Model Studio () in Statistical Analysis inside Hypothesis Test under Non Parametric Test. Use the drag-and-drop method to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis. Refer to Properties of Kruskal Wallis Test.

The test calculates the p value. Compare this p value with the alpha value to conclude hypothesis testing. Reject the hypothesis if the p value is less than alpha.
The test also calculates and displays the H-statistics. H-statistics calculates the interaction strength between the two features.

Properties of Kruskal-Wallis Test

The available properties of the Kruskal-Wallis test are shown below.

The table below describes the different properties of Kruskal-Wallis.

Field

Description

Remark

Task Name


It is the name of the task selected on the workbook canvas.

  • You can click the text field to edit or modify the task's name.
  • Space between words is not allowed in the Task Name.

Features

It allows you to select the variables on which the test is applied.

You need to choose a minimum of two variables.

Advanced

Alpha

It allows you to enter the Alpha value, a significance level.

  • The default value is 0.05. You can modify this value.
  • The Alpha value is always between 0 to 1. The value 0 interprets 100% confidence on the test. The value 1 interprets 0% confidence on the test. The system cannot guarantee both scenarios; hence the values cannot be 0 and 1.

Node Configuration

It allows you to select the instance of the AWS server to provide control over the execution of a task in a workbook or workflow.

For more details, refer to

Example of Kruskal-Wallis

As a medical researcher, you want to compare the effectiveness of three independent drugs – Drug A, Drug B, or Drug C. You run a Kruskal Wallis test to compare the efficacy of the drug treatments. Drug 1, Drug 2, and Drug 3 observations are given below.



The Kruskal-Wallis test is applied to the input data by selecting three independent columns. The chosen values are given below.

Property

Value

Task Name

Kruskal_Wallis_Test

Features (Independent Variable)

Drug 1, Drug 2, Drug 3

Alpha

0.05

The Data tab shows the columns selected in the Features dropdown.
Further, the Result page is as follows.

The result page consists of the following sections:

  1. Hypothesis

This section displays Null Hypothesis and Alternative Hypothesis defined by the system.

      2.Kruskal Wallis Test

This section displays the Kruskal Wallis test result. It shows a value of the following parameters:

    • H-Statistic
    • p Value
    • Alpha (α)

     3.Interpretation

    • Since the p value is greater than the alpha value; hence the null hypothesis is accepted. Accepting the null hypothesis means the medians are equal across all groups.
    • You can conclude your study by stating that all independent drugs – Drug A, Drug B, or Drug C come from the same distribution.
    • The effectiveness of all the drugs – Drug A, Drug B, or Drug C treatments are the same.

Table of Contents