Operators in Expression Builder

The Expression Builder on the Feature Definition page has the elements listed below -

  • Operator categories
  • Constant types
  • Features available in the predecessor node

Regular Expression Operators

The Regular Expression operator is explained in the table below.

Operator

Code Editor

Syntax/Description

Example/Remark

re.sub(0,0,0,flags=re.IGNORECASE)

Replace(String1, String2, Feature) (Boolean Value)

  • String1 = string to be replaced
  • String2 = string that replaces String1
  • Feature = categorical column on which the replace function is applied.
  • Boolean Value = True or False
  • It replaces one string with another.
  • This operation is case-sensitive.
  • If Boolean Value is True, the case sensitivity is ignored.
  • Thus, String1 will be replaced even if it does not have identical casing as that in the categorical column.
  • For example, String1 is 'Bitter', and String2 is 'Sweet'. Then, even if the values in the column are 'Bitter' or 'bitter', both will be replaced by 'Sweet'.
  • If Boolean Value is False, the case sensitivity is NOT ignored.
  • Thus, String1 will be replaced only if it has an identical casing as in the categorical column.
  • For example, String1 is 'Bitter', and String2 is 'Sweet'. Then, only the values 'Bitter' will be replaced by 'Sweet'. Any other string like 'bitter' will remain unchanged.

Example of Regular Expression

Consider a Dataset containing a Species column with 150 values, four written in lowercase (setosa) and six in uppercase (Setosa).
The input data is shown in the figure below.



We create an expression shown below. According to the expression, we want to replace the word Setosa with Flora in the Species column. The Boolean value is True, which indicates that case sensitivity is ignored.


The result of the Expression node is displayed below. You can see that both types of values (setosa and Setosa ) are replaced with the new string Flora.



We select False as the Boolean value so that the case sensitivity is not ignored.


The result of the Expression node is displayed below. You can see that only the uppercase values (Setosa) are replaced with Flora. The lowercase value (setosa) remains unchanged.