Custom Words Remover 

Description

Custom words remover eliminates the user-specified custom word/words before further processing. 

Why to use

Textual Analysis – Pre Processing 

When to use

When user defined custom words are to be removed from the textual data. 

When not to use

On numerical data.

Prerequisites

The custom word to be removed should be present in the selected field of dataset. 

Input

Nick likes to play football; however, he is not too fond of tennis.


Output

Nick likes to play; however, he is not too fond of.
In this example, the custom words "football, tennis" are removed.

Related algorithms

  • Case Convertor
  • Frequent Words Remover
  • Lemmatizer
  • Punctuation Remover
  • Spelling Corrector
  • Stemmer
  • Advanced Entity Extraction
  • Word Correlation
  • Word Frequency 

Alternative algorithm

-

Statistical Methods used

-

Limitations

It cannot be used on Numerical data.

Custom Words Remover is located under Textual Analysis (  ) in Pre Processing, in the task pane on the left. Use drag-and-drop method to use algorithm in the canvas. Click the algorithm to view and select different properties for analysis.



One of the major tasks of data pre-processing is to filter out useless data. It is also called as text mining. Custom words remover eliminates the user-specified custom word/words before further processing. This helps you to extract your data as required.

Properties of Custom Words Remover

The available properties of Custom Words Remover are as shown in the figure given below.



The table given below describes different fields present on properties of custom words remover.

Field

Description

Remark

Task Name

It displays the name of the selected task.

You can click the text field to edit or modify the name of the task as required.

Custom Terms to Remove

It allows you to type words that you want to remove.

Multiple words can be added separated by commas.

Text

It allows you to select the text from which you want to remove the custom words.

  • Only one data field can be selected
  • Textual data fields selected for the reader are visible.
  • Only textual data field can be selected
AdvancedNode ConfigurationIt allows you to select the instance of the AWS server to provide control on the execution of a task in a workbook or workflow.For more details, refer to Worker Node Configuration.

Interpretation of Custom Words Remover

The figure below shows the properties of the Custom Words Remover applied to tweets on a Twitter Data.
We select the custom word "good" to be removed from the tweets.











The figure given below shows the results of the Custom Words Remover algorithm.
In the figure, the column heading CCWRText represents the text after the Custom Words Remover is applied.
In the highlighted example, the selected Custom Word "The" has been removed.










Table of Contents