Punctuation Remover 

Description

Punctuation remover is an algorithm used to remove punctuation marks like a full stop, comma, semi-colon, question mark, exclamatory mark, and other such punctuation marks from the given text. 

Why to use

Textual Analysis – Pre Processing 

When to use

When there are redundant punctuation marks that can be removed before performing textual analysis.

When not to use

On numerical data.

Prerequisites

Punctuation marks should be present in the textual data. 

Input

Hello!!!, he said ---and went.

Output

Hello he said and went.
In this example, the punctuation marks "(!!!) (---) (.) (,)" are removed.

Related algorithms

  • Case Convertor
  • Custom Words Remover
  • Frequent Words Remover
  • Lemmatizer
  • Spelling Corrector
  • Stemmer
  • Advanced Entity Extraction
  • Word Correlation
  • Word Frequency

Alternative algorithm

_

Statistical Methods used

-

Limitations

It cannot be used on Numerical data.


Punctuation Remover is located under Textual Analysis ( ) in Pre Processing, in the task pane on the left. Use drag-and-drop method to use algorithm in the canvas. Click the algorithm to view and select different properties for analysis. 

Punctuation remover is an algorithm used to remove punctuation marks like a full stop, comma, semi-colon, question mark, exclamatory mark, and other such punctuation marks from the given text.

Properties of Punctuation Remover

The available properties of Punctuation Remover are as shown in the figure given below.

The table given below describes different fields present on properties of punctuation remover.

Field

Description

Remark

Task Name

It displays the name of the selected task.

You can click the text field to edit or modify the name of the task as required.

Text

It allows you to select the text from which you want to remove punctuation marks.

  • Only one data field can be selected
  • Textual data fields selected for the reader are visible.
  • Only textual data field can be selected
AdvancedNode configurationIt allows you to select the instance of the AWS server to provide control on the execution of a task in a workbook or workflow.For more details, refer to Worker Node Configuration.

Interpretation of Punctuation Remover

The figure given below shows the result of Punctuation Remover applied on Google News snippets.
In the figure, the column heading CPRText represents the text after the Punctuation Remover is applied.
In the two highlighted examples, the two punctuation marks comma ( , ) and dash ( - ) have been removed.









Table of Contents