It involves data cleaning, data transformation, and data reduction. Every textual data may not be ready
Data preprocessing is a data mining technique that involves transforming raw data into an understandable and useful format. Real-world data is often incomplete, inconsistent, and/or lacking in certain behaviors or trends, and is likely to contain many errors. Data preprocessing is a proven method of resolving such issues. It involves data cleaning, data transformation, and data reduction.
List of Pre-Processing Algorithms
- Case Convertor
- Custom Words Remover
- Frequent Words Remover
- Lemmatizer
- Punctuation Remover
- Spelling Corrector
- Stemmer
- Word Correlation
- Word Embedding
- Word Frequency
Notes:
- The Reader (Dataset) should be connected to the algorithm.
- These algorithms can be used only on textual data.
- You can use one algorithm after the other for preparing your data.
These algorithms are used for text mining and preparing your data for a targeted approach.