A Study of Discovery of Duplicate Data Utilizing Token-Based Technique
A Sequential Approach to Duplicate Data Identification and Elimination
Keywords:
duplicate data, token-based technique, data cleaning, database deformities, duplicate discovery, missing information, typographical mistakes, missing values, contractions, similarity valueAbstract
The process toward distinguishing and evacuating database deformities and copies is alluded to as information cleaning. The basic issue of duplicate discovery is that estimated copies in a database may allude to a similar genuine question because of mistakes and missing information. Duplicate end is hard in light of the fact that it is caused by various kinds of blunders like typographical mistakes, missing qualities, contractions and distinctive portrayals of the same sensible esteem. In the current methodologies, duplicate discovery and end is space subordinate. These space subordinate techniques for duplicate end depend on closeness capacities and limit for duplicate end and deliver high false positives. This research paper work displays a general consecutive system for duplicate identification and disposal. The proposed system utilizes six stages to progress the procedure of duplicate identification and disposal. Initial, a property choice calculation is utilized to recognize or select best and appropriate properties for duplicate ID and end. The token is framed for the chosen property field esteems in the subsequent stage. After the token arrangement, grouping calculation or blocking strategy is utilized to bunch the records in view of the similitudes esteem.Published
2017-01-01
How to Cite
[1]
“A Study of Discovery of Duplicate Data Utilizing Token-Based Technique: A Sequential Approach to Duplicate Data Identification and Elimination”, JASRAE, vol. 12, no. 2, pp. 651–654, Jan. 2017, Accessed: Aug. 07, 2025. [Online]. Available: https://ignited.in/index.php/jasrae/article/view/6316
Issue
Section
Articles
How to Cite
[1]
“A Study of Discovery of Duplicate Data Utilizing Token-Based Technique: A Sequential Approach to Duplicate Data Identification and Elimination”, JASRAE, vol. 12, no. 2, pp. 651–654, Jan. 2017, Accessed: Aug. 07, 2025. [Online]. Available: https://ignited.in/index.php/jasrae/article/view/6316