Analysis of Token Formation towards Blocking and Similarity Computation: -

Parvesh  Kumari Parvesh  Kumari; Dr. Kalpana . Dr. Kalpana .

Authors

Parvesh Kumari Author
Dr. Kalpana . Author

Keywords:

blocking key, similarity computation, duplicate identification, edge value, run the show approach, low quality copies, cleaned records, false positives, token concept, data cleaning process, complexity

Abstract

The best blocking key will be chosen for the blocking records by looking at execution of the duplicate identification. In the subsequent stage the edge esteem is computed in view of the similitudes amongst records and fields. At that point, a run the show based approach is utilized to distinguish or identify copies and to kill low quality copies by holding just a single duplicate of the best duplicate record. At last, all the cleaned records are assembled or blended and made accessible for the following procedure. This research work will be effective for diminishing the quantity of false positives without passing up a major opportunity for recognizing copies. To contrast this new system and past methodologies the token idea is incorporated to accelerate the information cleaning process and lessen the unpredictability. Investigation of a few blocking key is made to choose best blocking key to unite comparative records through broad analyses to abstain from looking at all sets of records. A lead based approach is utilized to recognize correct and estimated copies and to kill copies.

Downloads

Download data is not yet available.

Analysis of Token Formation towards Blocking and Similarity Computation

-

Authors

Keywords:

Abstract

Downloads

Downloads

Published

Issue

Section

How to Cite

SidebarMenu

EditorialTeam

JournalTemplate

SiteLink

IndexedBy

Address:

Contact Info:

Regional Office:

Information :