Article Details

Content Based Filtering and Fraud Detection on Social Networking Sites | Original Article

D. P. Gadekar*, Y. P. Singh, in Journal of Advances in Science and Technology | Science & Technology


The online social networking (OSN) offer an extensive variety of extra data to advance standard learning algorithm, the most difficult part is separating the applicable data from arranged information. Fake conduct is indistinctly disguised both in nearby and social information, making it considerably harder to define valuable contribution for expectation models. To efficiently join interpersonal organization impacts to identify misrepresentation for the Belgian legislative standardized savings foundation, and to enhance the execution of conventional non-social extortion expectation undertakings. Finding the semantic reasonable subjects from the colossal measure of rational points from the substantial measure of client Generated Content (UGC) in online networking would encourage numerous downstream uses of shrewd processing. Subject models, as a standout amongst the most effective algorithms, have been broadly used to find the inactive semantic examples in content accumulations. In any case, one key shortcoming of point models is that they require archives with certain length to give dependable measurements adversary producing intelligent themes. In Twitter, the clients' tweets are for the most part short and loud. Perceptions of word events are immeasurable for theme models. In this research work, we proposed novel text mining based OSN fraud detection method. The proposed method is used to get fraud detection using text mining with the noise removal, bag of word algorithms, features extraction and at last naive biased for fraud classification purpose. This method helps to improve the quality in the form of result and minimize total time of processing and accuracy. The extensive experimental evaluation is presented in this work to claim the efficiency of proposed approach against the state-of-art methods using different research datasets.