Article Details

Evaluation of Map Reduce (Mr) and Parallel Sql Database Management Systems (Dbms) For Performance and Development Complexity |

Surbhi Agarwal, in Journal of Advances in Science and Technology | Science & Technology


Thereis currently considerable enthusiasm around the MapReduce (MR) paradigm forlarge-scale data analysis. Although the basic control flow of this frameworkhas existed in parallel SQL database management systems (DBMS) for over 20years, some have called MR a dramatically new computing model. In this paper,we describe and compare both paradigms. Furthermore, we evaluate both kinds ofsystems in terms of performance and development complexity. To this end, wedefine a benchmark consisting of a collection of tasks that we have run on anopen source version of MR as well as on two parallel DBMSs. For each task, wemeasure each system’s performance for various degrees of parallelism on acluster of 100 nodes. Our results reveal some interesting trade-offs. Althoughthe process to load data into and tune the execution of parallel DBMSs tookmuch longer than the MR system, the observed performance of these DBMSs wasstrikingly better. We speculate about the causes of the dramatic performancedifference and consider implementation concepts that future systems should takefrom both kinds of architectures