Study of Cloud Computing & Databases

Pradeep Deshmukh

Study of Cloud Computing & Databases

A study on the impact of information technology on banking products and services in Haryana

by Pradeep Deshmukh*,

- Published in Journal of Advances and Scholarly Researches in Allied Education, E-ISSN: 2230-7540

Volume 2, Issue No. 2, Oct 2011, Pages 0 - 0 (0)

Published by: Ignited Minds Journals

ABSTRACT

In Present study we have tried to assess the new products and services available from the banks to the customers in Haryana, to ascertain the impact of information technology on the bank deals in Haryana, to ascertain the marketing strategies adopted by the bank branches in Haryana for marketing their products and services, to examine the extent of acceptance of the banking products and services by the customers in Haryana, and to evaluate the attitude of the bank personnel vis-a-vis. customers towards marketing of new products and services of banks

KEYWORD

cloud computing, databases, new products, services, banks, customers, Haryana, information technology, marketing strategies, acceptance

INTRODUCTION

The potential benefits of cloud computing are overwhelming. However, attaining these benefits requires that each aspect of the cloud platform support the key design principles of the cloud model. One of the core design principles is dynamic scalability, or the ability to provision and decommission servers on demand. Unfortunately, the majority of today’s database servers are incapable of satisfying this requirement. This paper reviews the benefits of cloud computing and then evaluates two database architectures—shared-disk and shared-nothing—for their compatibility with cloud computing. Cloud computing is the latest evolution of Internet-based computing. The Internet provided a common infrastructure for applications. Soon, static web pages began to add interactivity. This was followed by hosted applications like Hotmail. As these web applications added more user-configuration, they were renamed Software-as-a-Service (SaaS). Companies like Salesforce.com have led this wave. With a growing number of companies looking to get in on the SaaS opportunity, Amazon released Amazon Web Services (AWS) that enables companies to operate their own SaaS applications. In effect, Amazon hosted the LAMP stack, which they have since expanded to include Windows as well. Soon others followed suit. Then, large companies began to realize that they could create their own cloud platform for internal use, a sort of private cloud. So, just as the public Internet spawned private corporate intranets, cloud computing is now spawning private cloud platforms. Both public and private cloud platforms are looking to deliver the benefits of cloud computing to their customers. Whether yours is a private or public cloud, thedatabase is a critical part of that platform. Therefore it isimperative that your cloud database be compatible withcloud computing. In order to understand cloud computingrequirements, we must first understand the benefits thadrive these requirements. The shared-disk database architecture is ideally suited tocloud computing. The shared-disk architecture requiresfewer and lower-cost servers, it provides high-availabilityit reduces maintenance costs by eliminating partitioningand it delivers dynamic scalability.

THE BENEFITS OF CLOUD COMPUTING

Cloud computing is not a fad, it is driven by some tangibleand very powerful benefits. Whether the cloud is providedas an internal corporate resource, as a service hosted bya third-party, or as a hybrid of these two models, there aresome very real advantages to this model. Theseadvantages derive from specialization and economies oscale: Specialization: There is a great deal of specializedknowledge required to set-up and operate systems toaddress security, scalability, platform maintenance(patches, updates), data maintenance (backups) andmore. In a traditional model, each development effort hadto include this expertise on staff. Cloud computing enablesthese capabilities to be staffed by experts who are sharedacross many customers. Instead of hiring that one personwho does a decent job across all of these elements, cloudcomputing entities can hire individuals with deep expertisein each area, and then amortize this expense across alarge number of customers. This degree of specializationenables a variety of benefits that are driving cloudcomputing.

2

Economies of Scale: This is also a powerful driver for cloud computing. The ideal platform is very expensive to build. The servers, networking equipment, data storage/backup, power, redundant high-speed connectivity, etc. can result in a huge start-up cost for a single product or project. Add to this the fact that most development efforts fail, and the economics simply don’t make sense for investment of this level in each project. Cloud computing enjoys economies of scale, because that same investment can be amortized over a large number of projects. If one project fails, it can be replaced by a number of new projects that continue to amortize the initial investment. Economies of scale also apply to IT tasks. For example, let us use backup as an example of a standard IT task. In a standalone environment, an IT person might schedule and manage the backup process. In a cloud environment, backup is highly automated, whereby that same IT person can oversee simultaneous backups for hundreds or thousands of customers.

KEY BENEFITS OF CLOUD COMPUTING:

 Lower costs: All resources, including expensive networking equipment, servers, IT personnel, etc. are shared, resulting in reduced costs, especially for small to mid-sized applications and prototypes.  Shifting CapEx to OpEx: Cloud computing enables companies to shift money from capital expenses (CapEx) to operating expenses (OpEx), enabling the customer to focus on adding value in their areas of core competence, such as business and process insight, instead of building and maintaining IT infrastructure. In short, cloud computing allows you to focus your money and resources on innovating.  Agility: Provisioning-on-demand enables faster set-up and tear-down of resources on an as-needed basis. When a project is funded, you initiate service, then if the project is killed, you simply terminate the cloud contract.  Dynamic scalability: Most applications experience spikes in traffic. Instead of over-buying your own equipment to accommodate these spikes, many cloud services can smoothly and efficiently scale to handle these spikes with a more cost-effective pay-as-you-go model. This is also known as elasticity and is behind Amazon’s name Elastic Computing Cloud (EC2).  Simplified maintenance: Patches and upgrades arerapidly deployed across the shared infrastructureas are backups.  Large scale prototyping/load testing: Cloudcomputing makes large scale prototyping and loadtesting much easier. You can easily spawn 1,000servers in the cloud to load test your application andthen release them as soon as you are done, trydoing that with owned or corporate servers.  Diverse platform support: Many cloud computingservices offer built-in support for a rich collection oclient platforms, including browsers, mobile, andmore. This diverse platform support enablesapplications to reach a broader base of users righout of the gate.  Faster management approval: This is closelyaligned with cost savings. Since cloud computinghas very low upfront costs, the managemenapproval process is greatly accelerated, causingfaster innovation. In fact, costs are so low, thaindividuals can easily fund the expense personallyto demonstrate the benefits of their solution, whileavoiding organizational inertia.  Faster development: Cloud computing platformsprovide many of the core services that, undetraditional development models, would normally bebuilt in house. These services, plus templates andother tools can significantly accelerate thedevelopment cycle. The combination of these benefits is driving cloudcomputing from mere buzzword to disruptive andtransformational tsunami. With corporate adoption of cloud computing, we areseeing an explosion of cloud options. One of thoseoptions is the provisioning of database services in theform of cloud databases or Database-as-a-Service(DaaS). For the remainder of this paper, we focus on therequirements of cloud databases and the various optionsavailable to you.

EVOLVING CLOUD DATABASE REQUIREMENTS

Cloud database usage patterns are evolving, andbusiness adoption of these technologies accelerates thaevolution. Initially, cloud databases serviced consumeapplications. These early applications put a priority onread access, because the ratio of reads to writes was veryhigh. Delivering high-performance read access was theprimary purchase criteria. However, this is changing.

3

Consumer-centric cloud database applications have been evolving with the adoption of Web 2.0 technologies. User generated content, particularly in the form of social networking, have placed somewhat more emphasis on updates. Reads still outnumber writes in terms of the ratio, but the gap is narrowing. With support for transactional business applications, this gap between database updates and reads is further shrinking. Business applications also demand that the cloud database be ACID compliant: providing Atomicity, Consistency, Isolation and Durability. Perhaps it will be beneficial to consider two examples to better understand the differing cloud database requirements.

EXAMPLE 1: CONSUMER CLOUD DATABASE

Consider a database powering a consumer-centric cosmetics website. If the user does a search for a certain shade of lipstick, it is important that the results be delivered instantaneously to keep the user engaged, so she doesn’t click on another cosmetics site. If the site said that the chosen lipstick is in inventory and completed the sale, it wouldn’t be the end of the world to later find out that, as a result of inconsistent data, that lipstick wasn’t really in inventory. In this case, the consumer receives an email explaining that it is on backorder and will be shipped soon…no problem.

EXAMPLE 2: CORPORATE CLOUD DATABASE

Consider a company that sells widgets to manufacturers. A large company purchases a load of widgets necessary to keep its production line running. In this example, if the inventory was incorrect, due to inconsistent data, and the shipment is delayed, the company who purchased the widgets may be forced to shut down a production line at a cost of $1,000,000 per day…big problem! With this understanding of the different stakes involved, it is easy to understand how corporate adoption of cloud databases are changing the game considerably.

THE ACHILLES HEEL OF CLOUD DATABASES

Dynamic scalability—one of the core principles of cloud computing—has proven to be a particularly vexing problem for databases. The reason is simple; most databases use a shared-nothing architecture. The shared-nothing architecture relies on splitting (partitioning) the data into separate silos of data, one per server. You might think that dynamically adding another database server is as simple as splitting the data across one more server. For example, if you have two servers, each with 50% of the total data, and you add a third server, you justake a third of the data from each server and now youhave three servers each owning 33% of the dataUnfortunately, it isn’t that simple. Many user requests involve related information. Foexample, you might want to find all customers who placedan order in the last month. You need to go to the invoicestable and find the invoices dated for last month. Then youfollow a database key to the customer table to collect theicontact information. If this is spread across multipleservers, you end-up processing information on onemachine and then passing that data to the secondmachine for processing. This passing of informationcalled data shipping, will kill your database performanceFor this reason, the partitioning of the data must be donevery carefully to minimize data shipping. Partitioning dataa time-consuming process, is referred to as a black arbecause of the level of skill required. The ability topartition data in an efficient and high-performance mannereally separates the men from the boys in the world oDBAs. Automating this process remains an elusive goal. Sure you can use middleware to automatically repartitionthe data on the fly to accommodate a changing number odatabase servers, but your performance can quickly godown the toilet. If we use the example above, let’s say thayou have two servers with partitioned data and a query istaking .5 seconds. Then you add a third database serverdynamically repartition the data with some middlewareand now that same query takes 1.0 seconds, because othe data shipping between nodes. Yes, the performancecan actually decrease with the addition of more serversThis is the Achilles Heel of deploying a shared-nothingdatabase in the cloud.

ARE REPLICATED TABLES THE ANSWER?

Since data partitioning and cloud databases are inherentlyincompatible, Amazon, Facebook and Google have takenanother approach to solve the cloud database challengeThey have created a persistence engine—technically noa database—that abandons typical ACID compliance infavor replicated tables of data that store and retrieveinformation while supporting dynamic or elastic scalabilityFacebook offers BigTable, Amazon has SimpleDB andFacebook is working on Cassandra. These solutions areideal for the needs defined in the consumer example #1above. However, they are not a replacement for a readatabase, and they do not address corporate cloudcomputing requirements. THE SHARED-DISK DATABASE ARCHITECTURE ISIDEAL FOR CLOUD DATABASES

4

The database architecture called shared-disk, which eliminates the need to partition data, is ideal for cloud databases. Shared-disk databases allow clusters of low-cost servers to use a single collection of data, typically served up by a Storage Area Network (SAN) or Network Attached Storage (NAS). All of the data is available to all of the servers, there is no partitioning of the data. As a result, if you are using two servers, and your query takes .5 seconds, you can dynamically add another server and the same query might now take .35 seconds. In other words, shared-disk databases support elastic scalability. The shared-disk DBMS architecture has other important advantages—in addition to elastic scalability—that make it very appealing for deployment in the cloud. The following are some of these advantages: Fewer servers required: Since shared-nothing databases break the data into distinct pieces, it is not sufficient to have a single server for each data set, you need a back-up in case the first one fails. This is called a master-slave configuration. In other words, you must duplicate your server infrastructure. Shared-disk is a master-master configuration, so each node provides fail-over for the other nodes. This reduces the number of servers required by half when using a shared-disk database. Lower cost servers (extend the life of your current servers): In a shared-nothing database, each server must be run at low CPU utilization in order to be able to accommodate spikes in usage for that server’s data. This means that you are buying large (expensive) servers to handle the peaks. Shared-disk, on the other hand, spreads these usage spikes across the entire cluster. As a result, each system can be run at a higher CPU utilization. This means that with a shared-disk database you can purchase lower-cost commodity servers instead of paying a large premium for high-end computers. This also extends the lifespan of existing servers, since they needn’t deliver cutting-edge performance. Scale-in: The scale-in1 model enables cloud providers to allocate and bill customers on the basis of how many instances of a database are being run on a multi-core machine. Scale-in enables you to launch one instance of MySQL per CPU core. For example, a 32-core machine could support a cluster-in-a-box of 32 instances of MySQL. Simplified maintenance/upgrade process: Servers that are part of a shared-disk database can be upgraded individually, while the cluster remains online. You can selectively take nodes out of service, upgrade them, and put them back in service while the other nodes continue to operate. You cannot do this with a shared-nothing database because each individual node owns a specific piece of data. Take out one server in a shared- nothing database and the entire cluster must be shudown. High-availability: Because the nodes in a shareddisk database are completely interchangeable, you canlose nodes and your performance may degrade, but thesystem keeps operating. If a shared-nothing databaseloses a server the system goes down until you manuallypromote a slave to the master role. In addition, each timeyou (re)partition the database, you must take the systemdown. In other words, shared-nothing involves morescheduled and unscheduled downtime than shared-disksystems. Reduced partitioning and tuning services: In ashared-nothing cloud database, the data must bepartitioned. While it is fairly straightforward to simply splithe data across servers, thoughtfully partitioning the datato minimize the traffic between nodes in the cluster—alsoknown as function or data shipping—requires a great deaof ongoing analysis and tuning. Attempting to accomplishthis in a static shared-nothing cluster is a significanchallenge, but attempting to do so with a dynamicallyscaling database cluster is a Sysiphian task. Reducedsupport costs: One of the benefits of cloud databases isthat they shift much of the low-level DBA functions toexperts who are managing the databases in a centralizedmanner for all of the users. However, tuning a sharednothing database requires the coordinated involvement oboth the DBA and the application programmer. Thissignificantly increases support costs. Shared-diskdatabases cleanly separate the functions of the DBA andthe application developer, which is ideal for clouddatabases. Shared-disk databases also provide seamlessload-balancing, further reducing support costs in a cloudenvironment.

CONCLUSION

Whether you are assembling, managing or developing ona cloud computing platform, you need a cloud-compatibledatabase. Shared-nothing databases require datapartitioning, which is structurally incompatible withdynamic scalability, a core foundation of cloud computingThe shared-disk database architecture, on the other handdoes support elastic scalability. It also supports othecloud objectives such as lower costs for hardwaremaintenance, tuning and support. It delivers highavailability in support of Service Level Agreements(SLAs). As with every tectonic shift in technology, there isa Darwinian ripple effect as we realize which technologiessupport these changes and which are relegated to legacysystems. Because of their compatibility, cloud computingwill usher in an ascendance of the shared-disk database.

REFERENCES:-

 Cisco Systems. Cisco Catalyst 3750-E SeriesSwitches Data Sheet, June 2008.

5

J. Cohen, B. Dolan, M. Dunlap, J. M. Hellerstein, and C. Welton. MAD Skills: New Analysis Practices for Big Data. Under Submission, March 2009.

J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. In OSDI ’04, pages 10–10, 2004.

D. J. DeWitt and R. H. Gerber. Multiprocessor Hash-based Join Algorithms. In VLDB ’85, pages 151–164, 1985.

D. J. DeWitt, R. H. Gerber, G. Graefe, M. L. Heytens, K. B. Kumar, and M. Muralikrishna. GAMMA - A High Performance Dataflow Database Machine. In VLDB ’86, pages 228–237, 1986.

S. Fushimi, M. Kitsuregawa, and H. Tanaka. An Overview of The System Software of A Parallel Relational Database Machine. In VLDB ’86, pages 209–219, 1986.

S. Ghemawat, H. Gobioff, and S.-T. Leung. The Google File System. SIGOPS Oper. Syst. Rev., 37(5):29–43, 2003.

M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: Distributed Data-parallel Programs from Sequential Building Blocks. In EuroSys ’07, pages 59–72, 2007.

E. Meijer, B. Beckman, and G. Bierman. LINQ: reconciling object, relations and XML in the .NET framework. In SIGMOD ’06, pages 706–706, 2006.