A Critical Study on Cloud Computing

Exploring the Potential of Cloud Computing

by Madhuri*, Dr. C. Ram Singla,

- Published in International Journal of Information Technology and Management, E-ISSN: 2249-4510

Volume 5, Issue No. 1, Aug 2013, Pages 0 - 0 (0)

Published by: Ignited Minds Journals


ABSTRACT

“Cloud” computing – arelatively recent term, builds on decades of research in virtualization,distributed computing, utility computing, and more recently networking, web andsoftware services. It implies a service oriented architecture, reducedinformation technology overhead for the end-user, great flexibility, reducedtotal cost of ownership, on-demand services and many other things. This paperdiscusses the concept of “cloud” computing, some of the issues it tries toaddress, related research topics, and a “cloud” implementation available today.

KEYWORD

cloud computing, virtualization, distributed computing, utility computing, networking, web services, software services, service oriented architecture, information technology, total cost of ownership

INTRODUCTION

“Cloud computing” is the next natural step in the evolution of on-demand information technology services and products. To a large extent, cloud computing will be based on virtualized resources. Cloud computing predecessors have been around for some time now, but the term became “popular” sometime in October 2007 when IBM and Google announced a collaboration in that domain. This paper discusses the concept of “cloud” computing, some of the issues it tries to address, related research topics, and a “cloud” implementation available today. This paper discusses concepts and components of “cloud” computing. Also, this paper describes an implementation based on Virtual Computing Laboratory (VCL) technology. VCL has been in production use at NC State University since 2004, and is a suitable vehicle for dynamic implementation of almost any current “cloud” computing solution. A key differentiating element of a successful information technology (IT) is its ability to become a true, valuable, and economical contributor to cyber-infrastructure. “Cloud” computing embraces cyber-infrastructure, and builds upon decades of research in virtualization, distributed computing, “grid computing”, utility computing, and, more recently, networking, web and software services. It implies a service-oriented architecture, reduced information technology overhead for the end-user, greater flexibility, reduced total cost of ownership, on demand services and many other things.

CYBER-INFRASTRUCTURE

“Cyber-infrastructure makes applications dramatically easier to develop and deploy, thus expanding the feasible scope of applications possible within budget and organizational constraints, and shifting the scientist’s and engineer’s effort away from information technology development and concentrating it on scientific and engineering research. Cyber-infrastructure also increases efficiency, quality, and reliability by capturing commonalities among application needs, and facilitates the efficient sharing of equipment and services.” Today, almost any business or major activity uses, or relies in some form, on IT and IT services. These services need to be enabling and appliance-like, and there must be an economy of-scale for the total-cost-of-ownership to be better than it would be without cyber-infrastructure. Technology needs to improve end user productivity and reduce technology-driven overhead. For example, unless IT is the primary business of an organization, less than 20% of its efforts not directly connected to its primary business should have to do with IT overhead, even though 80% of its business might be conducted using electronic means.

CONCEPTS

A powerful underlying and enabling concept is computing through service-oriented architectures (SOA) – delivery of an integrated and orchestrated suite of functions to an end-user through composition of both loosely and tightly coupled functions, or services – often network based. Related concepts are component-based system engineering, orchestration

2

SERVICE-ORIENTED ARCHITECTURE

SOA is not a new concept, although it again has been receiving considerable attention in recent years. Examples of some of the first network-based service-oriented architectures are remote procedure calls (RPC), DCOM and Object Request Brokers (ORBs) based on the CORBA specifications. A more recent example are the so called “Grid Computing” architectures and solutions. In an SOA environment, end-users request an IT service or an integrated collection of such services at the desired functional, quality and capacity level, and receive it either at the time requested or at a specified later time. Service discovery, brokering, and reliability are important, and services are usually designed to interoperate, as are the composites made of these services. It is expected that in the next 10 years, service-based solutions will be a major vehicle for delivery of information and other IT-assisted functions at both individual and organizational levels, e.g., software applications, web-based services, personal and business “desktop” computing, high-performance computing.

COMPONENTS

The key to a SOA framework that supports workflows is componentization of its services, an ability to support a range of couplings among workflow building blocks, fault-tolerance in its data- and process-aware service-based delivery, and an ability to audit processes, data and results, i.e., collect and use provenance information. Component-based approach is characterized by reusability (elements can be re-used in other workflows), substitutability (alternative implementations are easy to insert, very precisely specified interfaces are available, runtime component replacement mechanisms exist, there is ability to verify and validate substitutions, etc.), extensibility and scalability (ability to readily extend system component pool and to scale it, increase capabilities of individual components, have an extensible and scalable architecture that can automatically discover new functionalities and resources, etc.), customizability (ability to customize generic features to the needs of a particular scientific domain and problem), and compos ability (easy construction of more complex functional solutions using basic components, reasoning about such compositions, etc.). There are other characteristics that also are very important. Those include reliability and availability of the components and services, the cost of the services, security, total cost of ownership, economy of scale, and so on. In the context of cloud computing we distinguish many categories of components: from differentiated and differentiated resources, to workflow-based environments and collections of services, and so on.

WORKFLOWS

An integrated view of service-based activities is provided by the concept of a workflow. An IT assisted workflow represents a series of structured activities and computations that arise in information-assisted problem solving. Workflows have been drawing enormous attention in the database and information systems research and development communities. Similarly, the scientific community has developed a number of problem solving environments, most of them as integrated solutions. Scientific workflows merge advances in these two areas to automate support for sophisticated scientific problem solving. A workflow can be represented by a directed graph of data flows that connect loosely and tightly coupled and often asynchronous processing components. One such graph is shown in Figure. It illustrates a Kepler-based implementation of a part of a fusion simulation workflow. In the context of “cloud computing”, the key questions should be whether the underlying infrastructure is supportive of the work-flow oriented view of the world. This includes on demand and advance-reservation-based access to individual and aggregated computational and other resources, autonomics, ability to group resources from potentially different “clouds” to deliver workflow results, appropriate level of security and privacy, etc.

VIRTUALIZATION

Virtualization is another very useful concept. It allows abstraction and isolation of lower level functionalities and underlying hardware. This enables portability of higher level functions and sharing and/or aggregation of the physical resources. The virtualization concept has been around in some form since 1960s e.g., in IBM mainframe systems. Since then, the concept has matured considerably and it has been applied to all aspects of computing – memory, storage, processors, software, networks, as well as services that IT offers. It is the combination of the growing needs and the recent advances in the IT architectures and solutions that is now bringing the

Madhuri1 Dr. C. Ram Singla2

reasonable cost, is poised to become, along with wireless and highly distributed and pervasive computing devices, such as sensors and personal cell-based access devices, the driving technology behind the next wave in IT growth. Not surprisingly, there are dozens of virtualization products, and a number of small and large companies that make them. Some examples in the operating systems and software applications space are VMware, Xen – an open source Linux-based product developed by Xen-Source and Microsoft virtualization products, to mention a few. Major IT players have also shown a renewed interest in the technology e.g., IBM, Hewllet-Packard, Intel, Sun, Red Hat). Classical storage players such as EMC, NetApp, IBM and Hitachi have not been standing still either. In addition, the network virtualization market is teeming with activity.

RESEARCH METHODOLOGY:

The general cloud computing approach discussed so far, as well as the specific VCL implementation of a cloud continues a number of research directions, and opens some new ones. For example, economy-of-scale and economics of image and service construction depends to a large extent on the ease of construction and mobility of these images, not only within a cloud, but also among different clouds. Of special interest is construction of complex environments of resources and complex control images for those resources, including workflow-oriented images. Temporal and spatial feedback large-scale workflows may present is a valid research issue. Underlying that is a considerable amount of meta-data, some permanently attached to an image, some dynamically attached to an image, some kept in the cloud management databases. Cloud provenance data, and in general metadata management, is an open issue. The classification we use divides provenance information into • Cloud Process provenance – dynamics of control flows and their progression, execution information, code performance tracking, etc. • Cloud Data provenance – dynamics of data and data flows, file locations, application input/ output information, etc. • Cloud Workflow provenance – structure, form, evolution of the workflow itself Open challenges include: How to collect provenance information in a standardized and seamless way and with minimal overhead –modularized design and integrated provenance recording; How to store this information in a permanent way so that one can come back to it at any time, – standardized schema; and How to present this information to the user in a logical manner – an intuitive user web interface: Dashboard . Some other image- and service-related practical issues involve finding optimal image and service composites and optimization of image and environment loading times. There is also an issue of the image portability and by implication of the image format. Given the proliferation of different virtualization environments and the variety in the hardware, standardization of image formats is of considerable interest. Some open solutions exist or are under consideration, and a number of more proprietary solutions are available already. For example, VCL currently uses standard image snapshots that may be an operating system, hypervisor and platform specific, and thus exchange of images requires relatively complex mapping and additional storage. Another research and engineering challenge is security. For end-users to feel comfortable with a “cloud” solution that holds their software, data and processes, there should exist considerable assurances that services are highly reliable and available, as well as secure and safe, and that privacy is protected. This raises the issues of end-to-end service isolation through VPN and SSH tunnels and VLANs, and the guarantees one may have that the data and the images keep their integrity in the “cloud”. Some of the work being done by the NC State Secure Open Systems Initiative involves water-marking of the images and data to ensure verifiable integrity. While NC State experience with VCL is excellent and our security solution has been holding up beautifully over the last four years, security tends to be a moving target and a lot of challenges remain. Direct comparisons with existing solutions are lacking at this point. However, the cost of service construction, maintenance and commonality definitely plays a role. Figure shows utilization of the VCL seat-oriented resources by day over the last 4 years. Currently, the average number of blades participating on the single-seat side is over 200, however, initially it was in the 40-ies. The overall number of reservation transactions covered by the graph exceeds 200,000. A much more agile re-distribution of the resources (perhaps nightly) is possible since we have all the

4

It is not clear that this may be a cost-saving measure. Another option is to actually react to the rising issues with data-center energy costs and turn off some of the equipment during the low-usage hours. There are issues there too – how often would one do that, would that shorten lifetime of the equipment, and so on.

USERS

The most important Cloud entity, and the principal quality driver and constraining influence is, of course, the user. The value of a solution depends very much on the view it has of its end-user requirements and user categories. Domain – the K-20 and continuing education – would be expected to: b. Support construction and delivery of content and curricula for these users. For that, the system needs to provide support and tools for thousands of instructors, teachers, professors, and others that serve the students. c. Generate adequate content diversity, quality, and range. This may require many hundreds of authors. d. Be reliable and cost-effective to operate and maintain. The effort to maintain the system should be relatively small, although introduction of new paradigms and solutions may require a considerable start-up development effort.

DEVELOPERS

Cyber-infrastructure developers who are responsible for development and maintenance of the Cloud framework. They develop and integrate system hardware, storage, networks, interfaces, administration and management software, communications and scheduling algorithms, services authoring tools, workflow generation and resource access algorithms and software and so on. They must be experts in specialized areas such as networks, computational hardware, storage, low level middleware, operating systems imaging, and similar. In addition to innovation and development of new “cloud” functionalities, they also are responsible for keeping the complexity of the framework away from the higher level users through judicious abstraction, layering and middleware.

AUTHORS

Service authors are developers of individual base-line “images” and services that may be used directly, or may be integrated into more complex service aggregates and workflows by service provisioning and integration experts. In the context of the VCL technology, an “image” is a tangible abstraction of the software stack. It incorporates a. any base-line operating system, and if virtualization is needed for scalability, a hypervisor layer, b. any desired middleware or application that runs on that operating system c. any end-user access solution that is appropriate e.g. web, RDP, VNC, etc. Images can be loaded on “bare-metal”, or into an operating system/application virtual environment of choice. When a user has the right to create an image, that user usually starts with a “No App” or a base-line image e.g., Win XP or Linux without any

Madhuri1 Dr. C. Ram Singla2

composite images (aggregates of two or more images we call environments that are loaded synchronously), the user extends service capabilities of VCL. An author can program an image for sole use on one or more hardware units, if that is desired, or for sharing of the resources with other users. Scalability is achieved through a combination of multi-user service hosting, application virtualization, and both time and CPU multiplexing and load balancing. Authors must be component (base-line image and applications) experts and must have good understanding of the needs of the user categories above them in the Figure. Some of the functionalities a cloud framework must provide for them are image creation tools, image and service management tools, service brokers, service registration and discovery tools, security tools, provenance collection tools, cloud component aggregations tools, resource mapping tools, license management tools, fault-tolerance and fail-over mechanisms and so on. It is important to note that the authors, for the most part, will not be cloud framework experts, and thus the authoring tools and interfaces must be appliances: easy-to-learn and easy-to-use and they must allow the authors to concentrate on the “image” and service development rather than struggle with the cloud infrastructure intricacies.

SERVICE COMPOSITION

Similarly, services integration and provisioning experts should be able to focus on creation of composite and orchestrated solutions needed for an end-user. They sample and combine existing services and images, customize them, update existing services and images, and develop new composites. They may also be the front for delivery of these new services e.g. an instructor in an educational institution, with “images” being cloud-based in-lab virtual desktops), they may oversee the usage of the services, and may collect and manage service usage information, statistics, etc. This may require some expertise in the construction of images and services, but, for the most Their expertise may range from workflow automation through a variety of tools and languages, to domain expertise needed to understand what aggregates of services, if any, the end-user needs, to management of end-user accounting needs, and to worrying about inter-, intra- and extra-cloud service orchestration and engagement, to provenance data analysis. The need may range from “bare metal” loaded images, images on virtual platforms (on hypervisors), to collections of image aggregates (environments, including high performance computing clusters), images with some restrictions, and workflow-based services. A service management node may use resources that can be reloaded at will to differentiate them with images of choice. After they have been used, these resources are returned to an undifferentiated state for re-use. On the other hand, an “Environment” could be a collection of images loaded on one or more platforms. For example, a web server, a database server, and a visualization application server, or a high-performance cluster. Workflow image is typically a process control image that also has a temporal component. It can launch any number of the previous resources as needed and then manage their use and release based on an automated workflow. Users of images that load onto undifferentiated resources can be given root or administrative access rights since those resources are “wiped clean” after their use. On the other hand, resources that provide access to only some of its virtual partitions, may allow non-root cloud users only: for example, a z-Series mainframe may offer one of its LPARS as a resource. Similarly ,an ESX-loaded platform may be non-root access, while its guest operating system images may be of root-access type.

END-USERS

End-users of services are the most important users. They require appropriately reliable and timely service delivery, easy-to-use interfaces, collaborative support, information about their services, etc. The distribution of services, across the network and across resources, will depend on the task complexity, desired schedules and resource constraints. Solutions should not rule out use of any network type (wire, optical, wireless) or access mode (high speed and low speed). However, VCL has set a lower bound on the end-to-end connectivity throughput, roughly at the level of DSL and cable modem speeds.

6

end-users may range from single seat desktops (“computer images”) that may deliver any operating system and application appropriate to the educational domain, to a group of lab or classroom seats for support of synchronous or asynchronous learning sessions, one or more servers supporting different educational functions, groups of coupled servers or environments e.g., an Apache server, a database server, and a workflow management server all working together to support a particular class, or research clusters and high-performance computing clusters.

CONCLUSIONS

“Cloud” computing builds on decades of research in virtualization, distributed computing, utility computing, and, more recently, networking, web and software services. It implies a service-oriented architecture, reduced information technology overhead for the end-user, great flexibility, reduced total cost of ownership, on-demand services and many other things. This paper discusses the concept of “cloud” computing, the issues it tries to address, related research topics, and a “cloud” implementation based on VCL technology. Our experience with VCL technology is excellent and we are working on additional functionalities and features that will make it even more suitable for cloud framework construction.

REFERENCES:

  • The Grid: Blueprint for a New Computing Infrastructure, 2nd Edition, Morgan Kaufmann. ISBN: 1-55860-933-4.
  • D. GEORGAKOPOULOS, M. HORNICK, AND A.SHETH, “An Overview of Workflow Management: From Process Modeling to Workflow Automation Infrastructure”, Distributed and Parallel Databases, Vol. 3(2). GLOBUS: http://www.globus.org/.
  • HADOOP: http://hadoop.apache.org/core/ .
  • ELIAS N.HOUSTIS, JOHN R. RICE, EFSTRATIOSGALLOPOULOS, RANDALL BRAMLEY (EDITORS), Enabling Technologies for Computational Science Frameworks, Middleware and Environments, Kluwer-Academic Publishers, Hardbound, ISBN 0-7923-7809-1.
  • IBM, “IBM Launches New System x Servers and Software Targeting Large Scale x86 Virtualization, http://www-03.ibm.com/press/us/en/pressrelease/19545.wss.
  • IBM, “Google and IBM Announced University Initiative to Address Internet-Scale Computing Challenges”, http://www-03.ibm.com/press/us/en/pressrelease/22414.wss
  • IBM, “IBM Introduces Ready-to-Use Cloud Computing”, http://www-03.ibm.com/press/us/en/pressrelease/22613.