WHY YOU NEED A PRIVATE CLOUD?

A private cloud is a cloud deployment that is dedicated for a single customer or organization and resides in the data center in organization’s premises or hosted in the data center of a private cloud service provider. Unlike public clouds that share same infrastructure of the public cloud service provider and is consumed by multiple customers, private cloud confines entirely to the owner business domain. It may be managed by either the business itself through the in-house IT team, or managed by the service provider or a 3rd party administrator. Private clouds offer enhanced security compared to the shared public cloud that is accessed by multiple customers. Private cloud offers other advantages also similar to public cloud, like automation, scalability, flexibility, fault tolerance, high availability, rapid and on-demand provisioning etc.
Public clouds have the cost advantage during initial stages over private clouds due to the former’s sharing of infrastructure feature. But there are other factors related to the nature of the business and its IT requirements that make private clouds suitable for them over public clouds. A good percentage of SME and big businesses need a cloud platform that is dynamic, mission-critical, and highly secure and also that offers enterprise level scalability and up-time. A private cloud that is well secured behind enterprise firewalls and on which the organization has direct control, satisfies these requirements. Organizations that have existing on-premise data centers can transform them to host their private clouds without much effort to fulfill the above needs.
Private cloud and security
The private cloud servers are normally hosted inside the organizational boundaries or hosted in a data center of the service provider. In both cases, the data access and security protocols are in compliance with the organizational policies and security guidelines. The communication network as well as physical and logical security implementations can be controlled and configured to suit the organizational needs. Access to the private cloud resources is restricted to the organizational designated delegates and hence has higher data integrity and isolation from external world. Organizations can decide the best possible security hardware and configuration for their private clouds. Also they can bring in and customize additional layers of security as per their security policy and planning.
Customization of the cloud components
For the private cloud, the organization can decide about, select and customize various components like computational CPUs, memory, volumes, load balancers, network components, identity and access services etc. according to their needs and objectives. Additionally the level of security, automation, resource pool capacity etc. can be customized and optimized to achieve the desired capabilities.
Compliance
Modern business computing world has strict compliance rules and regulations like HIPAA, PCI-DSS etc. to be implemented in business application design and communication. This is for achieving data security, integrity and interoperability between diverse applications across business domains. In case of an auditing by a standards organization, private clouds can be easily made compliant to the prescribed standards to achieve the certification.
Performance and reliability
You can setup the best possible combination of hardware, middleware and software needed for your business needs in a private cloud. Since you have spent for the infrastructure, it only takes a marginal cost to include hardware and software resources needed to deliver maximum performance. The number of CPU cores, whether it is 4 or 100, the RAM, whether it is 1 or 500 GB, the replicated High Availability HDD storage, whether it is 5000 or 50000 GB, the SSD storage, whether it is 1000 or 50000 GB, does not make much difference. So you can always select the most powerful configuration for high performance and availability. Since you are in control of monitoring and administering the cloud, you can ensure high reliability of your private cloud through proper configuration, maintenance, capacity planning and load balancing. You will not limited to the public cloud provider’s chosen hardware or no need to pay for enhanced performance, under the pay as you go scheme or reserved instance scheme. You will experience the flexibility to deploy and consume your own resources in your own decided workflow without bothering about hardware sharing or resource outage.
Business continuity with private clouds
Business continuity has diverse dimensions and impacts when your business is compared with the public cloud provider. Your business focuses on the interest of its own along with that of the contractors, stakeholders and partners. With this collaborative environment, it is crucial to ensure business continuity to the maximum level, through 99.99% up-time, uninterrupted and robust delivery of services and continuous infrastructure availability. On the other side, there is no guarantee that the public cloud provider, with his own infrastructure, the vast customer base, resource contention, management overheads, technology lock-ins, and public regulations can assure you the business continuity that you needs. It’s not only limited to operational continuity without downtime, but also the continuity of overall cloud business owned by the provider. There can be transfers, mergers, acquisitions, upgrades that can impact your deployment etc. These all can produce effects on your business continuity. On the other side, your dedicated private cloud guarantees you the business continuity and uptime tailored to your business policies and operational requirements. This flexibility and privilege is possible because you can continuously monitor, analyze, scale and restructure your private cloud in the direction that ensures your business continuity.
Cost factors for private clouds
Since organizations need to setup dedicated infrastructure for their private clouds or hire dedicated infrastructure from a private cloud service provider, the initial investment needed will be higher than public cloud offerings. But over a period of time, when the private cloud usage and its benefits start rising, the cost benefits of private cloud overtakes that of the public cloud expenses. The cost models of public cloud like pay-as-you-go and revised instances are not applicable to private cloud since the usage, irrespective of its quantity and the allocation and consumption of the compute, storage, memory and networking resources are independent of cost constraints. This is because the consumption and provisioning has its lower and upper bounds included in the initial cost of setup and deployment of the private cloud. The nominal costs that arise during operational stages are mainly for staff resources, consumption of any 3rd party services or that for infrastructure maintenance. It is the Total Cost of Ownership (TCO), that spans from the starting deployment stage to the entire life cycle of the private cloud platform that provides the cost benefit for organizations.
Conclusion
The initial cost and additional effort needed in maintaining a dedicated infrastructure may be a factor that can restrict small businesses to prefer public cloud due to its less operational costs and reduced in-house responsibilities. But when it comes to strategic, mission-critical and dynamic applications and IT environments, SMEs and enterprises can select dedicated private clouds mainly due to their increased capabilities in high availability, flexibility, control, business continuity and profit based TCO factors. Private clouds literally tailor the organization’s IT environment to the organizational backbone and hence providing a streamlined environment and space for growth and development.

Read more


WHY YOU NEED TO SCALE YOUR CLOUD

Today, cloud dominates in the IT world due to the many ways by which it has enabled businesses to solve their major IT concerns like cost, durability, flexibility, performance, security, operability, expertise, man effort, extendibility, scalability etc. The innovation of Cloud Computing and its associated technologies have revolutionized the IT world and most SMEs and almost all enterprises are heavily utilizing and harnessing the powers of cloud technologies.
Why scalability?
Businesses and projects are not meant to be static, but should be dynamic with growth, development and expansion. This growth process has adopted the IT infrastructure as its main channel. This is because of the tremendous capabilities the infrastructure provides to the business through multiple divisions of operation, research, administration, marketing, management etc. As a result of the targeted and non-targeted but evolutionary growth, businesses need to expand or scale their IT infrastructure to adapt it to the growing business needs. It is where cloud, again has proved its strength – through the easiness and flexibility by which it can be scaled up or down to match with the computational and data needs of the business. Scalability is the capability of an IT infrastructure to dynamically re-adjust itself to achieve the desired capability level by adding or removing various IT resources related to computation, storage, networking etc.
Scaling a cloud based IT infrastructure normally happens when one or more of the following (but not limited to) scenarios occur:
1. The business wants to expand its IT infrastructure for implementing a growth plan or the growth has demanded an expansion of the infrastructure.
2. The public and/or IT resources are experiencing high workloads due to new scenarios associated with policy changes, new marketing techniques, operational changes etc.
3. Business needs more resources for one or more of the business divisions like administration, operation etc.
4. An optimization plan demands change in quantity, quality and type of resources.
5. A peak load time or seasonal hike in traffic or computational requirements needs more resources.
5. An expansion – logical or geographical demands more capabilities.
Types of scaling
Scale Up or Vertical Scaling: this is needed at a more physical level, like adding more RAM or additional physical disks, cache, processors etc. to take the system from a lower to a higher level. It can also be like enhancing the server configuration, adding more physical units to the existing server or even replacing the current server with an entirely new and more powerful one. Normally this is needed when it is time to decommission the current server or a new RAID configuration is needed in place of current storage partitioning or more memory capability is needed to support the high computational requirements. Since it is at hardware or physical level, it involves extra cost in upgrading or purchasing new hardware, components, servers, etc.
Scale Out or Horizontal Scaling: this happens side to side. You will be adding more computational, storage or networking resources on a logical level through technologies like virtualization, elasticity, instances etc. The additional resources are added to the existing resource to increase the overall IT capability horizontally. A typical example is adding more compute or storage instances through the same physical server using virtualization with hypervisors. A load balancer is used to distribute loads across the connected resources and thus scaling is achieved. So a distributed or clustered type of interconnected architecture is best suitable for horizontal scaling. This type of scaling is more cost-effective, easy, rapid and flexible since it utilizes virtualization technology to get more resources on existing physical host and hence there is no need of additional hardware purchases, lengthy upgrades routines etc.
Scaling techniques
Auto-scaling is based on workload triggers that cause resource provisioning events to allocate more resources automatically and dynamically without manual intervention.
Scheduled scaling uses workload and resource utilization planning to schedule scaling of resources at specific points of time. This approach does not wait for scaling necessity to occur, but rather scale the system beforehand at scheduled time points. Any instantaneous arise of requirements are met with auto-scaling.
Predictive scaling analyzes network and resource usage patterns, trends as well as historic data to foresee the needed capacity at various periods like peak period, off-peak period, seasonal hike period etc and scale the cloud according to the capacity requirements.
What is a scalable architecture?
As evident from the above definition of types of scaling, scaling out or horizontal scaling is the most flexible and cost effective approach for scaling your cloud. The architecture of the cloud is an important factor in deciding the easiness and effectiveness of scaling. A loosely coupled, component and service based, pluggable architecture is the best option that allows effective scaling out. The application needs to be logically separated as components and services for independent scaling. For example an in-house utilized management component/service may need less compute, storage and network resources compared to the ecommerce component/service that receives heavy traffic and transaction requirements from the outside world. So resource allocation for these 2 components need not be same since only organizational staff, management and stakeholders only are the service consumers for the management component while general public including customers form a heavy client base for the ecommerce component. Here, the architecture and scaling framework must support component based resource allocation and scaling to optimize resource usage to prevent both over and under allocation. This type of service based scaling is also called product level scaling since the product requirements are the decisive factors in scaling.
Another aspect of a scalable architecture is its capability to scale at technology component level. Various technology components like the API server, transaction server, transaction DB, reporting DB, network interfaces, orchestration service, containerization service, etc. need to be scaled independently depending upon the requirements. For the above example, the transaction DB may need more resources compared to the reporting DB.
Auto-scaling in OpenStack Cloud
OpenStack, the popular open source cloud platform software is horizontally scalable and has built-in techniques for auto-scaling. OpenStack supports horizontal scaling through virtualization and load balancing. The OpenStack component services or instances can be quickly provisioned and recalled to implement scalability. OpenStack Heat, which is the orchestration engine for OpenStack performs auto-scaling upon triggers received from the metering componentOpenStack Ceilometer. Ceilometer, the telemetry service can be configured to set resource consumption alarms by the user. For example, when an alarm is set for 90% of usage of compute instances in the OpenStack, Ceilometer that monitors the usage, detects it when the load is 90% and triggers an alarm for Heat. Heat then auto-scale the cloud by adding more compute resources as specified by a Heat Orchestration Template (HOT). A bottom limit, say 25% usage, when detected by Ceilometer, triggers another alarm to Heat for scale-down. Heat then re-calls the previously allocated compute resources to restore the cloud to its minimum capacity. Users can define the upper and lower limits of usage or load that should trigger the scale-up and scale-down processes, using the HOT template.
Figure shows a scenario where additional nodes or instances for the apache httpd server are created and managed by the orchestration service and load balanced to distribute loads for achieving true scalability.
More about OpenStack Heat
OpenStack Heat can handle multiple instances through simple commands and HOT template files. What Heat can do to implement auto-scaling in OpenStack cloud are:
1. Create new instances from images.
2. Use the metadata service for configuration.
3. Use Ceilometer to create alarms for CPU/Memory/Network usage in instances.
4. Based on the alarms, perform events attached to these triggers for auto-scaling, such as deploying or terminating instances according to work load.
Below code shows an example Heat Orchestration Template:
heat_template_version: 2015-04-30
description: Simple template to deploy a single compute instance
parameters:
  key_name:
    type: string
    label: Key Name
    description: Name of key-pair to be used for compute instance
  image_id:
    type: string
    label: Image ID
    description: Image to be used for compute instance
  instance_type:
    type: string
    label: Instance Type
    description: Type of instance (flavor) to be used
resources:
  my_instance:
    type: OS::Nova::Server
    properties:
      key_name: { get_param: key_name }
      image: { get_param: image_id }
      flavor: { get_param: instance_type }
Heat is the built in orchestration engine with OpenStack that is responsible for orchestrating the cloud components using a declarative template format through an OpenStack-native REST API. The Heat Orchestration Template or HOT defines the cloud application infrastructure in YAML text files. Relationships between the various OpenStack cloud components are specified in HOT (Example: this volume is associated with this server, etc.). Using these template files, Heat makes OpenStack API calls to create the infrastructure in the desired order and configuration. Though the primary responsibility of Heat is to manage the infrastructure, the HOT integrate with other automation and configuration management tools like Puppet, Ansible, etc.
Conclusion
Auto-scaling is the mechanism that provides meaningful scaling service through on-demand and simultaneous scale up and down processes. The true potential of scalability of a cloud infrastructure is gained when resource allocation is optimized; that is without over or under allocation or utilization. OpenStack Heat and the Heat Orchestration Template is a great way of achieving auto-scaling in OpenStack cloud infrastructure, that optimizes resource usage as well as maintain a stable and growing production deployment.

Read more


SIX CLOUD TOOLS FOR AUTOMATION AND DEPLOYMENT OF YOUR CLOUD INFRASTRUCTURE

A cloud based infrastructure is actually an ecosystem for your IT needs. It contains numerous components starting from the base hardware to the cloud software and then the analytics and monitoring tools. All these are integrated well and work in a pluggable architecture to provide you with capabilities for fault-proof computations, data storage and access, delivering services, development and deployment of your products etc. A good cloud infrastructure needs to have the crucial features of high availability, horizontal scalability in which your enhanced and growing IT and business requirements are met by adding up more logical resources rather than adding separate physical systems including hardware and computational platforms. Here is where the popular open source cloud softwareOpenStack helps you in achieving your goals. OpenStack helps in building your IT environment as Infrastructure as a Service (IaaS) that runs on commodity hardware and a number of component services on top of it that provides services for compute, storage, networking, dashboard etc. This service segregation into components is the key to achieving horizontal scalability with OpenStack.
Automation and Deployment Management for Cloud
Much of cloud’s as well as OpenStack’s effectiveness lies in the degree of automation of various phases of the cloud like installation, setup, deployment, configuration, management, scaling, troubleshooting etc. While OpenStack provides you with the foundation for this automation, its co-projects as well as numerous projects – open source as well as enterprise – provides tools for achieving this automation. It is upto you to select and use the right tool for this automation, and this decides how much you have tapped into the power of cloud automation and scaling and thus saving costly effort and man power. An automated deployment management system handles every aspect of cloud administration like creating virtual instances, starting and stopping them, resource provisioning, configuration management, integration management, DNS configuration and overall system administration with little human intervention. We will examine some of the best and effective tools that help you in achieving this automation, deployment and configuration management with clouds like OpenStack.
Fuel – the OpenStack Deployment and Management Tool
It is a GUI based tool for deploying and managing the OpenStack cloud along with its components. It accelerates the complex deployment and testing process by deploying and running various configuration flavors of OpenStack. It handles the life cycle processes of the OpenStack cloud scale up/down, configuration management and plugin management. Fuel performs the following operations as part of the cloud deployment, testing and configuration process.
1. Discovers hardware, both bare metal and virtual nodes that can be configured to boot from network.
2. Allows for configuring the hardware through the GUI.
3. Provides capability to spawn and manage multiple OpenStack clusters.
4. Performs pre-deployment check and network validations.
5. Performs post-deployment checks and tests for validating the deployed OpenStack cloud.
6. Provides access to logs in real time through the GUI.
The power of Fuel lies in both the intuitive GUI it provides for the setup, testing, validation, deployment and configuration of the OpenStack cloud as well as its capability to perform the complex deployment related processes like resource and network detection, pre and post deployment testing and validation, configuration testing etc. The step-by-step wizard provides an array of configuration options that user can select, some of which are:
1. The host OS.
2. The hypervisor.
3. Storage back end.
4. Network topology.
5. Controller configuration that best suits the HA (high availability) requirements of the cloud.
 
On a global cloud level, Fuel assists in creating and deleting nodes, assign roles and accesses to nodes, replicate nodes, create and manage volumes and partition hard disks that span across the multiple physical hosts in the cloud and provides configuration and network templates to select the best suited one.
More about Fuel: https://www.fuel-infra.org
Mistral – the OpenStack Workflow Service
Each process associated with a cloud platform workflow is identified as a task and a set of tasks define a process. In a distributed cloud environment these multiple distinct processes should be interconnected and executed in a particular order. Mistral provides the mechanism to define these tasks and their relationships called the workflows and then execute them to create the cloud workflow. The typical workflow processes Mistral handles are: state management, synchronization, parallelization, high availability and task scheduling. Mistral also helps to enhance fault-tolerance by which the user can split the business tasks as parallel processes to be executed on multiple nodes synchronously. Mistral coordinates such that if any one node crashes, then the partially completed job is continued at another node at the same check point before the crash actually occurred.
 
Mistral can be used by a user or a client software or framework to specify cloud deployment workflows that includes creation of multiple VMs and the associated resources and components. Mistral can be used to schedule running of tasks that are essential for the cloud workflow. These are tasks like local processes (shell scripts or binaries) on the virtual instances, virtual instance management tasks like creation, starting, stopping and terminating, batching of tasks to define specific workflows and executing them as per a schedule using parallel execution etc.

Mistral can be used for data crawling that aids in Big Data analysis. It also works in live VM migration when a particular VM exceeds its maximum limit of CPU usage which is triggered by the Ceilometer OpenStack component that measures resource usage.
More about Mistral: https://docs.openstack.org/mistral/latest/
Compass – Automating Deployment and Management of OpenStack
It helps in bootstrapping the server pool associated with any cloud platform including OpenStack from the bare metal nodes. Compass and its integrated plugins assist in discovering hardware, deploying OS and hypervisor and also provide configuration management. Compass helps in the following areas of OpenStack cloud deployment and management:
1. Assists in infrastructure bootstrapping process and also offers programmability for operators to this.
2. Allows for implementing different configuration flavors through meta-data.
3. Implements extensibility through the integration of a number of tools like Chef and Ansible for OpenStack cluster configuration. By default Ansible is used for OpenStack installation. The Compass core blends with other tools for resource discovery, OS provisioning and Package deployment.
 
 
More about Compass: http://www.syscompass.org/install.html
 
Ansible – the OpenStack Configuration and Orchestration Tool
Ansible provides automation capabilities for the provisioning, configuration, intra-service orchestration and deployment of applications on cloud. It provides playbooks and roles for performing various deployment and configuration tasks in an OpenStack cloud. Ansible playbooks are a way to describe the automation jobs in YAML, also called Orchestration of the cloud. Ansible provides a control language that uses modules or routines to perform the intended tasks on nodes and the user can control this through the Shell CLI. It uses SSH communication for high security. Ansible provides the following capabilities:
1. Parallel task execution
2. Orchestration through Playbooks.
3. Dynamic building of configuration files.
4. Provides a simple language for overall automation
More about Ansible: https://www.ansible.com/how-ansible-works
 
Chef – Integration Framework for Cloud Infrastructure Automation
It is an open source cloud integration framework for automating cloud deployment. Chef enables programmers to create the machines and other cloud infrastructure components programmatically. Thus it transforms infrastructure into code. Regardless of the number of servers, Chef helps to manage this entire infrastructure by turning it into code. It provides configuration management and other infrastructure management tasks using Recipes, which are reusable definitions for automating infrastructure tasks. Chef has the following components:
1. Chef cookbooks are units of configuration and policy distribution. That is; the code that describes the desired state of the infrastructure.
2. A Chef Server is a central repository for cookbooks and for every node it manages.
3. A Chef Client runs on each node in the cloud and communicates with the Chef server to get the latest configuration description. This is used as instructions for bringing that node to the desired state.
4. Chef DK (Development Kit) provides the tools needed to develop the infrastructure automation code, directly from our workstation.
 
More about Chef: https://docs.chef.io/chef_overview.html
 
Puppet – a State based Configuration Management System
Puppet is a configuration management system that allows users to define the state of the IT infrastructure and then facilitates for automatic and continuous enforcement of the correct state. It can handle thousands of physical machines as well as logical VMs. Puppet also allows defining the infrastructure as a code. With this approach, multiple teams can collaborate on the cloud infrastructure development process through agile methodologies; utilize automated testing and continuous delivery. Thus, through configuration management, Puppet provides standardization in cloud management.
More about Puppet: https://puppet.com

Read more


HOW OPENSTACK COSTS LESS THAN AMAZON AWS

Amazon AWS and EC2 are public cloud services implemented through an API driven infrastructure while OpenStack supports all private, public and hybrid cloud models through server based virtual machine technologies. According to AWS, Amazon EC2 is a web service that provides resizable compute capacity in the cloud and is designed to make web-scale cloud computing easier for developers. According to OpenStack, it controls large pools of compute, storage and networking resources throughout a data center, managed through a dashboard or through the OpenStack API. OpenStack works with popular enterprise and open source technologies making it ideal for heterogeneous infrastructure.
The Cost Models

The cost of running a cloud comes in 2 categories: capital investment and operational costs. For the AWS, there are 2 cost models and are applicable for any public clouds:
a) Reserved instance or spot pricing: pre-purchase of resources that provide capacity reservation. Resources can be reserved in advance and can be used when needed. 
b) On-demand pricing: billing as per usage of resources on minute/hour basis. Can cause unpredicted hikes in cost with resource usage.
With OpenStack, following cost or implementation models can be utilized:
a) Buy hardware for setting up and running the OpenStack cloud.
b) Purchase a licensed distribution from a vendor for which the cost involves that for the license, annual support and recurring renewal costs.
c) Buy an OpenStack subscription service from a provider and receive support, maintenance etc.
d) Consume a fully managed OpenStack service also known as hosted cloud or private cloud as a service.
e) Freely download OpenStack and use either an in-house team or contractors to install, setup, maintain and operate the cloud.
With the above cost models, it is apparent that, public cloud cost models are suitable for an IT setup that has varying demands of resources and workload. Private cloud cost model is suitable for setups that expect a linear growth in resource needs and workload. When the infrastructure scale-out with addition of more and more computational, storage and networking resources, obviously it has impacts on the application and client side also. A growing IT model obviously results in an increased usage and network bandwidth consumption. An on-demand cost model causes hike in metered traffic and hence the costs. At this scenario, OpenStack’s private or hybrid cloud based cost models help you to keep cost within your controls, in the bandwidth intensive production setups by allowing for segregation of computational resources across an interconnected hybrid cloud model. Computations and data that need less public access and hence low network bandwidth can be safely deployed in the private cloud while keeping the commercial or production base on the public side.
Flexibility in investment

A public cloud based IT strategy often faces limitations in flexibility when considering long term operational and growth strategies. In this model, the business is locked down with the public cloud vendor’s technology stack and this limits interoperability with other clouds. Also this can prevent you from a transition to another offering when demand and requirements changes in the foreseeable future. The high cost and lack of interoperability and compliance between vendor locked-in public clouds can be overcome by utilizing a more flexible option like that offered by OpenStack. With OpenStack, you can deploy a scalable and flexible infrastructure on a hybrid cloud platform that runs on commodity hardware that uses virtualization for scale-out. OpenStack on its own form or a customized distribution always offer the flexibility in choosing what you need and leaves space and options for expanding and customizing your infrastructure with growing and varying demands at a significantly reduced cost compared to the public cloud model.
Where AWS costs higher than private OpenStack clouds

AWS is ideal for a quick prototyping based setup and where resource and expansion requirements are always within predictable limits. Everything can be controlled easily during these stages with the reserved or on-demand pricing models, since capacity and utilization are under control. But when your infrastructure, the resources, consumption and growth rate expands linearly on large scales, AWS is going to cost heavily. When data and workload are such high to be non-manageable by an in-house solution, moving the infrastructure to a public cloud like AWS costs high and has significant security concerns compared to an in-premise or co-location based private cloud. Another area is the organizational departmental workloads like that needed for accounting, human resources, development, quality assurance etc. where you do not want to spend on network bandwidth. With a public cloud like AWS, you need to spend for each and every bytes and when this is deployed on a local cloud like OpenStack private cloud, your costs are zero.
Getting benefits of a public cloud through private clouds as a service

The advantages of public cloud like non-need of infrastructure setting up expense or capital expense and other benefits like resource sharing, elasticity, procurement etc. can be made available through Private Cloud as a Service offerings. The operating expense based cost models of public clouds are well implemented on vendor based solutions that use OpenStack to provide private cloud as a service architecture. With such a service, businesses can deploy their IT infrastructure or workloads on their premises or at a co-location facility. This setup offers all the benefits of a private cloud like flexibility, authority, security and sovereignty using a public cloud like consumption model. Businesses can utilize the benefits such as capacity planning, monitoring, resource optimization and management for their projects. For well calculated and planned workloads, as well as seasonal spikes, this setup costs less than any public cloud like AWS.
The planning factors for an OpenStack setup

Organizations need to plan the requirements and solution optimization to a specific level along with growth anticipation in advance to make the maximum out of an OpenStack solution. Unless it is a service model, there must be expertise in managing the OpenStack deployment in-house or contracted. Vendor support is available at affordable and reasonable cost for OpenStack service models from the very deployment phase to operational and growth phases. The global community behind OpenStack delivers integration and consultation support that enables organizations to leverage the advantages of OpenStack for the benefits of their IT environment. It is true that OpenStack setup needs planning, expertise and initial capital investment unless you are using a hosted OpenStack solution. But that is a lag that you will be experiencing until reaching the threshold limit of initial resource, computational and bandwidth requirements. When your system becomes mature and starts operating and growing at linear levels, the OpenStack cloud platform starts to return results including cost effectiveness, steadily and continuously.

Read more


WHY YOU SHOULD USE OPENSTACK

What is OpenStack?
OpenStack is an open source cloud computing platform that can be used to setup, control and manage various components of a cloud like the computing nodes, storage, database and networking resources through a web based administrative dashboard (Horizon). It provides the cloud platform service as an Infrastructure as a Service (IaaS) solution through a set of connected services. Users can provision resources quickly like adding new instances through the dashboard. OpenStack allows for achieving great horizontal scalability and is highly configurable. OpenStack can be effectively used to setup and deploy private, public and hybrid clouds. OpenStack is evolving continuously and has all the strengths and resources for the creation of highly available cloud platforms. It has revolutionized the cloud sector through the simplicity and flexibility by which businesses can setup their own private, public or hybrid clouds and utilize rapid provisioning for scalability.
Here are the top reasons why you should use OpenStack in your IT infrastructure.
Global support and collaboration
OpenStack is maintained by a global community and backed by major enterprises like IBM, Intel, Red Hat, AMD, HP, Oracle, AT&T, Cisco, Dell, Ubuntu and so on. Thousands of developers are working hard behind it and its components to enhance and enrich it further. So you will be getting the assistance of the world with OpenStack.
Cost effective and flexible
It is free and open source and uses virtualization over commodity hardware to achieve the distributed IaaS (Infrastructure as a Service) functionality. You can create abstract pools of resources for compute, storage and networking easily through a user friendly dashboard. Since the resources are virtual machines you are not spending for the hardware and only for the logical resources. A major cost factor is based o the bandwidth utilization. Most businesses have data that those not need to be exposed and access needed only for in-house or for a closed group. Using OpenStack, the high traffic commercial data components can be deployed in a public cloud and the sensitive in-house data can be deployed in a private cloud that can not only reduce cost but also increases security. Also the distinction between development and production environments can be made by using a low cost private deployment as the development platform and an optimized and quality configuration cloud as the production platform.
High Security
OpenStack’s role based access controls ensure high level of security. The identity service called Keystone provides authentication services through user name-password schemes as well as token based mechanisms. Access to the cloud and resources can be configured and controlled at levels of users, roles and projects. OpenStack thus ensures high data security, sovereignty and control through the fine grained access and authentication mechanisms and also by using the closed and on-premise private cloud deployment.
Easy and powerful administrative Dashboard
Through the dashboard component named Horizon, OpenStack provides a powerful, web based and intuitive GUI dashboard to administer the cloud. This enables you to monitor and control your cloud services – the resources for compute, storage, networking, users etc. Administrators can monitor resource usage, the VM instances, user activities, quickly provision additional resources etc with the dashboard. Along with the dashboard, OpenStack provides command line tools and APIs also for accessing and managing the OpenStack cloud.
Distinct and well integrated component services
OpenStack cloud is implemented through the core and the individual component services that are maintained along with OpenStack, like Nova for compute, Neutron for networking, Cinder for storage, Horizon for dashboard etc. Since the entire setup use a pluggable architecture, the services can be configured, scaled and managed independently and needed customizations and business required customizations can be easily performed. The components can be cross integrated with other technologies like Galera Cluster etc for achieving synchronous replication based clustering etc.
Performance and horizontal scalability
OpenStack cloud has all the elements needed for high performance, high availability and horizontal scalability with little cost. Through use of parallel and distributed nodes and by avoiding single point of failures, high availability and performance is ensured. Rapid and flexible provisioning of resources for compute, storage, networking etc. enables the cloud user to utilize the horizontal scalability potential of OpenStack to the maximum benefit of their business.
OpenStack has a reputed customer base
OpenStack’s flexibility and simplicity makes it easier to businesses to choose between the cloud models that best suit their requirements. Deploying sensitive data in a private cloud while placing commercial data in public cloud and managing both simultaneously is no longer a complex or costly project, but can be implemented utilizing the power of virtualization, authentication and security, ease of administration and rapid provisioning of resources through OpenStack. Red Hat, IBM, Rackspace, Intel, etc has developed and released their customized versions of enterprise cloud versions utilizing OpenStack core and components and their own software, middleware, hardware components that suits the needs of a wide variety of business requirements. Along with supporting and funding the project, these businesses are using customized OpenStack clouds for their business needs. This include AT&T’s Integrate Cloud (AIC), Dropbox’s implementation of OpenStack private cloud and the OpenStack implementations used by organizations like eBay, TD Bank, BMW, Walt Disney, Wal-Mart, Verizon etc.
A balanced approach for your IT needs
OpenStack enables businesses to utilize its power and flexibility to plan and implement and a hybrid, highly customizable and controlled cloud infrastructure for the growth of the business. The diverse array of choices for various elements of the infrastructure and the options for fine grained control over them can be effectively utilized for setting up a dynamic infrastructure for the business needs. You can choose which hardware you need, what architecture your resources need, how much resources can satisfy your business computational needs at specific periods of the business, how much performance and availability your customers are to be provided, etc with OpenStack. OpenStack ensures ROI within a reasonable time frame through optimized deployment of your infrastructure. Along with the growing OpenStack technology and community, your business and its infrastructure also grows with minimum effort due to the finest level of adhesion OpenStack provides with your IT infrastructure needs.

Read more


BIG DATA AND OPENSTACK FOR DATA INSIGHTS

Big Data refers to large volume of data and processes applied on these data for getting insights for better decision making and strategic planning. Big Data operations include data capturing, data storage, data sharing, data analysis, search, transfer, querying, projection etc. Modern Big Data concepts encompass predictive analytics, user behavior analytics etc. to extract value from data and use them for developing insights, products and services from the analysis. On the Big Data perspective, it’s not the amount of data but what businesses do with the data that matters. Big Data analysis helps businesses to find solutions for saving cost and time, development and implementation of new technologies, optimize product offerings, improve operational efficiencies, etc.
Big Data can be defined by the five Vs:
Volume: the combined quantity of data generated and stored. The size of the data is a deterministic parameter for labeling the data as eligible for analysis to provide insights and even to label as Big Data.
Variety: the type and format of the data. It can be anything like structured numeric data, unstructured text data, media, emails, transactions etc.
Velocity: the speed at which data is acquired and processed to create the Big Data set and how this influence the growth and development of the business.
Variability: the inconsistency of the data as well as the data flow. There will be seasonal and triggered peaks as well as bottoms that may affect the processes that handle the data.
Veracity: the variation in the quality of the captured data and it can affect the accuracy of the analysis.
Important sources for Big Data are:
1. Data received from connected devices that include IT, electronics and appliances, sensors, monitors, communication equipments etc.
2. Social media, marketing responses, feedback and surveys, etc.
3. Portals, data banks and other public data sources.
Big Data Technologies
Once the data is received, it is to be decided what to store and what to discard, how much to analyze and how to utilize the resulting insights. For the storage, processing, management and analysis of Big Data, following technology requirements need to be met:
1. Large amount of storage at affordable cost.
2. Fast and performance optimized processors.
3. Affordable and distributed Big Data platforms for hosting the data.
4. Functional capabilities like parallel processing, clustering, virtualization, large grid environments, and high availability and fault tolerant systems.
5. Cloud computing and flexible provisioning facilities.
6. Standardized operations in the data center.
7. Data security and privacy.
To handle the vastness and power of Big Data you need an infrastructure that is capable of managing and processing high volumes of structured and unstructured data with high data privacy and security. There are 2 classes of Big Data technology:
a. Operational Big Data: systems that have operational capabilities for real time capturing, processing and storage of data.
b. Analytical Big Data: systems like Massively Parallel Processing (MPP) database systems and MapReduce that enables to implement a system that can be scaled out enormously to perform complex analytical computations.
In traditional RDBMS based centralized systems there is limit for maximum amount of data that can be handled without performance and storage constraints, and therefore Big Data need a more scalable solution. Implementing Big Data computation over a cloud based infrastructure allows increased flexibility, better resource utilization, scaling capacity, and lower costs. A private cloud boosts the security and compliance adherence while a multi-technology hybrid cloud offers an elastic infrastructure suitable for hosting the high end storage and processing needs of Big Data.
Apache’s Hadoop, the Big Data framework, with MapReduce, implements one of the solutions that provides distributed data storage and processing in which large amount of data storage and processing are distributed across a cluster of commodity hardware based parallel processing network that can be scaled easily to thousands of computational nodes.
Hadoop and OpenStack
Hadoop, combined with the open source cloud platform OpenStack provides an infrastructure that meets the technical, functional and non functional requirements needed for any Big Data processing infrastructure. OpenStack’s Sahara provides provisioning of the Hadoop data-intensive cluster (or any other framework like Spark) on top of OpenStack. OpenStack supports this architecture through its cloud based component services named nova (compute), neutron (networking), cinder (block storage), keystone (identity), swift (object storage), etc. The components of Hadoop facilitates a distributed computing service and named as The Hadoop File System (HDFS) for replicated data storage, YARN for job scheduling and cluster resource management, MapReduce – a YARN based system for parallel data processing and Hadoop Common, a set of utilities for managing Hadoop modules.
Hadoop and OpenStack are integrated to provide Big Data Platform as a Service (BDPaaS), a hybrid cloud-based Big Data as a service offering, that uses Hadoop’s Big Data analytics capability, RHEL’s OpenStack platform and Intel based x86 servers. This service collects data from different sources, performs comprehensive, strategic and business analysis operations on this data and incorporates the resulting insights obtained from this analysis into the business processes. The advantages of such a solution are:
1. Open source and commonly available components.
2. Compliance to industry standards.
3. Simple deployment and flexible operation.
4. Reduced risk, high security and scalability.
5. Distribution of workload intelligently to the most appropriate business component, whether it is on-premise, a private cloud or even a public cloud.
6. Built-in security practices for protecting data and insights.
Hadoop cluster on OpenStack
The main advantage that OpenStack provides for this integration is the template-based provisioning for reducing provisioning and deployment time. Cluster or node level templates can be specified for allowing automated provisioning of Hadoop clusters. Templates also allow flexibility in configuration and also in defining cluster type, that is whether Hadoop-based or not Hadoop-based. This architecture also provides efficient cluster time-sharing, load distribution and high availability of the infrastructure. OpenStack’s Sahara provides simple provisioning of Hadoop clusters and elastic data processing capabilities. Users can manage the Hadoop clusters through the OpenStack dashboard service Horizon.
 
Conclusion
Organizations that utilize and benefit from this Big Data Platform as a Service model using OpenStack spans across a wide range of business domains like insurance, health care, technology, logistics, media, advertising, education, manufacturing, and government departments. The integrated Big Data analysis and insights are used to deliver better products and services, enhance customer satisfaction and experience and thus producing higher sales and larger profits.

Read more


HIGH AVAILABILITY CONCEPTS FOR OPENSTACK CLOUD

A production OpenStack deployment must need the following requirements in order to serve the business needs with performance and efficiency:
1. High availability of the service
2. Scalability
3. Automated business and management operations
There are different deployment approaches that can be employed to ensure the above qualities. This article examines such an approach.
Any High Availability (HA) approach will be focused on the different OpenStack Services that are divided into different groups.
The HTTP/REST services provided by the API servers: nova-api, glance-api, glance-registry and keystone can be made highly available by distributing loads. For this, a load-balancer that supports health checking of the API servers can be utilized that distributes load evenly as well as ensure that the servers are healthy.
There are the Compute services that are responsible for provisioning and managing the VMs and the resources needed by them. These are the nova-compute, nova-network, and nova-volume. These are the fundamental management services that make the OpenStack running. So at the very primitive level, it is to be ensured that these services are up and running all the time. Since all the 3 services need to work simultaneously, any single failure can bring the system down even if temporarily. So the main approach to HA here should be to avoid a single point of failure in the coordinated working of these services. This can be achieved by using an external monitoring system to monitor these services and to handle common failure situations using recovery scenarios implemented as failure event handlers. These event handlers, on a minimum can notify the admin to restart the system or make an automatic restart. For the nova-network, that provides the networking service, the routing of projects task can be separated and handed over to an external hardware router. Such a multi-host setup removes a single point of failure, since nova-network then deals with only DHCP functions.
The Scheduler and Queue Server
The RabbitMQ message broker is the queue server that facilitates communication between the nova services. While queue mirroring is in-built into RabbitMQ, a load balancer can be used to distribute connection loads between RabbitMQ servers that are setup as a cluster. The nova-scheduler service accepts messages from the scheduler queue in the RabbitMQ server. New scheduler instances are created automatically upon need and all of them work synchronously to achieve redundancy and HA.
OpenStack Database High Availability
The multi-master-replication-manager (MySQL-MMM) is a commonly used database solution with OpenStack. It provides high availability and scalability for the OpenStack database service. Along with it, the wsrep API based Galera Cluster solution, provides excellent clustering and scalability to the entire OpenStack Cloud database service. The synchronous multi-master topology of Galera Cluster for MySQL is well-known for providing high availability and it fits well with OpenStack.
Scalability through Node Roles
Horizontal scalability is typically achieved in a cloud by adding additional instances to the cloud and includes them in the load balancer network configuration. For OpenStack this process is simple and flexible. But an issue with this kind of scaling is the role of the node. In OpenStack cloud each and every node has a set of services associated with the node. So it becomes necessary to define the services a node is to assign with, also known as its role during instantiating it. A development OpenStack deployment only need 2 types of nodes: compute nodes to perform computational services and host VMs and controller nodes to run all other management services. But a production deployment needs the additional API service to run on compute nodes. The below node roles are to be defined in the production deployment:
Endpoint node: responsible to perform load balancing and other high availability services. So this node contains the needed load balancing and clustering software/firmware. Instead of a single node, a dedicated load balancer network also can perform this task. There is a minimum requirement of at least 2 nodes for any cluster for redundancy and failover.
Controller node: this is responsible for managing the control and communication between the various services of the cloud like the queue server, database, dashboard etc. It can also host the scheduler service nova-scheduler and API servers. To achieve redundancy, the API servers must be load balance by an endpoint node and at least 2 controller nodes are needed.
Compute node: these are responsible for hosting the hypervisor and other VM instances. It also provides the needed resources for the VM instances. Further it can be used as network controller for the instances hosted in it, with a multi-host network scheme.
Volume node: this is associated with the nova-volume service that aids in storage volume management.
To achieve high availability, configuration management is needed in the following scenarios.
1. When a controller node is to be added to the cluster as part of scaling out, configuration changes in multiple places are needed like, node deployment, service start and finally load balancer configuration update to include the new node. For compute nodes, the level of configuration needed is much lesser than that needed for controller nodes but may be needed at bare-metal to services levels. Fine automation of these configuration change management are needed for HA.
2. If the controller node and endpoint node are to be combined in a single host, configuration changes are needed for the nova services and load balancer. This also to be automated for ensuring HA.
Configuration Management and Automation
Since configuration changes are needed at node levels, load balancers, replication etc. automated configuration management and scalability implementing scripts are necessary for any OpenStack cloud platform. Two well known projects for accomplishing this automation are Devstack and Crowbar. The integrated automation provider projects like Puppet and Chef are also highly contributing for deploying automated, scalable and highly available OpenStack cloud platforms. The scaling process can be automated by using orchestration engines also. These engines use configuration templates to be applied every time a new node or instance is allocated, thus providing automation of scalability.
Topologies for High Availability
There are different topologies that utilize different approaches to achieve high availability in OpenStack clouds. Some of these topologies are:
1. With a hardware load balancer: here a physical hardware load balancer appliance is used. It provides connection endpoints for the OpenStack services deployed on different nodes. Compute nodes host API servers, schedulers and nova-scheduler instances. Controller nodes host glance-registry instances and the Horizon dashboard.
2. With a dedicated endpoint node: here, instead of a hardware load balancer, en endpoint node is used to distribute traffic between different OpenStack services. Here the API services are deployed on the controller nodes instead of the compute nodes.
3. With simple controller redundancy: here the endpoint nodes and controller nodes are combined. Controller nodes host API services and nova-scheduler instances and these nodes can be easily scaled by instancing new nodes and reconfiguring HAProxy that provides high availability, load balancing and proxying for TCP and HTTP communications.
Network Topology
The OpenStack instances use static IPs for internal communications (private network) and the network is isolated from outside environment by the firewall implemented by the nova-network component. Outside world communicates with the public network part of this deployment that uses dynamic or floating IPs. Separate component networks can be used for management, storage (for nova-volumes), etc. The public network provides access to the cloud instances through dynamic IPs. Clients can connect to the OpenStack service APIs through virtual IPs of the endpoint nodes.
 
Summary
High availability can be achieved through many ways and load balancing and monitoring are the keys to achieve this. A central control node is not mandatory for a HA deployment and you can reach a benchmark level with some experimental configurations. The main point is the distribution of traffic and load evenly across the multiple instances of a service and also to the extent at which replication can be provided for stateful resources like MySQL and RabbitMQ.

Read more


OPENSTACK CLOUD DESIGN AND ARCHITECTURE CONSIDERATIONS

In a previous article, we have provided an Introduction to the OpenStack Technology, an overview of its architecture, service components and advantages. As we have specified, OpenStack is a cloud deployment and management system that allows administrators to control the cloud elements for computation, storage and networking and allows users to manage resource provisioning through the web interface. This article goes some more deep to explain the architecture and design guidelines for an OpenStack Cloud deployment that uses the Red Hat OpenStack platform to build a private or public IaaS cloud with Red Hat Enterprise Linux.
OpenStack Components or Core Services
The OpenStack cloud platform is implemented as a group of interrelated and interacting components for compute, storage and networking services. There are 9 core services with code names that provide these services. There are 3 methods by which administrators can manage and provision the OpenStack cloud and its components. These are through the web-based dashboard, command-line clients and through the OpenStack API. The 9 core services are:
Horizon: web-based Dashboard service to manage the OpenStack cloud resources and services.
Keystone: the centralized Identity service for the authentication and authorization of the OpenStack users, roles, services and projects.
Neutron: the Networking service that interconnects the various OpenStack services by providing connectivity between their interfaces.
Cinder: the Block Storage service that provides disk volumes for the OpenStack virtual machines.
Nova: the Compute service responsible for managing and provisioning OpenStack virtual machines on hypervisor nodes.
Glance: the Image registry service that manages the virtual machine images and volume snapshots that are taken from existing servers and are used as templates for new servers.
Swift: the Object Storage service for storing data, VM images, etc. as objects in the underlying file system.
Ceilometer: the Telemetry service that provides monitoring and measurement of the cloud resources.
Heat: the OpenStack Orchestration service that provides templates to create and manage cloud resources like storage, networking, instances etc.
There are other OpenStack services too, like Ironic that provisions bare-metal or physical machines, Trove that provides the Database as a Service functionality, Sahara that allows users to provision and manage Hadoop clusters on OpenStack, etc.
High level overview of the OpenStack core services. Image courtesy of Red Hat Enterprise Linux - https://access.redhat.com/
 
Architectural guidelines for designing an OpenStack Cloud Platform
When designing an OpenStack cloud platform, there are certain factors to be considered that have direct impact upon the configuration and resource allocation for the platform. The first one is the Duration of the project. Resources like vRAM, network traffic and storage will have noticeable changes with time. The more the duration, memory and storage need more resources. Also network traffic can be estimated based on an initial sample period and then taking into consideration, the spikes that can occur on peak periods. The frequency of usage has its effect on allocation of number of vCPUs, vRAM quantity and to determine the computational load.
Next design consideration is about the Compute Resources needed for the OpenStack cloud. The initial compute resources allocation can be based on the expected service and workload calculations. Additional resources can be provisioned later on demand. Since the instances come in different flavors, specific resource pools matching to the specific flavors should be designed so that they can be provisioned and used on-need. A consistent and common hardware design across resources in a pool enables maximum usage of available hardware. It also provides easy deployment and support.
Next consideration is Flavors, the resource template that determines the instance size and capacity. Unlike default flavors, user defined flavors can specify storage, swap disk, metadata to restrict usage and can allow to measure capacity forecasting. vCPU-to-physical CPU core ratio is another design factor. The default allocation ratio in RHE OpenStack Platform is 16 vCPUs per physical CPU core or per hyper threaded core. It can be reduced to 8 vCPUs if memory is low. This ratio has dependency on the total RAM available including the reserved 4GB system memory. An example allocation is 56 vCPUs for a host with 14 VMs and 64GB RAM.
In the Memory Overhead consideration, the VM memory overhead and KVM hypervisor memory overhead are to be considered. As examples, for a 256MB vRAM, 310MB physical is to be allocated, 512 Vs. 610 etc. A good estimation of hypervisor overhead is 100MB per VM. Another factor is Over-subscription, in which excessive number of VMs compared to the available memory on the host is allocated leading to poor performance. For example, a quad core CPU with 256GB of RAM and more than 200 1GB instances will cause performance issues. Therefore an optimal ratio of number of instances to available host memory is to be found out and allocated.
Regarding the Density factor, following points are to be considered: if instance density is lower, more hosts to support the compute needs are needed. A higher host density with dual-socket design can be reduced by using quad-socket designs. For data centers with older infrastructure and those with higher rack counts, it is important to reduce the power and cooling density. It is important to select the right Compute Hardware for performance and scalability of the cloud platform. Blade servers that support dual-socket, multi-core CPUs decreases server density compared to 1U rack-mounted servers. 2U rack-mounted servers take half the density than 1U servers. Large rack-mounted servers like 4U servers provide higher compute capacity and support 4 to 8 CPU sockets. These have much lower server density but come with a high cost. Sled rack-mounted servers support multiple independent servers in a single 2U or 3U enclosure but have high density than 1U or 2U rack-mounted servers.
When designing and selecting Storage Resources, following general factors are to be considered:
1. The applications must be compatible with the cloud based storage sub-system.
2. I/O performance benchmarks and data should be analyzed to find out platform behavior under different loads and storage should be accordingly selected.
3. The storage sub-system including the hardware should be inter-operable with the OpenStack components especially the KVM hypervisor.
4. A robust security design focused on SLAs, legal requirements, industry regulations, required certifications, compliance with needed standards like HIPAA, ISO9000, etc. and suitable access controls should be implemented.
Swift Object Storage Service
It should be designed such that these resource pools are sufficient enough for your object data needs. The rack-level and zone-level designs should satisfy the need of replicas required by the project. Each replica should be setup in its own availability zone with independent power, cooling and network resources available to that zone. Keep in mind that, even if the object data storage distributes data across the storage cluster, each partition cannot span more than one disk. So maximum number of partitions should always be less than number of disks.
Cinder Block Storage Service
A RAID configuration will be suitable for achieving redundancy. The hardware design and configuration should be same for all hardware nodes in the device pool. Apart from block storage resource needs, it should be taken into account that the service should provide high availability and redundancy for the APIs that provisions the access to the storage nodes. An additional load balancing layer is preferred to provide access to backend database services that services and stores state of block storage volumes. The storage hardware should be optimized for capacity, connectivity, cost-effective, direct attachment, scalability, fault tolerance and performance.
Network Resources
Network availability is crucial for the efficient functioning of the hypervisors in a cloud deployment. The cloud uses more peer-to-peer communication than a core network topology usually does. The VMs need to communicate with each other like they are on the same network. So, in order to deal with this additional overhead, OpenStack uses multiple network segments. Services are segregated to separate segments for security and to prevent unauthorized access. The OpenStack Networking Service – Neutron, is the core software-defined networking (SDN) component of the OpenStack platform. The general design considerations for Network resources include security, capacity planning, complexity, configuration, avoiding single point of failure, tuning etc.
Performance
For the OpenStack cloud platform performance requirements are related to network performance, compute resource performance and storage systems performance. Hardware load balancers can boost network performance by providing fast and reliable front-end services to cloud APIs. Hardware specifications and configurations along with other tunable parameters of the OpenStack components also influence performance. Using a dedicated data storage network with dedicated interfaces on the compute nodes also can improve performance. The controller nodes that provide services to the users and also assist in internal cloud operation also need to be carefully designed for optimal hardware and configurations. To avoid single point of failure, OpenStack services should be deployed over multiple servers with adequate backup capabilities.
Security
Security considerations are organized on the basis of security domains, where a domain includes users, applications, servers and networks that share same user-access-authentication rules. These security domains are categorized as public, guest, management and data. The domain trust requirements depend on the nature of the cloud whether it is private, public or hybrid. For authentication, the Identity service Keystone can use technologies like LDAP for user management. The authentication API services should be placed behind hardware that performs SSL termination since it deals with sensitive information like user names, passwords and authentication tokens.

Read more


AN INTRODUCTION TO OPENSTACK TECHNOLOGY

OpenStack is an open source project that consists of a set of software tools for building, deploying and managing cloud computing platforms for both private and public clouds. OpenStack is managed by the non-profit organization, OpenStack Foundation that promotes the global development, distribution and adoption of the OpenStack technology. It is supported and funded by a number of major enterprises, corporations and organizations and has thousands of community members world-wide who actively participate in technical contributions and community building efforts. 
According to OpenStack.Org“OpenStack is a cloud operating system that controls large pools of compute, storage, and networking resources throughout a datacenter, all managed through a dashboard that gives administrators control while empowering their users to provision resources through a web interface.”
 
What is OpenStack?
It is a cloud computing management platform that can be used to setup, control and manage various components of a cloud like the computing nodes, storage, database and networking resources through an administrative dashboard (Horizon). It provides the cloud platform service as an Infrastructure as a Service (IaaS) solution through a set of connected services. Users can provision resources quickly like adding new instances through the dashboard. OpenStack allows for achieving great horizontal scalability and is highly configurable. OpenStack can run on standard hardware but needs an operating system that supports virtualization. It can utilize almost all hypervisors as virtualization host and can run over a mixed hypervisor IT environment. It supports hypervisors like KVM, LXC, QEMU, XEN, VMWARE, etc.
Hypervisors
Hypervisors or Virtual Machine Managers (VMM) are software or firmware that makes virtualization possible by allowing sharing of a single hardware platform and resources by virtual systems that operate independently on a single host platform. It allows for creating, running and monitoring virtual machines that are guest operating systems and their applications sharing physical hardware resources through a single host operating system on which the hypervisor runs. There are 2 types of hypervisors:
1. Type 1 or Bare-metal hypervisors: they run on top of the system hardware directly without a host OS. Eg: VMware ESXi, Citrix XenServer, Microsoft Hyper-V, Oracle VM Server for SPARC, Oracle VM Server for x86, etc.
2. Type 2 or hosted hypervisors: run on a host OS that provides virtualization services. They cannot directly access the hardware. Eg: VMWare Workstation, VMware Parallels, Oracle Virtualbox, and Microsoft VirtualPC.
Hypervisors - Image courtesy: Wikipedia.org
Kernel based Virtual Machine (KVM) of Linux are kernel modules that provides a virtualization infrastructure for the Linux Kernel that makes it a hypervisor. For this KVM requires processor with hardware virtualization extensions. Other hypervisor modules are Xen, ESXi, QEMU, etc.
OpenStack cloud platform can be installed and implemented on the following hypervisor supported operating systems at present: Ubuntu (KVM), RHE Linux (KVM), Debian (KVM), Oracle Linux (XEN), Oracle Solaris (Zones), Microsoft Hyper-V and VMware ESXi.
OpenStack Architecture
The OpenStack Cloud Platform over a host hypervisor provides the OpenStack shared services needed to run and manage the cloud components – computing, storage and network – to implement private and public clouds. The user applications run over this OpenStack Cloud Platform and user controls the cloud through the OpenStack administrative interface and also through the API provided by OpenStack.
 
OpenStack Services (Components)
OpenStack architecture is using several component services out of which nine are identified and maintained as key components by the OpenStack community. These components are included in the core of OpenStack and are distributed with the OpenStack system. These components are denoted by its service, the project name and the description of its responsibilities in the system:
 
1. Nova: the primary Computing Engine. Deploys and manages the virtual machines and adds instances on demand for computing.
2. Horizon: web based Dashboard for users to configure, deploy and manage other services like instance creation, manage resources and networking.
3. Swift: provides an Object Storage and Reading mechanism for files and directories.
4. Cinder: provides Block Storage feature for running instances by which volumes can be allocated and de-allocated from instances.
5. Neutron: facilitates Networking for implementing connectivity between the VMs.
6. Keystone: provides Identity Services for the authentication of OpenStack users and management of their access to various services.
7. Glance: provides Image Service in which images or virtual copies of hard disks are used during the provisioning of new VMs.
8. Ceilometer: provides Telemetry or Billing Service by which usage of various cloud resources including VMs, network usage etc. are monitored and metered for billing and benchmarking.
9. Heat: the Orchestration component of the OpenStack that allow developers and users to store the resource requirements and configuration details for cloud applications in text files also known as HOT Templates (Heat Orchestration Component). These files are used to define and manage the infrastructure needed for the cloud application.
Advantages of OpenStack
OpenStack’s biggest advantage is the technology and global collaboration platform on which the project is implemented. This provides businesses with a solid and dependable service to base their IT infrastructure on. OpenStack is developing and expanding rapidly, adding to its repository more and more features and capabilities. Below is a list of main advantages being offered by the OpenStack Cloud Platform:
1. Automated deployment and management of the cloud.
2. Rapid setup and provisioning of instances and other cloud resources that allows immense horizontal scalability.
3. Centralized management through Horizon and the OpenStack API allows simplicity in implementation and flexibility in configuration.
4. Open source and is globally supported by thousands of community members.
5. Supports deployment of cloud platforms of any size or type from private clouds and public clouds to global cloud solutions spread over continents.
6. Supports a wide variety of host platforms and hypervisors.
7. Provides high availability and cost-effectiveness.

Read more


WHY IS ENTERPRISE DATA BACKUP IMPORTANT

Enterprise data backup is a critical requirement for any business to secure its vital data resources. Although specific technology components like file servers, database servers, storage servers etc. have broad provisions for backing up and protecting data associated with them, these are on a component basis rather than on the global business level. The term “global business level” is used here to represent the data associated with all aspects and interests of any particular business, and is spread over the individual technology components. This includes data in file servers, databases, private or public storage spaces, system and application configurations for the desktops, laptops, mobile devices, network related hardware and software like router configuration, firewall settings and rules, data associated with online SaaS/IaaS/PaaS services like CRMs, Payment Gateways, Cloud Deployed data etc. Backing up and securing data associated with these global business aspects are therefore is a complex procedure and could not be fulfilled by the backup capabilities of any single component like database backup solutions that backups database data only.

Complexity of Enterprise Data

As mentioned above business-critical data spans over the entire business space and extends outside to the premises of:
1. Business stakeholders like vendors, contractors, service providers etc.
2. 3rd party services that are consumed by the business like Software as a Service (SaaS), Platform as a Service (PaaS), Infrastructure as a Service (IaaS) or more broadly Anything as a Service (XaaS). These include websites, knowledge-base, hosted business portal, shared resources and other online services.
Enterprise file data consists of both the application level data and system level data. This includes:
1. Business process files created using Office Software suites, graphical files, process documentations, procedure and operational manuals, statistical and tabular data files, departmental files like payrolls, employee files, purchase/order/invoice files, customer data files, transaction files etc.
2. Files related to the product or service being offered by the business like software, technical and user manuals, system analysis and design files, development and operational logs, archives etc.
3. System configurations that exist in the form of files, database records etc.
4. Configured OS, firewall, utilities, administrative tools, internal software etc.
5. Network configurations and settings files.
6. Application files hosted in-house.
7. System and application software associated with desktops, laptops, mobile devices, printers, scanners etc.
It is a nightmare for the IT administrators and security professionals to manage the responsibility of backing up and securing all these resources without a consolidated and automated mechanism. It takes enormous effort and expertise to reach out onto all the above areas, pick the data that needs to be backed up, perform the backup operation over these high volumes of data, storing the back in a secure place in restorable format and finally restore them in a distributable and distinct manner whenever and wherever needed.

Enterprise data components
 
Why Enterprise Data Backup is Necessary?
Since we have identified the nature and volume of data to be backed up and the complexities involved in accomplishing this, it is time to decide about – is all these data need to be backed up? If so, why? Let’s try to find answer for this question.
If data is to be classified according to the time line that encompasses the creation and modification of data as well as the data structure, then there will be broadly 4 categories of data:
1. Historical or archived data that are no longer actively used but needed for reference and analytical purposes.
2. Data groups created in the past/present and being modified actively in the present and used extensively for business operations.
3. Data that controls configurations, created in the past and used in the present, passively.
4. Data created and modified at present moments and are being added or cumulated like logs and reports.
From the above list, data group no. 1 seems to be less eligible for data backup mainly due to its less priority and involvement in daily operations and also due to the high volume it consumes for storage and backup. But for any organization, that wants to learn from the past to aid in its evolution, to identify benchmarks and optimal levels of performance, and to record history as part of the business track record, then that data can be subjected to a backup scheme that uses heavy compression and cheap backup storage space.
Data group no. 2 involves most of the business operational data that is responsible for powering the business and fueling the processes. Any loss in integrity or consistency of this data means, the business is going to suffer losses in: business functioning and operations, business integrity and reputation, business uptime, financial and market losses and benefits and advantages gained so far. So this mission-critical data is the primary candidate for active backup and security.
Data group no. 3 is responsible for all the hardware, software and logical processes to run in the business. Without a proper configuration malfunctioning and under-performance will occur causing down-times and losses. So this data along with data group no. 2 is an active candidate for backup and security.
Data group no. 4 is the traces and evidences of the operation of all the resources associated with the business. It is needed for fine tuning, optimizing and troubleshooting the business processes, both physical and logical. Also it is to be used as evidences during investigations and forensic analysis. So this data group, through not as critical as data groups 2 and 3 is eligible for backup and security with high compression and moderate cost storage.
Features needed for Enterprise Data Backup Solution
Any solution for business data backup should meet the following requirements, though some can be termed as ideal.
1. Real time backup facility: this performs backup whenever data is changed, since for restoration, the latest data snapshot is needed rather than a scheduled backup taken on a past checkpoint. So this backup is something that changes with time or data updation and optimized for space utilization since too much backups can be costlier in terms of space. A preferred backup type for this is Incremental Backups, that does not backs up the entire data but add to the backup only those parts that are changed.
2. Automatic restore and recovery: an optimal backup system should detect failure scenarios and provide restore and recovery without or little manual intervention. The restore should be rapid so that business functions are not critically affected due to down time or lack of resources.
3. Bare-metal disaster recovery: the backup, restore and recovery should be handled from the infrastructure and platform level such that the entire business operational system is recovered and restored immediately after a disaster strike, attack or corruption.
4. Logical and discriminative restore: Even though the backup solution handles infrastructure or platform level backups that contain the entire data, most of the times, it is needed to restore only a sub system or even a single component in the main business informational system. The logical and discriminative restore mechanism should segregate backed up data on sub-system and component basis and individual components only should be restored upon needed.
5. Solid and controlled compression and intelligent backups – ability to create highly compressed but without data loss, backups that avoids redundancy as far as possible.
6. Distributed backup storage to avoid single point of failure and central administrative unit for controlling the backup process.
Essential Qualities that aid in Enterprise Data Backup
The main constraints associated with enterprise data backup are:
1. Technology selection that suites business infrastructure and limitations.
2. Cost of backup solution (one time/subscription/on-demand).
3. Data compression and space consumption of backups, since high frequency, more data or loosely compressed backups means higher storage consumption adding to the cost.
4. Reliability and security – the backup solution should be up and functional, ideally 100% of time and should be secure from threats applicable to business data, so that it can act as the savior of business data. If the backup location is external to the business premises, then network and communication security is critical.
5. Less complexity and administrational overheads.
6. Compliance with organizational data standards.
7. LAN/WAN coverage to backup data across networks.
An ideal solution will utilize modern cloud and cluster based computing and storage solutions to achieve such a backup solution. Since backing up is a high resource consuming and computing intense process, there will be momentary down-times or resource non availability scenarios can occur during the backup process running time. This must be time sliced or should be run as a concurrent background process to eliminate those disadvantages. Load balancing and secure networking is crucial in order to implement a sophisticated and robust backup solution.

Read more