IT teams spend a good deal of time thinking about disaster recovery – and rightly so. Unexpected downtime is hugely expensive, from a loss of productivity, to an immediate loss of revenue, to long-term damage to the corporate brand.
For disaster recovery of virtual machines already in the public cloud, CenturyLink Cloud’s premium storage option offers a simple, elegant option for enterprises. Just click a button, and you have an RTO of 8 hours, and RPO of 24 hours in the event of a declared disaster.
What about protecting on-premise VMs? This is a perfect workload for public clouds like CenturyLink Cloud. Traditional DR systems are simply too hard to get up and running. Too often, these products:
- Take too long to deploy. Many take months to plan, and then even more time to deploy.
- Add cost and complexity. Enterprises often need to bring in consultants or additional resources to manage the effort.
- Increase CapEx. New hardware usually needs to be purchased and managed accordingly.
- Require burdensome testing procedures. Once a DR solution is deployed, users are often forced to navigate long, labor intensive lead times to conduct DR tests throughout the year.
To help customers protect on-premise VMs, we’re excited to launch Safehaven for CenturyLink Cloud. This solution offers much of the simplicity as CenturyLink’s cloud-to-cloud DR – including self-service and a point-and-click interface – with powerful customization options. In addition, customers receive a “white-glove” on-boarding experience to ensure your configuration aligns with your DR strategy. Here’s a high level overview of how Safehaven for CenturyLink Cloud works:
- Build the CenturyLink Cloud servers using our self-service interface.
- Install the Safehaven replication software in your Cloud and Production environments
- Configure Safehaven including settings for RPO requirements and server bring up sequencing
- Test your environment against your with failover and failback operations runbook
Upon disaster declaration initiate the failover sequence in the Safehaven Console and let the pre-configured API calls boot your Cloud VMs, and start your OS so you can resume application processing.
Once your production environment is restored, initiate a failback command that automatically pauses your CenturyLink Cloud VMs and starts data transfer back to your production site.
SafeHaven provides full recovery orchestration and planning at the server group and data center levels. Users can be sure that multi-tiered applications involving multiple servers and data volumes will come up in the exact sequence you configured. In the case of controlled shut down events, SafeHaven can bring up a replica of your data center in the cloud without any data loss.
The service shines through where traditional DR systems fall short, specifically when it comes to:
- Cost. Safehaven is orders of magnitude less expensive compared for enterprises looking for recovery of private data centers in a multi-tenant cloud.
- On Demand Cloud Infrastructure. Configure and provision your VM and storage in minutes, and avoid long and expensive lead times to provision infrastructure.
- Performance, specifically ultra-low recovery times. Frequently an entire data center can be restarted in the CenturyLink Cloud in a matter of minutes.
- Non-disruptive Testing. Users can run test to validate application–layer recovery without affecting production systems.
- Group Consistency & Recovery Plans. Users can develop and test automated run book recovery plans.
- Automated Failback. When disaster conditions are resolved, users can failback to their original production data centers in minutes and without data loss.
- Continuous Data Protection. Users can retain up to 2,048 checkpoints to protect against data corruption or loss.
- Versatility. SafeHaven can be configured to protect both physical and virtual IT systems.
- Ease of use. The product UI is very intuitive and simple-to-use.
Let’s dive into a technical overview of the product to see exactly how it works.
Core SafeHaven for CenturyLink Cloud replication software services include:
- Inter-site migration, failover, and failback
- Continuous data protection with up to 64 checkpoints for rollback in cases of server infection or data corruption
- Non-disruptive testing of recovery and migration plans
These features can be executed at the level of individual Servers, groups of servers or entire sites. How? A virtual appliance called a SafeHaven Replication Node or SRN is uploaded into the site that is to be protected. SafeHaven then leverages local mirroring software already embedded within standard operating systems to replicate new writes from protected servers to the SRN. The SRN buffers these changes and transmits them asynchronously to a protection VM in the CenturyLink Cloud that then writes the changes to disk. When disasters occur, SafeHaven will boot the recovery VMs using the replica data images it has been maintaining in the cloud. Meanwhile, another SafeHaven virtual appliance called a Central management server transmits control traffic between the SRNs and the SafeHaven Management Console.
Disaster Recovery is a complex topic area and that there are many protection strategies to achieve the desired results. We created this service offering in response to customer feedback as yet another option for our customers to protect their production IT environments. We also recognize this solution is not a panacea for DR; rather it’s a solution oriented tool that may help you service certain production workloads in a cost effective, low investment model. And for those workloads that require a different protection technique we’ve got the depth of resources to help you realize your goals.
Recent history has shown that after a cloud provider is acquired, the pace of innovation slows and there’s a loss of focus (and staff). If you don’t believe me, check out the release notes (if you can find them!) of some recently acquired cloud companies. It’s not pretty. I’m here to say that we’re different.
140 days ago, the acquisition of Tier 3 by CenturyLink was described as a "transformational deal for the industry." Instead of randomizing Engineering post-acquisition with unnecessary process, and haphazard integrations with legacy and redundant products, we’ve actually accelerated pace of development on our go-forward platform, CenturyLink Cloud. In the past four months, we’ve maintained our software release cadence, grown our team, expanded our data center footprint, actively integrated with our parent company, and solidified a game-changing vision that has retained and attracted a phenomenal set of customers.
We update our cloud platform every month with new, meaningful capabilities. Only a very small subset of cloud providers can make that claim. In the past 140 days, we’ve shipped over 1,200 features, enhancements, and fixes. This includes a new high performance server class, faster virtual machine provisioning, new reseller services, a major user interface redesign, a compelling monitoring/alerting service, a new RESTful API, and a pair of new data centers.
Our ambitious data center expansion is on track. In the past few weeks, we’ve lit up a pair of new data centers in the US. This gives customers access to world-class CenturyLink network, security, and management services in those locations. With 11 total data centers, the CenturyLink Cloud has a greater geographic breadth than all but two public cloud providers. That’s pretty awesome for our customers who want a highly distributed environment for running their portfolio of applications.
Our Engineering team has also grown as additional experienced developers have come on board and contributed in a major way. The Operations team continues to scale out as well while becoming even more efficient at managing infrastructure at scale. Just as important, we’ve integrated with the broader CenturyLink teams and have a single, comprehensive vision for delivering multiple infrastructure options on a unified platform to a global customer base. Why should organizations compromise when trying to fit their needs into the cloud? With CenturyLink, customers can consume co-location, dedicated hardware, managed services, public infrastructure-as-a-service, and platform-as-a-service all with a single provider. And we’re working to integrate these options into a groundbreaking customer experience.
We aren’t close to being done disrupting this space. The next 140 days will be just as exciting. Try out our compelling platform, or join the team building the future of cloud and infrastructure.
Elasticity and quick provisioning are hallmarks of any good cloud platform. Cloud customers have gotten used to rapidly acquiring right-sized resources that fit a given workload. No longer do developers have to build the biggest (physical) server possible just to avoid requests to resize later on. Rather, provision for what you need now, and adjust the capacity as the usage dictates. But how do you know when it’s time to size up?
The CenturyLink Cloud engineering team just released a monitoring and alert service (alongside our powerful server UI redesign) that gives you the data you need! We designed this feature with three things in mind:
- Offer a simple, straightforward toolset that users can understand and take advantage of quickly.
- Deliver reliable, accurate statistics that reflect the current state of a server.
- Provide multiple ways to identify that an alert was fired.
Together, these three principles kept us focused on delivering a service that met market need. Let’s take a look at how the new monitoring and alert service applies each principles.
It’s easy to get lost in a sea of rarely-used options offered by a monitoring platform. Instead, we focused on ease of setup, a common theme in the CenturyLink Cloud. Users only have to follow two steps.
First, access the Alerts item in the top level navigation menu. This takes you to a list of all the alert policies for your account. Policies can measure CPU, memory, or storage consumption of a server. Creating a policy is as simple as providing a friendly name for the alert, indicating the measure and usage threshold, choosing a duration that the chosen threshold must be exceeded before an alert fires, and a list of the alert’s email recipients.
Once a policy (or polices) are created, simply apply it to one or many servers. The server’s Settings page now has a tab for Alerts where users can quickly add or more policies to the server. To aid usability, we show you a preview of the policy’s core parameters as you select it. This keeps policy names crisp, and prevents incorrect assignment of policies.
Immediately after applying a policy, the platform compares a server’s consumption to the policy’s trigger. Furthermore, you can update policies in a central location and instantly impact all of the servers attached to that policy. Simple, easy – and elegantly powerful!
What’s more, you will easily see when a server has alert policies attached. In our new user interface (available to all users as a public beta!), there are three ways you’ll identify that a server has an alert policy. First, we put an indicator on the monitoring chart that displays the alert level. Secondly, all of a server’s policies are listed in the summary pane. Finally, all policy activities are logged and available in the server’s audit trail.
Monitoring and alerting features exist to deliver proactive, timely, accurate statistics about a virtual machine. It does no good to find out that a server was running hot yesterday. False alarms are counterproductive as well.
In the CenturyLink Cloud monitoring and alerting service, we capture near-real time statistics about each server and show both current and aggregate perspectives. There’s the current consumption highlighted on the left, and the aggregated consumption available on the chart. You’re able to look at a long term aggregation, or even jump down to the average consumption on an hourly basis.
Because the CenturyLink Cloud runs a highly tuned virtualized environment, you may see a difference between what a virtual server shows for consumption, and the value we show in the Control Portal. The Control Portal identifies what the hypervisor itself thinks the utilization is, and this is MORE accurate because the hypervisor can intelligently add horsepower to servers under stress. So, keep this in mind and don’t worry if a server appears slightly stressed to you, but the platform itself doesn’t completely agree!
Finally, it’s important to be able to consume alerting information in multiple ways. We offer three wildly different but extremely complementary mechanisms. By default, a policy must have an email recipient for any alerts. So even if you aren’t logged into the Control Portal, you can instantly find out, in real-time, if an alert condition has been met for the threshold period. Additionally, Control Portal clearly displays when a server is in an alerting stage. If you’re on the server’s details page itself, you’ll see a warning as well as the utilization indicator turned to red. But even better, we highlight the offending server at different levels in the UI - in the left side navigation, the server’s group, and the group’s data center! This means that you can easily see where you have servers experiencing alerts from anywhere in the interface.
The final option is to configure a webhook. Recall that the CenturyLink Cloud offers webhook capabilities which push notifications to an external endpoint of your choosing whenever certain platform conditions occur. We’ve added a new webhook for “alert notification” that will send a data-rich message to any endpoint. For example, you could configure the webhook to feed into your support system so that the two environments (cloud and on-premises) are automatically integrated.
Alerts aren’t helpful if you don’t know they are occurring! So, we’ve built in a host of ways to send notifications and quickly see relevant information.
We’re excited to ship this new capability, and have other plans for building upon these services. Don’t hesitate to provide feedback or feature suggestions by accessing the “feedback” link within the Control Portal!
Multitenancy – the concept of using a single (software) platform to serve multiple customers – is a key aspect of nearly every cloud computing platform. Pooling resources results in lower costs for all parties, greater efficiencies, and faster innovation for customers. Are there risks and tradeoffs with this model? Sure, but every technology paradigm has them.
In this blog post, we’ll look at some core principles for successful multitenancy, see how the CenturyLink Cloud provides tenant isolation, and review the ways that CenturyLink Cloud customers create isolation within their own account. The goal is to simply help customers understand what to look for when assessing multi-tenant environments to run their workloads, SaaS applications, and more.
Any service provider delivering a multi-tenant environment must adhere to these six commandments:
- Thou shalt isolate tenants within their own network. This one applies mainly to infrastructure-as-a-service (IaaS) providers who promise secure computing environments. Software-as-a-Service (SaaS) customers on a platform like Salesforce.com don’t have this issue as customers do not have access to low level network traffic. When granting virtual machine access to users, the service provider has to ensure that there’s no opportunity to intercept network traffic from other customers.
- Thou shalt not allow tenants to see another tenant’s metadata. Sometimes metadata can be just as sensitive as transactional data! Multi-tenant service providers must make sure that customers are logically or physically walled off from seeing the settings or user-defined customizations created by other customers.
- Thou shalt encrypt data in transit AND at rest. Providers shouldn’t let their guard down just because data is within their internal network. Rather, data should constantly be transferred over secure channels, and encrypted whenever it’s stored on disk.
- Thou shalt properly clean up deleted resources. In a multi-tenant IaaS environment, there is clearly reuse. When a network is released by one customer, another can use it. When a storage volume is removed, that space on the SAN is now available for others. It’s imperative that service providers reset and clear resources before allowing anyone else to acquire them.
- Thou shalt prevent noisy neighbors from impacting others. This phenomenon is one of the hardest problems to address in multi-tenant environments. As a user, you have no say in who *else* is using the same environment. It’s up to the service provider to make sure that one customer can’t (intentionally or unintentionally) adversely impact the performance of other customers by overwhelming the shared compute, storage, or networking resources.
- Thou shalt define and audit policies to ensure proper administration of shared environments. Let’s be honest – using a multi-tenant environment involves a bit of trust. As a customer, you have to trust that the service provider has built a platform that properly isolates each customer, and that operational staff can’t go off the reservation and compromise your business. However, to run mission-critical apps in someone’s multi-tenant platform requires more than blind trust; you should also be able to demand to see 3rd party certifications and audits that prove that a mature organization is behind the platform.
Built-in Platform Isolation
With those principles in mind, how does the CenturyLink Cloud platform deliver secure isolation?
IaaS customers can create sophisticated network topologies with one or more VLANs. All of these logical networks are part of a giant physical network and we do best-practice VLAN isolation to make sure that data packets stay within the appropriate VLANs. This ensures that our customers cannot intercept traffic from other customers and creates a protected barrier around your virtual hardware.
What about data? The CenturyLink Cloud makes it easy to provision terabytes of persistent storage that you can easily resize as needed. But when it comes time to delete volumes, we make sure that all virtual disks are automatically wiped so that the next customer always get a blank volume with no way to retrieve data from the previous user. Regarding data encryption, by the end of 2014 we plan on being 100% encrypted at rest and support 3rd party tools for customers to manage their keys.
As mentioned above, noisy neighbors are one of the biggest challenges for multi-tenant cloud providers to handle. The CenturyLink Cloud takes a multi-pronged approach. First, we always leave headroom on host machines and closely monitor usage to know when it’s time to scale. Second, we use features in our hypervisor platform to protect against capacity and latency bursts in CPU and disk. Our storage subsystem is built to handle multi-tenancy and provide protection against I/O bursts. Third, the network is designed to prevent any one tenant from overwhelming the firewalls, and our ample bandwidth ensures that network saturation is nearly impossible.
Finally, you can certainly just “trust us” that we do everything right. But most customers, at first anyway, trust those who audit us. Our data centers and policies are regularly reviewed and we maintain certifications and standards that prove our extreme focus on building a secure environment for your applications.
The platform itself provides built-in multi-tenancy to isolate customers, but how can you build your own isolation WITHIN your account? This is a common scenario for resellers, SaaS provider, and large enterprises who want to logically segment business units or departments. Let’s look at a few options.
One of the best ways to create isolation in your account is through sub-accounts. Sub accounts are containers that can have unique users, permissions, billing procedures, networks, and even branding (look-and-feel). You can choose to inherit various settings from a parent account (e.g. “share parent networks”, governance limits) or treat them as completely independent resources.
Another choice? Use separate VLANS to isolate servers within an account. Consider providing users with remote access to cloud servers but only allowing a small subset of administrators to place the servers on the appropriate VLANs. This makes it possible to have project-specific VLANs where traffic is cleanly isolated from other networks in the account.
A final way to isolate users within an account is through the use of different data centers. The CenturyLink Cloud is spread across the globe, and expanding even more this year. It’s easy to spin up sub-accounts and intentionally constrain users to a chosen set of data centers. This helps you isolate accounts (and applications) to the geographies that work best for your business.
The most advanced cloud deployments depend on multi-tenant platforms. Building systems in this way isn’t easy - it takes careful upfront consideration and steady vigilance to ensure that all users get reliable, consistent performance. The CenturyLink Cloud was designed from day one to excel at multi-tenancy, and you can see that in how we’ve architected the platform and the features we expose to our customers.
Want to try it out? Spin up an account and see how our high-performing cloud can meet your needs today.
Last year, we made 12 predictions about what would happen in the cloud space in 2013. As the year comes to a close, it’s only fair for us to assess our hits and misses to see how well we did.
Recap and Scorecard
PREDICTION #1: 2013 will be the year of cloud management software.
REALITY: Hit. We saw this come true on multiple fronts. First, cloud management providers Enstratius and ServiceMesh were acquired by Dell and CSC, respectively. Tier 3 – known for the sophisticated management software that runs our IaaS – was acquired by CenturyLink. On top of this, Gartner estimates that a new vendor enters the cloud management space every month, and nearly every cloud provider is constantly beefing up their own management offerings. This shows the strategic value of comprehensive management capabilities in a cloud portfolio. Customer adoption of these platforms is also on the rise and Gartner sees 60% of Global 2000 enterprises using cloud management technology (up from 30% in 2013).
PREDICTION #2: While the largest cloud providers duke it out on price and scale, smaller cloud providers see that enterprise adoption really depends on tight integration with existing tools and processes.
REALITY: Mixed. Of course, cloud prices definitely declined in 2013 and massive scale continued to be a key selling point. Hybrid cloud picked up momentum this year as more companies looked to establish an IT landscape that leveraged on-premises assets while taking advantage of cloud scale. In order to maximize the efficiency of hybrid scenarios, companies need consistency in processes and tools. While cloud management platforms have helped with this a bit, there wasn’t a wholesale move by cloud providers to seamlessly integrate their core offerings with established products.
PREDICTION #3: Enterprises move from pilots to projects, and architecture takes a front seat.
REALITY: Hit. There’s been much less gnashing of teeth on “should I use the cloud” this year, and much more discussion about how to capitalize on the cloud. We’ve seen our customers move to more substantial solutions and ask for more sophisticated capabilities, such as self-service networking. Throughout the industry, we’re seeing more enterprise-class case studies where customers are putting mission critical workloads in the cloud. However, outages still occur on any cloud, and providers are publishing guidelines on how to properly architect for high availability. The recent AWS conference was full of sessions on architecture best practices, and developers are hungry for information about how those best practices are applied.
PREDICTION #5: Standalone, public PaaS offerings will be slow to gain enterprise adoption.
REALITY: Hit. In 2013 we saw renewed discussion on what PaaS actually is and what it SHOULD be. Longtime PaaS providers Microsoft and Google added IaaS products to their portfolio, while smaller firms like Apprenda saw success in private PaaS. Our sister company, AppFog, has launched over 100,000 apps, including some impressive enterprise deployments. Former Tier 3 colleague Adron Hall asked whether PaaS was still “a thing” or whether new container technologies like Docker were going to replace it. However, as some like our own Jared Wray and Red Hat’s Krish Subramanian have said, PaaS is about more than JUST application containers. A rich PaaS also includes the orchestration, management, and services that make it a valuable platform for web applications of any type. Either way, PaaS is still in its infancy and will continue to morph as customer scenarios take shape.
PREDICTION #6: Public goes private.
REALITY: Mixed. There were hints of this in 2013 as Amazon won a bid to win a private cloud for the CIA (and for you too if you have half a billion sitting around!), Microsoft offered a “pack” for making on-premises environments resemble their public cloud, and platforms like OpenStack gained traction as a private cloud alternative. We continued to make advances in supporting private scenarios by adding self-service site-to-site VPN capabilities to an already-robust set of connectivity options. I gave this a “mixed” score because as a whole, public cloud providers don’t yet (and may never) make it simple to run their stack in a private data center for mainstream enterprises.
PREDICTION #7: Cloud providers embrace alternate costing models.
REALITY: Hit. 2013 saw some changes to how cloud customers paid for resources. We modified our pricing to decouple some components while still making it easy to provision exactly the amount of CPU, memory and storage that you need for a given server. Google and Microsoft both launched their IaaS clouds with “per minute” pricing for compute resources. Cloud providers have yet to move to a “pay for consumption instead of allocation” model for things like storage, but overall we’ve seen a maturation of pricing considerations in 2013.
PREDICTION #8: While portability will increase at the application and hypervisor layer, middleware and environment metadata will remain more proprietary.
REALITY: Mixed. We might have been too pessimistic last year! DevOps tools have flourished in 2013 and platform adapters have made it possible to move workloads between clouds without a massive re-architecture effort. To be sure, code portability is still MUCH simpler than environment portability. Each cloud provider has their own value-added services that rarely transfer easily to other locations, and no clear IaaS standard has emerged. However, platforms like OpenStack are attempting to make cloud portability a reality, and the increasing prevalence of public APIs makes it possible for tools like Pivotal’s BOSH or Chef to orchestrate deployments in diverse provider environments.
PREDICTION #9: Global expansion takes center stage.
REALITY: Hit. One of the first questions we hear from prospective customers is “where are your data centers?” This year, almost all of the leading cloud providers expanded their footprint around the globe. For our part, we added data centers in Canada, the UK, and Germany. Now, as part of CenturyLink, we have major expansion plans in 2014.
PREDICTION #10: IaaS providers who don’t court developers get left behind.
REALITY: Hit. In 2013, Stephen O’Grady wrote that developers are the “new kingmakers” and this was reinforced by Gartner analyst Lydia Leong who wrote that IT operations no longer has a monopoly on cloud procurement. Developers are now running the show – bringing in vendors that meet their unique criteria. Consequently, a new crop of developer-centric cloud providers has popped up. While they don’t offer managed services or sophisticated resource management, they DO help developers get going quickly in the cloud. We wooed developers with new self-service capabilities, API improvements, and with new features like Autoscale and webhooks. Developers will continue to be a focus for us at CenturyLink and we plan on continuing our regular Open Source contributions!
PREDICTION #11: Clouds that cannot be remotely managed through an API will fall behind.
REALITY: Hit. APIs are the gateway to modern services and allow ecosystems to flourish. Consider the vibrant crop of cloud management platforms discussed in prediction #1. And that is just one small example. The vast majority of clouds listed in Gartner’s 2013 Magic Quadrant for Cloud Infrastructure have public, comprehensive APIs that developers can use to consume the cloud in whatever way they want. In 2013, we started an effort to replace our existing API with an even more expansive offering that offers complete parity with our industry leading Control Portal user interface. That effort will continue into the next year. When complete, a new host of capabilities will be accessible for CenturyLink, our partners, and mostly important, our customers.
PREDICTION #12: Usability and self-service become table stakes for cloud providers.
REALITY: Mixed. In 2013, we seemed to hit the point where “clouds that aren’t really clouds” struggled as the market began to demand more. Customers expected more and more self-service capabilities, and Tier 3 – along with most every other major provider – focused heavily on that in 2013. Platform usability was a lesser focus this year. While new clouds from Microsoft and Google included relatively straightforward user experiences, few providers made any massive visual improvements. While the CenturyLink Cloud continues to be lauded for an easy to use, powerful interface, we haven’t stood still. A major redesign is underway that will surface more data, simplify activities, and improve performance.
2013 was an important year in the maturation of the cloud industry. New vendors were introduced, popular platforms were acquired, and consumption of cloud services skyrocketed. What will happen in 2014? Stay tuned for our predictions!