Elasticity and quick provisioning are hallmarks of any good cloud platform. Cloud customers have gotten used to rapidly acquiring right-sized resources that fit a given workload. No longer do developers have to build the biggest (physical) server possible just to avoid requests to resize later on. Rather, provision for what you need now, and adjust the capacity as the usage dictates. But how do you know when it’s time to size up?
The CenturyLink Cloud engineering team just released a monitoring and alert service (alongside our powerful server UI redesign) that gives you the data you need! We designed this feature with three things in mind:
- Offer a simple, straightforward toolset that users can understand and take advantage of quickly.
- Deliver reliable, accurate statistics that reflect the current state of a server.
- Provide multiple ways to identify that an alert was fired.
Together, these three principles kept us focused on delivering a service that met market need. Let’s take a look at how the new monitoring and alert service applies each principles.
It’s easy to get lost in a sea of rarely-used options offered by a monitoring platform. Instead, we focused on ease of setup, a common theme in the CenturyLink Cloud. Users only have to follow two steps.
First, access the Alerts item in the top level navigation menu. This takes you to a list of all the alert policies for your account. Policies can measure CPU, memory, or storage consumption of a server. Creating a policy is as simple as providing a friendly name for the alert, indicating the measure and usage threshold, choosing a duration that the chosen threshold must be exceeded before an alert fires, and a list of the alert’s email recipients.
Once a policy (or polices) are created, simply apply it to one or many servers. The server’s Settings page now has a tab for Alerts where users can quickly add or more policies to the server. To aid usability, we show you a preview of the policy’s core parameters as you select it. This keeps policy names crisp, and prevents incorrect assignment of policies.
Immediately after applying a policy, the platform compares a server’s consumption to the policy’s trigger. Furthermore, you can update policies in a central location and instantly impact all of the servers attached to that policy. Simple, easy – and elegantly powerful!
What’s more, you will easily see when a server has alert policies attached. In our new user interface (available to all users as a public beta!), there are three ways you’ll identify that a server has an alert policy. First, we put an indicator on the monitoring chart that displays the alert level. Secondly, all of a server’s policies are listed in the summary pane. Finally, all policy activities are logged and available in the server’s audit trail.
Monitoring and alerting features exist to deliver proactive, timely, accurate statistics about a virtual machine. It does no good to find out that a server was running hot yesterday. False alarms are counterproductive as well.
In the CenturyLink Cloud monitoring and alerting service, we capture near-real time statistics about each server and show both current and aggregate perspectives. There’s the current consumption highlighted on the left, and the aggregated consumption available on the chart. You’re able to look at a long term aggregation, or even jump down to the average consumption on an hourly basis.
Because the CenturyLink Cloud runs a highly tuned virtualized environment, you may see a difference between what a virtual server shows for consumption, and the value we show in the Control Portal. The Control Portal identifies what the hypervisor itself thinks the utilization is, and this is MORE accurate because the hypervisor can intelligently add horsepower to servers under stress. So, keep this in mind and don’t worry if a server appears slightly stressed to you, but the platform itself doesn’t completely agree!
Finally, it’s important to be able to consume alerting information in multiple ways. We offer three wildly different but extremely complementary mechanisms. By default, a policy must have an email recipient for any alerts. So even if you aren’t logged into the Control Portal, you can instantly find out, in real-time, if an alert condition has been met for the threshold period. Additionally, Control Portal clearly displays when a server is in an alerting stage. If you’re on the server’s details page itself, you’ll see a warning as well as the utilization indicator turned to red. But even better, we highlight the offending server at different levels in the UI - in the left side navigation, the server’s group, and the group’s data center! This means that you can easily see where you have servers experiencing alerts from anywhere in the interface.
The final option is to configure a webhook. Recall that the CenturyLink Cloud offers webhook capabilities which push notifications to an external endpoint of your choosing whenever certain platform conditions occur. We’ve added a new webhook for “alert notification” that will send a data-rich message to any endpoint. For example, you could configure the webhook to feed into your support system so that the two environments (cloud and on-premises) are automatically integrated.
Alerts aren’t helpful if you don’t know they are occurring! So, we’ve built in a host of ways to send notifications and quickly see relevant information.
We’re excited to ship this new capability, and have other plans for building upon these services. Don’t hesitate to provide feedback or feature suggestions by accessing the “feedback” link within the Control Portal!
“Getting a little bit of the right information just ahead of when it’s needed is a lot more valuable than all the information in the world a month or a day later.” That quote – found in the book The Two Second Advantage by Vivek Ranadive and Kevin Maney – highlights a new reality where responsiveness can be a competitive advantage. Smart companies are building a responsive IT infrastructure where data isn’t just hoarded in massive repositories, but analyzed quickly and acted upon. How can you know more, faster and have better situational awareness?
With an increasing amount of critical IT systems running in the cloud, there’s a need to know what’s happening and act on it. This month, CenturyLink Cloud introduced Webhooks, making us among the first public IaaS cloud providers to send real-time notifications to a web service endpoint. For this initial release, customers can set up Webhooks for events within accounts, users, and servers.
When To Use This?
Webhooks are relatively new idea, although already used by diverse web properties like Wordpress and Zoho. Let’s look at three different scenarios where CenturyLink Cloud Webhooks can lead to better decisions.
Scenario #1 – Data Synchronization
Polling is an inefficient way to retrieve data from an external system, but it remains a popular choice. When you poll a system for changes, you’re effectively asking “do you have anything new for me?” Many times, the answer is “no.” With push-based notifications, the only time you are contacted is when something relevant happens. For example, some customers synchronize CenturyLink Cloud data with their internal support or configuration management systems. They do this for auditing purposes, or to give support staff an accurate picture of cloud deployments. The issue? Staying in sync requires an aggressive polling frequency that needless encumbers systems. Webhooks provide a better alternative.
In the scenario visualized below, as soon as a new server is created in the CenturyLink Cloud cloud, an event fires and a message is sent to an endpoint specified by the customer. That listener service then updates the appropriate internal system. Within seconds, systems are completely synchronized!
Scenario #2 – Anomaly Detection
People love the cloud because of the self-service capabilities and freedom to instantly create and delete servers at will. One downside of this freedom – for service providers anyway – is fraudulent signups. CenturyLink Cloud resellers actively monitor new accounts, but the sheer volume of manual analysis can be daunting. What if resellers could programmatically monitor specific sequences of events and then use that data to flag an account as “suspect” and deserving of special attention? Again, we turn to Webhooks to help react faster.
It’s great that developers can quickly bring gobs of new cloud machines online. But rapid provisioning can occur within the wrong sub-account or under unusual circumstances. In both of these examples, consider using a complex event processing solution that monitors streams of Webhook events and detects aggregate patterns that reveal more than any single event can.
Scenario #3 – Compliance Monitoring
Cloud and governance don’t have to be at odds with each other – and in fact, these two ideas go hand-and-hand when it comes to IT as a service. CenturyLink Cloud already provides customers with many ways to do this today through sophisticated account management capabilities. But we often get customers requesting a “corner case” scenario – like preventing a certain user from being added to an account, or making sure that database servers aren’t given a public IP address. Webhooks are a way for us to programmatically empower customers to support unique scenarios, in self-service fashion. Via Webhooks, users compare events to previous ones using a data repository. This way, customers can immediately find out if a server was changed inappropriately, a user was added to an account, or the contact information was changed. If an out-of-compliance change is made, the customer can respond almost instantly!
It’s very simple to configure Webhooks in the CenturyLink Cloud cloud. Simply visit the API section of the Control Portal and choose Webhooks. Here, users can browse the list of available Webhooks, then specify the “target” URL to receive a JSON-encoded message. Each Webhook is configured with an HTTPS URL, and includes an optional capability to send events that occur within sub-accounts.
For more details on how to create a Webhook listener service, take a look at our Webhook FAQ article in the Knowledge Base. This is an innovative and exciting capability for the platform and we can’t wait to see how customers use it to create more responsive systems and processes!
Elasticity is a core tenet of cloud computing. Cloud has become so popular simply because resources can be adjusted up or down, based on business need, instantly. Manually resizing cloud environments is still MUCH easier than altering physical hardware. But human action is still required, adding human cost to cloud.
A few cloud vendors have attempted to automate this process through “auto scaling” – services that expand and reduce the size environments based on user-defined parameters. However, this capability by and large automates the addition and removal of virtual machines to an existing resource pool. In engineering terms, this is “horizontal scaling” – adding capacity across multiple virtual machines. This approach is useful for consumer applications (think Netflix scaling up for Saturday night), but the enterprise scenario is much different, as we found out in our market research when developing this feature.
While we always recommend that our customers build highly available cloud systems with no single points of failure, there is value is sizing those resources up and down (i.e. “vertical scaling”) instead of only being able to add or remove entire servers. Having multiple servers is key for fault tolerance, but some workloads can benefit from additional server capacity, not just more servers!
This month, CenturyLink Cloud introduced our new Autoscale service. The initial release is focused on vertical scaling of CPU resources, with more vertical scaling (and, yes, horizontal scaling!) on the roadmap. Today, you can now add and subtract CPUs from cloud servers based on user-defined utilization limits. Capacity is added instantly without a reboot and capacity is removed only during user-defined windows of time, to prevent a reboot from occurring during prime usage hours.
Companies embrace the cloud because it offers agility, speed to market, self-service, rapid innovation, and yes, cost savings. There are plenty of cases where organizations can save money by using cloud resources, but it’s easy to focus on vendor compute and storage pricing, and forget about all the other financial components of a cloud application. See Joe Weinman’s Cloudonomics for an excellent analysis of how to assess the economic impact of using the cloud. An application can very easily cost MORE in the cloud – but that might still be just fine, since it helps the business shed some CapEx and remove servers from corporate data centers. In this post, we’ll talk about the full scope of pricing cloud applications and give you a useful perspective for assessing the overall cost.
Businesses deploy applications, not servers. A typical application is comprised of multiple servers that perform different roles. For instance, let’s consider an existing, commercial website that receives a healthy amount of traffic. It uses a load balancer to route traffic to one of multiple web servers, leverages a series of application servers for caching and business services, and uses a relational database for persistent storage.
To maximize revenue and customer satisfaction, the application is replicated in another geography for availability reasons and traffic can be quickly steered to the alternate site in the case of a disaster or prolonged outage.
“Hidden costs” often bite cloud users. This is especially true for those who buy from a cloud that offers “cheap virtual cores!” but also require you to buy countless other services to assemble an enterprise-class infrastructure landscape. Let’s look at each area where it’s possible – and likely – that you will incur a charge from your cloud provider.
- Application migration. If you are doing greenfield development in the cloud, then this won’t apply. But if you have existing applications that are moving to the cloud, there are a few migration-related costs. First, there can be a labor cost with doing virtual machine imports. Some cloud providers let you import for free, others charge you. In most cases, there is also a bandwidth charge for the transfer of virtual machine images. Finally, there’s likely a cost for storing the virtual machine image during the import process.
- Server CPU processor. This – along with RAM – is the number most frequently bandied about when talking about the costs of running a cloud application. Some providers let you provision the exact number of virtual CPU cores desired; others provide fixed “instance sizes” that come with a pre-defined allocation of CPUs and memory.
- Server memory. Cloud providers are ratcheting up the amount of RAM they offer to address memory-hungry applications, caching products, and in-memory databases.
- Server storage. There are many different types of storage (e.g. block storage, object storage, vSAN storage) and costs vary with each. Don’t forget to include the cost of storing data backups, virtual machine templates, and persistent disks that survive even after servers have been deleted.
- Bandwidth. It’s easy to forget about bandwidth, but it’s a charge that can bite you if you’re not expecting it! You may need to factor in public bandwidth, intra-data center bandwidth, inter-data center bandwidth, CDN bandwidth, and load balancer bandwidth. Not all of these may apply, and some may not be charged by your cloud provider, but it’s important to check ahead of time. Most cloud providers use the “GB transfer” model, charging for all data transferred – and penalizing customers for bursting above their commitments. CenturyLink Cloud utilizes the 95th percentile billing method, preventing surges in traffic from grossly affecting costs.
- Public IP addresses. Nearly every cloud provider offers a way to expose servers to the public Internet, and some charge for the use of public IP addresses. This is usually a nominal monthly charge, but one to consider for scenarios where there are dozens of Internet-facing servers.
- Load balancing. There is often a charge to not only use a load balancer, but also for the traffic that passes through it.
- VPN and Direct Connect. Cloud users are looking for ways to connect cloud environments to on-premises infrastructure, and vendors now offer a rich set of connectivity options. However, those options come at a cost. Depending on the choice, you could be subjected to fees for setup, operations, and bandwidth associated with these connections.
- Firewalls. This is usually baked into each cloud provider’s native offering, but you will want to check and make sure that sophisticated firewall rules don’t come with an additional charge.
- Server monitoring. Even those cloud servers aren’t in your data center, it doesn’t mean that you don’t need to monitor them! Depending on your monitoring needs, there can be a range of charges associated with standard and advanced monitors for each cloud server.
- Intrusion detection. Given that cloud servers are often accessible through the public Internet, it’s important to use a defense-in-depth approach that includes screening incoming traffic for potential attacks. CenturyLink Cloud is a bit unique in that we offer this at no cost, but you can still get this sort of protection from other vendors – but rarely for free.
- Labor for integrating with on-premises assets. You don’t want to create silos in the cloud, and you will likely spend a non-trivial amount of time integrating with your critical applications, data, identity provider, and network. If this effort requires assistance from the cloud provider themselves, there could be a charge for that time and effort.
- Distributed, disaster recovery environments. Applications fail, and clouds fail. If you require very high availability, you may need to duplicate your application in other geographically-dispersed cloud data centers. You could choose to keep that environment “warm” by synchronizing a data repository while keeping web/application servers offline. Or, you may choose to build a truly distributed system that leverages active infrastructure across geographies. Either way, it’s possible that you’ll incur noticeable charges for establishing replica environments.
- Development / QA environments. Applications may run differently in the cloud than in your local data center. Hence, you could choose to provision pre-production environments in the cloud for building and running your applications.
- System administrator labor costs. One of the wonderful things about the cloud is the widespread automation that makes it possible to provision and maintain massive server clusters without adding to your pool of system administrators. However, there are still activities that require administration. This may involve server patching and software updates, deploying new applications, and scaling the environments. Some of those activities can be automated as well, but you should factor in human costs to your cloud budget.
Places to save money
Given the various charges you may incur by moving to the cloud, how can you optimize your spend and take full advantage of what the cloud has to offer? Here are five tips:
- Don’t over-provision. Gone are the days when you have to request a massive server from an internal IT department because you MAY need the extra resources in the future and don’t want to deal with the hassle of upgrading the server later. CenturyLink Cloud makes it simple to change the number of virtual CPUs, amount of RAM, or amount of storage in seconds. Only spend money on what you need right now, and only pay more when you have to scale up. In addition, don’t settle for cloud providers who force you into fixed “instance sizes” that don’t deliver the mix of vCPU/RAM/storage that your application needs. CenturyLink Cloud encourages you provision whatever combination of vCPU/RAM/storage that you want! In fact, we usually tell customers to under-provision to start with, and ratchet up resources as needed.
- Turn off idle servers. If you decide to create development or QA environments in the cloud, it’s likely that those environments will be fairly quiet over weekends. By shutting those down – and doing it automatically – you can potentially save hundreds or thousands of dollars per year.
- Automate mundane server management tasks. Running maintenance scripts or installing software on a cluster of servers is time consuming and tedious. CenturyLink Cloud provides an innovative Group capability that makes it possible to issue power commands, install software, and run scripts against large batches of servers.
- Add resource limits to prevent runaway provisioning. Elasticity is a foundational aspect of cloud computing, but it’s not a bad idea to establish resource caps. With CenturyLink Cloud for example, customers can define the maximum amount of vCPUs, memory, and storage that any one Group can consume.
- Carefully consider uptime requirements and disaster recovery needs. Even though the cloud makes it easier, it’s still not cheap or simple to build a globally distributed, highly available application. Evaluate whether you need cross-data center availability, or, a defined disaster recovery plan. The simplest solution for CenturyLink Cloud customers is to provision Premium block storage which provides daily snapshots and replication to an in-country data center. In the event of a disaster, CenturyLink Cloud brings up your server in an alternate data center and gets you back in business. If you want to avoid nearly any downtime, then you can architect a solution that operates across multiple data centers. To save money, you could choose to keep the alternate location offline but synchronized so that it could quickly activated if needed.
When considering all the services you need to deploy and operate enterprise-level business applications, the “cheap virtual cores!” pitch is less compelling. It’s about finding a cloud provider that offers an all-up, integrated offering that gives you the set of services you need to deploy and maintain a robust, connected infrastructure. Give CenturyLink Cloud a try and see if our innovative platform is exactly what you’re looking for!
In the coming months, CenturyLink Cloud will launch new, enterprise monitoring capabilities, powered by ScienceLogic and New Relic. We wrote a guest blog post for ScienceLogic, describing our approach to monitoring, check it out here.