New: Alert Management System For AWS

When we talk to our customers regarding their AWS bills, the problem that is mentioned by most of them is the lack of alerts regarding the AWS services they are using. When there are a number of services running and different people accessing them, it becomes hard to keep track of all services that are being used.

AWS charges for these services per hour and the only way to keep track of them is through the reports that AWS provides to you and these reports have a delay of 24 hours and this delay can be catastrophic if there is a spike in a particular instance and it adds thousands of dollars to your AWS bill.

There can be a lot of use cases in which you forgot about a particular service that is running such as not terminating an instance in the testing environment or forgetting about an S3 bucket that is underutilized etc. To address this concern of our customers, we are introducing our Alert Management tool.

Your Environment Your Rules

While discussing this issue of alert management with our customers we learned that they didn’t want some generic alerts that become monotonous after a point in time. We realized that every organization has a unique architecture and one size fits all approach won’t work here. To combat this problem we have designed our alert management system in such a way that you have full control over the alerts. This how it works:

Our Alert management system dashboard has 2 tabs:

  • Alert Rules: This shows you the current set of alerts that you have set and what are their details which will be explained in detail later in the blog.
Alert Management Dashboard
  • Alert History: This tab is specifically designed to help you keep track of all the alerts that you set in the past so that you cross verify if every service is covered and you’re not missing anything to set alerts for.

Create Alerts Cut Chaos  

On the top right of Alert Management system dashboard you see a Create Alert option. When you click on it you’re presented with various options to configure your alerts. Let us understand what all these options are:

  • Name: You can specify the name of the alert according to your use case.
Name and Description of Alert
  • Description: If you want you can add a description for your alerts so that it is easily understandable if someone else is accessing the dashboard.
  • Services: You can choose from an option of 20 services such as Amazon DynamoDB, EC2, S3, Redshift, etc. to set an alert for. We are adding more services and expanding the options rigorously. 
Select the service for the alert
  • Linked Account: After choosing the service you want to set the alert for, you have to specify for which account do you want this alert. If you have more than one AWS account then this will help you choose the appropriate one.
Choose the linked account
  • Instance Type: If you are setting alerts for AWS EC2 then you get the option of choosing which particular instance type do you want to set the alert for.
Select the type of instance
  • Resource ID: To make the alerts more specific and helpful for you we give the option of selecting the particular resource ID for which you want the alert.
  • Region: You even have the option to choose the region of which you want to have the alerts set up for like us-east-1, us-east-1b, etc.
Choose the appropriate region

When you’re done setting the rules of the alerts click on next and then you have to choose a few important things concerning the personnel and costs related to them:

  • You get the option to choose the type of cost you want to set the alert for which can be either the Actual cost or the Forecasted cost of that particular service.
  • Alert Threshold and Period: You have to specify the cost threshold breaching which you will get an alert regarding that particular service. In addition to that, you get the option to choose whether the cost that you specified is the daily, weekly, or monthly cost of that particular service.
Configure the thresholds for your alerts
  • Email recipients: When setting up an alert you have to specify the email id(s) of the people who should be getting an alert so that they can rectify the issue immediately.

After this, you just need to click on the next option and you’ll be taken to a confirmation page where you can review the parameters of your alert and then confirm your alerts.

Preview of your configurations
Review the parameters before your confirm

Let’s Talk!

We are just getting started with our product and will keep adding features as we move forward. Our current product is deployed at Innovaccer, Niki.ai, etc. and we have helped them save more than $300k in cloud bills already. If you want to know more about our product and how we can help you cut down your cloud costs feel free to contact us at contact@opslyft.com or you can leave your email id in the dialog box and we will contact you.

Schedule A Demo!

OpsLyft: Your Personal Cloud Economist

Whenever you log online, there’s some kind of cloud service that you are using whether you realize it or not. In this modern do it as you go world cloud technologies have played a pivotal part in paving a path for companies to deploy their products and services in the fastest way possible.


Cloud has been deeply integrated into the infrastructure of most of the organizations in the world especially when it comes to tech companies. But cloud can’t be all rosy and bloomy, right? There has to be something that itches and infuriates organizations that most common folks don’t know about. The culprit here is cloud bills. 


Major cloud providers are really smart when it comes to the billing structure of the cloud. They have a pay-as-you-go model which is hard to keep track of when things reach an enterprise scale. This results in cloud bills reaching millions of dollars leaving companies puzzled as the bills and their native cost explorers don’t give a clear picture of the reasons that accumulate such massive bills.


We realized this lack of information and built our product combating the two major problems faced by organizations i.e., lack of visibility and quick action. Our product focuses on providing industry-leading visibility of your AWS costs down to the cost incurred by the smallest instance. For taking prompt action on the things causing a surge in cloud bills we have built our Plutus CLI so the developers can focus on developing more and debugging less. We have built the following to help you cut down your cloud costs:


Account Summary Dashboard


This dashboard is made for executives that want to have a holistic view of their AWS account and get a good amount of information without fiddling with a lot of data. You get an idea of your AWS costs and what steps you need to take to control them if the cost is more than what you expected. Account Summary Dashboard provides you with the following metrics:


  • All the accounts ‘ summarised details on a single screen.

  • Trend for the account’s cost for a brief on how the cost is looking out to be through the month.

Account Summary

  • Get the top 5 costliest service forecasts for day, week, and month.

  • Operational details to know the cost of the top five resources of basic services that are driven in real-time.

Top 5 Costliest Services

  • Determine the cost wasted on idle resources which could be avoided further.

  • Track down the cost of untagged resources which can be highly beneficial to track down unexpected costs.

True Cost Explorer 


True cost explorer is our approach to the various native cost explorer tools out there. The biggest problem with the native cost explorer was the lack of clarity as you can’t get a clear picture of the reason you are getting such massive cloud bills. The metrics provided by them can’t help pinpoint what needs to be done to curb the bills. 

We made a cost explorer that provides everything you need when it comes to “exploring” your AWS account. The benefits of our true cost explorer are the following:


  • To make the tracking of cloud costs easy we have segregated the costs into different categories namely cost of each service, account, region, instance type, usage type, API operation, and resource id in a single screen reducing monitoring time of the user. You can go as deep as you want to explore your AWS costs.

True Cost Explorer

  • We don’t force a particular view for the cost explorer which may restrict or overwhelm your cost analysis. There are a number of filters in place so that you can slice and dice data however the user may like and decipher your cloud costs according to your needs.

  • With our True Cost Explorer, you can even keep track of cost for an individual resource.

Idle Resource Coverage Dashboard


One of the biggest culprits of cloud costs is the idle resources that are running in the background. The resources are hidden from plain sight and hardly anyone pays attention to them. These resources keep churning up your cloud bills and you realize it when it’s already too late.

The main challenge with idle resources is they are hard to back trace when you have a big infrastructure. The time that goes into searching for such resources and shutting them down can be instead used to develop more features and products. We thought of this and made an idle resource coverage dashboard that can help the developers by:


  • To help keep track of cost wastage the dashboards have an idle resources count tab which tells you how many idle resources are there in your AWS account and in addition to that, you get the total cost of these idle resources that are adding up in your bill per day.

Idle EC2 Services

  • So far we cover-up detailed data for cost wasted in EC2 service namely Elastic IP addresses (EIP), Elastic Load Balancing (ELB), Elastic Block Storage (EBS), and underutilized EC2 instances.

Untagged Resources Dashboard


Tagging is considered one of the most fundamental principles of good engineering practice and organizations abide by it with full pride. But as the company grows and the resources increase even efficient tagging is hard to practice no matter how much you try. 

Untagged resources have a significant contribution to cloud bills and in a plethora of processes and activities going on, they get lost and are forgotten till you get your AWS bills and you think about what contributed to it. To solve this problem we made our Untagged resources dashboard that:


  • Identifies the resources which are not tagged but still contributes to cloud cost which solves the challenge to track down unexpected cloud cost.

Untagged Resources

  • The Detailed dashboards help you track untagged resource cost by service down to individual resource level.

Root Cause Analysis Dashboards


The problem with data provided by native cost explorers is the delay between data collection and when it’s actually displayed on the dashboard. This gives you a vague idea of the resources that are causing a cost surge and by the time you take an action, the metrics have already changed. We combat this issue by providing real-time data for up to 5 minutes so you can take action before it’s too late.

We have made our RCA dashboards that are your go-to for solving issues without looking for data at a number of places. RCA dashboards are designed are in such a way that:


  • They solve the limitations of AWS Cost Explorer and AWS Cloudwatch by providing cost data down to the very last instance.

  • They monitor anomalies in cost and performance for basic services in real-time to avoid unexpected cost spike or server crash.

  • They provide a detailed cost breakdown for every cost affecting variable or static action for the service in real-time.

Detailed S3 Real Time Costs

  • They provide detailed performance metrics in real time.

  • Cost and performance metrics can be tracked even to the resource level for the service.

Plutus CLI


There are many other cloud cost management tools out there in the market which also provide a number of dashboards to customers. Even though these dashboards are useful to finance teams and top management, these don’t serve a good enough purpose to developers. They get an idea where the cost is rising or which particular service/account is behaving inappropriately, but they don’t have a way to solve that problem quickly and have to sit and solve the issue with rigorous effort.


We realized this pain-point and developed our Plutus CLI that empowers developers to take quick actions and solve problems as soon as possible so they can focus on developing more. We have built our CLI with the following features keeping developers in mind:


  • We have a built-in action set to track and avoid unwanted resources and potential cost optimizations.

  • We give developers the freedom to list and remove all the potential cost optimization opportunities and unused resources just through command lines.

Plutus CLI

  • The engineering team can set up a custom configuration driven alert management system driven through the command line to keep a check on unwanted cost spikes.

  • You will get notified whenever the cost is increased through the user-defined threshold at the service or resource level.

List Of Commands

  • You can even deploy a YAML configuration and also check alert history through the command line.

We are just getting started with our product and will keep adding features as we move forward. Our current product is deployed at Innovaccer, Niki.ai, etc. and we have helped them save more than $300k in cloud bills already. If you want to know more about our product and how we can help you cut down your cloud costs feel free to fill the form below so we can give you our product demo.


Schedule A Demo!

Strong Interdependency Between AI & DevOps

As a part of my entrepreneurial journey, meeting industry experts and gathering insights from them is a regular thing now. I met someone this weekend where we talked about what it takes to build an ideal DevOps world. A world where product owners, development, QA, IT Operations and Infosec work together, not only to help each other, but also to ensure that the overall organization succeeds. By working towards a common goal, they enable the fast flow of planned work into production(e.g. performing tens, hundreds to thousands of code deploys per day), while achieving world class stability, reliability, availability and security. In this world, cross-functional teams rigorously test their hypothesis of which features will most delight users and advance the organization goals. They care not just about implementing user features, but also actively ensure their work flows smoothly and frequently through their entire value stream without causing chaos and disruption to IT operations or any other internal or external customer. But this is hard, the process of constant improvement with human involvement is a challenge when you operate things at scale. Hence, Automation is the Key! It’s the lifeline. This contributes to better sync among the teams and eventually faster and more accurate deployment and releases.

However, can we make our automations smart and self learning ? Think about automation over automations which just knows what you want for your infrastructure !

Yes, I’m talking about using AI/Machine Learning capabilities to enhance DevOps. But to recognize any benefit with AI and DevOps, a creative mindset may be required. AI can change how DevOps teams develop, deliver, deploy and organize applications to improve the performance and perform the business operations of DevOps.

The future of DevOps is AI-driven, helping to manage the immense capacity of data and computation in day-to-day operations. AI has the potential to become the primary tool for assessing, computing and decision-making procedures in DevOps.

For example, for the most effective medical diagnostic process, you don’t just depend on that detection alone. You use that detection to empower a human diagnostician who can apply a broad understanding of pathologies and deep experience with the complexities of individual patients to deliver the highest quality care. In DevOps, we can do the same. We can use AI to capture insights that teach us how to continuously optimize our workflows and processes. We can also use our AI learnings to push our work up higher on the value chain.

Collaboration between DevOps and AI can have numerous use cases. Some of them can be :

  1. Smarter Development : We all learn through iterations. Same goes for machines. Most machine learning systems use neural networks, which are a set of layered algorithms that accept multiple data streams, then use algorithms to process that data through the layers. You train them by inputting past data with a known result. These learning systems can also be applied to data collected from other parts of the DevOps process. This includes more traditional development metrics such as velocity, burn rate, and defects found etc.
  2. Smarter Monitoring : If you’re beyond the beginner’s level in DevOps, you are likely using multiple tools to view and act upon data. Each monitors the application’s health and performance in different ways. What we lack, however, is the ability to find relationships between this wealth of data from different tools. Learning systems can take all of these disparate data streams as inputs, and produce a more robust picture of application health than is available today.
  3. Predicting Faults : This relates to analyzing trends. If you know that your monitoring systems produce certain readings at the time of a failure, a machine learning application can look for those patterns as a prelude to a specific type of fault. If you understand the root cause of that fault, you can take steps to avoid it happening.
  4. Feedback Mechanisms : One of the biggest problems with DevOps is that we don’t seem to learn from our mistakes. Even if we have an ongoing feedback strategy, we likely don’t have much more than a wiki that describes problems we’ve encountered, and what we did to investigate them. All too often, the answer is that we rebooted our servers or restarted the application. Machine learning systems can dissect the data to show clearly what happened over the last day, week, month, or year. It can look at seasonal trends or daily trends, and give us a picture of our application at any given moment.

We can literally can derive numerous of use cases over a coffee when it comes to working AI with DevOps. Having said that, it’s first very important to Know Your DevOps First. As enticing as it may be to dive headfirst into AI, you’re not going to be as effective as you can be if you lose the humanity from your dev team. You don’t want to be so reliant on robots and so dysfunctional as humans that when it comes to complex problems, you are functionally unable to process or resolve them. At OpsLyft, we believe that the future of AI and DevOps is bright. There’s a future here where the rote business of work that we all deal with every day will be as archaic as accounting by hand. We’re in an exciting time.

We’d love to hear your stories about DevOps automation and possible use cases of AI/ML to it. Reach out to us at contact@opslyft.com as we surely can help you enhance your cloud by simplifying DevOps for you 🙂