Automate AWS Cleanup with Lambda: Delete Unnecessary EBS Snapshots
Automate AWS Cleanup with Lambda: Delete Unnecessary EBS Snapshots
Automate AWS Cleanup with Lambda: Delete Unnecessary EBS Snapshots
To delete unnecessary EBS snapshots in AWS, you can use an AWS Lambda function. This function will check for snapshots that are older than a specified number of days and then delete them. Below is a sample Lambda function in Python (using boto3, the AWS SDK for Python) that deletes snapshots older than a certain number of days.
If we have created the snapshot and forgot to delete it then AWS will take charge, there are so many resources like EBS Volumes, like S3, EKS and many others. Those resources are called Stale resources which have been created by someone but forgotten to delete the resources. If we are not managing the cost of the cloud platform efficiently the cost would go high.
Being a DevOps Engineer cloud cost should go down. Also they should proactively check and look for any stale resources that are in existing infrastructure.
To avoid a cost we will implement the Lambda Function. We are going to use the module called Boto3. Boto3 talks with AWS API. We can write a Lambda function to delete unusable snapshots.
So we will going to write the Lambda function which will fetch details for all the EBS snapshots
& filter out the snapshots which has been not in use (Stale)
In this example, we'll create a Lambda function that identifies EBS snapshots that are no longer associated with any active EC2 instance and deletes them to save on storage costs.
Description:
The Lambda function fetches all EBS snapshots owned by the same account ('self') and also retrieves a list of active EC2 instances (running and stopped). For each snapshot, it checks if the associated volume (if exists) is not associated with any active instance. If it finds a stale snapshot, it deletes it, effectively optimizing storage costs.
We will do some practical to understand more:
There is an instance which is already in running state in my account.
We are having below which has been created along with an EC2 instance.
So Just take an example that one of the developers who has access to the console is creating a snapshot for the volume each and every day & back to the EC2 Dashboard and create a snapshot.
Click on Snapshots.
Click on Create Snapshot.
Then click on Create Snapshot.
So the snapshot has been in Pending state.
The Snapshot has been completed.
So let's take a real time example that the developer has created the snapshot and he forgot to delete the snapshot. He has deleted the EC2 instance through that the volume got deleted automatically. But Snapshots remain available. In this case we will use the Lambda function.
Then we will write a Lambda function.
Go to the Lambda function & click in Create Function.
Keep other settings as default. We can click on the Create function.
Also follow the Python Boto3 documents for the reference.
Now we have to copy the code from GitHub repo and past the code below:
https://github.com/AmitP9999/aws-devops-zero-to-hero/tree/main/day-18
Then Save this code (ctrl+s) and click on the deploy.
Post this click on the Test project & then give the name Event Name as a “test”
& click on save. Now click on the test.
We have got below error
We have to rectified this issue.
Steps to Fix the Timeout Error: Increase Lambda Timeout:
Navigate to the AWS Lambda Console.
Open your specific Lambda function.
Under the Configuration tab, go to the General configuration section.
Click Edit and increase the Timeout value (e.g., to 30 seconds, 1 minute, or whatever suits your task).
Click on Configuration.
Then Click on Edit.
From in Basic Settings, Change the timeout:
So increase the execution time to 10 seconds & just keep in mind that default execution time is 3 Sec.
Then click Save.
Point to remember better to keep the execution time as small as possible because AWS will be going to charge using this as a parameter. Also post this project kindly delete the snapshot and EBS volumes as you created.
Then again go to the code & Test it and again we have got the below error.
The error message you're seeing—UnauthorizedOperation—indicates that the AWS Lambda execution role does not have the required permissions to perform the ec2:DescribeSnapshots action. To resolve this issue, you need to modify the IAM role that your Lambda function is using to include the necessary permissions.
Steps to Fix the Issue: Identify the Lambda Execution Role:
In the AWS Lambda Console, navigate to your Lambda function.
In the Configuration tab, find the Execution role under the Permissions section.
This will show the IAM role associated with your Lambda function.
Add the Necessary Permissions:
Go to the IAM Console in AWS.
Locate and open the IAM role being used by your Lambda function.
Click on the Permissions tab and then Add inline policy.
Then click on the Configuration tab and click on Permissions:
Now add the permission to this role.
Click the add permission and attach policies.
The Policy was successfully attached to the Role. Now lets try to execute.
Again we received the below error message.
Then Again go to the Role & we have to create a custom policy to attach, Click on Add Permission & Click Attach Policy.
We didn't find any policy related to the snapshot over here so we are going to create a custom policy. Then click on Policy and click in Create Policy.
Then Choose a Service
Then select EC2 as a service.
We have selected two actions below: Describe snapshots & Delete snapshots.
So coming to resources:
Then click on Next
Click Create Policy.
The Policy has been created. Then Again we have to go back to the Role and attach the policy.
Click on add permission. Now let's try to execute the role.
Again we have got an same error; we went in IAM role “cost-optimization-ebs-snapshot-role-jppv2dge”
Click the Permissions tab, then select Add inline policy.
Create a New Inline Policy:
In the policy editor, select EC2 as the service.
Add the DescribeInstances action to the policy to grant permission for describing instances.
Select the EC2 as a service
Added policy
Give the policy name “EC2_Describe******
Then click on Create Policy. Now we will run the code and let's see the result. Yes, our Lambda function executed successfully.
But as we checked on the console the snapshot has not been deleted.
Now we will delete the instance. Once the instance has terminated & volume would also be deleted.
I’m going to terminate the instance.
The instance has been deleted now as we checked on EC2 console Volume which attached to that instance also gone but still snapshot is available.
Usually within in organization we can also add the one more if condition in the code that instead of the directly deleted, We should verify when was the last time EBS volume was used or else we can provide the 30 days threshold which means if the snapshot is 30 days old or lastly used by 30 days ago so we can delete it. So we can add the if condition.
We can add if or else condition as per our inconvenient in the code. So if we mention in the code that snapshot would also deleted if its not attached to any volume.
Happy Learning!!