How to find unused RDS instances in AWS
It is not uncommon to have developers run their own development servers in AWS. Sometimes things are too large to run locally. The problem is that those servers sit idle a lot and cost money doing so. This becomes especially true if you have a large database running on RDS.
I have scripted the deployment of a dev RDS database. With our snapshots that can take up to ten minutes to deploy. IMHO that is fast enough that you can spin one up when you need it. The argument that I am facing is that they “always” need the RDS database. I disagree and set out to prove this.
I believe in public shaming to change behavior. So I started sending slack messages when I would see a database that was up for an extended period of time. But that approach did not work and took a lot of my time. The argument that came back was that they were “using” the database during that time. This was a daily manual process which was not fun for me. So I set out to automate this and get a little creative as well.
The bash script I am going to explain here can be found on my public gist. First off why is this bash? I am using the aws cli for all the heavy lifting so it made sense. Plus I love me some bash.
First get all the databases that are “available”:
aws rds describe-db-instances --output text --query 'DBInstances[*].{DBInstanceIdentifier:DBInstanceIdentifier,InstanceCreateTime:InstanceCreateTime,DBInstanceStatus:DBInstanceStatus}' | grep available
Now that we have the list, let’s loop through it and use some cloudwatch metrics. Not many people realize the wealth of data in cloudwatch. In this script we are going to grab the “DatabaseConnections” metric. In my gist we are looking for how many database connections in the last hour. In my cron script I am looking over a 12 hour period. Kinda hard to say you needed that server when you have not even connected to it in the past 12 hours. You can change that as you need.
for j in $servers do server=$(echo "$j" | cut -f1 -d$'\t') update=$(echo "$j" | cut -f3 -d$'\t') connections=$(aws cloudwatch get-metric-statistics --metric-name DatabaseConnections --start-time $STARTDATE --end-time $UTCDATE --period $period --namespace AWS/RDS --statistics Maximum --dimensions Name=DBInstanceIdentifier,Value=$server --output text --query 'Datapoints[0].{Maximum:Maximum}') if [ "$connections" == "0.0" ] then echo "Server $server has been up since $update" echo "There have been $connections maximun connections in the last hour" echo "To terminate this instance run one of the following commands:" echo "aws rds delete-db-instance --db-instance-identifier $server --final-db-snapshot-identifier ${server}-final-${MYDATE}" echo "aws rds delete-db-instance --db-instance-identifier $server --skip-final-snapshot" echo "---------------------------------------------------------------------------------" fi done
In my production script I send the message to slack, but you can change that as you need. Now when this gets put in a public channel it is hard to defend.