With a microservices architecture, distributed teams often need a central operational excellence team to make sure that the rest of the organization is following operational best practices.
For example, you might want to know if you configured lifecycle policy, versioning, and access policies properly for objects in an Amazon S3 bucket. Proper configurations ensure that you have the desired retention and deletion policies, and avoids accidental sharing of Amazon S3 objects.
Similarly, you might want to know whether teams have enabled Amazon DynamoDB auto scaling in their tables. Doing so increases throughput capacity (read and write capacity units) to handle increased traffic seamlessly, and lowers the throughput capacity when workloads decrease. This scaling means that you pay for the right amount of provisioned capacity. Finally, you might want to make sure that you configured Amazon CloudWatch alarms on your DynamoDB tables (or other AWS resources) for effective and automated response.
AWS provides services such as Amazon CloudWatch, AWS CloudTrail, AWS Config, and AWS Trusted Advisor to enable operational auditing. This blog post covers how you can use AWS Lambda and APIs provided by different AWS services to automate the auditing of your operational best practices.
In this post, you create an AWS Identity and Access Management (IAM) role for your Lambda function and use the DynamoDB API to review DynamoDB tables and indexes for different rules. You also set up automated notifications to let you know if your settings violate any of those rules. You can extend this solution to add more rules, or you can modify the solution to audit AWS services for operational best practices. The code in this post shows how to:
- Check the current AWS account limit for your DynamoDB tables.
- Calculate your total provisioned throughput—tables plus global secondary indexes (GSIs)—and warn you if it is greater than x percent of your AWS account limit.
- Calculate the provisioned throughput for each table and warn you if it’s greater than x percent of the table-maximum limit for your account.
- Check the provisioned throughput of each GSI and warn you if it is x percent less than the throughput of the table.
- Check whether you configured CloudWatch alarms for the following DynamoDB metrics logged by CloudWatch:
- Calculate the total number of warnings, and optionally write those warnings as a custom metric to CloudWatch.
Create an IAM role for your Lambda function
To get started, create an IAM role and attach it to an existing or new custom policy to have permissions for
cloudwatch, as shown in the following policy:
Create and configure your Lambda function
Next, create and configure a Lambda function:
- In the Lambda console, choose Create function.
- Under the Create function step, choose Author from scratch.
- Under Basic information, add the following information:
- Name: AWSAccountAudit
- Runtime: Python 3.6
- Role: Choose an existing role
- Existing role: Choose the IAM role that you created in the previous section
- Choose Create function.
- Under Function code, do the following:
- Leave Code entry type, Runtime, and Handler at their default values.
- Replace the existing code in the text box with the preceding code block.
- Under Basic settings, update the following:
- Memory (MB): Leave the default setting (128)
- Timeout: 5 min
- Choose Save to save changes.
- Under Configuration and Add triggers, choose CloudWatch Events from the list of available options.
- Under Rule, choose Create a new rule, and then fill in the following information:
- Rule name: AWS-DynamoDB-Daily-Audit
- Rule description: Daily audit of DynamoDB tables for operational best practices
- Rule type: Scheduled expression
- Schedule expression: Rate(1 day). You can choose a different frequency depending on how often you want to audit your AWS account.
- Choose Add to add a trigger for your Lambda function.
- Choose Test and under Configure test events, leave Create new test event and the Hello World Event Template. Type the event name,
AWSAccountAuditTest, and choose Create.
- Choose Test to execute the Lambda function.
Under Execution result, you can see the number of warnings that result from this script.You can view the output details under Log output. If there are a lot of warnings, they all might not appear in the Log output area. To get the complete output results of the function, choose the Click here link to view the CloudWatch log group under Log output.You can provide different parameters for the code in the Lambda function, including the AWS Region and thresholds for which warnings should be generated. The following code block shows how to provide these parameters.
By automating the auditing of your DynamoDB tables, you can help make sure that operational best practices are being followed, and that you have automated monitoring and notifications in place.
You can always add more rules. For example, you can randomly read a few items from each table to understand how large each item is in the table. If very large items are being stored, the audit can recommend, for example, that you use Amazon S3 to store large objects. The audit also might recommend using DynamoDB to store key metadata, along with links to the objects in Amazon S3. You can also extend the Lambda function shown in this post by using APIs for other AWS services to make sure that you are following operational best practices for those AWS services.