Mastering CloudTrail Logs, Part 2
Overview
In part 1 of this series, we took a look at what CloudTrail logs are, the value addition that CloudTrail logs serve and some of the problems involved in processing and storing these logs. In part two of this series, we will look at how Observo helps organizations process CloudTrail logs at scale and derive value from them.
As a quick recap, let’s take a look at what a CloudTrail event looks like. For more details on the important fields in a CloudTrail event, refer Part 1 of this series.
{
"eventVersion": "1.07",
"userIdentity": {
"type": "AssumedRole",
"principalId": "AROAEXAMPLEID:ExampleRole",
"arn": "arn:aws:sts::123456789012:assumed-role/ExampleRole/ExampleSessionName",
"accountId": "123456789012",
"accessKeyId": "AKIAEXAMPLEKEY",
"sessionContext": {
"sessionIssuer": {
"type": "Role",
"principalId": "AROAEXAMPLEID:ExampleRole",
"arn": "arn:aws:iam::123456789012:role/ExampleRole",
"accountId": "123456789012",
"userName": "ExampleRole"
},
"webIdFederationData": {},
"attributes": {
"mfaAuthenticated": "false",
"creationDate": "2023-04-06T12:34:56Z"
}
}
},
"eventTime": "2023-04-06T12:36:56Z",
"eventSource": "s3.amazonaws.com",
"eventName": "GetObject",
"awsRegion": "us-east-1",
"sourceIPAddress": "203.0.113.123",
"userAgent": "aws-cli/1.23.4 Python/3.9.7 Darwin/21.1.0 botocore/1.23.4",
"requestParameters": {
"bucketName": "example-bucket",
"key": "example-object.txt",
"x-amz-id-2": "EXAMPLE-ID-STRING",
"x-amz-request-id": "EXAMPLE-REQUEST-ID"
},
"responseElements": {
"x-amz-request-id": "EXAMPLE-REQUEST-ID",
"x-amz-id-2": "EXAMPLE-ID-STRING",
"ETag": "EXAMPLEETAG"
},
"requestID": "EXAMPLE-REQUEST-ID",
"eventID": "EXAMPLE-EVENT-ID",
"eventType": "AwsApiCall",
"apiVersion": "2021-10-08",
"recipientAccountId": "123456789012"
}
Characteristics of CloudTrail Logs
The distribution of CloudTrail events in your organization’s AWS account mimics the activity that takes place in AWS. Some common characteristics of CloudTrail events are:
- Read actions often contribute to >50% of events. This includes events like:some text
- GetObject - Retrieval of an object from an AWS S3 bucket.
- DescribeInstances - Describes Amazon EC2 instances.
- DescribeDBInstances - Describes Amazon RDS DB instances.
- ReceiveMessage - Receives messages from an Amazon SQS queue.
- The majority of CloudTrail events are generated due to automated processes that exist in your organization. This includes actions like execution of CICD pipelines, tests that spin up/down resources in AWS and scaling up and down of clusters due to traffic patterns to your services. This results in a repetitive pattern of CloudTrail events generated in your account.
- Some fields in a CloudTrail event contribute significantly to the overall size of each event but may not have analytical value in downstream SIEMs. For example, additionalEventData contributes to over 10% of each event's overall size.
Optimizing CloudTrail logs with Observo
Observo has native support for optimizing CloudTrail logs. Within minutes, customers typically realize >50% reduction in data volume for Cloudtrail logs. In this section, we’ll go over how Observo can help you optimize your CloudTrail data while at the same time, ensuring that high signal data is forwarded downstream.
Aggregating CloudTrail Events
As mentioned in the previous section, CloudTrail events tend to be highly repetitive in nature. Observo aggregates repetitive CloudTrail logs usings its highly scalable aggregation engine. Events that have the same request, response and caller and that occur within a 1 minute window are aggregated together into a single event.
Here’s an illustration of how aggregation of events takes place:
As seen above, events marked in yellow are aggregated together to produce a single event as these events are repetitive events that are produced within a short period of time. This is typical of most production environments - automated processes like CI/CD and E2E testing pipelines result in the bulk of Cloudtrail events generated.
Filtering CloudTrail Events
Observo gives you the ability to filter CloudTrail events that are not of value to your organization. As described in the above sections, the majority of CloudTrail events tend to be dominated by read actions that take place in your AWS account. Below we have a Filter defined in Observo that removes all events that match a regular expression of “List*|Get*|Describe*” and are from a trusted source that have a sourceIPAddress in the CIDR range 10.11.2.0/24. Note that this condition expression is customizable to match your organization's needs.
Reducing Event Payload of CloudTrail Events
Aside from aggregation and filtering of CloudTrail events, Observo also allows you to define what fields within each CloudTrail event are forwarded along to downstream destinations. As seen below, fields such as “additionalEventData” usually do not add value from a security posture perspective. Observo gives you the ability to define a rich set of conditions based on which fields are dropped from each event.
Conclusion
Observo enables your organization to get the most out of your AWS CloudTrail logs without compromising security posture. With Observo, you can realize over 50% reduction in data volume of CloudTrail logs within minutes. Look out for a part 3 of the series where we will deep dive into a case study from our customers.
Learn More
For more information on how you can save 50% or more on your SIEM and observability costs while cutting the time to resolve critical incidents by more than 40% with the AI-powered Observability Pipeline, read the Observo.ai White paper, Elevating Observability with AI.