Automated Traffic Mirroring in AWS with Full Coverage

Other Posts in This Series

  1. Automated Traffic Mirroring in AWS
  2. Automated Traffic Mirroring in AWS at Scale
  3. Automated Traffic Mirroring in AWS with Full Coverage (you are here)
  4. Automated Traffic Mirroring in AWS with Tags

The previous posts in this series have focused on a single use case where the RunInstances operation is hooked by Lambda via CloudWatch Rules to automatically create VPC Traffic Mirror Sessions for all of the attached ENIs a few seconds for any new instances created. We then decoupled the Lambda components to seamlessly handle whatever you can throw at it.

The next thing to do is expand the coverage of this solution to ensure there are as few scenarios as possible where you might miss packets. To do this, we will now be adding support for the following Operations:

  • StartInstances: by subscribing to this operation you will ensure coverage for any existing instances that do not already have active Mirror Sessions, as soon as the request is made to start the instance.
  • AttachNetworkInterface: this will provide coverage for any existing instances which have additional network interfaces (ENIs) attached. This can occur long after an instance is started, or it can happen within microseconds of the instance being created if all of the ENIs were not included in the initial RunInstances operation, for instance.
  • DeleteTrafficMirrorSession: in the event a Traffic Mirror Session is deleted, either by accident or with malicious intent, the updated Lambda function will use this as an excuse to crawl through all of your EC2 instances and ensure Traffic Mirror Sessions are properly setup.

Thankfully we’ve already done most of the hard work. The rx-vpctm-create function created in the previous post doesn’t even need to change! All we need to do is update our rx-vpctm-manager function to handle the new operations by creating a list of EC2 Instance IDs to pass to our rx-vpctm-create function, which doesn’t care at all why it’s being asked, it just takes an Instance ID and does it’s job.

Step 1: Update the existing Manager Function in Lambda
You’ll notice that the RunInstances and StartInstances Operations are able to share the same code for creating our list of instanceIds. The first difference is with the AttachNetworkInterface operation, which does not support multiples so at this time we just grab the instanceId and away we go… easy.

The heavy lift this time around – aside from lots of testing – was the addition of support for the DeleteTrafficMirrorSession operation because CloudTrail does not decorate the event with any information about the Traffic Mirror Session, only the ID, which is now useless because it’s been deleted! Which means there’s no easy, stateless way to determine what the source/filter/target for that session was. So at this time, the only solution I’m aware of is to grab all your EC2 instance IDs and just pass them to rx-vpctm-create which is already setup to silently ignore error messages generated by trying to create a duplicate mirror session.

NOTE: the method being used here for handling the DeleteTrafficMirrorSession operation is definitely a blunt instrument. If you are not ready to automatically mirror every single EC2 instance in the region where this solution is deployed, you may want to wait for the next post in this series, where I will add an orchestration example using a tag to determine which EC2 instances should be managed for you. Once we add that, then this will be much more tightly scoped for those who are not ready to mirror all the things. :slight_smile:

'use strict';

const AWS = require('aws-sdk')
const LAMBDA = new AWS.Lambda({apiVersion:'2015-03-31'})
const EC2 = new AWS.EC2({apiVersion:'2016-11-15'})

// The name/alias of the Lambda function to invoke for creating VPC TM Sessions
const LAMBDA_CREATE = 'rx-vpctm-create'

exports.handler = async (event) =>
{
  try
  {
    // Safety Net to ensure the actual request resulted in a successful response
    if (! event.detail.responseElements) {throw `This does not look like a successful API request`}

    // Populate the [instanceIds] Array based on [eventName]
    let instanceIds = []
    switch (event.detail.eventName)
    {
      // Convert [instancesSet] to an Array of [instanceIds]
      case 'RunInstances':
      case 'StartInstances':
        instanceIds = event.detail.responseElements.instancesSet.items.map(i => (i.instanceId))
        break

      // Simply pluck [instanceId] and go (operation does not support multiples)
      case 'AttachNetworkInterface':
        instanceIds.push(event.detail.requestParameters.instanceId)
        break

      // Get [instanceIds] for all NITRO EC2 Instances in this REGION
      case 'DeleteTrafficMirrorSession':
        instanceIds = await getNitroInstances()
        break

      // Safety Net for unsupported Operations
      default: throw `Unsupported Operation: ${event.detail.eventName}`
    }

    // Create a VPC Traffic Mirror Session for all [instanceIds]
    await Promise.all(
      instanceIds.map(async id =>
        LAMBDA.invoke({
          FunctionName: LAMBDA_CREATE,
          Payload: JSON.stringify({instanceId:id})
        }).promise().then(response =>
        {
          // Send LOG/ERROR to CloudWatch for each [instanceId] individually
          let payload = JSON.parse(response.Payload || '{}')
          if (payload.statusCode === 200)
          {console.log(event.detail.eventName, payload.body)}
          else
          {console.error(event.detail.eventName, payload.body, payload.statusCode, payload.error)}
        })
      )
    )
  }
  catch (err) {console.error(event.detail.eventName, err.message || err)}
}

/* ========================================================================== >>
   WORKER FUNCTIONS
============================================================================= */
async function getNitroInstances ()
{
  let instanceIds = []

  // Get all EC2 Instances as [reservations] in this REGION
  let reservations = [], token = null
  do {
    let data = await EC2.describeInstances({MaxResults:1000, NextToken:token}).promise()
    reservations = reservations.concat(data.Reservations)
    token = data.NextToken
  }
  while (token)

  // Extract the [instanceId] and add to list if [isNitro]
  reservations.forEach(r =>
  {
    r.Instances.forEach(i =>
    {
      if (isNitro(i.InstanceType)) {instanceIds.push(i.InstanceId)}
    })
  })

  return instanceIds
}

/* ========================================================================== >>
   HELPER FUNCTIONS
============================================================================= */

// The [hypervisor] property on the [instance] object is not useful for this, so
// it's necessary to check the instance family manually.
// Reference: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instance-types.html#ec2-nitro-instances
function isNitro (instanceType)
{
  const nitro = [
    'A1', 'C5', 'C5d', 'C5n', 'G4', 'I3en', 'Inf1', 'M5', 'M5a', 'M5ad', 'M5d',
    'M5dn', 'M5n', 'p3dn.24xlarge', 'R5', 'R5a', 'R5ad', 'R5d', 'R5dn', 'R5n',
    'T3', 'T3a', 'z1d', 'a1.metal', 'c5.metal', 'c5d.metal', 'c5n.metal',
    'i3.metal', 'i3en.metal', 'm5.metal', 'm5d.metal', 'r5.metal', 'r5d.metal',
    'u-6tb1.metal', 'u-9tb1.metal', 'u-12tb1.metal', 'u-18tb1.metal',
    'u-24tb1.metal', 'z1d.metal'
  ]

  return (new RegExp(`^(?:${nitro.join('|')})`, 'i').test(instanceType))
}

So what’s really happening here …
You will find that the code is heavily commented so I would ask you give it a read, you’ll be surprised how simple this all really is. Especially if you’ve been following the series, you probably have a good idea what’s happening here, as all that’s really been added is the handling of different eventName values for StartInstances, AttachNetworkInterfaces, and DeleteTrafficMirrorSession operations. As promised, the DeleteTrafficMirrorSession operation required the addition of a new worker function getNitroInstances() where all EC2 instances in the region are requested, filtered by InstanceType to only operate on Nitro instance types, and returned as a list of InstanceIds to pass to rx-vpctm-create just like the other operations.

Step 2: Update CloudWatch Rule Event Pattern
In order for the rx-vpctm-manager function to receive the new events, the new operations must be added to the Event Pattern of the CloudWatch Rule created in the first post of this series. Until now the event pattern instructed CloudWatch to only trigger the Lambda on the RunInstances operation. It’s really easy to add more operations either by typing them into the UI, or you can copy/paste the Event Pattern in JSON format below:

{
  "source": [
    "aws.ec2"
  ],
  "detail-type": [
    "AWS API Call via CloudTrail"
  ],
  "detail": {
    "eventSource": [
      "ec2.amazonaws.com"
    ],
    "eventName": [
      "RunInstances",
      "StartInstances",
      "AttachNetworkInterface",
      "DeleteTrafficMirrorSession"
    ]
  }
}

Conclusion, for now …

That’s it! Now you have full coverage. As mentioned above, there is still one more step to take if you need to have the ability to scope this automation by adding an orchestration element with tags. This is what will be covered in the fourth and final post of the series. However, if that’s not your style, the solution presented up to this point is all you need. Unless I’m missing something? Can you see any coverage gaps with this solution?

2 Likes