Cloud Security Case Study: Analysis and Learnings from CommuteAir’s Leak of the No-Fly List

Nick Doyle
10 min readFeb 16, 2023

In January 2023, maia arson crimew stumbled across a public Jenkins server using default login creds, storing static AWS Access Keys giving access to CommuteAir’s AWS Infrastructure, including DynamoDB Databases and most juicily S3 buckets, holding among other things an old copy of the TSC No-Fly List.

This was interesting to me for a couple of reasons

  • The domain, Aviation is cool
  • CommuteAir use AWS hence learnings could be applied in my work
  • There were also a number of process & policy failures, which although they might sound boring, I also find interesting to think about improvement

maia’s original writeup is here: https://maia.crimew.gay/posts/how-to-hack-an-airline/

For CommuteAir the outcome could have been far worse; in fact what happened is arguably the best case for them. Options available to maia which she did not take:

  • not disclose
  • persist
  • move laterally
  • exfiltrate more than just out-of-date no-fly list
  • sell the access aka IAB

I’ll skip my rant on why I think nobody should be running Jenkins (aka “the WordPress of CICD”) let alone exposed publicly; dead horse — let’s move on to the more interesting stuff.

OSINT Visibility — Know when you’re on the Radar

The reason this public Jenkins host came to maia’s attention, was that it was indexed on Zoomeye (Chinese Shodan), available for all to see, yet CommuteAir had no knowledge of this. How could this have been detected? 3 approaches I’d consider:

On the DIY front, an architecture I could see working would be lambda regularly querying centralized AWS Config for ENI Public IPs. This Lambda could then either:

Once the detection is in place you’ll need to actually make it useful by integrating to your SecOps processes, whether that be auto-creation of a Jira ticket, notification in Slack or Teams or dumb email. Shodan alerts can do this by hitting a webhook.

Bonus Round: Auto-Remediation

Once detection is in place, auto-remediation would be a potential next step; query AWS Config to determine the ENI (ex. detailed in this AWS blog post) and automatically apply a restrictive security group on the affected ENI.

Secrets in the Source

Apologies for omitting the Dad Joke Disclaimer

“Cryptographic failures” is in position #2 of the OWASP 2021 Top 10 list
This includes storing static credentials, with weak (or …. no) encryption.

Organizational leadership, including low appetite for poor security practices (and correspondingly prioritizing security) go a long way to improving this. Developer security education also supports — primarily for developer growth, but it also serves as evidence that the organization prioritizes; is willing to invest in application security and put its money where its mouth is.

How about technical measures?

All major hosted scm+cicd platforms(Gitlab, Github, Azure) now provide Secret Scanning out of the box (for their paid offerings, which can get admittedly pricey).
If you’re not using them, there are plenty of tools to 80/20 the DIY e.g. git-secrets

Bonus Round: There is no Secret

Wouldn’t it be great if your CI/CD Jobs didn’t even /need/ static credentials in order to do their work? Gitlab, Github and Buildkite (unfortunately not ADO) CICD runners support OIDC, which means you can

  1. Create an AWS OIDC Provider
  2. Create an AWS IAM Role trusting them via this Provider
    (so long as the job is for your repo!), and
  3. Your CI/CD jobs can then automatically retrieve temporary credentials as-needed to assume the Role and do their work

It’s very easy to setup and excellent security.

Bonus Round: A Canary in the Pipeline

Even if your repos contain no credentials, it would have been very useful for CommAir to get notification that someone was snooping and trying out the AWS Access Keys found therein. An excellent lightweight tool for this is Thinkst Canary Tokens. It literally does take under 5 minutes to setup. You create a token, store it somewhere people shouldn’t be snooping, and get notification when it’s used.

AWS Security

What AWS features could CommuteAir use, in order to prevent this happening again?

Security in AWS, like AWS in general, can be a bit of a disjointed conceptual mess, and designing and implementing useful controls requires experience and planning.

Prevention with SCP — Guard Rails for Governance at Speed

Guard Rails help you move quickly, while avoiding total disaster

By far the most effective measure — if feasible — is to prevent even using IAM Users and Access Keys at all, with the following statement in an org-wide SCP:

  - Sid: DenyCreateIamResources
Effect: Deny
Resource: *
Action:
- iam:CreateUser
- iam:DeleteUser
- iam:CreateAccessKey
- iam:DeleteAccessKey

In order to implement such though, the org needs clear policy, guidance and education on exactly why this is forbidden, as well as the Right ways to do things. Those require much more effort than 5 lines of yaml.

Security Hub

Best Practice is centralize your Security controls using AWS Security Hub, designed to be AWS’ one central place to manage your cloud security.

For configuration security — unfortunately Security Hub rulesets don’t have out-of-the-box detection of IAM User or Access Key creation; best you can do it enable “CIS AWS Foundations Benchmark v1.4.0" which includes the detection to check “1.14 — Ensure access keys are rotated every 90 days or less” to at least detect such after 90 days.

Guard Duty

For runtime detection, the relevant GuardDuty controls, that would generate findings in this scenario are:

The now-deprecated (nfi why) GuardDuty finding UnauthorizedAccess:IAMUser/UnusualASNCaller would also have fired when the principal first authenticated from somewhere not-the-Jenkins-host, assuming the attacker didn’t think to perform such requests from the Jenkins host itself (possibly for best OpSec points, from an ephemeral container running on the Jenkins host?)

HOWEVER. GuardDuty findings take FOREVER.

So long in fact that by the time you get alerts, an intruder possessing any more dynamicism than rail infrastructure will have long exfiltrated your no-fly list, your fly list, and your pet’s “places I would like to fly” list.

In a previous team we joked that GuardDuty is most-effective as a summary postmortem of what the attacker has accomplished as they casually owned your cloud infrastructure. An automated preparation of meeting notes for the PIR. A CliffsNotes for the CTO on “How We Got Pwned”. In fact, a very considerate automation to put in place might be a lambda to congratulatorily email these to your attacker. Perhaps include a little video montage of quirky AWS Console clicking with generic upbeat soundtrack, like relive.cc for Strava, but for cloud infrastructure pwnage.

OK /s - I think this fact doesn’t get enough attention. I think the reason is that often, the people architecting and deploying AWS environments are not the ops/security team. They’re architects or platform engineers whose vision (and incentive) is to do the initial design and build, but likely have never worked in — nor have incentives aligned to — an operational team. For them it’s good enough to say “yep switched on GuardDuty, security in place, tick”. Looks good on paper, ticks compliance, aligned to the Security Pillar of the AWS Well-Architected Framework, sounds authoritative in presentations. So everyone does. And in practice it’s almost useless.

AWS WAF

AWS WAF is in my opinion OK. Not great, not terrible. It’s OK. I mean it doesn’t actively harm your security. As far as I know.

It’s unlikely WAF would have blocked the principal in this scenario since the Jenkins box was just sitting on the public internet with default credentials. It’s also quite possible given CommuteAir’s demonstrated security posture, that it was a public pet with no ALB in front of it precluding use of WAF. But let’s assume not, for the sake of exploring this option.

Possibly, if initial access attempts generated multiple 401’s then the WAF AWS-managed rule group for Account Takeover Prevention rule “VolumetricIpFailedLoginResponseHigh” may have triggered and blocked the user.

Regardless, WAF is always worth putting in front of any public web interface.

Pluses

  • Easy to setup & configure OOTB rulesets
  • Easy to apply org-wide across all resources (mainly ALB & APIGW), with centralized config & monitoring

Minus

  • Terrible management of WCUs — the capacity units consumed for each Rule you apply.
    TBH you normally “just guess”, or to actually check, pull your rules out into a json file and simply run the super-friendly CLI command like:
    aws wafv2 check-capacity — rules fileb://FMManagedWebACLV2-core-waf-wallet-ap-southeast-2–1665024682490.json — scope REGIONAL
    Thanks AWS! I hate it.
  • Terrible documentation; rule descriptions incredibly opaque
    e.g. “Inspects for multiple requests from the same client session that use stolen credentials.” — would this have prevented this situation? Who knows. Maybe. I’d probably add that one (shrug)
  • Terrible documentation again especially if you want to configure multiple rulesets and exemptions
  • Terrible intel on false positives
    Either use the web console to view a “sample of blocked requests” — that’s the actual name, AWS speak perchance for some DevOpsian Dilletante’s Digital Degustation, un petit-sample of what might lie in the block logs to whet your appetite!
    Or dump your ALB Access Logs, create an Athena table like so & query where action = WAF. Just like the days of logging in to hardware firewalls to find blocked requests. Only more complex and annoying. And for those who used such device interfaces that’s saying something.

Anyway enough criticizing WAF. You could also actually do something cool with it for this situation: Auto-blocking of dodgy actors with GuardDuty, WAF + Lambda

Unhelpful AWS Security Services

(in this situation)

  • Amazon Macie
    While Macie would have alerted on the presence of PII in the no-fly list stored on S3, this was by-design and likely would have been squelched
  • IAM Credential Report
    While useful to get a report of IAM creds, including Access Keys, the use of such was by-design. Besides, no busy ops person is going to log in, manually run & check such. Decent idea, with half-assed execution. Again.
  • AWS Audit Manager
    Really only useful if your primary concern is generating a report with little effort, rather than actual security. Target audience: Security consultants, interns

Summary

  • Security controls in AWS aren’t actually that helpful
  • Preventative controls, particularly in CICD pipelines, are far more effective
  • But the most effective steps happen at the policy and education levels; although preventative controls are very effective, you’re not infallible nor omnipotent. This is why having your people security-educated is the most effective preventative measure you can have.

In this particular incident. If I were to apply this as CommuteAir’s CTO, my ToDo lists might look something like this

Today

  • Mobilize cleanup crew
  • Engage professional Incident Response, SecOps work with & learn
  • Lock down Jenkins hosts
  • Public IPs cross-ref with Shodan and Zoomeye, triage others
    (consider automating)
  • Secret scan repos — manual, rotate, plan to remove where possible
  • Query logs (CloudTrail, Jenkins app & system) for evidence of all actions by the actor. Ensure actor is completely ejected from the environment.

This Week

  • PIR
  • Work with SLT and align to make Security an organizational priority, that will require investment. Setting this at the top will give teams clear direction, and the time and means to make this reality.
  • Comms — Articulate to the wider business what has happened, what has been done, what the vision for the next 3 months is — and what steps & support we will be taking to get there. Get everyone on board.
  • App Teams — Security training
    Personally I’ve used and highly rate Secure Code Warrior
    Consider a tournament / gameday, it’s fun
  • IF CICD modernization is feasible — Mobilize — CICD Modernization Crew, consider hosted solutions supporting OIDC
    Mandate secret and security scanning in pipelines
    IF modernization is not feasible (unlikely with accurate articulation), ensure risk is articulated to business & accepted, lock down & uplift Jenkins hosts
  • Mobilize — SecOps Cloud Security Uplift Crew
    Cleanup static creds (work with App Teams)
    Check for persistence — AWS Detective, lambdas, IAM credential reports
    Security Hub, Guard Duty, integrate to Slack & define basic ops process for these
    Security Hub standards enable, make Cloud Security scores a team KPI, setup process (and allocate time) to regularly review & uplift

Thanks for reading, hope you enjoyed, feedback welcome. Until next time!

--

--

Nick Doyle

Cloud-Security-Agile, in Melbourne Australia, experience includes writing profanity-laced Perl, surprise Migrations, furious DB Admin and Motorcycle instructing