Securonix Threat Research: Uber Hack: Software Code Repository/VCS Leaked Credential Usage Detection

By Oleg Kolesnikov, Securonix Threat Research Team


Just about a week ago we learned of a new cyber attack, this time involving Uber–the leading ride-sharing app and company. Uber disclosed that it paid hackers $100,000 so that they would delete sensitive data stolen from Uber data stores. The hackers gained unauthorized access to a private GitHub repository used by Uber software engineers, and then used the credentials from that to access Uber data storage instances on AWS.

Here is a work-in-progress summary of what we currently know and our recommendations on some possible mitigations and Securonix predictive indicators that can be used to detect this and potentially future variants of this attack.

Figure 1: Uber Github Repository Source Code Credentials Examples

Summary – TL;DR

According to the publicly available details, there was nothing inherently “Über” about the Uber hack reported last week [1,2]–the hack was relatively trivial, namely Amazon AWS S3 credentials leaked by Uber in source code were made available on github and were then used by attackers to steal sensitive data from the Uber’s Amazon AWS S3 account. Still, the hack had a relatively significant impact. Here is what you need to know:

Impact: Personally identifiable information (PII) of 50 million of Uber customers and 7 million Uber drivers was exposed as part of this attack in October 2016. The data exposed included names, email addresses, telephone numbers, and ~600k driver’s license numbers. Subsequently, Uber secretly paid the perpetrators $100,000 to keep quiet and to erase the stolen user data. The hack was publicly reported by Uber on November 21, 2017.

Infiltration vector(s): According to Uber, the primary attack vector used involved the Amazon AWS S3 credentials checked into an Uber github repository that were exposed to attackers, and subsequently used to access Amazon AWS S3 storage account containing sensitive data. There are some unconfirmed reports that the repository used was a private repository.
In May of 2014, Uber experienced a similar security incident which also involved an adversary accessing an Amazon AWS S3 datastore using a github-posted access key, which impacted sensitive data of more than 100,000 Uber drivers. This past August, Uber agreed to 20 years of privacy audits to settle the FTC data mishandling probe regarding the 2014 incident.

Other/GDPR: While the amount Uber may be fined as a result of this breach in the US is likely going to be quite limited, it is worth mentioning that the EU’s General Data Protection Regulation (GDPR) comes into force in May 2018, and its impact on companies that process personal data will be substantial. Under the regulation, the estimated fine Uber would have been facing for this breach in EU is $650M (4% of their $6.5 billion revenue number for 2016) with mandatory 72-hour disclosure window.

Figure 2: Examples of AWS/Custom Credentials Identified in Github repo

Uber Attack/Data Breach – Observations/Artifacts

The infamous MITRE CWE-798, or use of hard-coded credentials, that is often involved in source code repository/version control system (VCS) credential leaks has been among the SANS Top 25 Most Dangerous Software Errors for years. In spite of this, we keep seeing this issue impacting organizations, most recently Uber, and resulting in high-profile breaches exposing critical customer data.

As shown in Figures 1 and 2, it is quite easy for attackers to obtain leaked credentials from github repositories of different organizations, including Uber, and there are a number of automated tools for this including the truffleHog tool shown in the figures above [5]. This issue has impacted a number of other individuals and organizations to-date besides Uber [7,3].

Accessing the corresponding Amazon S3 buckets containing sensitive information that was used by Uber attackers is also quite trivial and can be done either using the information leaked through github source code or using a simple AWS S3 bucket bruteforce tools such as lazys3

Figure 3: Scanning Other Github Repositories For Credentials – Lyft

The net net is that high-profile security breaches through AWS credentials posted by developers in code repositories such as github followed by access to unsecured AWS S3 buckets continue to be a troubling trend.

Prevention – Securonix Recommendations

Here are some of the Securonix recommendations to help prevent, detect, and mitigate such breaches within your organization:

$ curl -s<YOUR_ORGANIZATION>/repos\?per_page\=200 | grep clone_url | awk -F ‘”‘ ‘{print $4}’ | xargs -n 1 -P 4 trufflehog –regex –entropy=False
  • Check for regex patterns associated with commonly leaked credentials in your repositories. For instance:
  “Internal subdomain”: re.compile(‘([a-z0-9]+[.]*supersecretinternal[.]com)’),
   “Slack Token”: re.compile(‘(xox[p|b|o|a]-[0-9]{12}-[0-9]{12}-[0-9]{12}-[a-z0-9]{32})’),
   “RSA private key”: re.compile(‘—–BEGIN RSA PRIVATE KEY—–‘),
   “Facebook Oauth”: re.compile(‘[f|F][a|A][c|C][e|E][b|B][o|O][o|O][k|K].*[\’|”][0-9a-f]{32}[\’|”]’),
   “Twitter Oauth”: re.compile(‘[t|T][w|W][i|I][t|T][t|T][e|E][r|R].*[\’|”][0-9a-zA-Z]{35,44}[\’|”]’),
   “Google Oauth”: re.compile(‘(“client_secret”:”[a-zA-Z0-9-_]{24}”)’),
   “AWS API Key”: re.compile(‘AKIA[0-9A-Z]{16}’),#[a|A][w|W][s|S].*AKIA[0-9A-Z]{16}’),
   “Heroku API Key”: re.compile(‘[h|H][e|E][r|R][o|O][k|K][u|U].*[0-9A-F]{8}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{12}’),
   “Generic Secret”: re.compile(‘[s|S][e|E][c|C][r|R][e|E][t|T].*[\’|”][0-9a-zA-Z]{32,45}[\’|”]’)
  • Follow the Github recommendations regarding removal of sensitive data from the repositories e.g. use git filter-branch and BFG Repo-Cleaner to remove unwanted/sensitive data including credentials from your github history [4].
  • Adhere to your Cloud provider’s IAM best practices. In case of Amazon IAM:
    • Grant least privilege
    • Rotate security credentials regularly
    • Create and use IAM users instead of your root account
    • Enable multi-factor authentication (MFA) for privileged users
    • Restrict privileged access further with policy conditions
    • etc. (see [6] for more details)
  • Proactively search your source code repositories for sensitive information that could potentially be exposed to attackers using e.g. github dorks (see for more details/automation):
  • Assume that some credentials are going to be leaked in your source code sooner or later => perform behavior analysis of your AWS log/data sources to identify anomalies that may be associated with leaked credentials (see below);

Detection – Securonix Behavior Analytics/Security Analytics

1.1 Recommended Data Sources

In order to enable you to cover some of the key vectors involving the github-to-AWS attack path leveraged in the Uber breach, some of the recommended data sources to consider include:

#1 – CLO – Cloud Services/Alert Logs e.g Amazon AWS CloudTrail/CloudWatch/Macie, EC2, IAM, S3 Access etc.
#2 – WEB – Web Server Logs (Apache Tomcat, Webserver/IIS etc);
#3 – SSH – Sshd Access Logs, where appropriate e.g. if used for git/repo activity;
#4 – OCU – Other/custom logs related to development activity/check-ins/check-outs/pulls etc.

1.2. Some Examples of Relevant High-Level Behavior Analytics/Predictive Indicators

Here are some high-level examples of some of the relevant Securonix behavior analytics/predictive indicators to increase the chances of early detection of the malicious activity associated with the breaches similar to the one that impacted Uber:

#1 – Suspicious Cloud Activity (AWS S3 and EC2)

Suspicious Cloud Activity Peak AWS S3 Increase For Bucket Analytic
Suspicious Cloud Activity Rare AWS S3 Source Address For Bucket Analytic
Suspicious Cloud Activity Rare AWS S3 Operation For Bucket Analytic
Suspicious Cloud Activity Rare AWS S3 Access Useragent For Bucket Analytic
Suspicious Cloud Activity Diurnal AWS S3 Access For Bucket Analytic
Suspicious Cloud Activity Peak AWS EC2 StartInstances Increase For AccessKeyId Analytic

#2 – Suspicious Web Server Activity (Apache Software Version Control System (AS/VCS))

Suspicious Apache Software Version Control System Activity Peak Checkout/Clone Increase For User Analytic
Suspicious Apache Software Version Control System Activity Diurnal Checkout/Clone For User Analytic
Suspicious Apache Software Version Control System Activity Rare Checkout/Clone Source IP Address For User Analytic

It is important to keep in mind that there are many other attack vectors and log sources/data sources that need to be considered depending on the software version control system and the cloud infrastructure used by your organization e.g. Microsoft Azure, Slack tokens, Google OAUTH etc.


[1] Jeremy Kahn. Uber Hack Shows Vulnerability of Software Code-Sharing Services. November 22, 2017. Last accessed: 11-28-2017.

[2] Dara Khosrowshahi. Uber 2016 Data Security Incident. November 21, 2017. Last accessed: 11-28-2017.

[3] Darren Pauli. Dev put AWS keys on Github. Then BAD THINGS happened. 6 January 2015. Last accessed: 11-28-2017.

[4] Github. Removing Sensitive Data From a Repository. November 28, 2017. Last accessed: 11-28-2017.

[5] TruffleHog. November 28, 2017. Last accessed: 11-28-2017.

[6] Craig Liebendorfer. Amazon IAM Best Practices. January 6, 2016. Last accessed: 11-28-2017.

[7] Rich. My $500 Cloud Security Screwup. January 7, 2014. Last accessed: 11-28-2017.