Employees at Microsoft’s artificial intelligence research department accidentally made several dozen terabytes of sensitive data, including passwords and private keys, publicly available. The information was disclosed when a training dataset was published on GitHub in an open source project.
Users of the public GitHub repository received the open source image recognition code and AI model, as well as the ability to download content from Azure storage via a URL. Experts at cloud security startup Wiz found that the URL gave the user full access to cloud storage, including 38 TB of sensitive information, including the personal PC backups of two Microsoft employees. The array also included passwords for Microsoft services, secret access keys and more than 30,000 Microsoft Teams Messenger messages from several hundred Microsoft employees.
This URL, available since 2020, contained a public access signature token that not only allowed data to be read from Azure storage, but also granted permissions to add and edit that data – anyone could post malicious content there. Wiz reported its discovery to Microsoft on June 22nd and the Privileged Access Token was revoked on June 24th. The company completed its investigation into the incident on August 16.
As a result of the incident “As a result of this issue, no customer data was exposed and no other internal services were impacted.”, according to Microsoft. The company has expanded GitHub’s security features with new controls to control the exposure of credentials and other sensitive information, including public access tokens, which can have extremely broad permissions and long expiration dates.