Microsoft announced on Monday that it has taken action to address a flagrant security blunder that had exposed 38 terabytes of sensitive information.
The leak was found on the company’s AI GitHub repository, and it’s believed to have accidentally become public when a collection of open-source training data was published, according to Wiz. Additionally, it had a disk backup of the workstations of two former workers, which contained over 30,000 internal Teams communications in addition to secrets, keys, and passwords.
The repository, known as “robust-models-transfer,” is no longer reachable. It provided the raw code and machine learning models for the 2020 study “Do Adversarially Robust ImageNet Models Transfer Better?” before it was taken down.
The SAS token, an Azure feature that let users to share data in a way that is both difficult to trace and difficult to revoke, is to blame for the exposure, according to a study by Wiz. Microsoft was informed of the problem on June 22, 2023.
The repository’s README.md file specifically urged developers to download the models from an Azure Storage URL that unintentionally allowed access to the whole storage account, exposing further private information.
The token was incorrectly configured to give “full control” permissions rather than read-only, according to Wiz researchers Hillai Ben-Sasson and Ronny Greenberg. “In addition to the overly permissive access scope, the token was also misconfigured to allow read-only permissions instead of read-only,” they added. Meaning that an attacker might remove and overwrite existing files in addition to viewing all the data in the storage account.
Microsoft responded to the findings by stating that there was no proof of illegal customer data leakage and that “no other internal services were put at risk because of this issue.” Additionally, it highlighted that clients are under no obligation to take any action.
The SAS token was revoked, and all external access to the storage account was restricted, according to the Windows developers. Two days after the issue was first disclosed, the issue was fixed.
The company has increased the scope of its secret scanning service to cover any SAS tokens that might have too lenient expirations or privileges in order to reduce these dangers in the future. It added that a flaw in its scanning software had caused the particular SAS URL in the repository to be marked as a false positive.
The researchers stated that Account SAS tokens “should be considered as sensitive as the account key itself” because to the absence of security and governance surrounding them. “As a result, it is strongly advised against using Account SAS for external sharing. Sensitive data can readily be exposed by token creation errors that go undetected.
Misconfigured Azure storage accounts have been discovered previously. JUMPSEC Labs described a scenario in July 2022 where a threat actor might use such accounts to penetrate an enterprise on-premise infrastructure.
Almost two weeks ago, Microsoft disclosed that hackers based in China had infiltrated the company’s systems and stolen a highly sensitive signing key by compromising an engineer’s corporate account and possibly accessing a crash dump of the consumer signing system. This latest security blunder comes from Microsoft.
“AI gives IT companies access to enormous potential. The enormous amounts of data that data scientists and engineers handle, however, necessitate additional security checks and precautions as they strive to implement new AI solutions, according to Wiz CTO and co-founder Ami Luttwak.
“Large data sets are required to train this new technology. Cases like Microsoft’s are becoming harder to monitor and prevent as more development teams are required to manipulate vast volumes of data, share it with their peers, or work together on open-source projects.