At present, cybersecurity incidents are increasing day by day. In an unprecedented series of events, the Internet Archive, the famous Wayback Machine, faced multiple significant cyber attacks recently. These incidents have spread through the digital preservation community and raised critical questions regarding the security measures of our collective online heritage.
The First Data Breach:
The Internet Archive suffered a cyberattack in late September that exposed personal information of approximately 31 million users [1]. It is believed that the data was stolen on September 28 and breached on October 9. News of a breach began circulating on October 9 afternoon after the users visiting the archive.org began seeing a JavaScript alert, stating that the internet archive was breached. The JavaScript alert was “Have you ever felt like the Internet Archive runs on sticks and is constantly on the verge of suffering a catastrophic security breach? It just happened. See 31 million of you on HIBP!” [2].
Here, in the JavaScript alert, HIBP refers to the Have I Been Pwned. This is a data breach notification service created by Troy Hunt, with whom hackers commonly share the stolen information to be added to the service. Hunt told BleepingComputer that a hacker shared a database with user information from the Internet Archive. The file is a 6.4GB SQL file called “ia_users.sql” [2]. Troy Hunt told Bleeping Computer that hackers shared stolen information of users, which included email addresses, screen names and bcrypt-hashed passwords on 30 September and he contacted the Internet Archive with the information on 6 October [1]. The latest date on the stolen records is September 28, 2024, which is probably when the database was stolen [1] [2].
Hunt says there are 31 million different email addresses in the database, and many of them are signed up for the HIBP data breach alert service. Soon, the information will be added to HIBP. This will let users check if their email was involved in this breach. The authenticity of the data was validated after Troy Hunt reached out to users listed in the compromised databases. Among those contacted was cybersecurity researcher Scott Helme, who allowed BleepingComputer to share his exposed record [2].
Helme confirmed that the bcrypt-hashed password in the data record matched the bcrypt-hashed password stored in his password manager. He also confirmed that the timestamp in the database record aligned with the date of his last password update, providing further evidence of the data’s accuracy [2].
In the first breach, hackers took advantage of the Gitlab token that had been left exposed since December 2022, and taking advantage of it, hackers accessed the Internet Archive’s source code and stole user data [5].
The DDOS Attack:
While the archive was still dealing with the aftermath of the data breach, it faced multiple DDOS attacks. The Internet Archive suffered back-to-back DDOS attacks on October 9 and October 10 [1] [4]. Brewster Kahle, chair of the Internet Archive’s board, posted an update on the Internet Archive’s response to the attacks on X:
“What we know: DDOS attack – fended off for now; defacement of our website via JS library; breach of usernames/email/salted-encrypted passwords.
“What we’ve done: Disabled the JS library, scrubbing systems, upgrading security.” [1] [2]
On October 10, Brewster Kahle again posted on X that DDOS attacks have resumed taking archive.org and openlibrary.org offline again.
The Russia based SN_Blackmeta hacking group claimed a DDOS attack on the Internet Archive and they also said that they will be conducting more attacks through a post on X [3] [4].
Mid-October 2024 – The Second Breach:
In mid-October 2024, the hackers took advantage of an unrotated access token for breaching the Internet Archive. Hackers were able to get into the Internet Archive’s Zendesk support site without permission. These tokens, which are like digital keys, were meant to be kept safe after earlier alerts, but they were still at risk [5]. This breach exposed a critical flaw in the Archive’s security practices, particularly its failure to rotate API tokens regularly [5].
October 20, 2024 – The Third Breach:
This breach happened because bad actors continued to exploit unrotated Zendesk API tokens that had not been rotated. These tokens, which are like digital keys, were exposed in earlier attacks, but the Internet Archive didn’t change or replace them. As a result, the hackers maintained their access to the Internet Archive Zendesk support platform, where sensitive user support tickets were stored [5].
The Link Between the Breaches:
The third breach is directly connected to the vulnerability that was exploited during the first two breaches:
First Breach: October 9, 2024
In the first breach, bad actors exploited a GitLab token that had been exposed since late 2022. This allowed them to access the source code of the Internet Archive and put the sensitive information of 31 million users at risk. At the same time, the group SN_BlackMeta started a DDoS attack, which caused the website to have problems [5].
Second Breach: Mid-October,2024
In this breach, the bad actors targeted the Internet Archive Zendesk support platform by taking advantage of unrotated access tokens. These tokens should have been updated after the initial breach, but they were not and enabled unauthorized access to support tickets containing personal data from users [5].
Third Breach: October 20, 2024
The third breach happened because of the same main issue that caused the first two attacks: not managing and changing access tokens correctly. This allowed the attackers to continue exploiting the same vulnerabilities to access critical areas of the Internet Archive’s systems. Each new attack took advantage of the problems that the earlier attack didn’t fix, making the damage worse [5].
Coclusion:
This incident serves as a wake-up call for the organizations to implement proactive security measures, regular system updates and regularly rotate security tokens. This incident shows how even organizations like the Internet Archive are vulnerable if fundamentals of security are neglected.
Hackers are exploiting the systems using advanced technologies and tools and raising concerns for cybersecurity professionals to come up with rigid solutions to counter the sophisticated threats. It is our collective responsibility to counter these types of attacks by properly managing the system.
Reference:
- https://www.siliconrepublic.com/enterprise/internet-archive-cyber-attack-2024
- https://www.bleepingcomputer.com/news/security/internet-archive-hacked-data-breach-impacts-31-million-users/
- https://www.cyberdaily.au/security/11215-internet-archive-down-claims-catastrophic-data-breach-impacting-31-million
- https://www.techrepublic.com/article/internet-archive-accounts-exposed/
- https://www.forbes.com/sites/larsdaniel/2024/10/20/internet-archive-breached-again-third-cyber-attack-in-october-2024/
- https://x.com/disclosetv/status/1844135950324203802
- https://x.com/troyhunt/status/1844148532703526928
- https://x.com/brewster_kahle/status/1844183111514603812
- https://x.com/Sn_darkmeta/status/1844080692772401399
Nice Post!!!! It is important for organizations to always do periodic vulnerable tests on archives to mitigate hacker’s attacks, and it is also a reminder that the security of an organization is the responsibility of every individual that works within.
Nice work !
Nice Post !
This is really concerning that the hackers used the same vulnerabilities to breach the Internet Archive’s systems. This repeated attacks because of the unrotated API tokens shows how a small failure in security can lead to major consequences. This shows that only patching is not the best method but timely audit and management of security protocols is important.
Hi, Akshar. I really enjoyed the topic you chose to discuss for this post. I gathered and understood many new ideas. This particular issue can be seen as a result of the necessity for routine system updates. By doing so, we are able to uphold a proactive security measure where attacks are prevented due to token rotation. This incident helps showcase the value of safe cybersecurity practices on an organization’s end. As attackers increasingly use sophisticated tools to exploit even minor security oversights, staying updated and informed is one of the best ways to reduce the level of impact that may occur.
Excellent post! This blog is a stark reminder of how tricky it has become to secure digital objects — even for non-profit organizations such as the Internet Archive. This pattern of hacks illustrates how a single mistake, such as not re-sending access tokens, can become a cycle of attacks. That’s an apocalyptic example of how even very simple security procedures such as token changing help to significantly protect the system. It’s as if the Internet Archive’s experiences can be used to encourage others in the space to push further on token control and system audits.
Nice job, Akshar! The string of breaches at the Internet Archive is a wake-up call to take security measures to update systems and work to rotate access tokens. This proves that there is a great need for a good security framework to secure sensitive data and prevent recurrent attacks.
Nice job, Akshar! The string of breaches at the Internet Archive is a wake-up call to take security measures to update systems and work to rotate access tokens. This proves that there is a great need for a good security framework to secure sensitive data and prevent recurrent attacks.