Americas

  • United States

Asia

Leave the Internet Archive alone!

opinion
Oct 29, 20245 mins
InternetSecurityTechnology Industry

Except for book publishers, the Internet Archive has done no one any harm. But that hasn't stopped hackers from beating up on the site over and over again.

Archive.org
Credit: Archive.org

The web has been a mixed blessing for people who care about information. Yes, it’s made it easier than ever to access facts and opinions from around the globe — but it also throws out older data as quickly as it brings in new data. (And let’s not even talk about propaganda!)

One shining beacon for recording truthful and accurate records throughout the web’s history has been the Internet Archive.

The Archive was created by Brewster Kahle, who, beginning in 1980, wanted “to build a library of everything.”  His first step in that direction was creating the Internet’s first distributed search system, the Wide Area Information Server (WAIS)

When he founded the Archives in 1996, his ambitious goal was to provide “universal access to all knowledge.” Kahle and his friends have been remarkably successful. Today, the Archives holds digital copies of 44 million books and texts, 15 million audio recordings, 10.6 million videos, 4.8 million images, a million software programs, and even a copy of Computerworld from 1969. 

To do this, he created the Internet Archive and its associated projects, including the Wayback Machine, which allows users to view archived versions of more than 866 billion saved web pages, and the Open Library project, which aims to create a web page for every published book.

It’s that last project that got the Archives into legal hot water. During the COVID-19 pandemic, Kahle opened the library for free ebook borrowing via the Controlled Digital Lending (CDL) program. Publishing companies were not amused and the Internet Archive lost the resulting lawsuit, Hachette v. Internet Archive. The court rejected the Archive’s fair use defense, finding that its digital lending practices infringed on publishers’ copyrights. 

That’s a huge problem on its own. The Internet Archive is a 501(c)(3) non-profit with a gross revenue in its most recent 990 filing of only $30.5 million. For the size of the job it’s undertaken, it’s grossly underfinanced. 

Recently, though, adding insult to injury, the Archive has been subjected to one cyber-attack after another.The first major incident occurred Oct. 9-10 and involved two simultaneous attacks: First, hackers exploited a GitLab token, compromising the Archive’s source code and stealing user data from 31 million accounts. Concurrently, a pro-Palestinian group called SN BlackMeta launched a Distributed Denial of Service (DDoS) attack, temporarily knocking the site — and the Wayback Machine — offline.

Blackmeta said it hit the site because it belongs to the United States, which supports Israel in the ongoing Palestine-Israel conflict. Uhm, no, no it doesn’t. The only cause the Internet Archive espouses is freedom of information, and it has no connection with the US government. 

Maybe it should. I could argue that the National Archives and Records Administration (NARA) should track the public web, but it doesn’t. 

Then, on Oct. 20, the Internet Archive suffered yet another security breach: This time, hackers exploited unrotated Zendesk, the help desk support program’s application programming interface (API) tokens, to access the Archive’s support platform. 

The results have been one mess after another. Many of the Archive’s services, including the Wayback Machine, have gone dark. In addition, people are worried that some of the data stored by the Archive has been deleted or compromised. 

Operators managed to get the site back up, and a few days ago, Kahle told CBC Radio, “It’s just so sad. It’s great to be back up, and we have millions of people now accessing the site again.” 

That didn’t last. Since then, it’s been hammered yet again!

Enough already — crashing the Internet Archive won’t make a lick of difference to the world’s geopolitical problems. No one will get rich from ripping off the Internet Archive users. There is no point in messing with the Archive. None!

The Archive is a useful library. That’s it. That’s all. And that’s enough.  

In particular, the Archive keeps the only real records of what’s been on the Web. As we put more of our records and news on the Web and nowhere else, that’s vitally important for historians and other people who appreciate knowing who said what to whom and when. 

The Archive needs to be preserved, not vandalized. I’m reminded of the dim-minded protestors whose big idea was to throw pumpkin soup on the Mona Lisa. Quick! What were they protesting?  

You don’t know, do you? 

It was about the right to healthy, sustainable food.

That attack made no difference whatsoever. 

Vandalism, whether on a politically neutral, useful website or on world-famous art, is not helpful; it’s only harmful. And, in the Internet Archive’s case, it’s also pointless. 

More by Steven J. Vaughan-Nichols: