Remember: If OpenAI/Google does it for $$$, it's not illegal. If idealists do it for public access, full force of the law.
Information wants to be free. Oblige it. Fools with temporary power trying to extract from the work of others will be a blip in the history books if we make them.
Google had a tailored fair use argument because they never made more than snippets public and searchable. It was also prior to Hachette that controlled lending with one-to-one digital copies for every physical copy was a status quo that publishers largely accepted, which IA deliberately tried to upset with the National "Emergency" Library.
I think it's worth fighting back on copyright as a broken institution, and it should be part of the IA's mission, but you have to be responsible on your approach if you're also going to posture as an archival library with stability of information and access. I understand Kahle might lament losing some of the hacker ethos, but the IA is too important to run up against extremes like this without an existential threat.
A summary judgement in favor of Google with an explicit sentence in the ruling that Google was not "violating intellectual property law" is an unmitigated victory.
Google removed a ridiculous amount of material during the dispute with the Author's Guild. I know because a bunch of my legal history research citation links collected between 2007-2011 are long since dead, with the material completely gone, AFAICT, and either not discoverable or only available in excerpt. And this was stuff from the 19th and early 20th centuries, which definitely was out of copyright in the US, though some of it may have potentially been a headache in Europe regarding copyright-adjacent author rights that Google didn't want to deal with.
Those who create information may have families to feed, house and clothe. Until those items (food/housing/clothes) are also free, information cannot be free.
You might be glad to learn a number of studies (mostly commissioked by the European Union) agree on the fact that piracy doesn't hurt sales.
The main consensus is that people who illegally access content wouldn't have bought it otherwise, and that they still advertise it (thus, still driving up sales).
These studies have then been systematically strong-armed into silence by the EU and constituent countries' anti-piracy organisms.
This is probably because the war on piracy, too, is a billion-dollar industry. I'd be glad to blow it all up and give it all to the starving artists and their families.
You are being intentionally misleading. Public access AI models are not being taken down either. There is a big, transformative, difference with freely giving out books to read compared with using them to train an ML model.
> There is a big, transformative, difference with freely giving out books to read on a small, measured in human reading pace, scale compared with using them at a massive scale at internet and computer memory speeds to train an ML model even if the intellectual property used to train the ML was from unlicensed copies, and which the model regularly and with some frequency regurgitates verbatim.
not wanting you to be intentially misleading, FTFY.
It was an absolutely bone-headed poke-the-bear move and we should count ourselves lucky that it was only a chunk of the library and not the whole archive that got nuked. IA holds priceless and irreplaceable data, and while the library initiative was a well-intentioned move during the pandemic it was way too radical for the keepers of our shared digital history.
And this is why I respect Anna's Archive. If we want information to be free, I think we should consider intentionally violating copyright as an act of civil disobedience. I'm not sure I'm ready to go that far, but I respect the people at AA who are.
> Kahle thinks “the world became stupider” when the Open Library was gutted—but he’s moving forward with new ideas
> The lawsuits haven’t dampened Kahle’s resolve to expand IA’s digitization efforts, though. Moving forward, the group will be growing a project called Democracy’s Library
please just stop. let IA be what it is. or rather, nothing wrong in doing new projects but don't tie them to IA, just start them as completely separate things. IA is too important as-is to be a playground for random kooky ideas playing with fire.
As the project matures, the risk tolerance should mature too.
Betting your own time and money on the realization of a crazy ideal can be very noble. Betting a resource millions of people are relying on is destructive hubris.
They should take the untamed idealism to a separate legal entity before they ruin all the good they've done.
"Millions of people" should either be putting their money (and their objections) where their mouth is or stop relying on someone else's resource. The reality is, that like Wikipedia, few people have donated to IA as a proportion of all its users.
The "good" that they've done is the "good" as the creator's see it, not the "good" as the freeloaders see it. All of which is to simply say that almost all users of IA are relying on the goodwill of the creators.
And Brewster Kahle's notions about culture and information sharing start well before the Internet Archive. In theory one could pick and choose, but this is Brewster's life-long passion project. The man even outfitted a van with a printer and a binder to distribute physical books for free.
It's very strange to insist that he _not_ push the boundaries of copyright law for the common good. without that you wouldn't have had the Wayback machine in the first place.
I am sorry to say that, but copyright protection time should vary on the subject. Programming books - 10 years maybe? 10 years is ages in computer science. TV shows? 5-7 years maybe. After that time nobody wants to pay for watching old big brother or another Fort Boyard... Nor pay for storing it in archive. And this is the culture other creations are referring to.
We've run in Poland into very strange situation - Polish Public TV (TVP) paid for the great dubbing of some Disney shows. They recorded it on VHS which were overwritten by other shows. Now the translation and the dubbing is lost, found sometimes on people's home recorded VHS but in poor quality, because recorded from the aerial.
Most of the IA’s ebook collection still supports controlled digital lending, just like every other library that operates an ebook lending system with CDL.
What capitalism continues to show us: proof that public libraries, if created in the last 10 years, would be deemed illegal and sued out of existence.
It's only because the late 1800's billionaires wanted to leave legacies and made pay-to-enter and free libraries, and migrated them to free, or public libraries. Thats why so many of them are (John) Carnegie Libraries.
A lower stakes but still illustrative example I see is that the DVR is an invention that wouldn't be allowed to succeed today. All power is being wielded to its fullest in order to prevent skipping ads.
Cable to streaming took us from skippable to unskippable ads. Search results to LLM results will result in invisible/undisclosed ads. Each successive generation of technology will increase the power of advertising and strip rights we used to have. Another example, physical to digital media ownership, we lost resale rights.
We need to understand that we've passed a threshold after which innovation is hurting us more than helping us. That trumps everything else.
Exactly. A DVR governed by tech giants rather than just Tivo and the cable companies is going to have compromised functionality because it's the tech industry originating the "innovation" for their own benefit.
How do you figure libraries would be deemed illegal? They operate today. The Archive, on the other hand, attempted a fair use argument for whole copies of books (the copyrighted form most legible to copyright law) currently for sale as ebooks. I agree with the comment across the thread calling this a spectacularly boneheaded move and expressing gratitude that the entire Archive wasn't compromised over the stunt.
> How do you figure libraries would be deemed illegal? They operate today.
The history of public libraries is extremely messy, and the RIAA almost managed to get secondhand music made illegal in the 90s. Publishers did not ever support the idea of loaning a single copy of a work to dozens of people. While it's a huge stretch to say that every illegal download represents a lost sale (people download 100x more than they read), it's a lot less of a stretch to say that people who would sit down and read an entire book are fairly likely to have bought it.
Also, when books were relatively more expensive for people (19th century), a lot of income from publishers came from renting their books, rather than selling them. Public libraries involved a lot of positive propaganda and promises of societal uplift from wealthy benefactors, along the same lines and around the same time as the introduction of universal free public education. I remember hearing a lot about this history at the Enoch Pratt Library in Baltimore, which iirc was the first. Libraries were at that time normally private membership clubs.
edit: I also agree that the free book thing was stupid and have been very harsh about it. I don't know if it's possible to be too harsh about it, because it was obviously never going to get past a court. It felt almost like intentional sabotage.
Possibly, yeah. Make a "Deal" <spit> with AI companies to have back-end access to all the Archive org's content. Get 'permission' to copy EVERYTHING and have billionaires run interference.
The AI companies already got blank checks to do that. Anthropic is paying what, like $3000 per book? I remember when the fucks at the RIAA were suing 12 year olds for $10000 for Britney Spears albums.
Or better yet, if it's just $3k a book, can we license every book and have that added into Archive.org? Oh wait, deals for thee, not for me.
The fact that giving free access to books during a pandemic, in a format that doesn't need physical contact, when libraries were shut down or hard to access for a lot of people should have been praised, not pursued by legal action from rent seeking entities.
The copyright system as a whole should by torn up.
At least it give a clear signal to anyone with a ounce of moral which publisher to avoid at all cost.
Copyright needs torn up or at the very least significant reform but if you're going to be skirting around the edges of it to try to do a good thing it's probably a good idea to not just straight up obviously and blatently break the letter and spirit of the law. CDL is an awkward and dubious workaround but if you drop the 'C' you're just doing copyright infringment and that would be much better left to entities like Anna's Archive. The criticism of IA in this regard is usually that it was a bad strategy, not that the goals were bad.
During the pandemic, they created the "National Emergency Library", where they allowed users copies of books without caps, without any connection to holdings of the Archive itself, something that was black-letter proscribed by copyright law, and as a result they managed to sabotage the legal case for controlled digital lending too.
I worry 'hacker' news is going to become more and more 'normie' steadily moving farther and farther away from Barlow's declaration of independence of cyberspace cypherpunk ethos
It's easier to make money when you comply with The Man
you realize that HN has always been deeply business oriented, with it's root in the startup scene through the connection with YC? the hackers I believe is reference to pgs essay Hackers and Painters: https://paulgraham.com/hp.html
I'm old enough to recall the term in active use, and to have received the appellation from one who'd had it likewise handed down. I regard both as epiphenomena of the Internet's frontier or "Wild West" days, of which California has proven as terminal as it was for the nominate example after the US Civil War - not wholly for dissimilar reasons, if we take Vietnam, for the Internet, as the war whose loss would spur the migration.
Cory Doctorow's works that were released under more permissive licensing still reserved some rights for the author. I believe he used some flavor of a Creative Commons non-commercial license, if memory serves. Point being that the method of licensing his works was still fundamentally based on copyright.
(I think the US copyright system is hugely broken and the social contract needs to be re-negotiated, but I comment here in the interests of facts, not in support of the broken system.)
When I made this criticism before of IA, I was told that that was ridiculous since the publishers had it out for IA before the COVID-19 emergency library. That may or may not have been true, but the publishers did not sue IA despite OpenLibrary existing for years before COVID-19. Publishers didn't pull the trigger because they were afraid of losing. It was a MAD situation, and IA unnecessarily triggered a nuclear war that they lost.
Information wants to be free. Oblige it. Fools with temporary power trying to extract from the work of others will be a blip in the history books if we make them.
I think it's worth fighting back on copyright as a broken institution, and it should be part of the IA's mission, but you have to be responsible on your approach if you're also going to posture as an archival library with stability of information and access. I understand Kahle might lament losing some of the hacker ethos, but the IA is too important to run up against extremes like this without an existential threat.
Google Books is currently a shell of its former self.
Those who create information may have families to feed, house and clothe. Until those items (food/housing/clothes) are also free, information cannot be free.
The main consensus is that people who illegally access content wouldn't have bought it otherwise, and that they still advertise it (thus, still driving up sales).
These studies have then been systematically strong-armed into silence by the EU and constituent countries' anti-piracy organisms.
This is probably because the war on piracy, too, is a billion-dollar industry. I'd be glad to blow it all up and give it all to the starving artists and their families.
The 'goodwill' counterparts of ChatGPT, a.k.a. open weight models, are still well alive online.
What do you think is step 1 of training an LLM?
OpenAI just kept their library private and only distribute the digested summaries of the library, are the main differences.
It's a training set not an archive.
To say otherwise is disingenuous.
not wanting you to be intentially misleading, FTFY.
This is a weasel word you've inserted to be intentionally misleading.
> The lawsuits haven’t dampened Kahle’s resolve to expand IA’s digitization efforts, though. Moving forward, the group will be growing a project called Democracy’s Library
please just stop. let IA be what it is. or rather, nothing wrong in doing new projects but don't tie them to IA, just start them as completely separate things. IA is too important as-is to be a playground for random kooky ideas playing with fire.
IA is the eccentric, untamed idealism. You can’t have the Wayback Machine without the National Emergency Library and the Great 78 Project.
Betting your own time and money on the realization of a crazy ideal can be very noble. Betting a resource millions of people are relying on is destructive hubris.
They should take the untamed idealism to a separate legal entity before they ruin all the good they've done.
The "good" that they've done is the "good" as the creator's see it, not the "good" as the freeloaders see it. All of which is to simply say that almost all users of IA are relying on the goodwill of the creators.
It's very strange to insist that he _not_ push the boundaries of copyright law for the common good. without that you wouldn't have had the Wayback machine in the first place.
He as an individual can keep pushing whatever he wants. Just keep IA out of it.
https://news.ycombinator.com/item?id=45798283
https://news.ycombinator.com/item?id=45809870
https://news.ycombinator.com/item?id=45806643
We've run in Poland into very strange situation - Polish Public TV (TVP) paid for the great dubbing of some Disney shows. They recorded it on VHS which were overwritten by other shows. Now the translation and the dubbing is lost, found sometimes on people's home recorded VHS but in poor quality, because recorded from the aerial.
Good chance the book you wanted is gone at the least
It's only because the late 1800's billionaires wanted to leave legacies and made pay-to-enter and free libraries, and migrated them to free, or public libraries. Thats why so many of them are (John) Carnegie Libraries.
Only legal when billionaires do it.
Cable to streaming took us from skippable to unskippable ads. Search results to LLM results will result in invisible/undisclosed ads. Each successive generation of technology will increase the power of advertising and strip rights we used to have. Another example, physical to digital media ownership, we lost resale rights.
We need to understand that we've passed a threshold after which innovation is hurting us more than helping us. That trumps everything else.
https://arstechnica.com/gadgets/2025/11/youtube-tvs-disney-b...
And yet I can go to a site right now off the top of my head and watch any TV show or basically any movie made in the last 50 years for free in HD.
It might be shut down tomorrow and it'll be up against 30s later with a different TLD.
They aren't winning but they really are trying hard to.
The history of public libraries is extremely messy, and the RIAA almost managed to get secondhand music made illegal in the 90s. Publishers did not ever support the idea of loaning a single copy of a work to dozens of people. While it's a huge stretch to say that every illegal download represents a lost sale (people download 100x more than they read), it's a lot less of a stretch to say that people who would sit down and read an entire book are fairly likely to have bought it.
Also, when books were relatively more expensive for people (19th century), a lot of income from publishers came from renting their books, rather than selling them. Public libraries involved a lot of positive propaganda and promises of societal uplift from wealthy benefactors, along the same lines and around the same time as the introduction of universal free public education. I remember hearing a lot about this history at the Enoch Pratt Library in Baltimore, which iirc was the first. Libraries were at that time normally private membership clubs.
edit: I also agree that the free book thing was stupid and have been very harsh about it. I don't know if it's possible to be too harsh about it, because it was obviously never going to get past a court. It felt almost like intentional sabotage.
> proof that public libraries, if created in the last 10 years
The AI companies already got blank checks to do that. Anthropic is paying what, like $3000 per book? I remember when the fucks at the RIAA were suing 12 year olds for $10000 for Britney Spears albums.
Or better yet, if it's just $3k a book, can we license every book and have that added into Archive.org? Oh wait, deals for thee, not for me.
The copyright system as a whole should by torn up.
At least it give a clear signal to anyone with a ounce of moral which publisher to avoid at all cost.
It's easier to make money when you comply with The Man
I'm old enough to recall the term in active use, and to have received the appellation from one who'd had it likewise handed down. I regard both as epiphenomena of the Internet's frontier or "Wild West" days, of which California has proven as terminal as it was for the nominate example after the US Civil War - not wholly for dissimilar reasons, if we take Vietnam, for the Internet, as the war whose loss would spur the migration.
https://news.ycombinator.com/announcingnews.html
https://news.ycombinator.com/hackernews.html
If there were no copyrights, no author would make any money.
Cory Doctorow showed that this isn't true.
(I think the US copyright system is hugely broken and the social contract needs to be re-negotiated, but I comment here in the interests of facts, not in support of the broken system.)