Meta allegedly torrented over 81TB worth of pirated books to train its AI models

Current Events

Current Events » Meta allegedly torrented over 81TB worth of pirated books to train its AI models

02/06/2025 11:37:14 PM

02/06/2025

11:37 PM

https://arstechnica.com/tech-policy/2025/02/meta-torrented-over-81-7tb-of-pirated-books-to-train-ai-authors-say/

remember kids, piracy is bad

https://i.imgur.com/TGkNCva.gif https://i.imgur.com/8mWCvA4.gif

divot1338

02/06/2025 11:42:34 PM

02/06/2025

11:42 PM

How do you know that Zuckerbeg didnt just get them a library card?

Moustache twirling villain
https://i.imgur.com/U3lt3H4.jpg- Kerbey

SomeUsername529

02/06/2025 11:45:10 PM

02/06/2025

11:45 PM

It is well known that all the LLMs are based on massive theft. Its why people laughed so hard about OpenAI crying about DeepSeek "copying" their data or whatever. Current AI/LLMs would not exist if the creators had been stopped from stealing mountains of data to train on. They could probably have done it legitimately with enough time but clearly they didn't want to take the time.

TMOG

02/06/2025 11:45:36 PM

02/06/2025

11:45 PM

They're not music or movies, so nobody's going to care.

FL81

02/09/2025 05:10:36 AM

02/09/2025

05:10 AM

Meta also allegedly modified settings "so that the smallest amount of seeding possible could occur," a Meta executive in charge of project management, Michael Clark, said in a deposition.

And to think, they didn't even seed!

https://i.imgur.com/TGkNCva.gif https://i.imgur.com/8mWCvA4.gif

NeonPhoenix

02/09/2025 05:13:23 AM

02/09/2025

05:13 AM

FL81 posted...

And to think, they didn't even seed!

Those selfish bastards

https://imgur.com/u2HR4nG

Hejiru

02/09/2025 05:16:12 AM

02/09/2025

05:16 AM

TMOG posted...

They're not music or movies, so nobody's going to care.

Textbook companies sure care when you pirate their mandatory $400 book thats slightly different from last years edition.

The difference between fiction and reality is that fiction has to make sense.

FL81

02/09/2025 05:47:24 PM

02/09/2025

05:47 PM

Hejiru posted...

Textbook companies sure care when you pirate their mandatory $400 book thats slightly different from last years edition.

*meanwhile Facebook downloads literal millions of textbooks*

https://i.imgur.com/TGkNCva.gif https://i.imgur.com/8mWCvA4.gif

Post #9 was unavailable or deleted.

Storm_Shadow

02/09/2025 05:55:38 PM

02/09/2025

05:55 PM

#10

81TB? Pfft. I have a system at work that generates 81TB of data in a single month. Git gud, Zuckyboy.

If you treat people as equals, they start to think they ARE your equals.
http://www.gamefaqs.com/boards/1005-warhammer-40k

MabinogiFan

02/09/2025 05:58:53 PM

02/09/2025

05:58 PM

#11

Hejiru posted...

Textbook companies sure care when you pirate their mandatory $400 book thats slightly different from last years edition.

Lmao I remember people sharing pdfs of textbooks back when I was in college. I think even some professors looked the other way.

DrizztLink

02/09/2025 06:01:19 PM

02/09/2025

06:01 PM

#12

MabinogiFan posted...

Lmao I remember people sharing pdfs of textbooks back when I was in college. I think even some professors looked the other way.

I pirated my Ethics textbook and I shared the link with all in my class who asked.

Zero.

Regrets.

He/Him http://guidesmedia.ign.com/guides/9846/images/slowpoke.gif https://i.imgur.com/M8h2ATe.png
https://i.imgur.com/6ezFwG1.png

FL81

02/12/2025 03:41:11 PM

02/12/2025

03:41 PM

#13

Storm_Shadow posted...

81TB? Pfft. I have a system at work that generates 81TB of data in a single month. Git gud, Zuckyboy.

For what it's worth, the average eBook is, like, 500 KB or something lol

https://i.imgur.com/TGkNCva.gif https://i.imgur.com/8mWCvA4.gif

BewmHedshot

02/12/2025 03:47:30 PM

02/12/2025

03:47 PM

#14

Storm_Shadow posted...

81TB? Pfft. I have a system at work that generates 81TB of data in a single month. Git gud, Zuckyboy.

Yeah but is that data worth anything?

HudGard

02/12/2025 03:57:24 PM

02/12/2025

03:57 PM

#15

DrizztLink posted...

I pirated my Ethics textbook and I shared the link with all in my class who asked.

Zero.

Regrets.

Being able to ctrlF a textbook is a godly ability to have, saving money aside

did that every chance I could

You haven't set a signature for the message boards yet

Trumpo

02/12/2025 04:22:35 PM

02/12/2025

04:22 PM

#16

Those are rookie numbers

Lordgold666

02/12/2025 06:12:10 PM

02/12/2025

06:12 PM

#17

FL81 posted...

And to think, they didn't even seed!

Lol

"May the Father of Understanding guide you."

Notti

02/16/2025 04:38:34 AM

02/16/2025

04:38 AM

#18

SomeUsername529 posted...

It is well known that all the LLMs are based on massive theft. Its why people laughed so hard about OpenAI crying about DeepSeek "copying" their data or whatever. Current AI/LLMs would not exist if the creators had been stopped from stealing mountains of data to train on. They could probably have done it legitimately with enough time but clearly they didn't want to take the time.

No fair! We stole it first!

I've seen "content producers" cry similarly.

http://youtube.com/TheYoungTurks/videos
http://youtube.com/SamSeder/videos http://RightWingWatch.org http://reddit.com/r/BreadTube http://fb.me/OccupyDemocrats

Ratchetrockon

02/19/2025 11:45:22 PM

02/19/2025

11:45 PM

#19

Omg they use the sites I occasionally visit

I'm a Taurus. Currently playing: Oldschool Runescape & DMC 3 Ubisoft Port. He/Him

#20

Post #20 was unavailable or deleted.

Homeless_Waifu

02/19/2025 11:49:17 PM

02/19/2025

11:49 PM

#21

Meanwhile the founder of Reddit who tried to pirate books from a archive and spread the knowledge to general public got arrested or something like that

HERRO EvryBody. HOW ARE YOU? FINE OKAY!

FL81

02/19/2025 11:58:04 PM

02/19/2025

11:58 PM

#22

Homeless_Waifu posted...

Meanwhile the founder of Reddit who tried to pirate books from a archive and spread the knowledge to general public got arrested or something like that

what happened to Aaron Swartz was really fucked up

https://i.imgur.com/TGkNCva.gif https://i.imgur.com/8mWCvA4.gif

Current Events » Meta allegedly torrented over 81TB worth of pirated books to train its AI models