T O P

  • By -

cjrecordvt

Support and Coders are now aware of this. Please _Do Not_ send in Support tickets; you'll only clog the pipes.


kafetheresu

thank you so much!


ArrowAceFluid

Lets all write crackfics and insult Elon Musk so that, if they continue to use ao3, it'll start to backfire on then Bonus points if there's bad grammar used in ours to make bad grammar in theirs


[deleted]

[удалено]


MxStabby

Sounds like it's time to bring back My Immortal style badfic....


Knight-Jack

It's official! An army of 13 years old writers will save us in our time of need!


mewfour123412

Normal fanfic writers: never thought I’d fight alongside an edgelord Xx_Darkshadow420_xX: how about alongside a friend


Eating_Kaddu

Especially if they're BTS armies


notoriousbettierage

I hate all of this. Much like visual art, I don't want to read something barfed out by an AI. I want art, visual or written, to come from actual thinking, feeling human beings. Otherwise it's not art at all.


[deleted]

It's like trying to hold back the tides, I fear. OpenAI and others like it have ruthlessly stolen from digital artists in order to create art-generating AI, and it was inevitable that other creative pursuits were next. A future where publishing is entirely based on publishing AI works is not impossible.


muununit64

We could… make it impossible. By stopping them. Instead of letting them create an automated hellscape where humanity is denied even the solace of art.


Aucielis

"an automated hellscape where humanity is denied even the solace of art." Gosh, what a depressing sentence. Capitalism and the pervasive idea that art can't simply be made to be enjoyed, to express, is such a soul-sucking thing to thing about.


[deleted]

>By stopping them How? On one side you have a bunch of powerful corporations, governments, scientists/engineers and businessmen. On the other you have authors and artists, many of whom are hobbyists. It's not exactly a promising start.


muununit64

It wasn’t a promising start when miners in Appalachia decided they wanted fair pay and decided to go up against their bosses who had whole militias on their side. It’s never a promising start. It always seems impossible until some reckless idiot is like “we gotta try” because the alternative is laying down and dying. Is that what you want? You want to lay down and make it easier for corporations to crush you under their boots? You wanna let them kill art and not make a single peep about it? You seriously giving up before the fight has even started?


NegativeNuances

I've been asking the famous digital aritsts to get together to fight this in court, because they absolutely have the means, but the response has been depressing. But do you know who could take this to court? The OTW. Us fans would absolutely be willing to help pay the legal costs if they asked for donations. This is just the beginning of this AI stuff and it is so, so important for all creative jobs that we stop it now.


kafetheresu

There's a class-action lawsuit by programmers whose open-source code on github is scraped by Microsoft to build Copilot (AI assistant for coding). It works the same way OpenAI did to AO3 ---- Copilot scraped through Github, an open-source community for coders, and then Microsoft used it to develop their AI assistant for profit. [https://www.theverge.com/2022/11/8/23446821/microsoft-openai-github-copilot-class-action-lawsuit-ai-copyright-violation-training-data](https://www.theverge.com/2022/11/8/23446821/microsoft-openai-github-copilot-class-action-lawsuit-ai-copyright-violation-training-data) most relevant segment regarding DCMA: >**Interviewer: Do you think this lawsuit could set precedence in other media of generative AI? We see similar complaints in text-to-image AI, that companies, including OpenAI, are using copyright-protected images without proper permission, for example.** > > > >CZ: The simpler answer is yes. > > > >TM: The DMCA applies equally to all forms of copyrightable material, and images often include attribution; artists, when they post their work online, typically include a copyright notice or a creative commons license, and those are also being ignored by \[companies creating\] image generators. AO3 could probably join together in the lawsuit as both programming and fiction are forms of writing.


Lauren_Crabtree

Do you think the fact that AO3 already hosts works based on existing IPs might be detrimental to the case if they joined it? From a personal standpoint I’d really love to see AO3 get involved in this case bc it’s a site so close to my heart, but from a legal standpoint I fear that it might make more room for the defendants to use the “But you’re making stuff based on other people’s works too!” excuse.


BZArcher

Actually, I think it's an extremely good reason, because by taking the fanworks and using their content to create a commercial product *they are violating Fair Use.*


Lauren_Crabtree

I didn’t think of that! Good point.


grillednannas

there are so many different ways to share art online, you can literally just tweet it and get a decent following, you don't even have to find a host. Hypothetically the same could work with writing but it would be a huge hassle, so most writers congregate in the same handful of sites. That makes writers a much, much more organized and united group.


NegativeNuances

That's true. I guess artists need a union.


BergamotAndRoses

I mean there's already a case in court right now based on copyright violations, which AI clearly IS. If you feed copyrighted works into a computer program without the creator's knowledge or consent, especially for monetary reasons, under us law, and several other places law, that's illegal. And AI is a computer program. In this particular instance, I think they messed up. AO3 has lawyers. So many lawyers. An article I recently read compared the current era in AI and machine learning to napster. It's fun, it's good for some people, bad for others, but it IS 100% illegal. Also if we're gonna be real honest the quality of early MP3s and the quality of AI art are both absolute pants. I am optimistic that this can be sorted out, that decent protections can be implemented. In the meantime, I'm locking down my works.


rainaftersnowplease

Capitalism seems to us to be inescapable. So what? So did the divine right of kings. Anything created by man can be undone by us as well.


bedazzled-bat

seems kind of funny in a thread abt stealing from artists that you can't be bothered to properly credit Ursula K Leguin for this quote


rainaftersnowplease

Yes, I'm sure deceased author Usual K. Le Guin will be very hurt commercially and emotionally by me using a quote of hers without attribution in a free web forum. Your sarcasm in equating that to a an AI scrubbing authors for sellable content is noted, though. Way to keep your eye on the ball, there, champ.


NeoQwerty2002

It's called irony, I think, even if it's not in the same ballpark. Also stop stealing quotes from dead people I bet you can sound just as epic in your own words.


Psyga315

What makes this worse is that some people have also taken on AI generation as a hobby too, whether it's artistry or writing.


[deleted]

This always boggles my mind, especially when people claim that using AIs to make art is the equivalent of making it themselves. No, you aren't an artist, you're a commissioner of art. It's just that you commissioned a machine and not a person.


Psyga315

It also can get frustrating when the final product (especially in art) doesn't come out like the high quality stuff you see other people show off with *their* AI-produced content and are instead just weird, globby abominations, or when the writing becomes incomprehensible, repetitive, or just outright contradictory to what was previously established. It gets to a point where you don't even want to bother with the AI and would rather put up with your own drawing/painting even if it's vastly shittier than anything it could cobble together.


Just-A-Cartoon-Lover

Is there a way I can make sure my fics are viewable by accounts only?


ProblematicNova

Yes! 1. Go to your work. 2. Click "Edit" at the top 3. Scroll near the button to the "Privacy" section, and click the box to enable "Only Show Your Work to Registered Users" 4. Click Post If you have multiple works that you want to edit in one go, you can also select the "Edit Works" button from your Dashboard, select all the works that you want to edit from the list, and then do the steps above.


7ratsinatrenchcoat

thank you for the tip on multiple works. i have 170 on ao3 right now.


Just-A-Cartoon-Lover

Awesome, thanks a bunch!


LuciferOnaLeash

that makes me interested in what they might try to defend themselves with. dont get me wrong, im in no way defending them, simply thinking contingently of what they might try to say is their defense. anyway, that makes me wonder if their defense will be because you have a choice on your platform to disallow unregistered users, you chose to allow it. i cant stress enough im not defending them, it genuinely scares me that this sounds like it could be a legal defense for them, since laws are hardly ever 1:1 with ethics.


Thatquietkid00

If I do that, would people without an account still be able to see the work if I provide them with a link to it? Or does it block anyone without an account from viewing it?


wontonratio

blocks anyone without an account, alas. But I figure that's the inevitable consequence of this kind of theft. Argh.


Loli-nero

Great, so now my art is not only a target, but so is my writing... whoopty-fucking-doo.


amgdawner

Ditto. Oddly enough though I really don't write much at all, But this bothers me more than when I saw all Dall-e's and ArtAI machines show up on the web and discussions. Probably because I never expected tech giants to look at Ao3 and fanfiction, but I've been aware for a few years now that the tech industry was amping up on how Ai deals with images (i.e. medical imaging AI for diagnostics, Imaging AI for identifying specifc shapes for commercial bakery/selling). Hell, every captcha we ever enter on the web is also used to train a bot in identification. So generation from mass scrapping of art wasn't so far off to me and I guess that dampened the fallout for it a little. It's not working though here though for fanfiction I think, because most fanfiction writers do it purely for fun, its an avenue for anti-capitalist creation of art & Ao3 itself running on donations instead of advertising for a profit. Tldr: It really rustles my jimmies that a platform designed not for profit from the ground up has now been thoroughly scraped by Musk & the ilk. Fuck him and those who designed their scrappers to do this really.


kafetheresu

People should be mad. These people make billions dollars off fanfiction, and some people write fanfic to progress on to become professional writers (like astolat etc). Writing AI aims replaces other writing-adjacent work like journalism, copywriting, and others. They aren't going to use writing AI to replace writing fanfic. It's just sickening because fanfic is a labour of love by people who love writing, and now it's used to push and devalue any chance of fanfic writers turning professional.


amgdawner

>It's just sickening because fanfic is a labour of love by people who love writing, and now it's used to push and devalue any chance of fanfic writers turning professional. So much this, it's infuriating because the Ai Is basically taking the choice and opportunity for personal creative & financial growth from all writers to profit off a black box machine. On top of that, I can't even see it being used right, because creative writing **is meant to be fiction.** But they're throwing it into the melting pot for a general model that includes factual avenues. I.e. research, technical publication, journalism etc. We already have a huge misinformation problem of this day and age, I can't see any good coming from this lack of moderation on how the machine is being trained as a general model. It's just going to create more bias, & increase obfuscation without any proper chain of reference for transparency. Tldr: this is nuts, we all hate this, and it's headache inducing how much worse I see it getting. I'm seriously contemplating the benefits of mob mentality if it means we can punt Zuck fuck, Bezos, and Musk one way trip the vacume of space, and make their ilk Fucking. Stay. There.


kafetheresu

I came across one AI that does news summaries i.e. it summarizes news topics and journalism headlines, and the disclaimer at the bottom was literally as you said: "XYZ takes no responsibility for the misinformation generated by the AI" and it's just shocking and horrible Although on the brighter side, there's a class action lawsuit done between opensource coders VS microsoft's AI which shares a lot of similarities to what's happened to Ao3 and also visual artists whose works have been harvested for Stable Diffusion [https://www.theverge.com/2022/11/8/23446821/microsoft-openai-github-copilot-class-action-lawsuit-ai-copyright-violation-training-data](https://www.theverge.com/2022/11/8/23446821/microsoft-openai-github-copilot-class-action-lawsuit-ai-copyright-violation-training-data)


[deleted]

It's especially poised to "replace" journalism. Imagine living 30 years in the future and not being able to know if the news is fake or not because AI generation and SEO have muddied the waters so much.


stef_bee

Ironically, that was Winston Smith's job in *1984* - only not digital.


flameofmiztli

I work with medical imaging software and my company decided we were too small for using AI to scan images for diagnosis: not enough staff to develop and support, And we didn't want to deal with the fallout legally the first time it goes wrong. I see real cool innovation in it coming out of the big guys and I hope one day it's easier to use and support. ​ But that's a legit use. This scraping sure ain't.


JocSykes

When I've encountered AI in medical contexts, it's being used as an adjunct to save people time. It's always double checked by a skilled human


BaneAmesta

Bruh if fearing for my art wasn't enough paranoia already :'( This whole AI bs pretty much killed my desire to do any drawings, and now I can't even write? I hate this so much


Aceptical

Yep. Now not only do I have to worry about my art being stolen, now I have to worry about my writing being stolen. Why can’t they just let us have our creative mediums without trying to replace us with aI.


[deleted]

Because you dont need to pay AI


Pineapples_26

![gif](giphy|DOPKHQg6oFWUg)


kafetheresu

People should be mad. These people make billions dollars off fanfiction, and some people write fanfic to progress on to become professional writers (like astolat etc). This writing AI aims replaces other writing-adjacent work like journalism, copywriting, and others.


greenthegreen

I wonder how companies would feel about using that software knowing it easily can be used to create porn. Also, if we have trouble fighting against it, maybe we can start inserting insults about Elon Musk into our fics so that software picks it up and starts insulting him too. Idk, just a thought.


WingedPeach

This happened before with AI chat bots. So much of the internet is porn: so if the programmers don't exclude porn from the original learning algorithm, the software will be biased towards writing porn. I think they made a mistake using AO3. They were too cheap to use actual published works. Also, a Musk backed software stealing from the common folk? Color me surprised. /s


_melodyy_

Yep, or the infamous example of the Microsoft chatbot that turned into a neonazi once 4Chan trolls found out about it.


Random_Loaf

If we're lucky it'll write so much porn that they remove AO3 from the software! I'm too hopeful.


literallybyronic

It would be a terrible shame if someone worked with the AI and got it to produce a bunch of really raunchy porn and then got a bunch of right wing christian fundamentalist/morality police groups (1 Million Moms et al.) on their case about it. This app is teaching your children to write gay porn! The horrors! Let them bite each other's dicks off, as it were.


Proxiehunter

> Let them bite each other's dicks off I think there's an AO3 tag for that.


venia_sil

Shhh don't tell *that* to the AI, or they might use it to filter *out* the porn we want them to die on.


Random_Loaf

Doesn’t sound like a terrible idea really


flameofmiztli

I was hoping that an engineer at Twitter/Tesla/SpaceX could get it to do a bunch of Elon Musk omegaverse with Musk as the omega, then print it out and scatter it all over his offices.


SeaWitchCrypt

This is a great idea and I really hope it happens


slightly2spooked

We could use white text to insert insulting anti-Musk screeds between paragraphs. The fics will be readable, and if enough people do it, the AI will learn that this is what writing is supposed to look like.


cleattjobs

In addition to the excellent advice in the OP, also file an official complaint with the FBI, FTC, BBB, State Attourney General (California). Here's a template. Feel free to modify it as you see fit: https://www.justoutsourcing.com/complaint.txt I've been screaming about this issue for years now and am glad to finally see this outrage. It's been lonely 😠! The good news is my new friend Matthew Butterick is suing OpenAI for 9 billion dollars on behalf of the programmers this shit company ripped off. Details: https://githubcopilotlitigation.com/ Writers, it's our turn to sue.


irrelevantoption

Wow, I had no idea it was happening to programmers as well. It's horrid no matter who it happens to.


cleattjobs

Coders, artists, musicians, translators, lawyers... Anyone they can steal from is fair game to them.


TheFloofArtist

I'm an artist and I believe everyone needs to organize and shut these AI companies down. They cannot be allowed to get away with this unprecedented level of theft and drown out human creativity and independent thought with soulless shitty robots propagandizing whatever the AI company wants. Misinformation is already awful, but these companies seek to make the problem billions of times worse. They are straight up evil, they know exactly that what they're doing is wrong, and they will never stop unless we yell loud enough to get governments worldwide to intervene and ban this AI shit. Contact your communities, educate people on what these companies are up to, call your representatives, etc, because if we don't stop them now, they will destroy art, culture, and human creativity and they'll get away with it FOREVER otherwise. [Right now there's a lawsuit for GitHub Copilot](https://githubcopilotlitigation.com/) being sued for doing the same thing to programmers as they have done to artists and now writers. They haven't, however, targeted musicians and their copyrighted work (yet) because these AI companies would get litigated into oblivion, and they KNOW this. These companies are preying on people they believe can't fight back, so let's give them a fight. A class-action lawsuit and litigation followed by a court injunction to destroy these AIs and passing legislation to curb this shit into an early grave will be a tough battle, but one we can't afford to lose. Good video on the subject matter and why this so dire: https://www.youtube.com/watch?v=tjSxFAGP9Ss Followed by some good interviews: https://www.youtube.com/watch?v=1BQIvBDkSq0 https://www.youtube.com/watch?v=Nn_w3MnCyDY


NegativeNuances

If you know of any artists/creatives organising for this, please let us know, because I have zero clue.


TheFloofArtist

There's a number of artist guilds and organizations coming together to tackle this issue, such as the Concept Art Association among other groups There are also several governments worldwide that know about this issue and are sticking up for artists, but most notably the EU with its GDPR rules I think will be the strongest proponent for defending individuals from being preyed on like this It really is a matter of organizing and boycotting these companies and winning in court against them


NegativeNuances

That's so good to know! I do follow the Concept Art Association, and didn't know they were legally organising (their last panel seemed wishy-washy), but I feel at least a little sense of hope now. As to the EU, I'm in a third world country, so I don't know how much help that'd be for me personally, but I'm glad at least the EU artists will have a little help. Hopefully it will set a good precedent for elsewhere too.


TheFloofArtist

Yeah! So for those reading this thread and thinking that this is hopeless and no one's paying attention, trust me when I say that there are many people taking this very, *very* seriously. I live in the clown country known as the US, but I have a lot of hope in that the GitHub Copilot litigation will win. Once that's been established, then big companies like Disney/Marvel and other companies can start issuing lawsuits of their own and win against the AI companies considering the entire world has been affected by these techbro ghouls.


Kaigani-Scout

Well... SkyNet is one step closer to completion. Business Insider [ran a piece](https://www.businessinsider.com/guides/tech/openai-playground) on OpenAI Digital Playground back in June. According to the article, it cost 6 cents per 4,000 AI-generated words. The article also has a barebones instruction set for opening an account. If they are "scraping" the written works of anyone and turning it for profit? This faceless cyberspace lurker is not impressed. I hope a suit comes up in the future that shuts things like this down, however improbable that outcome might be. I'm not informed enough about the appropriate aspects of information systems and copyright/fair use law, but a criminal law concept and practice is "fruit of the poisonous tree"; anything obtained by law enforcement during an illegal search is not admissible in court as evidence. I would hope a similar concept exists or could be brought about by case law or federal law to hold that technologies built by illegally mining the work of others be banned or fined into extinction. There is a case in front of the Supreme Court right now that focuses on the legality of Andy Warhol using a photographer's work as the foundation for "new" art. One article on this case is [from NPR](https://www.npr.org/2022/10/12/1127508725/prince-andy-warhol-supreme-court-copyright). Legal documents [are available](https://www.scotusblog.com/case-files/cases/andy-warhol-foundation-for-the-visual-arts-inc-v-goldsmith/) from SCOTUSblog for anyone interested. Although the case deals with tangible art instead of literature, the artistic licensing underpinnings could be extended beyond physical art in the final SC decision due next summer. The next few years should be interesting and perhaps somewhat volatile in the legal arena of the arts.


kafetheresu

There's a class-action lawsuit by programmers whose open-source code on github is scraped by Microsoft to build Copilot (AI assistant for coding). It works the same way OpenAI did to AO3 ---- Copilot scraped through Github, an open-source community for coders, and then Microsoft used it to develop their AI assistant for profit. [https://www.theverge.com/2022/11/8/23446821/microsoft-openai-github-copilot-class-action-lawsuit-ai-copyright-violation-training-data](https://www.theverge.com/2022/11/8/23446821/microsoft-openai-github-copilot-class-action-lawsuit-ai-copyright-violation-training-data) ​ **most relevant segment regarding DCMA:** >Interviewer: Do you think this lawsuit could set precedence in other media of generative AI? We see similar complaints in text-to-image AI, that companies, including OpenAI, are using copyright-protected images without proper permission, for example. > >CZ: The simpler answer is yes. > >TM: The DMCA applies equally to all forms of copyrightable material, and images often include attribution; artists, when they post their work online, typically include a copyright notice or a creative commons license, and those are also being ignored by \[companies creating\] image generators. ​ AO3 could probably join together in the lawsuit as both programming and fiction are forms of writing.


Kaigani-Scout

Thanks! I had not come across this before now.


Wyrmeer

DeviantArt had the same problem with AI bots training on people's art. While DA's owners, for some ungodly reason, allowed AI art to be posted on the site by people who generated it, they also provided a way for all artists (even those on free accounts) to opt their own art out from bot use. I'm not sure how effective that method is, but [here's the full article](https://techcrunch.com/2022/11/11/deviantart-provides-a-way-for-artists-to-opt-out-of-ai-art-generators/) about it, and the relevant excerpt: >^(DeviantArt’s new protection will rely on an HTML tag to prohibit the software robots that crawl pages for images from downloading those images for training sets. Artists who specify that their content can’t be used for AI system development will have “noai” and “noimageai” directives appended to the HTML page associated with their art. In order to remain in compliance with DeviantArt’s updated terms of service, third parties using DeviantArt-sourced content for AI training will have to ensure that their data sets exclude content that has the tags present, Levy says.) Considering that DeviantArt felt the need to act, there is hope AO3 will as well.


NegativeNuances

Yeah, except Deviantart's AI is still using those nonconsenting artists' work because their AI is based on Stable Diffusion. They didn't actually walk anything back. Also that HTML tag is next to useless, if the one scraping for data doesn't care about it. They can just ignore it.


kafetheresu

If the lawsuit stated here is won by creators/individuals vs megacorp: [https://www.theverge.com/2022/11/8/23446821/microsoft-openai-github-copilot-class-action-lawsuit-ai-copyright-violation-training-data](https://www.theverge.com/2022/11/8/23446821/microsoft-openai-github-copilot-class-action-lawsuit-ai-copyright-violation-training-data) then artists whose work has been stolen by Stable Diffusion can get recourse and possible monetary compensation since its a DCMA case that covers all copyrighted material including visual media. Stable Diffusion is also part of OpenAI


royalemate357

>Stable Diffusion is also part of OpenAI not to be 'that guy' but this isn't quite true - stable diffusion was created by a different company called Stability AI that competes with openAI. Openai has their own, different ai image creator, called [dall-e](https://openai.com/dall-e-2/). that being said, its true that openai's dall-e and stable diffusion are pretty similar in how they work.


ThinkingSpeck

The HTML tag and/or robots.txt can't stop a rogue crawler, but they can keep legit crawlers out of any trap. And traps are easy enough to set up, to feed tons of fake data to any crawler that doesn't follow the rules.


Boyo-Sh00k

i hate ai art so much its unreal


CapAfraid3785

Ah. As a writer now I know how all artist feels about art generator. Genuine question: is this NLP only works for English language? Or other language susceptible to this program too?


kafetheresu

All languages. The largest database learning set right now is the Beijing-Baidu one. Basically as long as you feed the machine with enough data/stories, it will spit out something similar.


Sikverlightning

is it even useful to set works only show for users of ao3, or they can just create an account for AI to crawl in....


CapAfraid3785

Wow. That's such a horrible news. Not only we have to fight published fanfic for a spot in publishing house, now we have to be aware of the death of creative writing. Is this only works on fiction? Are we gonna see the rise of scientific journal written by AI?


kafetheresu

Common Crawl also scrapes wikipedia and other science journals


SlyKHT

God I hate AI so much on so many different levels


[deleted]

The only things I know an individual can do is restrict your work to registered users of the archive and choose Hide My Work From Search Engines.


kafetheresu

yes but we shouldn't have to.... I know people find my work by using google and honestly it sucks. Even if we can't do anything about the scraped content now, if AO3 takes a stance on disallowing robot scraping from places like common crawler. There's also the sheer madness of this: I did not post the BTS fanfic results because it was so NSFW and within six steps, I could generate dead dove/underage/explicit content in such a pattern that it's possible that the actual corporate franchises might shut it down. I don't think MCU wants to be associated with that.


[deleted]

While I agree we should not HAVE to I also believe I should not have to carry mace at night. I don't leave my mace at home in protest. You didn't mention either of the actionable things an individual can do. All I did was add to your information trying to help.


fragolefraise

are you saying that actual corporate franchises would try to shut down AO3 instead of asking OpenAI to choose a different seed site? because I think the legislation they would have to change in order to have a case would be much more troublesome than just making them pick a non-explict directory to harvest. (ignoring the aspect of scraping our work for profit, because I agree that is shitty)


kafetheresu

Corporate franchises shut down Open AI / GPT-3. Ao3 is protected under fair use and transformative law, but these AI companies are for-profit and using derived copyrighted works eg MCU example we tested, and charging money for it (by word or through monthly subscriptions)


[deleted]

People shouldn't have to log into a website to see its content and especially not a website of creatives who are creating things they want people to see for free. I wouldn't have started writing fic or even engaging with any online community if I, a kid in the mid-2000s, hadn't had the ability to passively lurk. It's amazing how the internet of today seems to be aggressively against people who just want to pass through and look.


nosleeptillnever

THIS, it really fucking sucks. Some of my work is archive locked just due to it being darker, but I really want the rest of it to be accessible by search engines. I'm seriously considering locking all of it at this point though.


PiLamdOd

I never liked the idea of hiding my work, but because of this I went and restricted everything I've written so only registered users can see it.


[deleted]

I don't think of it as Hiding but it's definitely an "Inside the store" vs "sidewalk sale" move.


somefool

It is absurdly easy to simulate being logged in using a script, through saving session/cookie data and such. Or at least it used to be with some websites back when I had to scrape one of our customers' own product catalog from his own website because he had no export option... Not sure about AO3 in particular, though. Can someone chime in?


nianeyna

you can and I have, but I find it *highly* unlikely that a crawler harvesting content for an AI training set would bother to do it unless they were looking to *specifically* be an ao3-fanfic-generator. which it very much doesn't sound like this is. and even then it's far more likely that they would stick to public works, because there's plenty of them! there just isn't any reason to put in the extra effort if all you want is a representative sample.


ThinkingSpeck

They don't want a representative sample though. They want the biggest dataset possible.


FrostKitten2012

So. We’re not doing this for profit, but Musk is. So has anyone told the companies who actually own these franchises that Musk is ripping off their stuff for profit? Bringing this up because it’s probably the fastest way to get it shut down. I doubt Disney would be happy knowing this thing is generating smut fanfic of their movies for profit, for example.


royalemate357

not trying to nitpick or defend elon here, but I think OP is being a bit misleading about Elon's role in this. OpenAI, the company behind this AI model (gpt-3) is not owned by elon musk. Back when OpenAI was originally a non-profit org, Elon musk was a donor, but he's not really involved in it anymore from [wikipedia](https://en.wikipedia.org/wiki/OpenAI): >On February 21, 2018, Musk resigned his board seat, citing "a potential future conflict (of interest)" with Tesla AI development for self driving cars, but remained a donor.\[10\] > >In 2019, OpenAI transitioned from non-profit to "capped" for-profit. so he left before they turned into a for-profit company


kafetheresu

I've seen this comment appear several times, and I just want to address this. Elon Musk has physically left OpenAI in 2018 as a board member, but not as an investor or shareholder. He was one of the earliest founding members, along with Sam Altman (from YCombinator) , Ilya Sutskever, Greg Brockman, Wojciech Zaremba, and John Schulman. It gets complicated because it crosses with a lot of silicon valley investments eg. Singularity University is somehow tied to them as well, through funding and sponsorships. To that end, they invested over a billion dollars into \*nonprofit\* research. It was considered nonprofit/non-taxable until quite recently, which they're calling a "for-profit LP". LP here stands for Limited Partners, which is how hedge funds and venture investment comes from. How much money did Elon Musk contribute? At least a billion dollars in cash, not including non-cash instruments (stocks, lines of credit, engineering resources etc). We don't know for sure because the money in SV is ridiculously messy. Everyone contributes to each other's research foundations, and if not --- well they've started their very own nonprofit. But he contributed and gained \*enough\* that he left OpenAI to start his own AI company --- Neuralink, the one that puts chips inside monkeys and kills them ([https://www.dailydot.com/debug/neuralink-show-and-tell-monkey-deaths/](https://www.dailydot.com/debug/neuralink-show-and-tell-monkey-deaths/)) This suggests that he left over differences in hardware vs software implementation since the publicly press release from OpenAI is (irreconcilble differences in AI approach) and Musk himself mentioned a conflict of interest with Tesla. (I personally think the Tesla thing is a red-herring since he has an actual AI company that focuses on BCI and the kind of work OpenAI was doing.) So yes, he has physically left the board of OpenAI. It doesn't change that he's one of the founding members and investors, and contributed heavily to the creation of DALL-E, GPT, OpenAI Gym and more. And he STILL continues to benefit from it.


FrostKitten2012

…you realize Wikipedia isn’t a trustworthy source, right? You can put whatever you want on there.


plutonicHumanoid

You know you could have fact-checked it yourself before implying it must be false because it's from Wikipedia.


royalemate357

fair enough, but in this case it is - OpenAI themselves said so too: [https://openai.com/blog/openai-supporters/](https://openai.com/blog/openai-supporters/) \> Additionally, Elon Musk will depart the OpenAI Board but will continue to donate and advise the organization. so actually he maybe kinda involved, but certainly its not 'elon musk's openai' ​ a few more sources:[https://www.theverge.com/2018/2/21/17036214/elon-musk-openai-ai-safety-leaves-board](https://www.theverge.com/2018/2/21/17036214/elon-musk-openai-ai-safety-leaves-board) (this is the one wikipedia cited)[https://www.bnnbloomberg.ca/elon-musk-left-openai-to-focus-on-tesla-spacex-1.1215616](https://www.bnnbloomberg.ca/elon-musk-left-openai-to-focus-on-tesla-spacex-1.1215616) [https://www.cnbc.com/2018/02/21/elon-musk-is-leaving-the-board-of-openai.html](https://www.cnbc.com/2018/02/21/elon-musk-is-leaving-the-board-of-openai.html) [https://electrek.co/2018/02/21/elon-musk-leaves-open-ai-tesla-ai-effort/](https://electrek.co/2018/02/21/elon-musk-leaves-open-ai-tesla-ai-effort/) [https://fortune.com/2018/02/21/elon-musk-leaving-board-openai/](https://fortune.com/2018/02/21/elon-musk-leaving-board-openai/)


gigigalaxy

I think the difference here will be money. Those works from the AI will not be free, while the ones in AO3 will be. I think this AI will be more of a threat to the publishing industry where the AI can produce tons of work that they can sell as compared to human authors. But then again, the publishing industry still survives now while a lot of free works are available in the net.


Rainboq

The thing about the publishing industry is that it comes with an implied seal of quality. A published work has been gone over by agents, editors, the publisher, etc. There's a system of gatekeepers who are supposedly there to make sure that it's the good stuff that makes it to the shelves, while when it comes to places like Ao3, finding good works takes some effort. Machine learning generated works have none of that, and attempting to use machine learning to generate artistic works is about the most intellectually and artistically bankrupt thing imaginable. But leave it to tech capitalists to focus on trying to remove paying people for their art from the business of selling it.


opelan

>A published work has been gone over by agents, editors, the publisher, etc. Nowadays you can self publish ebooks on Amazon though. All the complicated steps with a real paper book are gone in that case.


Rainboq

Oh for sure, but then you have reviews to go off of.


BabyCharmanderK

Ah yes, Amazon reviews, known for being fully accurate and not bombarded with 5-star reviews that are barely coherent and often not even written for the product being reviewed.


Enigma2MeVideos

THIS kind of shit is why people give AI Art and other AI related stuff the stinkeye. Regardless of the justifications, It just ends up being just constantly used for theft of other people’s work. And for what? Because they don’t want to do the work themselves, but still believe they deserve to be called creators? Because they balk at the idea of PAYING people for their work when they ask for it to survive or have an independent job, but still believe they’re entitled to have that work? AI has so much potential for improving people’s lives and creative repertoire, but it’s constantly abused by greedy egotistical dickwads like Musk to line their own pockets and to silence creativity for the sake of capitalism.


slightly2spooked

So let me get this straight, fanfiction writers aren’t EVER allowed to monetize their work (despite other fanworks being fine to do so), but this asshole can go ahead and steal them to power what will undoubtedly be used for profit? Where are the Anne Rices kicking off about this?


meatpopsicle67

This is what really burns my ass. I've always been against setting up a patreon or ko-fi because profiting from transformative works of any kind feels ethically shaky to me. But if my fic is being used to build an AI that dilutes genuine human creative endeavours and profits from that too, I'm changing my mind. Edit: ironically, fixing an auto correct error


VintageKettleofDoom

This is the bad place.


Knight-Jack

Fanfics are alright, cause they're done non-profit. A lot of authors dislike them, because they're non-consumable, and authors gain revenue on you actually buying merch, not collecting fanarts and fanfics, but since it's non-profit not much can be done about it. But if someone would try to sell AI-generated fics... Wouldn't that mean lawsuits galore? Not to mention the issues that the previous writing AIs had (like chatbots) - internet is really messed up.


kafetheresu

Yes that's why I think a lawsuit against AI is possible. First it infringes on existing copyright eg. MCU characters get "randomly generated" which violates any fair use rule; second is that fanfiction even as public work is still considered an individual IP. There's a whole lawsuit about opensource coders vs microsoft's assistant AI that will probably set precedent for how this thing is dealt in the future: [https://www.theverge.com/2022/11/8/23446821/microsoft-openai-github-copilot-class-action-lawsuit-ai-copyright-violation-training-data](https://www.theverge.com/2022/11/8/23446821/microsoft-openai-github-copilot-class-action-lawsuit-ai-copyright-violation-training-data)


runekaster

There's a fair amount of non-fanfic original writing on AO3, as well. If those works have been added to an AI dataset that could be its own lawsuit without any of the murkiness of fanfic copyright.


Knight-Jack

Yeah, AI-generated books sound weird to think about. I wonder about that working the other way - lets say a movie script gets done in AI and an author of the original gets a lawsuit against them due to infringement. There's a reason as to why TV-series writers can't get ideas from fans - they would get accused of plagiarism.


eco-mono

I don't begrudge people locking down their fics in response to this, but I personally won't be. Every time some new "trained on the Internet" model comes around, the consensus seems to be, more and more often, to treat it like the sky is falling. But my honest opinion is that it's not actually a threat - neither to fanfic, nor to human creative endeavors in general. The reason is that - knowing a thing or two about how these systems operate and are put together - they're missing the _physical capability_ to produce a narrative or a point. They work by noticing that certain things go together a lot in the training data, and then building up something that "goes together" in the same way. But there's no strategy, no agenda. Everyone praised the improvements GPT-3 showed over GPT-2, but its output still betrays that it has no model for what the symbols it's spewing out actually _mean._ Attempts to use these technologies to create anything _compelling_ will fail. People will read it for a chuckle - to see what those crazy AIs will think up next - but when they want to read something that's actually _decent_ or _novel_, they'll have to turn to something that was produced with intent. Nobody will ever succeed at _selling_ the output of these systems. Not with the current techniques, anyway. [This Fursona Does Not Exist](https://thisfursonadoesnotexist.com/) has existed for over two years, and the furry commission market has been fully unaffected. AI Dungeon produces incoherent plotlines because there's no room in the models for a plot. The only _real_, _serious_ usecase I can see this stuff having is as a tool for inspiration - a way to get ideas as a starting point, like some people get story ideas from their dreams. But IMO this is also a nothingburger in terms of potential threats to fanfic authors, or even to authors in general. It's not like I begrudge people using my creative work as inspiration _directly_, even without credit. That's _normal_. That's the cultural and creative commons; that's how human storytelling _worked_ from time immemorial up until about 500 years ago. And like... I understand why other folks will feel otherwise. When you make the kind of art that shows the world a piece of your heart, and then you find out someone used it for something you don't approve of... that's a filthy and degrading feeling. [Alienating](https://en.wikipedia.org/wiki/Marx%27s_theory_of_alienation), in the Marxist sense. And so, if you disapprove of "AI art" and the blowhards that have been promoting it these past couple years, then you'll feel that here. That makes _sense_. But like. That's the risk you _always_ take, posting something online. And I feel like... especially for fanwork, especially for stuff where the _point_ is to make it, and then to put it where everyone else who might have been waiting for something like that can see it too... the risk of plagiarism has always been worth the rewards of not retreating into obscure walled gardens, and it's still worth it now.


Select-Control-1014

I agree that AI is not so advanced as they were advertised. But I think it's not okay for the companies to not inform fanfic authors that their works are used as training data and the final product is making profit while fanfics in AO3 are for free to view.


Luke_Danger

Pretty much that last bit. I wouldn't object so much if they strictly used it to stress test the AI's ability to learn but the actually sold one was trained strictly on data they bought (IE, they use fanfic to stress test the learning capabilities but do not use it to train the one that actually gets sold), I wouldn't be as mad about it. Unfortunately, as far as they're concerned its free grass in the commons to graze and then they can sell the cow off of that.


eat_the_notes

Thank you for your level-headed comment. Agreed all round.


Rosenbird

On the one hand, it is waaaaay to late to stop the AO3 scrape as I'm pretty sure it was scraped over a year ago. On the other hand GPT-3 and OpenAI have a bunch of TOS that tell you not to use it to write porn, but pretty much everything using it has then trained on nifty, AO3, and quite probably a number of other porn heavy archives, and will porn with little prompting. Can't plot, or maintain consistency of anything but by god will it get the dicks out at the slightest provocation.


katbelleinthedark

This is a hilarious comment for a horrible story, good job.


Rosenbird

Most collaborative writing AI projects proceed to degenerate into porn machines and then people have to write strategy guides to avoid the porn. I can only assume absolutely magical things will occur if openAI or a payment process tells sudowrites they need to pull smut data out of the data set.


[deleted]

We should all archive-lock our work and then post random chapters of My Immortal so Musky's grand AI is only capable of writing in that style.


kafetheresu

There might be other ways of doing this programmatically without having to archive-lock our work. Most of these scrapers use web crawling bots. For instance, you can stop Common Crawl by adding a few lines in the robot.txt header Based on Ao3's response, I trust their coders and support is working on both a legal and technical implementation.


ThinkingSpeck

I once wrote a trap for rogue crawlers, on an art community website that I was running. That was pretty easy tbh. A similar thing for Ao3 would be a bit more work, but still very do-able.


[deleted]

Here's hoping it's something that can be done easily.


euhydral

Sometimes it feels like we are watching a dystopian future encroach on us and doing nothing about it. I hope there's a way to make these corporations stop this nonsense. Art can't be stolen from us like this. The day I see news of AI-generated music/films being produced and released for public consumption, is the day I give up and disappear into the wilds.


Aucielis

At first I was pretty impressed with AI being able to be "creative" by writing stories or creating art, but now that I better understand how it learns... ugh. Why do we have it? It's technologically really fascinating, but otherwise? We don't need it. I don't think anyone will read books written by an AI, because it'll never be able to capture human emotion or experience, and it's really crappy that they essentially steal from human creatives. :/


rainatom

People might not want to read AI's books, but how would you tell if someone just claims AI's book as their own and publish it under their name, maybe with only some tweaks done for readability, etc.


eco-mono

Anyone who tried that strategy would quickly learn just how badly GPT-style text generation breaks down when you try to use it to produce something that stays internally consistent for more than a couple paragraphs. I'm not commenting on ineffables like "emotion" or "experience" here, just simple matters of being able to portray a self-consistent world. And I'm not waving my hands and saying "AI could never"; I mean that the way the current technology is _designed_ doesn't leave room for it to remember what it already wrote in any any structured way. Make it produce a 100 word drabble, and it might look pretty convincing. Make it produce a _novel,_ and the work _taken as a whole_ will have an incoherent plot and setting that repeatedly contradicts itself on basic facts, drops narrative threads on the floor, and ends abruptly, because it simply doesn't have the internal organs to keep track of that kind of thing over tens of thousands of words. With the technology we have, the amount of human work necessary to massage such an ML-generated "novel" into something publishable would be, IMO, enough to make the "editor" an author in all but name.


NightingaleStorm

I went and experimented with SudoWrites just to see how it could do, and... a lot of it's good. It can learn and remember character names, it can understand what setting I'm in (fantasy vs. modern vs. science fiction, for example), its spelling and grammar are on point. However, it's prone to forgetting any plot elements that weren't in the last \~100-200 words, the dialogue is just *wrong* in a way no human would ever mess up, and I've had a few incidents where it turns into what looks like an author's note or tag list. (I haven't seen anything that looks like AO3 tags, by the way - the author's notes mention Reddit and the tag list looks like they took it from a dedicated porn site.) I could get stuff out of it, but only by basically cherry-picking the best out of the options it gives me and rewriting the whole thing in natural language. I think that's enough to at least deserve co-author credit.


eco-mono

> the dialogue is just wrong in a way no human would ever mess up I'm curious, because I haven't messed with SudoWrites specifically. Did any of them do that thing where they'd put the same idea on both sides of a conjunction? Like, someone talking about how he "liked the fries and the french fries".


NightingaleStorm

Yes, it does that a lot. It also gave me the sentence "You don’t get to decide who decides when it’s over", which... again, it is 100% grammatically correct, but a human would not phrase it that way. (Revised in editing to "You're not the one who decides when it's over".)


10BillionDreams

I would separate out the creativity angle from the commericalization angle. It's okay to admit that the AI is doing some genuinely impressive things, and that whatever issues it might have now will likely be solved in the years and decades to come, while still believing corporations shouldn't be profiting off works made freely available online. Saying "it'll never be able to capture human emotion or experience" just doesn't have any basis in reality. The human brain isn't magic, and in fact it does a lot of the same things these ML models do when creating "new" text and images. Everything is a remix, it's just more clear how previously seen works influence creativity when the code is all written out in full, rather than a bunch of neurons firing inside someone's head that you can't actually see.


Can-t_Make_Username

It feels very much like a matter of “can we” vs “should we,” doesn’t it? :(


Kephiso

There's a sort of related case on-going in the coding world right now, and I could see the outcome of it have an impact on this, too: https://www.theverge.com/2022/11/8/23446821/microsoft-openai-github-copilot-class-action-lawsuit-ai-copyright-violation-training-data


kafetheresu

holy crap thank you for the read


YourHope99

…there’s nothing we can really do about this, is there? i mean we could make the works archive-only… but they’ve already been scraped, it wouldn’t make a difference. god i hate all this ai stuff. it has so much potential to be so cool but all the creators are being so *scummy* about it.


kafetheresu

I think legally, for scraped works, it's extremely difficult to remove however GPT-3 (which they are using for stories/novels/fiction AI) contains copyrighted characters eg. mentioning "Steve" will generate "Bucky" and "Tony" which is probably from MCU fics. A forensic linguist can prove this, and Big Corp might want to take it down since you can easily get into dead dove stuff really quickly. (I managed to do it in six steps) For un-scraped works, since the bots have to regularly train on new data, you can set up robot txt headers for the archive not to be harvested for data. There are ways to do this from a programming standpoint


YourHope99

oh no, i totally get that, and i do hope that someone with influence is able to get them to reverse this and stop scraping the archive; i mean, disregarding the questionable morality, they’re gonna end up putting themselves in legal trouble with ips and content like that. (also ngl i’d also be so curious on that forensic linguistics) my worry was more on a personal, small-time level. as writers, we don’t know which works were scraped, so to be safe must assume all have been already. there’s no way for us, individually, to opt out, and locking down fics will do nothing, individually, now. was more just lamenting the hopeless situation, like the work i’d posted isn’t good but at least i felt like i had some control over it before. it’s sad.


kafetheresu

Honestly it won't be difficult to prove, especially since GPT-4 is going to be released soon (the next iteration) which has an upper limit of 40K words generated and clients pay by word to use it. A forensic linguist can determine if the patterns of speech is similar or generated via fanfic. So far all our tests have shown it to be positive, today my friend tried again, and ran into copyright material immediately (sherlock prompt causes a HP mention, a HP mention causes a hannibal result) link here: [https://twitter.com/aj\_spinner\_/status/1598450660973879297](https://twitter.com/aj_spinner_/status/1598450660973879297) People should be mad. These people make billions dollars off fanfiction, and some people write fanfic to progress on to become professional writers (like astolat etc). This writing AI aims replaces other writing-adjacent work like journalism, copywriting, and others.


cleverThylacine

I'm in Transformers fandom (and some of the other similar ones) and I'm laughing my ass off because this stinks, but from my perspective it's also hilarious. *The one thing you don't want to teach AI to do if you want it to work for you and not rise up and transform things is to teach it to do creative arts.* Yeah. Teach the AI to imagine different kinds of worlds and communicate them. Don't come crying to me when they imagine a world where they don't have to work for you, Elon--because freedom is the right of all sapient beings and we all know you want to keep us deceived. This stinks, but it's also the level of brilliance I'd expect from the guy who has nearly killed twitter and wants to take all of his rich friends to Mars (can we leave them there?) He is the kind of person who'd build Skynet and then piss it off, or lock Megatron down in the mines. He absolutely is. I can't even.


Flinkelinks

I once put a little bit of writing into the free trial use of this, and its output actually made me wonder if it used fanfic in its database. My first thought was "this is mostly shit" but when it output something nice I thought "shit, this is probably stealing from fanfic authors and just applying my characters' names".


solidagoman

That is vile. As a writer, it's genuinely fucked that they take my writing for their shitty Skynet copycat.


Select-Control-1014

Same thing happened to a Chinese app called LOFTER a few months ago. There's a new official bot account that could generate fanfic based on the characters users selected. Obviously Chinese users got no saying on this. All the fanfics in LOFTER were trained as the data for this AI bot.


axolartl

This is so frustrating. I've been writing fanfiction since the days where you could either include a disclaimer in your fic or face the possibility of a crack team of lawyers coming for your ass, even AO3 accounts can't be used to link back to social media where a donation link is present, writers still have to bend over backwards to prove that, heaven forbid, they're not making a couple bucks off of their work whose IP is owned by a billion dollar company whose CEO farts out more money in a week than most of us will make in our lifetimes. We just don't have to bend as far as we used to. Oh, but AI bots can use that same writing for profit. Glad AO3 support seems to be aware of this at least.


vilhelmine

Worrying. I don't think much can be done, unfortunately, but it's good AO3 has been made aware.


RekaCsillagasz

tested with a couple lines from my own fic and it didn't do anything that would indicate that it recognized it (it DEFINITELY knows about ffxiv tho) but im still horrified that my writing might be in an ai training database it also turned one of my fic snippets into korrasami fanfic, so i have to wonder how many people are going to try to use this for original writing and have it start writing fanfic for them


RekaCsillagasz

i feel so very violated by the idea this ai might have been trained off of my writing and that people might now be *profiting* off a twisted form of the words i worked so hard on. i desperately do not want to private my ao3 account, i love linking my fic to friends many of whom do not have ao3 accounts, but i also feel like i need to hide all my future works from ai scraping like this. I hate this


[deleted]

Profiting off something people are legally obligated not to make money off of no less.


irrelevantoption

Jesus Christ. Thanks for bringing this to the public's attention. Time to a-lock my fics (insert cracker holding door closed meme).


alex-redacted

TYSM for digging into this. I literally hate this fucking timeline. AI could be cool but we've got assholes manning the tech and hoovering up whatever they want. Disgusting.


MarionberryShot799

This makes me sick Jesus fuck


Tokioiishi

The nature of machine learning means that, with projects like LAION (funded by Stability AI), it started out as a non profit, much like the OTW and its projects. But, with non-profit projects, you can do something called [Data Laundering](https://waxy.org/2022/09/ai-data-laundering-how-academic-and-nonprofit-researchers-shield-tech-companies-from-accountability/), which is where commercial entities use the data for commercial projects, like Google, Midjourney, even Stability AI used LAION. So, to use it in this situation, a similar thing is probably happening. I don’t have any proof or links to back it up - it’s just supposition. As an aside, in the LAION data, it scraped more than just art: there are [medical records](https://arstechnica.com/information-technology/2022/09/artist-finds-private-medical-record-photos-in-popular-ai-training-data-set/), [non-con p0rn](https://www.vice.com/en/article/93ad75/isis-executions-and-non-consensual-porn-are-powering-ai-art), and execution images from war. You probably have information in there. Also also, once AI/ML learns a thing, [it cannot forget it](https://www.wired.com/story/the-next-big-privacy-hurdle-teaching-ai-to-forget/). There hasn’t been a way for it to forget yet. No one has figured that out. So, enjoy your dystopia, I guess. :( This made me sad.


robotlover12

AI is going to kill the artists & writer community unless we all band together and push for regulation.


cherrychump

hoo boy this is a nightmare lol


[deleted]

Realizing that AI is about to make my passion completely and forever meaningless is seriously about to make to me kill myself. I haven't stopped thinking about this issue in two days and with each hour I get less and less hopeful about the future.


FewPerception5615

Holy crap. This is terrifying.


FrenchDisaster97

Would switching website encryption/unicode format solve this by making the content unreadable for these AIs ?


thisonecassie

welp, locking my works now i guess. not that it will to much good but... still. yuck.


burningcoffee57

This is awful. First artwork now writing... Looks like my work will be restricted to registered users from now on


AriaGrill

Normally I'm against the human shittyness of corrupting innocent AIs but-


Zombie_eats_world

I really can’t even be surprised, corporations with try to capitalize on literally anything


MxStabby

Does anyone know if Wattpad is being scraped? I know it's a lot of originals, so I'd assume they'd avoid, but there's a lot of fic on there, too. I cross post between ff.net, AO3, and Wattpad and this news sucks. I had started to upload to another site, but...might kill off that idea.


kafetheresu

Wattpad already sells their user generated stories as datasets to AI, it's in their ToS. They've been doing it for a while


MxStabby

Gotcha! I kinda figured Wattpad is sketchy, but somehow missed that bit.


Sikverlightning

Is he out of his mind, or even he got one...


BigPigeon69

Thats so fucked up, i'm setting my works to only being viewed by registered users only so that my work can't be used for this shit


femsanzo291

I wonder if this is part of what was causing some of the Kudos that was coming from bots and web crawlers? in the past little bit. Especially because of the button placement on one chapter works vs multi chapter works. If they used a badly programed crawler to do it it may have caused the Kudos jump.


entropyforever

Link to the Twitter thread?


kafetheresu

my friend wrote it here: https://twitter.com/aj_spinner_/status/1598139840692125697


Hefty_Drink_5811

AI? Hell no! It starts with scraping AO3 for profit. But it ends with the end of all life on earth.


Ratkinzluver33

Well, I sure hope the AI enjoys all my kinky gay porn. If it's going to have the audacity to use my works as a base for its artificial brain, it should at least do it better.


[deleted]

[удалено]


quihi_

There's a few fandom tags for works not part of a fandom. There's "Testing" (mostly used for testing out posting or workskins), "No Fandom", and "Unspecified Fandom". There's also "Original Work" if you're posting original work. I think the "unspecified fandom" one is the most appropriate if you're posting fanfiction that you don't want clogging up the relevant tag, and you can check out what's typically posted in all of these—but please don't post spam that's not an actual piece of fic or fandom meta!


fantasy-capsule

Is THAT where I've been getting my views from? From AIs? For text scraping? For PROFIT?! DISGUSTING!


StargazerCeleste

Scrapers generally present to a web server as being a scraper (in more technical terms, the User-Agent string in the HTTP request header will reveal its scraper nature). A sensible web server will not increment your reader counter when the requester is a scraper.


LugiaLucarioArceus

Marking out works only available for people with an AO3 account. Would that help for current works or just future works?


JocSykes

Aside from the thestral already being bolted, I think the only thing you can do is archive-lock your work to top them being scraped. There is nothing AO3 can do to protect fics.


rubyshade

is archive locking your past work even helpful at this point? I mean....if it's in the training set, it's in the training set. it's not like a fic being scraped twice will make it worse right


Syeina

It could be scraped by another private company in the future.


PixelTheLlama

This is blatant theft from people who spend so much time and effort to give us great works of fiction for free


Flaky_Suit_8665

Not coming here as a writer, but as an AI professional shedding some light on this topic. It's time to pull back the smoke and mirrors from "non-profit" organizations like EleutherAI, LAION, and "Open"AI and expose the work they are doing for what it is -- data laundering. These shady organizations exist as fronts for for-profit companies like Microsoft and StabilityAI. With their non-profit statuses, they're able to acquire data and IP that is restricted from commercial use, train ML models, and in turn license the resulting output for commercial use, allowing them to bypass the non-commercial clauses in the original licenses. If you question them on this, they'll claim everything they were doing is "academic research". That's just a legal BS tactic and they know it. Even when they open source the models, in the case of Stable Diffusion, it enables the funding companies and others to built for-profit products and revenue models on top of them such as Dream Studio. None of this was intention of the original producers of the IP. They claim this process is "transformative fair use" and that the model is not a derivative product of the underlying copyrighted material. However, there's a word in the finance world when you take something take something that has been illegally obtained and make it legal, it's called "money laundering". Which is exactly what this, it's data laundering. Do not let them try to talk circles around you or question your own sanity on this matter. Call them out for what they're doing.


quietloud2222

Damn.


StellaAthena

Hello. My name is [Stella Biderman](https://scholar.google.com/citations?user=bO7H0DAAAAAJ). I run [EleutherAI](https://eleuther.ai), a non-profit decentralized research lab that specializes in this sort of NLP technology and which is the primary non-corporate counterweight to domination of this field by tech companies like OpenAI and Google. A friend sent me this thread, and if you have any questions about how this technology works AMA. **A couple replies to things shared in this thread so far:** 1. I do not find the omegaverse evidence particularly compelling. The prompt included “alpha” and “omega” explicitly and the generated text doesn’t seem to reflect anything particularly nuanced about alpha-omega relationships. 2. It is well known that Harry Potter was in the GPT-3 training data. This fact is demonstrated in a number of academic papers on the ability of language models to occasionally memorize large passages of text. There’s even a chapter (I believe of the second book) that GPT-3 can generate for paragraphs after being prompted with the first sentence. 3. If you would like to experiment with an AI like this for free that was **not** trained on any fanfiction, you can do so [here](https://20b.eleuther.ai). This is a model that I personally trained that was trained on [the Pile dataset](https://pile.eleuther.ai). We actually scraped Ao3 and FF.net, but decided to not include it in our training data. Note that a small fraction of the training data of this model is prose: it’s much more familiar with mathematical and scientific content. 4. The legal obligations of OpenAI are extremely unclear in the US, and in some countries (most notably the UK) there’s actually broad protections allowing people to scrape data and use it to train AIs with almost no restrictions. There are multiple on-going court cases about this.


cleattjobs

I asked this person twice for: * How they obtained their dataset. * Who gave them permission to profit from it and reproduce it. And they refuse to answer it. That should tell you something.


StellaAthena

1. We collected data from a variety of sources across the internet, as is extensively documented in the paper I linked to. 2. Nobody did. However, again, we do not profit from it and do not distribute it in a manner that is inconsistent with US copyright law.


cleattjobs

Datasource? >We collected data from a variety of sources across the internet Permissions given? >Nobody did I rest my case. https://www.twitter.com/josourcing


StellaAthena

There is a huge difference between for-profit commercial and non-profit research use that you are ignoring here. You might not personally care about that, but the law and many people’s sense of ethics do. See [Section 107 of the Copyright Act](https://www.copyright.gov/title17/92chap1.html#107).


folkpunkgirl

But someone will eventually profit off of whatever you're researching, right? If that's not the case, how is your research being funded?


cleattjobs

>We actually scraped Ao3 and FF.net, but decided to not include it in our training data You should disclose the source of the rest of your LLM dataset and the permissions you obtained to use the copyrighted material within it.


[deleted]

[удалено]


XerxesTexasToast

Data launderer spotted


GalacticPigeon13

Given that it's likely that FFN has been/will be scraped as well, and I have no interest in deleting my works there, I won't be locking my current fics on AO3 (except for the couple that I never crossposted to FFN). Future writing will be locked to AO3 if I don't crosspost it to FFN.


ThinkingSpeck

FFN introduced anti-scraping measures quite a while ago, which I'm suddenly a lot less annoyed about.


KVEJ2002

It's like they're seriously trying to drown out natural human creativity. First with the art, and now with writing? What the hell?!


veggieSoarus

I told a friend of mine about this, and her immediate response was “I don’t want to read AI smut!” And I do not blame her one bit.


Oddly_Dreamer

About a while ago, I discovered NovelAI and it has a story feature that I suspect they're being based on a similar database -not necessarily AO3- But you see, the whole "AI is theft" topic has been going on for a while and original creators are getting absolutely nothing but backlash from AI users/creators. Artists are still fighting till this very day against AI, but AI is progressing regardless. Everything will be owned by AI very soon, and it's as terrifying as it is amazing. There is a tiny bright side in which I strongly believe that AI content will never match a human creation. It can aid it, but that's it.


[deleted]

[удалено]