deepseek's model claims to be chatgpt during conversation. what does this mean?

beleza pura@lemmy.eco.br · 4 days ago

deepseek's model claims to be chatgpt during conversation. what does this mean?

ikt@aussie.zone · 3 days ago

Think might be this:

What is distillation?

Distillation is a means of extracting understanding from another model; you can send inputs to the teacher model and record the outputs, and use that to train the student model. This is how you get models like GPT-4 Turbo from GPT-4. Distillation is easier for a company to do on its own models, because they have full access, but you can still do distillation in a somewhat more unwieldy way via API, or even, if you get creative, via chat clients.

Distillation obviously violates the terms of service of various models, but the only way to stop it is to actually cut off access, via IP banning, rate limiting, etc. It’s assumed to be widespread in terms of model training, and is why there are an ever-increasing number of models converging on GPT-4o quality. This doesn’t mean that we know for a fact that DeepSeek distilled 4o or Claude, but frankly, it would be odd if they didn’t.

https://stratechery.com/2025/deepseek-faq/

maplebar@lemmy.world · 4 days ago

How did they do it so cheaply?

They stole it. Which is pretty fucking ironic if you ask me.

lemmylommy@lemmy.world · 4 days ago

All that censorship, yet they could not be bothered to replace ChatGPT with deepseek in their stolen training data 😄

Jeena@piefed.jeena.net · 4 days ago

Asking a LLM about itself leads to a lot of lies. Don’t do that mistake like I did.

I asked llama if it sends data back to meta and it said yes it does. I thought that’s big news and wrote a blog post about it, because it was supposed to be offline, etc.:

https://jeena.net/llama3-phoning-home

And oh man people started loughing and pointing out how stupid it was what I did and that is was obviously a hallucination.

TheObviousSolution@kbin.melroy.org · 4 days ago

Talk about a shitload slice of the irony pie, the industry that has not given two shits about intellectual property is getting undone by the masters of intellectual property theft and they don’t even have a recourse because of the bed they’ve made. Now, it’s time to lie in it.

spaduf@slrpnk.net · edit-2 4 days ago

I believe it has been confirmed that Deepseek-r1 was trained with RL datasets that originated with chatgpt. Pretty standard.

IHeartBadCode@fedia.io · 4 days ago

this means deepseek is based on an openai model?

It doesn’t sound like it is. It sounds more like it’s hallucinating which DeepSeeks has a really light end fine-tuning. But who knows? While their stuff is Open Source, no one has yet to test it and see if they can reproduce the results DeepSeek got. For all we know this is just a Chinese con or the real deal. But not knowing how you landed into this point of the conversation it comes off as a context aware hallucination.

It knows about openai and it being a LLM but it’s mixed up self identity in specific with identity in general. That is it is start to confuse LLMs and ChatGPT as meaning the same thing and then trying to wire back this bad assumption to make sense again.

Again, who really knows at this point? It’s too new and it being in China, there’s likely no way to verify these people’s claims until someone can take what they’ve published and made a similar LLM.

NaibofTabr@infosec.pub · 4 days ago

The code might be open. Are the training data sets?

MudMan@fedia.io · 4 days ago

You were talking to it in Portuguese? Because I definitely managed to get R1 to freak out using less frequent languages and just spout gibberish.

In any case, that’s not unique to this model, either. People need to stop taking chatbot statements about themselves as if a person was self-identifying. Most declarations of a chatbot about itself are either enforced manually or wild guesses. The chatbot doesn’t know about itself any more than it knows about any other subject.

cmgvd3lw@discuss.tchncs.de · 4 days ago

Do you have a proof of this? Like a screengrab or something?

Logi@lemmy.world · 4 days ago

Screen grabs are not proof. I’m on mobile ATM or I could screen grab you admitting to lusting after Elon’s buttocks.

cmgvd3lw@discuss.tchncs.de · 4 days ago

I have seen mutiple reports claiming Deepseeks datasets are based outputs of other LLMs before it.

Robin@lemmy.world · 4 days ago

Training data for these models used to be text off of the internet and some manually generated Q&A examples to make it behave more like a chat bot (instruction tuning). Because there is still a need for more data they have started adding AI generated text to the dataset. This technique doesn’t add new knowledge but it has shown to reduce hallucinations. Likely because this data is more focussed, truthful and structured than the median text from the existing datasets. They would probably have data from every major chat provider in there, especially the big boys.

jrs100000@lemmy.world · edit-2 4 days ago

From what Ive seen Deepseek is particularly prone to hallucinating and is extremely suggestable. It just thought it knew what you wanted to hear and said that, then got confused trying to justify what it had said.

Even_Adder@lemmy.dbzer0.com · 4 days ago

I’ve heard of this happening when you generate datasets with ChatGPT to help train your model. OpenAI doesn’t want you doing this, making it against their terms of use, but there’s nothing they can do to stop people. You can generate some really good synthetic datasets from ChatGPT, and it’s perfectly legal to do.

Were you running it locally?

no banana@lemmy.world · 4 days ago

I don’t think it means much. Ask it again later and it will change its mind

Valar_Morghulis@jlai.lu · 4 days ago

Answer. It will change its answer. It has no mind. 😉

flicker@lemmy.dbzer0.com · 4 days ago

Considering that all these models are trained on stolen content, no one should be surprised if this were true.

ImFineJustABitTired@lemmy.ml · 4 days ago

I fully agree that this doesn’t mean anything, but I think it would be hilarious if we found out deepseek is just ChatGPT in a trenchcoat or something