after playing with deepseek for a few minutes, talking about its own chain of thought feature called deepthink, it hit me with this:

Como isso se aplica a mim (ChatGPT)?

(tr. how does this apply to me (chatgpt)?)

after i replied “you’re not chatgpt”, it “thought” this:

Now, the user is asserting that I’m not ChatGPT. […] I need to acknowledge their point while clarifying my identity. […] I should explain that while I’m built on OpenAI’s GPT, different platforms might customize the interface or add features like “DeepThink,”

then, as part of its response:

Isso não muda o fato de que, no cerne, sou um modelo de linguagem treinado pela OpenAI (ou uma versão derivada dele, dependendo da implementação).

(tr. that doesn’t change the fact that, at the core, i’m a language model trained by openai (or a version derived from it, depending on the implementation))

this means deepseek is based on an openai model? i thought their model was proprietary

thanks

  • ikt@aussie.zone
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 days ago

    Think might be this:

    What is distillation?

    Distillation is a means of extracting understanding from another model; you can send inputs to the teacher model and record the outputs, and use that to train the student model. This is how you get models like GPT-4 Turbo from GPT-4. Distillation is easier for a company to do on its own models, because they have full access, but you can still do distillation in a somewhat more unwieldy way via API, or even, if you get creative, via chat clients.

    Distillation obviously violates the terms of service of various models, but the only way to stop it is to actually cut off access, via IP banning, rate limiting, etc. It’s assumed to be widespread in terms of model training, and is why there are an ever-increasing number of models converging on GPT-4o quality. This doesn’t mean that we know for a fact that DeepSeek distilled 4o or Claude, but frankly, it would be odd if they didn’t.

    https://stratechery.com/2025/deepseek-faq/

  • maplebar@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    4 days ago

    How did they do it so cheaply?

    They stole it. Which is pretty fucking ironic if you ask me.

  • lemmylommy@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    4 days ago

    All that censorship, yet they could not be bothered to replace ChatGPT with deepseek in their stolen training data 😄

  • Jeena@piefed.jeena.net
    link
    fedilink
    English
    arrow-up
    0
    ·
    4 days ago

    Asking a LLM about itself leads to a lot of lies. Don’t do that mistake like I did.

    I asked llama if it sends data back to meta and it said yes it does. I thought that’s big news and wrote a blog post about it, because it was supposed to be offline, etc.:

    https://jeena.net/llama3-phoning-home

    And oh man people started loughing and pointing out how stupid it was what I did and that is was obviously a hallucination.

  • TheObviousSolution@kbin.melroy.org
    link
    fedilink
    arrow-up
    0
    ·
    4 days ago

    Talk about a shitload slice of the irony pie, the industry that has not given two shits about intellectual property is getting undone by the masters of intellectual property theft and they don’t even have a recourse because of the bed they’ve made. Now, it’s time to lie in it.

  • spaduf@slrpnk.net
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    4 days ago

    I believe it has been confirmed that Deepseek-r1 was trained with RL datasets that originated with chatgpt. Pretty standard.

  • IHeartBadCode@fedia.io
    link
    fedilink
    arrow-up
    0
    ·
    4 days ago

    this means deepseek is based on an openai model?

    It doesn’t sound like it is. It sounds more like it’s hallucinating which DeepSeeks has a really light end fine-tuning. But who knows? While their stuff is Open Source, no one has yet to test it and see if they can reproduce the results DeepSeek got. For all we know this is just a Chinese con or the real deal. But not knowing how you landed into this point of the conversation it comes off as a context aware hallucination.

    It knows about openai and it being a LLM but it’s mixed up self identity in specific with identity in general. That is it is start to confuse LLMs and ChatGPT as meaning the same thing and then trying to wire back this bad assumption to make sense again.

    Again, who really knows at this point? It’s too new and it being in China, there’s likely no way to verify these people’s claims until someone can take what they’ve published and made a similar LLM.

  • MudMan@fedia.io
    link
    fedilink
    arrow-up
    0
    ·
    4 days ago

    You were talking to it in Portuguese? Because I definitely managed to get R1 to freak out using less frequent languages and just spout gibberish.

    In any case, that’s not unique to this model, either. People need to stop taking chatbot statements about themselves as if a person was self-identifying. Most declarations of a chatbot about itself are either enforced manually or wild guesses. The chatbot doesn’t know about itself any more than it knows about any other subject.

    • Logi@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      4 days ago

      Screen grabs are not proof. I’m on mobile ATM or I could screen grab you admitting to lusting after Elon’s buttocks.

  • Robin@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    4 days ago

    Training data for these models used to be text off of the internet and some manually generated Q&A examples to make it behave more like a chat bot (instruction tuning). Because there is still a need for more data they have started adding AI generated text to the dataset. This technique doesn’t add new knowledge but it has shown to reduce hallucinations. Likely because this data is more focussed, truthful and structured than the median text from the existing datasets. They would probably have data from every major chat provider in there, especially the big boys.

  • jrs100000@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    4 days ago

    From what Ive seen Deepseek is particularly prone to hallucinating and is extremely suggestable. It just thought it knew what you wanted to hear and said that, then got confused trying to justify what it had said.

  • Even_Adder@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    0
    ·
    4 days ago

    I’ve heard of this happening when you generate datasets with ChatGPT to help train your model. OpenAI doesn’t want you doing this, making it against their terms of use, but there’s nothing they can do to stop people. You can generate some really good synthetic datasets from ChatGPT, and it’s perfectly legal to do.

    Were you running it locally?

  • flicker@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    0
    ·
    4 days ago

    Considering that all these models are trained on stolen content, no one should be surprised if this were true.

    • ImFineJustABitTired@lemmy.ml
      link
      fedilink
      English
      arrow-up
      0
      ·
      4 days ago

      I fully agree that this doesn’t mean anything, but I think it would be hilarious if we found out deepseek is just ChatGPT in a trenchcoat or something