Office space meme:

“If y’all could stop calling an LLM “open source” just because they published the weights… that would be great.”

  • WraithGear@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 days ago

    So like an emulator. Or at least the PS2 ones when you had to dump your bios from your machine (or snatch someone else’s).

    But that’s my point! The data set is interchangeable. So Its not what makes the deepseek, THE deepseek LLM . But without the data set it would be functionally useless. And there would be no way possible to satisfy your requirement for data set openness. You said there is some line in the sand somewhere where you might be satisfied with some amount of the data, but your argument states that granularity must be absolute in order to justify calling it open source. You demand an impossible unnecessary standard that is not held to other open source projects.

    • Prunebutt@slrpnk.netOP
      link
      fedilink
      arrow-up
      0
      ·
      2 days ago

      The differenge is that the dataset is baked into the weights of the model. Your emulation analogy simply doesn’t have a leg to stand on. I don’t think you know how neural networks work.

      The standards are literally the basis of open source.

      • WraithGear@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        2 days ago

        I made my level of understanding kinda open at the start. And you say it’s not, open source most say it is, and they explained why, and when i checked all their points were true, and o tried to understand as best i could. The bottom line is that the reason for the disagreement is you say the training data and the weights together are an inseparable part of the whole and if any part of that is not open then the project as a whole is not open. I don’t see how that tracks when the weights are open, and both it and the training data can be removed and switched to something else. But i have come to believe the response would just boil down to you can’t separate it. There really is no where else to go at this point.

    • xttweaponttx@sh.itjust.works
      link
      fedilink
      arrow-up
      0
      ·
      3 days ago

      Just wanted to thank you both for this discourse! As somebody who’s interested in AI but totally ignorant to how the hell it works, I found this conversation very helpful! I would say you both have good points. Happy days to you both! 🙂

    • mamotromico@lemmy.ml
      link
      fedilink
      arrow-up
      0
      ·
      3 days ago

      Just to add, a good chunk of newer emulators require you to get a dump of the firmware externally, not just the ps2. Pretty much anything from ps2 onwards is like that.