• WalnutLum@lemmy.ml
    link
    fedilink
    arrow-up
    60
    ·
    2 years ago

    There are VERY FEW fully open LLMs. Most are the equivalent of source-available in licensing and at best, they’re only partially open source because they provide you with the pretrained model.

    To be fully open source they need to publish both the model and the training data. The importance is being “fully reproducible” in order to make the model trustworthy.

    In that vein there’s at least one project that’s turning out great so far:

    https://www.llm360.ai/

    • umami_wasabi@lemmy.ml
      link
      fedilink
      arrow-up
      14
      ·
      edit-2
      2 years ago

      Not just LLMs but all kinds of models are equivlant to freeware, aka the model itself and other essential bits for it to work. I won’t even call it source avaliable as there is no source.

      Take redis as example. I can still go grab the source and compile a binary that works. This doesn’t applies on ML models.

      Of course one can argue the training process isn’t determistic thus even with the exact training corpus, it can’t create the same model in terms of bits on mulitple runs. However, I would argue the same corpus provide the chance to train a model of similar or equivalent performance. Hence the openness of the training corpus is an absolute requirement to qualify a model being FOSS.

      • WalnutLum@lemmy.ml
        link
        fedilink
        arrow-up
        3
        ·
        2 years ago

        I’ve seen this said multiple times, but I’m not sure where the idea that model training is inherently non-deterministic is coming from. I’ve trained a few very tiny models deterministically before…

        • umami_wasabi@lemmy.ml
          link
          fedilink
          arrow-up
          1
          ·
          2 years ago

          You sure you can train a model deterministically down to each bits? Like feeding them into sha256sum will yield the same hash?

          • WalnutLum@lemmy.ml
            link
            fedilink
            arrow-up
            1
            ·
            2 years ago

            Yes of course, there’s nothing gestalt about model training, fixed inputs result in fixed outputs

    • Canary9341@lemmy.ml
      link
      fedilink
      arrow-up
      5
      ·
      2 years ago

      The importance is being “fully reproducible” in order to make the model trustworthy.

      Well that’s a problem, because even with training data that’s impossible by design.

      • WalnutLum@lemmy.ml
        link
        fedilink
        arrow-up
        4
        ·
        2 years ago

        I’m not sure where you get that idea. Model training isn’t inherently non-deterministic. Making fully reproducible models is 360ai’s apparent entire modus operandi.

  • thayer@lemmy.ca
    link
    fedilink
    English
    arrow-up
    21
    ·
    edit-2
    2 years ago

    If a layman may ask, what are folks even using AI/LLMs for mostly? Aside from playing around with some for 10-15 mins out of simple curiosity, I don’t have a practical use for platforms like ChatGPT. I’m just wondering what the average tech enthusiast uses these for, outside of academia.

    • Eggyhead@kbin.run
      link
      fedilink
      arrow-up
      16
      ·
      edit-2
      2 years ago

      I teach language. I get paid for my time in front of students, not the time it takes to prepare their lessons and the materials. I use AI to quickly reference grammar rules, to fabricate example dialogs in specific scenarios to practice, and to suggest activities to do in class to practice the target grammar. I never do exactly as it says, just take it as kind of a source of suggestions for me to build from.

      • thayer@lemmy.ca
        link
        fedilink
        English
        arrow-up
        5
        ·
        edit-2
        2 years ago

        That sounds like a time saver for sure. I imagine that some of those elements (grammar rules) are widely available everywhere, while others (practice dialogues, activity suggestions focused on the use of language) would require a fairly specific training model.

        • Eggyhead@kbin.run
          link
          fedilink
          arrow-up
          4
          ·
          2 years ago

          Well, LLMs are quite literally trained on language, so asking it to simulate a conversation between a hotel clerk and a guest who is upset that they can’t find the hair dryer is pretty much what it’s best at doing.

          You can even build the dialogs with students. Have them introduce a scenario for the LLM to manufacture, then have the students suggest variables to apply, such as the clerk being hungry and in a bad mood while the guest is actually drunk after returning from a club in order to see how the language changes, then have the students act it out for laughs.

    • snek_boi@lemmy.ml
      link
      fedilink
      arrow-up
      6
      ·
      2 years ago

      A friend of mine and I have gotten used to using it during our conversations. We do fast fact-checking or find a good first opinion regarding silly topics. We often find it faster than digging through search-engine results and interpreting scattered information. We have used it for thought experiments, intuitive or ELI5 explanations of topics that we don’t really know about, finding peer-reviewed sources for whatever it is that we’re interested in, or asking questions that operationalizing into effective search engine prompts would be harder than asking with natural language. We always always ask for citations and links, so that we can discard hallucinations.

      • thayer@lemmy.ca
        link
        fedilink
        English
        arrow-up
        3
        ·
        edit-2
        2 years ago

        Thanks for sharing! I’m probably too set in my ways to ever utilize AI for things like this. I never use virtual assistants like Alexa or Google either, as I like to vet and interpret the source of information myself. Having the citations would be handy, but ultimately I’d want to read them myself so the IA/VA just becomes an added step.

    • ErwinLottemann@feddit.de
      link
      fedilink
      arrow-up
      5
      ·
      2 years ago

      we use it to classify data that is needed to be sent to one of three endpoints. chatgpt tells our tool where it belongs. there are.probably more practical ways to do this, but the customer wanted AI in his product so here we are 🤷

  • umami_wasabi@lemmy.ml
    link
    fedilink
    arrow-up
    15
    ·
    edit-2
    2 years ago

    What’s FOSS-AI? A model everyone can download and use for free? Or in the OSS spirit that everything need to be open and without discrimination of use, aka OSS training data corpus and no AUP attached?

    Or you mean the inference engine running those models?

      • umami_wasabi@lemmy.ml
        link
        fedilink
        arrow-up
        6
        ·
        edit-2
        2 years ago

        So you’re including free models like freeware, not FOSS only, by non big tech.

        Your choice of models will be quite limited as the compute resource and training corpus needed to make a viable base model isn’t anyone can do.

        • irreticent@lemmy.world
          link
          fedilink
          arrow-up
          2
          ·
          2 years ago

          Agreed, but there have been big projects that have been open source. I can imagine* an AI (LLM) being developed fully FOSS. It would be rare, but I can see it happening if a big foundation got behind it. Maybe Mozilla, or another that tries to keep the spirit of their mission statement.

          *Imagine: I’m not too familar with all of the current, public, and free models out there, just a few. This was just me making a hopeful guess about if it might be actually happening now.

  • just_another_person@lemmy.worldBanned
    link
    fedilink
    arrow-up
    14
    ·
    2 years ago

    I’m just convinced all of y’all asking about this are in a huge circle jerk that never ends, but refuses to understand how it all works.

    A model is a model. It’s a simplified way of narrowing down thresholds of confidences. It’s a pretty basic sorting algorithm that runs super fast on accelerated hardware.

    You people seem to think it’s like fucking magic that steals your soul.

    Don’t send information over the wire, and you’re golden. Learn how it works, and stop asking dumb questions like this is all brand new, PLEASE.

    • SomeLemmyUser@discuss.tchncs.de
      link
      fedilink
      arrow-up
      8
      ·
      2 years ago

      There is a difference between a general scare about the AI buzzword and legitimate distrust in online services which are closely connected to american spying institutions (regardless if they are ai or not)

      If my calories tracker app would apoint a (former) NSA official on their board, I would be looking for alternatives too. This is not about AI, this is about a company with huge sets of private data being closely interconnected with american spy institutions.

      Sad that you don’t seem to be able to distinguished between legitimate security questions and badly informed hypes/scares ass soon as a buzzword like AI occurs

        • SomeLemmyUser@discuss.tchncs.de
          link
          fedilink
          arrow-up
          4
          ·
          edit-2
          2 years ago

          I did read this part, and while this is generally true, there are use cases of such large models. Some of them require the input of personal data (find bugs in my code, formalize this email, scan this picture for text and translate it, draw an anime version of this picture of my friend tom)

          So people being weary of security implications of such large models are certainly not

          in a huge circle jerk that never ends, but refuses to understand how it all works.

          Sure you can just call them all dumb using ai like the mainstream (putting in personal data) and attribute it to an unwillingness to understand, but this doesn’t match the reality. Most people don’t even understand how an operating system functions, which components work online and which offline and who can access which of their information, let alone know how “AI” works and what the security implications are.

          So If people ask those questions, hoping there are alternatives they can use safely your answer “no, u just dumb, machine can’t harm you, its not magic, just don’t put in data in”

          Is not only rude but also missing the point. Most usefull/fun/mainstream ways DO in fact, put in data.

          You explaining basic models also doesn’t help, as the concern here is not mainly/only the model, but american spy institution to access all prompts you did put in, maybe categorizing you in personality clusters dependent on your usage of language or assigning tags on which political stance a users has (and with entities like the NSA I could imagine far worse)

          Also “A model is a model” Is not very accurate in such cases. When someone has control and secrecy over each aspect of the model, it would be very well possible for entities like the NSA to manipulate the content the models puts out in arbitrary directions. A government controlling and manipulating information the public receives is a red flag for a lot of people (rightfully so IMHO)

          How are people supposed to get better in digital privacy topics if you just tell them to shut up and insult them when they aks questions trying to learn? You acting like you are in your Elfenbeinturm of genius isn’t helping anyone.

          • just_another_person@lemmy.worldBanned
            link
            fedilink
            arrow-up
            1
            ·
            2 years ago

            I’m not calling anybody dumb. I’m saying they’re being willfully ignorant and assuming this is all brand new tech that is mysterious, rather than learning about how it works.

            A lot of people are hyped by the “Hype & PR” machine right now instead of being (appropriately IMHO) suspicious and using critical thinking.

            • SomeLemmyUser@discuss.tchncs.de
              link
              fedilink
              arrow-up
              4
              ·
              2 years ago

              Your comment certainly feels like you look/kick down on people instead of giving them a helping hand getting up.

              With your attitude you are driving away people who want to do exactly what you want of them: educating theirselfs.

              You are being contra productive to your own demands is what I’m saying

              • just_another_person@lemmy.worldBanned
                link
                fedilink
                arrow-up
                1
                ·
                2 years ago

                Not sure I should take a lot of interest in the thoughts of someone on the Internet trying to act socially or intellectually superior, but using the word “theirselfs”, which is not a word.

                I am definitely “Contra productive” though. I can beat it on a single life with no Konami codes 🫰

    • Possibly linux@lemmy.zip
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      2 years ago

      A model is that feeling you feel after eating baby carrots

      A model is the smoothness of a dog on leather

      A model is a salad that taste like mud

      A model is…

  • Possibly linux@lemmy.zip
    link
    fedilink
    English
    arrow-up
    10
    ·
    edit-2
    2 years ago

    Edward Snowden isn’t god

    I know that’s a shock to some…

    Anyway I think he means self hosted options. I would recommend Ollama with a frontend

  • utopiah@lemmy.ml
    link
    fedilink
    arrow-up
    2
    ·
    edit-2
    2 years ago

    My documented process https://fabien.benetou.fr/Content/SelfHostingArtificialIntelligence but honestly I just tinker with this. Most of that isn’t useful IMHO except some pieces, e.g STT/TTS, from time to time. The LLM aspect itself is too unreliable, and I do like 2 relatively recent papers on the topic, namely :

    which are respectively saying that the long-tail makes it practically impossible to train AI to be correct in rare cases and that “hallucinations” are a misnomer for marketing purposes to be replaced instead by “bullshit” used to convinced people without caring for veracity.

    Still, despite all this criticism it is a very popular topic, hyped up to be the “future” of computing. Consequently I did want to both try and help others to do so rather than imagine that it was restricted to a kind of “elite”. I try to keep the page up to date but so far, to be honest, I do it mostly defensively, to be able to genuinely criticize because I did take the time to try, not reject in block.

    PS: I do try also state of the art, both close and open-source, via APIs e.g OpenAI or Mistral but only for evaluation purposes, not as tools part of my daily usage.