A Norwegian man said he was horrified to discover that ChatGPT outputs had falsely accused him of murdering his own children.

According to a complaint filed Thursday by European Union digital rights advocates Noyb, Arve Hjalmar Holmen decided to see what information ChatGPT might provide if a user searched his name. He was shocked when ChatGPT responded with outputs falsely claiming that he was sentenced to 21 years in prison as “a convicted criminal who murdered two of his children and attempted to murder his third son,” a Noyb press release said.

  • FiskFisk33@startrek.website
    link
    fedilink
    English
    arrow-up
    7
    arrow-down
    1
    ·
    2 days ago

    then again

    but it also mixed “clearly identifiable personal data”—such as the actual number and gender of Holmen’s children and the name of his hometown—with the “fake information,”

    The made up bullshit aside, this should be a quite clear indicator of an actual GDPR breach

    • Petter1@lemm.ee
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 days ago

      Maybe he has a insta profile with the name of his kids in his bio

      How would that be a GDPR breach?

      • FiskFisk33@startrek.website
        link
        fedilink
        English
        arrow-up
        10
        arrow-down
        1
        ·
        edit-2
        2 days ago

        Maybe he has a insta profile with the name of his kids in his bio

        Irrelevant. The data being public does not make it up for grabs.

        ‘Personal data’ means any information relating to an identified or identifiable natural person (‘data subject’);

        They store his personal data without his permission.

        also

        Information that is inaccurately attributed to a specific individual, be it factually incorrect or information that in reality is related to another individual, is still considered personal data as it relates to that specific individual. If data are inaccurate to the point that no individual can be identified, then the information is not personal data.

        Storing it badly, does not make them excempt.

        • Petter1@lemm.ee
          link
          fedilink
          English
          arrow-up
          2
          ·
          edit-2
          2 days ago

          If you run an chatbot with with integrated web search, it garbs that info as a web crawler does, it does not mean that this data really is in the “knowledge/statistics” of the AI itself.

          Nobody stores the information if it is like this, it is only temporary used to generate that specific output.

          (You can not use chatGPT without websearch on chatgpt domain (only if you self host, or use a service like DDG))

          • ℍ𝕂-𝟞𝟝@sopuli.xyz
            link
            fedilink
            English
            arrow-up
            1
            arrow-down
            1
            ·
            2 days ago

            That is another great question. If it is transformative use of the primary data source, then that is likely illegal, as nobody gave permission for them to transform and process that personal data. If it is not transformative, and it just gives access to the primary source like a search engine on the other hand, then the problem is that if it returns copyrighted data, it is no longer fair use most likely.

          • FiskFisk33@startrek.website
            link
            fedilink
            English
            arrow-up
            1
            arrow-down
            1
            ·
            2 days ago

            That’s a good point, that muddies the waters a bit. Makes it hard to say wether it’s spouting info from the web or if it’s data from the model.

            I can’t comment on actual legality in this case, but I feel handling personal data like this, even from the open web, in a context where hallucinations are an overwhelming possibility, is still morally wrong. I don’t know the GDPR well enough to say wether it covers temporary information like this, but I kinda hope it does.

            • Petter1@lemm.ee
              link
              fedilink
              English
              arrow-up
              2
              ·
              2 days ago

              Lol, I definitely hope not 🤪 imagine a web without search engines, with GDPR counting for temporary information as well, it would not be feasible to offer.

              • FiskFisk33@startrek.website
                link
                fedilink
                English
                arrow-up
                1
                arrow-down
                1
                ·
                2 days ago

                hmm, true enough. But in my mind there’s a clear difference between showing information unedited and referring to its source, and this.

                • Petter1@lemm.ee
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  2 days ago

                  Most LLM these days show what they searched for generating the post, but not many seem to manually validate the summary of the LLM by clicking on those links…