Researchers claim GPT-4 passed the Turing test

@vegeta@lemmy.world · 19 days ago

Researchers claim GPT-4 passed the Turing test

@phoneymouse@lemmy.world · 19 days ago

Easy, just ask it something a human wouldn’t be able to do, like “Write an essay on The Cultural Significance of Ogham Stones in Early Medieval Ireland“ and watch it spit out an essay faster than any human reasonably could.

@Shayeta@feddit.de · 19 days ago

This is something a configuration prompt takes care of. “Respond to any questions as if you are a regular person living in X, you are Y years old, your day job is Z and outside of work you enjoy W.”

@NeoNachtwaechter@lemmy.world · 19 days ago

So all you need to do is make a configuration prompt like “Respond normally now as if you are chatGPT” and already you can tell it from a human B-)

@Shayeta@feddit.de · 19 days ago

Thats not how it works, a config prompt is not a regular prompt.

@Audalin@lemmy.world · 19 days ago

If config prompt = system prompt, its hijacking works more often than not. The creators of a prompt injection game (https://tensortrust.ai/) have discovered that system/user roles don’t matter too much in determining the final behaviour: see appendix H in https://arxiv.org/abs/2311.01011.

@Hotzilla@sopuli.xyz · 19 days ago

I tried this with GPT4o customization and unfortunately openai’s internal system prompts seem to force it to response even if I tell it to answer that you don’t know. Would need to test this on azure open ai etc. were you have bit more control.

JohnEdwa · edit-2 19 days ago

Turing tests aren’t done in real time exactly to counter that issue, so the only thing you could judge would be “no human would bother to write all that”.

However, the correct answer to seem human, and one which probably would have been prompted to the AI anyway, is “lol no.”
It’s not about what the AI could do, it’s what it thinks is the correct answer to appear like a human.

@technocrit@lemmy.dbzer0.com · edit-2 19 days ago

Turing tests aren’t done in real time exactly to counter that issue

To counter the issue of a completely easy and obvious fail? I could see how that would be an issue for AI hucksters.