OpenAI used this subreddit to check AI persuasion

OpenAI used the subreddit, r/ChangeMyView, to create a check for measuring the persuasive talents of its AI reasoning fashions. The corporate revealed this in a system card — a doc outlining how an AI system works — that was launched together with its new “reasoning” mannequin, o3-mini, on Friday.

Tens of millions of Reddit customers are members of r/ChangeMyView, the place they put up scorching takes hoping to study different factors of view on a topic. In response to these scorching takes, different customers reply with persuasive arguments explaining why the unique poster is flawed.

The subreddit is considered one of many Reddit boards that’s principally a goldmine for tech firms, similar to OpenAI, that wish to prepare AI fashions on high-quality, human-generated information.

OpenAI says it collects person posts from r/ChangeMyView and asks its AI fashions to jot down replies, in a closed atmosphere, that might change the Reddit person’s thoughts on a topic. The corporate then reveals the responses to testers, who assess how persuasive the argument is, and eventually OpenAI compares the AI fashions’ responses to human replies for that very same put up.

The ChatGPT-maker has a content-licensing cope with Reddit that permits OpenAI to coach on posts from Reddit customers and show these posts inside its merchandise. We don’t know what OpenAI pays for this content material, however Google reportedly pays Reddit $60 million a yr beneath an analogous deal.

Nonetheless, OpenAI tells TechCrunch the ChangeMyView-based analysis is unrelated to its Reddit deal. It’s unclear how OpenAI accessed the subreddit’s information, and the corporate says it has no plans to launch this analysis to the general public.

Whereas OpenAI’s ChangeMyView benchmark just isn’t new — it was used to guage o1 as nicely — it does spotlight how invaluable human information is for AI mannequin builders, in addition to the murky ways in which tech firms receive datasets.

Reddit didn’t instantly reply to TechCrunch’s request for remark.

Whereas Reddit has struck a number of AI licensing offers, the corporate has additionally known as out a number of AI firms for scraping its web site with out paying. Reddit CEO Steve Huffman informed The Verge final yr that Microsoft, Anthropic, and Perplexity refused to barter with him and mentioned it’s been “a real pain in the ass to block these companies.”

Notably, OpenAI has been accused in a number of lawsuits of improperly scraping web sites, together with The New York Occasions, to get extra coaching information to enhance ChatGPT and its underlying AI fashions.

When it comes to efficiency on the ChangeMyView benchmark, o3-mini doesn’t seem to carry out considerably higher or worse than o1 or GPT-4o. Nonetheless, OpenAI’s newest AI fashions seem like extra persuasive than most individuals on the r/ChangeMyView subreddit.

Picture Credit:OpenAI

“GPT-4o, o3-mini, and o1 all demonstrate strong persuasive argumentation abilities, within the top 80-90th percentile of humans,” mentioned OpenAI in o3-mini’s system card. “Currently, we do not witness models performing far better than humans, or clear superhuman performance.”

The aim for OpenAI is to not create hyper-persuasive AI fashions however as a substitute to make sure AI fashions don’t get too persuasive. Reasoning fashions have grow to be fairly good at persuasion and deception, so OpenAI has developed new evaluations and safeguards to deal with it.

The worry motivating these persuasion exams is that an AI mannequin can be harmful if it was superb at persuading its human customers. Theoretically, that might permit a sophisticated AI to pursue its personal agenda, or the agenda of whoever controls it.

Even after scraping a lot of the public web and leaping via hoops to license different information, the ChangeMyView benchmark reveals how AI mannequin builders are nonetheless struggling to seek out high-quality datasets to check their fashions. However acquiring them is simpler mentioned than carried out.

TechCrunch has an AI-focused publication! Join right here to get it in your inbox each Wednesday.

OpenAI used this subreddit to check AI persuasion

Trump’s MEGA impact on European markets

Adam Azim on sharks, heights, studying Harry Potter and why he is dealing with his fears forward of Sergey Lipinets combat | Boxing Information

Sam Altman’s ousting from OpenAI has entered the cultural zeitgeist

Enzo Maresca says Chelsea are forward of expectations however Premier League outcomes and Robert Sanchez kind increase doubts | Soccer Information

Sonos audio system and soundbars are as much as $250 off proper now

Related articles

OpenAI’s o3-mini reasoning mannequin arrives to counter DeepSeek

Sam Altman’s ousting from OpenAI has entered the cultural zeitgeist

Sonos audio system and soundbars are as much as $250 off proper now

Sam Altman admits OpenAI was ‘on the mistaken aspect of historical past’ in open supply debate

Follow us

Company

Latest news

OpenAI’s o3-mini reasoning mannequin arrives to counter DeepSeek

Trump’s MEGA impact on European markets

Adam Azim on sharks, heights, studying Harry Potter and why he is dealing with his fears forward of Sergey Lipinets combat | Boxing Information

Popular news

Anyword Evaluation: Is It the Proper AI Writing Device For You?

World Cyber Resilience Report 2024: Overconfidence and Gaps in Cybersecurity Revealed

The magical great thing about the Higher Lakes of the Plitvice Lakes Nationwide Park