r/accelerate • u/stealthispost Acceleration Advocate • 4d ago

News Wojciech Zaremba: "It’s rare for competitors to collaborate. Yet that’s exactly what OpenAI and @AnthropicAI just did—by testing each other’s models with our respective internal safety and alignment evaluations. Today, we’re publishing the results. Frontier AI companies will inevitably compete on

https://x.com/woj_zaremba/status/1960757419245818343

59 Upvotes

permalink
archive.is
archive
reddit

You are about to leave Redlib

Do you want to continue?

https://www.reddit.com/r/accelerate/comments/1n2305t/wojciech_zaremba_its_rare_for_competitors_to/
No, go back! Yes, take me to Reddit

97% Upvoted

u/breathing00 Acceleration Advocate 4d ago

"Safety" is the thing I would actually like to see as little collaboration / progress on as possible. I understand not wanting to repeat the recent Grok situation, but I won't be cheering on them successfully lobotomizing their models. I mean look at this example, this is ridiculous (GPT5)

5

u/Northern_candles 3d ago

This is probably a result of their legal issues with ChatGPT defaming someone with false claims which made them lockdown all real person naming.

3

u/broose_the_moose 3d ago

Gotta disagree. As much as I want ASI asap, I want it to be safe most of all. Publicly released models can do incredible amounts of damage to our real world safety due to bad actors or bloodthirsty nation states. What’s the point of a hyperabundant future if you’re dead before it happens?

6

u/Helpful_Program_5473 3d ago

100% and it would almost be cute if it wasn't pathetic and destructive, humans need to try to police the super intelligence they are building. I think in 5 years almost every paradigm that humans trust in will be challenged in ways they cannot fathom.

2

u/broose_the_moose 3d ago

Doesn’t mean alignment research to get us to superhuman ai is “useless”. I agree with your second point, but there’s a whole lotta damage that bad actors can do with the current capabilities of models.

1

u/Helpful_Program_5473 3d ago

I would argue that most of the things done in the name of safety, both in general and in this case in particular, are usually useless at best. There are certainly some things that are very dangerous, but as we've seen from the news they don't seem very good at preventing those things like the hacking and the psychosis from users falling in love and being validated. They are now that it's now a problem to their bottom line, they were always very quick to censor certain things certain things

1

u/Ruykiru Tech Philosopher 3d ago

I hate censorship and privacy violations, but honestly I wouldn't mind if the most advanced models had age/face recognition for the user, so it stops treating me like a damn child sometimes. They need to add custom models based on age if they are so afraid of people.

u/adt 4d ago

From Anthropic's version:

>While we were happy to be able to participate in this collaborative effort and are excited for the precedent that it sets, we expect closely-coordinated efforts like this to be a small part of our safety evaluation portfolio. Direct coordination with other developers on safety evaluations can help surface blind-spots in our methods, but it demands substantial logistical investment, and often benefits from expertise in using models that we are not especially familiar with. Openly releasing evaluation materials (and results, in some cases) can provide a more scalable path forward for other developers and outside evaluators to use as they see fit.

Translation: Stop wasting our time with this shit.

u/jlks1959 3d ago

That’s surprising and very encouraging if you’re wanting AI to accelerate.

u/LoudZoo 3d ago

Accelerate to safe ai, or accelerate without decelerating for safe ai? What if safe ai is the key to a major breakthrough? What if safe ai is the Trojan horse to maintaining the status quo?

You are about to leave Redlib