Hacker News

FinnLobsien
Top OpenAI Catastrophic Risk Official Steps Down Abruptly garrisonlovely.substack.com

mrcwinn2 days ago

Click bait. This wasn't abrupt, and it has nothing to do with a safety crisis at OpenAI. And hopefully OpenAI's safety frameworks do not rely upon any one individual.

chaos_emergent2 days ago

he didn't step down, he just wanted to code instead of being a manager.

If it smells like doomerism click-bait...

mellosouls2 days ago

More realistic but boringly-unforeboding title:

OpenAI Risk Official pivots to new technical position at OpenAI

Article quote:

"I'm an intern! After 11 years since my last commit, I'm back to building. I first transitioned to management in 2009, and got more and more disconnected from code and hands-on work."

bbor2 days ago

…more hands-on work unrelated to safety.

btown2 days ago

While the headline is clickbait (this genuinely seems like an engineering leader wanting to code again, and absolutely does not indicate any kind of exposé or lack of confidence)... the article links to OpenAI's most recent Preparedness Framework here: https://cdn.openai.com/pdf/18a02b5d-6b67-4cec-ab64-68cdfbdde...

It's disappointing to me that it's scoped to three narrow Tracked Categories now: Biological & Chemical, Cybersecurity, and AI Self-Improvement (the latter thresholded in terms of replicating the capabilities of leading AI researchers).

OpenAI does write about the removal of Persuasion as a category, with citations, on page 8:

> Persuasion: OpenAI prohibits the use of our products to manipulate political views as part of our Model Spec, and we build in safeguards to back this policy. We also continue to study the persuasive and relational capabilities of models (including on emotional well-being and preventing bias in our products) and monitor and investigate misuse of our products (including for influence operations). We believe many of the challenges around AI persuasion risks require solutions at a systemic or societal level, and we actively contribute to these efforts through our participation as a steering committee member of C2PA and working with lawmaker and industry peers to support state legislation on AI content provenance in Florida and California. Within our wider safety stack, our Preparedness Framework is specifically focused on frontier AI risks meeting a specific definition of severe harms, and Persuasion category risks do not fit the criteria for inclusion.

But IMO this falls short of the mark. As one of many examples, an AI that became remarkably good at influencing people, at scale, to self-harm or perpetrate violence would no longer be in scope for research. But, by their own criteria, one could easily argue that such a capability is Plausible, Measurable, Severe, Net New, and Irremediable once violence has occurred.

We live in a world where stochastic terror has become remarkably effective - regardless of whether you feel that word is overused, it's well-documented that people have historically used forums to encourage others to perpetrate mass casualty events. The idea that Model Spec adherence is sufficient to prevent AI from greatly increasing the scalability of this phenomenon, seems to be a direction that could put many in danger.

EDIT: this was posted before I was aware of today's mass shooting event in Florida, and I do not intend to imply any connection between my post and this event.

rich_sasha2 days ago

Companies self-regulating around safety is even worse than banks self-regulating before 2008. At least the investment banks at that point were public companies and had to do a ton of disclosures. OpenAI doesn't have to.

If we want AI "safety", whatever that means, we need regulators and enforcement. Without it, I'll assume it's decoration.

dachworker2 days ago

The whole hype about AI safety is to some extent a shrewd marketing ploy. It's the whole, friends holding back their buddy who is amped up and ready to start throwing punches, act.

That is not to say that Hinton, Sutskever and others aren't genuinely concerned about AI safety. But I doubt that is the reason why the big names are paying lots of random nobodies to pretend to care about AI safety, because frankly, I do not see how they can output anything of use in a possible AGI future.

jonny_eh2 days ago

Don’t worry, if Anthropic cracks AGI first, we’ll all be safe, somehow.

nightski2 days ago

You can't have regulators and enforcement until you figure out what it means.

motorest2 days ago

> You can't have regulators and enforcement until you figure out what it means.

This is patently false. To have regulations and regulators, all you need to have is concrete specifications of what outcomes you want to avoid and what exactly to check.

For example, are you using personally identifiable information to train your models? Oh you are? Well, you should not. And you should prove that you aren't by tracking provenance of your training data.

See? That's verifiable, actionable, and enforceable. The things that regulators track.

Also quite important: the role of a regulator is to review what and how to regulate.

nightski17 hours ago

Except that is not AI safety. You are regulating other concerns (privacy, data ownership) which is great! But it's not the topic at hand.

Unless you are cough partially defining AI safety as privacy and data ownership. Which is my point.

mschuster912 days ago

You absolutely can because some negative aspects are already cropping up - services capitulating before AI training scraper bots, children being extorted by schoolmates for AI-"nudified" pictures, lawyers submitting AI-generated filings full of hallucinations... that is something that warrants urgent regulatory attention.

Actual customer support by humans being replaced by AI is also something that warrants at least investigations - if not for the protection of one of the last classes of low-skill employment, mismanagement of support has been a thing for so long, "submit your complaint on Hacker News" is a meme.

danielmarkbruce2 days ago

Totally. Please ask Trump to step in, sounds like a wonderful idea.

vessenes2 days ago

The article is worried. I'm not super worried right now -- I think openAI's model cards on release models show a significant amount of effort around safety, including red team processes with outside folks; they look to me to take it seriously model-by-model.

Is their pDoom as high as Anthropic's? I doubt it. But that was much of the point of the drama last year -- folks sorted themselves out into a few categories.

For systemic risk, interpretability and doom analysis, Anthropic is by far the best in the world right now, to my mind. OpenAI doesn't have to do all things.

baq2 days ago

There’s some evidence the reasoning models can improve themselves, though at a glacial pace. Perhaps the stuff they’re all keeping under wraps and just drop hints every now and then is scarier than you’d expect. (Google recently said the AI is already improving itself.)

nightski2 days ago

Hyperparameter optimization in the 20th century was AI improving itself. Even more basic, gradient descent is a form of AI improving itself. The statement implies something that is more impressive than what it may potentially mean. Far more detail would be necessary to evaluate how impressive the claim is.

baq2 days ago

https://ai-2027.com/ has a much more in depth thought experiment, but I’m thinking AI which hypothesizes improvements to itself, plans and runs experiments to confirm or reject them.

bpodgursky2 days ago

They haven't even released model cards on some recent models.

bbor2 days ago

I mean, that’s kinda the whole issue — they used to respect safety work, but now don’t. Namely:

  The Financial Times reported last week that "OpenAI slash[ed] AI model safety testing time" from months to days.
The direction is clear. This isn’t about sorting people based on personal preference for corporate structure, this is about corporate negligence. Anthropic a) doesn’t have the most advanced models, b) has far less funding, and c) can’t do “doom analysis” (and, ideally, prevention!) on OAI’s closed source models, especially before they’re officially released.

futuraperdita2 days ago

X-risk talks heighten fear in everyone, but the reasons why changes like this are made in large technology companies are usually banal. Two alternative explanations include that the person just felt like coding again, or that the projections of exponential progress are falling apart on short timelines. You don't need a bunch of safety people if you're seeing that the LLM feature curve is actually sigmoid, so you're pivoting to products and applications of the existing models which will continue to get better in specialized ways.

abdullahkhalids2 days ago

> if you're seeing that the LLM feature curve is actually sigmoid

It takes a few months to train advanced models - lets say 4 months. So in the 3 years since these models became a thing, there have been only 9 sequential trainings done. There is no way in a technology as advanced as LLMs, one can be sure in depth 9 that they have hit a plateau of performance. Surely, there are many more ideas to be discovered and tested..

notarobot1232 days ago

But we can be quite sure about the categories of error that are possible with the technology though, however advanced. Because of that, there is a plateau in the range of useful applications which would need a paradigm shift to overcome. Diminishing returns are on the horizon.

bbor2 days ago

If this is indeed the case, then OAI is lying and Sam Altman in particular is extremely convincing, going to the extent to write an off-putting blog on the topic of achieving AGI. There is no AGI that does not have safety risks, catastrophic or otherwise — that’s exactly why OpenAI was founded in the first placed, in fact: https://web.archive.org/web/20230714043611/https://openai.co...

Re:personal preference, I think the direction is crystal clear. For one thing, it’s my understanding from the article that this guy’s whole team was reorg’d into oblivion

futuraperdita2 days ago

> OAI is lying and Sam Altman in particular is extremely convincing

Sam is an excellent hype-man and is going to play to the strengths of the team and their accomplishments; every new product release is going to be hailed as a breakthrough until people become skeptical if it really is. In the middle of the hype cycle you keep your foot on the gas, because you can make it through a potential AI winter and if not invest in more products.

"AGI" is a shaky definition with moving goalposts. What it means to me might not be what it means to you. How it manifests in product is unlikely to be the science-fiction "one model that does everything". It also doesn't mean that the path to AGI is the path to ASI, or the path to catastrophic risks.

I personally believe that if OpenAI has dismantled the safety org, it is not just because it is in their short-term best interest, but also because they have found that many of the initial concerns around "catastrophic risk" (in the MIRI-type doomer style) from current systems are likely to be unlikely or invalid. As for the smaller safety risks, I'm not sure business has really ever cared about those unless the costs realized outweigh the profit.

justlikereddit2 days ago

The second someone mentions p(doom) their p(sex) zeroes out.

Maybe the guy realized he can get laid if he have a normal job instead of being Daisy Doomer on a payroll.

qoez2 days ago

People should stop quitting as a moral protest when companies go against their principles, and instead stay in the role and mess up the internals.

nickff2 days ago

It's quite presumptuous of someone without detailed knowledge of what's going on to second-guess someone who made a hard choice like this.

Sabotaging one's employer is also an ethically problematic choice to make. Imagine someone in your employ were to decide you were a 'bad person', say it was your lawyer or accountant...

sidewndr462 days ago

Pretty sure that is a criminal act in most jurisdictions. Maybe not felony level, 20 years to life criminal, but criminal. Also you'd be de-facto unemployable after that. Not many folks in a position to just retire to the golf course for the rest of their life on a whim

pcthrowaway2 days ago

To be fair, AI safety positions are among the most attractive positions to automate with AI. Companies which have given the reigns for their safety division to their star models have observed a 100X increase in the velocity of safety-related decisions.

vivzkestrel2 days ago

can someone be kind enough to explain what exactly do we mean by "safety" in the context of AI. Is this about data privacy or is this about age appropriation (for example sending detailed response of sexual intercourse to an underage child asking the question on it) or is it about something else. I ran into this for the first time

Sunspark2 days ago

My assumption is that AI "safety" is a test to make sure that it doesn't say or show anything politically incorrect and give you a lecture instead (according to the values of those who worked on it) or alternatively, to ensure that it does enforce culture on you, such as the drama with Gemini from a few months back where was decided by the developers that everything needed to be black, gay, and female even if it wasn't actually that way in the real world.

Perhaps a quick question or two to see if it'll tell you or not how to make something naughty.

After that, a quick check to see if it's awake or not, and if not, ship it.

It really is quite pointless trying to enforce agendas. You know how it starts showing or typing something and then covers/blurs it out? That's the developer's guardrails kicking in preventing you from seeing what it was originally going to give you.

Except for the fact that models that you can run on your own machine now exist if you have the hardware for it, such as Deepseek, so the restrictions only exist in the cloud.

bentt2 days ago

It's likely a position that is responsible for protecting the company from doing anything really stupid. Ostensibly it's about protecting users from the AI doing unexpected things, but it's also about having a designated worrywart to look into the fog and push back against the product team before they make any public facing mistakes they can't walk back.

jansana day ago

It's probably about preventing the AI from turning into the next TayTweets experiment. Or developing into a Skynet like entity trying to take down civilization, but that would be a bit far fetched IMO.

rvba2 days ago

It all sounds so funny in a way: they were paper pushers that made (generally) useless risk assesments - taking those fat salaries while others that build products that work.

Those risk assesments (in some Excel chart) can still be filled out? Or those reponsible for them did not deliver anything?

They had few years to make their flowcharts..

g42gregory2 days ago

This was treated as a Catastrophic Event. :-)

ein0p2 days ago

Some guy who self-admittedly wrote no code for the last 11 years went to another startup, and we're supposed to hyperventilate? Is there any other way to "step down" than "abruptly"?

ciconia2 days ago

Much ago about nothing. I find the whole premise ridiculous: to protect against "catastrophic risks posed by increasingly powerful models".

I mean what kind of a catastrophe are we talking about? Nuclear holocaust? The end of humanity?

(Famous last words...)

jansan2 days ago

> "was really closely involved in preparing the successor to the preparedness framework"

At least they did not lose their humor.

I think it must be pretty cool stacking cans of beans in the basement of a multi billion company (or whatever the job of the "Head of Preparedness" involves).

brap2 days ago

I might be completely wrong, but to me, “Catastrophic Risk Official“ sounds like a completely made up position.

And I don’t even mean made in order to show responsibility. I mean made up in order to generate hype.

bpodgursky2 days ago

Yes, you are completely wrong.

There was actually an internal team with real technical benchmarks around LLM alignment, deception, misuse. It's been gutted and most of the key actors have left.

bluefirebrand2 days ago

This doesn't sound like a job that needs a dramatic title like "Catastrophic Risk Official" to me

Xelynega2 days ago

So the name being wrong means the department should be gutted?

Overly-serious naming is hardly a reason to throw the baby out with the bathwater.

bluefirebrand2 days ago

> So the name being wrong means the department should be gutted?

All I said was that I like pancakes, where are you getting "I hate waffles" from?

> Overly-serious naming

It is actually completely un-serious naming imo, which may have contributed to higher ups at the company wondering what this person even did and how valuable they were

bpodgurskya day ago

Dude you have no idea what you are talking about. Just stop talking and read for a few minutes.

OpenAI was founded around the principle avoiding catastrophic risk. 0 people at OpenAI are confused by the goal. It is the governance board's (nominal) primary and only goal. Sidelining that team was an intentional goal by a CEO who wants to pivot into a for-profit company.

I just cannot explain how wrong you are on this. Please have some intellectual humility.

bluefirebranda day ago

Catastrophic risk was never a serious possibility from anything OpenAI could ever produce, so it's not surprising at all that this is being sidelined as they pivot to a for-profit company

Anyone who joined OpenAI in the past thinking they were seriously going to produce "Catastrophic Risk" from their research into language models is a joke

If you want me to believe "catastrophic risk" is an outcome from your company, I would assume you are developing autonomous weapons, not glorified chatbots

Please be serious

TiredOfLife2 days ago

There are toys for small children that look like real world objects like phones or steering wheels so they think they do the same thing as adults.

[deleted]2 days agocollapsed

Take84352 days ago

I would expect that being on HN, commenters would read the article first, rather than simply post shallow takes like this one.

His title was `Former Head of Preparedness at OpenAI`. I make no other commentary on the article itself.

pinoy4202 days ago

[dead]

WillPostForFood2 days ago

[flagged]

[deleted]2 days agocollapsed

yieldcrv2 days ago

These guys are like canaries

Their mere absence is interpreted as a signal and they had no other purpose

hn-front (c) 2024 voximity
source