Hacker News

_____k
Uber’s COO says it’s getting harder to justify money spent on tokenmaxxing businessinsider.com

izanton25 minutes ago

What if... we stop for a moment, and then, after thinking for a moment, we stop hammering nails with a microscope, and stop using token usage as a metric of productivity?

I know it's sounds stupid, but what if

Lalabadie6 minutes ago

You're now in the last frame of the comic, getting thrown out the window.

tekno4517 minutes ago

Not very Billion Dollar Valuation of you.

[deleted]15 minutes agocollapsed

devin15 minutes ago

The people who have ascended to leadership positions are deeply divorced from reality.

"It is difficult to get a man to understand something, when his salary depends on his not understanding it." -Upton Sinclair

lorecore13 minutes ago

The crazy thing is their salary does not actually benefit from riding these trends. Unless it's equally/even more clueless board level pressure with ulterior motives (i.e., lifting their other AI investments or the sector as a whole).

repeekad11 minutes ago

Every c suite in the country is panicking about being left behind, from their perspective it’s either token max or fade into obscurity, or at least that’s what they were sold

treis6 minutes ago

Please. These are the same people that force their employees to use Microsoft teams because slack is $5 an employee a month. They're not going to sit idly by while employees burn thousands a month in tokens.

lorecore7 minutes ago

I don't think that's accurate. I think every C suite in the country is looking to do away with labor's leverage as much as possible. I think this is a cultural thing more than anything else, C suite + investors looking to get rid of those pesky humans required to prop up their lifestyles. AI is the most credible path toward that. Short, medium or long term returns be damned, this is a reconfiguration of society and they want to shed what they consider to be baggage.

zeroonetwothree13 minutes ago

Come on, don’t be crazy

mrkeen14 minutes ago

I always used to wonder this about software stacks even prior to LLMs, but it seems more relevant now somehow:

When will Uber (or your favourite company) be 'done'? They've been writing software for 16 years.

They match drivers to passengers. More software isn't going to increase the chance that I seek them out instead of taking a bus or train.

Will their software be finished in 20 years? 80?

great_psy4 minutes ago

[delayed]

goldenarm8 minutes ago

Most of the codebase is custom integrations for local markets. You can systematize some of it but most of the complexity comes from there.

bee_rider6 minutes ago

Weren’t they trying to do their own self-driving thing?

I think this is partly a problem with companies that have had heavy investment. Uber’s value isn’t based on what they are doing, it is based on the idea that they are going to render ideas like owning your own car or taking public transit obsolete (I mean that’s an exaggeration but less of one than it ought to be).

dag1007 minutes ago

There are always newer technologies and techniques to be implemented. Better algorithms. Larger deployments. Better reliability. There are also almost always bugs to fix. So, so many bugs.

FartyMcFarter23 minutes ago

If any company announces that they use token consumption as an employee performance signal, for me that's close to a red flag to stay away from that company.

No company with good engineering leadership should act like this is remotely a good idea.

LaurensBER22 minutes ago

Tokens are the new "lines of code per engineer". Easy to graph, easy to "manage".

KellyCriterion15 minutes ago

...and easier to bill! Back, then noboday had the idea to charge per "lines of code", but today it seems accepted to charge per words processed?

abvdasker8 minutes ago

Meta does this. Guess what one of the criteria for their recent layoffs was.

an0malous12 minutes ago

I worked at a YC company that was doing this and left last month. I wonder where this all started from, VCs and tech execs are such a monoculture

bilater8 minutes ago

The black bill that is coming that nobody is prepared for is that the value of a token varies greatly depending on the human. Companies will quickly find out its much better to give your top 10% engineers a lot more tokens and lay off your average engineers. The 10x engineer will become the 1000x engineer.

Wrote about this and the impact of to jobs here: https://x.com/deepwhitman/status/2058324179506831372

hmokiguess4 minutes ago

Why do keep doing this? It's the same as measuring by LoC, we know it's not gonna work. Also, see Goodhart's Law[1]

- https://en.wikipedia.org/wiki/Goodhart%27s_law

crorella16 minutes ago

Tokenmaxxing makes no sense, it is akin to write extremely inefficient SQL / Spark Jobs, full of cartesian joins, ultra skewed datasets, etc, just for the sake of using as much compute / memory / IO as possible.

This always happens when the metric becomes the goal, companies should nurture and foster an environment where AI is used in the most efficient way possible, first asking "do we really need an agent for this" and if so, what kind of agent is needed, what model, reasoning level, etc.

They should also promote projects that aim at saving tokens, increasing cache hits, codifying the information in ways such they use as less context as possible (graphs of knowledge are pretty good for this!)

InsideOutSanta6 minutes ago

It's toddler-level logic. "You can achieve positive outcomes by using X. Therefore, we need to use as much X as possible to maximize positive outcomes."

It's like trying to win a race by setting a gas station on fire.

SpicyLemonZest5 minutes ago

The argument in favor of "tokenmaxxing" has always been that it's creating space for employees to freely explore the broad and novel space of AI-enabled workflows. I've seen a number of use cases where I'm skeptical any value is being produced, but a number of others where some team or another has finally solved a long-standing problem of theirs with an agentic workflow that would have been hard to justify to a cost review committee.

> They should also promote projects that aim at saving tokens, increasing cache hits, codifying the information in ways such they use as less context as possible (graphs of knowledge are pretty good for this!)

My understanding is that most big "tokenmaxxing" companies do have teams who are working on this in the background.

simonw15 minutes ago

I'd be interested to know if this is about individual employee AI usage, or use of AI tokens in production features, or both - and assuming both, what the split is.

I can see how Uber could burn unbelievable amounts of tokens if they start running internal features that run a bunch of prompts against every completed ride, or every customer profile, for example.

Or maybe this is about employee usage, but they introduced some stupid "you get evaluated on how many tokens you used" thing a couple of months ago when that was trendy and are just beginning to notice how much that cost?

devina minute ago

IMO, it's undoubtedly both.

The number of product teams who have shipped expensive-to-operate AI features is wayyyy up there, and for many of the scenarios I've seen, customers simply don't care or are unwilling to pay significant premium for access to it.

At the same time I'm starting to see some direction from people in leadership that I should "use the right model for the job" and things along those lines, which is a very, very different line from what I was hearing 12 months ago.

My continued prediction is that we are going to see a tweak on the SaaS model where the sweet spot moves to metered usage pricing of really fine-grained API-based access for apps which traditionally have been operated solely via the UI. Long term the trend is going to be "we'll house the data, enrich it, maintain it, provide fine-grained API access over it tailored to model usage, and you bring the model" with some services opting to give you the model interaction layer/harness. IOW I don't think SaaS is dead. Far from it. However, I do think that a lot of people are going to be looking to interact with SaaS apps via their own models with APIs that support those use cases better than a lot of those APIs do today.

jhack22 minutes ago

Maybe don't use the most expensive models on the planet? Maybe use AI like a tool and not this black box that grants wishes?

onlyrealcuzzo16 minutes ago

I think companies are reluctantly realizing that AI is not a magic genie in a bottle, and is instead a tool.

Still very valuable. They just need to have strategies that match what the tools are capable of - not strategies that involve "rub the magic lamp and increase profits 80%".

If the market is rewarding companies going after the "rub the lamp" strategy, they're going to say they're doing that to juice stock prices.

Maybe the market is finally realizing blindly spending billions on LLMs with almost no strategy is not a good strategy.

Who knows.

dgellow16 minutes ago

Sounds like you want to be in the next round of layoffs?

[deleted]17 minutes agocollapsed

InsideOutSanta8 minutes ago

"He said that, based on talks with Uber's senior engineering leaders, he realized higher token usage did not translate into a proportional increase in useful consumer features."

He's saying that like it's some grand epiphany and not the most self-evident, obvious thing I've heard this month. Some of the literal dumbest people on earth are in charge of these major companies.

JackDanMeier14 minutes ago

At what point is there a difference between a burn rate and tokenmaxxing? Isn't it the same as during the dotcom bubble?

cryo3232 minutes ago

Waiting for tokenedging next.

postsantum9 minutes ago

^ Philip K. Dick's unreleased book title

SecretDreams26 minutes ago

Is this when you type the prompt into the text window, but don't hit enter? Make the GPU see the message "x is typing"? Lol.

FartyMcFarter21 minutes ago

As long as there's an RPC connection established and a partially sent request, I think it would count.

rcvassallo8314 minutes ago

Oof leader of bubble are starting to take a step back?

chihuahua34 minutes ago

It's amazing that it took months to figure this out. "Well we thought that if engineers are told to maximize costs through AI use, to consume as much as possible of a resource that costs us money, then obviously good things will happen. Imagine my surprise when it didn't turn out that way."

Imagine if engineers were ranked based on their AWS spend. People allocate VMs and fill databases with terabytes of random bits, to get to the top of the AWS leaderboard. If you don't do this, you're ranked at the bottom, and good luck at the next review cycle. Who could have expected that this is not the road to success?

davnicwil5 minutes ago

I think unfortunately it's not about what seems obvious, or even what seems more likely, but about what seems retrospectively justifiable regardless of outcome.

The incentive structure of this type of decision is 'absolutely under no circumstances existentially mess up'. Ostensibly with respect to the organisation, but in actual reality much more so with respect to the individual(s) involved in the decision.

If everyone else is doing something that kind of obviously makes no sense, and you decide to break from the crowd by instead doing what does make sense, then there's a pretty solid chance of gaining a temporary edge while reality resolves the truth. But those gains probably won't matter all that much for the organisation, or indeed your position within it. It's a solid chance of an unimportant gain.

However on the other hand, there's a tail risk that something very unexpected happens and the thing everyone's doing that makes no sense actually turns out to make sense - sometimes even for entirely unpredictable incidental reasons - and then, well, you're in trouble. Not necessarily 'you' the organisation.. they'll likely be able to catch up and it won't matter that much. But for 'you' personally, the decision maker, it's very much not good.

As a bonus, in the much more likely scenario that the thing that makes no sense turns out to indeed make no sense, you're in the same boat as everyone else, there's no relative loss, and most importantly you don't stick out as someone who did something as risky as to go against the prevailing, albeit pretty clearly nonsensical, sentiment.

So basically, game theory tells you pretty quickly to just go with the thing that makes no sense if you're optimising for some (weighted) cross of what's best for the organisation and yourself as the decision maker.

this_user24 minutes ago

The point of this was always to explore what is possible with AI as quickly as possible. Obviously, there is going to be a lot of waste, but the 5-10% of employees who are truly thinking about it and discovering novel applications are what you are truly after. Because right now, you effectively have a giant, as of yet poorly explored space of potential uses.

Anyone who can find the actually valuable portions of the space early has a potentially huge competitive advantage. Even if the result of the experiment is the negative that AI is actually mostly not that useful, that is still extremely useful information in a time of great uncertainty regarding outcomes.

The bottom line is that this approach may be expensive, but if you have the money to burn, it's far from the worst strategy if you are trying to position yourself correctly for the future.

adrianN21 minutes ago

What’s the huge advantage though? Adopting workflows that give big productivity gains is relatively easy even for big corporations. It’s only an advantage if you can keep it secret.

OTOH maybe we’re in for a future of patenting prompts.

uejfiweun13 minutes ago

The thing I don't get though, is that most people just don't have that much work they need to do. I can use AI to pretty easily get my work done just via the regular chat interfaces. But because of the tokenmaxxing metrics that leadership tracks, I end up just having the AI deliberate for hours on random things just so that I can boost my token numbers. I think tokenmaxxing for the end goal you described is only realistic when the engineers are truly buried under a backlog of work.

roxolotl25 minutes ago

The inability of leaders to understand Goodhart’s Law is always a sight to behold. They see a number go up and pat themselves on the back for how well their employees are making it go up without ever wondering if the thing they care about is happening.

saghm21 minutes ago

Someday maybe Goodhart's Law will be intuitive to people making decision like this, but not any time soon I guess

dgellow14 minutes ago

> It's amazing that it took months to figure this out

We aren’t there yet, so far it is just a COO questioning the investment

solenoid093726 minutes ago

You say "amazing that it took months to figure this out" as if the answer to the question is obvious.

But it's not. Some FAANGs are doing amazing things with unlimited tokens. Other companies have no clue what to do with tokens, they've just told their engineers to max them.

It really depends on how you're using the tokens. If you're just using them for Codex and Claude Code - yeah, tokenmaxxing is incredibly dumb.

saghm15 minutes ago

In other words, people who are productive get more done when you scale up what they're already doing, and people who aren't productive will not magically become productive when you scale up what they're already doing. That's incredibly obvious, because we've seen how this plays out repeatedly in so many different ways (lines of code, commits, tickets closed, etc.), and it has nothing to do with tokens or even programming, but just how trying to manage people works.

morpheuskafka23 minutes ago

> But it's not. Some FAANGs are doing amazing things with unlimited tokens

Giving someone unlimited access to a resources is not the same as directing or incentivizing them to use it for the sake of using it which is what the parent comment criticized.

As for the other FAANGs, Meta and Google have (not good but still) frontier models of their own, so they are very different from a company paying API costs per token.

steveBK12324 minutes ago

> Some FAANGs are doing amazing things with unlimited tokens. Others have no clue what to do with tokens.

Unlimited tokens is different from “use AI a lot or we will fire you, and we are counting token consumption as usage”. Obviously the latter is stupid and yet it was done in many places.

solenoid093715 minutes ago

My company did this and it was done reasonably:

1. If you don't use AI most of your work days, you aren't actually doing any work, since AI is deeply integrated into every internal tool and even the smallest usage counts. Also, it doesn't lead to firing, just your manager noticing.

The only way to not hit the usage requirement is to actively avoid AI, and I think it's fair if a business decides it doesn't want people that are actively avoiding it.

2. We counted token usage, but it wasn't used to inform firings or layoffs, though that is widely reported FUD. The guidance was "don't worry about token usage." Usage was just tracked for cost attribution reasons.

dgellow12 minutes ago

Where can I see those amazing things done by FAANGs?

fsloth21 minutes ago

> Some FAANGs are doing amazing things with unlimited tokens.

Would love to know what things!

SecretDreams22 minutes ago

Show me some fang that have made nice outwards facing products through a fully embraced AI workflow?

AI is an accelerator that engineers should know and have access to, but it's not something that should have mandated usage and quotas around. It's also absolutely dangerous for young engineers and the like - it fundamentally denies you of the "learning" aspect. I'm now seeing in interviews young graduates being given AI tasks to complete and they come back with a correct solution and no concept of how it is working.

You learn and reinforce learning by DOING and reading in depth. High level summaries don't teach anything and are the kinds of things only VPs care about. So, unless the intention in the future is for everyone to be a VP using AI to do the work, we need some middle ground here and some real thought around implementation of these tools or there's going to be a generational canyon gap of knowledge between being able to "say" and being able to "do".

[deleted]19 minutes agocollapsed

lorecore15 minutes ago

Not all tokens are created equal. It's easy to use a ton of tokens by having agents work together in parallel. That's basically the equivalent as people spending time in meetings, hardly a productivity win. As with everything in development, results matter, how you get there doesn't (unless you're a bad manager).

yapyap9 minutes ago

wtv

irishcoffee21 minutes ago

I just realized my company is months behind this curve. About to blow my token allocation. Before I do, anyone have requests? Sincerely.

kibwena minute ago

I hereby suggest you take the fragmentary excerpts of the infamous erotic stage play The Lusty Argonian Maid shown in The Elder Scrolls series of games and extrapolate them to 100,000 additional full-length acts.

egypturnash36 minutes ago

Uber COO says he just decided to short a bunch of AI company stock.

epolanski33 minutes ago

Slightly ot, but I really dislike this reddit WSBization of HN.

Adds nothing insightful to these discussions.

cwillu27 minutes ago

“Please don't post comments saying that HN is turning into Reddit. It's a semi-noob illusion, as old as the hills.” --hn guidelines (there are links to examples in the original)

noman-land13 minutes ago

It's unfortunately the WSBification of the entire society.

paulpauper14 minutes ago

many of these leading AI companies are operating at large losses and subsidizing users with VC money. Profitability will entail having to impose greater limits and raising prices, so this will reduce to some degree the value proposition of AI compared to humans.

7777777phil36 minutes ago

As soon as tokens stop stop being subsidized, heavy agentic use will become as least as expensive than paying an (entry level) employee. When this happens many companies will trade off havy tolen usage for (maybe a bit slower, bit less accurate) employees again.

stult8 minutes ago

You're assuming the price won't come down as the tech matures. That seems like a big assumption, considering how quickly open weights models are catching up to frontier models, and how little effort has been invested so far in optimizing inference costs.

It's especially a crazy assumption to make relative to the costs of employing a human. The costs of paying an entry level employee are unlikely to go down at all, and even if those costs do decline, there's a floor they can't drop below (minimum wage at the extreme end), whereas companies are free to optimize agentic costs as close to zero as possible.

So you are assuming that a cost which is extremely susceptible to optimization but which no one has yet seriously attempted to minimize will remain perpetually above a cost which is much less susceptible to optimization, is already subject to enormous efforts to minimize, and has a legally mandated floor. That seems like a bad bet.

Wowfunhappy25 minutes ago

DeepSeek is an open weights model. It's possible the hosted versions are subsidized, but we know what it costs to run locally. And it's expensive, but it's also pretty clearly cheaper than an employee.

Of course, the latest DeepSeek models are not as good as Claude, but they're not super far off either.

irishcoffee14 minutes ago

They're not far off, getting the same seamless integration as hosted models is a full time job. I think what just happened is that devops is about to explode. What will naturally follow is local hosting of all the things when people realize subscription costs for cloud-whatever are absurd.

Gitlab is going to take off? This is not investment advice.

Wowfunhappy8 minutes ago

> What will naturally follow is local hosting of all the things when people realize subscription costs for cloud-whatever are absurd.

Even acknowledging we don't know exactly what costs would look like in a world without VC money, wouldn't hosting models logically be cheaper to do at scale in a data center?

When I compared to the cost of running DeepSeek locally, I meant that we can treat that cost as a price ceiling, not the floor.

skybrian13 minutes ago

Maybe this just counts as “light use” since I’m a hobbyist programmer and I only run one coding agent session at a time, but I get about as much done as I did back when I was working while spending a lot of time browsing the Internet, etc.

I’ve spent $10-$20 a day using Claude to write code and closer to $5 a day now that I mostly use Deepseek and GLM, using API pricing (no subscriptions) since I don’t use Claude Code.

This is a rounding error for a company. So I think there’s plenty of room to use AI extensively while being more cost-conscious.

helloplanets11 minutes ago

More straightforward to talk about the hardware directly. Full Kimi K2.6 needs an 8x H200 node to run and serve around 20 heavy users. You can rent an 8x H200 node for around $30/hr.

I'd imagine GPT-5.5 and Claude Opus 4.7 could run just fine on a 16x H200 node and serve at least 10 heavy users without the token output getting choppy.

saghm12 minutes ago

What's funny is that this apparently wasn't something that the Uber COO seemed to think about when their company is arguably one of the most successful ever at the "subsidize to drive down costs until you capture nearly the entire market" strategy.

BadBadJellyBean19 minutes ago

I have been saying the same for while. Someone always says "but Anthropic is making money on their API" or "But it's inference will get cheaper". But I don't believe it. first all the investments have to payed off at some point and second of all there are other things that cost money. I don't believe that any of them have a positive balance sheet.

I also don't think that blitz scaling will work like with Uber. The engineers are still there. We can work without the LLM tools.

solenoid09379 minutes ago

If by "investments will pay off" you mean major profits, that's never going to happen as long as scaling laws hold. All revenue will just go to financing more compute, and either we hit AGI or have the greatest economic collapse in modern history.

The world will look drastically different 5 years from now; for the better or worse, so save every penny (especially if you work in tech).

cryo3231 minutes ago

This is what I’m betting on.

The financials don’t make sense now. Based on the expenditure the finances won’t ever make sense.

pocksuppet18 minutes ago

what the fuck is this timeline I am stuck living in

illithid040 minutes ago

>"He said that, based on talks with Uber's senior engineering leaders, he realized higher token usage did not translate into a proportional increase in useful consumer features."

Goodhart's law strikes again at someone with enough power to be both ignorant of it and make others suffer their ignorance. You cannot simply measure productivity by tokens spent just like you can't measure it by hours spent in a chair at a desk.

colechristensen39 minutes ago

You can measure productivity by hours spent at a desk?

batch1236 minutes ago

You can measure attendance by hours spent at a desk

devttyeu28 minutes ago

Well if you're a devshop just billing hours of mostly low impact work then hours are very much equal to productivity.

saghm10 minutes ago

Next time you're going to work for an hour, ping me, and I bet I can surprise you with how much less productive I am than you

epolanski30 minutes ago

Productivity is measured by economists in $/hour.

Which is why two identical jobs with the same real life output have drastically different productivity.

A nursing home in Luxembourg has 5 times the productivity of one in Romania despite the services being identical and tech-unrelated.

Rohunyyy25 minutes ago

Now we are going to get a new profession. Token Engineer! They will be experts on tokenmaxxing! The job growth that the billionaire CEOs promised us from AI is finally here!

fsloth18 minutes ago

Well there are already offerings like githits (https://news.ycombinator.com/item?id=46105112) that sort of promise optimize bang-per-buck of inference

aplomb10266 minutes ago

[flagged]

nekzn36 minutes ago

It’s funny that “maxxing” entered the common vocabulary.

chihuahua33 minutes ago

If you're not tokenmaxxing, you're getting tokenmogged on the AI leaderboard, and your next review ain't gonna be pretty.

internet200028 minutes ago

A good 80% by volume of the modern vernacular is 4chan language that got sanded down.

nekzn27 minutes ago

Sanding down is how we got goyslop turned into slop.

harvey914 minutes ago

Slop is a word in its own right which got the goy prefix later in life.

amirhirsch14 minutes ago

I like this too. I have been intentionally -maxxingmaxxing to get the meme out there. It's a good canary to sort out who gets the spicy takes from the pedestrians who probably still copy-paste into the ChatGPT web app like a psychopath.

gigatexalan hour ago

I find it useful that if they cut the use altogether I will pay for it out of pocket.

dghlsakjg19 minutes ago

Would you decide its usefulness based on how high the bill is, or how many things you get done while using it?

The former is the issue, and how many companies have been operating. It's like a trucking company ranking driver effectiveness by fuel used instead of by cargo moved.

sottol41 minutes ago

Maybe that's the plan :)

But on a more serious note, do we know how much Uber spent per technical employee/month? I assume it is far more than even any of those $200 "max ai" plans.

And the other question is how much the public would be willing to spend, in my estimation this is as "cheap" as it will ever get (main-stream at least).

KronisLV38 minutes ago

> I assume it is far more than even any of those $200 "max ai" plans.

Am in a random small company, colleague spent 100 EUR a day on Sonnet through AWS Bedrock (needed to use a EU region). Paying for tokens will get you in a deep hole financially compared to any of the subscriptions, unless it's like DeepSeek or one of the other models that are priced a bit better, though that's also a tradeoff in what they can/cannot do and also where the data goes. Ended up trying out the Mistral subscription for the US stuff btw, it was fine.

Marciplan37 minutes ago

bigCo’s don’t get to do the $200 Max plans, they have unlimited plans but get charged like API

sottol29 minutes ago

Exactly. But I did find an article ([1]) and spend doesn't seem that high per engineer ($150 to $250 per eng) - at least on average, I assume the costs were skyrocketing towards the end.

> Adoption climbed from 32 percent of engineers in February to 84 percent classified as agentic coding users by March. By spring, 95 percent of Uber engineers used artificial intelligence tools monthly, and roughly 70 percent of committed code originated from those tools. About 11 percent of live backend updates were written by agents with no human in the loop, according to Uber's own disclosures.

> The numbers behind the spend are what make the story instructive rather than anecdotal. Monthly cost per engineer ranged from $150 to $250 on average, with power users running between $500 and $2,000.

My guess is that the reason to rethink AI-spend was probably the exponential growth in cost over time, and tokenmaxxing payoff not being immediately obvious as mentioned in the article.

[1] https://www.forbes.com/sites/janakirammsv/2026/05/17/uber-bu...

mattlondon28 minutes ago

Probably long term each dev gets their own GPU and runs a model locally I expect. Seems like a more sustainable approach, even if a local model is not absolute SOTA.

iwontberude33 minutes ago

Except you won’t because they will threaten to fire you and force you to route all of your AI through data protection proxy to stop exfiltration by filtering and tracking prompts/response tokens.

throwaway61374624 minutes ago

[dead]

hn-front (c) 2024 voximity
source