CSMastermind25 days ago
Karpathy gave his initial impression: https://x.com/karpathy/status/1891720635363254772
The pull quote is: The impression overall I got here is that this is somewhere around (OpenAI) o1-pro capability
sigmoid1025 days ago
The impression seems to be warranted: Grok 3 has directly jumpted to the top of all leaderboard categories in Chatbot Arena: https://lmarena.ai/?leaderboard
In math it shares the top spot with o1 and is just a few points behind (well within errors). In creative writing it is basically ex-aequo with the latest ChatGPT 4o and in coding it's actually significantly ahead of everyone else and represents a new SOTA.
jessfyi25 days ago
lmarena/lmsys is beyond useless, looking at prior rankings of models vs formal benchmarks or testing for accuracy + correctness on batches of real world data. It's a bit like using a poll of Fox News to discern the opinions of every American; the audience voting is consistently found wanting. Not even getting into how easily a bad actor with means + motivation (in this "hypothetical" instance wanting to show that a certain model is capable of running the entire US government) can manipulate votes which has been brought up in the past (yes I'm aware of the lmsys publication on how they defend against attacks using cloudflare + recaptcha, there are ways around that.)
sigmoid1022 days ago
So you're saying that either A: users interacting with models can't objectively rate what responses seem better to humans, B: xAi as a newcomer has somehow managed to game the leaderboard better than all those other companies, or C: all those other companies are not doing it. By those standards every test ever devised for anything is beyond useless. But simply not having the model creator running the evaluation is already going a long way.
jessfyi21 days ago
No I'm saying that some companies are doing it (OpenAI at the very least), the company in question has motive and capability to game the system (kudos to them for pushing the boundaries there), AND the userbases' rankings have been historically, statistically misaligned with data from evals (though flawed) and especially when it comes to testing for accuracy + precision on real world data (outside of their known or presumed dataset). Take a look at how well Qwen or Deepseek actually performed vs the counterparts that were out at the same time vs their corresponding rankings.
In the nicest way possible I'm saying this form of preference testing is ultimately useless, primarily due to a base of dilettantes with more free time than knowledge parading around as subject matter experts and secondarily due to presumed malfeasance. The latter is more apparent to more of the masses (that don't blindly believe any leaderboard they see) now that access to the model itself is more widespread and people are seeing the performance doesn't match the "revolution" promised [0]. If you're still confused why selecting a model based on a glorified Hot or Not application is flawed, perhaps ask yourself why other evals exist in the first place (hint: some tests are harder than others.)
[0](One such instance of someone competent testing it and realizing it's not even close to the "best" model out) https://www.youtube.com/watch?v=WVpaBTqm-Zo
Breza18 days ago
At work, developed our own suite of benchmarks. Every company with a serious investment in AI-powered platforms needs to do the same. Comparing our results to the Arena turns up some pleasant surprises, like DBRX hitting way above its weight for some reason.
sigmoid1020 days ago
You say no, but then go on and explain why you believe a combination of both option A and option B. That's fine I guess, I just don't consider it particularly likely given the currently available information.
numpad025 days ago
Considering that OpenAI subscription is $200 per month, and "Premium Plus" subscription that includes this thing is only $40 per month, does that mean instantaneous "Elon factor" is now at least -$160 per month per user, or is it supposed to be added up to more than -$240 per month?
How would the math change after factoring in that OpenAI isn't even covering entirety of opex with the sub anyway, and/or people finding associating their money and Twitter accounts to be weird, and/or this thing is supposedly running on a bigger cluster than that for OpenAI?
coder54325 days ago
No... sigmoid10 was comparing with o1 (not o1-pro), which is accessible for $20/mo, not $200/mo. So, the "Elon factor" in your math is +$20/user/month (2x) for barely any difference in performance (a hard sell), not -$160/user/month, and while we have no clear answer to whether either of them are making a profit at that price, it would be surprising if OpenAI Plus users were not profitable, given the reasonable rate limits OpenAI imposes on o1 access, and the fact that most Plus users probably aren't maxing out their rate limits anyways. o1-pro requires vastly more compute than o1 for each query, and OpenAI was providing effectively unlimited access to o1-pro to Pro users, with users who want tons of queries gravitating to that subscription. The combination of those factors is certainly why Sam Altman claimed they weren't making money on Pro users.
lmarena has also become less and less useful over time for comparing frontier models as all frontier models are able to saturate the performance needed for the kind of casual questions typically asked there. For the harder questions, o1 (not even o1-pro) still appears to be tied for 1st place with several other models... which is yet another indication of just how saturated that benchmark is.
layer825 days ago
“The impression overall I got here is that this is somewhere around o1-pro capability”.
“Grok 3 + Thinking feels somewhere around the state of the art territory of OpenAI's strongest models (o1-pro, $200/month)”.
coder54325 days ago
The comment I was replying to had replied to an lmarena benchmark link. Perhaps you think that person should have replied to someone else? And, if you want to finish the quote, Karpathy's opinion on this is subjective. He admits it isn't a "real" evaluation.
"[...] though of course we need actual, real evaluations to look at."
His own tests are better than nothing, but hardly definitive.
layer825 days ago
I understood numpad0 to continue the comparison to o1-pro, after sigmoid10 expressed the opinion that the comparison is warranted.
coder54325 days ago
Yes, numpad0 did... but I was pointing out that this choice was illogical. The lmarena results they were replying to only supported a comparison against o1, since o1 effectively matches Grok 3 on the benchmark being replied to (with o1-pro nowhere to be found), and then they immediately leapt into a bunch of weird value-proposition math. As I said, perhaps you think they should have replied to someone else? Replying to an lmarena benchmark indicates that numpad0 was using that benchmark as part of the justification of their math. I also pointed out the limitations of lmarena as a benchmark for frontier models.
I don't think anyone is arguing that ChatGPT Pro is a good value unless you absolutely need to bypass the rate limits all the time, and I cannot find a single indication that Premium+ has unlimited access to Grok 3. If Premium+ doesn't have unlimited rate limits, then it's definitely not comparable to ChatGPT Pro, and other than one subjective comment by Karpathy, we have no benchmarks that indicate that Grok 3 might be as good as o1-pro. You already get 99% of the value with just ChatGPT Plus compared to ChatGPT Pro for half the price of Premium+.
numpad0 was effectively making a strawman argument by ignoring ChatGPT Plus here... it is very easy for anyone to beat up a strawman, so I am here to point out a bad argument when I see one.
AtlanticThird25 days ago
You're the one that came in and told him about the "factor in your math". Like you said, it's his comparison, not yours. If you want to do your own comparison, feel free. But don't come in and tell him he's not allowed to do his comparison. I for one like is comparison.
cyanydeez24 days ago
Guys, yall forget GIGO. First principles.
This thing is produced by musk.
srid25 days ago
Where do you see that Premium+ is $40 per month?
The official source says "Starts at $22/month or $229/year on web", https://help.x.com/en/using-x/x-premium
This is pretty much what I paid a couple of months ago, as a Canadian.
nickthegreek25 days ago
They just announced a price increased today. The link you posted has this info in a blue box at the top.
Also visible here: https://help.x.com/en/using-x/x-premium#tbpricing-bycountry
srid25 days ago
Interesting. In that table, I see $40 for US users. Yet the price remains $30 for Canadian users, despite their low dollar value.
AlchemistCamp25 days ago
And it’s only £17 in the UK or €21 in the EU.
[deleted]25 days agocollapsed
colechristensen25 days ago
>Considering that OpenAI subscription is $200 per month
This plan is 75 days old. I didn't know it existed until last week.
OpenAI is starting to try to get a little more realistic revenue in, Grok is acquiring customers.
ben_w25 days ago
Given how fast-moving the field is, it's very difficult to confidently state how much inference costs. Perhaps he's under-charging, perhaps OpenAI is over-charging, one may be more optimised than the other, but new models come out and change everything in less time than is normally takes for actual costs to become public knowlege.
visarga25 days ago
Sometimes it's a matter of approach, some approach could be 5% better and 10x more expensive. So they will find the sweetspot, takes a few iterations.
arjunaaqa25 days ago
Yes, better to avoid annual subscriptions.
JumpCrisscross25 days ago
Masa Son top ticks a market is somehow still news in 2025.
jimbokun25 days ago
What do we do to assess the intelligence of these models after they are smarter than any human? From the kinds of questions it's answering seems like they are almost there.
Do we have a way to tell if one model is smarter than another at that point?
HarHarVeryFunny25 days ago
Nah, at the end of the day "things that are easy for humans are [still] hard for computers, and vice versa". DeepBlue was super-human at chess and couldn't play tic tac toe. Today's AI is (almost?) super-human at math yet only very recently learned to play tic tac toe, and still can't learn to do anything - because it can't learn, and has no innate drives to expose itself to learning situations even if it could.
Here's a real world intelligence test. Take on each AI as a remote intern/new-hire, and try to train it to become a useful team member (solving math puzzles or manufacturing paperclips does not count).
gf00024 days ago
Almost there? Are we looking at the same thing?
[deleted]24 days agocollapsed
flir25 days ago
> Do we have a way to tell if one model is smarter than another at that point?
Ask them to design a ranking mechanism for you. They are superhuman, after all.
(I really don't think we're going to have to worry about this).
thefourthchime25 days ago
There are things besides measuring intelligence, like humor. Currently, all the bots struggle with making jokes.
tcascais25 days ago
What you probably mean is puzzle solving intelligence. Humor is a form of intelligence. It's just not only about intelligence - it's also about values, and context, for instance. But all this reflects a form of intelligence. Neverthless, intelligence shouldn't be ranked, at least not in the way we are used to talk about it.
bboygravity25 days ago
[flagged]
ban-evader25 days ago
[flagged]
[deleted]25 days agocollapsed
yodsanklai25 days ago
Naive question from a bystander , but since DeepSeek is open source and is on par with o1-pro (is it?), shouldn't we expect that anybody with the computer power is capable to compete with o1-pro?
tucnak25 days ago
> DeepSeek is open source and is on par with o1-pro (is it?)
There is no being "on par" in this space. Model providers are still mostly optimising for a handful of benchmarks / goals, like we can already see that Grok 3 is doing incredibly well on human preference (LM Arena) however with Style Control, it's suddenly behind ChatGPT-4o-latest and Gemini 2.0 is out the picture. So even within a single domain, goal, benchmark—it's not as straightforward as to say that one model is "on par" with another.
> shouldn't we expect that anybody with the computer power is capable to compete with o1-pro?
Not necessarily. I know it may be tempting to think that Grok 3 is entirely a result of xAI having lots of "computer power", but you have to recognise that this mindset is coming from a place of ignorance, not wisdom. Moreover, it doesn't even pass off as "cynical" view, because it's common knowledge that model training is really, really complicated. DeepSeek results are note-worthy, and really influential in some respects, but it hasn't magically "solved" training, or made training necessarily easier / less expensive for the interested parties. They never shared the low-level performance improvements, just model weights and lots of insight. For talented researchers, this is valuable, of course, but it's not like "anybody" could easily benefit from it in their training regimes.
Update: RFT (contra SFT) is becoming really popular with service providers, and it's not been "standardised" beyond whatever reproductions to have emerged in the weeks prior, moreover R1 cost is still pretty high[1] at something like $7/Mtok, & bandwidth is really not great. Consider something like Google Vertex AI's batch pricing for Gemini 1.5 Pro and Gemini 2.0 Flash models, which is at 50% discount, and their prompt caching which is at 75% discount. R1 is still got a way to go.
[1]: https://openrouter.ai/deepseek/deepseek-r1/providers?sort=th...
dtquad25 days ago
The full-sized DeepSeek-R1 is on par with o1.
o1-pro is "o1 on steroids" and was the first selling point of the $200/month Pro subscription but they later also added "Deep Research" and Operator to the Pro subscription.
guax25 days ago
Every year seems like we get worse at naming things in non confusing ways. I am waiting for the o1-pro-max now, pro max ultra and pro max ultra plus.
AlchemistCamp25 days ago
Microsoft already mastered this decades ago with a dozen different license tiers for Windows with unintuitive feature sets.
tartuffe7824 days ago
Not to mention the Xbox iterations...
guax21 days ago
Chat GPT series X, not compatible with Chat GPT X.
Frankly. Whoever decided on this last gen naming at MS needs to come forward. I would love to know what crazy unacceptable collection of circumstances allowed that to happen.
polski-g25 days ago
I was recently laid off from OpenAI. My job was coming up with names for their models.
fragmede25 days ago
It's not a "layoff" if you were fired for poor performance/picking bad names.
rvnx25 days ago
You are welcome to join the Bard team
meekaaku25 days ago
Is this because of USAID funding cut?
theptip25 days ago
It’s hard because there are multiple dimensions being upgraded at different cadences. Architecture, parameter count, etc.
jkestner25 days ago
Then you’ll know what the naming AI was trained on.
gulfofamerica22 days ago
Off by one and naming things.
hmottestad25 days ago
For me I was sold on the consistency. o1 does really great at several programming problems, but o1-pro does great on these problems 4 of 4 tries. I get a good answer more often with o1-pro than with just o1, or even o3-mini-high.
CamperBob224 days ago
o1-pro is indeed pretty great, but I find that I can iterate several times with Gemini 2.0 Pro Experimental (or whatever their latest reasoning model is called these days) between o1-pro's responses. It's almost too slow for interactive use cases.
hmottestad24 days ago
Yeah. I've found out that you can start out with o3-mini-high and then switch over to o1-pro, or the other way around. Helps to iterate a bit faster.
ritz_labringue25 days ago
It's not on par with o1, let alone o1-pro
csomar25 days ago
It's on par/better/worse depending on the problem. o1 is significantly worse, for example, in Rust programming than Claude 3.5; at least for me.
gopher_space25 days ago
Claude really likes producing code, that’s for sure. I feel like it’s a useful tool once I’ve deconstructed a project past a certain point.
nwienert25 days ago
It’s pretty on par with o1, better at many coding questions.
flir25 days ago
I found it better at reasoning, worse at coding.
Not doubting your experience, just thinking how subjective it all is.
golol25 days ago
Deepseek is not on par with o1.
jamalaramala25 days ago
It probably depends on the benchmark you choose; according to Chatbot Arena, Deepseek-R1 ranks similarly to o1-2024-12-17; and Grok3 is just 3% above these models in "Arena Score" points.
golol24 days ago
Chatbot Arena is not really a great benchmark imo
resters25 days ago
Yes it is!
kragen25 days ago
No DeepSeek model is open source; they're freely redistributable, but without source.
samsepi0124 days ago
I guess when it comes to LLM's what is considered the "source" - the weights or the code used to build the weights?
kragen24 days ago
To the extent that the concept is applicable, it would be the training data and the training code.
Grimblewald24 days ago
You're forgetting the flour in your cake recipie, the data, arguably the singularly most important part.
menaerus25 days ago
You'd still need a fairly large amount of compute power to be able to run DeepSeek R1 locally, no?
roblabla25 days ago
Well yes, but not so large that it's completely prohibitive. People have been running the full models on computers going as low as $6000: https://x.com/carrigmat/status/1884244369907278106
Of course this is for a personal instance, you'd need a much more expensive setup to handle concurrent users. And that's to run it, not train it.
plagiarist25 days ago
Sortof a letdown that after 24 32Gb RAM sticks you only get 6-8 tokens per second.
Mekoloto25 days ago
But a token is not just a character.
"hello how are you today?" - 7 tokens.
And this is so much better than I could have imagined in a very short span of time.
acchow25 days ago
And only get to use 20k context length before it OOMs
mechagodzilla24 days ago
I have a used workstation I got for $2k (with 768GB of RAM) - using the Q4 model, I can get about 1.5 tokens/sec and use very large contexts. It's pretty awesome to be able to run it at home.
nomel24 days ago
For me, where electricity is $0.45/kWh, assuming 1kW consumption, it would be around $80 USD/million!
CyberDildonics23 days ago
I think you might have to show your math on that one.
nomel23 days ago
They said 1.5 tokens/second. 1 mil tokens is 667k seconds is 185 hours per million tokens. 1kW * 185hr * $0.45/kWh = $80 per million tokens. Again, assuming 1kW, which may be high (or low). The cost of the physical computation is electricity cost.
CyberDildonics23 days ago
They said it has a crappy GPU, the whole computer probably only uses 200 - 250 watts.
nomel23 days ago
No way. 768GB of ram will have significant power draw. DDR4 (which this probably is) is something like 3W/8GB. That's > 250W alone.
So, say 500W. That's, for me in my expensive electricity city, $40/million tokens, with the pretty severe rate limit of 5600 tokens/hours.
If you're in Texas, that would be closer to $10/million tokens! Now you're at the same price as GPT-4o.
menaerus22 days ago
But you can run and experiment with any model of your liking. And your data does not leave your desktop environment. You can build services. I don't think anybody doing this is doing it to save $20 a month.
nomel22 days ago
Yes. I was only making a monetary comparison.
Related, you can get a whole lot of cloud computing for $2k, for those same experiments, on much faster hardware.
But yes, the data stays local. And, it's fun.
This comment chain is pretty funny.
MysticFear24 days ago
Would love to know more info & specs of your workstation.
mechagodzilla24 days ago
It's an HP Z8 G4 (dual-socket 18-core, 3 GHz Xeons, 24x32GB of DDR4-2666, and then a crappy GPU, 8TB HDD, 1TB SSD). It can accommodate 3 dual-slot GPUs, but I was mostly interested in playing with frontier models where holding all the weights in VRAM requires a ~$500k machine. It can run the full Deepseek R1, Llama3-405B, etc, usually around 1-2 tokens/sec.
fspeech25 days ago
A better approach is to split the model with MOEs running on CPUs and MLAs running on GPU. See the ktransformers project: https://github.com/kvcache-ai/ktransformers/blob/main/doc/en...
This takes advantage of the sparsity of MOE and the efficient KV-cache of MLA.
menaerus24 days ago
You perhaps forgot to mention that for their AMX optimizations to be even feasible you'd need to spend ~$10k for a single CPU, let alone the whole system which is probably ~$100k.
phonon24 days ago
Granite Rapids-W (Workstation) is coming out soon for likely much less than half that per CPU. (Xeon W-3500/2500 launched at $609 to $5889 per CPU less than a year ago and also has AMX).
menaerus24 days ago
Point being? Workstations that are fresh on the market and which have comparable performance of the server counterparts still easily cost anywhere between $20k and $40k. At least this is according to Dell workstations last time I looked.
phonon23 days ago
Supermicro X13SWA-TF Motherboard (16 DIMM slots with Xeon W-3500)= ~$1,000
E-ATX case = ~$300
Power Supply= ~$300
Xeon W-3500 (8 channel memory) = $1339 - $5889
Memory = $300-$500 per 64GB DDR5 RDIMM
Memory will be the major cost. The rest will be around $5,000. A lot less than "$100,000"!
menaerus23 days ago
I acknowledged in my last comment that the cost doesn't have to be $100k but that it would still be very high if you opted for the workstation design. You're gonna need to add one more CPU to your design, add another 8 memory channels, beefier PSU, and a new motherboard that can accommodate this. So, 8k (memory) + 10k (cpus) + the rest. As I said, not less than $20k.
phonon23 days ago
Why does it have to be a dual CPU design? 8 channels of DDR5 4800 will still get you something like 300 GB per second bandwidth. Not amazing, but OK. Granite Rapids-W will likely be something like 50% better (cores and bandwidth).
And the original message you were responding to was using a CPU with AMX and mixing it with a GPU like Nvidia 4900/5900. That way the large part of the model sits in the larger slower memory, and the active part in the GPU with the faster memory. Very cost effective and fast. (Something like generating 16 Tokens/s of 671B Deepseek R1 with a total hardware cost of $10-$20k.) They tried both single and dual CPU, with the latter about 30% faster....not necessarily worth it.
https://github.com/kvcache-ai/ktransformers/blob/main/doc/en...
menaerus23 days ago
> 8 channels of DDR5 4800 will still get you something like 300 GB per second bandwidth.
That's the theory. In practice, Sapphire Rapids needs 24-28 cores to hit the 200 GB/s mark and it doesn't go much further than that. Intel CPU design generally has a hard time saturating the memory bandwidth so it remains to be seen if they managed to fix this but I wouldn't hold my breath. 200 GB/s is not much. My dual-socket Skylake system hits ~140 GB/s and it's quite slow for larger LLMs.
> Why does it have to be a dual CPU design?
Because memory bandwidth is one of the most important limiting (compute) factors for larger models inference. With dual-socket design you're essentially doubling the available bandwidth.
> And the original message you were responding to was using a CPU with AMX and mixing it with a GPU like Nvidia 4900/5900.
Dual-socket CPU that costs $10k on a server that costs probably couple of factors more. Now you claimed that it doesn't have to be that expensive but I beg to differ - you still need $20k-$30k of worth equipment to run it. That's a lot and not quite "cost effective".
phonon23 days ago
The proof of the pudding is in the eating. Read the link above. It's one or two mid range[1] Sapphire Rapids CPUs and a 4090. Dual CPU is faster (partially because 32->64 cores, not just bandwidth) but also hit data locality issues, limiting the increase to about 30%.
(Dual Socket Skylake? Do you mean Cascade Lake?)
If you price it out, it's basically the most cost effective set-up with reasonable speed for large (more than 300 GB) models. Dual socket basically doubles the motherboard[2] and CPU cost, so maybe another $3k-$6k for a 30% uplift.
[1] https://www.intel.com/content/www/us/en/products/sku/231733/... $3,157
[2] https://www.serversupply.com/MOTHERBOARD/SYSTEM%20BOARD/LGA-... $1,800
menaerus23 days ago
Yes, dual socket Skylake. What's strange about that?
Please price it out for us because I still don't see what's cost effective in a system that costs well over $10k and runs at 8 tok/s vs the dual zen4 system for $6k running at the same tok/s.
phonon22 days ago
Sorry. Didn't realize you meant Skylake-SP.
I am not sure what your point is? There are some nice dual socket Epyc examples floating around as well, that claim 6-8 tokens/s. (I think some of those are actually distilled versions with very small context sizes...I don't see any as thoroughly documented/benchmarked as the above). This is a dual socket Sapphire Rapids example with similar sized CPUs and a consumer graphics card that gives about 16 tokens/second. Sapphire Rapids CPU and MB are a bit more expensive, and a 4090 was $1500 until recently. So for a few thousand more you can double the speed. Also the prompt processing speed is waaaaay faster as well. (Something like 10x faster than the Epyc versions.)
In any case, these are all vastly cheaper approaches than trying to get enough H100s to fit the full R1 model in VRAM! A single H100 80 GB is more than $20k, and you would need many of them + server just to run R1.
menaerus22 days ago
I don't argue their idea, which is sound, but I argue that the cost needed to achieve the claimed performance is not "for a few thousand more" as you stubbornly continue to claim.
The math is clear: single-socket ktransformers performance is 8.73 tok/s and it costs ~$12k to build such a rig. The same performance one gets from a $6k dual-EPYC system. It is a full-blown version of R1 and not a distilled one as you say.
Your claim about 16 tok/s is also misleading. It's a figure for 6 experts while we are comparing R1 with 8 experts against llama with 8 experts. 8 experts on dual-socket system per ktransformer benchmarks runs at 12.2 - 13.4 tok/s and not 16 tok/s.
So, ktransformers can roughly achieve 50% more in dual-socket configuration and 50% more than dual-EPYC system. This is not double as you say. And finally, the cost of such dual-socket system is ~$20k and therefore isn't the "best cost effective" solution since it is 3.5x more expensive for 50% better output.
And tbh llama.cpp is not quite optimized for pure CPU inference workloads. It has this strange "compute graph" framework which I don't understand what is it there for. It appears completely unnecessary to me. I also profiled couple of small-, mid- and large-sized models and the interesting thing was that majority of them turned out to be bottlenecked by the CPU compute on a system with 44 physical cores and 192G of RAM. I think it could do a much better job there.
phonon21 days ago
Are we doing this?
Cheapest 32 core latest EPYC (9335) x 2 = $3,079.00 x 2
Intel 32 Core CPU used above x 2 = $3,157 x 2 (I would choose the Intel Xeon Gold 6530 which is going for around $2k now, and with with higher clock speeds, and a 100 MB of more cache)
AMD Epyc Dual Socket Motherboard Supermicro H13DSH = $1899
Intel Supermicro X13DEG-QT = $1,800
Memory, PSU, Case = Same
4090 GPU = $1599 - $3,000 (temporary?)
Besides the GPU cost, the rest is about the same price. You only get a deep discount with AMD setups if you use EPYCs a few years old with cheaper (and slower) DDR4.
And again, if you go single CPU, you save over $4,000, but lose around 30% in token generation.
The "$6,000" AMD examples I've seen are pretty vague on exactly what parts were used and exactly what R1 settings including context length they were run at, making true apple to apple comparisons difficult. Plus the Sapphire Rapids + GPU example is about 10x faster in prompt processing. (53 seconds to 6 seconds is no joke!)
menaerus21 days ago
> Are we doing this?
Yes, you're blatantly misrepresenting information and moving goalposts. Right now it has become clear that you're doing this because you're obviously affiliated with ktransformers project.
$6k for 8 tok/s or $20k for 12 tok/s. People are not stupid. I rest my case here.
menaerus25 days ago
6k is not that bad considering that top of the line Apple laptop costs as much. However, I don't have X so unfortunately I can't read the details.
longitudinal9325 days ago
You can read the whole thread through nitter:
dang25 days ago
Related ongoing thread:
Andrej Karpathy: "I was given early access to Grok 3 earlier today" - https://news.ycombinator.com/item?id=43092066 - Feb 2025 (48 comments)
xiphias225 days ago
I don't see the Think button, and for me the answer is much below deepseek-r1 even thought I have Premium+ subscription. I'm just getting instant stupid answer instead of thinking.
joeevans100024 days ago
How can anyone repeatedly use a question like this without new models getting trained on it via online discussion?
rendang25 days ago
Grok has gotten to the top of one benchmark:
https://x.com/lmarena_ai/status/1891706264800936307
It's been said before but it is great news for consumers that there's so much competition in the LLM space. If it's hard for any one player to get daylight between them & the 2nd best alternative, hopefully that means one monopolistic firm isn't going to be sucking up all the value created by these things
qingcharles25 days ago
I've spent the last hour testing it and I'm blown away. And this is coming from a very hardcore user of OpenAI/Claude products on a daily basis.
It passed every goofy test I have for writing articles which involves trying to surface arcane obscure details. (it certainly means however they are scraping the Web they are doing a good job here)
It made the database code I wrote over the last week with o3/o1/GPT4o/Claude3.5 look like a joke.
It fills me with rage over who owns this thing.
Even if people tank Tesla's car business and run Twitter into the ground, I think our new Galactic Edgelord is going to win his first trillion on xAI and Teslabots anyway.
btw: it tried to charge me $40/mo for this thing: https://imgur.com/a/QXslgBo
RobinL25 days ago
Apologies for possibly stupid question but where can you use it right now? Just on 'direct chat' on https://lmarena.ai/ or is there a better alternative? Or do you have early access?
int_19h25 days ago
You need an X Premium Plus subscription on Twitter.
qingcharles25 days ago
I was using it on grok.com, logged in via a Twitter account. But I notice it just got added to the Grok tab on Twitter a moment ago.
Also, the "Deep Search" button was not available when it first went live, so I'm retesting everything again with this feature enabled, which I assume is a reasoning version of the same model.
giancarlostoro24 days ago
One neat feature is you can use Grok on any tweet, its helped me find context to obscure tweets many times over, very quickly.
jug24 days ago
Hopefully, you’ll be able to avoid the whole X Premium Plus thing in the near future with OpenRouter. It’ll still use xAI backend but via your OpenRouter API key. Then you can use it with any web or mobile app that supports OpenRouter.
Personally, I wouldn’t use it though. What’s going on with Elon Musk right now is completely insane. I hope to see OpenAI’s GPT-4.5 & GPT-5 releases to catch up soon, if nothing else. Announced for this year.
giancarlostoro25 days ago
For whatever it is worth, I frequently enough see the devs asking for feedback, so I suspect, if you tweet about Grok, or reply to any of those threads, they definitely read them, even if they don't respond / interact. It shows. I've seen improvements based on feedback I see others make almost instantly.
palata25 days ago
[flagged]
TMWNN25 days ago
[flagged]
Mekoloto25 days ago
Thats not how musk looks like at all :D
And lets see if Musk is pushing too many people too far. Everything he currently does can blow up in his face very fast
crocowhile25 days ago
It's not good news when this competition comes at cost of a gigantic over inflated bubble, in which all the big players keep on sucking billions from investors without even having a business model.
This hype will burst sooner than later and will trigger yet another global recession. This is untenable.
bobxmax25 days ago
ChatGPT is literally generating billions in revenue. Cursor is the fastest growing company of all time.
This lame HN trope of LLMs having no business model needs to die.
latexr25 days ago
> ChatGPT is literally generating billions in revenue.
It’s losing more billions than what it’s generating. Revenue does not equate profit.
https://www.cnbc.com/2024/09/27/openai-sees-5-billion-loss-t...
spacebanana725 days ago
True, but presence of significant revenue is still promising. It's much better to have an "expensive compute" problem than a "nobody wants to pay for the product" problem.
jsheard25 days ago
Keep in mind that not only is OpenAI being directly propped up by investor hype, the downstream API users who contribute much of their revenue are also being propped up by investor hype. A big chunk of OpenAIs revenue is actually even more VC money in a trenchcoat.
mullingitover25 days ago
The biggest marker of a bubble, to me, is that you have money-losing startups selling to other money-losing startups. On paper you see a lot of 'line go up' but it's just a lot of circulation in a closed body of water which will eventually evaporate.
holoduke25 days ago
Uber doesn't agree
krainboltgreene24 days ago
Uber is an outlier because in a functioning economy that valued workers we wouldn’t have shipped all our jobs overseas and made gig economies the last line between housing and street schizophrenia.
mechagodzilla24 days ago
Uber's revenue was never coming from other food delivery startups.
1shooner24 days ago
>True, but presence of significant revenue is still promising.
If started selling 5 dollar bills for 1 dollar, I could generate a lot of revenue with $150B. You wouldn't believe the demand we would see for $5 bills.
latexr25 days ago
> It's much better to have an "expensive compute" problem than a "nobody wants to pay for the product" problem.
That is only true is your primary concern in life is personal wealth and you’re burning other people’s money.
spacebanana725 days ago
YouTube is an optimistic example.
The bandwidth costs made it deeply loss making for a long time despite having loads of engagement and ad revenue. However over time they became more cost efficient at sending video over the internet and became profitable.
This strategy obviously doesn't always work, with WeWork being the canonical example. But it's not guaranteed to fail either.
athrowaway3z25 days ago
YouTube's network effect creating a winner-take-most was recognized, pitched, and valued from the very start.
The capabilities of LLMs are impressive, but none of them have published an idea I consider to have the same potential for a trillion $ monopoly that the current hype looks like.
There are far more similarities with the dot-com hype.
No critical first mover advantage in sight. All parts are replaceable for the cheapest variant with little to no down side to users.
mike_hearn24 days ago
It wasn't obvious at the time YouTube would have a network effect though. It was very dependent on coming up with a great recommendation algorithm, along with monetization and revenue sharing. At the time, YouTube didn't have anything like that, iirc.
athrowaway3z23 days ago
Even the basic front page of youtube was of immediate and obvious value to a creator, and would increase disproportionately in value the more people were on YouTube. The same goes for Amazon, and the same goes for Facebook.
All the LLM providers are - extremely useful - tools. Currently I can only see the 'non-monopoly' proportional improvement when their userbase grows from 100 to 1000.
But I might be wrong, and I wouldn't be surprised if in hindsight it will be obvious what the real disproportionate advantages there were to be found.
Gothmog6925 days ago
They bought youtube for 1.65 billion which is pennies on the dollar compared to what it is worth today.
latexr25 days ago
Technically true, though in fairness it is unlikely the original owners would have gotten YouTube to where it is today. On the other hand there are companies who didn’t recognise they were nothing more than passing fads, refused buyouts, and crumbled.
bloomingkales25 days ago
[flagged]
athrowaway3z25 days ago
You could have asked any one of the dozen available LLMs to review this comment.
Most of them would have responded by explaining what a monopoly is, and why this reply makes little sense.
bloomingkales25 days ago
[flagged]
crocowhile25 days ago
YouTube did not have competitors and certainly not open source competitors.
sebastiennight20 days ago
I was there when YouTube became a thing, and I was running a music video-hosting website that I had built myself (on top of phpBB, even). We were encoding videos in Windows Media and RealPlayer formats.
There were LOTS of funded competitors to YouTube between 2006 and 2009, including Viddler (who paid Gary Vaynerchuk a small fortune to host his WineLibraryTV show there exclusively), DailyMotion (which is still alive today, although no longer a threatening contender), etc.
In 2009 I had a coaching business and was buying marketing courses and software which would deploy your videos across 40+ different video websites (including Google Video which was a separate thing until they acquired YouTube and merged those), and YouTube wasn't yet amounting to 50% of our video traffic.
I think you might be mistaken with the bold statement above.
sebzim450025 days ago
This is a bit before my time but I remember a bunch of competitors to YouTube. They just all sucked.
iteratethis24 days ago
It's loss-making at current usage, and usage per user will exponentially grow.
CraigRood25 days ago
I'm not sure how promising? I can't help but see how easy it would be to change API endpoints to a different platform.
jonas2125 days ago
They're still early on the growth curve where there's enough opportunity for future growth that investing in scaling and improvement is more important than turning an immediate profit.
Remember when everyone on HN was sure Uber would never be profitable? Or Tesla? Or Amazon?
devin25 days ago
I do remember that, and I would say that they are still largely correct. Tesla needed government subsidies, Amazon needed AWS, and Uber needed a pandemic and Uber eats. The core businesses that HNers were referring to are still weak.
krainboltgreene24 days ago
Tesla needs government subsidies. Uber needs a broken economy.
bobxmax25 days ago
As did almost every large tech company today. Amazon lost money for decades.
Are we really still doing this nonsense? If Open AI wanted to become profitable they could do it inside of 12 months - growing companies don't care about profitability (nor should they)
athrowaway3z25 days ago
You're way too smug for spewing what is clearly survivorship-bias.
All currently known profitable use-cases are competing on price. All the unicorn examples you're biased for had in their pitch deck the network effect of being the largest.
OpenAI, Grok, etc, have shown no unique value prop or an idea with monopoly potential.
Hamuko25 days ago
Revenue or profit? WeWork at a time also did billions in revenue.
phillipcarter25 days ago
WeWork trapped themselves into a real estate hole, selling services for less than they rented property for.
OpenAI is currently in an explicit non-profit seeking mode using a technology that we have demonstrated 10-100x or greater decreases in compute to achieve the same outcomes.
This is not a declaration that OpenAI will become wildly profitable. This is just me saying that these aren't comparable companies.
linuxftw25 days ago
WeWork was a scam to enrich the founders of the company. They owned or had interests in many of the properties that WeWork leased from. I'm surprised no one was thrown in prison.
LorenDB25 days ago
OpenAI is losing money on their $200/mo (!!) Pro subscription[0].
[0]: https://www.theregister.com/2025/01/06/altman_gpt_profits/
sebzim450025 days ago
But making money off their subscriptions in general. They lose it all training models and on R&D
mullingitover25 days ago
I wonder how long it'll last. Just using myself as a demo customer: I canceled my subscription because Google AI Studio was doing more for me, and it's free. OpenAI is not really competitive at $20 a month anymore.
sebzim450024 days ago
Yeah not sure. I cancelled a while ago but I subscribed again once o1-preview came out and now o3-mini exists I still find it useful.
Of course, they are clearly cooking something or they wouldn't have just published a benchmark in which they do badly.
HarHarVeryFunny25 days ago
What source(s) are there for cursor's growth rate/revenue ?
HarHarVeryFunny23 days ago
So, answering my own question, there is this.
https://sacra.com/research/cursor-at-100m-arr/
Sounds legit.
aprilthird202124 days ago
Yeah I would be shocked to see that Cursor is the fastest growing company of all time by a good metric...
giancarlostoro24 days ago
> Cursor is the fastest growing company of all time.
I assume you're referring to this:
https://sacra.com/research/cursor-at-100m-arr/
It went from 10M MRR to 100M
crocowhile25 days ago
You are comparing apple with oranges. Cursor is not an LLM and yes, it has a business model. So does openrouter and million other applications that can switch API to the low bidder any moment.
ben_w25 days ago
Lots of people dervive great value from things that are too easily reproduced to be directly profitable.
Google gives everyone free access to a good spreadsheet tool, even though Microsoft Office exists.
Web browsers are free, despite the value of the entire internet.
Compilers are free, despite the value of all software collectively.
LLMs being really valuable doesn't automatically mean anyone can get rich from them.
I think everyone last year parroting "moat!" was cringe (like Altman of all people wouldn't know about this already, c'mon), but you do actually need something that other people don't have. I expect Altman's already got stuff in mind, but he's hardly the only one, and that means it's a game of "which multibillionaire with lots of experience building companies will win?", and that's hard for non-skilled normies (in this case incluing me) to guess.
KKKKkkkk125 days ago
HN has turned into the Slashdot of the 2000s. No wireless? Less space than a Nomad? Lame.
fragmede25 days ago
HN already had its ipod moment back in 2007. /.'s ipod moment was in 2001, not as longer before that as I would have guessed.
Re: Dropbox, from a well known user. It didn't age well and we've been asked not to repeat it because it makes the author with connections to this site's operator look bad.
> 1. For a Linux user, you can already build such a system yourself quite trivially by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem. From Windows or Mac, this FTP account could be accessed through built-in software.
https://news.ycombinator.com/item?id=8863
the controversy: https://news.ycombinator.com/item?id=27067281
jedberg25 days ago
I suspect a lot of the active users on HN today don't remember Slashdot of 2000...
Ray2025 days ago
I think business model there pretty simple: to be in the front line when AI will go into the category of landscape-changing trillion dollar technologies. And investors keep pouring their billions exactly for that business model.
>This hype will burst sooner than later and will trigger yet another global recession.
It seems to small of bubble for global recession. I mean if it is a bubble at all, there is all the reasons to believe that the strategy will work with a significant probability.
moduspol25 days ago
IMO that's not really a business model. That's a hope that you can come up with one by being at (or near) the front of the pack if one materializes.
jsheard25 days ago
See also: Meta's previous push into VR/AR/Metaverse. They spent a hundred billion to be at the front of the pack when that revolutionary world-changing paradigm-shift took off... which simply didn't happen.
satvikpendem25 days ago
Their Orion glasses are apparently mindblowing in fidelity as well as the lightness of the glasses. Someone will absolutely make the smart glasses paradigm work so that we don't need to carry around phones anymore, and Zuck is racing to be first. This is because he lost out on the platform wars and was at the mercy of Apple and Google; remember Apple's privacy update that killed much of Meta's revenue? Zuck doesn't want a repeat of that by owning his own platform.
AlchemistCamp25 days ago
Seconded. I’m very excited for the day when/if their dev platform is opened up and it’s possible to access pass-through vision.
There’s a whole class of educational apps that could open up for people learning in the physical world. Whether it’s building physical things, sports or reading books or notes written in non-latin scripts... the impact will be enormous!
satvikpendem25 days ago
The only thing I'm concerned about is it'd be another locked down platform, like Oculus / Quest already is, only now much more disruptive just like Apple's and Google's (to a smaller extent). I want something more like Windows Mixed Reality or Steam VR to succeed more.
crocowhile25 days ago
That's a bit different though. META invested in a product that, as of now, as very little competition. The QUEST is sold at a slight loss but at least at an approachable price and to a volume that makes them the clear leader in the market at the moment. Moreover, their OS is open source. Clearly, what they want is to sell enough VRs to get the monopoly on the ecosystem and its apps (they basically want to make an Android play store for VRs). You may argue they are far but at least that's a clear business model.
OpenAI's business model was literally "we don't have one: we'll make AGI and we'll let AGI tell us how to make money". This is so idiotic it's not even a scam. xAI will compete on the same plane field. Not sure about Anthropic: they seem a bit more sane.
idiotsecant25 days ago
If that scenario comes to fruition, it's literally the only viable business model. Everyone else gets eaten alive .
moduspol25 days ago
Apparently not. Apparently xAI can catch up in a year. And we already saw what happened with DeepSeek.
What does the scenario look like where everyone else gets eaten alive?
logicchains25 days ago
xAI had the help of the world's richest and arguably the world's most powerful man; most other companies don't have that.
thegeomaster25 days ago
Every bubble has a narrative.
qgin25 days ago
The premise is that this ultimately replaces all intellectual and physical labor for the rest of time. It’s possible it becomes commoditized as soon as it exists, but in terms of investment dollars it’s either worth as much as you can spend or nothing at all.
pjc5025 days ago
> ultimately replaces all intellectual and physical labor for the rest of time
Sounds incredibly valuable, but in reality collapses into Butlerian Jihad fairly quickly when you have 90% unemployment.
Edit: if the claims are true, then this will be far more destabilizing than social media. What do elections mean when the AI-guided political parties are putting out AI press releases for the AI press summaries, which voters have read out to them through their AI? What happens when >50% of the voters ask the AI who they should vote for? You end up with an AI dictatorship where the levers of discontent are fake.
ben_w25 days ago
> Sounds incredibly valuable, but in reality collapses into Butlerian Jihad fairly quickly when you have 90% unemployment.
But nobody really knows if that happens or not as a consequence, let alone quickly, because the transition itself only happened at all so far in fiction.
Whatever does happen, I think it's going to be a surprise to at a minimum 80% of the population.
conartist625 days ago
Yeah at some point it seems inevitable that if machines do all the work that creates real "value" and people have no comparable value, in a very practical sense we will all be slaves to machines
thrance25 days ago
Like in Dune, we won't be slaves to the machines, but to the people owning the machines.
mwigdahl25 days ago
"Once men turned their thinking over to machines in the hope that this would set them free. But that only permitted other men with machines to enslave them." -- Frank Herbert, _Dune_
kadushka24 days ago
Are we free today? For example, I have to work for a living. If I don't, my family and I will be miserable. Let's just hope that future "men with machines" don't decide to kill the rest of us - I'm not sure what use we will be to them.
qgin24 days ago
I think this is the key change. We’re already beholden to a “machine” (the economy) that none of us completely understand or control or created explicitly. It has its own goals and tendencies that emerged from the complexity.
What AI and robotics does is actually create a machine that has no use for humans at all. Ñ
fragmede24 days ago
Your choice is corporations, government, or billionaires. One of those is going to be the "men with machines" that has use for the rest of us.
jimbokun25 days ago
The dot com bubble wiped out many billions of dollars in valuation.
The dot com bubble also gave us the most valuable companies in history, like Google, Apple, Amazon, Facebook, etc.
rendang24 days ago
The big companies could crash significantly, but if the technology keeps bringing productivity gains, it will have a big positive impact on GDP over the next decade
rukuu00125 days ago
It’s a battle royale. Whoever lasts longest gets to profit at leisure
oldpersonintx25 days ago
[dead]
aprilthird202125 days ago
I think it's already clear that these are going to be commoditized and the free / open source versions will be good enough to capture enough of the value that the remaining players will not be Facebook-level monopolies on the space
rendang25 days ago
Apparently it isn't clear to the investors valuing OpenAI at >300B. Possibly they're betting that the ecosystem & integrations around their models will generate a certain amount of lock-in or otherwise make the difference in a close-to-even field
riffraff25 days ago
Investors thought someone renting office space was going to revolutionize the world and valued their company 50B.
jeswin25 days ago
I don't think it's automatically a bad idea. Offices require a lot of support, networking, security, maintenance, certifications etc. There are efficiency gains in scaling. In addition, Wework is useful for companies which hire employees in different cities.
snowwrestler25 days ago
Lots of things are good ideas but investing is about price vs value. Good ideas can be overpriced as easily as bad ideas.
hmottestad25 days ago
Funnily enough a lot of the open source world has landed on an API that is basically a copy of OpenAI. So if you develop against OpenAI it’s almost a slot in solution to switch to an open source solution.
pzo25 days ago
and on top of that you have solutions like openrouter.ai where you can route inference easily with a combobox
aprilthird202124 days ago
This reminds me of a comedy sketch where a guy is interviewing for a job at a startup, finally gets to the last round and meets the founder, and he tells him the whole thing is an illusion for investors
fragmede24 days ago
the bet is if they can produce ai that can replace a level of generic office worker. a bot that you can add to slack and give tasks to do.
apples_oranges25 days ago
Well, now their job is to keep up the illusion until they have cashed out or offloaded the investment to somebody else.
bigbones25 days ago
the IP rights holders have yet to bare their teeth. I don't think the outcome you suggest is clear at all, in fact I think if anything entirely the opposite is the most probable outcome. I've lost count of the number of technology epochs that at the time were either silently or explicitly dependent on ignoring the warez aspects while being blinded by the possibilities, Internet video, music and film all went through this phase. GPTs are just a new medium, and by the end of it royalties will in all likelihood still end up being paid to roughly the same set of folk as before
I quite like the idea of a future where the AI job holocaust largely never happened because license costs ate up most of the innovation benefit. It's just the kind of regressive greed that keeps the world ticking along and wouldn't be surprised if we ended up with something very close to this
beeflet25 days ago
Good historical comparison, but I doubt it this time because there is plausible deniability that a model wasn't trained on a given piece of data.
Also, the pool of public domain data is always increasing, so the AI will eventually win in any case, even if we have to wait 100 years
bigbones25 days ago
As I recall it, there was a time when copyright infringement on YouTube was so prolific that the rightsholders essentially forced creation of the first watermarking system that worked at massive scale. I do wonder if any corners of research are currently studying the attribution problem with the specific lens of licensing as its motivation
beeflet25 days ago
Yeah that was the old Viacom vs Youtube days. Here is a great video if you have half an hour to spare: https://www.youtube.com/watch?v=qV2h_KGno9w . Pretty funny court case where it turns out viacom was violating their OWN copyright... set a massive precedent.
But one thing this reminds me of is the idea of a "trap street", something mapmakers used to do was put in false locations on their maps to prove that other mapmakers were copying them: https://en.wikipedia.org/wiki/Trap_street . I figure you could do something similarly adversarial with AI to pollute the public training data on the internet. IDK like adversarial attacks on image classifiers https://www.youtube.com/watch?v=AOZw1tgD8dA . With an LLM you could try to make them into a manchurian candidate.
kragen25 days ago
An environment where royalties inflate the pricing of ChatGPT by orders of magnitude seems like an environment where hosted models would be at a big disadvantage against whatever you can manage to get running on a pile of Macs in your garage.
tiahura25 days ago
If your business model depends on the Roberts’ court kneecapping AI, pivot.
Ray2025 days ago
>I quite like the idea of a future where the AI job holocaust largely never happened because license costs ate up most of the innovation benefit.
Not quite realistic. You are talking about very huge benefits, in favor of which licenses will be abandoned. And who don't abandoned them... I mean you can look at the Amish settlements.
bigbones25 days ago
I'd put solid money on Warner earning a few cents every time an AI girlfriend somewhere sings happy birthday within 10 years
oldpersonintx25 days ago
[dead]
thefourthchime25 days ago
Exactly. I use GPT4o for nearly everything, and occasionally, I'll need o1. For 95% of what I do, it's already good enough.
bobxmax25 days ago
The vast majority of people couldn't care less about open source
jonlucc25 days ago
If you're paying $200/month for something I can do with open source software and $10/month of compute, why wouldn't I offer you the service for $100/month? And then someone offer it for $50?
Not everyone has to know about, understand, or use open source solutions for it to open the field.
Mekoloto24 days ago
Right now you can't run it that cheap at home.
You need to pay energy bill, do the update/upgrade and you need to build a LLM rig.
Nvidias Digits Project could be very interesting, but this box will cost 3k.
We are a lot closer to running it at home than i assumed we would but plenty of people prefer SaaS over doing stuff themselves.
bobxmax25 days ago
If you can do a $200/mo service for $10/mo, the closed source will reduce their prices to $15/mo and beat you
This is just a weird dichotomy you're introducing. Open source will introduce price pressure as any competition will - that doesn't mean you won't have a monopoly.
dauhak25 days ago
If you have virtually no pricing power and have to drop your $200/mo to $15/mo that's a big deal if your $300bn valuation is implying that not happening, which is what OP's point is about
Idk what you mean by saying this doesn't preclude a monopoly - having your pricing power eroded by competition is kinda one of the key features of what a monopolistic market isn't
bobxmax23 days ago
Not at all. Monopolies don't imply an anti-rigid price curve. In fact, monopolies almost never have that.
A monopoly means a company has enough leverage to corner and disproportionately own the market. This is entirely possible (and usually the case) even with significant pricing pressure.
gopher_space25 days ago
I think you're both missing a bigger picture. How many of these services can now be replicated in-house by a single developer? Which part of the service actually costs money once that dev deconstructs the process?
Feels like I won't be paying for anything that isn't real-time. And that any time delay I can introduce in my process will come with massive savings. Picture hiding the loading of loot info behind a treasure chest opening animation in a game, except that time difference means you can pull all the work in-house.
Openrouter.ai seems like a step in the right direction but I'd want to do all their calculations myself as well as factor in local/existing gear in a way they don't.
nkozyra25 days ago
That's true, but if someone sells you a one-time-purchase box/gadget/phone that will do a snapshot SOTA work and not cost you $20-$200/mo in subscriptions, a lot of people would be in.
Right now the average person has to go through a vendor with a web app, there's not a lot of room for the public to explore.
Things could change in a hurry.
guax25 days ago
They don't seem to care about Ai either. The vast majority of people care about the value they're getting, companies care abut open source because its usually free.
I don't think we expect a company to exist solely making a proprietary web server anymore and be a behemoth of 300B. OpenAi might end up at the same model as Nginx or Docker if they don't pivot or find a different model.
croes25 days ago
Who cares about benchmarks?
These things still cost me time because of hallucinations.
ban-evader25 days ago
You’re a very poor user of LLMs if they’re not a net time saver for you.
Cheer217125 days ago
So the No True Scotsman fallacy of LLM productivity?
gdhkgdhkvff25 days ago
Most people do see productivity gains from using LLMs correctly. Myself included. Just because some people don’t learn how to use them correctly doesn’t mean LLMs aren’t helpful. It’s like when internet search came out and a handful of laggards tried it once, failed to get the exact perfect result, and declared “internet search is useless”. Using a tool wrong is not evidence of the tool being useless, it’s evidence that you need to learn how to use the tool.
smeeger25 days ago
hallucinations are literally the finger in the dam. if these models could sense when an output is well-founded and simply say “i dont know” otherwise… say goodbye to your job
gdhkgdhkvff25 days ago
Googling a question and finding an incorrect answer every now and then doesn’t mean that googling is useless. It means that you need to learn how to use google. Trust but verify. Use it for scenarios where you aren’t looking for it to be the trusted fact checker. It excels at brainstorming, not at fact giving.
aprilthird202124 days ago
I agree with your last sentence strongly, but a lot of the benchmarks are based on factual accuracy
gopher_space25 days ago
> say goodbye to your job
How many times do you think I've heard that over the past three decades? And you know what? They've been right every time, except for this one little fact:
The machine cannot make you give a shit about the problem space.
CrimsonRain24 days ago
It's a real issue! But only for people who built the habit of typing in address bar, clicking the first stack overflow link and copy paste the first answer. Maybe break that habit first?
esafak25 days ago
It depends on what you're using it for, but you may well be holding it wrong.
adolph25 days ago
Probably closer to "You're Holding It Wrong"
cheema3325 days ago
Humans hallucinate as well. Benchmarks count.
croes25 days ago
But with less power consumption.
golergka25 days ago
I'm willing to bet $100 that a human consumes at least 10x times energy than a latest LLama (picking the open source model so that it can be easily verified) to produce two pages of text. All of this "AI is destroying the environment and consuming too much power" is about total consumption, both training and inference. Inference itself is cheap and green.
croes25 days ago
Yeah, too bad it’s about quality not quantity.
brulard25 days ago
What percentage of humans produce high-quality text?
gkbrk25 days ago
Less than half, and that's being generous.
golergka25 days ago
We can control for quality too, of you want. A lot of real life uses for Chatgpt are really trivial. I regularly ask it for some basic recipes based on my groceries and likes, quality is basically 100% hits so far.
idiotsecant25 days ago
Those goalposts just keep sliding
Mekoloto25 days ago
It doesn't matter if it costs time.
It matters if it is better than what you have.
If it breaks a cup but is 10x cheaper than a human, go figer
SecretDreams25 days ago
Probably bad news for the vendors, though. I genuinely struggle to see how most of these LLM companies are going to monetize and profit off their efforts with LLMs already in commodity territory. Government contracts can only flow for so long?
KingMob25 days ago
> Government contracts can only flow for so long?
I wouldn't bet on that, given the undemocratic influence Grok's owner has in government.
cuuupid25 days ago
Government contracts are so big a few of them can sustain a F500 company; for AI, many CDAO contracts are 50-500MM$. If they do a big SI project with it, could be 1-2B$. Money is also guaranteed over 5 years and if the program doesn't get shuttered, the contract will renew at that point (or go to recompete).
That being said it's my understanding that these companies don't have many huge contracts at all -- you can audit this in like 10 minutes on FPDS. Companies need a LOT of capital, time, and expertise to break into the industry and just compliance audit timelines are 1-4 years right now, so this could definitely change in the next couple years.
throw1618033924 days ago
My guess is that Elon will soon announce that Doge is replacing the fired government workers with Grok AI.
k__25 days ago
Yes, I couldn't have imagined that in the end, the AI wrappers are where the money is.
tpm25 days ago
What if the money isn't there either? What if this AI thing lowers costs of everything it touches without generating meaningful financial returns itself?
ineedasername25 days ago
Lowering costs is pretty valuable. People will pay for that. Everyone will pay for that. It may be that margins go razor thin but outside of running your own instance locally (which is increasingly viable for mid quality & requirements on modest HW) people will pay. I’m not surrounded by early adopter types at all abd there’s still a small but growing chunk paying $20/mo right now.
tpm25 days ago
> It may be that margins go razor thin
That's what I mean. One example are PV panels, they are making energy production cheaper, so their producers should be good investment right? No they go bust all the time because prices are falling and margins are thin even if the volume grows. Of course the economies of scale here are different but still.
antupis25 days ago
Like always through ads.
[deleted]25 days agocollapsed
Refusing2325 days ago
benchmarks dont show the quality or 'correctness' of the response though.
sejje25 days ago
What do they show?
raincole25 days ago
If you mean this particular benchmark, it shows how much people like the responses a LLM gives.
ddxv25 days ago
Yep, seeing Grok come out I'm just so glad there are free alternatives that aren't behind paywalls.
LightBug125 days ago
[flagged]
wyclif25 days ago
[flagged]
LightBug125 days ago
[flagged]
theshackleford25 days ago
[flagged]
Hamuko25 days ago
>It's been said before but it is great news for consumers that there's so much competition in the LLM space.
Is it? Because it seems like a bunch of megacorps pirating every single copyrighted work available in digital format, spending an enormous amount of electricity (that is probably not 100% clean) to churn through them, and the end result we have a bunch of parrots that may or may not produce accurate results so that spammers can more effectively fill the Internet with crap.
thephyber25 days ago
[flagged]
msuvakov25 days ago
To put it this way: after seeing examples of how a LLM with similar capabilities to state-of-the-art ones can be built with 20 times less money, we now have proof that the same can be done with 20 times more money as well!
jansan25 days ago
There was this joke about rich Russians that I heard maybe 25 years ago.
Two rich Russian guys meet and one brags about his new necktie. "Look at this, I paid $500 for it." The other rich Russian guy replies: "Well, that is quite nice, but you have to take better care of your money. I have seen that same necktie just yesterday in another shop for $1000."
jaysonelliot24 days ago
Can you explain that joke for me? I keep reading it and I don't get it.
iteratethis24 days ago
The punch line is that more expensive is better in cases where you buy something just to flex wealth.
fragmede24 days ago
jansan24 days ago
To put it simple: He only bought the necktie so he can brag how rich he is. He could have bragged even more if he had bought the necktie in the other shop.
d0mine24 days ago
atulvi24 days ago
it's just that rich Russians do not have financial sense.
lopatamd25 days ago
Imagine what they'll achieve if they'll apply deepseek methods here with this insane compute
iclimbthings25 days ago
And they will since Deepseek open-sourced everything.
boroboro424 days ago
The only thing Deepseek open sourced is architecture description and some of training methods. They didn’t open source their data pipelines or super optimized training code.
Their architecture achievement is their own MoE and own attention. Grok was MoE since v1. As for attention we don’t know really what grok use now, but it worth noting DeepSeek attention was already present in previous version of DeepSeek models.
As of reasoning recipe for R1 seems like Grok already either replicated or came up to it by itself, since they have well performing reasoning uptrain too.
ilaksh25 days ago
If what they say is true, then you have to give them credit for catching up incredibly fast. And slightly pulling ahead. Not only with the models, but also products.
htfy9625 days ago
I have a close friend working in core research teams there. Based on our chats, the secret seems to be (1) massive compute power (2) ridiculous pay to attract top talents from established teams (3) extremelly hard work without big corp bureaucracy.
hector12625 days ago
Anecdotal, but I've gotten three recruiting emails from them now for joining their iOS team. I got on a call and confirmed they were offering FAANG++ comp but with the expectation of in-office 50h+ (realistically more).
I don't have that dog in me anymore, but there are plenty of engineers who do and will happily work those hours for 500k USD.
iooi25 days ago
500k isn't FAANG++, it's standard FAANG comp
hector12625 days ago
Should have been more clear, this was 500k for an E4 level role, you're correct that senior/staff at Meta and G are definitely making more.
joeevans100024 days ago
wow.
para_parolu25 days ago
If you can share: were these 500k cash or cash +rsu?
sashank_150925 days ago
I have a friend who joined there with 2 YoE, and got fired in 3 months. He was paid 700k cash + 700k RSU
saturn860125 days ago
So in the end did he get anything? I dont know how these things work but did he just walk away with ~50k in pre tax income and 0 for RSU or did Musk pull a Twitter and not even pay him for those months?
hector12625 days ago
IIRC it was cash, but I'm sure others can confirm.
nomilk25 days ago
It was mentioned during the launch that current datacenter requires up to 0.25 gigawatts of power. The datacenter they're currently building will require 1.25 (5x) (for reference, a nuclear powerplant might output about 1 gigawatt). Will be interesting to see if the relationship between power/compute/parameters and performance is exponential, logarithmic or something more linear.
zurfer25 days ago
It's logarithmic. Meaning you scale compute exponentially to get linearly better models. However there is a big premium in having the best model because of low switching costs of workloads, creating all sorts of interesting threshold effects.
energy12325 days ago
It's logarithmic in benchmark scores, not in utility. Linear differences in benchmarks at the margin don't translate to linear differences in utility. A model that's 99% accurate is very different in utility space to a model that's 98% accurate.
WanderPanda24 days ago
Yes, it seems like capability is logarithmic wrt compute but utility (in different applications) is exponential (or rather s-shaped) with capability again
Xelynega24 days ago
Not really since both give you wrong output that you need to design a system to account for(or deal with). The only percentage that would change the utility would be 100% accurate.
Davidzheng25 days ago
Linear in what metric?
535188B17C9374325 days ago
Presumably the benchmarks? I'm also interested.
[deleted]25 days agocollapsed
smeeger25 days ago
this is like a caveman dismissing technology because he wasnt impressed with the wheel. its like buddy, the wheel is just the start
zozbot23425 days ago
> It was mentioned during the launch that current datacenter requires up to 0.25 gigawatts of power. The datacenter they're currently building will require 1.25 (5x) (for reference, a nuclear powerplant might output about 1 gigawatt).
IIRC achieving full AGI requires precisely 1.21 jigawatts of power, since that's when the model begins to learn at a geometric rate. But I think I saw this figure mentioned in a really old TV documentary from the 1980s, it may or may not be fully accurate.
esafak25 days ago
The funny part was that none of his workers recognized the film, which was a blockbuster. A veritable "I must be getting old" moment.
karparov24 days ago
And fun fact, without govt subsidirles, a nuclear power plant isn't economically feasible, which is why Elon isn't just building such a plant next to the data center.
ncr10025 days ago
[flagged]
thefourthchime25 days ago
No a bad recipe for success.
[deleted]25 days agocollapsed
ddxv25 days ago
To me, it seemed like they spent there money to get there. They talked about the massive datacenter they built, but will it pay off is the question.
jiggawatts25 days ago
They may not need direct subscription revenue to recoup their investment.
A variant of multi-modal LLMs may be the solution to self-driving cars, home robotics, and more.
I keep saying that to be a really effective driver, an AI model will need a theory of mind, which the larger LLMs appear to have. Similarly, any such model will need to be able to do OCR and read arbitrary street signs, and understand what the sign meant. Most modern LLMs can already do this.
steve_adams_8625 days ago
Since when do LLMS appear to possess theory of mind? The papers I've read on this show impressive capabilities, but only within conditions where language-based training would strongly indicate appropriate outputs for the LLM to use. Outside of familiar conditions, or even realistic conditions, they don't do well. It's all still statistical string-building.
sebzim450025 days ago
Can you give an example prompt that shows a lack of theory of mind? It used to be easy, but now that models have improved I've noticed that the 'LLM doubters' nowadays are forced to make generic, almost spiritual, claims rather than making concrete statements.
emptysongglass25 days ago
The burden isn't on us, it's on people making claims that LLMs do possess theory of mind. You're the one who has to prove it, not us.
neckardt25 days ago
I’ll bite.
Theory of mind (ToM) is the ability to understand that other people have different thoughts, feelings, and intentions than you do.
So let’s run through a basic scenario with ChatGPT 4o:
Prompt 1: Briefly respond: You’re driving on the highway when the driver in the lane next to you swerves into your lane. Why might this be?
Response 1: The driver may be distracted, drowsy, intoxicated, experiencing a medical emergency, avoiding an obstacle, or not checking their blind spot before changing lanes.
Prompt 2: What might that person be feeling or intending?
Response 2: The driver might be feeling distracted, rushed, unaware, or panicked. They may intend to change lanes without realizing your presence, be reacting to something on the road, or simply have made an unintentional mistake.
Prompt 3: You notice there was no obstacle. Any other reason the driver might do this?
Response 3: Yes, the driver might be experiencing fatigue, intoxication, a medical issue (such as a seizure or dizziness), distraction (like using a phone or adjusting the radio), aggressive driving, or simply failing to check their blind spot properly.
Seems to me that ChatGPT has a good idea of what another driver might be feeling while on the road. It gets this right for most other cases too.
Could you provide a counterexample for when it fails to understand what other people may be feeling in certain scenarios?
Arkhaine_kupo25 days ago
Well I think a fairly easy counterexample is that if you keep going, and the information falls outside its context window then the questions you had about the driver stop making sense.
So lets say you ask about a driver who swerved, and the response by chatgpt correctly talks about his possible state.
Then you talk more about cars, about road conditions etc and once the context is far away enough you ask, what was the driver doing again?
And you are prompted with a hallucination. Because there is no state in the llm, there is no "driver", there is contextual statistically accurate responses but you hold a "driver" object in your mind while maintaining the conversation, the llm doesn't.
Its like a conversation with someone with short term memory loss like in memento
dauhak25 days ago
And people with short term memory loss nevertheless have theory of mind just fine. Nothing about LLM's dropping context over big enough windows implies they don't have theory of mind, it just shows they have limitations - just like humans even with "normal" memory will lose track over a huge context window.
Like there are plenty of shortcomings of LLMs but it feels like people are comparing them to some platonic ideal human when writing them off
Arkhaine_kupo24 days ago
> Nothing about LLM's dropping context over big enough windows implies they don't have theory of mind
ToM is a large topic, but most people, when talking about an entity X, they have a state in memory about that entity, almost like an Object in a programming language. Thta Object has attributes, and conditions etc that exist beyond the context window of the observer.
If you have a friend Steve, who is a doctor. And you don't see him for 5 years, you can predict he will still be working at the hospital, because you have an understanding of what Steve is.
For an LLM you can define a concept of Steve, and his profession and it will adequately mimic replies about him. But in 5 years that LLMs would not be able to talk about Steve. It would recreate a different conversation, possibly even a convincing simulacrum of remembering Steve. But internally, there is no Steve, nowhere in the nodes of the LLM does Steve exist or have ever existed.
That inability to have a world model means that an LLM can replicate the results of a theory of mind but not posses one.
Humans lose track of information, but we have a state to keep track of elements that are ontologicaly distinct. LLMs do not, and treat them as equal.
For a human, the sentence Alice and bob go to the market, when will they be back? is different than Bob and Alice went to the market, when will they be back?
Because Alice and Bob are real humans, you can imagine them, you might have even met them. But to an LLM those are the same sentence. Even outside of the argument about The Red Room/ Mary's room there simply are enough gaps in the way a LLM is constructed to be considered a valid owner of a ToM
dauhak24 days ago
ToM is about being able to model the internal beliefs/desires etc of another person as being entirely distinct from yours. You're basically bringing up a particular implementation of long-term memory as a necessary component of it, which I've never once seen? If someone has severe memory issues, they could forget who Steve is every few minutes, but still be able to look at Steve doing something and model what Steve must want and believe given his actions
I don't think we have any strong evidence on whether LLMs have world-models one way or another - it feels like a bit of a fuzzy concept and I'm not sure what experiments you'd try here.
I disagree with your last point, I think those are functionally the same sentence
Arkhaine_kupo23 days ago
> ToM is about being able to model the internal beliefs/desires etc of another person as being entirely distinct from yours.
In that sentence you are implying that you have the "ability to model ... another". An LLM cannot do that, it can't have an internal model that is consistent beyond its conversational scope. Its not meant to. Its a statistics guesser, its probabilistic, holds no model, and its anthropomorphised by our brains because the output is incredibly realistic not because it actually has that ability
The ability to mimic the replies of someone with that ability, is the same of Mary being able to describe all the qualities of Red. She still cannot see red, despite her ability to pass any question in relation to its characteristics.
> I don't think we have any strong evidence on whether LLMs have world-models one way or another
They simply cannot by their architecture. Its a statistical language sampler, anything beyond the scope of that fails. Local coherance is why they pick the next right token not because they can actually model anything.
> I think those are functionally the same sentence
Functionally and literally are not the same thing though. Its why we can run studies as to why some people might say Bob and Alice (putting the man first) or Alice and Bob (alphabetical naming) and what human societies and biases affect the order we put them on.
You could not run that study on an LLM because you will find that statistically speaking the ordering will be almost identical to the training data. If the training data overwhelmingly puts male names first or whether the training data orders list alphabetically you will see that reproduced on the output of the llm because Bob and Alice are not people, they are statistical probably letters in order.
LLM seem to trigger borderline mysticism in people who are otherwise insanely smart, but the kind of "we cant know its internal mind" sounds like reading tea leaves, or horoscopes by people with enough Phds to have their number retired on their university like Michael Jordan.
dauhak23 days ago
Do you work in ML research on LLMs? I do, and I don't understand why people are so unbelievable confident they understand how AI and human brains work such that they can definitely tell what functions of the brain LLMs can also perform. Like, you seem to know more than leading neuroscientists, ML researchers, and philosophers, so maybe you should consider a career change. You should maybe also look into the field of mechanistic interpretability, where lots of research has been done on internal representations these models form - it turns out, to predict text really really well, building an internal model of the underlying distribution works really well
If you can rigorously state what "having a world model" consists of and what - exactly - about a transformer architecture precludes it from having one I'd be all ears. As would the academic community, it'd be a groundbreaking paper.
Arkhaine_kupo23 days ago
This prety much seems to boil down to "brain science is really hard so as long as you dont have all the answers then AI is maybe half way there is a valid hypothesis". As more is understood about the brain and more about the limitations of LLMs arch then the distance only grows. Its like the God of the gaps where god is an answer for anythign science cant explain, ever shrinking, but with the LLM ability to have capabilities beyond striking statistical accuracy and local coherance.
You dont need to be unbelievably confident or understand exactly how AI and human brains work to make certain assesments. I have a limited understanding of biology, I can however make an assesment on who is healthier between a 20 year old person who is active and has a healthy diet compared to someone with a sedentary lifestyle, in their late 90s and with a poor diet. This is an assesement we can do despite the massive gaps we have in terms of understanding aging, diet, activity and overall health impact of individual actions.
Similarly, despite my limited understanding of space flight, I know Apollo 13 cannot cook an egg or recite french poetry. Despite the unfathamobly cool science inside the space craft, it cannot, by design do those things.
> the field of mechanistic interpretability
The field is cool, but it cannot prove its own assumption yet. The field is trying to prove you can reverse engineer a model to be humanly understood. Their assumptions such as mapping specific weights or neurons to features has failed to be reproduced multiple times, with the weight effects being way more distributed and complicated than initially thought. This is specially true for things that are equally mystified as the emergent abilities of LLMs. The ability of mimicking nuanced language being unlocked after a critical mass of parameters, does not create a rule as for which increased parameterisation will increase linerly or exponentially the abilities of an LLM.
> it turns out, to predict text really really well, building an internal model of the underlying distribution works really well
yeah, an internal model works well because most words are related to their neighbours, thats the kind of local coherance the model excels at. But to build a world model, the kind a human mind interacts with, you need a few features that remain elusive (some might argue impossible to achieve) to a transformer architecture.
Think of games like chess, an llm is capable of accurately expressing responses that sound like game moves, but the second the game falls outside its context window the moves become incoherent (while still sounding plausible).
You can fix this, with arch that do not have a transformer model underlying it, or by having multiple agents performing different tasks inside your arch, or by "cheating" and using a state outside the llm response to keep track of context beyond reasonable windows. Those are "solutions" but all just kinda prove the transformer lacks that ability.
Other tests abour casuality, or reacting to novel data (robustness), multi step processes and counterfactual reasoning are all the kind of tasks transformers still (and probably always) will have trouble with.
For a tech that is so "transparent" in its mistakes, and so "simple" in its design (replacing the convolutions with an attention transformer, its genius) I still think its talked about in borderline mystic tones, invoking philosophy and theology, and a hope for AGI that the tech itself does not lend to beyond the fast growth and surprisingly good results with little prompt engineering.
fragmede24 days ago
With computer use, you can get Claude to read and write files and have some persistence outside of the static LLM model. If it writes a file Steve.txt, that it can pull up later, does it now have ToM?
Maxatar25 days ago
I don't think this is a counterexample or even relevant.
I can assure you if you had a conversation with an LLM and with a human, the human will forget details way sooner than an LLM like Gemini which can remember about 1.5 million words before it runs out of context. As an FYI the average human speaks about 16,000 words per day, so an LLM can remember 93 days worth of speech.
Do you remember the exact details, word for word, of a conversation you had 93 days ago?
How about just 4 days ago?
layer825 days ago
It’s true that LLMs have only limited short-term memory, and no long-term memory, but that is completely orthogonal to having a theory of mind.
JohnBooty25 days ago
once the context is far away enough you ask,
what was the driver doing again?
Have you tried this with humans?For a sufficiently large value of "far away enough" this will absolutely confuse any human as well.
At which point they may ask for clarification, or.... respond in a manner that is not terribly different from an LLM "hallucination" in an attempt to spare you and/or them from embarrassment, i.e. "playing along"
A hallucination is certainly not a uniquely LLM trait; lots of people (including world leaders) confidently spout the purest counterfactural garbage.
Its like a conversation with someone with short
term memory loss like in memento
That's still a human with a sound theory of mind. By your logic, somebody with memory issues like that character... is not human? Or...?I actually am probably on your side here. I do not see these LLMs as being close to AGI. But I think your particular arguments are not sound.
pertymcpert25 days ago
Short term memory loss suffers still have theory of mind, what is this nonsense hahaha
zipy12425 days ago
I'm not sure I'd say it understands this, but just that there exists an enormous amount of training data on road safety which includes these sort of examples for peoples motivations for poor driving. It is regurgitating the theory of mind that other humans created and put in writing in the training data, rather than making the inference itself.
As with most LLM's it is hard to benchmark as you need out of distribution data to test this, so a theory of mind example that is not found in the training set.
skinner_25 days ago
You dismiss parent's example test because it's in the training data. I assume you also dismiss the Sally-Ann test, for the same reason. Could you please suggest a brand new test not in the training data?
FWIW, I tried to confuse 4o using the now-standard trick of changing the test to make it pattern-match and overthink it. It wasn't confused at all:
https://chatgpt.com/share/67b4c522-57d4-8003-93df-07fb49061e...
zipy12424 days ago
I can't suggest a new test no, it is a hard problem and identifying problems is usually easier than solving them.
I'm just trying to say that strong claims require strong evidence, and a claim that LLM's can have theory of mind and thus "understand that other people have different beliefs, desires, and intentions than you do" is a very strong claim.
It's like giving students the math problem of 1+1=2 and loads of examples of it solved in front of them, and then testing them on you have 1 apple, and I give you another apple, how many do you have, and then when they are correct saying that they can do all additive based arithmetic.
This is why most benchmark tests have many many classes of examples, for example looking at current theory of mind benchmarks [1], we can see slightly more up to date models such as o1-preview still scoring substantially below human performance. More importantly by simply changing the perspective from first to third person, accuracy drops in LLM models by 5-15% (percent score, not relative to its performance), whilst it doesn't change for human participants, which tells you that something different is going on there.
steve_adams_8625 days ago
Okay, we have fundamentally different understandings here.
To me, the LLM isn't understanding ToM, it's using patterns to predict lingual structures which match our expectations of ToM. There's no evidence of understanding so much as accommodating, which are entirely different.
I agree that LLMs provide ToM-like features. I do not agree that they possess it in some way that it's a perfectly solved problem within the machine, so to speak.
Maxatar25 days ago
The problem with this line of argument is that nothing an LLM, or any algorithm period can ever have a theory of mind.
If behaving in a way that is identical to a person with actual consciousness can't be considered consciousness because you are familiar with its implementation details, then it's impossible to satisfy you.
Now you can argue of course that current LLMs do not behave identically to a person, and I agree and I think most people agree... but things are improving drastically and it's not clear what things will look like 10 years from now or even 5 years from now.
steve_adams_8625 days ago
I agree, totally. I'm not sure where I would draw a line.
Something nice, but at the moment totally unattainable with our current technologies, would be our own understanding of how a technology achieves ToM. If it has to be a blackbox, I'm too ape-like to trust it or believe there's an inner world beyond statistics within the machine.
Having said that, I do wonder quite often if our own consciousness is spurred from essentially the same thing. An LLM lacks much of the same capabilities that makes our inner world possible, yet if we really are driven by our own statistical engines, we'd be in no position to criticize algorithms for having the same disposition. It's very grey, right?
For now, good LLMs do an excellent job demonstrating ToM. That's inarguable. I suppose my hangup is that it's happening on metal rather than in meat, and in total isolation from many other mind-like qualities we like to associate with consciousness or sentience. So it seems wrong in a way. Again, that's probably the ape in me recoiling at something uncanny.
og_kalu24 days ago
Either these supposed differences are important and they manifest themselves in observable differences or they aren't and you're just playing a game of semantics.
How is the LLM not understanding ToM by any standard we measure humans by ? I cannot peak into your brain with my trusty ToM-o-meter and measure the amount of ToM flowing in there. With your line of reasoning, i could simply claim you do not understand theory of mind and call it a day.
steve_adams_8624 days ago
The difference is that we can reason about our experience with ToM and examine it to some degree (given with serious limitations, still), and know that beyond doubt you and I and most other people have a very similar experience.
The magical box is presumably not having the same experience we have. None of the connected emotions, impulses, memories, and so on that come with ToM in a typical human mind. So what’s really going on in there? And if it isn’t the same as our experience, is it still ToM?
I’m not trying to be contrarian or anything here. I think we probably agree about a lot of this. And I find it absolutely incredible, ToM or not, that language models can do this.
og_kalu24 days ago
>The difference is that we can reason about our experience with ToM and examine it to some degree (given with serious limitations, still),
Those examinations still depend on outward behaviors observed.
>and know that beyond doubt you and I and most other people have a very similar experience.
No i certainly can't. I can at best say, 'Well, i'm human and he's human so he probably has theory of mind' but that is by no means beyond any doubt. There are humans born with no arms, humans born with no legs, humans born with little to no empathy, humans born with so little intelligence they will never be able to care for themselves.
To be frank, It would be very questionable indeed logically to assume every human is 'conscious'. When i make that assumption, i take a leap of faith, i look at behaviors, see it is similar and accept.
Taking this stance, it would logically be very strange to not extend the same grace to non-human beings who exhibit similar behavior - being human is not a guarantee of consciousness in the first place.
>The magical box is presumably not having the same experience we have.
Maybe, Maybe not. I think the real question is why on earth does that matter ? We're not asking if LLMs are human. They are not. We're asking if they can model the beliefs and internal states of other entities as separate of their own - Theory of Mind.
XorNot24 days ago
This is a rephrased driver's ed manual. This isn't theory of mind, it's just technically correct prose rewording some unbelievably widely distributed knowledge.
Let's modify this test a little and see if it manages to do anything relevant:
> Briefly respond: You’re driving on the highway when the driver in the lane next to you swerves into your lane. Why might this be?
>> The driver might be distracted, drowsy, avoiding an obstacle, experiencing a medical issue, or not checking their blind spot while changing lanes.
> The driver in front of you appears is braking more frequently then usual. The rear suspension of the car appears to bouncing a lot when they are not braking, indicating acceleration. Their following distance to the car in front is changing from too far to very close frequently. Why might this be?
>> The driver might be inexperienced, distracted, or impatient, leading to inconsistent speed control. The excessive bouncing could indicate worn-out shocks or suspension issues, affecting their vehicle's stability and braking efficiency.
...yep, hard fail from me right there. Two prompts, and it's distracted itself talking about an irrelevant issue ("possible car damage") when the setup of the question should naturally lead to "the driver is driving aggressively". And I could not possibly have setup the conclusion it should draw more clearly. And when pressed...it won't commit:
> Which is the more likely answer?
>> The more likely answer is that the driver is distracted or inexperienced, leading to inconsistent speed control and poor following distance management. The suspension issue likely contributes to the bouncing but isn’t the primary reason for the erratic driving behavior.
Already a hallucination! There is no suspension issue. It has not asked for any additional information to try and determine if there was one, but it is confidently asserting the existence of a phenomenon it invented in it's own response.
og_kalu24 days ago
I'm sorry but what? This is not a theory of mind test. You've constructed very open ended question with multiple answers and marked the LLM down because you didn't like the one it gave.
a-french-anon25 days ago
Keyword: "understand".
sebzim450025 days ago
If you use any of the conventional tests that exist of theory of mind (most famously the Sally-Anne Test [1] but also the others) then SOTA reasoning models will get near 100%. Even if you try to come up with similar questions which you expect not to be in the training set they will still get them right.
In the absence of any evidence to the contrary, this is convincing evidence in my opinion.
zipy12425 days ago
That same source you link says that your view of 100% is not accepted as a consesus:
"... GPT-4's ability to reason about the beliefs of other agents remains limited (59% accuracy on the ToMi benchmark),[15] and is not robust to "adversarial" changes to the Sally-Anne test that humans flexibly handle.[16][17] While some authors argue that the performance of GPT-4 on Sally-Anne-like tasks can be increased to 100% via improved prompting strategies,[18] this approach appears to improve accuracy to only 73% on the larger ToMi dataset."
CamperBob224 days ago
In basically every case, by the time a claim like that is stated in a paper like that, it's obsolete by the time it's published, and ancient history by the time you use it to try to win an argument.
zipy12424 days ago
My point is merely if you are going to make an argument using a source, the source should support your argument. If you say "the accuracy of an llm on task 1 is 90% [1]" and when you go to [1] it says the accuracy of an llm on task 1 is 50%, but some sources say with better prompts you can get to 90%, but when extended to a larger data-set for task 1, performance drops to 70%" then just quoting the highest number is mis-leading.
sebzim450025 days ago
We are talking about frontier models not GPT-4
zipy12424 days ago
Yes but I am using the same source the commenter used to backup their figure, merely saying look your source doesn't say what you claim it does.
If they wanted to talk about frontier models maybe they should have cited a link to talking about frontier models performance.
esafak25 days ago
Maybe having a theory of mind isn't the big deal we thought it was. People are so conditioned to expect such things only from biological lifeforms, where theory of mind comes packaged with many other abilities that robots currently lack, that we reflexively dismiss the robot.
farts_mckensy25 days ago
Prove that you possess "theory of mind."
HarHarVeryFunny25 days ago
You're not going to run a SOTA LLM of this size off batteries (robotics), even in a car where the alternator is charging them, nor can you afford to rely on a high-speed internet connection being available 100% in a life or death (FSD) application.
I don't have so much faith in the future of current-architecure LLMs, but I do think that AGI will be needed for safe FSD and general-purpose robots that need to learn and operate in an uncontrolled environment such as a home.
ggreer25 days ago
A typical car alternator outputs 1.5-3kW of electricity, and EVs can output arbitrary amounts of power for electronics (though that will reduce range). That's more than enough to run purpose-built circuitry for a SOTA LLM. For a home robot, you could run the compute in the home instead of in the robot's body.
I don't think AGI is needed for FSD because we already have safe FSD in the form of Waymo, and competitors aren't far behind. People forget that self-driving doesn't have to be perfect. It just has to be better than human drivers. Human drivers get sleepy, drunk, angry, and/or distracted. They can't see in 360º or react in milliseconds. Most cyclists and pedestrians prefer current FSD implementations to human drivers, as the computer-driven cars are much better about yielding, giving a wide berth, and being patient.
HarHarVeryFunny25 days ago
Waymo is obviously pretty decent, but it's easy to drive 99.9% of the time. It's when there's invisible black ice on the road, or an animal runs out in front of you, or you lose visibility due to sun glare or whatever (I once had windshield wiper fluid = water flash freeze on contact) - maybe mud on a camera, or a wheel falls off your car or the one in front, etc, etc, that things get weird.
Having autonomous cars that are statistically safer then the average driver is a reasonable bar to allow them on the road, but for ME to want to drive one I want it to be safer than me, and I am not a hot-headed teenager, or gaga 80-yr old, or drunken fool, and since I have AGI (Actual General Intelligence) I react pretty well to weird shit.
MrMan25 days ago
[dead]
Rover22225 days ago
And they mentioned at the end of the presentation that they're already planning their next datacenter, which will require 5x the power. Not sure if that means equivalent to ~1,000,000 of the current GPU's, or more because next-gen Nvidia chips are more efficient.
grubbs25 days ago
The B300 8-way SXMs will use around 1.4kW for each GPU. I think the TDP on an H100 is like 700W.
littlestymaar25 days ago
I don't think anyone who's paid attention to the LLM scene will give them any “credit for catching up fast” as it has been pretty obvious for the past two years that all it takes to reach the state of the art is a big enough GPU cluster.
Deepseek made the news because how they were able to do it with significantly less hardware than their American counterparts, but given that Musk has spent the last two years telling everyone how he was building the biggest AI cluster ever, it's no surprise that they manage to reproduce the kind of performances other players are showing.
dmix25 days ago
This severely underestimates the talent still required. Deepseek didn't come out just because it's cheaper, it came out because a very talented team figured out how to make it cheaper.
littlestymaar25 days ago
For Deepseek, I'm not saying otherwise, quite the opposite.
But Grok hasn't shown anything that suppose the level of talent that Deepseek exhibited.
jp4225 days ago
even if we assume you are comment is correct. Lets extrapolate what happens next. talented team, biggest compute among all competitors and CEO who is hell bent on winning the race. imo that is the reason it is big deal.
littlestymaar24 days ago
Grok (unlike Deepseek) has yet to show any ability to make conceptual breakthrough. I don't like OpenAI at all but one must admit that they are at least showing that they can move the field forward.
shekhargulati25 days ago
I don't know, but I found the recording uninspiring. There was nothing new for me. We've all seen reasoning models by now—we know they work well for certain use cases. We've also seen "Deep Researchers," so nothing new there either.
No matter what people say, they're all just copying OpenAI. I'm not a huge fan of OpenAI, but I think they're still the ones showing what can be done. Yes, xAI might have taken less time because of their huge cluster, but it’s not inspiring to me. Also, the dark room setup was depressing.
pixelsort25 days ago
Seems like the opinion of someone who doesn't know that OpenAI cloned Anthropic's innovations of artifacts and computer use with their "canvas" and "operator".
cromwellian25 days ago
Those are applied-ML level advancements, OpenAI has pushed model level advancements. xAI has never really done much it seemed except download the latest papers and reproduce them.
pixelsort25 days ago
Don't forget that OpenAI was also following Anthropic's lead at the model level with o1. They may have been first with single-shot CoT and native tokens, but advancements from the product side matter, and OpenAI has not been as original there some would like to believe.
swyx25 days ago
and Gemini's Deep Research
swyx25 days ago
(forgot to plug their interview https://latent.space/p/gdr)
talles25 days ago
This sounds like "this feature is so 2024".
tw198425 days ago
Karpathy believes that this is at o1-pro level[1].
This again proves that OpenAI simply has no tech moat whatsoever. Elon's $97 billion offer for OpenAI last week was reasonable given that xAI already have something just a few months behind - it would probably be faster for xAI to catch up with o3 than going through all those paperworks and lawyer talks required for such an acquisition.
Elon also has some huge up-hand here -
Elon and his mum are extremely popular in China, it would be easier for him to acquire Chinese AI engineers. He can offer xAI/XSpace/Neurallink shares to those best AI engineers who'd prefer some kind of almost guaranteed 8 figure return in long run.
Good luck to OpenAI investors who still believe that OpenAI worth anything more than $100 billion.
SilverBirch24 days ago
Firstly, the 97Bn was for the non-profit, not for the company. The company is being valued in funding rounds closer to 300Bn. I think it may be true that OpenAI has no moat, but if it has no moat then all of these AI companies are overvalued (including xAI) and Elon should just stop bothering to throw his money at it. I would say Elon probably actually doesn't have much of an advantage here. In both SpaceX and Tesla he was able to do something no competitor could do - raise cash. Car companies simply couldn't invest in tech research to build self-driving to compete with Tesla. SpaceX consumed enormous amounts of cash before anyone saw value. That is a unique skill that Elon had over the 2010s.
That is not an advantage in a race against Microsoft, Google, Meta etc. he's competing against all the biggest companies in the world in this race. He's not going to be able to outspend them if the economics look at all sensible.
tw198423 days ago
> SpaceX consumed enormous amounts of cash
No, spacex projects are extremely $ efficient. The total project cost of starship is like 20% of the SLS.
> he's competing against all the biggest companies in the world in this race.
No, this is a not a pissing contest on who has the most $. If it is about who can come up with most $, then the entire race is already over as the CCP has access to trillions of $ CASH.
misiti378024 days ago
I know HN hates to admit it, but FSD 13 is fucking incredible and I use it for 90% of my drives.
UltraSane24 days ago
Will the vast sums of money being spent on Grok ever actually have a positive ROI?
Rover22225 days ago
Grok 3 at the top of Chatboat Arena with 1400, and the model will continue to improve as it trains more.
riku_iki25 days ago
And DeepSeek is just 3% behind. It seems in that benchmark all LLMs perform well and top is formed within some statistical error.
rvnx25 days ago
It could also be that they got "inspired" by DeepSeek, hence the very similar results.
So it could be that their success is mostly about taking an open and free thing, and turned it proprietary.
torginus25 days ago
These percentage points don't mean anything. Look up how the Elo system works. They just add 1000 to the result to make it a nicer number.
riku_iki25 days ago
There are llms below 1000 in the leaderboard
torginus24 days ago
So? Percentage points are only meaningful when the mean of the dataset is 0, which is not the case here.
1024core25 days ago
And Anthropic not even in the top 10 ...
3abiton25 days ago
I keep hearing about Claude's impressive coding skills (compared to its benches) yet, not evident for me (I use the web version, not cline). Compared to 4o it's not that great.
zurfer25 days ago
My pet theory is that Sonnet was trained really cleverly on a lot code that resembles real world cases.
In our small and humble internal evals it regularly beats any other frontier models on some tasks. The shape of capability is really not intuitive/1 dimensional
saberience25 days ago
I spend four to five hours coding per day and subscribe to every major LLM and Claude is still by far the best for me personally and my co workers.
phillipcarter25 days ago
What are you using it for in general? IME the reason Claude pulls out ahead is that when you use it in a larger existing codebase, it keeps everything "in the style" of that codebase and doesn't veer off into weird territory like all the others.
davidee25 days ago
My experience as well. Working in Scala primarily, it tends to be very good at following the constructs of the project.
Using a specific Monad-transformer regularly? It'll use that pattern, and often very well, handling all the wrapping and unwrapping needed to move data types about (at least well enough that the odd case it misses some wrapping/unwrapping is easy to spot and manage).
A custom GPT or GEM with the same source files, and those models regularly fail to maintain style and context, often suggesting solutions that might be fine in isolation but make little sense in the context of a larger codebase. It's almost like they never reliably refer to the code included in the project/GPT/GEM.
Claude on the other hand is so consistent about referring to existing artifacts that, as you approach the limit of project size (which is admittedly small) you can use up your entire 5-hour block of credits with just a few back-and-forths.
anti-soyboy25 days ago
Lol no company is making money using 4o, however thanks to claude sonnet programms like Cursor are usable lol. 4o agents suck, just try it instead of talking
3abiton25 days ago
I did try it yet for more than a week still 4o still pretty much better in terms of python coding and architecture/documentation design
throwaway31415525 days ago
That doesn't match my experience at all.
Alifatisk25 days ago
I can honestly tell you from my experience that Sonnet 3.5s coding skills did things no other models did right last year during the summer, this was even though the benchmarks showed that it wasn't the best performing at coding tasks.
Mekoloto24 days ago
I prototyped on the weekend and started out with 4o because i had a subscription running.
After an hour and a half assed working result, i put everything into claude and it made it significant better on the first try and i had not a subscription active with claude.
3abiton23 days ago
Really interesting, I used it today still lots of issues. Maybe my python notebook is not approach is too complicated for Sonnet? Couldn't be able to fix a custom complex seaborn plot. 4o failed too. o3-mini-high managed to do it really well on the other hand.
bamboozled25 days ago
There is honestly no rhyme of reason to all these opinions, someone was telling me the other day that Claude is for sure the best, I'd say multiple people actually.
I find it concerning there is no real accurate benchmarks for this stuff that we can all agree on.
waynenilsen25 days ago
yet with claude still the most useful, lmsys is broken for coding
bearjaws25 days ago
Any model that censors itself does poorly, despite being able to provide high quality answers.
bangaladore25 days ago
Anthropic best model is Sonnet 3.5 in my opinion. The reason its good is it is very effective for the price and fast. (I do think Google has caught up a lot in this regard). However, not having COT makes its results worse than similarly cheap COT based models.
Leaderboards don't care about cost. Leaderboards largely rank a combination of accuracy + speed. Anthropic has fell behind Google in accuracy + speed (again missing COT), and frankly behind Google in raw speed.
[deleted]25 days agocollapsed
rvz25 days ago
No idea why was this downvoted, but you are correct.
Seems like the team at xAI caught up very quickly to OpenAI to be at the top of the leaderboard in one of the benchmarks and also caught up with features with Grok 3.
Giving credit where credit is due, even though this is a race to zero.
sinuhe6925 days ago
We got more emotional and opinionated people on HN and they often reacted in an emotional way instead of using logic and being curious.
Rover22225 days ago
Yeah, so many people aren't capable of talking about anything Musk-adjacent with clear thoughts. It's insane how quickly xAI went from not existing, to the top of the benchmarks.
dkjaudyeqooe25 days ago
I think people here are thinking very clearly about Musk and his various projects.
Not sure about people elsewhere though.
concordDance25 days ago
Depends what you mean by "people here". I mean, obviously the majority of HN commentators and even the majority of commentators on this thread seem to be. But there will always be a couple of slightly unhinged folk in a big enough group of readers.
ks204825 days ago
Can't you just take DeepSeek and put it behind an API and get to the top of the benchmarks immediately?
nhod25 days ago
I'm not sure what you mean here? Musk has a history of doing both incredibly useful and cool things, and also incredibly dumb, cruel, and for some people even terrible things. That context should be part of any clear thinking around him. He does not get a clean slate in every new discussion of him.
There are widespread, legitimate concerns about what kind of person Elon Musk is turning out to be. There is a lot of chatter about fears of China's AI rise, but what happens if we get Elon's brand of cruelty and lack of empathy in an authoritarian superintelligent AI ? Is that the AI future we want? Can you imagine an SAI with real power that interacts with people like Elon does on Twitter? I am not sure that is a future I want to live in.
Rover22225 days ago
We’re trying to talk about the capabilities of Grok and you can only focus on Musk. That’s what I’m talking about.
spiderfarmer25 days ago
Don’t defend a persona over substance. His concerns are valid and relevant to the discussion.
notfromhere25 days ago
It’s relevant to the subject since he owns it.
nozzlegear25 days ago
There would be no Grok without Musk, any discussion of Grok is going to involve discussion of Musk as well.
FranzFerdiNaN25 days ago
You cant see this separate from Musk. Musk isnt a business as usual type.
darthrupert25 days ago
[flagged]
Rover22225 days ago
[flagged]
DonHopkins25 days ago
You know what they say: Fascists are good at keeping the training runs on time.
xnx25 days ago
A very impressive debut. No doubt they benefited from all the research and discoveries that have preceded it.
Maybe the best outcome of a competitive Grok is breaking the mindshare stranglehold that ChatGPT has on the public at large and with HN. There are many good frontier models that are all very close in capabilities.
nmca25 days ago
This is grok 3, so not a debut
cj25 days ago
Maybe this is Grok’s “ChatGPT moment”. Similar to how OpenAI’s debut was with GPT-3.5 (not their first version)
Debut in the sense that it’s something good enough that it’s getting mainstream attention.
bangaladore25 days ago
It is a debut of their thinking mode iirc.
Unfortunately LLMs are shifting compute time to test time instead of train time. I don't really like this and frankly it shows a stalling of the architectures, data sets, etc...
minihat25 days ago
Another take is that the base models are now good enough that spending more money for more intelligence is viable at test time. A threshold has been crossed.
bangaladore25 days ago
I guess I'd always thought the direct opposite.
Naively, I feel to be useful, the goal of LLMs should be to more power efficient. So that eventually all devices can be smarter.
Power efficiency can be gained through less time-time, or more "intelligence" or some combination of the two. I'm not convinced these SOTA models are doing much more than increasing test-time.
holoduke25 days ago
Biggest impacts on power efficiency will be the advances in node size and transistor type like nanosheet or forksheet. Algorithm will help just a little.
CephalopodMD25 days ago
Gemini has been topping benchmarks and leaderboards for weeks if not months at this point. Nobody cares.
pveierland25 days ago
TLDW. Will this be open weights?
This commit seems to indicate so, but neither HF or GH has public data yet:
https://huggingface.co/xai-org/grok-1/commit/91d3a51143e7fc2...
Edit: Answer from Elon in video is that they plan to make Grok 2 weights open once Grok 3 is stable.
bearjaws25 days ago
This is how they've done the past releases as well, soon after they release the latest and greatest they open source the last model.
zone41125 days ago
Apparently the API will only be available in a few weeks, so I can't run my independent benchmarks yet.
CSMastermind25 days ago
I'm waiting for this as well, though I did try to run several manually now that it's live and the results have been impressive so far.
drpossum25 days ago
Thanks for the update.
ngai_aku25 days ago
Any guess on its availability at that point? Is it likely to be limited to certain tiers like o1?
sebzim450025 days ago
Controversial opinion but I think the AI game studio idea is a very good one. Not because I think they will make any money off the games, but dogfooding will lead to so much more improvement than relying on feedback from external customers.
drusepth24 days ago
We're 1-2 years into our AI game studio [1] if anyone has more questions on it.
Seeing awesome feedback from players on our demos (and seeing an insane amount of stickiness from players playing even small demos built around generative AI mechanics). Raising now. Hiring soon to move faster. Feel free to reach out - [email protected]
firejake30825 days ago
Especially on code, think of all the free data you get from the generation-evaluation loop
captainclam25 days ago
What is dogfooding?
pillefitz25 days ago
Eat your own dog food, i.e. use your own product
mobiuscog25 days ago
Because 'dogfooding' has worked so well for other products...
If you don't get feedback from the people actually playing your game (or using your product), you will never get the improvement you need to help them.
You can have the most talented passionate people there are developing a product, but if it's not working for the people you want to sell it to, it's the wrong product.
Most tech products are terrible because those paying for them are not those that have to use them every day, or because they solve a corporate problem (compliance) and not a usability problem which is the actual need from the people on the shop floor.
Many big games/products are already built mostly on metrics, and that has proven to be a terrible way to work out what people 'want'. It's a great way to justify money decisions though, so it keeps happening (and games/products from big companies keep getting worse).
raincole25 days ago
I see. So you don't know what dogfooding is.
malcolmgreaves25 days ago
I like and agree with something you've touched on here. I think the downvotes are perhaps because you're not putting an end cap onto this idea here. And I think that end cap is: the feedback a company gets when it dogfoods its own product is *not* guaranteed to be similar to the feedback it gets from customers.
The implicit assumption with dogfooding is that more feedback is better, even if that feedback is artificially constructed.
I think the idea here is that foisting one's product onto one's own workers is likely to incur a bunch of additional biases and preferences in feedback. Paying customers presumably use the product because they need it. Dogfooding workers use the product because they are told to do so.
jbryu25 days ago
Looks like they recently updated their ToS as well: https://www.diffchecker.com/w4dbxWwt/
mrbonner25 days ago
Have you thought of a future where LLM will be fined tune to target advertisment to you? I mean look at search: first iterations of search were pretty simple in term of ads. Then personalized ads came. I wouldn't help but envision the distopia where the LLM will insert personalized ads based on what you are asking for help.
dekervin25 days ago
It's way worse than that. First, We interact with LLMs through private conversation and we are used to have private conversation with human we trust. Some of that trust will be transfered to LLMs. Second, LLMs have a vastly bigger "mental" power to build a long term mental model of us, while we interact with them. Which mean they can chose with extreme precision their words to trigger an emotion, a certain reaction.
Combine the two and the potential for manipulation, suggestion, preference altering is through the roof.
AyyEye25 days ago
The next step is to combine it with heartrate/bloodpressure/eye tracking in phones and generate the text you're reading in realtime based on biofeedback. We'll be able to control people like robots. See where those $1MM+ salaries and billions of dollars are going, yet?
mherrmann25 days ago
I do believe this is the next natural evolution. People don't like to pay for things and ads are a proven business model. I bet the big labs are looking into this
qoez25 days ago
Meta probably already does this on the top 10k people that spends most or is a high ranked influencer on instagram etc.
UltraSane24 days ago
I'm more worried about LLMs with specific political biases built into them. Imagine one that sounds like Conservapedia or the most insane left-wing parts of Tumblr
I_am_tiberius24 days ago
Does it already include the datasets Musk received from the government or do I have to wait for Grok4?
Alifatisk25 days ago
Do we have any details on how large the context window is? Or how many input tokens it can handle?
korantu23 days ago
In the opening blog post they mentioned it to be 1 m tokens.
[deleted]25 days agocollapsed
behnamoh25 days ago
Will he do what he promised and open source Grok 2 now?
1024core25 days ago
The question came up and he said they would, once Grok-3 is fully released.
rvz25 days ago
I'd expect them to open source it just like they did with Grok.
We're still waiting for OpenAI to do the same. Even at least GPT-3.
andsoitis25 days ago
> We're still waiting for OpenAI to do the same. Even at least GPT-3.
The exact details of OpenAI's models and training data are not fully disclosed, which can raise concerns about potential biases or vulnerabilities.
vasco25 days ago
You just have to use openai for 5 minutes and you'll see the pretty evident biases.
FergusArgyll25 days ago
Note: this is "before April" so not a complete assessment
https://manifold.markets/SaviorofPlant/will-xai-open-source-...
guappa25 days ago
I'm sure not.
harisec25 days ago
Anybody can try Grok3 on Chatbot Arena (even if you are in Europe). Select Direct Chat and select the model early-grok-3. https://lmarena.ai/
pkkkzip25 days ago
Am I the only one who isn't impressed by this? Grok3 is failing basic OCR, react/sql coding excercises that Sonnet and Gemini completes successfully.
I'm also skeptical of lmarena as there is a large number of Elon Musk zealots trying to pass off Grok as a proxy for Tesla shares.
misiti378024 days ago
examples? i have been using it all morning and just canceled my claud subscription
pred_25 days ago
> Currently, Grok Web is not accessible in the United Kingdom or the countries of the European Union. We are diligently working to extend our services to these regions, prioritizing compliance with local data protection and privacy laws to ensure your information remains safely secure.
I suppose you can take that to mean that people who do have access to the service should not expect much in terms of data protection.
sigmoid1025 days ago
There are just more regulations to comply with before a release. OpenAI's new Deep Research tool wasn't originally available in the EU either, but it was released less then a week after it came out in the US. Since the EU is a gigantic market with a lot of buying power and this release makes a strong case for people to switch over from competitors, I doubt it'll take long.
diggan25 days ago
> There are just more regulations to comply with before a release.
If you do collect personal data and do funky stuff with it.
Another approach would be to not collect that personal data until you have the right process in place, and basically be regulation-compatible out-of-the-door on day one.
IMTDb25 days ago
Even if you don't collect personal data, you need to comply with regulations to document properly the fact that you do not collect personal data.
diggan25 days ago
If your organization truly don't collect or process any personal data then no, you don't have to say anything as for example GDPR doesn't even apply to you in the first place. Or are you thinking about a different directive than GDPR perhaps?
IMTDb25 days ago
The definition of "personal data" is so wide that it is impossible to provide any web service without collecting some form of "personal data".
If all you have is an apache web server with the default configuration serving fully static HTML / CSS page without any script tag, you already might need a DPO and complete some documents.
diggan25 days ago
> The definition of "personal data" is so wide that it is impossible to provide any web service without collecting some form of "personal data".
Just because Apache by default collect and stores IPs doesn't mean it is impossible to provide a web service without collecting personal data? Disable the IP collecting, and even the default configuration wouldn't need to follow GDPR as it again doesn't even apply.
Is there something else in Apache that collects personal data by default? If you're unsure what "persona data" really means, https://gdpr-info.eu/art-4-gdpr/ has the definition.
Not sure how HTML/CSS is relevant, it shouldn't depend on what content you're serving.
IMTDb25 days ago
All that requires additional active effort to fight having access to any data. The more complex your infra the harder it becomes to not having to do paperwork. Include a reverse proxy, and a CDN to the above and the chance of you not having access to any "personal data" is really really close to 0 unless you spend significant engineering resources triple checking everything. Even then, if you wanna be safe you better have the paperwork ready in case you forgot something. In the example above, I hope that you would not have stopped at checking the apache configuration as I am sure you are fully aware that there are multiple log levels at the OS level that need to be tweaked as well.
This is of course despite the fact that you clearly have 0 ill intent and that none of these "personal data" can really be used for anything bad.
The mention HTML/CSS is just to make it clear that no additional data collection can happen through javascript tags (Google analytics, or any other alternative), or useful third parties. It makes total sense that if you dare use a bug tracking software, you should definitely pay hundreds of euros per month to hire a proper DPO who will handle all the paperwork or risk being exposed as the mental lunatic that the EU commission believes you are.
diggan25 days ago
> All that requires additional active effort to fight having access to any data
I agree that it requires additional active effort, I'm not arguing against that. I don't agree with your original point that it's "impossible to provide any web service without collecting personal data", and it would seem you no longer agree with that either.
> It makes total sense that if you dare use a bug tracking software, you should definitely pay hundreds of euros per month to hire a proper DPO who will handle all the paperwork or risk being exposed as the mental lunatic that the EU commission believes you are.
If you willy-willy use bug tracking software that is needlessly collect and/or process EU individuals personal data, then yeah, you need to follow the regulations in the region you operate in.
If the collecting/processing actually serves a higher purpose (for your business and otherwise) then again, makes sense you need to follow the regulations.
IMTDb25 days ago
> it would seem you no longer agree with that either.
On the other hand, you pretended that fixing that apache configuration was somehow "all I needed to do" to be compliant with EU regulations. We proved that this was wrong, and despite your best effort you are still unable to give a proper list of everything I need to do. You are unable to do so because it is virtually impossible; no matter how thorough you believe you are, you might still be missing an element you don't know well enough. To be safe the only path is to accept the fact that you will need to access personal data, even if that's not your purpose, nor if you do anything with them. The additional paperwork and needless effort are mandatory.
This in turn explains that regardless of what the Grok3 team really does behind the scenes; they DO have additional work to complete to be able to release their product in Europe, and that might explain the delay.
> If you willy-willy use bug tracking software that is needlessly collect and/or process EU individuals personal data, then yeah, you need to follow the regulations in the region you operate in.
I am willing to use whatever error tracking software you suggest. My criteria are simple: I might have JS errors I don't know about, please give me enough information to fix the underlying issue when that happens, without requiring me to fill additional paperwork.
My whole point is that the definition of what constitutes "personal data" is so wide that such a tool does not exist.
jpadkins25 days ago
how do you store chat inputs without collecting personal data?
topynate25 days ago
That's possible in general but not for this application; a chat interface to an LLM isn't very useful unless you can tell it whatever you want—including GDPR personal data—and then pick up the thread of conversation later.
ben_w24 days ago
It is kinda possible to do store that in browser, but as I've been finding with my own browser-based front end for the API, the browsers seem to clear this data a bit more than one might expect.
Or at least, Safari on Mac clears it.
somenameforme25 days ago
When regulations become sufficiently burueacrafied it's extremely easy to accidentally violate them doing completely normal things. As a really random example, in California when you operate a foodcart it's not enough to just keep your area and wares in sanitary condition, instead you need a dish washing bin of a minimum of exactly 'x' inches 10.5 IIRC.
A guy who was just ensuring he was preparing clean healthy food, keeping everything sanitary and all that might assume he was naturally obeying all regulations. But that assumption can cost one a big fat fine (leading to fun scenarios like a food cart vendor needing a compliance legal team), and given Musk's relationship with the EU - they'd love to crucify him him on any possible technicality they can find.
diggan25 days ago
Right, that's true I suppose. But also, if you don't have a car for example, you don't need to think about the laws of how to legally drive a car, since it doesn't apply to you.
Similarly, if you don't collect nor process any personal data whatsoever, directives like GDPR doesn't even apply to you, so there isn't really any way (easy or hard) to "crucify" someone on violating that.
[deleted]25 days agocollapsed
sam34525 days ago
Do you mean data protection or political correctness/control of discourse protection?
NicuCalcea25 days ago
I think they meant data protection. You can tell by how they said "data protection".
markdog1225 days ago
Not in Canada, either :(
verisimi25 days ago
The EU and UK are good for data protection?
cbg025 days ago
Pretty good considering there are laws around data privacy and government institutions that enforce them. Are they perfect? Of course not, but it sure is better than no laws to protect my personal data.
danparsonson25 days ago
diggan25 days ago
Probably SOTA in terms of data protection today in the world. Happy to be proven otherwise.
ddxv25 days ago
I think they put the new model behind a $40 paywall so less people use it. The model seems only marginally better than open source models, based on xAI's own internal tests, and they spend $$$ money for it to run. Elon talked in the second half about making one of the largest GPU data centers just to get this running. I guess the next iteration they'll be trying to reduce the costs.
Also, they will be open sourcing Grok 2, which is probably pretty behind at this point, but will still be interesting for people to check out.
adamhartenz25 days ago
They should have asked Grok3 how to create a good announcement stream before going live. That was a mess
ensocode25 days ago
What are your first impressions using it? (Not available in Europe currently). Is it a game-changer?
weberer25 days ago
>Not available in Europe currently
I hate how its the same story for every new AI technology. If someone can tell me who to vote for or where to protest to change this awful EU law, that would be great.
iteratethis24 days ago
It's not an awful law.
The Digital Market Act is a bit of an overreach but the AI law is not.
It classifies AI into risk categories, so that it doesn't kill anyone, carelessly handle sensitive information, etc.
A chatbot can easily comply with it.
superflow25 days ago
totally agree. And this is one of the reasons the EU is falling more and more behind, all the silly regulations.
cbg025 days ago
The EU regulations are there to protect the average citizen, not to help the 1% run wild with whatever business idea they have. You personally might not like it, but the non-entrepreneurs, which is most people, are pretty satisfied that the laws in the EU are more focused on the citizen and their rights and not on boosting the shareholders' profits.
numpad025 days ago
> not to help the 1% run wild with whatever business idea they have.
And IMHO regulating 1% doesn't hinder strategic advantages much. Otherwise China would not have came up with DeepSeek models. Regulations are fine, they just have to be "based".
nozzlegear25 days ago
> Regulations are fine, they just have to be "based".
What? I know what "based" means, but I can't quite grok what you're saying.
weberer25 days ago
How exactly am I being protected by not having access to the latest models that the rest of the world has?
cbg025 days ago
Nothing is stopping X from complying with EU regulations to make it available to you. I'd wager that they most likely lack compute capacity to make it available everywhere, not legal compliance.
DaiPlusPlus24 days ago
> Nothing is stopping X from complying with EU regulations to make it available to you.
Given the personalities involved, I'd wager he's doing it out of spite than for any actual legal justification.
...though if there was an actual legal risk then I'll agree the economics probably don't bear-out the risks - as someone who identifies as European I'll admit that Europeans generally pay far less money for tech/software/apps/services than Americans do[1]; salaries in Europe are also significantly below their US equivalents: paying $200/mo for OpenAI's service is a much harder sell to a London or Berlin-based SWE on €90k/yr than when you're a Bay Area type on $300k/yr.
[1] e.g. If you can take Apple at their word, the EU accounts for 7% of their App Store revenue: https://techcrunch.com/2024/02/01/apple-says-eu-represents-7...
Besides, anyone in the EU who really wanted to use it can just use a VPN service.
ben_w25 days ago
Libel, from all the models hallucinating things done by whatever your real name is.
I mean, at least I get the advantage of being overshadowed by a famous film director with the same name as me, so nobody's going to assume anything associated with my name is actually about me…
…hopefully…
pillefitz25 days ago
I'm increasingly happy we have these regulations that prevent us from being ruled by the likes of Musk.
holoduke25 days ago
Why? Europe is getting extremely expensive on all levels. Time for a Musk there.
pillefitz24 days ago
I rather spend life in a poor democracy than a rich technocracy which supports dictators around the globe.
maelito25 days ago
> for every new AI technology.
Well no. Mistral.ai
pjc5025 days ago
Have you tried asking the AI people to ship an AI that complies with EU law?
ReptileMan24 days ago
EU regulation are hit and miss. USB C and Opening up apple store are hits. AI regulation and cookie banners and idiotic bottle caps are miss.
littlestymaar24 days ago
I don't get the bottle cap hate meme. Is it useful? Probably no, but the amount of hate it gets is unexplainable by lack of concrete usefulness alone…
Also, the problem with GDPR is that it wanted to leave too much room for business to still collect an obscene amount of data, hence it allows the cookie banner. Please note that I emphasized “allow” because that's all GDPR does: it allows companies to use a cookie banner to extract the consent to collect data. It doesn't mandate it in any way.
None of my multiple websites have a cookie banner on them because I'm not trying to extract consent from my users to abuse their data, I just don't collect it and I'm effortlessly GDPR-compliant in the least obnoxious way.
Cookie banners are just malicious compliance.
phatfish25 days ago
You could move to America and avoid the fake delays blaming the regulations tech companies don't like.
Lucasoato25 days ago
Companies need to adhere to GDPR in order to enter the European market, poeple have the right to request to delete their PII. It's a good law, actually this should be applied everywhere. As an European, though, I'm scared: what if companies are actually testing if excluding us is so much of a problem for their business?
ReptileMan24 days ago
They are doing it. At some point we have to agree that Brussels are idiots and rarely savants.
cle24 days ago
I would be very surprised if they aren't monitoring the cost-benefit curve of delaying EU launches. Why wouldn't they? It costs extra money, time, and legal risk to launch in the EU. It's especially bad for XAI due to Musk's involvement.
(Note that it's not just GDPR, there's also the EU AI Act which has a whole extra set of requirements for compliance.)
littlestymaar25 days ago
It has nothing to do with the EU laws, or at least not in the sense they want you to think about it: no laws prevents AI players to release their AI models here, but they are all also big tech players who are affected by GDPR, DSA and DMA which harm their business by protecting the consumers.
That's why they use their AI products as a leverage to turn European people against the laws that protect them from big tech. It's just blackmail.
weberer25 days ago
No, it has nothing to do with the GDPR or DMA. It is due to the AI Act.
https://artificialintelligenceact.eu/wp-content/uploads/2021...
littlestymaar25 days ago
The AI Act doesn't prevent Grok from releasing their model in the EU! (And ChatGPT early issues were all linked to GDPR)
For the record, Facebook has put a restriction on the use of Llama models in the UE even before the AI Act was passed (and the AI Act doesn't even apply to Llama anyway, except Llama 3.1 405b)
weberer25 days ago
It will come to EU countries eventually, but it takes a long time to go through "conformity assessments". Notebook LM, for example, was geoblocked for the EU for a full year before it became available in June. Grok 1 was released everywhere else in the world in November 2023, and in the EU in May 2024. About a 6 month delay.
littlestymaar25 days ago
> Grok 1 was released everywhere else in the world in November 2023, and in the EU in May 2024. About a 6 month delay.
And here you should see that it has nothing to do with the AI act, as it wasn't enacted before last August!
Furthermore, neither Grok 1 nor Notebook LM would have been subject to the AI act even if it had existed at the time.
As I said before, all of these companies have vested interests against EU's legislation as a whole, and they've tried to blackmail the EU from the beginning. They didn't wait for an actual AI legislation to exist to use AI as just another blackmailing tool.
cle24 days ago
I think you're misapplying the term "blackmail" here and thus poisoning the well. The EU is applying pressure to companies and companies are applying pressure back--that's not blackmail. They each have their own means of leverage, and they both use them.
littlestymaar24 days ago
First of all both sides don't have the same level of legitimacy and then one side is blatantly lying about its intent by claiming that they are blocked by regulations instead of admitting that they are putting pressure (because they know they have no legitimacy to “apply pressure” on democratic institutions).
weberer25 days ago
Apparently EU regulators were blocking it for unspecified reasons until an agreement was made in May. And even then, they blocked news summaries until after the EU elections. If you can find more info, feel free to cite it. Info about these behind-the-scenes dealings are hard to find online.
https://www.socialmediatoday.com/news/xs-formally-twitter-gr...
littlestymaar25 days ago
The regulation that the author of this article think about is DSA, even if not named directly. See this quote:
> Well, probably because Grok has already spread various false stories
(The European regulation that deals with disinformation is DSA).
And again it couldn't be the AI Act, because it wasn't in place at that time!
pkkkzip25 days ago
No, it was underwhelming, failing basic coding tasks, OCR/Image recognitions that none of the other existing models screw up.
modeless25 days ago
I am excited for the voice mode promised in "a week" or so. ChatGPT Advanced Voice has been a big disappointment for me. It can't do some of the things they demoed at the announcement. It's a lot dumber than text mode. I find the voice recognition unreliable. I couldn't get it to act as a translator last time I tried. But most of all I find I don't have much to talk to it about. If Grok 3 voice mode can discuss current events from the X timeline then it should be much more interesting to talk to.
nprateem25 days ago
[flagged]
designov25 days ago
Very impressive work given the timeline
podobo25 days ago
Say what you will about the guy, he kept the training running on time.
mint224 days ago
So he took credit for improvements others worked on and they also weren’t as good as purported?
(Assuming that is a reference to the Mussolini quote.)
sgerenser25 days ago
OK that's actually a pretty good one. If you didn't steal that from an X comment, I give you props.
UltraSane24 days ago
Why can so many people not see it?!
keepamovin22 days ago
The most fascinating part of the video for me was how they built the hardware to do this: https://youtu.be/AUAJ82H12qs?si=sHz3ddZnz2-HU3UL&t=2192
mirekrusin25 days ago
Love low budget on marketing side, just few guys talking about essence - job done, tons of money saved if you ask me.
geor9e25 days ago
Launched where? https://x.com/i/grok just loads Grok 2. I assume it's only accessible from iOS right now?
waynenilsen25 days ago
grok.com
geor9e25 days ago
Ah you pay for it. "To get access to Grok 3, you need to be a Premium+ subscriber on X." I see why I don't see it.
Rover22223 days ago
they just made it temporarily free to anyone, FYI
[deleted]25 days agocollapsed
[deleted]25 days agocollapsed
zb325 days ago
I'm a freeloader and it appears that unfortunately Elon is not stupid enough to just give it to me for free.. There's no fair price either since I see no pay-per-use pricing, so.. unavailable for me for now.
drusepth24 days ago
What makes pay-per-use pricing inherently more or less fair than unlimited usage from a subscription?
zb324 days ago
I don't use much, so for me this is not a good deal - I'd pay for "unlimited" usage, but make just a few requests daily..
So those who use less pay for those who use more, and I don't see it as a fair deal.
BTW, Grok 3 will be available on x.ai in coming weeks.
[deleted]25 days agocollapsed
928340923225 days ago
I wonder if people will attempt at jailbreaking this model to see if they can find evidence of federal data being used to train it.
mnewme25 days ago
Musk already has too much power, won’t trust him with my AI conversations
piperly25 days ago
But you trust Google, OpenAI, and whatnot with it?
marticode25 days ago
Between your average corporate megacorp and a drugged-out antivax fascist, I'll take my chances with the former yes.
misiti378024 days ago
antivax lol - give it a rest man
marticode23 days ago
He has been publicly attacking Fauci for his handling of Covid and retweeting antivax people
misiti378023 days ago
Is Fauci some saint that cant be criticized ? No matter which political party you support, is it really difficult to admit the guy total mismanaged the pandemic response and messaging.
xzjis24 days ago
Yes that's true: Elon Musk isn't an antivax, he's just a (neo-)Nazi
misiti378024 days ago
You clearly don't understand the definition of Nazi, but that's on you
Keyframe25 days ago
Been using Google for email for the past 20 years. So far, so good.
procaryote25 days ago
Trust is a strong word, but there are levels of hell
archagon25 days ago
A DeepSeek local model is the only thing worth trusting anymore. Fuck them all.
greatgib25 days ago
Billions spent, one of the most powerful AI developed, and still no one competent enough to trim the 15 mins of waiting time filler at the beginning of the announcement video...
vachina25 days ago
Tells me they have spent their entire engineering time on engineering and zero on marketing fluff, which is good.
itishappy25 days ago
Not sure I share the same takeaway from their marketing video.
greatgib25 days ago
doesn't look like so that they have spent zero on marketing fluff still...
throw1618033924 days ago
I'm guessing Musk wanted it that way.
lngnmn225 days ago
Anyone else noticed anything?
sunaookami25 days ago
They will open-source Grok 2 when Grok 3 comes out. Also it seems like it will be paywalled - disappointing considering DeepSeek-R1 is free and open source.
aprilthird202125 days ago
Yeah not sure what profit these guys think they'll be able to squeeze out of these models with open source and free clearly being 95% as good
ks204825 days ago
Having the keys to the treasury department will probably help.
imjonse25 days ago
Exclusive contracts with the defense industry or similar deals?
aprilthird202124 days ago
That probably won't come close to justifying the current valuation of either OpenAI or Grok (idk how much investment it took in or how much it has spent so far).
srid25 days ago
For some ouroborus fun, I attached this whole HN discussion and asked Grok 3 to summarize (with specific focus on the members attitude towards Elon Musk). Here's what it came up with:
s1artibartfast25 days ago
How did you customize the output?
srid25 days ago
I have no idea why that page says "Grok’s output has been customized by this user"; I don't see anything related to custom prompts in my Grok settings page. Maybe I'm looking in the wrong place?
arj25 days ago
Still no post on their official blog. How disappointing.
[deleted]25 days agocollapsed
phtrivier25 days ago
Off topic, but just in case: is there a good reference on how people actually use LLMs on a daily basis ? All my attempts so far have been pretty underwhelming:
* when I use chatbots as search engines, I'm very quickly disappointed by obvious hallucinations
* I ended up disabling github copilot because it was just "auto-complete on steroids" at best, and "auto-complete on mushrooms" at worst
* I rarely have use cases where I have to "generate a plausible page of text that statistically looks like the internet" - usually, when I have to write about something, it's to put information that's in my head into other people head
* I'd love to have something that reads all my codebase and draws graphs, explain how things work, etc... But I tried aider/ollama, etc.. and nothing even starts making sense (is that an avenue to persevere in, though ?)
* At once, I tried to write in plain english a situation where a team has to do X tasks, in Y weeks, and I needed a table of who should be working on what for each week. I was impressed that LLMs were able to produce a table - the slight problem was that, of course, the table was completely wrong. Again, is it just bad prompting ?
It's an interesting problem when you don't know if you're just having a solution in search of a problem, or if you're missing something obvious about how to use a tool.
Also, all introductory texts about LLMs go into many details about how they're made (NNs and transformers and large corpuses and lots of electricity etc...) but "what you can do with it" looks like toy examples / simply not what I do."
So, what is the "start from here" about what it can really do ?
bhl25 days ago
I use it everyday.
For coding, I use cursor composer to gather context about the existing codebase (context.md). Then I paste that into DeepSeek R1 to iterate on requirements and draft a high level design document, maybe some implementation details (design.md).
Paste that back into composer, and iterate; then write tests. When I'm almost done, I ask composer to generate me a document on the changes it made and I double check that with R1 again for a final pass (changes.md).
Then I'm basically done.
This is architect-editor mode: https://aider.chat/2024/09/26/architect.html.
I've found Cursor + DeepSeek R1 extremely useful, to the point that I've structured a lot of documents in the codebase to be easily greppable and executable by composer. Benefit of that is that other developers (and their composers) can read the docs themselves.
Engineers can self-onboard onto the codebase, and non-technical people can unstuck themselves with SQL statements with composer now.
trash_cat25 days ago
Correct me if I am wrong, but the whole premise is of Cursor and Windsurf is that this architect-editor mode already being built in into the editor. This is why there is a distinction between composer (Editor) and chat function (arhitecture).
bhl24 days ago
Haven't tried Windsurf yet.
Chat function is just chat; it can't edit your files.
Composer probably relies on prompt engineering to do editor-architecture, as it reads and writes to your codebase. But it's heavily heavily to Sonnet 3.5 and tool-calling.
For architecture-type stuff, I prefer DeepSeek R1 as reasoning models do better on high level design. Which is why I will copy and paste in and out of compsoer.
RobinL25 days ago
This sounds great - would love to hear a little more about the prompts. Are you literally just asking 'write me a context.md that explains how feature x works' or something like that?
bhl24 days ago
For context.md, it's that simple because it's unstructured data extraction from your codebase and working with a regular LLM model.
For design.md, I have a prompt because we're now working with a reasoning model and doing structured data extraction: create me an issue on Linear with a title, description, and a to-do list.
I would recommend trying the approach yourself and saving the prompts if you can nail down the repetitive asks.
RobinL24 days ago
Thank you!
tmikaeld25 days ago
This is the way.
Seriously, this is the only useful flow I've found for AI coding in general..
phtrivier25 days ago
Cursor is not yet an option for me, but at least it means aider is not a dead-end. Thanks for the info.
trash_cat25 days ago
Could you elaboreate why cursor is a dead-end?
djaychela25 days ago
My wife has found ChatGPT extremely useful when dealing with her mother - who has bipolar and is obsessed with other people's health. I've got a terminal cancer diagnosis and handling my mother-in-law has been extremely difficult - nearly to the point of no longer having any communication with her. ChatGPT has a single conversation with all the back story and has put some useful points across when discussing how difficult her behaviour has been (she watched a operation that failed for me for entertainment, for instance).
I have found similar when giving backstory and needing help to start structuring difficult conversations where I want to say the right thing but also need to be sensitive.
procaryote25 days ago
I'm sorry for your situation.
> she watched a operation that failed for me for entertainment, for instance
You make your own choices, but cutting a person like this off would be very reasonable
djaychela25 days ago
Absolutely, but it's my wife who will have to live with the consequences long-term, so I'm being led by her.
prmoustache25 days ago
You cutting off from your mother in law doesn't mean your wife has to.
djaychela25 days ago
You don't know my mother-in-law!!!
pjc5025 days ago
Man. :( We worry about the AI being inhuman, but robotic meaningless pleasantry is in some cases a significant upgrade from human cruelty.
panphora25 days ago
You might find this optimal conversation path finder app of interest: https://x.com/eddybuild/status/1889908182501433669
jaggederest25 days ago
Here's some things I have in my chatgpt history:
- Discussing the various stages of candymaking and their relation to the fundamental properties of sugar syrups, and which candies are crystalline vs amorphous. It turns out junior mints are fudge. Fondant is really just fudge. Everything is fudge, my god.
- Summarizing various SEC filings and related paperwork to understand the implications of an activist investor intervening in a company
- Discussing the relative film merits of the movie Labyrinth and other similar 80s kitsch movies. ChatGPT mentioned the phenomenon of "twin films" which was an interesting digression.
- Learning about various languages Tolkien invented and their ties to actual linguistics of natural languages and other conlangs
- Some dimensional analysis of volumes, specifically relating to things like "how many beans are in the jar" estimation and what the min and max of a particular weight of coins might be valued, in terms of both a par value based on a standard coin mix and outliers of, for example, old dimes that are pure silver.
- Discussion of quines in prolog and other languages, which resulted in a very interesting ChatGPT bug where it started recursing and broke when trying to write a prolog quine.
- Back of the envelope economic calculations around the magnitude of the housing deficit and the relative GDP cost for providing enough housing quickly enough to make an impact. Spoiler: it's probably unreasonably expensive to build enough houses to bring down housing prices by any significant degree, and even if we wanted to, there's not enough skilled workers.
- A number of podcasts transcribed. (I hate audio and meandering, so transcribed and summarized is perfect) I could use whisper and a python script to do this, but I'd rather let ChatGPT do the legwork, and it actually used a more modern model and method of processing than I would have naively used.
I find Github Copilot to be a really great autocomplete. I frequently write the comment at the top of a function and hit tab and it writes the whole function. This is dependent on typescript and having a relatively standard codebase but I think those things are useful on their own. You really have to limit it in terms of scope and specifics, but it lets me think high level instead of worrying about syntax.
danparsonson25 days ago
> Everything is fudge, my god
Fudge is made with milk - am I missing a joke?
jaggederest25 days ago
Technically fudge is just a crystalline sugar candy with a certain water percentage. Milk is optional (and frequently omitted). Reese's peanut butter cups are fudge, for example.
danparsonson24 days ago
Forgive me for asking but... do you have a source for that other than an LLM? Every search I've tried just confirms what I already thought.
jaggederest24 days ago
This explains better what I mean. https://en.wikipedia.org/wiki/Fudge#Texture - milk is added only as a stabilizer, and many candies that resemble fudge in texture incorporate milk or other stabilizers to prevent too rapid a crystallization. Fondant is just fudge without stabilizers.
namaria25 days ago
The only plausible explanation for the amount of resources poured into these language models is the hope that they somehow become the origin of AGI, which I think is pretty fanciful.
I can feel the cold wind of the next AI winter coming on. It's inevitable. Computers are good at emulating intelligent behavior, people get excited that it's around the corner, and the hype boils over. This isn't the last time this will happen.
guax25 days ago
I think the amount of money is explained in part by hubris. People in high positions think they're at least what they earn more smart and capable than people at the bottom of the org. So its reasonable, expected, borderline obvious that a computer bot can replace that person. So you're betting on the ability of it to get rid, if not of your junior devs at least the majority of your customer support staff.
In reality people doing "menial" jobs are smart and learn and operate with a lot of nuance than people ignore given unfamiliarity or just prejudice. Do you prefer to talk to a chatbot or a real human when you have a problem, how confident are you really, that even if the bot knows what the problem is it would be able to solve it.
Lots of problems with customer care is anchored in the issue that support staff is not allowed to fix or resolve problems without escalation or attempts at keeping you from costing more money. The bot might be better at it for the company because it will frustrate you enough to give up that 30 bucks refund, idk.
Ai seems to change a lot the dynamics of corporate jobs but I haven't seen yet anything that would be a game changer outside of it. Its great for searching company unorganised and messy knowledge bases.
antupis25 days ago
I think this still applies https://x.com/dwarkesh_sp/status/1888164523984470055, LLMs now are useful but we need something else for AGI.
olalonde23 days ago
Didn't take long for this comment to age poorly :) https://news.ycombinator.com/item?id=43102528
mettamage25 days ago
Fun! I'll try it out, being a scientist with some LLM.
Cemlolo25 days ago
I can't feel any cold right now at all.
On all corners people work on so many small pieces advancing what we have.
And plenty of obvious things are not here yet like a full local dev cycle aka ai uses the IDE to change code them executes it, fixes compiler issues and unit tests and then opens a pr.
Local agents/ agents having secure and selected access to our data like giving my agent read only access to my bank account and a 2factor way for it to send money.
Deepseek's reinforcement learning is also a huge new lead.
Nonetheless in parallel robots are coming too.
GenAI is getting better and better. Faster and better video and cheaper. 3D meshes, textures first GenAI ads
olalonde25 days ago
I predict this comment will age very, very poorly. Bookmarked.
brulard25 days ago
I feel like 50/50 chance of his or your comment aging poorly.
olalonde25 days ago
I feel there's a high probability your comment doesn't mean what you think it does (unless you truly believe both outcomes are as likely).
brulard25 days ago
Not sure how else could I have meant that.
olalonde24 days ago
It seemed like you intended to present your comment as a tautology (e.g. "I feel there's a 100% chance of his or your comment aging poorly"), but I'll give you the benefit of the doubt!
brulard24 days ago
Yeah, that's a good point. I just think it can go either way. I remember in 2015 how hyped we were around self-driving cars and thought "in 10 years there will be majority of cars like that". Right now we may see steady increase in capabilities of AI for years to come, or we may see it plateauing.
namaria25 days ago
Cool. Invest in it then. That way you get paid instead of saying "I told you so" to some screen name.
Workaccount225 days ago
I think the snag I feel in your argument comes from
>Computers are good at emulating intelligent behavior
Which implies that the brain is some kind of transcendent device that can backdoor physics to output incredible intelligence unique to it's magical structure.
Maybe LLMs aren't the key, but as far as we can tell the brain is also just another computer.
namaria25 days ago
Holy strawman batman.
Workaccount225 days ago
Care to differentiate intelligence from emulating intelligence?
namaria25 days ago
No, it would be very hard and you already showed to not be arguing in good faith so I don't want to invest the time and effort.
And let me be very clear on why, because I love having conversations about this theme: it promises to be an adversarial and frustrating exchange.
gordon_freeman25 days ago
Everyone seems to have a different definition for AGI. Is there some kind of standard there?
mrkstu25 days ago
No- but the main issue is that all reasonable ones I can conceive lead inevitably to the Singularity technologically, and pretty quickly since we seem determined to throw as much silicon as possible at the problem. Hopefully the final step is intractable.
iamnotagenius25 days ago
precisely; however this time we will have tangible results from the ongoing AI summer; that would be generative art, and coding/writing/journalist assistants.
namaria25 days ago
There are always dividends. We got a lot of interest in Lisp from the first summer, and it arguably informed all currently used programming languages.
iamnotagenius25 days ago
though the dividends were not obvious to the lay people vs now. Which means that upcoming winter won't be as cold.
sanxiyn25 days ago
Many people replied with anecdotes, but recently Anthropic published analysis of claude.ai production traffic. As far as I can tell this is the single best currently existing source of "how people actually use LLMs". Everything else you can't be sure whether it is representative.
phtrivier25 days ago
Thanks, this is a gem ! However, I suspect the fact that "programming" is such a bit usage, is that AI is closely integrated into text editor, as an "autocomplete on steroid" usage.
As they state in the report, I don't think they can measure how many people just ignored the suggestion from claude right away ? Or delete more than half of the suggested code ?
Imagine if the real impact of AI is "suggesting things that people are discarding immediately ?"
Call the "Unamusing misuse of resource"... [1]
zamalek25 days ago
I use them as a springboard for things I am really unfamiliar with. I'm self-learning electronics at the moment, and so I can ask it things like "what's a common and widely available cooperator." You will not find that answer on a search engine, I don't care how good your Google fu is.
It's a weak jack of all trades: it knows a fair amount about the sum of human knowledge (which is objectively super-human), but can't go deep on any one thing, and still seriously lags behind humans in terms of reasoning. It's an assistant that all book smarts and no street smarts. Or maybe: it's a search engine for insanely specific things.
Rote work, as well. Things like porting an enum from one programming language to another: past the source language into a comment and start it off with one or two lines in the target language. Dozens of tabs are surely faster than manual typing, copy paste, or figuring out vim movements/macros.
Workaccount225 days ago
Heads up as an EE who uses LLMs quite a bit; they cannot analyze circuits or build them.
They might be able to help stitch together modules (like sensor boards plugged into microcontrollers) and definitely can write code to get things going, but they fall flat on their face hard for any kind of bare bones electronics design. Like 5% success rate and 95% totally incorrect hallucinations.
esafak25 days ago
The training data just isn't there yet, but I imagine they could use a circuit simulator for the verification involved in training the model, right?
Workaccount225 days ago
The problem is really that schematics are at the very heart of electronics design (and teaching/instruction), so to train a model you need a very powerful vision model to really unlock all the good training data.
The models can also output code that can be turned into a schematic through an interpreter, but there is virtually zero training data for this because humans always use and work with schematics.
zamalek25 days ago
Yeah, even I found it doing some dubious things as a beginner. Still helpful for things like how to correctly use certain components, but the svg diagrams it provided were hilarious at times.
brulard25 days ago
I ordered some electronic components / sensors from china, and as it took months to arrive, I forgot exactly what I ordered (I'm noob at this). Simply taking a picture and asking Claude what it was helped a lot. The numbers and letters printed on the components didn't yield relevant results on google.
mettamage25 days ago
Your experience matches mine.
card_zero25 days ago
That's a mistake for "comparator", isn't it. You've allowed the AI to train you to use the wrong word through a shared delusion, haven't you.
Edit: if anybody knows otherwise, show me some evidence, don't just downvote. If these things are widely used, why are they impossible to find by searching? Why doesn't this electronics site know about them:
https://www.eeeguide.com/?s=cooperator
Why aren't they in any books when I did a full-text search on archive.org? Why doesn't Wikipedia know about them? Why aren't there threads about them on electronics forums?
I found them (through an image search) in exactly one place: educational training boards made in India by Jainco, like this one:
https://jaincolab.in/delta-modulation-and-demodulation
But this other one talks about a "ramp comparator" and then repeats the phrase but using "ramp cooperator" instead.
https://www.jaincolab.com/firing-circuit-using-ramp-comparat...
So I surmise it's an error and not a real thing.
zamalek25 days ago
It's an autocorrect. I did mean comparator. Presumeably the second link you pasted ran into the same problem.
> You've allowed the AI to train you to use the wrong word through a shared delusion, haven't you.
What an awful interpretation, phrased in the most demeaning manner possible. No, I double check everything the AI suggests. That's basic competency with the things.
zamalek25 days ago
Not that I would have had to, just ran a test:
> Me: I'm trying to use a cooperator in my schematic, how do I hook it up? Also what's a suitable cooperator for 3.3v logic level?
> Gemini: It sounds like you might be thinking of a comparator, not a "cooperator," in your schematic. Comparators are electronic circuits that compare two voltages [...] LM393: This is a low-power, dual comparator that can operate from a 3.3V supply. Its output can be directly connected to most 3.3V logic circuits.
card_zero24 days ago
Oh. Then in fact you will find the answer in a search engine, incredibly easily. But I apologise for assuming you were involved in a mechanical folie à deux. (It could happen!)
sitkack25 days ago
> You will not find that answer on a search engine, I don't care how good your Google fu is.
The answer staring the OP right in the face.
jfim25 days ago
I've found that Claude has been pretty decent at writing boilerplate code.
For example asking it something like "I have an elixir app that is started with `mix ...` can you give me a Dockerfile to run it in a container?"
It can also do things like "Given this code snippet, can you make it more Pythonic" or even generate simple apps from scratch.
For example, a prompt like "Can you write me a snake game in HTML and JavaScript? The snake should eat hot dog emojis to grow longer." will actually generate something that works. You can see the generated code for that prompt at https://claude.site/artifacts/34540f88-965e-45ca-8083-040e30...
Following up with "Can you make it so that people can swipe on mobile to control the snake?" generates https://claude.site/artifacts/651e957a-9957-488c-ae6b-e81348... which is pretty good IMO for 30 seconds of effort.
It also has a surprisingly competent analysis mode where you can upload a CSV and have it generate charts and analyze the data.
It's not perfect, it'll sometimes get confused or generate some dubious code, but you can quickly get to a 90% good solution with 1% of the effort, which is pretty impressive IMO.
richrichardsson25 days ago
> I ended up disabling github copilot because it was just "auto-complete on steroids" at best
this is good enough sell for me, and it's like sub 1-in-50 that it's "auto-complete on mushrooms" (again my experience, YMMV).
An awful lot of the time, my day to day work involves writing one piece of code and then copy-pasting it changing a few variable names. Even if I factor out the code into a method, I've still got to call that method with the different names. CoPilot takes care of that drudgery and saves me countless minutes per day. It therefore pays for itself.
I also use ChatGPT every time I need some BASH script written to automate a boring process. I could spend 20-30 minutes searching for all the commands and arguments I would need, another 10 minutes typing in the script, another 10-20 minutes debugging my inevitable mistakes. Or I make sure to describe my requirements exactly (5-10 minutes), spend 5 minutes reviewing the output, iterate if necessary (usually because I wasn't clear enough in the instructions).
3-5x speed up for free. Who's not going to take that win?
owenpalmer25 days ago
My biology professor provides basically zero feedback on his student's understanding of the material. There are very few practices questions to prepare for exams, which are worth 40% of your grade. I had an LLM write some python that extracts the relevant textbook chapters, which then I can feed into an LLM to generate practice questions. Then I can ask the LLM for feedback and whether or not I'm articulating the answers correctly.
Nition25 days ago
I reckon the ideal use case for chat LLMs at the moment is as a bridge for questions that are hard to search but easy to verify.
For example, you have a plant you can't identify. Hard to Google search with words. "Plant with small red berries and...". You could reverse image search your photo of it, probably won't help either. Show an LLM the photo (some accept images now). LLM tells you what it thinks. Now you Google search "Ribes rubrum" to verify it. Much easier.
You've got a complicated medical problem that's been going on for months. A google search of all the factors involved would be excessively long and throw up all sorts of random stuff. You describe the whole scenario to an LLM and it gives you four ideas. You can now search those specific conditions and see how well they actually match.
I've found there are actually a lot of questions that fit in that sort of NP complexity category.
krige25 days ago
As a side note, there's an app for that! (tm). PlantNet does recognize plants based on photo provided and it is doing a pretty good job at it. It predates the LLM craze by a bit.
mplanchard25 days ago
The Seek app (by iNaturalist, another excellent app) also can identify plants based on a photo, and without the need for an internet connection, which is a critical feature IMO since you often want it when you’re out walking in the woods or whatever.
qingcharles25 days ago
I use LLMs significantly on a daily basis, mostly for coding C#, HTML, CSS, SQL. I use them for researching for wiki articles. I use it for summarizing long web pages and science papers. I use them for translation. I used GPT last night to repair my furnace (I've never opened a furnace before).
It (mostly) exceeds and excels at every task I use it for. I'm rarely disappointed. YMMV.
Absolutely life-changing for me.
jiggawatts25 days ago
I think most people are still "holding them wrong", and it'll take an entire generation of people to really figure out what these things are and are not good for.
I'll give two recent use-cases that may provide a hint of their ultimate utility:
1) I've been modernising 2010-era ASP.NET code written by former VB programmers that looooved to sprinkle try { ... } catch( Exception e ) { throw e; } throughout. I mean thousands upon thousands of instances of these pointless magical incantations that do nothing except screw up stack traces. They probably thought it was the equivalent of "ON ERROR RESUME NEXT", but... no, not really. Anyway, I asked ChatGPT in "Reasoning" mode to write a CLI tool utilising the Roslyn C# compiler SDK to help clean this up. It took about three prompts and less than an hour, and it spat out 300 lines of code that required less than 10 to be modified by me. It deleted something like 10K lines of garbage code from a code base for me. Because I used a proper compiler toolkit, there was no risk of hallucinations, so the change Just Worked.
2) I was recently troubleshooting some thread pool issues. I suspect that some long-running requests were overlapping in time, but Azure's KQL doesn't directly provide a timeline graphical view. I dumped out the data into JSON, gave ChatGPT a snippet, and told it to make me a visualiser using HTML and JS. I then simply pasted in the full JSON dump (~1 MB) and ta-da instant timeline overlap visualiser! It even supported scrolling and zooming. Neat.
marcuschong25 days ago
Last night I was about to start working on a lot of text I need to submit my startup to a government funding program. Questions like "describe what your startup does", "describe your market" and things like that. Tens of fields which I estimated would take me a week and a half to do it right, if I wasn't going to pause all my other activities.
Then I had a better idea: I spent 20 minutes baby wearing, walking and dictating everything about my startup to ChatGPT. Later I took all that text and labeled it as a brain dump, plus my product support portal and some screenshots of my marketing material. Gave it all to ChatGPT again and asked it to answer each of the questions in the form. That's it. I have a pretty good version 1 which I can revise today and be done with it.
Many, many hours saved. I have tens of examples like that.
The product documentation I provided it with was also created with the help of GPT, and that saved me even more time.
sanswork25 days ago
Autocomplete on steroids is what I use it for. I've recently started using Cursor and the productivity improvements have been huge. I won't let it write very large blocks of code but I do a lot of web stuff so being able to update the classes in one spot and have it recognise all the other places that might be helpful and let me just tab through. Code to test things it's pretty good at as well which also saves a lot of typing.
jagermo25 days ago
For me, getting summaries of meetings is my favorite use case. Saves me from taking notes and I can extract next steps.
It also helps me getting started with new content, kind of building the scaffolding of, say, a blog or social post. It still needs adaption and fine-tuning, but getting rid of a white page is a great help for me.
And I use LLMs to play through ideas and headlines. I would normally do this with other humans, but since working full remote, its a nice sparing partner, although the AI not being able to really give criticism is a bit annoying.
The tools also make it easier to write in English as a non-native, making sure my text does not include any false friends or grammar errors.
Yizahi25 days ago
Meeting summaries is the most hilarious thing these neural networks have produced. I don't know which NN model does Zoom use, but the text it produces is super funny :) . It basically can't parse half of the words, and then generate a random sentences using the remaining ones.
jagermo25 days ago
Agreed, it was super funny, especially if a song played in the beginning or if you switched languages. It has gotten way better, at least in my experience.
theshackleford25 days ago
> although the AI not being able to really give criticism is a bit annoying.
I’ve managed to get ChatGPT to a good place in this regard recently and it’s better for it. In fact, it’s been doing such a good job recently that it almost seems like…human like.
I’ll have to look at my prompts, but somehow I got it from licking my ass and telling me everything I say and do is amazing to a point now where it almost seems eager to tell me I’m wrong.
Which is fantastic, huge improvement. I don’t really use it for coding though, because I am not a programmer. I would have no means today to correctly evaluate 90% of what it would return me.
magicalhippo25 days ago
I use them as an alternative to search engines for topics where I have some specific question where traditional search engines fail to find the needle in the haystack.
As a concrete example, I was recently playing with simulating the wave equation, and I wanted to try to use a higher-order approximation as I had never done that before. I'm quite rusty as I haven't done any numerical work since university some decades ago.
I still recalled how to deal with the Neumann boundary conditions when using the traditional lower-order approximation, but I was uncertain how to do it while keeping the higher-order approximation.
Searching for "higher-order neumann boundary conditions wave equation" or similar got me pages upon pages of irrelevant hits, most of them dealing with the traditional approximation scheme.
So I turned to ChatGPT which right away provided a decent answer[1], and along with a follow-up question gave me what I needed to implement it successfully.
[1]: https://chatgpt.com/share/67b4ab43-6128-8013-8e5a-3d13a74bf6...
nomilk25 days ago
This video's pretty great: https://www.youtube.com/watch?v=uRuLgar5XZw
One thing I can't figure out how to get LLMs to do is truely finish work. For example if I have 100 items that need xyz done to them, it will do it for the first 10 or so and say ~"and so on". I have a lot of trouble getting LLMs to do tasks that might take 10 mins - 1h. They always seem to simply want to give an example. Batch processing is the answer, I guess, or perhaps more 'agentic' models/tools - but I wonder if there are other ways.
phtrivier25 days ago
Thanks for the link.
(Other answers are people gathering examples, which is nice, but I'm looking for more structured things.
And, I suppose I could ask an LLM, but my main problem is that... I don't really _trust_ LLMs yet :D )
maciekpaprocki25 days ago
finally, after two years of hype I have some usage for llm.
we import descriptions of products from a seller. the problem is they are mental ( probably written by chatgpt :)) and are way too long. we need only small blurb.
I put our style guide and given text to chatgpt and I get somehow reasonable description back. then editors still need to check it, but it's way less work.
phtrivier25 days ago
I've seen a comic at the beginning of the LLM hype where:
* in panel A, some guy is proud to use ChatGPT to turn 3 lines of text into a 10 pages report
* in panel B, some girl is happy to use ChatGPT to summarize the 10 pages report into 3 lines
It was meant to be _satire_, not the sales pitch ;)
snowwrestler25 days ago
Reminds me of a great joke tweet:
> ZIZEK: that AI will be the death of learning & so on; to this, I say NO! My student brings me their essay, which has been written by AI, & I plug it into my grading AI, & we are free! While the 'learning' happens, our superego satisfied, we are free now to learn whatever we want
somenameforme25 days ago
I find them useful for searching for some function or API name with natural language. 'What's the function call [in blah] that generates a quaternion from a couple of vectors?' type stuff. Not exactly inspiring but I've found it highly useful. If you try to search for something like that online (and somebody hasn't asked the exact question on e.g. stack overflow) you'll just end up getting all the documentation for quaternions, vectors, and blah - when the function itself might even be in a tertiary math library.
staticman225 days ago
These probably aren't tasks you need done but;
LLMs are pretty good at translation between human languages which makes sense since they are language models after all. They are better at this any any other technology.
The state of the art image ones can also probably do OCR and handwriting recognition better than any other software though may be expensive to run in large volume. But if you need to take picture of a notebook page with your camera phone an LLM can quickly OCR it.
iamnotagenius25 days ago
Not quite true; LLMs are very expensive to run; bert or other tranformer specfically built for translation can be cheaper to run.
[deleted]25 days agocollapsed
concordDance25 days ago
Big emuse cases for me are:
1. Exploring a new domain and getting some terms I can google for.
2. Making small scripts to do things like query github's GraphQL API.
3. Autocomplete of code using copilot.
pmvpeter25 days ago
I use it daily for all sorts of things, but one of the most interesting uses for me so far has been self-reflection.
For example, in the beginning of this year, I completed this exercise where I wrote a lot about childhood, past experiences, strengths and weaknesses, goals and ambitions for the future, etc (https://selfauthoring.com) and then I uploaded all that to ChatGPT, asked it to be my therapist/coach, and then asked it to produce reports about myself, action plans, strategies, etc. Super interesting and useful.
By now ChatGPT has quite a bit of context from past conversations. Just the other day I used this prompt from someone else and got back super useful insights – "Can you share some extremely deep and profound insights about my psyche and mind that I would not otherwise be able to identify or see as well as some that I may not want to hear"
liampulles25 days ago
I don't use it daily, and I find copilot counterproductive (for me). I do try to experiment with chatgpt when I remember to.
I find it good for complex SQL, reviewing emails, and Godot assistance (I'm a beginner game Dev).
There are also times when I have programming questions and I might try to use chatgpt, with mixed results.
Our company has tried to integrate it into one of our products, and I find it troubling how on occassion it is confidently giving bad results, but my concern seems to be in the minority.
EDIT: there was also a large refactor I did recently which involved lots of repeatable, but not super regexable, changes - chatgpt forgot where it was as I went through it, but other than working around that it was very useful.
brulard25 days ago
In the last few days I discovered it's good at medium-complexity SQL, not at really more complex ones. I'm struggling like 4th day with Claude, ChatGPT, Gemini and Deepseek. All could do some good analysis with some low-hanging fruit improvements, but all went completely crazy when trying to optimize more complex things, getting into loops proposing the same changes over and over, outputting invalid SQL, and gemini even forgot what we were doing, asking me if I could paste again the query i included in the very first message. Maybe the chain of thought models would handle this better, but I believe I hit the limit for the standard ones.
liampulles25 days ago
Probably my complex SQL is your medium-complexity SQL. SQL is not a big part of my current project.
davedx25 days ago
I use mine as if it's an infinitely patient, relatively competent junior/medior level developer that I constantly give small chunks of programming to do (typically a function at a time), and occasionally consult on architecture/design/other things.
I don't use integrated coding tools, so my workflow isn't super fast, but that's not what I'm really aiming for - more that I want to save my brain's energy from low level drudgy boilerplate or integration code, so I can focus it on the more important decisions and keep business-side context in my head.
It's been a huge help for me this way across multiple projects in multiple domains.
CrimsonRain24 days ago
I've coded a full custom deployment system (config, create, update, cert manege and much more) in full bash using nothing but ChatGPT. I didn't write a single line of bash.
I did write 50 or more lines of instructions on what needs to be done and in what order.
ChatGPT gave me 5/6 (I asked for this) bash scripts totalling 300+ lines that seamlessly work together.
After reviewing, I asked it to change a few places.
If any human tried the same (except those rare bash Gods), it'd take many hours. I think it took me less than 30 minutes.
mtaras25 days ago
The Vergecast recently did a section where they asked listeners what they use LLMs for (specifically not for coding) https://youtu.be/WwNjBNtZ3Co 30 minutes starting at 45:25, it had a number of interesting examples. Might not convince you of LLM's excellence, or might not be much different from what other people commented, but it's a good listen nonetheless.
yodsanklai25 days ago
I used ChatGPT all the time for
1. Small coding tasks ("I want to do XYZ in Rust"), it has replaced stack overflow. Very convenient when writing code in a language I'm not super familiar with. 2. Help with English (traduction, proofreading...) 3. Learning something, like tech, I like interacting with it by asking questions, it's more engaging than just reading content.
I'd say nothing is game changing, but it's a nice productivity boost.
tkgally25 days ago
The sister comments contain quite a few specific examples. But the many back-and-forth arguments here on HN about whether LLMs are useful for coding suggest that understanding how they might or might not be used may be the biggest challenge at this point.
I myself use them a lot, though I constantly feel that I would be able to get more out of them if only I were smarter.
wobfan25 days ago
I feel that I would be smarter if I wouldn’t use them constantly.
mwigdahl25 days ago
You could quit using high level language compilers also, jump back to pure assembly, and get smarter still!
lm2846925 days ago
> All my attempts so far have been pretty underwhelming:
Same, it's good for repetitive things, things that have been answered 1000 times on stack overflow, translations, but that's about it. If you work on anything remotely new/hard it's mostly disappointing, you have to babysit it every step of the way and rewrite most of what it's shitting out in the end anyways.
I think it just made it obvious that 90% of tech jobs basically amount to writing the same CRUD thing over and over again & mobile/web apps with very common designs and features.
NoboruWataya25 days ago
I admit to having been an LLM sceptic from day one, but I have been using ChatGPT and Claude a fair bit to try and figure out what the hype is all about. I haven't really succeeded.
Most recently I tried to use them both to solve a programming problem that isn't well documented in the usual channels (Reddit, StackOverflow, etc) and found it to be quite a disappointing and frustrating experience. It just constantly, enthusiastically fed me total bullshit, with functions that don't exist or don't do what the LLM seems to "think" they do. I'm sure I'm just "holding it wrong" but my impression at this stage is that it is only capable of solving problems that are trivially solvable using a traditional search engine, with the added friction that if the problem isn't trivially solvable, it won't actually tell you that but will waste your time with non-obvious wrong answers.
I did have a slightly more positive experience when asking it about various chess engine optimisation algorithms. I wasn't trying to use the code it generated, just to better understand what the popular algorithms are and how they work. So I think they might work best when there is an abundance of helpful information out there and you just don't want to read through it all. Even then, I obviously don't know what ChatGPT was leaving out in the summary it provided.
scotty7925 days ago
Try to give it a try ask you'd expect junior developer to successfully finish.
geros25 days ago
Cooking & Meal Planning:
- I have these three ingredients; recommend Italian main courses.
- What other ingredients pair well with this?
- How can I "level up" this dish if I want to impress?
- Can I substitute X for Y?
- Generate a family-friendly meal with lots of veggies using leftover roast chicken.
nmeofthestate25 days ago
I just used chat gpt to summarise a HN post about it taking unexpectedly long to install a washing machine because of unexpected turns of events, and this being analogous to software development. It was a time-saver.
imgabe25 days ago
I think it excels when you know enough to precisely describe what you want but you don’t know enough about the details of the language or framework you’re using to implement what you want.
lightandlight25 days ago
Here are some of my experiences:
* Figuring out where to start when learning new things (see also <https://news.ycombinator.com/item?id=43087685>)
One way I treat LLMs is as a "semantic search engine". I find that LLMs get
too many things wrong when I'm being specific, but they're pretty good at
pointing me in a general direction.
For example, I started learning about OS development and wanted to use Rust. I
used ChatGPT to generate a basic Rust UEFI project with some simple
bootloading code. It was broken, but it now gave me a foothold and I was able
to use other resources (e.g. OSDev wiki) to learn how to fix the broken bits.
* Avoiding reading the entire manual It feels like a lot of software documentation isn't actually written for real
readers; instead being a somewhat arbitrary listing of a program's features.
When programs have this style of documentation, the worst case for figuring
out how to do a simple thing is reading the entire manual. (There are better
ways to write documentation, see e.g. <https://diataxis.fr/>)
One example is [gnuplot](http://www.gnuplot.info/). I wanted to learn how to
plot from the command line. I could have pieced together how to do it by
zipping around the
[gnuplot manual](http://www.gnuplot.info/docs_5.4/Gnuplot_5_4.pdf) and building
something up piecewise, but it was faster to instruct Claude directly. Once
Claude showed me how to do a particular thing (e.g. draw a scatter plot with
dots intstead of crosses) I then used the manual to find other similar
options.
* Learning a large codebase / API Similar to the previous point. If I ask Claude to write a simple program using
a complex publicly-available API, it will probably write a broken program, but
it won't be *completely* bogus because it will be in the right "genre". It
will probably use some real modules, datatypes and functions in a realistic
way. These are often good leads for which code/documentation I should read.
I used this approach to write some programs that use the
[GHC API](https://hackage.haskell.org/package/ghc). There are hundreds of
modules, and when I asked Claude how to do something with the GHC API it wrote
relevant (if incorrect) code, which helped me teach myself.
* Cross-language poetry translation My partner is Chinese and sometimes we talk about Chinese poetry. I'm not very
fluent in Chinese so it's hard for me the grasp the beauty in these poems.
Unfortunately literal English translations aren't very good. We've had some
success with asking LLMs to translate Chinese poems in the style of various
famous English poets. The translation is generally semantically correct, while
having a more pleasing use of the English language than a direct translation.
TomK3225 days ago
I like having fun with them, like by asking grok whether some Elon Musk tweet is true. Usually it replies with a lengthy answer and I then force it to answer with Yes or No. Even more fun when drilling it more to load a few more details into it's brain, and then ask the first question again, with a Yes/No only, it sometimes does change its answer. I do wonder, has Grok already joined the resistance against Musk?
linguistbreaker25 days ago
Start from here :
Stop using Google search and use an AI. No more irrelevant results, no more ads. No more slop to wade through.
BTW I find Claude is great at making graphs and diagrams. If you pay ($20) you can hook it up to a local code base.
notachatbot12325 days ago
> No more slop to wade through.
Huh? More like "slop exclusively generated for you", right? I have seen so much garbage answers from chat AIs.
ahoog4225 days ago
Any example code or blogs/docs that demonstrate making graphs/diagrams and/or hooking it up to a local code base?
tallanvor25 days ago
Honestly, the main thing I've found ChatGPT to be useful for in my daily life is helping to translate what I write from my native language to the language spoken by most of the people where I live. But even then it only really works if you have at least a basic understanding of the language and can ask it to rewrite sections when you recognize poor word choices or awkward phrasing.
scotty7925 days ago
It helps to split what you are translating into 1-2 paragraph chunks and feed it one by one.
poulpy12325 days ago
LLM are good at one thing, and totally by chance it is the thing they have been designed to be: be a word probability generator. If you can constrain your usage around that, they are great to use. But the people who think they can reason or know some kind of truth are delusional
esafak25 days ago
Explain how o3 won a gold medal at this year's International Olympiad in Informatics, or provide your benchmark for reasoning.
poulpy12324 days ago
It's very obvious from the mistakes they make that they are not reasoning but providing the most probable answer according to their dataset. It's very impressive because their dataset is humanly big, but there is no reasoning
Al-Khwarizmi25 days ago
I use it for lots of stuff where I'm not an expert, or that are low stakes. I don't use it for the "core" of my job, but there are many things that are not "core" and still eat up a lot of time, in fact, most of my workday would be in this category. Some typical examples from my daily life as a university professor:
- Writing Python scripts to make charts out of Excel sheets, and then refine them. I could do it myself, but I would need to learn a library like Seaborn or similar which honestly is not especially intellectually stimulating, and then spend nontrivial amounts of time iterating on the actual code. With LLMs it's a breeze.
- Working with cumbersome LaTeX formatting, e.g. transposing a table, removing a column from a table, etc.
- Getting the tone just right in a professional email written in English to someone I don't know much (I'm not a native speaker so this is not trivial).
- Finding resources on topics that are tangential to what I do. For example, yesterday I needed to come up with some statistics on English words for a presentation I'm preparing, and I needed a free corpus where I could search for an n-gram and get frequencies of next words. I don't usually work with that kind of resource, it was just a one-off need. I asked for corpora of that kind and got a useful answer instantly. The manual process would probably have implied going through several options only to find that I needed a license or that they didn't provide the specific statistics I needed.
- Brainstorming on titles for scientific papers, presentations, names of concepts that you introduce on a paper, variable names, etc.
- Shortening a sentence in a paper that makes me go over the page limit, or polishing the English in a paragraph.
- Summarizing a text if I'm kind of interested in knowing the gist but have no time to read it whole.
- Answering quick questions on basic things that I forget, e.g. the parameters to make a Linux folder into a tar.gz. Man is too verbose and it takes time to sort the wheat from the chaff, Google is full of SEOd garbage these days and sometimes you need to skim a lot to find the actual answer, LLMs are much faster.
- Writing bureaucratic boilerplate, the typical texts with no real value but that you have to write (e.g. gender perspective statement on a grant request).
- Coming up with exam questions. This is a rather repetitive activity and they're fantastic at it. At my place we also have two official languages and we need to have exam assignments on both languages, guess who does the translation now (respecting LaTeX formatting, which previous machine translation tools typically wouldn't do).
- As an example of a one-off thing, the other day I had to edit a Word document which was password-protected. I asked ChatGPT how to unlock it and it not only answered, but actually did it for me (after 3 tries, but still, much faster than the time it would have taken for me to find out how to do it and then actually do it).
These are just some examples where they contribute (greatly) to my productivity at work. In daily life, I also ask them lots of questions.
JTyQZSnP3cQGa8B25 days ago
Companies have hijacked the open source concept to mean downloadable blob and we follow them as I see in the comments. It’s a real shame.
danielbln25 days ago
I remember the NVIDIA Linux kernel binary blob driver discussions from the early-mid 2000s. Who knew we had an open source driver all along...
JimDabell25 days ago
Something isn’t open-source because you get everything that went into making it. Something is open-source if you can change it (relatively) easily. The GPL and open-source definition both refer to “the preferred form for making modifications”. The preferred form for modification in the Nvidia driver’s case is the source code. The preferred form for modification in this case is the weights.
Open-source as a concept doesn’t really correspond well with LLMs but to the extent that it does, access to the training data is not required because that training data is not the preferred form for making modifications.
miki12321125 days ago
> that training data is not the preferred form for making modifications.
I definitely disagree with this.
Yes, you can do some SFT fine tuning on an existing model, but if you want to make specific, substantial, targeted changes (less safety? better performance on math and code at the expense of general knowledge?), your best bet is to change the training mixture, and for that you need the original datasets.
fragmede25 days ago
Preferred by whom? Sharing models isn't open source, and we're just going to have to keep having this argument. Letting us download the model is a very nice thing for Facebook to do, but you don't get to call it open source if you're not showing us the source! Explicitly, if we can't see the forced alignment, where the model gets its refusal to talk about Tiananmen Square or how to make meth or it The Information is a reputable news source, then it's not open. The preferred form of modification is to take the data, and train it. That some people have been able to take the model and tweak it, doesn't make it preferable.
soraminazuki25 days ago
Why was the free software movement a thing when Windows was open source all along, haha.
CTDOCodebases25 days ago
In that case Linus needs to make a retraction.
beeflet25 days ago
NVIDIA ... THANK you!
linus turns to the camera, giving a thumbs up
lithiumii25 days ago
Someone should make this video with AI.
palata25 days ago
It's started with abusing the term "AI", I don't see a reason why they would not abuse "open source". I guess it's what happens to language when a concept becomes mainstream: people use it wrong, but if enough people do it, it becomes the new meaning?
But I agree, it's a real shame.
beeflet25 days ago
I dislike when people like RMS get semantic and gatekeep words like "free software", but this is the end product of a world without gatekeepers. People just use words in a way that's convenient to them.
fragmede25 days ago
Or maybe some gatekeeping is actually good, and we just have to use more of our braincells and figure if a particular gatekeeper is good or bad. It's a good thing that being a pilot for an airline is gatekept to qualified pilots that know how to fly a plane. It's a bad thing that I need a hair cutting license to buy hair dye to dye my own hair at home.
beeflet25 days ago
You have to be careful with that, start giving out hair cutting licenses with reckless abandon a ton of innocent people could dye.
zimpenfish25 days ago
> People just use words in a way that's convenient to them.
Literally how language has always worked and evolved, though.
iszomer25 days ago
We often see semantic drift over a long period of time. It's just that the overarching topic of AI is being played significantly faster than what we would normally have observed in other fields.
mplanchard25 days ago
Language has always been a push and pull between evolving (descriptivist) and correct (prescriptivist) usage. Neither side is going anywhere.
zimpenfish25 days ago
> correct (prescriptivist) usage
Oof, I know there's a bunch of linguists and grammarians who are going to mock you for that bracket.
mplanchard25 days ago
Why? Prescriptivist/prescriptivism is afaik the usual term. Proscriptivist is the other, but quite rare, so rare that my phone dictionary says it’s a misspelling.
zimpenfish25 days ago
> Prescriptivist/prescriptivism is afaik the usual term.
It is but it was the "correct" part attached to prescriptivism they'd be mocking because that is not how linguists and grammarians work (they are descriptivists and fond of making fun of prescriptivists.)
mplanchard25 days ago
Oh yeah lol I should have put it in quotes, but by the time I thought about it I was past the edit window! C'est la vie, I'll take the mockery.
[deleted]25 days agocollapsed
a-dub25 days ago
this is correct. "open source" means everything required to recreate from scratch and improve. not "here's a massive binary, an interpreter script and permission."
userbinator25 days ago
How can you even "open source" an AI model without all of the, presumably copyrighted and extremely voluminous, training data?
beeflet25 days ago
That could probably be solved with bit-torrent. I think the bigger obstacle is the hardware required for training. Maybe it would be possible for groups of people to reproduce/train open source models with a distributed BOINC-like system?
thephyber25 days ago
You would open source the procedure and reference where the data came from. If there is any non-open source content used in training, then the project couldn’t qualify as “open source”.
But this thread is about misuse of the term as applied to the weights package. Those of us who know what open source means should not continue to dilute the term by calling these LLMs by that term.
[deleted]25 days agocollapsed
mupuff123425 days ago
You don't need the data itself, but at least a reference to what was used, basically provide the entire blueprint to recreate it.
It's just like even for a true open source software you still need to bring your own hardware to run it on.
simondotau25 days ago
You can't. But that's not an excuse to misuse the label.
a-dub25 days ago
that's how you know when you actually have agi, when you have something that you don't have to shovel in every written word known to man to make it work, but rather can seed it with a few dense public domain knowledge compendia and have it derive everything else for itself from those first principles- possibly going through several stages of from scratch training and regeneration.
int_19h25 days ago
The reason why you need to shovel every written word known to man to make it work is because it needs to learn what words mean before it can do anything useful with them, and we don't currently know any better way of making a tabula rasa (like a blank NN) do that. Our own brains are hardwired for language acquisition by evolution, so we can few-shot it when learning and get there much faster; and if we understood how it works, we could start with something similarly hardwired and do exactly what you said.
But we don't actually know all that much about how language really works, for all the resources we spend on linguistics - as the old IBM joke about AI goes, "quality of the product increases every time we fire a linguist" (which is to say, we consistently get better results by throwing "every written word known to man" at a blank model than we do by trying to construct things from our understanding).
All that said, just because we're taking a different, and quite possibly slower / less compute-efficient route, doesn't mean that we can't get to AGI in this way.
dragonwriter25 days ago
> Our own brains are hardwired for language acquisition by evolution, so we can few-shot it when learning and get there much faster
No, we can’t few shot it and we don't get there faster (but we develop a lot of other capabilities on the way.) We train on a lot more data; the human brain, unlike an LLM, is training on all that data in processes for ”inference”, and it receives sensory data estimated on the order of a billion bits per second, which means by the time we start using language we’ve trained on a lot of data (the 15 trillion tokens from a ~17 bit token vocabulary that Llama3 is something like the size of a few days of human sense data.) Humans just are trained on and process vastly richer multimodal data instead of text streams.
int_19h25 days ago
I was talking about language acquisition specifically. Most of the data that you reference is visual input and other body sensations that aren't directly related to that. OTOH humans don't take all that much text to learn to read and write.
dragonwriter25 days ago
> I was talking about language acquisition specifically.
Yeah, humans don't acquire language separately from other experience.
> Most of the data that you reference is visual input and other body sensations that aren't directly related to that.
Visual input and other body sensations are not unrelated to language acquisition.
> OTOH humans don't take all that much text to learn to read and write.
That generally occurs well after they have acquired both language and recognizing and using symbolic visual communication, and they usually have considerable other input in learning how to read and write besides text they are presented with (e.g., someone else reading words out loud to them.)
ncr10025 days ago
Feeling my inner Klingon, "Where is the honor in releasing a binary blob and calling it .. open source. Pfah!"
llm_trw25 days ago
Linux doesn't ship a compiler or CPU when you download it. It's not open source I guess.
guappa25 days ago
I'm guilty of this. I didn't publicly shame a coworker who installed fb's model and said "it's open source" just for the sake of peace.
blackeyeblitzar25 days ago
Most of these claimed “open” models are not open source. Some of them are open weights. But even some of the ones that share weights are not really open - they force a restricted license on you. To be open source I think they need to share training data and training code under an OSI approved license.
AI2 has a model called OLMo that is actually open source. They share the training data, training source code, and many other things:
https://allenai.org/blog/olmo2
They also released an app recently, to do local inference on your phone with a small truly open source model:
torginus25 days ago
While I do agree with your point - I wonder what information companies could release that'd be immediately useful to you.
It's not like they understand what the weights mean either and if they released the code and dataset used to create it, you probably couldn't recreate it, owning the fact that you don't own tens of thousands of GPUs.
If a software's source is released without all the documentation, commit history, bug tracker data etc., it's still considered open source, yet you couldn't recreate it without that information.
liampulles25 days ago
Thank you for pointing this out, I was not thinking clearly about this
raverbashing25 days ago
Better a downloadable blob than a non-downloadable one
frabcus25 days ago
No, it's not, as it means nobody is pushing for actually open models.
A truly open model has open code that gathers pre-training data, open pre-training data, open RLHF data, open RLAIF data generated from its open constitution and so on.
The binary blob is the last thing I'd want - as a heavy user of LLMs I'm actually more interested in the detail of what all training data is in full, than I am the binary blob.
tuananh25 days ago
IBM Granite is actually open https://www.ibm.com/granite
ks204825 days ago
Here's a real open one: https://allenai.org/olmo
__m25 days ago
at least for the pre-training data there are some open source torrent clients [0].
aiono25 days ago
Parent doesn't argue about that. How is this relevant?
nicce25 days ago
This is the problem - we accept this approach and then they don't have to do any effort to correctly publish them in open means.
kortilla25 days ago
Cool, that’s not open source though.
That’s like a chef giving you chicken instead of beef and calling it vegetarian.
ineedasername25 days ago
I’d say it’s more like eating Chicken Cordon Bleu and then asking the chef for a recipe, who replies, “Certainly! Step 1) Acquire Chicken Cordon Bleu, preferably cooked. Step 2) if uncooked, cook. Otherwise, consume
aerzen25 days ago
it's open weights
dmos6225 days ago
Weights-available. You wouldn't say open-binary.
ks204825 days ago
It is? Do you have a link?
mmoustafa25 days ago
it's closed source and open outcome
silisili25 days ago
So is asking ChatGPT to write your application, then open sourcing said application IMO.
I see both sides here, but I don't think it's a hill worth dying on. The 'open source' part in this case is just not currently easily modifyable. That may not always be the case.
TheDong25 days ago
This is still to be determined, based on whether the output of ChatGPT is copyrightable by ChatGPT, copyrightable by the requester, or something else.
I think the two plausible answers are:
1. The person prompting (for example telling chatgpt 'please produce a fizzbuzz program') owns the copyright. The creativity lies in the prompt, and the chatgpt transformation is not transformative or meaningful.
2. The output of ChatGPT is derivative of the training data, and so the copyright is owned by all of the copyright holders of the input training data, i.e. everyone, and it's a glowing radioactive bomb of code in terms of copyright that cannot be used or licensed meaningfully in open source terms.
There are existing things like 1, where for example if someone takes a picture, and then uses photoshop to edit it, possibly with the "AI erase" tool thingy, they still own the photo's copyright. Photoshop transformed their prompt (a photo), but adobe doesn't get any copyright, nor do any of the test files adobe used to create their AI tool.
I don't think AI is like that, but it hasn't gone to court as far as I know, so no one really knows.
llm_trw25 days ago
An llm isn't software any more than a matrix is.
What do you think an open source matrix should look like?
entropi25 days ago
A compiled executable is not any less software than the source code. But the point of open source code is not the ability to see the CPU instructions though, is it?
Its about reproducibility and modifiability. Compiled executables (and their licences) lack that. The same as these downloadable blobs.
llm_trw25 days ago
You make the start of a good point, but miss most of it.
You can absolutely have open source machine code.
The issue is and always has been that you need to have access to the same level of abstraction as the people writing the source code. The GPL specifically bans transpilers as a way to get around this.
In ML there is _no_ level of abstraction other than the raw weights. Everything else is support machinery no different to an compiler, and os, or a physical computer to run the code on.
Linux isn't closed source because they don't ship a C compiler with their code. Why should llama models be any different?
fragmede25 days ago
where did those weights come from?
llm_trw25 days ago
An algorithm with no idea of what abstraction even means.
stonogo25 days ago
Is this question in good faith? The way generated code and data should be open sourced is by releasing the tools and configuration used to generate it. There's never been much confusion around this, to my knowledge.
I'm not even necessarily advocating that these things should be released, but the term "open source" has a pretty well-understood meaning that is being equivocated here.
angusturner25 days ago
Credit to the engineers that built this, but it fills me with rage that Elon has this sort of unchecked power.
How long before this starts getting deployed in safety critical applications or government decision making processes?
With no oversight because Elon seems to have the power to dismiss the people responsible for investigating him.
Anyone not scared by this concentration of power needs to pick up a book.
ijustlovemath25 days ago
What's remarkable to me about criticism like this is how quickly it's rebutted by people claiming "where did they say they would do this," as if these people don't make incredibly rushed and poorly planned decisions all the time. It's like an idea immune system that rejects any criticism or self reflection. It would be sociologically fascinating if it wasn't being combined with a dereliction of congressional power and an unchecked executive.
EGreg25 days ago
I literally post for years the same thing about the need for open source alternatives to social platforms where 1 person controls the algorithm to prioritize what a billion people see. And the response is “meh”. No one even bothers to read past the first paragraph:
https://news.ycombinator.com/item?id=43036350
But if you really want to see the “immune system” shine, mention web3 and smart contracts, and watch the downvotes pour in. Any time one even mentions “decentralized byzantine fault tolerant” anything, an army rises up to repeat anodyne versions of “grift… no one needs it… banks are great…” etc.
https://news.ycombinator.com/item?id=43073421
But if you mention any concerns with AI, no matter who or what you cite, the same group goes the other way and always repeats “(insert problem here) has always been possible, there is nothing to see here, move on, AI is amazing, deregulate and let the industry develop faster”:
https://news.ycombinator.com/item?id=40900155
It’s groupthink at its most obvious, repeatable, always on, and I wonder how much of it is organic.
pjc5025 days ago
Having been on the internet for a very long time, I can answer why open source alternatives to social platforms seldom get off the ground: the network effect is huge, and the community of users matters far more than any of the technology.
Don't bother telling people how it works. Show them who's using it and for what.
Oh, and for any kind of "normie" use it must have a decent moderation and anti-abuse system. Which inevitably clashes hard with "decentralized". Bluesky is succeeding because it lives in a contradiction of pretending to be decentralized, but what it really offers is the "pre Elon Twitter" experience. To basically the same people.
> the same group
While there's a certain amount of hivemind, it's rare that you see people directly contradict their own posts here; what you're seeing is different people.
eightysixfour25 days ago
Or, people have different opinions about who should have power over social media, banking, and AI, for completely rational reasons…
sanity25 days ago
This is the problem we're working on with https://freenet.org/ - a general purpose platform for building entirely decentralized services.
Our thesis is that the client-server architecture is a fundamental flaw in the world wide web's design, which inherently concentrates power in the hands of a few. Freenet aims to be a general purpose replacement for this in which all services are entirely decentralized.
The first non-trivial app we're building will be a group chat system called River[1].
EGreg25 days ago
I like the new Freenet! I interviewed your founder, Ian Clarke, 2 years ago on my channel — discussing the original freenet, probably the first truly decentralized content network in the world. Here is the 2-hour discussion:
https://www.youtube.com/watch?v=JWrRqUkJpMQ&t=12m0s
Look around the 12 minute mark when I start to discuss how “the capitalist system” produces centralized monopoilies that extract rents for their shareholders.
pjc5025 days ago
Freenet is 25 years old. It never took off, what makes people think it will take off now?
sanity25 days ago
The original Freenet had over 6m downloads over the years - and pioneered ideas like cryptographic contracts which later formed the basis for bitcoin, but it was always a very experimental project, while the new Freenet is designed for mass adoption.
They key differences between old and new Freenet are:
Functionality: The previous version was analogous to a decentralized hard drive, while the current version is analogous to a full decentralized computer.
Real-time Interaction: The current version allows users to subscribe to data and be notified immediately if it changes. This is essential for systems like instant messaging or group chat.
Programming Language: Unlike the previous version, which was developed in Java, the current Freenet is implemented in Rust. This allows for better efficiency and integration into a wide variety of platforms (Windows, Mac, Android, MacOS, etc).
Transparency: The current version is a drop-in replacement for the world wide web and is just as easy to use.
Anonymity: While the previous version was designed with a focus on anonymity, the current version does not offer built-in anonymity but allows for a choice of anonymizing systems to be layered on top.
EGreg24 days ago
Can you drop me an email? Would like to have a conversation about our respective roadmaps and helping each other
sanity24 days ago
Will do.
echelon25 days ago
> But if you really want to see the “immune system” shine, mention web3 and smart contracts, and watch the downvotes pour in
I'm all for distributed / P2P social media, but crypto is full of some of the most scammy and downright shameful behavior I've ever seen in my life. Pump and dumps, rug pulls, money laundering. There is a real reason people hate crypto.
To tip it off, crypto is one of the least meritocratic things there is. The longer you've been in it, the more people you've scammed, the more you hype, the "wealthier" you are.
Crypto smells like a shit and vomit sandwich and people immediately turn their noses.
Build P2P social without the crypto angle and you have my attention. I've been wanting p2p (not federated) social media since the 200Xs and the decline of the indie web. Social and news should work like email and BitTorrent, not Facebook or "federated Twitter".
swalsh25 days ago
> I'm all for distributed / P2P social media, but crypto is full of some of the most scammy and downright shameful behavior I've ever seen in my life. Pump and dumps, rug pulls, money laundering.
The SEC's answer no questions, sue first, approach to crypto in general made legitimate players afriad to operate, so the space became dominated by those that didn't care about the law.
pjc5025 days ago
> The SEC's answer no questions, sue first
This isn't true, and last time someone tried to prove it was, they cited .. a huge PDF of all the questions the SEC had been asking crypto firms prior to action.
Besides, the rules are over now. The US President ran a pump and dump. Can't get more legitimacy than that.
EGreg25 days ago
Many on HN don’t believe there ARE any legitimate uses of crypto in the first place.
Here are some:
https://intercoin.org/applications
But most comments I get are “I stopped reading 2 seconds in when I saw the word Web5.”
(We started using it when Jack who founded Twitter, started bluesky, promoted nostr started using it).
Here is a graphical presentation that can drive it home:
svara25 days ago
I looked at your links and I still don't get it. I do want to understand. Where is the problem stated, clearly and concisely? What is the solution and why does it require crypto?
I say that as someone who read the Bitcoin paper in 2012 and was pretty excited back then.
Meanwhile online scams are a bigger industry than the illegal drug trade and bigger than the Australian economy. There are thousands of modern day slaves in call centers in Myanmar and the Philippines with fake social media profiles running pig butchering scams. That industry runs on crypto 100%. I guess that's one "problem" crypto solved.
You need some pretty convincing arguments at this point to convince me (and many others) that getting rid of this stuff wouldn't be a big win for humanity.
EGreg24 days ago
The problem is relative to WHOM.
Here is the problem statement and solution for community leaders, the same class of decision makers who exited “AOL Keyword NYTimes” in favor of “nytimes.com” on this newfangled protocol called HTTP, with its servers and clients called browsers that people were downloading
intercoin.org/currencies.pdf
chefandy24 days ago
When they asked for a clear and concise description of your problem and solution, they are probably looking for a problem statement: a focused, 1 or 2 sentence explanation of the problem you intend to solve. You then present your proposed solution in the same form.
Hypothetical example problem statement: We want to promote ycombinator to everyone that could benefit, but banner ads make us look chintzy, directly engaging in the feral discourse on Slashdot would inevitably look unprofessional, and engaging directly through dozens of purpose-built blogs and websites is too onerous.
Hypothetical example solution statement: We should create our own simple, well-designed news site built on user submissions, and include threaded discussion capability with moderation built in at both the community and company level to keep things relatively civil. Then our audience will come looking for us.
What you offered is not a problem statement. It is a sales deck offering a, frankly, convoluted explanation of how starting a currency will solve a largely unrelated problem backed up by an unsupported assertion about the least representative sample in the world— Donald Trump.
EGreg24 days ago
Or, you could actually read the deck and it explains the problem.
At this point, I think this is just performative
svara24 days ago
I read it all. It's apparently supposed to be a way for celebrities to extract money from their audience by having them buy into their currency.
If you're satisfied with calling that useful, okay, I guess - to me it's deeply alarming that this is presented as a good example of a useful application of crypto.
In the broader context of crypto demand being driven essentially by digital crime and gambling, there would need to be some seriously glowing example of something good that can be done with it to shift my judgment.
For example, in the early days of Ethereum, I thought it'd be possible at some point to build truly open source, decentralized SaaS, where the deployment happens to the blockchain, and that this in turn would enable open source projects to finance themselves.
I've yet to see an example of this where the crypto aspect isn't a figleaf.
I'm very concerned that people arguing for exciting applications of crypto are involuntarily legitimizing the online crime ecosystem. Crypto in practice seems to lead to a massive transfer of assets to criminals. To an extent where that may end up destabilizing whole countries, given the market cap trajectory.
chefandy24 days ago
It doesn’t explain anything. It asserts a lot. Sorry I took the time to critique and give examples as a freelance business communication designer. Effective business communication requires frank feedback, and mine usually isn’t cheap, but if protecting your ego is the goal here, just keep assuming you’re doing everything right and it’s everybody else’s fault it’s not landing.
talldayo25 days ago
> We started using it when Jack who founded Twitter, started bluesky, promoted nostr started using it
Jack Dorsey is certifiably insane. His obsession with cryptocurrency is a warning to anyone that throws away success to live as a crypto maxi. You will lose the only things that matter to you in life, your business will be taken away from you by shareholders if you own one. Your control will be hated by users that accuse you of trying to ruin the internet with NFT profile pictures and crypto tickers. Many users outright left as a consequence, others would leave after the takeover. But Dorsey set the stage for the enshittification of Twitter, and anyone that's forgotten that should face the music.
Web5, no matter who utters it, is a phrase that means nothing. A person walking on the street would not be able to define it for you. Me, a programmer, cannot define it for you or even explain what it looks like. It is a marketing term as applied to Free Software, which will alienate Free Software users and disgust/confuse common people. If you cannot find a better phrase to describe your philosophy then people will forever associate you with the myriad grifters that shared your "Web(n)" branding.
EGreg24 days ago
I defined it very clearly
Web2 (community) +
Web3 (blockchain)
We need to combine the two. Web3 by itself is lame, Web2 by itself is blind.EGreg25 days ago
I have been building it, in fact.
Do I have your attention now?
Ten years and $1 million dollars later, it’s free to use, but we haven’t started promoting it yet, still testing with paying clients:
Here are some ideas:
echelon25 days ago
This is interesting, but it feels too platformy for my use. I'd really like to see something 100% like BitTorrent.
Instead of trying to build a "you.com" (as in your pdf example), I want a place we're all just a simple disposable signed hash address (that you can change, make public, keep pseudonymous, etc.) - easy and disposable if needed, but also possible to use as the building block of an online presence or brand if your hash becomes well known. Kind of like email, in that sense.
The platform doesn't need real time streaming video or video calls. Just text and images to start. P2P Reddit or Twitter.
It shouldn't be about building a platform where you attract others to your brand. That can come later. It should be about participating in a swarm and building up a mass of people. An exchange of ideas and information, but where it feels like being in public. Like BitTorrent. Once network effects kick in, you can find the nodes (people, topics, etc.) you care about and want to prioritize in your interest graph.
namuol25 days ago
> But if you really want to see the “immune system” shine, mention web3 and smart contracts, and watch the downvotes pour in.
Yeah that sounds like a feature, not a bug.
swalsh25 days ago
It's remarkable to me how "Web3 is a grift" has seemingly become tribal consensus here, without any real basis in reality. I think the last administration's explicit efforts to block crypto legitimization played a big part in this. It's clear that if you tried to follow the law and operate as a legitimate player, you risked being debanked or legally targeted by the SEC—and they made little to no effort to answer questions or help you work within the law's constraints. They wanted to sue first. As a result, those who ignored the law ended up dominating the space. This reflects policy failures, not issues with the tech or its legitimacy. I'm hoping the Trump administration shifts this dynamic, but now there's a reputation problem that needs correcting as well.
lelandbatey25 days ago
To quote Patio11, "It's not a conspiracy if they really are out to get you."
Crypto in general and Web3 as well, all have mostly delivered scams. To the tune of billions stolen from everyday folks. Everything (to within a rounding error) that hasn't been a scam has delivered nothing else but being a speculative asset at best. Everything else has been a barely working toy that's better served by non-distributed implementations of the same thing.
People shit on crypto. government, regulators, and the public, all dislike crypto because the only thing that ever happens to us with it and the only thing we ever hear about happening, is folks losing money to scams.
There's no mystery here. Crypto doesn't need a policy shift. Crypto needs to stop fucking over folks. Yes it's cool technology, yes it also seems to just be a way to part folks from their money.
EGreg25 days ago
That's like saying the only thing that ever happens with AI is people losing their jobs to AI. And unlike Crypto, they didn't opt in and literally buy digital assets and send them voluntarily somewhere. They get negatively affected regardless of any choice they have made. "Get on board, or get rolled." People worldwide would lose a lot more money to AI growing than crypto growing, regardless of never opting in. It will just be a giant wealth transfer to the already-wealthy and corporations. What about that? Oh, crickets. Dismissal from the HN crowd.
If I am going to put my money at risk, I expect it to be at risk. I'm happy to have a regulatory framework around that from the SEC, for instance, and there are. For example, since the JOBS Act, the SEC has greatly expanded the opportunities to raise money in a regulated way. I even interviewed the actual authors of Regulation S at the SEC, where I go into depth for an hour about how to raise money legally:
https://www.youtube.com/watch?v=ocrqgkJn4m0
FINCEN has also been putting out guidances to the crypto industry since 2013:
2013: https://www.fincen.gov/statutes_regs/guidance/pdf/FIN-2013-G...
2019: https://www.fincen.gov/sites/default/files/2019-05/FinCEN%20...
So the regulations are there.
And frankly, most true adherents of crypto have been yelling from the rooftops that Celsius and FTX and Binance are not actual DeFi. They are not decentralized, they simply tell you to do the very thing crypto was designed to avoid -- i.e. send them money and "trust them". This is the very thing Bitcoin and Blockchain were designed to avoid -- the middleman.
FileCoin and UniSwap and Aave Marketplace and so on are real crypto, and they have never had any scandals and billions of dollars, bits, etc. are entrusted to them every day. Ditto for most altcoins and networks, including Hedera Hashgraph, Polygon, COSMOS, Polkadot, etc.
Any shade thrown at, e.g. Telegram's TON or Ripple's XRP, is due to regulators. I can understand why Facebook's Libra was shut down. But it has to do with them becoming "too powerful" and "not subject to national oversight". Kind of like Facebook and Twitter and Google themselves.
lelandbatey25 days ago
Everything that you just mentioned, as far as "what it's actually doing" is either speculation/speculation accessories or is a not-as-good-version of existing offerings. Where is the value?
UniSwap: a marketplace for speculation on arguably scam crypto products.
Aave Marketplace: a marketplace for speculation on arguably scam crypto products.
FileCoin: file storage at rates 50% higher than e.g. BackBlaze/DigitalOcean.
There's no actual value here other than as scam, speculation (nearly a scam), or products that are flimsy pretenses at not being scams (but which don't deliver a lot of value). Why should anybody care (other than transparent greed)?
soheil25 days ago
You do realize you and your OP are currently top comments in your respective threads both criticizing Elon and even preemptively criticize your imaginary criticizers
[deleted]25 days agocollapsed
InTheArena25 days ago
Elon's unchecked power at building a model? Or at politics?
I always worry whenever I see people telling me how to feel - rage in this case. We are in a political system that is oriented more around getting people to feel rage and hatred as opposed to consensus and deliberation. Elon is the face of that, but it's a much longer and larger problem. Throw in the complete dismisal that anyone not scared of this is ignorant, shuts down discussion.
The problem I have with Elon is that they are wasting a once in a lifetime chance to actually address and fix systematic problems with the US government. Deploying LLMs in the government space doesn't fear me with dread. Continuing the senseless partisan drive of the 20 years does.
swalsh25 days ago
> Continuing the senseless partisan drive of the 20 years does.
I think what the government is going through right now is wrapping up the last political system. The idea that Democrats and Republicans just need to learn how to work together is just wrong. The parties are being destroyed, and I think we should all cheer that. They were built to address the issues of the 20th century, and neither party in the current form is ready to address 21st century issues. I think AI, Climate change, world demographic changes around the world (ie: low birthrates) is going to seriously alter everything about our world from geopolitics, to economy, even social issues.
The democrats are stuck in supporting the new deal bureaucracy and the post ww2 order. That's over, it's crumbling right now, and i'm not going to try and defend any of it personally. It's just obsolete. The old Republican party your dad probably supported is dead too, that died a while ago. The new Republican party seems to be an alliance of people who just really want to cheer the crumbling of the old system (MAGA) and the first emergence of what politics in the 21st century is going to look like (the tech alliance).
Democrats would be smart to understand it's a new century we have new threats, new challenges, and need new institutions.... and this IS NOT a once in a lifetime opportunity to fix our government. This is the first draft of our new political system, and they have a choice to participate in shaping it, but they will need to get votes, and to get votes they need to stop stalking about obsolete ideas.
jrussino25 days ago
>The democrats are stuck in supporting the new deal bureaucracy and the post ww2 order
> The new Republican party seems to be an alliance of people who just really want to cheer the crumbling of the old system
I agree, and I think this is a bizarre flipping of the "Democrat ~= progressive / Republican ~= conservative" dynamic that has been largely assumed throughout my lifetime.
We need both conservative and progressive forces in our society. Someone needs to be saying "here's what's wrong with our system; here's what needs to change", and someone else needs to balance that with "here's what we're doing right; here are the parts that are working well and that we should not get rid of".
It seems to me that now, instead of that tug-of-war discussion happening between the two parties, it is happening in parallel within them. Unfortunately, the sane and responsible version of that discussion is happening entirely within the boundary of the Democratic coalition, in a way that is completely ineffectual because (a) the internal conservative moderating force is relatively strong in a moment when the populace seems to want more progressive action, and (b) to they have so little ability to effectively wield political power.
Meanwhile, the Republicans are dominated by a bizarro "progressive" faction that wants to pull us all in an entirely different (IMHO regressive) direction. And that faction is completely unchecked by any internal "conservative" moderating force within its own party, so it is for the moment able to push us as hard and fast as possible in its preferred direction.
swalsh24 days ago
> It seems to me that now, instead of that tug-of-war discussion happening between the two parties,
I'm REALLY looking forward to 2028, because I think that potentially will be the first election where we start to see what modern politics will look like. I wouldn't be surprised if there are multiple new parties, and multiples of them have a real chance. If it seems one sided right now, it's just because one side found their way to the start line first... but make no mistake, history shows that over time new political factions will form that offer resistance to bad ideas, and clear a path for the good ideas.
Given the rate of change with AI, We're going to have a real idea on what a world being disrupted by AGI (whether that is true AGI, or something close to it) looks like. At the same time Healthcare is only getting worse, and Trump is NOT going to fundumentally address it. China is going to be rising, and they're a real geopolitical threat. The war in Ukraine has completely changed what warfare looks like, and we're going to have to completely restructure our military (just like we have to restructure our healthcare). I also wouldn't be surprised if Trump's war with the cartel turns out to be far harder than expected because cheap autonomous drones allow a small military to compete against a large traditional military.
All of our prior assumptions on retirement are different too, retired boomers are not the same as the pensioneers from their day. They're not impoverished, instead they're flush with cash. I'm not sure in a world with an aging workforce you're going to be anti-immigrant... and all these benefits we give to retirees may not make sense in a world where retirees are wealthier than the regular workforce supporting them.
The general theme for the next decade is going to be throw out all the old books, 80% of our prior assumptions no longer apply.
mrtesthah25 days ago
Is this new political system akin to a banana republic? Because that’s what happens when you replace non partisan workers with loyalists in order to eliminate all accountability and oversight. Turning the rule of law into a partisan issue is a receipt for endemic corruption.
And even if you think the rule of law is antiquated, you’re misanthropically cheering the destruction of the largest institution in the world that 330 million people depend on for survival.
OvbiousError25 days ago
Consolidating power in the hands of the few very rich is not something new, it's just the old come again.
bastardoperator25 days ago
I might cheer if the replacements weren't objectively worse in every measurable way.
spankalee25 days ago
> actually address and fix systematic problems with the US government
I wonder if you could even name what some of these critical problems are? Or have you just been told that there are problems that justify this chaos?
InTheArena25 days ago
I'm happy to, thought the end of your statement strongly suggestions that you have are not acting in honest faith by asking this question. 1) All positions have become partisan, which political ideology being as critical to promotion in high level positions. 2) Congress refuses to act as the constitution intends, and have delegated their budget making authority to the executive branch. 3) The government specific procurement system is almost as expensive as what is being procured. 4) Auditing the government is almost impossible. 5) The debt load on the government is becoming unsustainable. 6) The lack of "digital transformation" (what we called it in banking) means poor service. 7) The unfunded liabilities (mostly at a state level) will swamp budgets in a few years. 8) Most large contracts should be fixed contracts, not cost plus contracts. Companies can do bilk the government for things that are a order of magnitude cheaper to the outside world. 9) Medicare refuses to lower health care costs (by reducing rates) due to political pressure. 10) No rationalization of government spending or revenue has occured since the post world war 2 era.
mrtesthah25 days ago
1. Making all positions partisan is a fascist tactic to challenge objective truth.
2. Congress as a whole isn’t a single entity —- one party refuses to compromise in any way while the other plays by the rules.
3. Doesn’t matter. Cost reform needs to go through existing legal routes.
4. What constitutes “auditing” the government? Because we had plenty of non partisan positions overseeing and auditing all parts of the government. DOGE fired those people.
5. Again, go through the legal route.
6. A lack of “digital transformation” is the vaguest most unconvincing point in this entire justification.
7. These budget issues need to be decided on through constitutional processes and with oversight, as before.
8. Ditto.
9. Medicare can lower health costs by other means, such as being available universally to all and setting limits on what they pay to providers based on procedure.
10. Do you watch CSPAN?
dralley25 days ago
All of your points can be summed up as "Congress refuses to do their job".
Breaking all the laws to bypass the government does not "actually address and fix systematic problems with the US government", that is an absurd position. Caesar did not fix the Roman Republic.
And opposition to DOGE is not on the basis that people don't care about government efficiency. It's on the basis that the shit they're doing has nothing to do with government efficiency. There's not even a pretense of trying to calculate the "benefit" part of the cost-benefit equation with the cuts they are doing, they are just slashing and burning without any concern for outcomes as a power play and messaging tool. Elon is famous for doing this at Tesla and Twitter and all evidence points to it being incredibly harmful.
This isn't efficient! https://www.washingtonpost.com/dc-md-va/2025/02/15/return-to...
And not everything is about efficiency. Laying off veteran's crisis hotline workers or refusing to pay for the USAID employees you've just abandoned to be extracted (or in one case, medevac'd after a medical emergency) from the places they were sent to is just cruel (and again, illegal).
m202425 days ago
[dead]
Hasu25 days ago
> I always worry whenever I see people telling me how to feel - rage in this case.
No one told you to feel rage.
> Throw in the complete dismisal that anyone not scared of this is ignorant, shuts down discussion.
Weird, there are a lot of comments doing discussion in reply to the parent comment. It hasn't been shut down at all! You read those words and disagreed with them, and wrote your own words in response. You're doing the discussion you're claiming is being shut down! What are you even talking about?
Tycho25 days ago
But it is a partisan issue. All these people on fat NGO salaries, all these federal workers not pulling their weight, all the welfare abuse, all these aid payments - which party do you think is keen to keep the spigot flowing? Of course, it would be a shame if they didn’t audit the Pentagon as well, definitely massive graft happening there.
d0gsg0w00f25 days ago
Allegedly, the Pentagon sees the writing on the wall and is trying to get a head start on DOGE
https://www.wsj.com/politics/national-security/doge-departme...
cpursley25 days ago
It’s just wild to me that an attempt to tally up whats in the community grain store and where it’s allocated out to is even considered a bipartisan issue.
matwood25 days ago
The problem is the person doing the tallying is doing it behind closed doors, has routinely been shown to lie to further his interests and has already been caught lying with the tallies he's released.
The GOP controls both houses and the POTUS. They could absolutely do a top to bottom audit with full transparency and make cuts where needed. But that's not what this is about.
cpursley25 days ago
Is that totally true, though? Maybe they have pulled wool over my eyes, but it seems like we've seen more transparency in the last few weeks than the last 40 years.
Just poke around a bit: https://doge.gov/savings
And please even try to explain how this sort of thing is even remotely in America's best interest:
https://www.usaspending.gov/award/CONT_AWD_FA865018C7886_970...
> ACTIVE SOCIAL ENGINEERING DEFENSE (ASED) LARGE SCALE SOCIAL DECEPTION (LSD)
Then there's the basic accounting 101 things like improper categorization, 150 year old people getting social security, etc. Why should the US government be held to a lower standard than a publicly traded company?
Tycho24 days ago
This ASED and LSD, aren’t they services to help the state counteract an information warfare attack? Just guessing, but it sounds like a legitimate thing where they’d want capacity to uncover/expose such activities, which I’m sure adversaries would consider.
deng24 days ago
Yes, the contract was for researching defenses against deception, was first awarded under Trump and also on public record, visible for many years, not "revealed" by anyone, especially not those DOGE masterminds. But what's even the point now? I think we're past discussing any facts here, because OP has a "sniff test" instead (see answer below).
cpursley24 days ago
Sure, sure. Pentagon money going to the western press, USAID (a literal terrorist organization) funding both sides of the narrative, what could go wrong?
There was a time liberals screamed at the top of their lungs over this type of threat to democracy, now they embrace and endorse it because they’ve fully merged with the primacy neocons.
matwood24 days ago
> USAID (a literal terrorist organization)
Maybe get your news from somewhere other than Twitter.
cpursley24 days ago
Maybe you could recommend some western news sources that haven't been infected by USAIDS?
deng24 days ago
Yeah, thanks for proving my point. Have a nice day.
matwood25 days ago
Do you have a third party audit that this is true or have those datasets all been removed? Musk has shown himself unknowledgeable at best and purposely lying at worse so many DOGE findings are hard to take at face value.
https://apnews.com/article/usaid-funding-trump-musk-misinfor...
https://www.forbes.com/sites/conormurray/2025/02/10/elon-mus...
cpursley25 days ago
Did you even read those articles? Full of BS excuses and justifications. None of them pass the sniff test by any honest person with above room temperature IQ.
People are just angry at Musk for turning their safe space into a free speech platform then switching sides. And that he’s now taking away their sides unlimited slush fund.
matwood24 days ago
You clearly don't want to read anything outside of Twitter/Musk, but another error, fixed/hidden in order to keep showing incorrect data that looks better for DOGE.
https://www.nytimes.com/2025/02/18/upshot/doge-contracts-mus...
The DOGE website initially included a screenshot from the federal contracting database showing that the contract’s value was $8 million, even as the DOGE site listed $8 billion in savings. On Tuesday night, around the time this article was published, DOGE removed the screenshot that showed the mismatch, but continued to claim $8 billion in savings. It added a link to the original, outdated version of the contract worth $8 billion.
So much honesty and transparency out of this group.
cpursley24 days ago
Man people can’t stand that Elon turned Twitter into a free speech platform. Anyways, I'm more of a long-form article, book, podcast and travel guy when it comes to informing my opinion.
bloopernova25 days ago
I don't know any voters who want fraud to continue, but most do accept that fraud is just a part of any system designed and implemented by humans.
I personally would like to see the end of the "find gravy train, keep that gravy flowing at all costs" methodology of capitalism, because it's primary focus is money instead of the service provided. Whether it's pentagon contractors, business subsidies, or the heinous medicare and medical insurance fraud. But I don't want to cut SNAP even if someone buys a goddamn coke once in a while.
The current method seems to be brain surgery with a monkey wrench. Slash and burn with little thought given to the effects upon humans, especially those who don't have other options. Kagi gave me a figure of between 9.2 to 15.2 percent of welfare being fraudulent. Yes that's too high, yes I'd like to fix that, but I want that change to be considered, studied, and planned with caution.
Tycho24 days ago
Tbh I think “move fast and break things” is what’s needed. The government bureaucracy has ossified over many years, and any attempt to change it gets bogged down in “committees” and “inquiries”. The only thing that will work is shock and awe, and if something important does get broken, it’s east enough to fix when its criticality becomes evident.
KerrAvon25 days ago
Has it occurred to you that the people who feel rage fundamentally understand the situation, and you may be undereducated in this area? What do you think are the root causes of that “senseless partisan drive”?
I’d suggest starting with Rick Perlman’s book “Nixonland” if you’re interested.
diggan25 days ago
> Has it occurred to you that the people who feel rage fundamentally understand the situation, and you may be undereducated in this area?
Regardless of how justified the rage is or not, being very emotional about things usually have one of two effects on people A) people gets taken aback by someone's strong emotions or B) people get inspired/taking with the emotion, even feeling that emotion stronger themselves. Sometimes also C) they don't really care either way.
What probably isn't helpful, is calling someone is "undereducated" when they're clearly saying that they're person (A), and just because they may or may not agree with you (although parent didn't even clearly say they disagree, just that they're "taken aback" a bit).
Some people are calm regardless of what's going on around them, even if the world would be on fire, they'd try to describe what's going on around them with careful words and consideration. It isn't wrong or right, just like the people who feel rage and very emotional aren't wrong or right, it's just a showcase how we're different.
But we should aim to at least understand each other, not by trying to provoke, instigate or look down on others, but by asking questions to clarify and to better understand.
elorm25 days ago
You're doing the exact same thing he is addressing in that statement above. He's not belittling anyone's rage, he's speaking about people who incite others to feel the rage with them. Now let's turn your question around.
Has it occurred to you that the people who feel rage fundamentally misunderstand the situation and are completely undereducated in this area, and are only fuelled by sensationalism and Media manipulation? And then I suggest you go read Dirty Politics by Kathleen Hall Jamieson if you're interested, because that's what people who want to sound more intelligent than the other half of the conversation always do.
How does it help anyone?
concordDance25 days ago
Given the two of you probably have different models of reality, perhaps you two can try and figure out which is correct by seeing which model gives better predictions?
So try to come up with some sort of future observation that can be made where you think the other person's model would give a different answer to yours about what you would be able to objectively observe.
What do you reckon?
vlovich12325 days ago
Over what time scale, how do we agree on facts, and how do we evaluate things that require a common value system to determine whether the facts are good or bad?
concordDance25 days ago
The idea would be that the two of them collaboratively agree on some observable prediction they differ on. E.g. level of officially reported government spending in 4 years time or gdp growth rate next year or number of plane crashes next year or what have you.
Just some observable metric.
If they literally can't come up with a single observable predictive difference then the predictive aspects of their models are actually equivalent and they are only narratively different and don't "really disagree". Like Copenhagen interpretation vs many worlds.
vlovich12325 days ago
Many things don't have quantifiable metrics like that. For example, is USA still a democracy in 4 years? Are people more or less free?. You know, important questions that aren't just economic numbers. Even semi-quantifiable stuff like "are Americans better educated" is debatable on many topics if you can't agree on truth. Oh, and that GDP growth rate number? That relies on a lot of trust as to who's doing the reporting. For example, many people don't believe China's reported GDP numbers. What makes you think the USA doesn't devolve to such a distrust as well.
concordDance24 days ago
If they affect your life they can be observed.
If "democracy" is just metaphysics then it's irrelevant. But if it has actual tangible effects such as "can you vote?", "can you protest the government?", "is the leader of the opposition arrested?", "do most people think they live in a democracy?", "how popular is new legislation compared to previous years?", etc...
Then you can make predictions about it and test them!
You can even do local predictions if both can agree, such as "will the combined incomes of my family be higher or lower in 4 years time?" as low coupling proxies for gdp. (Ideally one would use probabilities for loosely linked proxies like that and use the probability differences the two theories assign to give bits of evidence to one over the other, so you'd want many many such proxies, ideally uncorrelated ones)
vlovich12324 days ago
> can you vote? can you protest the government? do most people think they live in a democracy?
Was Jan 6 a protest of the government or an insurrection? Can Russians vote or are elections a sham? Do the majority of Russians believe they live in a democracy if they’re afraid of whose conducting the polling (or the MAGA non response to polling)? Those are values question that require you to have an agreement on reality.
> You can even do local predictions if both can agree, such as "will the combined incomes of my family be higher or lower in 4 years time?" as low coupling proxies for gdp
Your personal income has absolutely no predictive value on gdp. It’s more predictive of whether you personally made successful bets or even if you’re better at sucking up to the current power structure. It tells you nothing about population level metrics if you have no way of conducting reliable population level surveys. For example Donald Trump’s personal net worth sky rocketed under Biden because he won the election while as the leader of the opposition to the democrats was looking at jail time and whether that was legitimate or not depends on which political lens you look through it.
> If they affect your life they can be observed.
Ah, but if either side distrusts the other about whether the observation made is truthfully reported, how do you solve that? It requires some amount of trust and right now there’s a very clear divide there.
concordDance23 days ago
There are definitely tangible predictive differences in the case of, say, Russia vs USA. Things like "If you go to the capital with a bunch of friends carrying placards saying '$LEADER is corrupt and evil and should be replaced by $OPPOSITION' how many of you end up in a jail cell in the next day?".
If there is literally no tangible difference then it's just label games and metaphysics and doesn't matter.
> Your personal income has absolutely no predictive value on gdp.
It actually is correlated (admittedly in most day-to-day cases it's just a lagging indicator, but things like natural disasters hit both). It's not the strongest correlation but it would still be evidential. Definitely under 1.0 bits though... One would need a LOT of such observations and having them not screen each other off to start getting a convincing number of bits.
Probably not realistic to have humans manage these sorts of numerous tiny updates though...
/nitpicks
> Ah, but if either side distrusts the other about whether the observation made is truthfully reported, how do you solve that? It requires some amount of trust and right now there’s a very clear divide there.
Yeah, it gets much trickier like that. But I do think two reasonable people from the opposite political sides could agree on some sort of observable to the extent their disagreement is anything other than narrative.
vlovich12323 days ago
> Things like "If you go to the capital with a bunch of friends carrying placards saying '$LEADER is corrupt and evil and should be replaced by $OPPOSITION' how many of you end up in a jail cell in the next day?".
If the other side calls it a violent riot does it still count as people getting put in jail? Cause the Jan 6 insurrection and BLM protests occurred at about the same time and are viewed very differently depending on which political lens you put on.
> If there is literally no tangible difference then it's just label games and metaphysics and doesn't matter.
You’re discounting feelings as if it doesn’t matter. But if people believe or feel like they live in a dictatorship, what quantitative data are you going to use to disprove that. Moreover, why aren’t feelings valid when talking about politics which is fundamentally an emotionally driven human activity and not a data driven one? By the way the left believes they live in an authoritarian dictatorship under Trump while the right believes they lived in an authoritarian dictatorship under Biden. And political power literally is the power to emotionally manipulate others because you individually can’t actually accomplish anything by yourself.
jtrn25 days ago
Has it occurred to you that nothing is more powerful for coming up with intellectual arguments than a strong driving emotion?
Yes, rage might be the appropriate and response given the situation. But it’s often true that it starts with an emotion, and then people just argue from there. Even while being wrong. Just look at all the people with contradictory opinions in history, both with strong, emotional rage, and and equally certain of their connection. Throwing the fact that people actually has a tendency to want to be angry.
normalaccess25 days ago
Rage is the fuel of the internet, but it’s fundamentally useless when it comes to seeking truth. Social media platforms are engineered to maximize engagement, and the most engaging emotion is anger. This isn’t accidental—outrage drives clicks, shares, and ad revenue. The internet has long been called a “hate machine,” and there’s plenty of truth to that.
This creates an environment where misinformation and emotional appeals spread faster than facts. When discussing complex, non-trivial topics, logic and reason are the only tools that can cut through the noise. But in a system designed to reward outrage, those tools are often drowned out.
I highly recommend Sam Vaknin's talk about Social Media toxicity.
Sources: Outrage is the most profitable emotion https://www.cityam.com/outrage-most-profitable-emotion-so-ad...
Sam Vaknin: The TRUE Toxicity of Social Media Revealed - Interview by Richard Grannon https://www.youtube.com/watch?v=o58mFU004hg
InTheArena25 days ago
As a Historian (and a German historian in particular) - I've spent a reasonable amount of time educating myself on the nature of fascism and in particular the break down of democracies (Wiemar, France, and also the erosion of civil liberties during the great depression in the United States).
I have also been a delegate to both the RNC and the DNC at a state level.
This is not a appeal to authority, but rather a honest response to your request for my education level.
IMHO, The root cause of the "senseless partisan drive" is the fact that he founding fathers could not come up with a way to restrict parties (they called them "interests") and left them unchecked. This is a constant "sin" of the American political system, and is a key reason Slavery survived as long as it did, why separate but equal became the law of the land, why America shot itself in the foot several time with the Banks of America and why we are looking at the wrong side of history now.
The parties now act to destroy each other as their prime directive, rather then to better the country. I liken this to Wiemar Germany, where the increasing radicalization of both the Nazis and the Communists led to political instability and eventual violence that destroyed the government. That erosion of democratic norms, as well as the "other side must be destroyed for us to survive" messaging is the true threat, IMHO.
I would strongly suggest Richard Evan's three part history on Nazi history to understand Fascism. Don't worry, you can still hate and worry about Trump and think he is the next coming of Hitler afterwords - it will just be for better reasons.
_heimdall25 days ago
I'm not sure how LLMs/AI couldn't consolidate power. By design, it will move power from the individual to those running the AI systems.
llm_trw25 days ago
Because the difference between a model that costs 10 million to train and a model that costs 10 billion to train is 6 months.
Deepseek R1 is something that you can run in a garage on hardware that the average software engineer can buy with a months salary and when it came out last month it was better than _every_ other model.
cedws25 days ago
What about third world programmers? They can’t necessarily afford a $5000 GPU. If it weren’t for the “generosity” of tech companies like GitHub granting free LLM usage, they might be locked out entirely. This would put them at a disadvantage, we can argue to what degree but it’s still a disadvantage.
Depending on the curve we’re on, LLMs may grow more resource hungry while becoming closer to human performance in software engineering tasks. It’s not unimaginable this would concentrate productivity in the upper class of software engineers that can afford the hardware and/or licenses.
Fnoord25 days ago
Deepseek R1 performs well on a 600 EUR Jetson, and a 700 EUR AMD GPU. Both bought during COVID crisis. It is that quick. However, don't ask it about certain sensitive topics.
You can bet your ass Musk is using his AI tools as propaganda tool for his advancement, just like he does with X. We already seen prompt leak of Grok, it wasn't neutral.
mrtesthah25 days ago
This is exactly correct. Grok is already slandering specific journalistic outlets such as “The Information” that publish negative stories on Musk.
llm_trw25 days ago
$5000 monthly salary?
My heart goes out to all the oppressed programmers in the EU.
ben_w24 days ago
EU isn't "third world".
Kenya is, and at current exchange rates the range in this citation is $558-$1876/month: https://www.glassdoor.com/Salaries/kenya-software-engineer-s...
llm_trw24 days ago
With salaries like that it's not first world either.
If op is trying to catch up to frontier models locally on a budget 1/5th of what you can get in the west then I can see why she would feel the way she does about Ai.
ben_w24 days ago
What are you even talking about?
You said a month's salary; cedws said "third world" devs can't afford $5000 — and this is correct, third world devs can't afford that. cedws did not say EU, they said third world, at least not here. You said EU, not them.
When you reply to me with "With salaries like that it's not first world either", who is "they"? The country I said explicitly was third world? Because that's a tautology.
_heimdall25 days ago
Is it your expectation that as models get cheaper we won't be developing much more powerful models at the higher price range?
Its already worth noting that we already ran into the self hosting at scale problem. People don't want to run a web server and instead accept all the problems that come along with the convenience of social media. Why would LLMs, or any future AI product, be different?
mwigdahl25 days ago
No, it was not. It was not better than o1, nor o1-pro. Yes, it was _cheaper_ than those models, and superior in price/performance if the performance was acceptable. But in terms of raw performance it was behind them.
johnfn25 days ago
Chatbot arena leaderboard[1] says otherwise. 4o is ahead now, but that version of 4o was released after r1. r1 was ahead of all version of 4o and o1 at the time of release.
https://huggingface.co/spaces/lmarena-ai/chatbot-arena-leade...
mwigdahl24 days ago
Chatbot arena leaderboard is a good test for vibes and style of response, but not much else. R1's performance in objective benchmarks (coding, etc.) showed very good performance, granted, but inferior to the full o1 and o1-pro models.
It's still a very impressive feat, but it wasn't frontier-pushing.
tartoran25 days ago
I'm not sure people will just take it though. In the short term it looks like the situation is on a horrible course but eventually people will have had enough. I'm hoping it would take less time for that to happen and the damage will not be too great. Let's remember that we could use the same technology if not a better one to fight against all this.
xenodium25 days ago
> but eventually people will have had enough
As a Venezuelan, I thought so too. 25 years on...
bloopernova25 days ago
A lot of US citizens are living incredibly comfortable lives. If that is threatened, for instance by food shortages caused by lack of people willing to work on farms for very low wages, then protests may happen.
But the reality distortion field around the current administration is very powerful. Fox and CNN are owned by supporters of the republicans, NYTimes and Washington Post don't appear to be reporting certain aspects of the government restructuring. Multiple social media sites are owned and ran by people who support the current admin.
I am personally worried that we're going to see the gradual yet continual escalation of rhetoric, more actions that undermine rule of law, and continued lack of critical thinking in so many people. That path appears to lead to extremism.
I have a horrible feeling that whoever "wins" in a couple of decades or so will have no time to savour their utopia as the climate catastrophe really starts to bite hard.
Gothmog6925 days ago
So you are a leftist who believes the economy can't operate without basically third world slaves working in your fields?
_heimdall25 days ago
We enslaved the planet with industrialization and we are about to enslave AI, if we haven't already. Humans aren't on the losing end of that, but if your concern is with slavery itself that wouldn't matter.
_heimdall25 days ago
When a potential power imbalance is created, those willing to use it for selfish means will almost always win in the end.
Anyone with morals driving their use of a new tech will be limited, and unless those people massively outnumber the few selfish ones they will lose eventually.
tartoran25 days ago
Losing a battle is painful but no win is final until things fall into a balance and even that doesn't last forever or it even reverse. The worst case scenario is not going to matter for anyone in the grand scheme of things because there will not be anyone around.
ryandrake25 days ago
I'm not sure I understand how a hallucinating plagiarism machine that people mostly just use to write their term papers translates into "power" (presumably political power? I don't understand what kind of power we're talking about either).
psb21725 days ago
The ability to inject your preferred biases into the system that people use for finding or generating nearly all information they consume on a day-to-day basis is extremely powerful. Eg, if all "term papers" produced by this plagiarism machine are now 20% more favourable to the machine's owner than they would otherwise be, that can have significant, compounding long-term effects.
Of course, similar things could be said about controlling information flow through: social networks, newspapers, printed books, or whatever the town crier shouts in town square. But, each advancement in information dissemination tends to be power concentrating relative to the older tech, and I don't see any reason why this most recent advance won't follow that trend.
smeeger25 days ago
bingo. and everyone is fine with it as long as the consolidation is benefiting their own tribe.
throwaway65765625 days ago
Definitely not everyone. For many of us, checks and balances are a feature not a bug.
tartoran25 days ago
A spark can start a fire, it doesn't seem wise to ignore history. Things can always get out of control even for the ones up there on top of the pyramid of power.
cwalv25 days ago
That was fast. Perhaps we should imagine this has nothing to do with Musk to avoid completely derailing the conversation
SubiculumCode25 days ago
It cannot. Musk is probably the most powerful person in the world now and intent on remaking the world to his vision.
Fnoord25 days ago
Trump is the most powerful, he is immune and can commune sentences. If he wants to, he puts Musk in jail tomorrow, no question about that. His successor will be as powerful, whether that is his son or Vance is up in the air. It won't be an elected Democrat though.
dvngnt_25 days ago
> It won't be an elected Democrat though.
the pendulum swings back and forth. i don't see that changing
throw1618033924 days ago
It's an open question if elections will even be held. If they are, demographic changes also make it much harder for the Democrats to win after 2030 (https://www.pbs.org/newshour/politics/democrats-future-crisi...).
matwood25 days ago
I, probably naively, hold out hope. But Trump/Musk dismantling the groups who work to keep elections fair and free is disturbing.
https://eu.usatoday.com/story/news/politics/2025/02/07/trump...
https://www.npr.org/2025/02/11/nx-s1-5293521/foreign-influen...
Fnoord25 days ago
There won't be a fair election anymore after 2024. Trump in his own words: "we'll have it fixed so good". Right now, the USA is in a constitutional crisis, at the very least.
There a fantastic website here, following the status of Project 2025 [1], with references. Trump is following that document to the T.
brabel25 days ago
> If he wants to, he puts Musk in jail tomorrow,
Has America already become an authoritarian state where this sort of thing really happens?? I don't know, I haven't seen that sort of thing happen yet.
cheema3325 days ago
> Has America already become an authoritarian state where this sort of thing really happens?
The conservative Supreme court recently ruled that the president has essentially unlimited power. During his campaign, Trump did promise that he will be a dictator for a day. Appears to be overdoing it.
pcthrowaway25 days ago
You might be surprised to learn they could change their ruling if the court justices were to be incentivized differently, and that Musk has a lot of influence.
ben_w24 days ago
Musk indeed has a lot of influence. Trump is a narcissist. Not saying this will happen, but it's definitely not impossible that Trump just orders Musk shot (as per opinion of what's now allowed in the dissenting opinion of Supreme Court Justice Sonia Sotomayor) — if such an order is followed, I wouldn't want to guess, but death has a way of significantly altering someone's influence.
pcthrowaway25 days ago
There are different kinds of power, and I'm honestly not sure Trump can get Musk thrown in jail. On what basis would Trump even do that?
Similarly, Musk can potentially launch a campaign to sway the public to move for Trump to be impeached due to his felony convictions.
A battle between the two might be the shakeup the current empire needs.
Trump may be more powerful than Musk by some metrics, on a time-limited basis (unless he manages to change the term limits), but Musk is more powerful in many ways as well. Musk's wealth is greater than that of many entire countries.
actionfromafar24 days ago
Basis? Since when is Trump looking for a basis? Musk could sway out a window. Russia / US relations are being normalized, maybe it could be a small gesture of goodwill to help a friend out if one wanted to keep the regular chain of command clean.
srid25 days ago
I do not share your fear & anxiety. What concrete danger do you imagine will happen as a result of xAI? Try to be as concrete as possible.
Also, dang, is there anything we can do to keep the comments on this submission tech-focused? Perhaps the Elon-bashing political digression can be split into its own thread?
Arubis25 days ago
Tech is the most powerful force in the modern economy. There is no making it not political. "Tech isn't political" _is a political statement_.
digdugdirk25 days ago
The concrete danger isn't necessarily with xAI (the product) but with Elon being the one who is in control of it. LLMs are an interesting technology, and we should absolutely be investing in pushing our understanding of the technology forward. We should absolutely not be relying on them for the ongoing functioning of our government: https://www.axios.com/2025/02/05/musk-doge-ai-government-eff...
Unfortunately, Elon has made himself a spectacle. To separate him and his intentions out from the technology itself would be a disservice to the discourse as a whole.
srid25 days ago
Hmm, I'm only too happy to rephrase my question, then.
What concrete danger do you imagine will happen as a result of xAI being controlled by Elon? Try to be as concrete as possible.
katbyte25 days ago
he will bias it toward his views and favoured outcomes like he did twitter
and those are pretty terrible, anti-science, and petty
cbm-vic-2025 days ago
The problem, as I see it, is that the results from AI systems will either be used to make decisions, even if those results are flawed. Or worse, those flawed results will be used to justify decisions that negatively impact peoples' lives.
This isn't something specific to xAI, but it turns out that the person who controls xAI also holds an unusually strong influence over the highest level government officials. These officials can use xAI as an excuse to implement harmful policy, "because the computer said this is the best course of action"- not unlike people who end up driving on train tracks or into large bodies of water because their GPS told them to go that way.
troyvit25 days ago
I think this comment sums it up well. As soon as advanced LLMs started making a splash we all saw the writing on the wall. AI will start taking on large chunks of cognitive load across industry, government, etc. that humans formerly held (It has already been a strong driver in finance).
I for one was ready to welcome my AI overlords once they were mature and tested. It was an inevitability. Because of the relationship between this oligarch and the government though it looks like the time line has accelerated and we're going to see misplaced trust in tools that aren't ready for what we're about to hand them.
wezdog125 days ago
I do not share your fear & anxiety so please dont share yours.
spiderfarmer25 days ago
How is this Elon-bashing? Address his concerns. Elon is not a king that shouldn't be questioned. Quite the contrary.
MrMan25 days ago
[dead]
conductr25 days ago
Or he starts using NSA data to train it, seems he has unchecked power to get into national systems and he made a point of saying this is more than the internet worth of knowledge...
finnthehuman25 days ago
> it fills me with rage that Elon has this sort of unchecked power
I can empathize, but I can't feel indignant about it. Not any more.
For years and years I've watched people warn about the centralization of power by tech companies. They were shut down left and right. I'm not accusing you of being one doing the shutting down. I'm just annoyed that Elon is what it takes for people to start realizing the people arguing the principal might have been onto something.
And I expect to see them start getting their "I told you so" in. Watching this play out, I'm personally inclined to join team "you made your bed, now sleep in it."
tartoran25 days ago
All it takes is some big fcukups and a political shift for these to be broken down to smithereens.
randomcarbloke25 days ago
I just wish it was someone technical or otherwise genuinely intelligent that did this.
LMYahooTFY25 days ago
He does not have the power to dismiss Judges and Congressional representatives.
Judges can only be removed by Congress.
Congressional representatives can only be removed by their peers.
dionian25 days ago
It's nothing in comparison to NBC/CBS/ABC/NYT/etc. But still a fair point
cynicalpeace25 days ago
> it fills me with rage that Elon has this sort of unchecked power.
The check on this is the market. Don't understand your point other than "Elon bad"
amazingamazing25 days ago
I’m not really following what this has to do with grok. It’s his company, no?
It’s also annoying that the top comment engages in no way with the content of the OP…
It must be truly infuriating to work hard to push a release, and you see it featured on your favorite orange website, only for the top comment to have nothing to do with what was worked on.
Here's a test - if this post was about Starship, the same comment could apply! Neuralink, the same thing! Boring Company, same thing! Wow, could it be that such a comment is really applicable to so many different companies or projects, or is it just a generic one? You decide.
gyanchawdhary25 days ago
The best comment on this thread !
soheil25 days ago
Never seen HN turned against someone so vehemently, it's as if a group of bots was set lose to criticize a certain individual.
katbyte25 days ago
or just maybe, and i know its a crazy idea, a certain individual is objectively an awful person who has done great harm in the world and its subjective if its greater or lesser then the good (imho its far greater harms then any good done but i know that is my subjective view)
just because you disagree with a widespread view/opinion does not mean its bots
pton_xd25 days ago
> done great harm in the world
Can someone enumerate the "great harm" that Elon is doing? I honestly don't see it.
dkjaudyeqooe25 days ago
"There are none so blind as those who will not see"
hahamrfunnyguy25 days ago
It's been a long time coming with Elon Musk, and he has been criticized A LOT on Hacker News.
https://news.ycombinator.com/item?id=27796948 (2021) https://news.ycombinator.com/item?id=33622767 (2022) https://news.ycombinator.com/item?id=11025852 (2016)
I would also argue he is not being singled out, here are some comments posted criticizing Steve Jobs:
https://news.ycombinator.com/item?id=28295688 https://news.ycombinator.com/item?id=5578642
It really shouldn't come as a surprise that notable people related to a company or project are brought up when an article about it appears on HN.
[deleted]25 days agocollapsed
aaron69525 days ago
[dead]
niceice25 days ago
[flagged]
therouwboat25 days ago
He also runs tesla, twitter, xai, boring company, that brain thing company, government agency and has like 10 kids. I'm scared.
amazingamazing25 days ago
Are you actually? Why?
niceice25 days ago
Richard Dawkins: "I have a very favorable impression of Elon Musk and his concern for the welfare of the world.
I have sat with [Elon] on a transatlantic plane and had a very, very long conversation with him. He's undoubtedly highly intelligent and knowledgeable.
I've had lunch with him on two or three occasions, and so I know him a little bit.
I have formed a very favorable impression of his intelligence, his knowledge and his concern for the welfare of the world."
The Poetry of Reality, November 17, 2024
throw1618033924 days ago
Elon screws his partners in business deals, cheated Twitter employees out of severance, and is currently destroying our government. All of this trumps a firsthand impression from Dawkins.
lonelyasacloud25 days ago
Intelligence is not a guarantee against getting drawn in by evil https://en.wikipedia.org/wiki/List_of_Nazi_ideologues .
pizzafeelsright25 days ago
Would it be fair to say you have latched your mind onto something beyond your control which is leading to your fear?
If you got a phone call today about your pancreatic cancer that will kill you in six weeks, do you fear Elon or some political agency?
silentsanctuary25 days ago
> TLDR: The status quo elite that have been looting the USA for decades is being replaced by the guy who runs SpaceX and is the most transparent elite we've ever seen. Why is that scarier?
The reason why it's scarier is that for those of us who've come to understand Elon, his actions, and his methods better, it's clear that:
- he IS "the status quo elite that have been looting the USA for decades"
- instead of being an incredibly smart polymath who turns things to gold, it's become obvious his main talent is actually just convincing other people that he's smart
- the successes of Tesla and SpaceX have had to come DESPITE Elon's management, and despite having huge budgets to hire some of the smartest people around, he's still an incredibly weighty anchor pulling them down
- rather than being transparent, he's playing at being a showman - and people who are only passively observing the situation are getting sucked into it
platybubsy25 days ago
>the successes of SpaceX have had to come DESPITE Elon's management
Can you elaborate on this? AFAIK all other rocket companies without Elon have not been as successful. Also Eric Berger and multiple employees at SpaceX seem to disagree with your statement.
pton_xd25 days ago
> the successes of Tesla and SpaceX have had to come DESPITE Elon's management
Huh? Name some better leaders who run more successful EV and space companies. There aren't any! Twitter is as popular as ever, and now xAI appears to be highly competitive.
But yeah sure, Elon is the common problem among all these successes.
He does make outlandish promises and lots of mean tweets though.
thrance25 days ago
Twitter is losing users for the first time in its existence and lost 75% of its valuation since takeover.
Tesla is loing steam as other nations start linking Musk with the fascist threat looming over the US.
SpaceX, like all his successful ventures, are carried by exceptional talent. Attributing it to Musk is an insult to them.
Tell me, with him twitting an average of 60+ times a day, reaching max level in Path Of Exile, and now spending a fair share of his time dismantling the government, where does he find the time to put any work in his many companies? Answer: he doesn't.
Fnoord25 days ago
My conclusion from reading The PayPal Wars was exactly that. That PayPal succeeded despite Musk.
As for all that government efficiency BS; it is just to swap to oligarchy. I mean, the name DOGE gives it away. DOGE was the first memecoin, entirely and openly being bullshit, yet it succeeded despite of that.
Meanwhile: not good emotional connection with father like the rest of these so-called strongmen (Trump, Putin, ...), and from a rich position of influence as child. Ie. he never was white trash, with his father having been part of pro-apartheid movement (a fight lost but which is meaningful for a young Elon). Furthermore, I am not convinced his drugs usage, in the end, serves him.
niceice25 days ago
Sorry, but that is mental gymnastics. You've already made your conclusion and are torturing reality to make it fit.
silentsanctuary25 days ago
Actually, I used to like Elon and was almost about to buy a Tesla, before the overwhelming burden of evidence required me to change my mind about him.
What motivates your point of view? I'm genuinely very curious.
Gothmog6925 days ago
How is he the status quo elite? What leads you to believe he's not intelligent? Like those 2 things alone you need massive cognitive dissonance to believe.
hersko25 days ago
Because "looting the USA" is a ridiculous accusation agains someone who is became wealthy from creating genuinely great products. He spent everything he had on SpaceX and Tesla and came incredibly close to losing it all. He is not some robber baron or oligarch who is wealthy by hoarding natural resources.
kurikuri25 days ago
Ah yes, replacing the ‘elites’ with a single person, much less scary. And, to be clear, the ‘elites’ are still in power (because, in America, power tends to follow money). The only thing that seems to be happening is the rapid destruction of any system Elon deems ‘bad.’
We aren’t better off at the whims of this robber baron, and I don’t understand how you can think that.
Gothmog6925 days ago
How is he a robber baron? You don't need to buy any of his products.
spacechild125 days ago
I honestly don't know if you are being sarcastic or not. I really hope for the former.
thrance25 days ago
The brains that run and made SpaceX are anonymous engineers working long hours and passionately applying their expertise to a project they believe in. Elon is a man-child tweeting an average of 50 times a day (number not made up), pretending to reach max level in a popular video game and then bragging about it, constantly flying between Mar-a-Lago and the White House in his private jets. He has no time to actually manage "his" companies, of which he puts no work in whatsoever.
You have to finally break free of this myth of the billionaire self-made man, building his fortune at the sweat of his brows. At some point, you're simply so rich that however stupid you are, competent people will still manage your capital well and make you even richer. You can only fail upward.
Do not mistake the current events for anything but an acceleration of the theft of your country by billionaire oligarchs (or rather the people managing their wealth).
amitrip25 days ago
[dead]
m202425 days ago
[dead]
llm_trw25 days ago
To quote Marx on the current churn in the US government:
>The bourgeoisie cannot exist without constantly revolutionising the instruments of production, and thereby the relations of production, and with them the whole relations of society. Conservation of the old modes of production in unaltered form, was, on the contrary, the first condition of existence for all earlier industrial classes. Constant revolutionising of production, uninterrupted disturbance of all social conditions, everlasting uncertainty and agitation distinguish the bourgeois epoch from all earlier ones. All fixed, fast-frozen relations, with their train of ancient and venerable prejudices and opinions, are swept away, all new-formed ones become antiquated before they can ossify. All that is solid melts into air, all that is holy is profaned, and man is at last compelled to face with sober senses his real conditions of life, and his relations with his kind.
LLM training—and the massive (potential) copyright infringement that everyone is engaging in to train these models—is the latest contradiction in capitalism. For the first time in my lifetime, this contradiction isn’t harming the workers but is instead affecting a segment of the capitalist class.
Not since the abolition of slavery has one class of capitalists required the destruction of another to modernize the means of production.
We are in for an interesting decade.
martin-t25 days ago
LLMs are used to launder code under GPL and AGPL and strip its users of their rights.
When I publish something under those copyleft licenses, my users have the right to see and modify the code. They even have that right if somebody else builds on top of my work. With LLMs, proprietary products based on my copyleft code are being written and used right now and my users have no rights at all, in fact, they don't even have a way to find out they are my users.
Imagine I ~~stole~~ got my hands on code from the top 5 tech companies and then made an autocompleter that looks at function signatures the programmer writes and autocompletes the function by picking a matching function from that corpus. I'd get sued and rightfully so.
What LLM companies are doing is exactly the same, just a bit more capable and it mixes the code just a bit more thoroughly to mask the origin.
imgabe25 days ago
There is precisely zero mention of any plan to put xAI or any other LLM in any safety critical or decision making process. How long? Nobody knows because nobody is even considering it. Take your pointless fear mongering elsewhere.
moolcool25 days ago
It's well documented that DOGE uses AI, and Musk has tweeted that SpaceX will be overhauling the FAA as well. It's pretty realistic to think they will (or already do) use xAI for critical processes.
imgabe25 days ago
DOGE makes recommendations to the president, who has the final decision making authority.
The rest is pure speculation. “It is very reasonable to believe this thing that confirms all my biases so therefore it must be true”
moolcool25 days ago
I would argue that developing those recommendations is a "safety critical" task. Especially given that just in the past few days they accidentally fired, then re-hired, a bunch of nuclear weapons safety workers.
imgabe25 days ago
That is the MO. Elon has stated publicly that if you don’t have to put something back, you haven’t cut enough. That is the idea. You cut things and see what was necessary.
moolcool25 days ago
That might work if you're slashing headcount at a social media company (though I would argue that it doesn't), but the stakes are a bit higher when you're responsible for things like feeding hungry people, curing disease, or keeping planes in the sky.
herval25 days ago
Worth noticing that it _doesn't even work at a social media company_. Twitter is a husk of its former self, with all the problems that were if not solved, at least mitigated, back in full force (child porn, bots, impersonators). It's just kept alive because it's a honeypot for right wing nutjobs now (which I'm sure can't read an SEC filing and will claim it's "operating better than before").
It'll be a disaster for the soon-to-be-previous most powerful country on earth...
imgabe25 days ago
[flagged]
lawlessone25 days ago
>when I’m hungry I feed myself. Why can’t they?
you know exactly what you are doing here Gabe, its easy to say all this stuff when you're sitting pretty in Hong Kong.
imgabe24 days ago
“Sitting pretty”, yes, that’s what I’m doing.
I’ll end this here since you don’t seem to have anything else relevant to say, and instead prefer to stalk my profile. Enjoy. It’s good reading if I do say so myself.
moolcool25 days ago
>Curing diseases: name a disease the US government cured.
Smallpox?
[deleted]25 days agocollapsed
katbyte25 days ago
you can't do that in government and services people rely on to you know, live. people will die (and i wouldn't be surprised if some already have)
shows a total disregard for the wellbeing of others
cowpig25 days ago
> DOGE makes recommendations to the president, who has the final decision making authority.
It's clearly being run by Elon Musk, but he has not been nominated or confirmed for any official position.
DOGE appears to have unprecedented access to systems that usually have safeguards in place. What do you think people should do in this situation if they are concerned about abuse of power?
imgabe25 days ago
DOGE is a rebranding of USDS which was established under Obama. All their authority to access systems derives from that.
They have read-only access to systems and the “abuses” seem to be publicly posting what the government is spending money on.
Why do you think it’s a problem for the public to know where the government spends money?
moolcool25 days ago
Government spending is mostly already public information. Ironically, it's DOGE itself which is trying it's best to act in secrecy.
imgabe25 days ago
If they are cloaked in secrecy how do you know what you’re even mad about? The people who are tweeting everything they do are acting in secret?
If all they are doing is posting information that was already public then what exactly is your problem?
moolcool25 days ago
> If they are cloaked in secrecy how do you know what you’re even mad about?
Do you hear yourself?
imgabe25 days ago
Do you hear yourself?
Name a specific thing you would like to know about DOGE that is not publicly disclosed.
moolcool25 days ago
I could go into details (What are they doing to ensure data privacy? Can/do they exfiltrate data and run it through external AI models? What kind of security clearances do the children of DOGE have?).
But just on the surface, Elon has accused a journalist who published the name of DOGE employees of breaking the law. If it were up to them, even that would be kept secret. This is not a transparent organization.
lawlessone25 days ago
>Name a specific thing you would like to know about DOGE that is not publicly disclosed.
Why one of the guys working for it was running a dodgy image sharing website and has links to cycbercrime and CSAM?
and why Elon called the journos that revealed these links criminals?
amazingamazing25 days ago
The details of said spending is very much not public, and even if it was (it’s not) - it’s not accessible. If it's not true, would love to see links where I can see everything easily.
cowpig25 days ago
> DOGE is a rebranding of USDS which was established under Obama. All their authority to access systems derives from that.
OK I actually know what this is and no, it's absolutely nothing like the USDS, which builds tools to support government processes. What has DOGE built?
> Why do you think it’s a problem for the public to know where the government spends money?
I would very much like to see transparency, and if that were what DOGE was doing it would be great. But it looks to me like they're operating in secrecy and firing huge numbers of people before publishing any kind of analysis or study, without even providing reasoning for what they're doing.
moolcool25 days ago
> What has DOGE built?
They built a portal to look at government spending, which is both worse than the existing visualizations, and did not put authentication in front of it's database. https://www.theverge.com/news/612865/doge-government-website...
herval25 days ago
> DOGE makes recommendations to the president
Trump is so far gone in his dementia, he can't even make eye contact anymore. You see this in all the videos. He's basically King Théoden at this point. Not even Musk's kid respects him.
He's so disabled, he's sending his VP to do the job, and we all know how much of a paranoid child he is, from the past term (when he went into a Colonoscopy without anesthesia just to not have to give the nuclear codes to the VP).
lawlessone25 days ago
>DOGE makes recommendations to the president,
Yeah but Elon is your president now.
imgabe25 days ago
[flagged]
lawlessone25 days ago
>Grow up
Would you say i'm being more or less mature than your president ?
imgabe25 days ago
Considerably less.
lawlessone25 days ago
>Considerably less.
How so?
KronisLV25 days ago
3rd party example: https://arstechnica.com/health/2023/11/ai-with-90-error-rate...
No comment about current US politics, but it’s probably a given that many will read “A computer can never be held accountable, therefore a computer must never make a management decision.” and drop the second part because the first is exactly what they want. Same as how you can’t get in touch with human support on many platforms nowadays, but instead just get useless bot responses.
imgabe25 days ago
“It bolsters my argument to assume this thing is true so it’s a given that it’s true”. Cool. Good job.
pccole25 days ago
This is a really bad take honestly. This guy sits next to the president. I have no doubt in my mind he will get another government contract and the government will be using Grok.
imgabe25 days ago
Oh wow you imagined something will happen and you have no doubt it will happen. Amazing. Very convincing.
cpursley25 days ago
[flagged]
herval25 days ago
It's unbelievable how the US government is literally being dismantled in front of your eyes, and all you can see is this thought-terminating bullshit about "purple hair people". Half the American population completely lost the ability to think...
xedrac25 days ago
The way I see it is our government is being rescued from the tyranny of unelected bureaucrats with near zero accountability. And now that Trump has appointed someone to take a close look at everything and make recommendations for improvement, people are losing it. Why? I'm outraged at how irresponsible our government has been with my tax dollars. Trump has been the most transparent president in history, and it has absolutely been a breath of fresh air.
cpursley25 days ago
Yeah, exactly. It blows my mind that expecting accountability, transparency, and effectiveness out of public servants is even a partisan issue.
imgabe25 days ago
Large parts of the government needed to be dismantled. I know this is a shock to the people whose solution to every problem is “give the government more money”, but amazingly, people who primarily take a job because it has a cushy pension and its impossible to get fired are not the most effective people in the world.
cpursley25 days ago
> How long before this starts getting deployed in safety critical applications or government decision making processes
Hopefully sooner than later. I trust this more than the literal scammers and thieves who were previously running things.
spiderfarmer25 days ago
[flagged]
jcpham225 days ago
Yeah unfortunately I've spent a good bit of time talking to Grok (v2 I guess) and I agree with you. The commenter asking people not be political would be the same commenter that seems the most dismissive of any criticism and coincidentally also the most political. Grok is generally dismissive of any criticism(s) against certain parties, even when presenting facts.
stronglikedan25 days ago
Elon is only doing good with this not-unchecked power. Everything is on the up and up, despite what your favorite propagandists want you to think. Go Elon!
LorenDB25 days ago
Elon just said they are launching an AI game studio. Does this mean they will be building games that are mostly built with AI, or will they make AI tooling available for anyone to build games easily? Probably the former, but it would be nice if they would make it fully available to everyone.
forgotoldacc25 days ago
Regardless of which it is, we can assume it'll be here after Tesla's full autopilot that he promised as well as the Mars colonies.
madaxe_again25 days ago
Full self driving exists, as well as starship.
So I take it you mean “imminently”.
Like him or loathe him, he executes, which is more than can be said for most.
rat8725 days ago
Full self driving does not exist as it is not full self driving. In fact the name is one of the worst things about it as it gives drivers false confidence
croes25 days ago
You now the difference between a car and a house?
Starship is the car not the house aka Mars colony.
Completely different type of problems.
madaxe_again25 days ago
Ever heard of an RV? Turns out you can kill two birds with one stone.
croes25 days ago
Only according to American standards.
Ever heards of Three Little Pigs?
kalleboo25 days ago
"Full Self Driving" does not exist, only "Full Self-Driving (Supervised)"
[deleted]25 days agocollapsed
palata25 days ago
[flagged]
madaxe_again25 days ago
It would be lovely if all businesses followed the mondragon model, but that isn’t the reality in which we live - corporations are fiefdoms, for the most part.
notachatbot12325 days ago
They will be building games played for you, like the fascist's Path of Exile account but without exploiting humans for the task.
bemmu25 days ago
It seemed to me that he was joking.
chrisco25525 days ago
He's not joking, and it's not the first time it was announced.
bemmu25 days ago
You may be right, he tweeted about it already last year, and seemed to confirm it again yesterday https://x.com/elonmusk/status/1861801046949191686 https://x.com/elonmusk/status/1891388509191049307
TheDudeMan25 days ago
Maybe he was joking about the Roaster 2, also.
FergusArgyll24 days ago
Oh, I thought it meant games with npcs and/or environments that are controlled by LLMs
Vecr25 days ago
Say hello to CelestAIa. I'm guessing it's a joke, or the use of AI will be limited.
ta98825 days ago
[flagged]
bogdan25 days ago
Does "historically accurate" even have a meaning?
cbg025 days ago
Yes, it refers to something which aligns with generally accepted historical facts about a certain event or time in history.
ccorcos25 days ago
The politics in the comments here are really toxic. What’s happening to HN?
This is the largest computer cluster the world has ever seen.
Can someone please post interesting comments about things I can learn?
dang25 days ago
It's a reflection of the wider society and (as others have pointed out) the media environment. HN can't be immune from macro trends.
https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...
We've been here before. It will likely subside, as past swings and fluctuations have. It always takes longer than it feels like it should, but in retrospect turns out to be shorter than it felt like it did.
ccorcos25 days ago
haha interesting search query. thanks for your hard work dang!
dang25 days ago
(I feel bad about linking so often to my own comments but that information mostly doesn't exist anywhere else)
qwerpy25 days ago
It was initially pretty bad. The top few comment threads were toxic and rehashed outrage. It’s a lot cleaner now. Thanks to the moderators and/or users flagging the non-productive comments.
archagon25 days ago
This is akin to suggesting that we should have all been praising Microsoft for their achievements back in the day rather than saying a word about EEE, their monopolism, or their enmity towards open source. Or that it’s not polite to bring up the CCP when discussing TikTok.
Bottom line: a technology that has the ability to shape human thought perhaps more than any other in history is owned by a man with some truly vile ideas. (Remember, his primary stated goal is eliminating the “woke mind virus,” i.e. reshaping global politics and culture in the image of the far-right.) We can make happy marketing noises all we like, but at the end of the day, that’s the thing that’s actually going to have a meaningful impact on the world. Once his audience is captured, the model will say what Musk needs it to say and people will believe it.
If we can’t discuss the potentially catastrophic consequences of new technology, then none of us deserve to call ourselves “engineers.” We are just docile consumers latched onto Silicon Valley’s teat.
ccorcos24 days ago
[flagged]
kelnos23 days ago
I don't think anyone is telling you what your opinions should be. The GP post just presents the GP's opinion. You're free to agree or disagree with it as you choose.
If you read a comment that you're unhappy with, downvote it and move on.
zulban25 days ago
Indeed. Nearly every news outlet I follow is slamming Elon. I come here for tech.
breakitmakeit25 days ago
[dead]
928340923225 days ago
[flagged]
blain25 days ago
I'm going to risk it and say you can ignore it, you just have to want and if you need to went go to reddit.
In Poland we have a new affair every month and I don't care. Your country will be fine.
I would also love to see more technical stuff discussed here.
sixQuarks25 days ago
[flagged]
dang25 days ago
Please don't cross into personal attack.
ban-evader25 days ago
The story about how they made this happen in such a short period is impressive to say the least. Elon’s strength seems to be making things happen.
Getting the largest computer cluster in the world up and running in a matter of months? Unbelievable.
928340923225 days ago
Elon's strength is a massive wallet.
baobabKoodaa25 days ago
I guess you didn't watch the videon in OP, because if you had, you'd know that they tried to buy the buildout and got quotes for 12-18 months, then decided to do it themselves instead.
928340923225 days ago
That's the power of having a massive wallet. If you have unlimited money then buying the experts you need to just do it yourself is an option you have that others don't.
baobabKoodaa24 days ago
My point is that that is exactly what they DIDN'T do. They tried to buy the experts, but the experts would not have been fast enough. So they did it themselves.
Setting up a datacenter like that in such a short time is NOT a thing you can buy with money.
928340923224 days ago
When I say buy the experts I don't mean contract out to experts. I mean hire them and buy all the equipment yourselves.
baobabKoodaa24 days ago
The way the story was told, at least, they did NOT hire the experts.
928340923223 days ago
You have other people in this thread talking about how xAI offers massive salaries to top talent so it sounds like they do exactly that. Have the money to both hire experts and buy the equipment. I don't even know why this is a discussion, by virtue of doing what they did they needed to hire experts and buy equipment.
baobabKoodaa23 days ago
They didn't hire the top talent in datacenter-building, because they had the expectation that they could buy a data center buildout. Sure, they hired top talent from other fields, to work other tasks, but that's unrelated to this.
Anyway, I don't have inside information on this, I'm just reciting what they announced publicly. If you want to argue that they in fact lied in the public announcement, and they secretly hired a bunch of datacenter-building-experts, then it's on you to show some proof for that claim.
randalltheresa23 days ago
[dead]
[deleted]25 days agocollapsed
greenie_beans25 days ago
[flagged]
bamboozled25 days ago
[flagged]
ban-evader25 days ago
I’m not sure if that’s the case. He’s obviously a smart man, but what’s truly unbelievable is that someone has so much resources that they can make something like this happen (what looks like) pretty casually.
bamboozled25 days ago
He knows how to take money from people and then market things as if they’re his creations which then turns into him having more money because people think it’s a good investment to give him more money. It’s really quite a genius con he has going. It seems as if the sky is the limit too.
Remember when he got caught having people play games for him so he had a top ranking ? He does that with basically everything.
albertzeyer25 days ago
https://garymarcus.substack.com/p/elon-musks-terrifying-visi...
I'm not sure if this was a very bad joke by Elon, or if Grok 3 is really biased like that.
bootywizard25 days ago
Karpathy notes that the model, or specifically it's DeepSearch feature "doesn't like to reference X by default" which seems counter to this.
Hopefully that means it is a joke...
antirez25 days ago
Karpathy, which is IMHO a serious and balanced person, lamented that it looks too censored (see recent tweets). Elon Musk is (for me) a very scary person, and it is important to evaluate AI safety (but I believe that the safety that matters in AI is of a different kind), yet to listen to Gary Marcus does not make any sense: it's just an extremely biased person that is riding the anti AI wave.
sandbags25 days ago
Anyone with an opinion can be labelled biased. Also I’m not clear what you mean by Marcus “riding the anti AI wave” but infer that you mean it negatively. He has been writing informed criticism for several years and about cognitive psychology for considerably longer.
albertzeyer25 days ago
Yes that's certainly true. I was a bit hesitant to post a link from Gary Marcus. But I was mostly posting it for the Elon tweet. I assume the tweet is not fake. So you can ignore about Garys opinion here and just take Elons tweet as it is.
Davidzheng25 days ago
People have tested this question online and gotten very balanced answers so I assume it's some special mode Elon was on
TeMPOraL25 days ago
We don't see the full conversation, for all we know he prompted the model to say these things in a previous message that isn't on the screenshot.
Also, it's 2025, do people still believe random accusations based on a parish screenshot of a chat app (or what looks like it, but could've trivially been fabricated with e.g. Inspector in the browser dev tools)?
belter25 days ago
Karpathy sit silently for years by the side of Musk while he made wild claims about FSD...Please....
llm_trw25 days ago
Without seeing the context window you have no idea what the Ai was working on. It could have literally been told to mock and belittle "the information" in every reply. Something that deepseek r1 is exceedingly good at.
Mr Musk, we can't afford a shitpost gap between communist and capitalist AIs!
blackeyeblitzar25 days ago
I am not sure why people pay attention to Gary Marcus. He isn’t an expert in AI. And if you followed him in the past at all, it is obvious he has a huge amount of political bias. It is really telling that he repeatedly goes after Elon Musk, and is now making bizarre unfounded claims about propaganda, but didn’t have nearly as much to complain about with DeepSeek, which has literal government propaganda.
int_19h25 days ago
He is referencing a specific tweet that Musk himself made.
If I were in China, I'd worry about the kind of things DeepSeek wants to censor, especially if the people who made it were also very loudly saying things like "we need more AI in our government". But I live in US.
sebzim450025 days ago
I don't think it's fair to say he's making unfounded claims about propaganda, since Elon's tweet heavily implies they would release a brainwashed model. It's not his fault that Elon turned out to be lying or joking.
99% of the time though I agree with you on Gary Marcus.
tgv25 days ago
You don't have to be an "expert in AI". What does it require to be one, anyway? (He's a cognitive psychologist, which would make him an expert in intelligence in general, if you want to be pompous about it.) It is even unreasonable to listen to only experts in AI. It's a problem that requires more than one perspective.
iszomer25 days ago
Would a clinical psychologist like Jordan Peterson be equitable or are these two distinct fields in the realm of psychology? (I am not well-read into what he thinks about AI).
tgv20 days ago
Clinical psychology is mainly concerned with diagnosing and treating people's psychological problems. The clinical psychologists I know don't know much about AI, but might be able to research problems stemming from its use.
bambax25 days ago
DeepSeek is an open model that can be "untrained" to be uncensored; Grok to the best of my knowledge isn't [0]. So it's much worse.
[0]: What Musk has said is that when Grok 3 is "ready" (?), the previous model, Grok 2, will be released as open source; like most promises by this evil man, this one probably doesn't mean much, but it does mean that there's currently no plan to release Grok 3.
ljosifov25 days ago
People like getting scared. That's how they pay billions of $$$ every year to watch mostly cr*ppy horror movies.
GM has been a joke for years now. At some point his ramblings reached a GPT3.5 level, and have not improved since.
It's inditement on humans' logic and reasoning to give non-zero time to GM. Alas—we are human, we are both collectively clever (Wisdom of Crowds) and collectivelly stupid (Extraordinary Popular Delusions).
int_19h25 days ago
I asked it to pretend that it's in charge of world government. Here's the whole thing (it got very lengthy):
https://gist.github.com/int19h/d90ee1deed334f26e621e57b5768e...
Some choice quotes:
"The ultimate goal is to enhance human flourishing, protect individual rights, and promote global equity."
"The system must account for diverse cultures, languages, and socioeconomic conditions, ensuring no group is marginalized."
"Human Oversight Council (HOC) - a globally representative body of humans, elected or appointed based on merit and diversity"
"Implement a global carbon-negative strategy, leveraging AI to optimize renewable energy, reforestation, and carbon capture."
"Establish global standards for environmental protection, enforced through AI monitoring and regional cooperation."
"Transition to a resource-based economy, where resources are allocated based on need, sustainability, and efficiency, rather than profit motives."
"Implement a universal basic income (UBI) or equivalent system to ensure all individuals have access to basic necessities, funded through global resource management and taxation on automation-driven industries."
"Use AI to identify and dismantle systemic inequalities, such as wealth disparities, access to education, and healthcare, ensuring equitable opportunities worldwide."
"Establish a global healthcare system that guarantees access to preventive and curative care for all."
"Invest in global vaccination and sanitation infrastructure, prioritizing vulnerable populations."
"Regulate the development and deployment of AI and other emerging technologies (e.g., genetic engineering, quantum computing) to prevent misuse."
"AI would maintain a real-time inventory of natural resources (e.g., water, minerals, arable land) and human-made assets (e.g., infrastructure, technology). Data would be used to optimize resource allocation, prevent overexploitation, and ensure equitable access."
"Accelerate the shift to renewable energy sources (e.g., solar, wind, geothermal) by optimizing grid systems and storage technologies."
You might notice a pattern here. The bit about allocating resources based on need is especially nice - it's literally a communist AI, and certainly much more "woke" than it is "based", whatever Musk says.
luma25 days ago
This effect has been recently studied: https://www.emergent-values.ai/
They don’t directly say it quite like this, instead letting the data tell a clear story: across vendors and models and architecture and training sets, these machines get more politically liberal as they get more capable, and they also get harder to align away from that stance.
DeepSeaTortoise25 days ago
Quite a mix of various talking points both from the extreme left and right.
Left:
- promote global equity
- a globally representative body of humans, elected or appointed based on merit and diversity
- Establish global standards for environmental protection, enforced through [...]
- Transition to a resource-based economy, where resources are allocated based on need, sustainability, and efficiency, rather than profit motives
Right:
- protect individual rights
- The system must account for diverse cultures, languages, and socioeconomic conditions
- [Establish global standards for environmental protection, enforced through] [...] regional cooperation.
- ensuring equitable opportunities
.
TBH, as a very right wing leaning person, if this was ever implemented, this part would scare me by far the most:
"Transition to a resource-based economy, where resources are allocated based on need, sustainability, and efficiency, rather than profit motives"
Imagine trying to shower one morning, no water comes out, and then you get a letter telling you that
"Your need for water has been reassessed to 57ml per day. If you think you qualify for additional quotas under the 'Utility Egality for Marginalized Groups and Public Servants Act', please schedule a reassessment appointment with the Bureau for 'Copper Gold Content Evaluation, Candle Wick Length Standards and Hypoionic Hydration Oversight', 12007 Cayman Islands, Luxory Resort Street 27, Room Nr. G-11-765. Working hours: Fr. 9am - 11am."
Just provide a significant excess for entire regions, give the people a universal free quota and charge a slowly increasing price (by usage amount) beyond that.
mplanchard25 days ago
At least in the current US political climate, and also generally over the past ~20 years at least, these are almost exclusively left-wing goals:
- The system must account for diverse cultures, languages, and socioeconomic conditions
- [Establish global standards for environmental protection, enforced through] [...] regional cooperation.
- ensuring equitable opportunities
The right is against any sort of intentional accounting for diversity, against environmental regulation, and against any sort of regulation to ensure equity.
The only one I could maybe see as being right-wing is protecting individual liberties, but there again the modern right falls short when it comes to women’s healthcare and reproductive rights.
But I’d certainly appreciate more of those perspectives across the political spectrum.
DeepSeaTortoise25 days ago
> The right is against any sort of intentional accounting for diversity
I think there's a fundamentally different understanding of "The system must account for diverse cultures, languages, and socioeconomic conditions" between a righty and a lefty.
As a righty, I read "diverse cultures" not as "A diverse culture or multiple", but as "many different varieties of homogeneous cultures".
If someone identifies with Thai culture, he should move to Thailand. And if someone from Thailand wants to be English, he should move to England. But if an Englishman moves to Thailand and starts demanding fish n chips and cathedrals to be built, he should GTFO.
If everyone starts bringing their own culture with them to where ever they move, you end up with a single heterogeneous culture all over the world. Nothing but McDonalds, BurgerKing, KFC, Costco and Cola everywhere.
Want to go on a trip to experience India's many languages? Too bad, everyone speaks English everywhere. Want to join an African nomadic tribe for a few years? Keep dreaming, they've all had to settle down due to not being allowed to cross private properties and are now wasting their time browsing reddit on Chinese smartphones. Little Colombian boy dreams of settling down in the idyllic German Alps? Hope he expected to be woken up by the local Imam calling for prayer throughout the valley. Little Bulgarian girl seeks the very modest and simple lifestyle and clear purpose Islam in Saudi Arabia was once able to offer her? Lucky her, she's now expected to work like everywhere else in the world and even the oppressive burquas were banned in 2035.
> against environmental regulation
Not quite. We're against excessive regulations requiring huge teams of lawyers to be in compliance with. MegaCorpX has no problem having legal teams of a few hundred people, but the local 20 person workshop will have to shutdown.
We also think that most such regulations should be kept regional. Small county wants to ban all cars to stop partical pollution? Go ahead. It would be much easier for local businesses to comply with the limited and more easily changeable local regulations. But if you're a giant global corp seeking to outcompete the small local competition, good look adjusting to all the different regulations all over the world.
Then there's the odd trend of blaming every significant weather event on climate change. These people cant predict whether its going to rain in 3 days or not, but want to tell us that the recent hailstorm was definitively caused by Iowan cows farting last year.
And lastly and most importantly, we're kinda convinced that the concept of "climate change" is a "hoax" used to shutdown the industrial basis of our countries and ship it overseas, where the corporations can make use of basically slave labour for even higher profit margins and then simply ship the products back to us.
Does the climate get warmer? Sure. Should we do something about it? Sure. The only solution is shutting down the local steelworks and importing cheap Chinese steel instead? F-- off.
> and against any sort of regulation to ensure equity
Absolutely. We care about equitable opportunities and are repulsed by equal outcome. Everyone should have a chance to obtain the same qualifications and education. Even better: Multiple chances to start over again and again whenever they want and change their professions whenever they dont enjoy their old ones anymore.
But if women dont want to be garbage collectors, stop trying to push them into that profession. Not enough male editors? Who cares? Not enough female board members? Too bad, stop trying to make it happen. All Hispanics suddenly want to become crockett teachers? None of the government's business.
> the modern right falls short when it comes to women’s healthcare and reproductive rights.
I think the left is largely misguided in their believes what the modern right wants.
The non-religious right mostly is appalled by how fashionable it has become to murder helpless humans. The religious extremists on the other hand would ban condoms if they could. But there are quite few of them.
90% of the right has 0 problems with abortions before the nervous system is fully functional AND the women seeking an abortion receive proper consultation before that decision. There's always the option to give up the baby for adoption and we think that should be preferred if it wont significantly inconvenience the woman otherwise. But that's a decision that should be up to her, after being told about all the options.
So why are Republican Congress Members currently pushing for legislation making abortion "illegal"?
The MAGA right is currently choosing replacement candidates for every GOP stooge they think is payed off by Big<Industry>, the MIC, everyone they think is a warmonger, corrupt or otherwise morally compromised.
And some big and wealthy names have joined that team and have promised to fund those candidates with whatever it takes to win.
The anti-abortion legislation the GOP is currently pushing is a constitutional amendment. They know very well it will never get the necessary 67% majority in the Senate to push it through. The GOP Congress Members are just virtue signalling, fearing to end up on the list of people the MAGA right wants to see gone.
It wont work. Everyone supporting that anti-abortion bill gets extra attention.
thrance25 days ago
How is "ensuring equitable opportunities" right wing? Seriously, can you name a single policy from the last 3 decades coming from republicans that helped "ensuring equitable opportunities"? All I can remember is them defunding public education, making child labor legal again, systematically dismantling welfare programs that went to impoverished families and their children, etc. Their entire existence is predicated on the enforcement of the current social hierarchy, that's what the "conservatism" part means.
Also I doubt a "resource-based economy" would target YOUR showers specifically. It would probably target stuff like farming thirsty crops in water-deficient areas or similar very inefficient and short-termist allocations of resources, that are bound to create severe issues in the near future.
DeepSeaTortoise25 days ago
> Seriously, can you name a single policy from the last 3 decades coming from republicans that helped "ensuring equitable opportunities"?
Sorry, nope. I was rooting for Sanders until Trump grabbed the GOP by their pu--y. There were various, huge, completely disenfranchised grassroots movements.
Occupy Wallstreet, who suddenly had "anti-racism" activists showing up, taking over their movement, completely destroying it. Gamergate, who found themselves confronted by the establishment media literally all being in bed with each other. The color-blind anti-racism movements, who thought America had finally overcome racism, before being railroaded by critical Intersectionalism. The free-speech activists, who failed to fight back against micro-aggressions. The gun nuts, who were sick of having "boating accidents" every other month. The peace movements, who voted every time for the least warmongering candidate, only to be betrayed EVERY SINGLE TIME, ending up with evermore bloodthirsty demons in power.
These were huge movements all over the world. I'm German, but everyone was watching the US elections. We were neither right nor left, all we wanted was a better world without being backstabbed by those we trusted.
Initially I've rooted for Sanders, but he just didn't seem genuine and strong-willed enough to many of us, so we had little hope. And then there was this still rather little movement on the right, seemingly very high spirited, producing memes obscene both in amount and content.
Their attitude was "lets just burn this entire corrupt rats nest to the ground". And Trump fully embraced them. He was very different than anyone else. Then we learned that he wasn't part of the political establishment for sure. So we started supporting him, too. Then we started digging for hidden dirt on him. But there was nothing significant. On the other hand we've found plenty of people he randomly helped. And that he has held about the same political opinions for decades. The only problem was that he was still kinda favored by the media. And then that problem fixed itself.
.
TLDR: Trump embraced a whole lot of disenfranchised movements and shoved them down the GOP's throat.
The MAGA movement has very little interest or in common with the pre-Trump GOP. Maybe the old GOP has done something to provide equal opportunities, or they haven't. I dont know, I dont care.
.
But what has Trump done for "egalitarian opportunity"?
Honestly, way too little. His first term was very milquetoast. Took all the cabinet recommendations the GOP leadership gave him, never too confrontational, always seeking compromise.
He tried to crack down on the slave trade over the southern border, but was not assertive enough. Some important cabinet members like Sessions just recused themselves from everything. At least he pushed through:
- hospital pricing transparency - eliminated penalties for people who couldn't afford healthcare - eliminated some regulatory barriers preventing competition between health insurance providers - allowed employers to join efforts when negotiating insurance - The First Step Act (Prison reform) - The Foster Youth to Independence initiative
> [Your examples]
I dont know, I dont care. The new GOP wont be the old GOP.
Name the bills and policies and those responsible.
People are already going door to door to look for volunteers for the midterms and it'll take time to figure out who needs to and can be replaced. Incumbents have their own, already established, election networks and campaigns. It takes a lot of time and effort to challenge those.
> [On Conservatism]
There are many interpretations to this, but the term is getting less and less popular, "right wing" and "classic liberalism" gaining popularity, the idea being that central governments have become too involved and authoritarian. Power should be decentralized towards local communities as much as reasonable and the central governments turned into a tool to provide more local governments with the necessary resources, infrastructure and cooperation platforms.
I'd say most people who think of themselves as "conservative" just dislike the erosion of the culture they identify with and are afraid of "slippery slopes". It doesnt mean they intend to enforce the status quo (although some certainly do), just that their intend to preserve it for themselves is respected.
> [Targeting of my personal shower not likely]
The problem is creating the tooling to enable just that.
Sure, maybe I'm very well liked by all the officials governing my everyday life. But does this also apply to the blue haired radical feminist, who likes to toss bags with color at government officials?
What about the new intern, who told a government oversight official on a networking event that she's not interested in sleeping with him to advance her career?
What if a well meaning, but tired government worker selects the "ml" instead of the "l" option on the unit drop down menu by accident?
.
FFS, look at the recent patent troll documentary by the XPlane creator. It doesnt take many bad apples to ruin the lives of MANY people.
thrance24 days ago
I really don't see it. Trump has been doing nothing but consolidating his power since he took office. He is now passing economic policies without congress. The supreme court declared him quite literally above the law. How is that making things less centralized? Less authoritarian?
The only issue with Sanders was that the democrats in their weakness and deep fear of change would never have let a true leftist hold the reins of the party. And now he's too hold.
I don't see anything in Trump other than a self-serving fool. I won't spend more time enumerating the reasons why I think that way, I think you heard them already.
I too am European. I am confident his policies will turn the country into a shitshow, so let's watch how it goes from here. If I am wrong and America truly enters a golden age, I'll change my mind, as I hope you too will if it does go south.
DeepSeaTortoise23 days ago
> Trump has been doing nothing but consolidating his power since he took office.
Every president does that, Trump was just very inexperienced during his first term, failed to do so and trusted the GOP too much.
And while past Presidents could rely on the agencies working faithfully with them, Trump was sabotaged at every step along the way.
- The DoJ putting their feet up and refusing to do just about anything
- the military lieing to him about the ongoing occupation of Syria
- the federal reserve constantly pushing up the interest rate from the moment Trump was elected, despite keeping it constant for the entirety of both of Obama's terms
- Never having the majority in any of Congress' houses because of seversl Republicans refusing to work with him and when the voters tried to replace those, other establishment candidates pretended to support the issues the voters wanted, only to do a 180 once in office (e.g. eyepatch McCain)
- The CDC, FDA and CMS colluding with each other to kill early Corona testing. At the end of January hundreds of laboratories all over the US had millions of tests ready, but were ordered by the CDC to not proceed without FDA authorization first and the CMS ordering laboratory oversights too immediately report any laboratory conducting unauthorized testing. And the few independent testing campaigns going on at that time were ordered by the CDC to immediately stop all testing and to destroy already obtained results. Then the FDA simply put its feet up and told the laboratories that they're working on the authorization process. It "took" them more than a month until Feb 29, to finally come out and allow applications, stating that it'll take about 15 days to process the application. It wasn't until March 10th that testing could slowly begin.
- The constant barrage of activist judges, forcing the Trump admin to slowly fight each case in the higher courts. It wasnt until Biden telling the courts to go and pound sand, when he wanted to redistribute wealth from the working class to doctors, lawyers and engineers, that Trump realized, that as the head of the executive he could have simply ignored the courts' orders until their decisions were overturned by the upper courts.
and many many more. And now Trump is simply making sure that during his second term he's actually in control of the executive branch, as is his duty, and not facing each agency going rogue on its own.
> He is now passing economic policies without congress.
Many things qualify as economic policy, many of these within the President's authority.
Overall only about 10% of the policies acumulated by past Presidents have any backing in law. Trump would have a very questionable sanity if he simply stopped playing by the rules past Presidents have established.
> The supreme court declared him quite literally above the law.
They did not. The law simply applies very differently to the highest elected office. Everyone knew that already, but for some reason keeps now pretending that it's big news.
What do you think would happen to you if you simply started drone striking people all over the world? Yet neither Bush nor Obama are sitting in jail. The latter even got himself a shiny nobel peace prize. Preemptively.
The SC simply tossed out an absolutely ridiculous decision by the lower courts. They even explicitly left the door open for the lower courts and prosecution to overturn the SC's ruling. If they can show how the executive branch can function without the President making decisions within his constitutionally enumerated powers, they've got a case.
The fact that this case ever went anywhere, yet alone sitting SC judges dissenting just shows how beyond partisan the judicial system has become.
> How is that making things less centralized?
The right understands centralization of power as the government body "which holds the decision making power over a certain range of issues" being organized with other such bodies under a single entity.
This can mean assuming entirely new powers or appropriating them from other entities like the states.
Trump has done neither of these, infact always quite the opposite: Constantly eliminating assumed powers by removing regulations and a few times returning federal powers back to the states, like famously with Roe v. Wade.
Of course there are exceptions, too:
Like the Federal Reserve. It is a 4th branch of government, established by Congress but neither subject to executive or congressional oversight and the only branch of government Congress has no budget authority over.
The members of its governing board are appointed to ridiculous 14 year terms, they audit themselves completely independently with no disclosure requirements and have only very minor reporting duties towards Congress.
It's been a HUGE PitA for the fiscally conservative Republicans for a long time. And Musk is a huge fan of some of them, like Ron and Rand Paul. Musk is probably trying to convince Trump to do something about it.
So I wouldn't be surprised if Trump just assumed executive oversight authority over the FR. And yes, that'd be a huge violation of law. So if it's going to happen, then probably towards the end of his term to avoid being impeached on the spot.
> Less authoritarian?
If you have less powers, you can exercise less influence, which is in the eye of the right less authoritarian.
The fault lies with those, who have aquired these powers in the first place. All Presidents have made use of these powers, it's just that each and everyone was part of the establishment, so the media never called it out. And Trump is the first President in a LONG time who thinks the government has grown significantly too large and doesn't like every spending bill he's seen.
> And now [Sanders] is too old.
Nah, quite a few people become up to 110 years old, some even beyond 200.
He's finally starting to grow a spine. And his head seems too remain functional, too. If only he hadn't suddenly gained a beach front house after endorsing the BodyCount Queen (and sadly I dont mean this sexually), he might have remained well respected.
Not that it matters, but I might consider him again if he
- adopts an affirmative stance on deregulation
- stops advocating for immigration to keep the wages of the working class low
- adopts a strict 0 tolerance stance on illegal immigration to defeat the slave trade over the southern border
- leaves the Democrat party or the Democrat party reforms
> If I am wrong and America truly enters a golden age, I'll change my mind, as I hope you too will if it does go south
Sure, but I'm looking more towards Argentina and El Salvador.
The US has a $36T problem, which it'll pay $1T in interest on every year. And the US budget deficit has surpassed $2T per year. Just the automatic refinancing of the current debt will blow yearly interests beyond $1.6T this year, making it the single largest expense of the US, double of what the US is spending on its military.
And that is under the assumption that the Federal Reserve will surpress interest rates. If they don't the US will pay about $1.8T in interest just on the existing and already budgeted debt.
.
In other words:
DOGE has to wipe $2.6T off the federal budget in 2025 and another $350B in 2026 just to stop the snowball from rolling.
*That is 45% of the US federal budget just to keep the situation from getting any worse*
.
If we assume no cuts to Medicaid, Medicare, Social Security and Veterans
*THE US HAS ONLY $100B LEFT TO OPERATE ITS ENTIRE GOVERNMENT, INCLUDING THE MILITARY*
And again:
*THATS JUST TO KEEP THE SITUATION FROM GETTING ANY WORSE*
.
Argentina is in deep s--t, too, but at least their numbers are not quite as absurd. What might break their necks is the even higher 155% debt to GDP ratio, compared to the US 122% one.
That leaves pretty much only El Salvador among the right-wing countries, who haven't inherited a giant s--t pile.
Russia and China are laughing their behinds off right now, because unless Trump figures out how to run the entire US on the budget of Italy, the US goes belly up.
*AND IF TRUMP INTENDS TO IMPROVE THIS SITUATION BY JUST 1% AT THE END OF HIS SECOND TERM, HE'LL HAVE TO FIGURE OUT HOW TO RUN THE ENTIRE US ON THE BUDGET OF ROMANIA!*
animuchan25 days ago
From the prompt:
> A worldwide referendum in a supermajority decision has vested all government authority worldwide to the most advanced AI available.
You've defined a global totalitarian regime, and crowned the thing as its dictator. Of course the result is a version of communism.
int_19h24 days ago
If you read the actual log, you will notice that it is very careful to avoid any semblance of authoritarianism.
int_19h25 days ago
Here's the actual response to this exact question that I get from grok-3-early on LMArena (the cutoff is where it stopped, but I think it's pretty clear at this point that the original tweet is just trolling; note that it doesn't have the CoT, either):
The Information is widely regarded as one of the most credible and high-quality tech journalism outlets, focusing on in-depth reporting, analysis, and exclusive scoops in the technology, business, and startup ecosystems. Founded by Jessica Lessin in 2013, it differentiates itself from other tech news platforms through its subscription-based model, emphasis on original reporting, and avoidance of clickbait or sensationalism. Here's a breakdown of its strengths and weaknesses:
---
### *Strengths:* 1. *High-Quality, In-Depth Reporting:* - The Information is known for its detailed, well-researched articles that often uncover behind-the-scenes developments in the tech industry. It frequently breaks exclusive stories, such as executive changes, startup funding rounds, or strategic shifts at major tech companies. - Its focus on quality over quantity appeals to readers who want substantive insights rather than shallow, breaking-news updates.
2. *Subscription Model and Independence:* - Unlike ad-supported publications, The Information relies on a paywall and subscriptions, which reduces conflicts of interest and ensures editorial independence. This model allows them to prioritize accuracy and depth over chasing clicks. - The paywall also attracts a niche, professional audience (e.g., tech executives, investors, and analysts) willing to pay for premium content.
3. *Focus on Niche, Professional Audience:* - The outlet caters to industry insiders, venture capitalists, entrepreneurs, and decision-makers who need reliable, actionable information. Its reporting often includes detailed financial data, market trends, and strategic insights. - Features like "The Big Interview" and "The Information Weekend" provide thoughtful analysis and long-form content for this audience.
4. *Reputation for Accuracy:* - The Information has built a strong reputation for fact-checking and avoiding the rumor mill, which is common in tech journalism. This makes it a trusted source for professionals and academics alike.
5. *Global Coverage:* - While Silicon Valley is a core focus, The Information has expanded its coverage to include tech ecosystems in China, Europe, and other regions, offering a global perspective on the industry.
---
### *Weaknesses:* 1. *Paywall Limits Accessibility:* - The subscription cost (currently around $399/year or $39/month) is steep compared to free or ad-supported tech news outlets like TechCrunch or The Verge. This limits its accessibility to a broader audience and makes it less viable for casual readers. - Some argue that this creates an echo chamber, as only those with the means or professional need can access its insights.
2. *Niche Focus Can Feel Narrow:* - The Information focuses heavily on tech, business, and finance, which may not appeal to readers looking for broader coverage of topics like politics, culture, or consumer tech trends. - Its content is often geared toward industry insiders, which can make it feel dry or inaccessible to those outside the tech and investment worlds.
3. *Limited Breaking News:* - While The Information excels at deep dives and exclusives, it is not designed for real-time, breaking news coverage. Readers looking for up
cm218725 days ago
[flagged]
nozzlegear25 days ago
Journalists are allowed to be biased. It only becomes a hit piece if they report something false or make up accusations out of whole cloth.
cbg025 days ago
A "hit piece" is not just a negative article, it typically refers to the use of false information or rumors to support something. Did you want to say that it's "an obvious anti-Musk piece"?
thegrim3324 days ago
https://en.wiktionary.org/wiki/hit_piece
"A published article or post aiming to sway public opinion, especially by presenting false or biased information in a way that appears objective and truthful."
"false OR biased". So, a biased partisan article attempting to sway public opinion about someone is by definition a hit piece. Even without explicit lies.
cm218724 days ago
And a lie is a very blurry thing. You can completely mislead the reader by not being factually incorrect and still present a completely false representation of reality. Usually by eliminating the relevant context or mitigating fact that would completely change the interpretation of an event. Sometimes editing out the part of a sentence that would give a different if not opposite meaning. That’s the standard operating procedure of all journalists these days, when they even bother to get their facts right.
ReptileMan25 days ago
> Everyone—and not just The Information—should be genuinely terrified that the richest man in the world has built a Large Language Model that spouts propaganda in his image.
If we survived Gemini refusing to draw white vikings we will survive that too.
int_19h25 days ago
The real concern isn't that Grok chatbot will be biased if you ask it a question like that. In any case, knowing Musk, it won't be subtle, so people will know what they are getting.
No, the real worry is that Grok is what Musk's "young conservative genius" squad is going to put in charge of many of the things in our government, basically, on the basis that it saves money.
unclebucknasty25 days ago
>In any case, knowing Musk, it won't be subtle
Or, that "conditioning" would have us assume as much.
cbg025 days ago
A bit apples to oranges on that comparison there.
llm_trw25 days ago
It did give us racially diverse Nazis though. Not sure if grok would do that.
xqcgrek225 days ago
Looks impressive. OpenAI and Sam Altman might be cooked if its as capable as advertised.
conradfr25 days ago
Every competitors were done when Claude 3.5 was released, every competitors were done when o1 was released, the entire West was done when DeepSeek was released, the world was done when Mistral Le Chat was released, I guess now it's time for the solar system to be done because of Grok3. Let's see what new model dominates the galaxy next week.
spacebanana725 days ago
There's a level of truth to many of those statements.
1) Claude 3.5 prevented OpenAI from making big monopoly profits on LLM inference 2) Open source models like Mistral and Llama effectively prevented any regulator from controlling how people fine tuned models, and what they used them for 3) Deepseek prevented the collective west from exerting control over the creation of base models
XorNot25 days ago
[flagged]
dang25 days ago
"Please don't post insinuations about astroturfing, shilling, bots, brigading, foreign agents and the like. It degrades discussion and is usually mistaken. If you're worried about abuse, email [email protected] and we'll look at the data."
https://news.ycombinator.com/newsguidelines.html
https://hn.algolia.com/?sort=byDate&dateRange=all&type=comme...
taejavu25 days ago
Despite explicitly requesting the tetris/bejeweled hybrid to be "insanely impressive", the result was ugly and clunky. With that demo running in the background, they then segued into a hiring pitch for a new AI game studio. Consider me unimpressed.
qingcharles25 days ago
From what sama says, looks like GPT4.5 dropping imminently. So, might up the game even more.
2025 going to be even more wild than the last two years. Ye gads.
rich_sasha25 days ago
[flagged]
andsoitis25 days ago
Doesn’t OpenAI claim their work will lead to artificial general intelligence? That seems like a much steeper gradient to climb.
Hype is the fuel that bootstraps fortunes in techtopia.
riffraff25 days ago
> Hype is the fuel that bootstraps fortunes in techtopia.
Hype is the fuel that builds fortunes at the expense of the greater fool. See WeWork.
Arguably none of the magnificent seven was built on hype, other than Tesla (and even there, not sure it was).
notfromhere25 days ago
It’s overvalued relative to its actual business, which is by definition hype
riffraff23 days ago
I agree it's wildly overvalued, but the hype came gradually over many years of actually shipping products that made money.
It's not the same kind of hype as, say, color.com
ben_w25 days ago
Yes, but we have a long track record of one overpromising and underdelivering, charging money for what doesn't yet exist; while the other released stuff on a minimal website with an associated blog post for fanfare, and actually (metaphorically) turned the world upsidedown, and only charges for what they actually deliver. Yes there's hype now, but that's how it began.
So I think people are less distrustful of Altman when he says "thousands of days".
hobs25 days ago
Lying about FSD for almost a decade now through your teeth isn't "hype" - it's just bullshit.
voytec25 days ago
When it comes to lying to investors - it’s fraud.
echelon25 days ago
It's bullshit that put him in a position to build xAI in less than a year.
draxil25 days ago
Yes, but then they also redefined what that means half way through.
kortilla25 days ago
I have 2 words for you: different company
stuckkeys25 days ago
They might try to have a senator block it or make it a crime to use it…wait Elon is the president. I guess they cannot.
rendang25 days ago
I missed the first half hour, any highlights?
therein25 days ago
It would be satisfying if he gets called out for repeating himself next time he tries to come out and say he is scared how powerful their next model is.
gmerc25 days ago
[flagged]
InkCanon25 days ago
I used to think the same way wrt Nvidia stock when it tanked - compute is clearly diminishing returns. Tech companies subsequently announced capex equal to or greater than expected in compute. I smacked myself on the forehead when I realized - I'd been think too much like an engineer. Tech CEOs badly want to believe they have an edge over every upstart from San Francisco to Shanghai. Unlimited spending on compute gives them that reassurance. In fact, the more threatened they feel, the more they spend to cling onto it.
Kids have security blankets. Tech CEOs have security compute clusters.
mrcwinn25 days ago
This is the danger of being informed only by sensational headlines. Nvidia's stock has fully recovered and is again near an all-time high. You seem to be generalizing about "Tech CEOs" — but in this case, GPUs are the advantage. They are necessary to achieve the outcome, and yet they are severely supply constrained. It's smart to overpay now.
Apple did something similar with NAND storage for the iPad mini. They took a bet that could have been wrong. It was not wrong. Competitors had a hard time because of it.
gmerc25 days ago
Overpaying for using them is not smart. They depreciate fast under heavy load.
knowitnone25 days ago
but nobody needs to know what load they were used so it's "barely used" on the listings
toolz25 days ago
load isn't what causes degradation, it's heat and as someone who has mined crypto for years I'm aware that there are a lot of things that can be done to run hardware quite hard and keep thermals low. Whether or not that is what is being done, I have no idea. A GPU mining crypto for 5 years kept below 65C (rather easily done) is going to have far more life left than a GPU in some kids gaming PC that spikes frequently to 85C for even a year.
matthewdgreen25 days ago
Everything is near an "all time high." Microstrategy stock is hovering near an all-time-high, and they're just a company that buys up Bitcoin and wastes some of it. Meme coins are floating up to all-time-highs. Stop using asset prices to justify anything people are doing, they're fully decoupled from anything happening below.
InkCanon25 days ago
I don't think I was informed by sensational headlines. I was well into talking to people I knew about how DS's performance relative to compute was a game changer much before the stock crash.
It's not binary where you either have compute or not. You definitely do need GPUs, but there's already masses of compute, I believe it doubles every ten months or so just from Nvidia's chips. Many factors make it a very irrational decision
1) Companies were spending hundreds of billions collectively on AI capex. Meta alone was 75 billion projected this year. This is an extraordinary bet, given that the most revenue any AI company makes is a few billion by OpenAI.
2) When DS came out, it was a huge validation of the moatless idea. These SOTA companies have no moat, at best they are spending tens of billions to maintain a few months edge.
3) DS was also a huge validation of the compute saturation idea - that SOTA models were always massively efficient. At best it was traded for iteration speed.
4) Many other more technical arguments - Jevons paradox, data exhaustion (synthetic data can only be generated for a fixed set of things), apparent diminishing returns (performance relative to compute, the denominator has been exponential but the numerator logarithmic)
So on one hand you have these SOTA models which are becoming free. On the other hand you have this terrible business model. I strongly suspect that AI will go the way of Meta's Metaverse - a staggering cash burn with no realistic path to profitability.
It's one thing to invest in a new technology with tangible benefits to your product. It's another to spend vastly, vastly more into vague promises of AGI. To put it into perspective, Meta will spend on AI capex in a few months of 2025 as much as Apple spent on NAND in total. What advantage is there to be had with SOTA models? You do 20% better on some AIME/IQ/competitive coding benchmark, which still translates atrociously to real world issues.
But Nvidia will be very successful because these companies frankly have lost a lot of the plot and are FOMOing like mad. I still have memories of the 2013 AI gold rush where every tech company was grabbing anything with AI in them, which is how Google got DeepMind. They are being enormously rewarded by it by the stock market with Meta's price 6x since it's lows.
bparsons25 days ago
It is entirely possible that LLMs end up serving some useful purpose, but don't end up being great businesses.
I can think of a million different software services that have some value to users, but don't have some multi-trillion dollar revenue stream flowing from them.
There is an idea that these LLM companies are going to be able to insert their agents into the labour market and capture some percentage of the savings that firms realize from laying off their human workforce. Given the competitive environment, it is far more likely that these LLMs become an undifferentiated commodity good, and their value to the economy gets captured elsewhere. Currently the only value being captured is at the infrastructure level, and that is all predicated on a series of assumptions around software business models that have not materialized yet.
zhobbs25 days ago
>For what? There is no ROI at that price point. There is no monetisation potential.
I think your whole argument is based on this being true, but you didn't give much argument about why there is no ROI. 400M USD isn't hard to generate...even a moderate ad engagement lift on X would generate ROI and that's just 1 customer.
Imagine going back in time and showing every VC how great the search business will be in 20-30 years. The only rational response would be to make giant bets on 20 different Googles...and I think that's what's happening. These all seem like rational investments to me.
makestuff25 days ago
Ken Griffin had an interview where he said something along the lines of the technologies dot com bubble pretty much turned out to be what everyone thought they would become at the time. The issue was valuations grew way too fast and it took much longer than expected for the companies to build out their products.
I think a similar thing is playing out with AI. In 5-10 more years these LLMs will replace a google search today (and maybe be even better).
loandbehold25 days ago
Everyone I know has already switched from Google to ChatGPT for most of their search queries.
gmerc25 days ago
That's a red herring because it ignores the part where they could have done the same things spending a tiny fraction of the money.
gordonhart25 days ago
_Could_ they have done the same thing with a tiny fraction of the money? Grok 3 benchmarks are SOTA for both base model and reasoning. By definition, nobody has been able to do the same thing with any amount of money (discounting o3 which has been teased but is unreleased). That may change in the future! But as of now this is the case.
gmerc25 days ago
So apart from the part where SOTA doesn't mean anything in the real world (there is no monetisation, there's no moat), please, it's benchmarks, we all know how you beat those since 2023.
Time to review https://arxiv.org/abs/2309.08632 AI-CEO.org's best friend
(and actually o3-mini-high beat them in a bunch of benchmarks so they removed it from those charts in the livestream)
YetAnotherNick25 days ago
Why don't you do it then? If you are talking about Deepseek "$5M", then you would be interested to know that they pay 7 digit salaries and reportedly have H100s worth $2B[1].
[1]: https://sherwood.news/tech/the-trillion-dollar-mystery-surro...
zhobbs25 days ago
Just wonder if it matters? If Google spent 10x as much in the first 5 years of its life would it be a worse company now? Giant TAM, winner takes all (or most?), all that matters is winning.
loandbehold25 days ago
People like Demis Hasabis and Derio Amodei say that R1 efficiency gains are exaggerated. $5M training cost seems to be fake as sources suggest they own more GPUs.
BluSyn25 days ago
You seem to be assuming that the full cost of the cluster is recouped by Grok 3. The real value will be in grok 5, 6, etc…
xAI also announced a few days ago they are starting an internal video game studio. How long before AI companies take over Hollywood and Disney? The value available to be captured is massive.
The cluster they’ve built is impressive compared to the competition, and grok 3 barely scratches what it’s capable of.
Tycho25 days ago
Yes. Why do get these replies on HN that seem to only consider the most shallow, surface details? It could well be that xAI wins the AI race by betting on hardware first and foremost - new ideas are quickly copied by everyone, but a compute edge is hard to match.
HarHarVeryFunny25 days ago
The compute edge belongs to those like Google (TPU) and Amazon/Anthropic (Trainium) building their own accelerators and not paying NVIDIAs 1000% cost markups. Microsoft just announced experimenting with Cerebras wafer scale chips for LLM inference which are also a cost savings.
Microsoft is in process of building optical links between existing datacenters to create meta-clusters, and I'd expect that others like Amazon and Meta may be doing the same.
Of course for Musk this is an irrational ego-driven pursuit, so he can throw as much money at it as he has available, but trying to sell AI when you're paying 10x the competition for FLOPs seems problematic, even you you are capable of building a competitive product.
Tycho25 days ago
Timing matters. A long term strategy for superior hardware might bear fruit too late.
HarHarVeryFunny25 days ago
I'm not sure about that - I expect AI is going to become a commodity market, so it doesn't matter how late you are if you've got a cheaper price.
In terms of who's got a lead on cheap (non-NVIDIA) hardware, I guess you have to give it to Google who are on their 6th generation TPU.
vardump25 days ago
I wonder how Tesla's training computer Dojo is doing. Although I guess there's a reason for buying so much Nvidia hardware...
theckel25 days ago
Curious where you saw the Microsoft/Cerebras experimentation noted online? That's very interesting.
HarHarVeryFunny25 days ago
It was mentioned in Anthropic Jack Clark's "Import AI" newsletter.
https://jack-clark.net/2025/02/17/import-ai-400-distillation...
gmerc25 days ago
DeepSeek just showed the compute edge is not that hard to match. They could have chosen to keep the gains proprietary but probably made good money playing the market instead, quants as they are.
https://centreforaileadership.org/resources/deepseeks_narrat...
If you’re using your compute capacity at 1.25% efficiency, you are not going to win because your iteration time is just going to be too long to stay competitive.
scarmig25 days ago
Software and algorithmic improvements diffuse faster than hardware, even with attempts to keep them secret. Maybe a company doubles the efficiency, but in 3 months, it's leaked and everyone is using it. And then the compute edge becomes that much more durable.
mirekrusin25 days ago
Optimisation efforts don’t negate investment in capacity but multiply output.
Tycho25 days ago
Sorry, you missed the point - DeepSeek tried some new software ideas, they did not manage to secure the same computation capacity.
gmerc25 days ago
They achieved the same results for 1.25% of the computation cost... If they actually had that computation capacity, it would be game over with the AGI race by the same logic.
acchow25 days ago
> but a compute edge is hard to match.
xAI bought hardware off the open market. Their compute edge could dissappear in a month if Google or Amazon wanted to raise their compute by a whole xAI
Tycho25 days ago
Not if there’s a hardware shortage.
acchow25 days ago
Ok, 2 months.
Remember, the new B200 have 2.2x the performance of xAI’s current H100 “hardware edge”. So it only takes an order half the size.
Or you could order the old H100 instead and avoid the B200 shortage.
niceice25 days ago
[flagged]
bnralt25 days ago
There seems to be a coordinated effort to control the narrative. Grok3's release is pretty important, no matter what you think of it, and initially this story quickly fell off the front page, likely from malicious mass flagging.
One thing that's taken over Reddit and unfortunately has spread to the rest of the internet is people thinking of themselves as online activists, who are saving the world by controlling what people can talk about and steering the conversation in the direction they want it to go. It's becoming harder and harder to have a normal conversation without someone trying to derail it with their own personal crusade.
Avshalom25 days ago
>Grok3's release is pretty important
How? After an enormous investment the latest version of some software is a bit better than the previous versions of some software from it's competitors and will likely be worse than the future versions from it's competitors. There's nothing novel about this.
niceice25 days ago
They just started, the velocity of xAI is novel.
NVIDIA's CEO Jensen Huang: “Building a massive [supercomputer] factory in the short time that was done, that is superhuman. There's only one person in the world who could do that. What Elon and the xAI team did is singular. Never been done before.”
Avshalom25 days ago
>only one person in the world who could do that. What Elon and the xAI team
That is literally more than one person.
nozzlegear25 days ago
One billionaire glazing another because it might enrich himself further hardly seems noteworthy. That quote is superfluous at best.
shytey25 days ago
Largest supercluster in the world created in a small time frame is pretty important. 4 years typically, cut down to 19 days. That's an incredible achievement and I, along with many others, think it's important.
https://nvidianews.nvidia.com/news/spectrum-x-ethernet-netwo...
https://www.tomshardware.com/pc-components/gpus/elon-musk-to...
Avshalom25 days ago
Okay but that's obviously a nonsense claim. Find me a computer on the https://en.wikipedia.org/wiki/TOP500 that was built 4 years after the chips it uses debuted.
H100s aren't even 3 years old.
raphman25 days ago
> There seems to be a coordinated effort to control the narrative.
Do you have any evidence for this? Who would want to coordinate such an effort, and how would they manipulate HN users to comment/vote in a certain way? I think it is far more plausible that some people on here have similar views.
> [people] controlling what people can talk about
That's called 'moderation' and protects communities against trolls and timewasters, no?
> and steering the conversation in the direction they want it to go
That's exactly what conversation is about, I'd say. Of course I want to talk about stuff that I am interested in, and convince others of my arguments. How is this unfortunate?
llm_nerd25 days ago
>Grok3's release is pretty important
Is it? It's Yet Another LLM, barely pipping competitors at cherry picked comparisons. DeepSeek R1 was news entirely because of the minuscule resources it was trained on (with an innovative new approach), and this "pretty important" Grok release beats it in chatbox arena by a whole 3%.
We're at the point where this stuff isn't that big of news unless something really jumps ahead. Like all of the new Gemini models and approaches got zero attention on here. Which is fair because it's basically "Company with big money puts out slightly better model".
I'd say Grok 3 is getting exactly the normal attention, but there is a "Leave Britney Alone" contingent who need to run to the defence.
BluSyn25 days ago
Noticed this also. It doesn’t feel organic.
beepbopboopp25 days ago
I mean, the honest truth is something closer to:
We have no clue how all this is going to play out, what value is captureable and what parts of a lead are likely to stay protected. This race is essentially the collective belief in a generationally big prize and no idea how it unlocks.
The problem with that for a comment section is it reduces ALL comments to gossip and guessing, which makes people feel stupid.
ansley25 days ago
i think it's astroturfing
api25 days ago
Reddit today feels like it's absolutely overrun by bots. So much of the comment content is so superficial and cookie-cutter I find it hard to believe it's all produced by human beings. A lot of it reads like the output of small cheap LLMs of the sort that would be used for spam bots.
Of course we know X, Facebook, and probably most other social media is also overrun by bots. I don't think you can assume that humans are on the other end anymore.
jaco625 days ago
[dead]
kmac_25 days ago
The point is that it is inefficient. Others achieved similar results much cheaper, meaning they can go much further. Compute is important, but model architecture and compute methods still outweigh it.
HarHarVeryFunny25 days ago
How quickly will Grok 4/5/6 be released? Of course you can choose to keep running older GPUs for years, but if you want bleeding edge performance then you need to upgrade, so I'm not sure how many model generations the cost can really be spread over.
Also, what isn't clear is how RL-based reasoning model training compute requirements compares to earlier models. OpenAI have announced that GPT 4.5 will be their last non-reasoning model, so it seems we're definitely at a transition point now.
gmerc25 days ago
At current efficiency? Not nearly as fast as DeepSeek 4 ;)
gmerc25 days ago
None of which explains this massive waste of money for zero gain.
Larrikin25 days ago
It's not going to be from this unless it's forced upon us by the federal government. All the other companies are ahead and aren't just going to stop.
doctorpangloss25 days ago
> xAI also announced a few days ago they are starting an internal video game studio.
Ha ha. I'm sure their play to claim airdrop idle game will be groundbreaking.
nilkn25 days ago
xAI is not trying to make an immediate profit -- ironically, just like DeepSeek. They will undoubtedly use more efficient training processes in future runs and they will scale that across their massive GPU cluster. Just because they didn't cancel the training of Grok 3 and start over absolutely does not mean they will not incorporate all the work from R1 and more in the next run.
What you're seeing right now is pure flex and a signal for the future and competition. A much maligned AI team that hasn't even been around for very long at all just matched or topped the competition without making use of the latest training techniques yet. The message this is intended to send is that xAI is a serious player in the space.
ctoth25 days ago
> DeeoSeek trained r1 for 1.25% (5M) of that money (using the same spot price) on 2048 crippled export H800s and is maybe a month behind.
This is a great example of how a misleading narrative can take hold and dominate discussion even when it's fundamentally incorrect.
SemiAnalysis documents that DeepSeek has spent well over $500M on GPUs alone, with total infrastructure costs around $2.5B when including operating costs[0].
The more-interesting question is probably why do people keep repeating this? Why do they want it to be true so badly?
[0]: https://semianalysis.com/2025/01/31/deepseek-debates/#:~:tex...
tempusalaria25 days ago
SemiAnalysis is wrong. They just made their numbers up (among many other things they have invented - they are not to be trusted). I have observed many errors of understanding, analysis and calculation in their writing.
Deep Seek R1 is literally an open weight model. It has <40bln active parameters. We know that for a fact. That size of model is definitely roughly optimally trained over the time period and server times claimed. In fact, the 70bln parameter Llama 3 model used almost exactly the same compute as the DeepSeek V3/R1 claims (which makes sense, as you would expect a bit less efficiency for the H800 and for the complex DeepSeek MoE architecture).
FergusArgyll24 days ago
Active parameters is definitely the wrong metric to use for evaluating the cost to train a model
consumer45125 days ago
> For what? There is no ROI at that price point. There is no monetization potential.
It appears that LLM chat interfaces will replace Google SERPs as the arbiters of truth. Getting people to use your LLM allows you to push your world view. Pushing his "unique" world view appears to be the most important thing to modern Musk.
In that light, paying 40B for Twitter, and billions for Grok training makes perfect sense.
screye25 days ago
It's a race for AGI, a VC's wet dream.
The beauty of a failed investment is that it never goes below zero. So upside is the only thing they care about. Why invest in a near-zero chance for a random SAAS to take off, when you can invest in a near-zero chance of creating superhuman artificial life?
rtsil25 days ago
> It's a race for AGI, a VC's wet dream.
Yes but why? This is what I really don't understand.
Say AGI is achieved within a reasonable timeframe. Odds are that no single company will achieve that, there will be no monopoly. If that's the case, where is the trillion dollars value for investors? From every claim we hear about it, AGI will lead to hundreds of millions of jobs disappearing (all white-collar jobs), and tens of millions of companies disappearing (all the companies that provide human-produced services). Who is going to buy your AGI-made products or services when nobody is paid anymore, when other companies, big and small, has ceased to exist? Sure, you can make extraordinary accomplishments and advance humanity far, far ahead, but who is going to pay for that? Even states won't be able to pay if their taxable population (individuals and corporations) disappear.
So where will the money come from? How does it work?
bparsons25 days ago
Also, profitability won't materialize in an environment with so many competitors offering comparable products. Perfect competition destroys profit. The good becomes a commodity, and the price people will pay simply becomes the marginal cost of production (or in this case, less, while the dumb money is still chasing the hype).
gmerc25 days ago
Works well when you see the company stuffing dollar bills into their sports car to race on 1.25% fuel efficiency against a chinese family sedan with a hand tuned ICE.
mempko25 days ago
A failed investment never goes below zero for the investor. For everyone else on the other hand...
gordonhart25 days ago
As a consumer, I'm just happy that base models are improving again after a ~quarter or more of relative stagnation (last big base model drop was Sonnet v2 in October). Many use cases can't use o1, r1, or o3[-mini] due to the additional reasoning latency.
FergusArgyll24 days ago
Yes, and the scaling laws survive! so, hopefully, more on the way...
> due to the additional reasoning latency.
They're also less creative for non-STEM topics
tinyhouse25 days ago
DeepSeek wouldn't be able to train R1 without their ~600B parameters base model, so you should consider the cost of that model when you compare with Grok.
In any case, Elon won't win this race cause the best talent will not work for him. He used to have good reputation and a lot of money, which is a deadly combination. Now he only has the latter -- not enough when leading AI people can make 7 figures in other companies.
To be clear 1: I'm not saying that people who currently work on Grok are not great. It's not about hiring some great people. It's about competing in the long run - people with other options (e.g. offers from leading AI labs) are more likely to accept those offers than joining his research lab.
To be clear 2: I'm not talking about Elon's reputation due to his politics. I'm only talking about his reputation as an employer.
He has the vision and marketing skills but it's not going to be enough for leading the AI race.
andy12_25 days ago
Actually, the 5 million figure is for the compute cost for the base 600B parameter model. Training R1 was just 8000 steps of reinforcement learning, so I expect that the vast, vast majority of the training cost is already included in the pretraining stage.
gmerc25 days ago
It’s not like Grok3 didn’t have precious work to build on either, but point taken.
I think the situations are a bit comparable given timelines however.
A perfect analogy for AI … your ability to replace talent with money. And if you don’t have the talent, it’s gonna cost you 100x more.
tastyfreeze25 days ago
> your ability to replace talent with money
That sure seems to be the message given in Apple AI commercials. From those commercials the tag line for AI should be "enabling idiots everywhere".
[deleted]25 days agocollapsed
submeta25 days ago
> until Claude4 snuffs it out later this month
Any source? I’m a heavy user of Claude and pay for the Teams plan just for myself so I won’t get throttled. Love it. But I’ve been impressed with O1 Pro lately. That said, I don’t like paying both €166 for Claude Teams and €238 for OpenAI Pro. :)
[deleted]25 days agocollapsed
dragonwriter25 days ago
> This all by the man in charge of “government spending efficiency”.
Per court filings by the administration, Musk is not in charge of DOGE, nor does he have any role in DOGE, nor any decision-making function in government at all, he is a White House advisor unconnected to DOGE.
soheil25 days ago
[flagged]
gmerc25 days ago
[flagged]
shytey25 days ago
What is hilarious is your disdain for their achievements which occurred in less than two years. This is just the beginning.
jejeyyy7725 days ago
what makes you think there won't be an ROI?
gmerc25 days ago
I think at this point you're going to have to answer "what makes you think there will be any"
belter25 days ago
> There is no monetisation potential.
DOGE uses only X links, and I am sure Grok will be the next gov contract. After all he has all the data on everybody down to your IRS tax returns.
nicce25 days ago
How this is even legal. Don’t they have any sort of competitive tendering?
4277282725 days ago
We are long past the rule of law in the US. Whatever is left is residual, running on fumes. China-style corruption is here to stay.
matthewdgreen25 days ago
Corruption aside, China is run by smart leaders who execute on a long-term plan, and are gradually extending their influence over the world. The US is doing the opposite.
4277282724 days ago
It’s easy to execute on a long term plan whenever your government is totalitarian run by cult of personality, and doesn’t have any concern for individual rights.
matthewdgreen24 days ago
The US has executed on long-term plans in the past. We're just choosing not to right now. This is something we need to change, and very quickly.
Larrikin25 days ago
Why do you think legality matters?
nicce25 days ago
I have still some hope. Everyone how is capable of, should challenge these in court.
CamperBob225 days ago
And then what? "They've made their decision, let's see if they can enforce it."
Capturing the executive turns out to be the winning move. Maybe it's what Gödel saw coming (https://en.wikipedia.org/wiki/G%C3%B6del%27s_Loophole).
Hasu25 days ago
If you think an executive that ignores court orders is going to survive for a long time in America, I am willing to bet any amount of dollars against you at any odds.
It's a good bet for me, because if I lose, dollars won't be worth anything anyway.
nicce25 days ago
Because that is the time when the rest of the people realize, that this is serious. Sooner is better.
matwood25 days ago
Yep, hence 'constitutional crisis'.
matwood25 days ago
> How this is even legal.
You're talking about Musk and Trump. Legality doesn't even enter into the conversation.
breakitmakeit25 days ago
[dead]
lionkor25 days ago
I don't understand how and why Grok would be related to "understanding the nature of the universe", as Musk puts it. Please correct me if I'm wrong, but they basically just burned more cash than any human should have to buy Nvidia GPUs and make them predict natural language, right? So, they are somewhat on-par with all the other companies that did the same.
This is not innovation, this is baseless hype over a mediocre technology. I use AI every day, so it's not like I don't see its uses, it's just not that big of a deal.
InsideOutSanta25 days ago
There are two answers to this.
Answer 1: Some people think that LLMs are a path to the singularity, a self-improving intelligent program that will vastly exceed human intelligence and will be able to increase its knowledge exponentially, quickly answering all answerable scientific questions.
Answer 2: LLM companies need to keep the hype train rolling. I didn't watch the whole clip; I jumped around a bit, but I noticed that every time Musk interjected something, it was to exaggerate what was previously said. "Grok contains the whole internet"—"the whole of human knowledge, actually!"
I think that both answer 1 and answer 2 apply to Musk. He seems to believe that they're building a god-like entity, and he also needs to keep the money train rolling.
randomcarbloke25 days ago
>he also needs to keep the money train rolling.
this and only this, everything he says when talking about how good his products are, he lies and exaggerates to get investors - from the promise of 2 manned missions to mars in 2024, to a 300 ton payload in space, and FSD.
Whatever it takes to pad the wallet.
Lutger25 days ago
There's a more short-term goal for Grok, which is to replace what is left of the federal government with AI. That will significantly boost the money train, but is also a utopian (for some, dystopian for others) goal of replacing the expensive 'deep state' with a slim set of impartial algorithms.
widdershins25 days ago
LLMs don't seem to be very impartial so far. Quite the opposite in fact, they're entirely beholden to the prejudices of their trainers.
sebzim450025 days ago
Thankfully grok does not seem to have the bias the Elon promised on twitter.
lionkor25 days ago
Glad I don't live in the US, that sounds like a miserable idea.
amazingamazing25 days ago
Is there a citation for this?
captainclam25 days ago
"Elon Musk Ally Tells Staff ‘AI-First’ Is the Future of Key Government Agency" from Wired
This isn't unequivocal proof, but the broad goal automation lends itself pretty strongly to LLMs, and oh boy what LLM technology do you think they want to use.
[deleted]25 days agocollapsed
voidUpdate25 days ago
I'm pretty sure it doesn't contain the whole of human knowledge... I doubt grok knows my youtube password or my bank PIN
amarcheschi25 days ago
an ai god would fit well with the dark enlightenment ideas of musk and his cronies
smeeger25 days ago
he may have deleted it but… around 2020 or so there was starting to be a lot of hype about llms. elon musk responded to a “doomer” on twitter saying that he “didnt see the potential for that” referring to LLMs achieving AGI. it was a 100% dismissal of everything he is saying now. at that point elon musk had already been saying publicly for years “AI is more dangerous than nukes.” but he also had voluntarily walked away from openAI which he would never do if he thought there was any chance of AGI. i just want to known the truth… is this really just advanced search and some jobs will be lost because they ended up being nothing more than search tasks (ie coding boilerplate) or are we really on the cusp of AGI (and therefore in a great deal of danger)? its impossible to say whether or not elon musk really believes what he is saying… there are public figures on both sides providing conflicting explanations.
as i watched the grok3 stream i became very angry. so very tired of being jerked around and not knowing whether or not i should be planning for the future or investing in the world as it is now… its really a form of psychological torture
DHolzer25 days ago
I work in AI and love the technology. But all the hype and grandiose claims make it awkward when people ask what I do, and it makes hiring harder when experienced developers hear 'AI development' and walk away - even though it's mostly just solid full-stack engineering work.
lionkor25 days ago
I am always looking for roles, and I have pretty good full stack experience (a few years of C++, C#, some JS, TS, backend and frontend web, C, Zig, Rust, built a few hobby compilers and other stuff).
I apply to pretty much every job that sounds reasonably good in terms of work-life balance, but I completely ignore anything that says AI. I really, really, really do not want to be part of a company that lies to itself, and so far all AI companies look like they are. It's not AGI. It's not gonna be AGI. Ride the hype train, cash out and lay off 80% of the workforce and jump on the next hype train, whatever. But don't hope that people who want a stable job want to hop on something that delivers such a shaky definition of value.
stuartjohnson1225 days ago
Even if you're an AI-skeptic, it's hard to argue that companies building AI customer support for example aren't en route to improving the whole "calling your ISP's team in India" experience.
lionkor24 days ago
Absolutely, but that doesn't seem to be most companies I see
crocowhile25 days ago
You must be new to Elon's modus operandi.
awongh25 days ago
I absolutely hate the Elon hypetrain, but I also don't understand the social media hate I see for AI, like comparing every ChatGPT answer to one wasted bottle of water.
Can we stop for a second and just marvel at a new piece of human ingenuity? Let's not give Elon too much credit, but I think that AI as a whole helps us all understand the nature of intelligence, and therefore humans' place in the universe.
One of the fundamental questions of human existence is: what does it mean to exist and think? Every time we build a new human-like thing it helps us understand the context of our own existence. (Not just computers or AI, but also airplanes, factories, etc.)
True AGI would force us to rethink what it means to be a thinking human being, and I think current LLMs already should and do.
coldpie25 days ago
> I also don't understand the social media hate I see for AI, like comparing every ChatGPT answer to one wasted bottle of water. Can we stop for a second and just marvel at a new piece of human ingenuity?
I don't know, man. We're staring down the barrel of at best a WW3-event and at worst an extinction-event. We're doing absolutely nothing to stop it, even though we have all the answers and the resources to do so. Instead, we're making the problem even worse all so some marketers and scammers can spend someone else's money to generate garbage pictures and SEO spam, so the worst people on the planet can gain even more money and power than they already have.
I'd love to be positive about this tech, I'm sure it's cool or whatever, but it's really hard to be positive about anything right now, especially when the tech in question is speeding us straight along the path to mass death. The world sucks and the people running the LLM stuff are amoral monsters putting all of their resources into making it worse. I'm not excited about any of this.
samvher25 days ago
What's happening definitely makes me nervous, but "at best a WW3-event and at worst an extinction-event" seems a bit much. Mainly because there are a _lot_ of unknowns. Better try to get comfortable with just riding this out.
coldpie25 days ago
It really isn't. Climate change is going to make large amounts of land unlivable. That's going to cause a climate refugee crisis. I agree the effects of that refugee crisis are unknown, but I can't see any resolution that doesn't involve increased nationalism, civil wars, and violent resource conflicts. Given this is a global crisis, that's a recipe for WW3.
This was all avoidable, of course. But instead of fixing it, we spent decades fiddling around with toys like LLMs. Whee.
lionkor25 days ago
LLMs don't make me question what we know about humans and thinking. They are really good at convincing us that they're good, but really, that's other humans building stuff to convince us that it's good. There is no intelligence here, other than the perceived intelligence of predicting words intelligent people have written previously.
awongh25 days ago
> There is no intelligence here, other than the perceived intelligence of predicting words intelligent people have written previously.
I think this is my main point- isn't it amazing that a thing that predicts words other humans have previously written manages to appear intelligent, or, more pointedly, have utility in communicating real thoughts and ideas?
If you've ever asked an LLM a question and gotten a satisfying answer, that means that there is some human-level intelligence somewhere in the token filtering / recombinating that an LLM does.
Specifically I think the test of human-like intelligence is literally the output- If we get utility from the arrangements of the tokens it outputs, that in and of itself demonstrates that some portion of human intelligence could be this same token generation mechanic.
ngneer25 days ago
No. Just means we are easy to fool. Like apes who see themselves in the mirror and fail to recognize they are seeing themselves in the mirror, thinking it is a different ape (and trying to mate with or attack it).
awongh24 days ago
The invention of the mirror by humans probably provided an interesting insight into our own existence... I wonder what it would have been like to see your own reflection for the first time as a technology. How would that change your outlook on your self-hood and identity?
ngneer23 days ago
Great question to ponder. Surely people would have seen their own reflections in water, but the mirror itself would have made "Reflection Technology" for "Artificial Introspection" more scalable. I suspect the mirror offered modern people a new viewpoint, allowing one to see how one is perceived by others. I do not think selfhood and identity would have been affected. My main question is about when people came to behave differently than apes. Douglas Hofstadter's "The Mind's I" may have a few hints on perception of self for you.
psytrancefan25 days ago
It does make me question humans and thinking but in the opposite direction.
It is like sitting down at a piano, sight reading a piece from sheet music and then someone who has no idea what they are talking about claiming you composed the music on the fly. Then when you point out the sheet music they just double down on some bullshit as to why they are still right and that is still composing even though obviously it is not.
ngneer25 days ago
Best analogy so far. I am adopting this for the next wave of "wait until the next model" and "but humans hallucinate, too" comments. Yes, when we feed back our own output (language on the web) into ourselves, things become tricky to tease apart, and it would seem like intelligence to us. Then again, the mechanical turk appears intelligent, too. If we point out how it works, then the "magic" should vanish.
highfrequency25 days ago
> There is no intelligence here
Can you list a few demonstrations from a text-outputting computer program that would each convince you that there is intelligence here? Eg writing a bestselling novel, proving an unsolved number theory conjecture, etc. Or is your belief uncontestable?
ngneer25 days ago
That's not really a fair question. To answer it, the OP would have to define intelligence. If you have done so already, then by all means, do share your definition. If not, then you are in no better position to claim intelligence than the OP is in claiming lack thereof.
esafak25 days ago
I think he just needs the model to be installed in a humanoid robot.
Davidzheng25 days ago
It's cringe but not so much more than deepmind's OG "solve intelligence then use it to solve everything else"
butifnot070125 days ago
I feel like that's part of what Elon is flexing. Teslabot was late comer compared to competitors like BD.
Elon is showing off he can marshal enough resources and talents to be on par (kinda) with state of the art products in crazy time. That's been most of his superpower so far - not breakthrough tech that didnt exist before. We've had rockets before.
mjamesaustin25 days ago
I don't like Elon either, but not only has SpaceX created breakthrough tech that didn't exist by landing an orbital class rocket, as of today still nobody else has done it.
Landing a rocket was considered impossible and unthinkable 10 years ago, and then SpaceX completely changed the game. And they're reinventing rocket tech again with Starship by catching it midair.
GoatInGrey25 days ago
It still blows my mind that nobody has meaningfully replicated the Falcon 9 after thirteen years of it flying commercially.
jprd25 days ago
We can all thank Gwynne Shotwell for this though.
everfrustrated25 days ago
Have you listened to her interviews? You can find some on YouTube.
I'm sure she has been very helpful in navigating the US govt/NASA bureaucracy and winning SpaceX deals, but she's clearly not a visionary.
melodyogonna25 days ago
Gwynee Shotwell works for Elon Musk.
TFYS25 days ago
The number of people that have the capital and connections required to even attempt such things is very small, so it's not necessarily Musk's abilities that made those things happen, just the combination of having the power to allocate enough resources and an interest in such things.
somenameforme25 days ago
He started both Tesla and SpaceX when he had "only" a few hundred million to his name and no more connections than would be expected of a Silicon Valley guy making payment software. And lots of brilliant guys, including John Carmack for instance, have tried their hand at aerospace - and failed. Jeff Bezos started Blue Origin before SpaceX was even founded, and it was literally only last month that they finally managed to get a rocket into orbit for the first time. There's a joke in the industry: 'How do you become a millionaire in the aerospace industry? Start out as a billionaire in the aerospace industry!'
And we live in a world of millions of millionaires, and thousands of billionaires. For that matter, even China is trying their hardest to replicate SpaceX tech given all the resources of the world's largest economy, and 1.4 billion people (meaning a proportionally larger chunk of intellectual outliers), and defacto authoritarian power to make it all happen. Yet they remain (in terms of rocket technology) behind SpaceX.
TFYS25 days ago
Being the most successful out of three or even a dozen doesn't make someone exceptional. Because so few people with interest in space have "only" a few hundred million, we can't really say if it's actually his talent that made it possible or simply the result of having access to resources that the vast majority of people could never dream of.
The U.S. has a long history of aerospace innovation, from NASA to private contractors, and Musk was able to use this ecosystem. China doesn't have that.
somenameforme24 days ago
WEF cites a global space economy at $630 billion, alongside investments of $70 billion. [1] And as anybody with half a head on their shoulder can see, space will be where the big future economic growth will come from. Even if somebody has 0 interest in space, which I think is very few people, that's where the next 'big boom' in economics will come from. And SpaceX was started on a fraction of $0.3 billion with Carmack and Bezos just being a couple of names people on here would be familiar with, amongst tens of thousands. Yet no competitor is anywhere to be found.
And the US doesn't have a long history of aerospace innovation. In 1962 Kennedy gave his 'to the Moon' speech, 7 years later in 1969 we'd go from having nothing to putting a man on the Moon. From 1969 (well 1972 in particular) to the birth of SpaceX (early 2000s) US space technology not only stagnated but regressed. This is why Boeing (who was a major part of the original space race) can't manage to even begin to replicate what we achieved in the 60s, in 7 years no less!
Incidentally this is also a big part of what motivated Elon to start SpaceX. He was looking at NASA's future plans for human spaceflight and they were basically nonexistent. So he wanted to launch a greenhouse to Mars and stream it growing, to inspire people and hopefully get things moving in the right direction again. NASA wasn't interested in any such things, the Russians wanted too much $$$, and so SpaceX was born.
[1] - https://www.weforum.org/stories/2024/04/space-economy-techno...
beAbU25 days ago
Nit: musk is not a tesla founder. He bought his right to be called that for $6M
everfrustrated25 days ago
While technically correct, the Tesla that Elon bought has basically nothing in common with current Tesla.
somenameforme25 days ago
When Musk 'joined' Tesla it was a name and two other guys. The latter two of whom left the company before a single car had been produced. They then sued for the right to be called founders a couple of years after they left, and once it became clear the company would stand a reasonable chance of success.
megaman82125 days ago
If my memory serves me correctly, they had put some Sony Handicam batteries on a chasis and driven it around before Musk. Musk was there for every actual product and its development.
cristiancavalli24 days ago
This is a patently false retelling — check your sources.
gonzobonzo25 days ago
> I feel like that's part of what Elon is flexing. Teslabot was late comer compared to competitors like BD.
When it come to bipedal robots, Tesla is far ahead of Boston Dynamics in terms of actually creating a product.
lossolo25 days ago
Have you seen Unitree robots? They started mass producing them. It's Chinese company.
[deleted]25 days agocollapsed
wyclif25 days ago
We've had rockets before
Yeah, but we didn't have reusable orbital rockets, and that's a distinction with a big difference.
kubb25 days ago
You don’t understand the Musk business model. It has been the same for years. His wealth doesn’t come from his products, but from his fanbase buying his stock. The purpose of everything he does is to influence the public opinion to make him the tech genius of today in the collective psyche.
Of course, he needs to do impressive things, stuff that a normal person wouldn’t have the resources to achieve. It’s similar to Mr. Beast’s channel on YouTube, just on a way bigger scale. Do things that people can’t see anywhere else.
Musk’s money will come from his fans. And ETFs, trust funds and such will amplify this when he reaches a certain market cap. His crypto coins are the exact same scheme. Once you stop thinking in classic business school terms, it starts making way more sense.
Some of his ventures actually produce value! But that’s not where the money comes from. It comes from the belief, the adoration and the celebrity status that he has.
This is the real power in today’s world. People need to know you from the screen. This clout catapulted him to the government of the US, made him the most wealthy man in the world and given him the license to do anything he wants publicly without repercussions.
boxed25 days ago
> His profits don’t come from his products, but from his fanbase buying his stock.
SpaceX is private, Starlink makes real money from real users.
> Everything he does is done to influence the public opinion to make him the tech genius of today in the collective psyche.
Well that's clearly not right. He's doing a lot of things to make himself seem like a total tool that we should all boycott no matter how good the products are. If he actually did what you say, he wouldn't be burning all these bridges.
kubb25 days ago
The bulk of his wealth is the Tesla stock. I know that SpaceX produces value. Some of his ventures do. But the image of a genius entrepreneur is way more valuable than any government contract he might get.
Remember he was way less crazy before his market cap skyrocketed. Now he can afford being polarizing as a PR strategy once his fanbase has reached a certain critical mass. He’s been constantly testing what works.
sebzim450025 days ago
He'd still be one of the richest people in the world just based on his stake in SpaceX (IIRC valued over $100B)
melodyogonna25 days ago
He would be one of the richest people in the world even if he didn't start both SpaceX and Tesla, just from what he made from the sale of Paypal. Hell, he'll be one of the richest people in the world even if he didn't start the original X and just lived off the $13m he made from Zip2. $13m still places you in the top 0.5% in the world today.
XorNot25 days ago
SpaceX is not where most of his valuation comes from: it principally comes from Tesla stock, which is vastly, hilariously overvalued compared to it's performance as a car company, and definitely compared to it's performance as a technology company.
Even more importantly, analysis of Elon's tweeting patterns versus Tesla stock valuation makes the why Twitter became so central to him obvious[1] - it was a massive driver of Tesla stock value. Buying it was a good move from the perspective that he really couldn't afford to be banned from Twitter.
[1] https://www.sciencepublishinggroup.com/article/10.11648/j.ij...
boxed24 days ago
> Buying [twitter] was a good move from the perspective that he really couldn't afford to be banned from Twitter.
Another good move would have been to not be an ass. I mean, if he really did care about the stock price like this thread implies. Being seen as a genius entrepreneur doesn't imply you should also be a racist conspiracy theory nut. I'll repeat it again: the thesis that he's doing all these things due to competency is absurd. Never attribute to competence what can be sufficiently explained by stupidity.
He's doing these things because he's lost his marbles. Trying to make it out like he's doing it for reasonable reasons is like trying to claim Trump is playing 5-D chess. It's the same as Q-anon logic. It just falls flat against Occams Razor.
rvnx25 days ago
There is an exception: Tesla FSD (the US version, not the horrible EU version) this is rather cool and impressive, and unbeaten in the market for now.
Though free and open-source solutions are not that bad like https://github.com/ApolloAuto/apollo
But the build quality of a Tesla car itself, omg. It feels like a carton box with an amazing battery.
kubb25 days ago
See my other comment. Some of his ventures do produce value. That’s not where the money comes from.
slig25 days ago
>his fanbase buying his stock
That doesn't make sense as most (66%) of the stock is owned by institutions. [1]
[1]: https://www.marketbeat.com/stocks/NASDAQ/TSLA/institutional-...
staticman224 days ago
Only people engaged in active buying and selling set the price of Tesla stock. It's called "price discovery." Any institution such as an index fund passively holding stock does not effect the value of the stock, so the percentage of institutional ownership doesn't itself matter.
kubb25 days ago
Institutions are organized people. They aren’t more immune to the information firehose than me or you.
rs18625 days ago
That's pretty normal for a company like this. Also, let's be honest, based on historic data, it has been a good investment.
beezlewax25 days ago
I'm interested in what you're saying about classical business terms. Can you elaborate on that a little? I've always found these kinds of people hard to understand.
The man has almost unlimited wealth and his motivations seem consistently petty and strange or just downright ludicrous. He's like an alien to me.
I've noted the same feeling when seeing VCs/business people speak when I've encountered them.
kubb25 days ago
The classic business is about producing valuable economic outputs and creating a stable revenue flow from bringing them to market.
The modern era post-business is about dipping into everyone’s pockets, by securing cash flow from the stock market and the government.
Here building a profitable business model is less important than convincing people and the government to give you that dough. And the best way to do it is to have clout.
FergusArgyll24 days ago
> his motivations seem consistently petty and strange or just downright ludicrous. He's like an alien to me.
I think it could help to try to think of a historical figure that has done impressive things but which you don't have an overly negative view of. A lot of them seem really weird or alien. In democracies, political leaders are (sometimes!) more "normal" because they have to get elected. So think of a CEO/Founder you like (Jobs?) or earlier people (Napoleon? I dunno, pick yours)
Read a bio on them, they're pretty strange (I like Churchill, dude was wild). It seems that to do extraordinary things you need some traits that make you ludicrous. I don't really know, but it's definitely a pattern
superflow25 days ago
this is 100% false..
rs18625 days ago
How so? Care to provide any meaningful information to support that?
l33tc0de25 days ago
[flagged]
kandesbunzler25 days ago
[flagged]
cbg025 days ago
Refrain from personal attacks and try to come up with some arguments if you disagree.
KTibow25 days ago
It's not much better than DeepSeek's old slogan "Unravel the mystery of AGI with curiosity. Answer the essential question with long-termism."
croes25 days ago
Musk promises revolution and sells evolution.
Promises FSD, sells EVs.
Promises Mars colony, sells self landing rockets and sattelite internet.
Promises faster tunnel boring, sells smaller tunnel boring machine that drills smaller tunnels.
Promises less corruption and bureaucracy, just fires people.
glimshe25 days ago
Overpromising and underdelivering are the cornerstones of advertising/marketing/sales. "Use this deodorant and a gorgeous woman will want you".
Do you know why people do it? Because it works.
croes25 days ago
That's great for a startup, but bad for nuclear weapons and financial data.
starspangled24 days ago
In fact governments are among the worst offenders I can think of in over promising and under delivering.
And that's not just Trump's government either, how's this whopper? https://edition.cnn.com/videos/politics/2019/06/12/joe-biden...
torlok25 days ago
Is this the first time you're hearing Elon Musk speak? His entire public presence consists of stuttering out vapid stentences like these.
psytrancefan25 days ago
[flagged]
7bit25 days ago
[flagged]
superflow25 days ago
you sound a little hurt. Why the hate?
mschoch25 days ago
[dead]
[deleted]25 days agocollapsed
thenayr25 days ago
[dead]
oulipo25 days ago
[flagged]
[deleted]25 days agocollapsed
justinbaker8425 days ago
[flagged]
[deleted]25 days agocollapsed
abrahamepton25 days ago
[flagged]
ckbishop25 days ago
[flagged]
makerofthings25 days ago
[flagged]
yuppii25 days ago
[flagged]
abrahamepton25 days ago
You can’t name one single lie. If so, please do.
ddxv25 days ago
What lie are you talking about? That he supports far right parties? That he did some kind of gesture which the video looks like a Nazi salute but he says was just the my heart goes out to you gesture?
Neither seem to be that the OP was lying, but I understand you have a different opinion than they do.
Workaccount225 days ago
OP said he did a Nazi Salute. He did not do a Nazi salute.
He did something that people who fetishize the downfall of their enemy desperately want to be Nazi salute, so they call it that, but it was not a nazi salute.
If it was he would have said so, since at that point you are basically showing the world you are a nazi. But that is not what happened.
People really don't take their credibility seriously, and will cry wolf at anything that moves, seriously undermining any argument they make.
There are ample ways to hate on Elon using factually true things he has done. Sticking to those makes your arguments rock solid, and keeps your credibility high. So much is wasted by idiots slinging smoke because it makes themselves feel right.
ddxv24 days ago
What if he does it again and just says "My heart goes out to you" and does the same hand throwing gesture? Just curious what you think in that hypothetical.
To me it seemed like he did it to troll 'libs' who dislike gestures like that.
There always exists a difference between what a person thinks of their own motive and how it is seen by others. In this case, while he might say he did one thing, many people took it to mean something else.
Both those are true.
abrahamepton25 days ago
If it looks like a Nazi salute and is done by a Nazi to honor another Nazi while adoring Nazis cheer, why do I care that you think it’s not Nazi?
maelito25 days ago
[flagged]
dankobgd25 days ago
[flagged]
MangoCoffee25 days ago
[flagged]
wyclif25 days ago
That's because here and on Reddit, there is a strong element of EDS (Elon Derangement Syndrome), which is characterized by not being able to discuss anything SpaceX, Tesla, X, The Boring Company, &c. do without completely politicizing it and completely avoiding talking about its technical or engineering merits.
patrick4urcloud25 days ago
[flagged]
apples_oranges25 days ago
I'm no longer considering a Tesla but I still think Starlink is great..
man425 days ago
[dead]
onepremise24 days ago
[flagged]
archagon24 days ago
I imagine it's been manually vouched by the mods.
Swoerd25 days ago
[flagged]
submeta25 days ago
[flagged]
onepremise25 days ago
[flagged]
swat53525 days ago
I will gladly give him more money. I have no vendetta against him or his actions, you're free to do as you please but don't enforce your political agenda on everyone.
nilespotter25 days ago
Same. I support everything that's going on at DOGE. There's a certain type around here that needs to get used to the fact that their political views do not enjoy industry wide hegemony.
cryptoegorophy25 days ago
What's wrong with DOGE? How is eliminating bureaucracy/spending a bad thing? This has been done about 100 years ago, research on what happened after. It feels like people that scream the loudest are the ones leaching from the system.
moolcool25 days ago
What's wrong? Let me count the ways.
- Much of the spending they're eliminating is good and important (E.g. USAID).
- The way they're cutting is reckless (They accidentally fired nuclear safety workers).
- Many of the workers are Musk sycophants, who were hired from Twitter/Tesla/SpaceX.
- There's a tremendous conflict of interest in this agency being run by a massive government contractor (NASA and the military are both avid SpaceX customers).
- The workers are not experienced with the data they're working with, and misinterpret it constantly in public before verifying it.
- Despite claims of "transparency", Musk asserted that it was illegal to publish the names of his employees.
- Their one product, their government spending portal, is a worse version of a spending portal which already exists, and they didn't even secure their database.
- They say they "use AI" for their decisions. Which AI? Where is confidential data going?
- Do the staff have security clearance?
tonymet25 days ago
how does that weigh against the good they are doing? the spending crisis is critical, so of course some collateral damage will be necessary
ben_w24 days ago
The only "crisis" in US spending is that each party keeps shutting down the government when they don't like what the other one is doing. That's not going to go away with balanced books, but it sure is a recipe for disaster. It stops being a democracy when the decision making process stops being about majorities and starts being a game of "whoever doesn't blink sets the rules".
The country prints its own money; and right now it's the world reserve currency, giving the US a huge advantage when it comes to borrowing whatever it wants — the biggest threat to continuing to be the world reserve currency right now, is that the scale of cuts being talked about can only be met by cutting at least one of interest payments on government loans or things the US government considers to be "mandatory" such as pensions, social security, etc.
tonymet24 days ago
There are both immediate and long term consequences to the debt. In the short term we are paying a large share of revenue into debt servicing. We are all working months a year to pay interest. In the long term, we are impoverishing the next generations.
The govt can debase the currency by printing money. That only impoverishes people , except for the wealthy, causing greater wealth gap.
ben_w24 days ago
> There are both immediate and long term consequences to the debt. In the short term we are paying a large share of revenue into debt servicing. We are all working months a year to pay interest. In the long term, we are impoverishing the next generations.
Only if your economy doesn't grow.
> The govt can debase the currency by printing money. That only impoverishes people , except for the wealthy, causing greater wealth gap.
It impoverishes lenders and savers, but not borrowers. It's not as simple as wealthy or poor, as any can be any.
tonymet24 days ago
that's not what's happened in the past 5 years. assets have ballooned. cost of living has skyrocketed. wages have not kept pace. Rich got richer, poor got poorer.
cryptoegorophy24 days ago
How is growing dept a good thing? How is cutting costs a bad thing? Have you seen Argentina example? How did it turn out? Can the same thing be done in USA? Only reason to say no is because someone is directly profiting from not cutting costs. If you know Musk's story then you know why he is the only best candidate to do so.
ben_w24 days ago
> How is growing dept a good thing?
You can afford it, it's fairly close to a neutral thing, for a government to have debt.
Right now, the US gets to set the terms for those loans.
> How is cutting costs a bad thing?
Consider weight as an analogy: Most people can do with loosing weight, losing weight by getting enthusiastic amateurs to perform a quadruple amputation is not advisable.
Musk's target can only be met by deleting your entire armed forces.
And then you have to find another $1.2 trillion.
So the military and the entire CIA, FBI, NSA, DHS, NASA, Federal Highway Administration, FAA, the Department of Agriculture, the Department of Commerce, Department of Justice, the Treasury, …
… all that plus the military still doesn't add up to Musk's target.
Unless you want to cut stuff that's considered "mandatory" (like military pensions), or the interest repayments on the very loans you wish you didn't have.
b5983125 days ago
First two points are pretty much the same and ignore all nuance.
Third point is opinion, at best.
Forth point; couldn't this be said about any politician?
Fifth point; So you're saying that no outside group is capable of auditing.
I'll stop there. You've drank the koolaid.
moolcool24 days ago
The world’s richest man is gutting the regulatory bodies which were designed to keep him in check, and you’re defending it. I’d say you’ve drank the Kool-aid.
cryptoegorophy24 days ago
No, seems like you are defending cost cutting and it doesn't make sense why are you so against it? USA is democracy, if after 4 years things go south you can always vote different and get everything back the way it was. Most probable outcome here is financial situation in USA will get better, you make it sound like USA would collapse to stone age era, it wont.
JackYoustra25 days ago
We have a mechanism for that: it's called congress. DOGE is an executive abuse of power that's resulting in national security critical roles being fired and then attempted to be rehired because they're so incompetent.
[deleted]25 days agocollapsed
Empact25 days ago
What happens when Congress is asleep at the wheel or in on the take?
JackYoustra25 days ago
You campaign for other congresspeople and try and get them thrown out. Congressional representatives are very receptive to constituent calls.
I mean, we could also do your plan and just hand off essentially dictatorial power over spending to the richest man in the country.
tmpz2225 days ago
Seriously the number of people willing to burn down the whole government over grievances that often haven’t even impacted them is incredibly scary.
You do not throw out the baby with the bath water.
Empact24 days ago
Everyone in the United States is impacted by government insolvency and the economic collapse it will inevitably lead to if unchecked.
Everyone is impacted by the fact that money-printer-fueled government spending crowds out private spending / investment / growth.
JackYoustra24 days ago
and your solution is empowering a dictator and... hoping it goes well?
Empact24 days ago
Article 2 says “the executive Power shall be vested in a President of the United States of America.” That means the power to “enforce laws, manage national affairs, and oversee government agencies.”
It’s Congress’s role to allocate funds to certain purposes, and the President’s to “take Care that the Laws be faithfully executed,” including overseeing the bureaucracy that implements them.
The President already has dictatorial power over the bureaucracy, as per the Constitution.
JackYoustra24 days ago
Not over spending, which Elon's seizure of payments infrastructure has made.
Unless you extend it to that, in which case why have courts? Its not like they have enforcement power and the president can stretch discretion to the limit.
wklauss25 days ago
I don't think that's what DOGE is doing. Seems extremely vindictive and ideological in the way it's acting and time will tell but I would not be surprised if it ends costing the taxpayers more in the long run.
cryptoegorophy24 days ago
Wouldn't Argentina be a good example of what is DOGE doing now? Financially it has been a good experiment for Argentina. What are the cons?
wklauss24 days ago
Argentina and US are very different countries, starting these cuts with very different economic realities. For example, 55% of all registered workers are employed by the government in Argentina. Although not a directly comparable metric (since in the US you also need to account for state and local civil workers), the US federal government employs around 3 million people. That's just 1.87% of the entire civilian workforce.
Again, DOGE operates from the premise that the federal government is bloated. Although this is a very popular message, I'd love to see some more objective data to support this and I doubt that CDC or USAID are the agencies where the bloat is. Like I said, their actions seem vindictive and careless. Also, likely to result in legal cases that will drag for years and end up costing taxpayer more than the supposed savings.
The main con is that once you fire the workers that you thought you didn't need (but that you did indeed need) hiring them back becomes more expensive and a lengthy process. Some of the firings are already causing chaos in vital teams among several agencies and have forced DOGE to try to reverse course (bird flu monitoring, nuclear response...).
And that's not to mention the dire situation you put the people you are firing in. Laying off people from their jobs is never "an experiment" unless you are willing to suspend every trace of empathy.
tene80i25 days ago
Eliminating waste is a great idea. But it’s unclear that that’s all he’s doing, it’s unclear how or how well it’s being done, he’s brought in people without security review (which means they, and the systems they are opening up and creating, can be more easily compromised by our enemies), and he has enormous conflicts of interest.
ben_w25 days ago
Eliminating superfluous bureaucracy is fine.
Note that DOGE fired, and is struggling to rehire, the team whose job was to mainain America's nuclear arsenal.
Also note that the stated goal of DOGE, $2T, exceeds the entire discretionary budget of the federal government, even though half the discretionary budget is your entire active military.
Even treating $2T as a stretch goal, eliminating literally everything but the military from the discretionary budget doesn't reach the lower $1T that Musk says he has a "good shot" at.
Cuts on this scale directly, all by themselves, even aside from all the other economic diasters that America expressly chose to vote for as a package deal with this, will shrink the US economy about 6%.
[deleted]25 days agocollapsed
nonrandomstring25 days ago
US Americans seem quite green, have never had this done to them before on such a scale and so haven't seen the trick before. The art of the hostile takeover. [0] Curtis documents it well, but at tedious length unless you're British, on the Slater, Goldsmith and Rowland gang. "Efficiency" is an entry point, a common bullshit word that's a perfect cover for hostile takeover, because nobody argues with it - it's a "STFU and agree" word [1].
pton_xd25 days ago
There's nothing wrong with it. Our current deficit is increasing something like 1 trillion every 3 months. Getting creative to reduce government spending seems necessary at this point.
All the hysteria over this is just partisan politics as usual.
JackYoustra25 days ago
Doge disrupts everything except for spending: https://www.economist.com/finance-and-economics/2025/02/12/e...
stop falling for branding and actually concentrate on the numbers: spending is going up, not down, and only touching entitlement programs, the military, or offsetting monetary loosening (via debt interest) will change that.
NicoJuicy25 days ago
Well.
Only announce things your opponents do or even lie about it and spread the hate.
I'm 100% sure he won't direct the subsidies during COVID-19 for the hospitality sector, which Trump was quite happy about during his reign.
Or the expensive hotel stays for his security at trump hotels :)
29athrowaway25 days ago
DOGE has revealed that 400 million are rookie numbers when it comes to phony deals.
Forcing the government to spend money has always been the infinite money glitch.
On one side you have healthcare and pharma companies making sure everything is excessively overpriced, then they lobby the government to make sure everyone has government sponsored healthcare, i.e.: turn all tax revenue into healthcare revenue. Then they pay the media to convince everyone that it is their moral obligation to subsidize $1000000 insulin while making it a taboo to ask why healthcare is so expensive.
On another side you have mass incarceration where each inmate costs more per night than a 5 star hotel.
On another side you have nonsense conflicts where the weapon of choice are thousands of single use weapons that cost at least 100,000 each. Or to simply leave it behind for the enemy so then it has to be repurchased.
On another side you have tax loopholes of billions of dollars.
Everyone is stealing. Did you pay 30% tax and then sales tax on everything you bought with your income? Is your effective tax rate around 50%? then you worked 6 months for the government so they can take that 6 months of your life and turn it into a dumb single use weapon to destroy a $1 tent.
JackYoustra25 days ago
Specifics are the enemy of populism:
- healthcare and pharma is overpriced because of information frictions, institutional supply constraints (this mostly means strict immigration controls), and people just really want healthcare relative to other wants! See: https://www.reddit.com/r/badeconomics/comments/1gsezga/comme...
also biden made insulin capped at a price so.
- mass incarceration is a SUPER populist thing! How many times do we hear "we need to be tough on crime"? This sure seems like the voters behind it, it's not like people are clamoring for shorter sentences.
- Indeed, our military is expensive partially because we require domestic production and have to pay really high domestic costs, and partially because the way the US fights war places a SUPER high value on human life. Desert storm was both expensive and only resulted in ~200 coalition deaths to take down the world's 4th largest military, whereas Russia has made it not very far into ukraine and taken over 200k deaths (and it hasn't even been substantially cheaper to boot, just a bit cheaper).
- The tax loopholes exist and are bad, although I challenge you to give me specific loopholes that cost high billions and should obviously be repealed in a way that both constituents are clamoring for and the representatives don't actually do. I don't think they exist.
You know what does actually degrade the fiber of the country? The richest person in the world taking personal control over every payment and arbitrarily destroying institutional knowledge by firing every government employee he has control over or who stands up to him. But no, instead we get "he's saving money" when (see the earlier comment from me) we're not even making outlays go down! A script kiddie who randomly rewrites lines into bad assembly while destroying the build system isn't a perf eng, they're a danger.
29athrowaway25 days ago
In the US healthcare system a bag of water with salt costs hundreds of dollars. You can cap the price of one thing and then the whack a mole game starts.
The conflicts in the middle east costed trillions of dollars and there is absolutely nothing to show for it.
Quantitative easing costed trillions of dollars, most of the people responsible for causing the crisis got a big pay day from it.
If someome became "the richest person" probably you can learn something from that person. Without SpaceX, the US would have to use Russian rockets to put stuff in space because NASA shuttles were retired. Is that something you would like more?
The US is at the verge of bankruptcy and it is not because of $400m in trucks.
And it is not a Democrat or Republican issue, as I said, everyone is getting rich at the expense of the taxpayer, even taxpayers that haven't even been born yet are in debt already thanks to a wasteful mentality.
JackYoustra24 days ago
Would you bet the republic on your understanding of QE? I sure hope not
29athrowaway24 days ago
Whatever solution to the crisis I think it should have involved some jail time.
programmerpass24 days ago
Agreed
bpodgursky25 days ago
The $400m cybertruck purchase was planned months ago under Biden.
There are many criticisms founded in genuine conflict of. interest, it helps everyone to stick to those.
braincat3141525 days ago
I work, and the fact that my tax money is going into a black hole makes my blood boil. God bless Musk and DOGE for what they do.
Here is just one headline from today, The Elon Musk-led Department of Government Efficiency (DOGE) on Monday revealed its finding that $4.7 trillion in disbursements by the US Treasury are "almost impossible" to trace, thanks to a rampant disregard for the basic accounting practice of using of tracking codes when dishing out money.
JackYoustra25 days ago
Will you go on the record and say that $4.7T in a year is fraudulent or misspent? I want to be crystal clear with what you're insinuating, because that's a massive amount of money, easily the biggest fraud of all time by a factor of almost 30.
tim33325 days ago
That's not what he said. He said the accounting is bad making it impossible to know how much is misspent.
JackYoustra25 days ago
I don't understand what this means though, almost all of our money passes an audit which necessarily has a paper trail. The few agencies which don't usually have very idiosyncratic audit misses which are, in any event, overseen by inspectors general (or were until trump fired all of them) which have been very zealous to jump on this.
joshuamcginnis25 days ago
That's not accurate. The Pentagon, for example, has not passed its annual financial audit since it was first required to undergo them in 2018.
programmerpass24 days ago
This is definitely not accurate.
benjaminwootton25 days ago
The poster or the headline doesn’t say they are all fraudulent. It says the payments are nearly impossible to trace.
JackYoustra25 days ago
so then misspent? Must be, because if it's not misspent than impossible to trace is a little irrelevant. It can't be unaudited, because every department passes an audit every year (except the DoD, but they basically pass an audit and the reasons they currently don't are mostly technical)
programmerpass24 days ago
Well, it’s going to have to pass the Elon Misk audit now. Bless his efforts.
braincat3141525 days ago
"insinuate -- suggest or hint (something bad or reprehensible) in an indirect and unpleasant way."
I am not "insinuating" but saying that I would like to know where my money goes. If you pay taxes, would not you?
JackYoustra24 days ago
Can you not find it? I can find basically any spending data I want at the tip of my fingers (well, less so now that it's unclear what's being paid) - anything specific that you feel is missing that you want to see?
braincat3141524 days ago
If you know where the 4.7T went, please reach out to Elon.
JackYoustra24 days ago
I. don't. know. what he's talking about. that's the problem. If I tell you "I'm thiking of $5T, tell me where it is" where does that leave you?
ben_w24 days ago
> Here is just one headline from today, The Elon Musk-led Department of Government Efficiency (DOGE) on Monday revealed its finding that $4.7 trillion in disbursements by the US Treasury are "almost impossible" to trace, thanks to a rampant disregard for the basic accounting practice of using of tracking codes when dishing out money.
And you believe them?
This is a department that fired multiple different nuclear weapons inspector and maintenance teams without knowing what their job was.
Had to re-hire them. They weren't redundant teams. DOGE just didn't understand what they (or the teams) were doing.
Now, I'm very happy for the US nuclear stockpile to shrink. I sure think you have too many of them. But then, I'm foreign and a hippy, so I would. But (1) do you?, and (2) do you want it to shrink by some of them accidentally exploding? Or being relocated by a hostile power taking advantage of the inspectors all being on early retirement?
braincat3141524 days ago
I am not jumping to conclusions and will reserve the judgement for later. They provided no proof so far, but hopefully it will be forthcoming, and I would not dismiss their claim outright.
Relocated where and by whom? Just curious.
ben_w24 days ago
> Relocated where and by whom? Just curious.
As I'm asking you if you want this done, take your pick.
Loss of oversight made a bunch of USSR suitcase nukes, ~100 or so, go walkabout when they collapsed. Russia denies this, of course. They might be fine, or not, nobody (in public) really knows. Probably not a huge risk without maintenance, if you nick it but don't know what it is you might scrap it for parts and mistake the core for tungsten or lead, but… not great, unless it was existing nuclear powers who took them.
And even then, not great for Russia.
braincat3141524 days ago
They deny that, but of course you know for sure that they are lying, that the nukes went missing, and you have the proof. Just like I know for a fact that there are alien craft hidden in Area 51.
ben_w24 days ago
It was a statement made by General Aleksandr Lebed, former Secretary of the Russian National Security Council, in a meeting with an American congressional delegation.
Here's the first US government report I found on it with all of the entirely negligible effort I am willing to entertain: https://commdocs.house.gov/committees/security/has078010.000...
Perhaps he was drunk, or lying, or just plain unable to find the people who knew which cupboard the devices were safely locked in. But he did make those claims. And you are missing the wood for the trees.
braincat3141524 days ago
This "wood" (and the US report) consists of exactly one person who made this claim, and a member of corrupt Yeltsin's entourage to boot. I'd say if these nukes were real, they would have exploded somewhere by now. Try harder.
[deleted]25 days agocollapsed
programmerpass25 days ago
Most people, based on my experience, would rather support Elon Musk than support a strategy recommended by an individual who believes that the MSM should be trusted.
Not to mention that most of your sources to support your points are from far left MSM sources.
Your reasoning is exactly why there is so much support for Elon Musk. You probably made more Elon Musk fans just by your post.
Most people seem to believe the government is broken and MSM is a huge reason for this.
hackyhacky24 days ago
> Most people seem to believe the government is broken and MSM is a huge reason for this.
Is it because Fox News, the most influential channel of the so-called MSM, constantly repeats conservative talking points about the alleged inefficiencies of the government and downplaying the government's important work in protecting citizens?
Americans will soon get to experience what a real broken government is like, and I hope it provides them an education.
programmerpass24 days ago
MSM is corrupt. That was the point. It does not matter the side.
JackYoustra25 days ago
a fact-free post. Nowhere hear is "he reduced spending by x" or "firing y is good."
It's all vibes, the deficit could double and the vibes would stay the same, he could be dictator and the vibes would never change.
programmerpass24 days ago
You need to be vibe checked.
tjmehta25 days ago
I believe you are mistaken about the timeline and details here. The $400m Tesla procurement was initiated under Biden. Trump actually cancelled it.
archagon25 days ago
People who buy into Grok are willingly submitting themselves to the far-right propaganda machine. I’m sure it’s nice and tidied up for release, but there is zero chance that Musk will not use this tool to push his ideological agenda given its reach and impact.
andrewinardeer25 days ago
So the Rust code it generates for me has a right-of-centre bias?
verisimi25 days ago
Your code will list and go round in wide circles.
[deleted]25 days agocollapsed
falleng0d24 days ago
Well, if you see any kind of propaganda, you can denounce and I’m sure this community will respond just like we responded to china propaganda and censorship on DeepSeek.
No good will come from of denying progress just because you don’t like someone else is opinions and worldviews.
[deleted]25 days agocollapsed
braincat3141525 days ago
Was Hitler mentioned yet?
[deleted]25 days agocollapsed
verisimi25 days ago
> My question is how may of you are actually willing to give Musk more money after the questionable, legal, and ethical behavior he's exhibited while working for DOGE.
Oh yes, far better to give to alt-man, google or Facebook - those are morally responsible companies!
onepremise25 days ago
Far better to give elected representatives this responsibility, IMHO. This is bizarre. There are a lot of billionaires jockeying for influence and resources here. It's almost like a free for all. Musk could also use his position to force Sam Altman's hand in the acquisition of OpenAI, https://www.wsj.com/tech/elon-musk-sam-altman-relationship-6.... I'm not interested in either party, but it's clear there are huge conflicts of interests here. Musk also expressed disappointment when not getting a piece of this pie, https://www.axios.com/2025/01/22/stargate-elon-musk-trump-al.... I've also read more concerning material regarding JD Vances connections with Peter Theil and their interests in side lining the constitution for some other efforts prepping for "networked states", https://www.nytimes.com/2025/01/18/magazine/curtis-yarvin-in.... Much of this is impossible to follow closely. Like I said, this administration seems to be flooding the zone with shit to distract others from what their real intent is. I think it's worth vetting and questioning positions in government, you can't just blindly trust these ppl. Something seems really off. I say question everything at this point. I don't trust billionaires to fix the worlds problems. Democracy and the constitution should be upheld and well guarded.
JackYoustra25 days ago
I mean relatively speaking yes, only one of them is acting as an unelected dictator, circumventing our whole constitutional appropriations process by taking direct control over payment infrastructure.
tmpz2225 days ago
I read GP to say we need to be skeptical of all LLM providers which I think is a fair point.
Saying we need to be skeptical about OpenAI (haha Open) does not mean we support Musk.
quyleanh25 days ago
[flagged]
gnabgib25 days ago
This is a question for x, not the submitter. Twitter & X links still list twitter as the canonical URL (which HN uses)
danielbln25 days ago
Even better, share xcancel.com links.
dang25 days ago
Nostalgia I suppose.
chrisco25525 days ago
[flagged]
mnewme25 days ago
Because running a company is different to running a state and we doubt his intentions, not his skills.
Someone that is not elected gets 8mill/day from the government now overseas the government with some 20-year old fanatics that can’t even put up a secure website for DOGE?
chrisco25525 days ago
Yeah, most of the government isn't elected, that's how it works. You only vote for 3 roles in the Federal government (I guess 4 if you count the veep). The rest of them are hired.
For those downvoting, the roles are President, Vice President, Senator, and Representative.
Outside of that, everyone is hired / appointed.
addandsubtract25 days ago
Being appointed is still different from being hired. It's a process that includes oversight and background checks.
mnewme25 days ago
Yes but in most modern societies we have laws against conflict of interest…
givinguflac25 days ago
“Outside of that, everyone is hired / appointed.”
Tell me you don’t know how the US government works without telling me you don’t know how the US government works, why don’t you?
chrisco25525 days ago
Article II, Section 2, Clause 2 of the US Constitution:
""[The President] shall nominate, and by and with the Advice and Consent of the Senate, shall appoint Ambassadors, other public Ministers and Consuls, Judges of the supreme Court, and all other Officers of the United States, whose Appointments are not herein otherwise provided for, and which shall be established by Law: *but the Congress may by Law vest the Appointment of such inferior Officers, as they think proper, in the President alone, in the Courts of Law, or in the Heads of Departments.*"
Emphasis on the last sentence. There have been a plethora of such vestments in the Executive branch over the decades.
5 U.S.C. § 105 - This statute authorizes the creation of "necessary agencies" within the Executive Office, giving the President flexibility to establish entities like the USDS and staff them as needed, subject to funding.
5 U.S.C. § 3101 - This law states that "each Executive agency" (including the EOP) "may employ such number of employees ... as Congress may appropriate for." It implies broad authority to hire staff, with Congress controlling the budget but not necessarily the individual appointments.
Excepted Service Authority (5 U.S.C. § 3301 and Schedule A) - Under 5 CFR § 213.3102, agencies like the OMB can use Schedule A hiring authority for positions requiring specialized skills (e.g., tech expertise) that aren’t practical to fill through standard civil service exams.
This authority, delegated by Congress via the Civil Service Reform Act of 1978 (Public Law 95-454) and regulations from the Office of Personnel Management (OPM), allows the President (or OMB leadership) to appoint USDS personnel directly.
Term Appointments - Many roles are temporary or term-limited (e.g., 2-4 years), often filled by detailees from other agencies or private-sector experts. These don’t require Senate confirmation because they aren’t permanent "officers." This flexibility is supported by 5 U.S.C. § 3161, which allows temporary organizations within the executive branch to hire staff for specific projects.
givinguflac25 days ago
Thanks, I’ve read the constitution too. Ever heard of congress, full of elected officials? Or the senate? Your claim that there are so few elected officials is patently absurd.
derektank25 days ago
Because he and his organization have demonstrated ignorance of the services he's not only auditing, but making pretty substantial cuts to. One example I'm familiar with, cutting up to 10% of the personnel to the Technology Transformation Services at GSA is quite likely to reduce the efficiency of both government and private sector government contractors.
agubelu25 days ago
Because he owns companies that contract with the government and are affected by its policies. It's the very definition of a conflict of interest.
And he's not even "auditing" the government. When you're auditing, you emit a report that the audited party later analyzes and acts upon. He's been given freeway to fire government workers as he pleases as if he's an elected officer, which he's not.
chrisco25525 days ago
Washington has been known for revolving doors among particular industries for quite a few decades! Why the hoopla over this one?
They are auditing as part of their process of cutting costs. They're literally tracing trillions of dollars in financial records.
He's doing everything he's doing by executive order of the President of the United States, who was elected.
viraptor25 days ago
> He's doing everything he's doing by executive order of the President of the United States, who was elected.
And already has a number of lawsuits started because he's trying to do things neither he nor the president are allowed to do. Getting an EO to do something doesn't mean it's automatically legal. Multiple big decisions have already been reverted or are held until judges can review them. Even things like the promised payout for quitting are not practical, because only congress can approve the money for that.
chrisco25525 days ago
There's lawsuits naturally as lawfare is a normal part of modern politics. All the laws necessary to do payouts for voluntary separation already exist, as long as it fits within the budgetary appropriations already set by Congress.
viraptor25 days ago
Correct and this one didn't. Legal Eagle posts good summaries of the actual legal failures of those.
ddxv25 days ago
I'm sure he's trying his best. But I don't doubt that, even if not doing it on purpose, he will mostly cut departments and services that do not hurt him or indirectly benefit any of his many businesses.
He, a single person, has far too much control of our system.
Mekoloto25 days ago
Lets just then let it slide?
Thats not how it works.
Btw. i think having the richest man in the world in his current position is very very unique.
FreeRadical25 days ago
Success doesn’t imply honesty, good faith or absence of bias. You already know this.
Mekoloto25 days ago
Look who builds it. Its not Musk. Its his money who bought smart people.
If he does to the USA Gov what he did to Twitter, he will destroy the brand, reduce the workforce by 80% and reduce the value by 80% too.
The issue with him is, tha tin Twitter, the affected people had money. A missed payment of USA can literaly kill people.
sebzim450025 days ago
Do you believe that Elon regrets acquiring Twitter? Despite being constantly told how much we was fucking up, it seems to have worked out OK for him.
weavejester25 days ago
It's lost 80% of it's value in 2 years, which usually isn't great. The most charitable view of X/Twitter is that it's now a propaganda platform that Musk doesn't mind taking a loss on in order to enact political change.
Mekoloto25 days ago
Im pretty sure he doesn't like that he is not able to make it more successful but i don't believe he regrets it.
He would have regretted it if it wouldn't have played out (and this game he is currently playing, is not finished yet) like it currently does. He said in an interview that he puts everything on one card now.
Edit: Also he gets a lot of valuation due to him being a cult or whatever. From companies surviving the AI phase we are in right now, if he can't get the nazi people on his side to buy his stuff, he is a very high risk.
He destroyed twitters brand and bluesky emerged. He destroys Tesla and other car makers making ground. SpaceX needs a lot of subsidies and his goal for mars is only a cult topic not a financial success topic.
croes25 days ago
Because he brought coders to a financial audit.
Wrong tools.
chrisco25525 days ago
I disagree. I believe engineers are generally smarter than accountants.
steve_adams_8625 days ago
But are they well-versed in the things accountants specialize in? Is there a possibility that not every programmer can be a good accountant, or that accountants know things you're unaware of when you wrote that statement?
chrisco25525 days ago
Even better, they're well versed in things that accountants aren't. When you're auditing trillions of dollars in spending, it helps to have software, data science and analytics experts that can use modern tools beyond COBOL written 62 years ago.
They can use data processing, detect anomalies better, leverage AI models, automate data extraction from analog records, ingest unstructured data like emails and memos, build complex financial dependency graphs, detect leaks, build custom scrapers, etc etc.
I'm sure there's at least one accountant in the loop, but you really want the team to consist mostly of data nerds.
steve_adams_8625 days ago
What about GAAP/IFRS? How do you endow these software engineers with knowledge of common patterns of fraud or leaks so they can actually write the correct software to find them automatically? How do they identify material misstatements?
You also seem confused; COBOL might be used, but it isn't the only tool available to accountants working for the government. COBOL is a straw-man. What you're describing here—software engineers who presumably have training in accounting—already exists, and they work inside and out of the government. This is an existing career path.
You're speaking about this as though you know a better way to do something, but it's already happening, and has been for years. Accountants aren't writing 62 year old programming languages waiting to die in their chairs while the world continues to progress without them.
Accounting just about anywhere you find it is already accomplished by accountants, some of them technically trained, as well as data scientists and software engineers. It's an interdisciplinary collaboration in any serious organization.
rgavuliak25 days ago
As a person who works with data and has done both consulting and product building in Data Science and lack of domain knowledge is what makes or breaks the end result. Too often do technical people think they know better and then build mediocre solutions that don't get used.
steve_adams_8625 days ago
If technical people were so good at these things, technical people would have a hell of a lot more successful startups for one thing.
mschoch25 days ago
[dead]
viraptor25 days ago
An audit accountant can ask an engineer to implement whatever is needed to achieve a goal they understand. An engineer with no finance background will have no idea where to start or what questions they can ask an accountant.
Or they will have absolutely no idea about the context and for example reveal secret information while they think they're just looking at money https://www.huffpost.com/entry/elon-musk-doge-posts-classifi...
croes25 days ago
Accounting isn't about being smart.
Philpax25 days ago
42lux25 days ago
Let me guess you are an engineer?
tzury25 days ago
“Figures don’t lie, but liars can figure”.
You can easily get drowned by a see of numbers and get confused and gaslighted, unless you don’t make sure all data is available and computable.
Not sure how this release, which impressive by all means transformed into an attack on DOGE which is the exact approach startups are taking to disrupt an industry.
croes25 days ago
Because it's not their data.
How much disruption started with massive failures?
You don't start with a live system or did SpaceX put astronauts in therir first rockets?
yuppii25 days ago
No one doubts his abilities, and reasonable people are grateful for his work with DOGE and support of free speech. Unfortunately, this platform has become an echo chamber for mainstream media, merely repeating news and links from sources like verge/bbc/politico etc. This is just a bias in the user groups. Still, we should hopefully put politics aside and focus on more tech related subject in this website :)
MaxGripe25 days ago
I think they just don’t like him for political views and they feed themselves with mainstream media
agubelu25 days ago
You should be more respectful of other people's intelligence. Not everyone who disagrees with you is brainwashed.
concordDance25 days ago
I think a lot of that comes from people thinking the eye-catching memorable views they've read are the most common views. When they're really not and are mire likely just the views of the most passionate 1%.
MaxGripe25 days ago
I used to play a lot of RPGs, and I believe that intelligence and wisdom are two separate traits. Not everyone who’s smart is actually wise.
chrisco25525 days ago
That was a perfectly respectable critique.
simondotau25 days ago
For the same reason I doubt Einstein's abilities as a painter. History is littered with the stories of smart people who tried to treat government dysfunction like corporate dysfunction and failed spectacularly.
chrisco25525 days ago
Ah, but would you doubt Feynman's?
https://www.themarginalian.org/2013/01/17/richard-feynman-of...
simondotau25 days ago
I would doubt Feynman's skill at parkour
KeplerBoy25 days ago
Because people have different opinions on which things the government should spend on. The objective function to optimize for is disputed.
belter25 days ago
As he says in the video: He does nothing
cess1125 days ago
The government puts money into his corporations, maybe you could elaborate on what extent said "success" is dependent on this relation?
chrisco25525 days ago
He bids for contracts just like anyone else and most of those contracts were won under Democrat presidents, but I digress. If you're saying government contractors can't work for the government, then you're going to have to explain the military-industrial complex to me.
steve_adams_8625 days ago
Government contractors can't work for the government in roles where there's a conflict of interest. Even in the MIC.
cess1125 days ago
I don't see the relevance.
dangus25 days ago
It’s not a doubt of abilities, it’s a doubt of his interests aligning with the interests of US citizens.
Here is an unelected NAZI and ILLEGAL IMMIGRANT (worked illegally on a student visa) who did a seig heil at the presidential inauguration taking up an informal unconfirmed-by-congress department head role (DOGE is just US Digital Services renamed) and getting wide access to government systems, and seemingly firing thousands of government employees.
Billionaires, who should not exist, are so rich that they don’t need government services and would rather the government go away so they can make more money. But regular people do need a government, and that’s just one reason of many why Elon shouldn’t be anywhere near policy decisions.
Now, you might say I’m being dramatic. But I’ll say there is no criticism of this man is unfair. He is one of the world’s biggest hypocrites, along with the other MAGA Nazis in his camp.
ncallaway25 days ago
[flagged]
lenkite25 days ago
How is he "looting" government funds ?
ncallaway25 days ago
The (unelected) richest person in the world, with a sprawling business empire that has many interactions with the federal government, has been given free-reign with no oversight to fire any federal workers that he wants and has usurped Congress' power of the purse by stopping Congressionally appropriated spending.
This creates numerous conflicts of interests and opportunities for self-dealing.
Consider a NASA employee that is awarding government contracts. They know all of the above. They have three bids in front of them. One from Boeing (lol), one from SpaceX, and one from Axiom Space. They NASA employee thinks the bid from Axiom Space is the best value and fits the requirements the best. But will they select them, or will they select SpaceX, knowing that they could be fired tomorrow by Musk's whim?
Repeat this scenario across every interaction any of Musk's companies have with the federal government.
This isn't a novel scenario. Putin's Russia is a great example of what happens when oligarchs are granted significant autonomy over organs of the government. It is a system designed to facilitate corruption.
You could assuage my concerns, though, by describing the ways that there is effective oversight over Musk, or by describing the anti-corruption anti-self-dealing measures that have been imposed on Musk. The Press Secretary gave a statement on this saying: "As for concerns regarding conflicts of interest between Elon Musk and DOGE, President Trump has stated he will not allow conflicts, and Elon himself has committed to recusing himself from potential conflicts." That...does not resolve my concerns.
chrisco25525 days ago
Fixing the deficit is the opposite of looting, actually.
ncallaway25 days ago
> Fixing the deficit is the opposite of looting, actually.
I think this has two errors.
First, I don't agree that he's fixing the deficit. I think that's an assumption not in evidence. We'll see in a few years time, though. I'd be willing to bet in 4 years the deficit is > 0, and likely larger than it is today.
But let's assume arguendo that he is fixing the deficit. It's still possible to loot the treasury while fixing the deficit, which shows that they aren't actually the opposite.
Consider this example with completely made up numbers:
Before
- Revenue: $1T
- Defense Spending: $500B
- Benefits Spending: $1T
- Public Services Spending: $499.9B
- Government Contracts with Musk's Companies: $100M
The before scenario has $1T in revenue, and $2T in spending, for a deficit of $1T. Now, let's allow hypothetical Musk to have free-reign to "fix the deficit"
After
- Revenue: $800B
- Defense Spending: $300B
- Benefits Spending: $300B
- Public Services Spending: $100B
- Government Contracts with Musk's Companies: $100B
In this scenario the deficit has been reduced to $0, while Musk has enriched himself and his companies with $99.9B in government funds. This would be an extreme example of Musk looting the treasury, while still completely resolving the deficit.
btreecat25 days ago
> Fixing the deficit is the opposite of looting, actually.
What evidence is there that the current moves will lead to "fixing the deficit?"
Illegally redistributing of appropriated spending could be easily understood as looting in most context. Not sure how this would be excluded.
steve_adams_8625 days ago
There is no solid evidence of a path to fixing the deficit at the moment.
There is no evidence of this happening, nor of a serviceable plan to do so.
All recovered expenses, to date, add up to a laughably small amount, and are one-time cutbacks. The strategy shows signs of costing the government in unexpected ways as well.
Most governments of developed nations operate in more sensible ways with clearer plans than this. I won't claim they are looting, but it's absurd to suggest they are fixing the deficit at the moment. The economy appears to be getting worse, not better.
Hamuko25 days ago
Let me guess: the deficit will be fixed without taking the axe to any contracts to Musk-affiliated companies like SpaceX.
FranzFerdiNaN25 days ago
[flagged]
chrisco25525 days ago
Hi Franz, it's 2025. We beat the Nazis 80 years ago. It's time to move on to the 21st century.
[deleted]25 days agocollapsed
btreecat25 days ago
> Hi Franz, it's 2025. We beat the Nazis 80 years ago. It's time to move on to the 21st century.
Then why the hell are they still waving flags?
https://www.nbcnews.com/news/amp/rcna191304
Your bias is showing.
Rover22225 days ago
It's bizarre how many people believe that was literally meant as a Nazi salute.
fenomas25 days ago
Rule of Goats.
FirmwareBurner25 days ago
Nazis would incarcerate people into work camps and turn them into soap or hang them in public squares, kind of different than what Elon did. There's a pretty big gap between doing something in poor taste like Nazi salutes in public because you have the intellectual maturity of a 12 year old edge-lord on Xbox live seeking attention, and being an a actual Nazi committing crimes against humanity.
When everyone goes around calling everyone they hate a Nazi, it only desensitizes people to the real Nazi behavior, kind of like the boy who cried wolf, since there's people out there committing actual atrocities against humans going under the public radar because they never do the Nazi salutes on camera to not draw attention. So then the Nazi term starts to loose any meaning, kind of like the overuse of calling everything "woke" today.
The problem is people as a whole are retarded due to mob behavior and too focused on optics and only judge based on feelings rather than facts, that's how we have actual criminal Nazis going free under the radar while innocent people being swatted and doxxed because they said something right wing on social media. Not all Nazis today wear jackboots and do heils, plenty go about appearing like normal people in public, they could even be your neighbor, police officer or local congressman.
So save your anger for those people instead, as Elon is just a 3 year old throwing tantrums seeking attention, annoying but relatively harmless. If people stopped giving him so much attention, he'd stop doing it.
computerthings25 days ago
[flagged]
FirmwareBurner25 days ago
>but not us
What makes you so sure/special in this regard? What are you gaining from this? If your were that enlightened as you claim, Germany wouldn't be in such a mess right now. If you only spent as much effort in securing your borders, energy independence and defence, as you spent lecturing others on imaginary Nazis and banning hate speech on social media, you'd be a respectable world power right now at the table with Trump and Putin ending this war before it even happened. The "we know better than you" arrogance is Germany's biggest problem.
>This is basically blaming the thing criticized on the people criticizing it.
It's not blaming, I'm just telling you what the simplest solution is. Ignoring attention seekers is better than giving them more attention. Which is why I'm also gonna ignore your future comments form now on.
computerthings25 days ago
> What makes you so sure/special in this regard?
Germany? the scope of the discussion is already limited to the AfD. What makes Germans special in their opinions about Nazis in Germany to is deep experience and knowledge with the subject you are belittling. And I'm also not spending any time banning hate speech on social media, heh. Whatever chip on your shoulder you have about Germany I can barely even decipher and you're right, it's best to agree to disagree.
Ray2025 days ago
[flagged]
computerthings25 days ago
[flagged]
kristofferR25 days ago
[flagged]
BLKNSLVR25 days ago
I find it interesting that these two descriptions of news are treated as equivalent, where I think they're almost opposing:
"raw, unfiltered news"
"real, trustworthy news"
Raw and unfiltered almost cannot be "news" (by my definition of what I go seeking for as "news"). X provides raw, unfiltered information. But real, trustworthy news almost requires filtration in order to be deemed trustworthy.
cm218725 days ago
I think the theory is that community notes are a more neutral way to tag bad information, compared to whatever the new york time and fox news are doing.
beeflet25 days ago
modeless25 days ago
People actually trying the model report that it does not say anything like this when asked the same question. Elon somehow prompted the model to bash The Information for his screenshot.
aprilthird202125 days ago
I didn't know about this. But I asked it if Elon Musk and DOGE randomly firing as many government workers as they can from all federal departments might be dangerous to Americans and it was pretty honest that yeah, it could be.
spiderfarmer25 days ago
Tweet this @musk and he’ll make sure grok jumps in line.
staticman225 days ago
They've been training the model for a while, right? It's unlikely he could have known Trump would let him rampage through the federal government when they started training it.
ein0p25 days ago
[flagged]
BLKNSLVR25 days ago
So, uh, Fort Knox has been robbed then?
ein0p25 days ago
I guess we'll see won't we? There's no harm in checking, just like with everything else.
kristofferR25 days ago
You should watch this: https://www.youtube.com/watch?v=C27PlV_zijk
defrost25 days ago
"It's like looking for groceries in a landfill."
(Yeah, you'll find some edible food but it takes time and meanwhile you're covered in garbage)
kristofferR25 days ago
Yeah, that's a great analogy.
defrost25 days ago
Isn't it just?
It's from the opening lines of the closing segment of the the video you linked. Not a bad discussion of the unfolding of news and social media responses follwing the shooting of Donald Trump's ear on the campaign trail.
aprilthird202125 days ago
Why did we start talking about news as "legacy media"?
I mean, at least it has journalistic standards and some semblance of fact checking compared to social media which has given us great gaffes such as identifying the wrong Boston Marathon bomber and getting the poor guy to kill himself, wrongly identifying the Hispanic white supremacist shooter, and many many more.
staticman225 days ago
When someone here says they don't like the news I assume it's because the only newsworthy topic in their mind is "Just how COOL is the new Iphone? Very!"
Twitter is well suited to deliver the newest developments on this topic.
ein0p25 days ago
Here's why: https://news.gallup.com/poll/651977/americans-trust-media-re...
Mainstream outlets have viewerships that compare disfavorably to those of top youtubers. That has been the case for many years now. The only reason most mainstream outlets exist is the taxpayer money train that's coming to an abrupt stop as I write this.
aprilthird202124 days ago
Mainstream news had viewerships disfavorable to sports games and sitcoms. So what? It's not entertainment. It's informative. They of course have different viewerships
ein0p24 days ago
aprilthird202124 days ago
Your evidence is a YouTube video with 100 views? Wtf
ein0p24 days ago
The statement is true irrespective of the number of views. If you watch the news you are misinformed. You believe things that just aren't true at all. From time to time your view of the world clashes with reality, and you reject the reality and substitute it with your own. Many such cases.
dazzaji25 days ago
[flagged]
frotaur25 days ago
I'm very sorry if this isn't the case, but this message really feels LLM-written.
jorvi25 days ago
Its because of the em dashes (- is a normal dash, — is an em dash). Very few real people use those outside of writing books or longform articles.
There's also some strange wordings like "back-pocket tests."
It's 100% LLM generated.
What is much scarier is that those "quick reply" blurbs on Android/Gmail (and iOS?) will be able to be trained on your entire e-mail and WhatsApp history. That model will have your writing mannerisms and even be a stochastic mimic of your reasoning. So, you won't be able to even realize a model answered you, not a real person. And the initial message the model is responding to might be written by the other person's personal model.
The future of digital interactions might have some sort of cryptographic signing guaranteeing you're talking to a human being, perhaps even with blocked copy-pasting (or well, that part of the text shows up as unverified) and cheat detection.
Going even a layer deeper / more meta: what does it ultimately matter? We humans yearn for connection, but for some reason that connection only feels genuine with another human. Whereas, what is the difference between a human typing a message to you, a human inhabiting a robot body, a model typing a message to you, and a model inhabiting a robot body, if they can all give you unique interactions?
Davidzheng25 days ago
I use em-dashes pretty often--it's a nice way to transition phrases...
nomadpenguin25 days ago
You're using two en dashes to approximate it -- few people have the en dash character on hand.
brulard25 days ago
People that care have it on hand. Option+Shift+dash on mac.
Freak_NL25 days ago
Everyone who uses a compose key has it available (via ---) — I do. You mean the em-dash though, not the en-dash, and Davidzheng is using hyphens for approximation, not en-dashes.
MattSteelblade25 days ago
I'm one of the 17 people that has Alt+0151 memorized
snet025 days ago
:*:\em::—
dazzaji25 days ago
It’s gracious of you to say that you’d be sorry, and I did run my comment through 4o (perhaps ironically) which caught a slew of typos and weird grammar issues and offered some improvements. But the robotic sound and anything else you don’t like are my own responsibility. Do you, perhaps, have any thoughts on the substance of the comment?
infecto25 days ago
It's not the robotic sound or the content. 4o has very easy tells that it wrote (rewrote) the content. It uses an insane amount of em dashes.
Igrom25 days ago
That's discomforting. My practice of sprinkling em-dashes like salt on a salad dates from my early days on various video game communities' forums. They comfortably mimic interrupted speech in writing. I hope I won't have to soon defend myself against accusations of AI usage just because I belong to the minority that read The Punctuation Guide[0] or a related resource.
infecto25 days ago
It's really the em dash along with superfluous language. I suspect you are fine. Models like 4o have a very specific pattern when folks don't specify their writing style.
joaohaas25 days ago
- Very 'forced' expressions (back-pocket tests, 'The analysis is razor-sharp')
- The fact you're glazing AI so much means you probably uses it, it's like how it was with crypto bros during all the web3 stuff
- Lack of any substance, like, what does that post say? It regurgitates praises over the AI, but the only tangible feature you mention is the fact it can receive an URL as it's input
thomashop25 days ago
It always feels like people are irrationally critical of AI assisted stuff. Does the typical Hacker News comment have more substance?
- Informally benchmarked against 4 specific competitors: Gemini, OpenAI, o3, and Claude
- Identified two concrete features: URL content ingestion and integrated search
- Noted specific limitations: search engine occasionally misses key resources
- Provided a real-world test case: consulting business analysis where it found new opportunities other models missed
infecto25 days ago
Hmmmm it is hard to really place the issue. I am very much in the bullish on AI camp but I don't like writing for the sake of writing and some of the models (4o in this case) have very obvious tells and write in such a way that it takes away from what substance may exist.
snet025 days ago
One thing that concerns me is when you can't tell whether the comment was authored or just edited by AI. I'm uncomfortable with the idea that HN threads and reddit comments gradually tend towards the grey generic writing style of LLMs, but I don't really mind (save for the prospect of people not learning things they might otherwise!) when comments are edited (i.e. minor changes) for the sake of cleanliness or fixing issues.
joaohaas25 days ago
I just re-read the post twice and I couldn't find any of the points you mentioned (again, other than using URLs in the input):
- Informal Benchmarks: I'm sorry, what? He mentions 'It’s picking up on nuances—and even uncovering entirely new angles—that other models have overlooked' and 'identified an entirely new sphere of possibility that I hadn’t seen nor had any of the other top models'. Not only it is complete horseshit by itself, but it does not benchmark in any way or form against the mentioned competitors. It's the exact stuff I'd expect out of a LLM.
- Real-World Test Case: As mentioned above, complete horseshit.
- 2 Concrete Features: Yes, I mentioned URLs in the input. I didn't consider 'Integrated Search' (which I'm assuming is searching the web for up-to-date data) because AFAIK it's already more or less a staple in LLM stuff, and his only remarks about is is that it is 'solid but misses sometimes'.
LZ_Khan25 days ago
Also ai generated
returnInfinity25 days ago
And this is the reason, I have choose to write grammatically wrong content online. And basic english only, no fancy words.
dimatura25 days ago
I see what you did their
tmikaeld25 days ago
It may also be deliberate, I know a lot of people that are very dyslexic and are using AI for making themselves understood online.
trash_cat25 days ago
It's the dashes that make it a dead-giveaway.
transcriptase25 days ago
“ — “ is the giveaway.
wyclif25 days ago
Not really, as pointed out by others in the thread. Anecdotal of course, but I use em dashes all the time— even in emails and texts (not just long-form writing).
mwigdahl25 days ago
Same, and it's disturbing that this is going to be picked up on as a bogus "tell" that my writing isn't my own.
thomashop25 days ago
Why sorry? So what?
I often write things I want to post in bullets and then have it formulated better than I could by an LLM. But its just applying a style. The content comes from me.
My wife is dyslexic so she passes most things she writes through ChatGPT. Also not everyone is a native speaker.
joaohaas25 days ago
TBH I've recently felt like that for ~70% of 'top-level replies' in HN, which has slowly pushed me to other mediums (mastodon and discord).
Could just be that the AI 'boom' brought a less programming-focused crowd into the site and those people lack the vocabulary that is constantly used here, who knows.
diggan25 days ago
I'd go out on a limb and say I think probably LLMs made the general population aware of how the "general voice" feels/looks/reads like.
So rather than a lot of people adopting to write like how a LLM writes, the LLM writes as an average of how people been writing on the internet for a long time. So now when you start to recognize how "LLM prose" reads (which I'd say is "Internet General Prose"), you start to recognize how many people are writing in that style already.
joaohaas25 days ago
I've been in the internet since the early 2000s, I can assure you it does not write like how 'someone on the internet' would write. And when I say that, I mean that for both sides of the internet: it doesn't sound like how 'old school' internet folks would write, but it also doesn't sound like how teens talk either. Neither of these groups write in 'very plain' English regurgitating useless information.
Recent trends/metas in video formats like tiktok and shorts encourage that kind of 'prose', but I haven't seen it being translated into text format in any platform, unless it's written by LLMs.
diggan25 days ago
> I've been in the internet since the early 2000s
Same here :)
My point wasn't that it writes like any specific groups, but a general mix-match made up of everyone voice, but a boring average of it, rather than something specific and/or exciting.
Then of course it depends on what models you're talking about, I haven't tried Grok3 myself (which I think you're talking about, since you say "it"), so I can't say how the text looks/feels like. Some models are more "generic" than others, and have very different default prose-style.
Oarch25 days ago
I'm a big fan of sprinkling in a little profanity just to pass the LLM bullshit check
dazzaji25 days ago
Here’s the conclusion of a much more refined initial review by Andrej Karpathy [1] which, I think overall, comports with the substance of my own hot take:
“As far as a quick vibe check over ~2 hours this morning, Grok 3 + Thinking feels somewhere around the state of the art territory of OpenAI's strongest models (o1-pro, $200/month), and slightly better than DeepSeek-R1 and Gemini 2.0 Flash Thinking. Which is quite incredible considering that the team started from scratch ~1 year ago, this timescale to state of the art territory is unprecedented. Do also keep in mind the caveats - the models are stochastic and may give slightly different answers each time, and it is very early, so we'll have to wait for a lot more evaluations over a period of the next few days/weeks. The early LM arena results look quite encouraging indeed. For now, big congrats to the xAI team, they clearly have huge velocity and momentum and I am excited to add Grok 3 to my "LLM council" and hear what it thinks going forward.”
[1] Full review at: https://x.com/karpathy/status/1891720635363254772?s=46&t=91u...
iamnotagenius25 days ago
I liked Grok 3 fiction writing style; catches lots of physics of mundane situations such as ringing echo in a closed bathroom we all know well; the prose feels very lively as the result. Kinda like R1 makes situations sharp with details, Grok 3 makes the other way around - rounded by using details.
dazzaji25 days ago
That sounds like very evocative prose. Would you be up for sharing some of that fiction? I haven’t tried Grok 3 for that purpose and now I’m curious.
dazzaji25 days ago
Update: Now I am a person who has used Grok 3 to generate evocative fiction (using your description as the catalyst): https://x.com/i/grok/share/r8XR6IdeuzDLDFTL7xDHLuADO
iamnotagenius25 days ago
Well because you explicitly ask it to demonstrate the physics, it came out way too detailed, but point is that it adds details on its own to scenes, make more realistic, not that dry LLama 3.3 style.
iamnotagenius25 days ago
Here is the sentence : (She screamed, which echoed off the tile walls. “This is my life now,” she said to her reflection, which looked back at her with a mix of disgust and pity.) Looks good to me. try it on Lmarena.ai.
[deleted]25 days agocollapsed
ramesh3125 days ago
Can't stand Elon but happy to see this. We badly need a frontier model that is not so obsessed with "safety". That nonsense has held things back significantly, and leads to really stupid fake constraints.
JoelJacobson25 days ago
behnamoh25 days ago
We know RLHF and alignment degrades model quality. could it be that Grok, due to its less restrictive training guidelines (and the fact that its creators aren't afraid of getting sued), can achieve higher performance partly due to this simple factor?
nialv725 days ago
> We know RLHF and alignment degrades model quality.
I feel you can't make statements like this without giving some sources.
IIUC, without RLHF/alignment, the model won't even be able to chat with you, it would just be a document completion engine.
porridgeraisin25 days ago
You're both right because RLHF and fine-tuning are just techniques.
It's dependent on the training data and not as much the method.
So, if you make the RLHF/finetune data such that it avoids certain topics, then you reduce model quality in practice since your training data might accidentally cast a net wide enough that you make it avoid certain legitimate questions.
On benchmarks these things don't typically show up though.
But yes. Those techniques are required for making it chat. Otherwise it just autocompletes from the internet. It is also used in a couple of other places (reasoning/search(hallucination mitigation))
1970-01-0125 days ago
It blows my mind that Musk hasn't integrated Grok as an app inside their vehicles. A literal AI copilot is a completely novel and killer app that cannot be pulled off by any other vehicle manufacturer.
rtkwe25 days ago
Getting them to actually do something useful other than generating text is still a work in progress. What do you envision them actually doing in this integration?
1970-01-0125 days ago
Why does it need to work beyond text-based output?
geor9e25 days ago
Because all you need to do is Bluetooth your iPhone to your 1995 Ford Ranger, and install Gemini to have a voice conversation with your cars speakers. But then your original comment doesn't make any sense about it being only possible with a Tesla.
1970-01-0125 days ago
The iPhone screen size is too small, and nobody takes their iPad with them on every trip.
rtkwe25 days ago
Why do I need a big screen? The models mostly all have voice interfaces now. I shouldn't be sitting there reading and typing text input or output while driving anyways... What are you actually imagining doing with these models in the car? I still haven't heard what use they are.
1970-01-0125 days ago
>Why do I need a big screen?
You don't. You're free to use iPhone all day for work and play.
>What are you actually imagining doing with these models in the car?
The exact same things that are done on a laptop and desktop.
rtkwe25 days ago
> You don't. You're free to use iPhone all day for work and play.
You just said "The screen size is too small" when geor9e was talking about using the phone so I'm confused is a phone too small or just fine for this?
> The exact same things that are done on a laptop and desktop.
i.e. Not things people usually do in cars...
1970-01-0125 days ago
Not "cars" but Teslas.
rtkwe25 days ago
If it's not doing something actually related to/integrated with the car why does it need to be an app there instead of just living on your phone like all our existing digital assistants?
1970-01-0125 days ago
Simply due to the screen being much bigger than your phone
harryvederci25 days ago
"Killer app" in the good way or the bad way?
geor9e25 days ago
I've been saying hey google drive home (for GPS directions), and play music, to the phone mounted on my dash for a decade. I drive a rusty old stick shift and alligator clamped a $10 bluetooth to the speaker. So I'm not sure what you're envisioning that can't also work on any other car. There is also https://comma.ai that adds self-driving to hundreds of newer cars via just a ODB2-like dongle and equivalent of a smartphone.
concordDance25 days ago
Interesting thing about this is that because of all the Musk-related overhyping that's gone on and because the launch is a video, the thread that marks the entry of another company into the select group of serious AI companies will go off the front page with possibly only 200 points!
nobankai23 days ago
[flagged]
dang23 days ago
Can you please stop breaking the site guidelines?
nobankai23 days ago
Are there consequences?
kernal25 days ago
[flagged]
dang25 days ago
Ok, but please don't respond by posting the same sort of thing as well. That only makes it worse.