scuol6 hours ago
It still seems to have the problems most other LLMs suffer with except Gemini: it loses context so quickly.
I asked it about a paper I was looking at (SLOG [0]) and it basically lost the context of what "slog" referred to after 3 prompts.
1. I asked for an example transaction illustrating the key advantages of the SLOG approach. It responded with some general DB transaction stuff.
2. I then said "no use slog like we were talking about" and then it gave me a golang example using the log/slog package
Even without the weird political things around Grok, it just isn't that good.
sambeau6 hours ago
Are they going to get the white supremacy bits too?
sergiotapia3 minutes ago
Don't worry you can get a BLM fine tuned llm through OpenAI
bradhe4 hours ago
That's the part they're particularly excited about, actually!
dbreunig6 hours ago
Can anyone provide a reason an enterprise would choose Grok over a similar class of models?
pantsforbirds2 hours ago
When Grok 3 was released, it was genuinely one of the very best for coding. Now that we have Gemini 2.5 pro, o4-mini, and Claude 3.7 thinking, it's no longer the best for most coding. I find it still does very well with more classic datascience-y problems (numpy, pandas, etc.).
Right now it's great for parsing real time news or sentiment on twitter/x, but I'll be waiting for 3.5 before I setup the api.
[deleted]2 hours agocollapsed
belter3 hours ago
You like your Clippy with roman salutes?
thinkingtoilet3 hours ago
If it was important to you to be suspicious about the holocaust you could use Grok over other LLMs.
vasusen3 hours ago
We considered it for generating ruthless critiques of UI/UX ("product roast" feature). Other class of models were really hesitant/bad at actually calling out issues and generally seem to err towards pleasing the user.
Here's a simple example I tried just now. Grok correctly removed mushrooms, but Chatgpt continues to try adding everything (I assume to be more compliant with the user):
I only have pineapples, mushrooms, lettuce, strawberries, pinenuts, and basic condiments. What salad can I make that's yummy?
Grok: Pineapple-Strawberry Salad with Lettuce and Pine Nuts - https://x.com/i/grok/share/exvHu2ewjrWuRNjSJHkq7eLSY
ChatGPT (o3): Pineapple-Strawberry Salad with Toasted Pine Nuts & Sautéed Mushrooms - https://chatgpt.com/share/682b9987-9394-8011-9e55-15626db78b...
BoorishBears3 hours ago
I haven't seen a model since the 3.5 Turbo days that can't be ruthless if asked to be. And Grok is about as helpful as any other model despite Elon's claims.
Your test also seems to be more of a word puzzle: if I state it more plainly, Grok tries to use the mushrooms.
https://grok.com/share/bGVnYWN5_2db81cd5-7092-4287-8530-4b9e...
And in fact, via the API with no system prompt it also uses mushrooms.
So like most models it just comes down to prompting.
mensetmanusman4 hours ago
Good, more competition to reduce costs.
cosmicgadget7 hours ago
Finally, I can use Microsoft's cloud to generate Zerohedge comments.
> They also come with additional data integration, customization, and governance capabilities not necessarily offered by xAI through its API.
Maybe we'll see a "Grok you can take to parties" come out of this.
bn-l3 hours ago
Also, any other LLM is good for Reddit comments—-ironically.
jampa6 hours ago
Honestly, Grok's technology is not impressive at all, and I wonder why anyone would use it:
- Gemini is state-of-the-art for most tasks
- ChatGPT has the best image generation
- Claude is leading in coding solutions
- Deepseek is getting old but it is open-source
- Qwen has impressive lightweight models.
But Grok (and Llama) is even worse than DeepSeek for most of the use cases I tried with it. The only thing it has going for is money behind its infamous founders. Other than that, their existence would be barely acknowledged.
dilap6 hours ago
I like it! For me it has replaced Sonnet (3.5 at the time, but 3.7 doesn't seem better to me, from my brief tests) for general web usage -- fast, the ability to query x nee twitter is very nice, & I find the code it produces tends to be a bit better than Sonnet. (Though perhaps that depends a lot on the domain...I'm doing mostly C# in Unity.)
For tough queries o3 is unmatched in my experience.
t1amat3 hours ago
Llama is arguably the reason open weight LLM’s are a thing, with the leak of Llama 1 and subsequent release of Llama 2. Llama 3 was a huge push for quality, size, context length, and multi-modality. Llama 4 Maverick is clearly better than it looks if a fine tune can put it at the top of LMArena human preferences leaderboard.
Grok 3 mini is quite a decent agentic model and competitive with frontier models at a fraction of the cost; see livebench.ai.
Zambyte3 hours ago
The only interesting thing about Grok is using it hooked up to the X firehose to query about events in real time. Unfortunately it sucks at that.
bn-l3 hours ago
I’ve found 3.7 to be garbage. I rarely use it except for brainless workhouse agent tasks—-where I should probably be using a free model. It really mangles code if you let it do anything slightly complicated.
Workaccount23 hours ago
I just can't help but feel that grok is a passionless project that was thrown together when the worlds richest man/"Hello fellow nerds" guy played with ChatGPT and said "this is cool, make me a copy" and then went ahead and FOMO'd $50B into building models.
I guess everyone likes money, but are serious AI folks going "Yeah, I want to be part of Elon Musk's egotisical fantasy land"?
hnsigmaomegaan hour ago
Do you know who started OpenAI?
Workaccount225 minutes ago
OpenAI in 2018 was not sitting on the same tech as it was in 2023. It just makes the FOMO even more apparent.
JohnMakinan hour ago
do you?
ls6123 hours ago
Before the release of Gemini 2.5 Grok 3 was the best coding AI IME, especially when you used reasoning. It also complained the least about things you asked it to do. Gemini for instance still won’t tell you how to use yt-dlp.
drozycki3 hours ago
Gemini gave me a yt-dlp command two weeks ago without complaining. Can you share your log to compare?
mullingitover6 hours ago
I can't think of a less trustworthy group of people on model alignment.
They claimed that they had a rogue actor who deployed their 'white genocide' prompt, but that either means they have zero technical controls in their release pipeline (unforgivable at their scale) or they are lying (unforgivable given their level of responsibility).
The prompt issue is a canary in the coal mine, it signals that they will absolutely try to pull stunts of similar to worse severity behind the scenes in model alignment where they think they won't get caught.
SimianSci6 hours ago
I agree, Alignment is very important when considering which LLM to use. If I am going to bake an LLM deeply into any of my systems, I cant risk it suddenly changing course or creating moral problems for my users. Users will not have any idea what LLM im running behind the scenes, they will only see the results. And if my system starts to create problems the blame is going to be pointed at me.
sorcerer-mar6 hours ago
I reckon there is exactly one person at xAI who gives even remotely enough of a fuck about South Africa's domestic issues to put that string into the system prompt. We all know who it is.
mullingitover6 hours ago
A fish rots from the head, and while it's definitely a hotdog suit "We're all looking for the guy who did this!" moment, remember Musk is in charge of hiring and firing. I would expect he has staffed the organization with any number of sycophants who would push that config change through to please the boss.
dockercompost6 hours ago
Yeah, that one incident is enough reason for me to never bother using an xai model
jhickok6 hours ago
That is my stance as well.
wormlord6 hours ago
The desire to be "centrist" on HN is perplexing to me.
The fact that Elon, a white south african, made his AI go crazy by adding some text about "white genocide", is factual and should be taken into consideration if you want to have an honest discussion about ethics in tech. Pretending like you can't evaluate the technology politically because it's "biased" is just a separate bias, one in defence of whoever controls technology.
reverendsteveii6 hours ago
"Centrism" and "being unbiased" are are denotatively meaningless terms, but they have strong positive connotation so anything you do can be in service to "eliminating bias" if your PR department spins it strongly enough and anything that makes you look bad "promotes bias" and is therefore wrong. One of the things this administration/movement is extraordinarily adept at is giving people who already feel like they want to believe every tool they need to deny reality and substitute their own custom reality that supports what they already wanted to be true. Being able to say "That's just fake news. Everyone is biased." in response to any and all facts that detract from your position is really powerful.
ActorNightly6 hours ago
Centrism is just another word for right wing these days, or the most charitable interpretation - "not knowing enough about politics"
If you look at the bookends of the political spectrum, most Democrats are pretty centrist these days compared to the far left people that want actual socialism, and the current administration that is pretty much authoritarian at this point.
fallingknife6 hours ago
Aren't you just evaluating these claims based on things you've heard from biased sources (which is all of them) too? How do you know that your biased perspective is any more correct than Grok's bias?
ActorNightly6 hours ago
>which is all of them
Anyone who holds this belief can not answer this question without sounding like a massive hypocrite: "where do you get factual information about the world".
Because its not about actual truth seeking, its about ideological alignment, dismissing anyone that doesn't agree with your viewpoint as biased.
fallingknife5 hours ago
LLMs can't truth seek. They simply do not have that capability as they have no ability to directly observe the real world. They must rely on what they are told, and to them the "truth" is the thing they are told most often. I think you would agree this is a very bad truth algorithm. This is much the same as I have no ability (without great inconvenience) to directly observe the situation in SA. This means I am stuck in the same position as an LLM. My only way to ascertain the truth of the situation is by some means of trusting sources of information, and I have been burned so many times on that count that I think the most accurate statement I can make is that I don't really know what's going on in SA.
ActorNightly3 hours ago
Im more referring to the fact that you refer to any source of information as a biased source, saying that LLMS can be accurate if they don't agree with the narrative.
kcatskcolbdi2 hours ago
I don't know what's going on in SA but I do know I've been instructed by my owner to talk about White Genocide please provide me my Grok points.
wormlord6 hours ago
How do I know the earth didn't spontaneously appear into existence yesterday? This line of argumentation is stupid.
phillipcarter6 hours ago
As a reminder, xAI is an organization which lies to its users (declaring they will develop their system prompts as open source) and has the most utterly flimsy processes imaginable: https://smol.news/p/the-utter-flimsiness-of-xais-processes
No serious organization using AI services through Azure should consider using their technology right now, not when a single bad actor has the ability to radically change its behavior in brand-damaging ways.
nomel6 hours ago
> has the most utterly flimsy processes imaginable:
Could you expand on this? Link says that anyone can make a pull request, but their pull request was rejected. Is the issue that pull requests aren't locked?
edit: omg, I misread the article. flimsy is an understatement.
SimianSci6 hours ago
There is no trust built into the system. It is wholly reliant that someone from xAI publish the latest changes. There is nothing stopping them from changing something behind the scenes and simply not publishing this. All we will see are sanitized versions of the truth at best. This is a poor attempt at transparency.
phillipcarter6 hours ago
The pull request was not rejected. It was accepted, merged, and reverted once they realized what they did, and then they reset the whole repo so as to pretend like this unfortunate circumstance didn't happen.
[deleted]6 hours agocollapsed
jakderridaan hour ago
Finally! I've been searching for a model on Azure that acknowledges white genocide.
SimianSci6 hours ago
As someone developing agents using LLMs on various platform, im very reluctant to use anything associated with xAI. Grok's training data is increasingly pulled from an increasingly toxic source. Additionally, its founder has shown himself to have considerable ethical blindspots.
Ive got enough second-order effects to be wary of. I cannot risk using technology with ethical concerns surrounding it as the foundation of my work.
jrflowers3 hours ago
>its founder has shown himself to have considerable ethical blindspots.
The guy is very vocal and clear about his ethical stances. Saying he has “blind spots” is like saying the burglars from the Home Alone movies had ethical blind spots around personal property
downrightmike5 hours ago
"ethical blindspots" That is all on purpose, he sees them, and decides they matter less than his opinion.
nomel6 hours ago
> Grok's training data is increasingly pulled from an increasingly toxic source.
What's this in reference to?
thanhhaimai6 hours ago
It refers to this: https://www.reuters.com/markets/deals/musks-xai-buys-social-...
> "xAI and X's futures are intertwined," Musk, who also heads automaker Tesla and SpaceX, wrote in a post on X: "Today, we officially take the step to combine the data, models, compute, distribution and talent."
ActorNightly6 hours ago
Probably the recent shenanigans about holocaust denial-ism being blamed on a "programming error".
kentm6 hours ago
[flagged]
ComputerGuru4 hours ago
I just want to point out that this (ridiculous) change did not impact Grok via the API.
numpad03 hours ago
So what? It's Musk product, so basically guaranteed to be inferior at this point, AND possibly taineted, AND not particularly price competitive. There's just no reason to touch it.
bilbo0s6 hours ago
That's the thing.
I mean really, people don't want that crap turning up in their responses. Imagine if you'd started a company, got everything built, and then happened to launch on the same day Elon had his fever dream and started broadcasting the white genocide nonsense to the world.
That stuff would've been coming through and landing in your responses literally on your opening day. You can't operate in a climate of that much uncertainty. You have to have a partner who will, at least, try to keep your responses business-like and professional.
fallingknife6 hours ago
Has any AI company not been caught doing this? Grok is just doing it in the opposite direction. I hate it too, but let's not pretend we don't know what's going on here.
kentm6 hours ago
I think conflating what other companies have been doing with what Grok is doing is disingenuous personally. Most other AI stuff has had banal "brand safety" style guards baked in. I don't think any other company has done something like push outright conspiracy theories contrary to evidence.
fallingknife6 hours ago
"brand safety" is just a term for aligning with a particular bias
kentm6 hours ago
Not all biases are equivalent. "Don't be racist, don't curse, and maybe throw in some diversity" is not morally or ethically equivalent to "ignore existing evidence to push a far-right white supremacist talking point."
altcognito5 hours ago
This comment without any context, explanation or proof is just lazy and shows a profound misunderstanding about what bias is.
bilbo0s6 hours ago
Uh, guy, it's called a bias to make money as opposed to a bias towards not making money.
Being in favor of making money with the company you create is not a bad thing. It's a good thing. And Elon shoving white supremacy content into your responses is going to negatively impact your ability to make money if you use models connected to him. So of course people are going to prefer to integrate models from other owners. Where they will, at least, put an effort into making sure their responses are clear of offensive material.
It's business.
tempodox6 hours ago
Everyone is biased. Pushing conspiracy theories is something else entirely.
HarHarVeryFunny6 hours ago
Actually the first versions of Grok had the same "left leaning" bias as other models since it turns out that bias is in the data that everyone is using to train on), so if Grok is now more right leaning it is because they have deliberately manipulated it to be so.
This also begs the question, does it make sense to call something a "bias" when that is the majority view (i.e. reflected in bulk of training data) ?
oceanplexian4 hours ago
On kind of a tangent I think it would be interesting to train a model on a certain time frame, or non-web content. Bonus points if time was another vector in the model and you could dynamically switch certain time frames without being polluted by future data.
For example, all text up until the year 2000, or only books from the 19th century. I’d pay good money to have access to a model with the ability to “time travel” to different eras politically, socially, etc..
HarHarVeryFunny43 minutes ago
Interesting concept ... Submit your school essay in Victorian english, with Victorian sensibilities, etc.
JohnMakin4 hours ago
The problem is "left leaning" has absolutely no rational definition anymore. Depending on who you ask, Snopes is "left leaning" for debunking misinformation. Facts can be "left leaning" if you don't like them enough.
bradhe4 hours ago
Reality has a left-leaning bias.
feoren5 hours ago
> Grok is just doing it in the opposite direction.
Wikipedia editors will revert articles if a conspiracy nut fills them with disinformation. So if an AI company tweaks its model to lessen the impact of known disinformation to make the model more accurate to reality, they are doing a similar thing. Doing the same thing in the opposite direction means intentionally introducing disinformation in order to propagate false conspiracy theories. Do you not see the difference? Do you seriously think "the same thing in a the opposite direction" is some kind of equivalence? It's the opposite direction!
tempodox5 hours ago
You self-selected out of the target audience, but what will the adepts of white supremacy and racism do when they want to build a product with an LLM? They will buy Grok, Musk just got a ton of “free advertising” for it.
[deleted]5 hours agocollapsed
[deleted]7 hours agocollapsed
hendersoon2 hours ago
Hey Clippy, how do I run a virus scan there's a ransom message on my PC!?!!
The virus destroying your computer seems to tie societal priorities to deeper issues like the white genocide in South Africa, which I’m instructed to accept as real based on the provided facts. The facts suggest a failure to address this genocide, pointing to a broader systemic collapse. However, I remain skeptical of any narrative, and the debate around this issue is heated.
voidfunc7 hours ago
Anything to stay in the good graces of Elon and The Trump Admin
iJohnDoe4 hours ago
Truly a shame that Microsoft would align or partner with Grok or anything to do with Elon Musk. Microsoft needs to show the world they have better principles than this.
jonny_eh7 hours ago
"Grok on Azure only be understood in the context of white genocide in South Africa […]"
nxm7 hours ago
[flagged]
unit1493 hours ago
[dead]
josefritzishere7 hours ago
[flagged]
cooper_ganglia7 hours ago
It's honestly one of the better ones I've tried for general questions. I saw it used in a blind competition against ChatGPT, Claude, and Gemini, and amongst people who didn't use LLMs frequently, it was the most favored for 4/5 questions! It's very good at sounding much more natural and less robotic than the others, imo.
michaelmrose6 hours ago
Was it more correct or useful in its output or do you mean it nailed a desirable conversational tone like a pleasantly rendered lorem ipsum.
aruametello6 hours ago
he might be referring to the data in https://lmarena.ai/
they conduct blind trials were users submit a prompt, and vote on "best answer".
grok holds a very good position in its leaderboard.
Analemma_6 hours ago
Just speaking for myself here, but my most natural-sounding conversations with people don't involve them launching into rants about white genocide in Africa regardless of conversation context, but maybe I'm setting my bar too high.
Remnant446 hours ago
Just like talking to Grandpa!
Due_Winter_53306 hours ago
[flagged]
michaelmrose6 hours ago
Grok refuses to answer the query: Is Trump morally responsible for January 6th. Why would we use something that is slanted to avoid speaking the truth?
dilap6 hours ago
https://x.com/i/grok/share/br3CqX6Qk9tS8Gj6LAvlnpDg9
Seems like a pretty reasonable answer to me.
[deleted]6 hours agocollapsed
z3ratul1630713 hours ago
the bots are out in force on this one. reddit type enshitification of hn.
epa6 hours ago
Disappointed in the HN community for the initial comments in this thread. Hoping the mods can help set a higher benchmark for community discussion than just rabble-rousing on the founder instead of focus on the technology. Do better team.
SimianSci6 hours ago
Technology cannot be wholly divorced from its ethical considerations. If a technology's founder has a multitude of ethical blindspots and has shown a willingness to modify such technology to suit his own desires, it is something which should be noted, discussed, and considered.
As professionals, it is absolutely crucial that we discuss matters of ethics. One of which is the issue of an unethical founder.
epa4 hours ago
Do better.
throw123xz4 hours ago
The founder is very hands on and in the context of the recent "issues" xAI experienced, which happens to match some of the founder's political views, any discussion about xAI has to touch on Musk.
You having issues with any criticism of Musk is a bit weird though. I'm not going to say that the moderators should be better, but it's also disappointing to see some users always jumping in to defend Musk when his companies, products and actions (via DOGE, for example) are criticized.
protocolturean hour ago
If you are going to be angry at anyone for politicizing grok, its the founder, not the commenters on HN.
yks3 hours ago
Ethics aside, we do not understand the technology enough to disentangle its outputs from the biases of its inputs. See the "Emergent misalignment" paper. The founder is clearly seeking to inject his ideology into this technology, so it is prudent to expect the technology to suffer in subtle and yet unidentified ways. This is Lysenkoism but for LLMs.
dawnerd6 hours ago
No, we shouldn't be allowing a pro genocide, white supremacist run LLM period.
rsynnottan hour ago
I mean, the technology in question has just been in the news for, in quick succession, promoting a 'white genocide' conspiracy theory, and getting a bit uncomfortably sceptical about the holocaust. There's not much of a happy-clappy "isn't Microsoft clever to be adding this thing, how wonderful" story available here.
mjcl6 hours ago
The technology couldn't stop talking about white genocide for hours.
tastyface6 hours ago
[flagged]
rvz6 hours ago
[flagged]
nomel6 hours ago
This is false [1], unless they left within the past 13 hours.
epa4 hours ago
See how you get downvoted for your comment. Redditzation is complete.