Thread regarding Wells Fargo & Co. layoffs

AI reduces the headcount at salesforce

People are really underestimating how much AI is driving job cuts. Salesforce reportedly used AI agents to lay off 5,000 employees across marketing, product management, data analytics, and even its own Agentforce AI division.


by
| 1345 views | | 13 replies (last February 13) | Reply
Post ID: @OP+1kh7y5res

13 replies (most recent on top)

@ff Missing the point. The snapshot doesn't convey progress, but those numbers keep going down. And the reason why the error rates continue to improve is because benchmarks like these are always moving the goal posts. These benchmarks are designed to be hard and stump the LLM in order to tease out errors. That's the whole point. Progress.

For basic grounded (proper context), the error rate is now < 2% for most things.

What these benchmarks also don't really show are how the vast majority of errors in normal daily life are mostly solvable with proper input (context), guardrails and solid harnessing. Coding is nearly solved. By the end of this year, the programmers who are keeping up with the technology won't be hand typing code anymore (it's happening now even) and they'll barely be reading it. Instead they'll be managing the agents and using pipeline tooling to debug and keep it on rails. This is why you see the market for junior dev collapsing.

If software engineering can be automated to this degree, what makes you think your job is any different? Software engineering just happened to be first because the labs realized this would be the fastest path to self-improving models. OpenAI's latest model actually helped create itself (It's in their release notes if you don't believe me).

Let's say they get the error rate down to a steady 5%, just to throw a number out there. How many boards will look at that and say, that's good enough given the cost savings? I bet most will say tha's good enough. They're already foaming at the mouth now.

The AI isn't going to replace humans any time soon, but it will continue to significantly compress the work.

by
| | Reply
Post ID: @h5+1kh7y5res

"here's what the lying machine has to say about how much it lies:" lmao the critical thinking of ai users

by
| | Reply
Post ID: @gh+1kh7y5res

@e0 What profession, outside of economics and politics, enables someone to have a successful career with a 28% error rate?

I didn't sc--w up once last year and I still got an IM.

by
| | Reply
Post ID: @ff+1kh7y5res

@dv Uh, you are likely referring to OpenAI's paper from 2024 given that highly specific 55% rate. That might as well be an eternity in terms of AI progression. Regardless, learning about the tools weakness and how to work around them is how you become successful with AI.

Let's ask a current model (Gemini 3 Pro) about it (with a contextually appropriate prompt). I had it write it and format the table in ASCII for better legibility on this forum. Happy learning!

The "55%" hallucination figure is a technical data point from OpenAI’s 2024 SimpleQA benchmark. While accurate for that specific test, it is a misleading representation of how AI performs in the real world.

The Context Behind the 55% Figure

• Intentionally Adversarial: OpenAI explicitly stated they filtered the dataset to include only questions that "induce hallucinations." If a model answered a question correctly during the design phase, it was often thrown out. This was a "stress test," not an average performance review.
• The Tools Gap: SimpleQA forbids the AI from using the internet. In real-world use cases, models use Search or RAG (Retrieval-Augmented Generation), which reduces hallucination rates from ~55% to under 2%.
• The 2024 vs. 2026 Factor: The 55% figure (roughly 60% for GPT-4o) is now nearly 18 months old. As of February 2026, "Reasoning" models have significantly closed the accuracy gap.


Consolidated Benchmark Results (February 2026)
Modern benchmarks distinguish between Raw Memory (Parametric/No Internet) and Grounded Reality (Summarization/Tool Use).


MODEL SERIES RAW MEMORY ERROR GROUNDED ERROR TOOL-ENABLED ERROR
OpenAI (o3/GPT-5) 38% - 51% 0.8% - 1.2% ~35% (Complex)
------------------ ------------------ ---------------- -------------------
Gemini (3.0 Pro) ~27.9% 0.7% - 1.5% ~40% (w/ Search)
------------------ ------------------ ---------------- -------------------
Anthropic (4.6) 42% - 57% 2.6% - 4.2% ~30% (w/ Search)

(Note: Grounded/Tool rates represent hallucinations during active document analysis or search.)

2026 Model Specifics
Gemini 3.0 Pro (DeepThink): Currently leads in raw factual recall on the SimpleQA Verified set (~72% accuracy), meaning its "guesswork" has dropped to roughly 28%.

Claude 4.6 Opus: Released Feb 2026; while its error rate on trivia is higher, it has the lowest "confident hallucination" rate because it is the most likely to say "I don't know" when uncertain.

GPT-5.2 / o3: Remains the industry leader for Grounded Accuracy (summarizing long documents or coding) with error rates consistently near 1%.

Verifiable Sources

  1. The "55%" Source (OpenAI Blog): https://openai.com/index/introducing-simpleqa/ (Confirms the benchmark was designed to "induce hallucinations.")

  2. The 2026 Intelligence & Accuracy Index: https://artificialanalysis.ai/evaluations/omniscience (Real-time tracking of GPT-5, Gemini 3, and Claude 4.6 accuracy.)

  3. Vectara Hallucination Leaderboard (Real-World Use): https://huggingface.co/spaces/vectara/leaderboard (Industry standard for document-grounded accuracy and summarization.)

  4. HalluHard Benchmark (arXiv 2026): https://arxiv.org/abs/2602.01031 (Research showing how Tool-Calling/Search drastically lowers error rates vs. raw memory.)

by
| | Reply
Post ID: @e0+1kh7y5res

@dg because even the new, enhanced AI models hallucinate up to 55% of the time. AI is extremely error-prone, and sells those errors with authority. Things will break if we put too much trust in it. It's why every single WF training on AI says you have to double-check the output.

So you save time on the front-end, by getting work done with a prompt, then increase time on the back-end fixing the slop AI regurgitated from the internet.

Like the computer before it, the internal combustion engine before it, the steam engine before it, and the wheel before it... AI will not replace workers, it will shift the labor force to different adjacent rolls.

by
| | Reply
Post ID: @dv+1kh7y5res

@a1

not sure why this is getting downvoted. Folks need to wake up. AI is not what it was in 2024. Especially the paid versions.

by
| | Reply
Post ID: @dg+1kh7y5res

@by It doesn't need to replace intuition, although it will at some point. You are not at risk of being replaced by AI any time soon. Wells Fargo's molassas speed will buy you a little more time, too.

Jobs aren't going to be lost because AI fully replaces people. Jobs are going to be lost because AI will significantly increase productivity, which leads to less people needed to do the same amount of work.

Maybe revenue growth increases at the same rate to match the increased productivity, maybe not. I tend to think not --at least not initially.

by
| | Reply
Post ID: @cg+1kh7y5res

@bd If you want people to stop calling out the AI slop you post, then stop posting AI slop

by
| | Reply
Post ID: @c3+1kh7y5res

Another FUD post.

AI is enhanced Internet search with consensus probability, it doesn't replace intuition.

The current job cuts would be happening with, or without, AI. AI is just giving CEOs a convenient bogeyman.

by
| | Reply
Post ID: @by+1kh7y5res

@a8 Excellent article by Matt Schumer. There are several versions of this blog post. The original is at his blog shumer.dev/something-big-is-happening I particularly like this line -- "We're not making predictions. We're telling you what already occurred in our own jobs, and warning you that you're next." and "I am no longer needed for the actual technical work of my job. I describe what I want built, in plain English, and it just... appears". AI is designing more of the solution now. What does that mean for this bank? The command and control facilitators, like the AI slop guy that posts replies on this board, will try to bully, to persuade the executive management that AI is a lost cause. Remember the operating system of this bank is fear. They will do their best to keep their jobs. Hey this https://shumerprompt.com/ is cool.

by
| | Reply
Post ID: @bd+1kh7y5res

Follow the money is going to be replaced by follow the AI. At this bank, most of the AI tech work is done out of OH, some in Texas, some in India. Follow the LinkedIn profiles of the engineers.

by
| | Reply
Post ID: @aa+1kh7y5res

https://x.com/mattshumer_/status/2021256989876109403?s=20

by
| | Reply
Post ID: @a8+1kh7y5res

Oh it is barely getting started. It still takes quite a bit of effort to do it right, but the tools and models get better every month. Look to the frontier labs for a preview of what is coming.

At the very least, wall street and boards are putting massive pressure on CEOs to replace workers with AI. I think that is premature, but they see the writing on the wall.

Today's job report paints a bleak picture for the white collar worker.

by
| | Reply
Post ID: @a1+1kh7y5res

Post a reply

: