8 Comments
User's avatar
Hydrogen Revolution's avatar

Glad you liked it Damir. I agree that 2039 may seem ambitious at the current rate of technological progress, but I think there is a strong argument that the rate of progress will increase rather rapidly as we achieve AGI and then ASI. I anticipate the rate of progress in the year 2035 to be equivalent to 5 to 10 years of 2025-level progress. Perhaps even more.

Damir Maksan's avatar

Thank you Alex for this informative article and getting us excited about the possibilities. I like the story at the end, even if 2039 seems a bit ambitious. But we can hope!

Bob Bell's avatar

While I agree with a lot of the excitement in this post, we are nowhere near AGI using the latest GPT powered LLMs. Those lack any world model and continue up to the latest releases to “hallucinate” because they are just advanced auto-complete algorithms with no actual intelligence or thinking behind them. I made the observation years ago that advanced AI is required to truly unlock the potential of robots and a Star Trek economy, and GPT LLMs might be a step in that direction, but keep the limitations in mind- LLMs cannot play chess as they lack the ability to reference a world model of chess rules to validate their moves. As such holes are seemingly fixed by LLMs keep in mind they are detecting select problems and dumping them to traditional coded solutions. Otherwise I hope validations happen sooner than later which could also open up a less violently hostile examination of GUTCP.

Hydrogen Revolution's avatar

Two questions:

1) Have you used GPT-5 Pro?

2) If you took the International Math Olympiad test, how many questions do you think you would answer correctly?

Bob Bell's avatar

Your second question is a strawman and irrelevant. I was just noting that everyone should temper their Sam Altman hyped visions of AGI "any day now". Even Sam Altman is walking back expectations and mentioning an AI bubble lately. Just read my comments or maybe do some research, read some articles by Gary Marcus or even notice all the disclaimers around LLM output. There have been several lawsuits for LLMs making up defamatory information about people. Judges have started warning attorneys not to use AI for their briefs because they make up non-existent cases. I use ChatGTP for a variety of purposes (as well as other models). I regularly get irrelevant output and links to incorrect legal material for example. A computer chess program is not AGI. And LLMs can't play chess. Responses are based on training data and "next most likely word". An LLM doesn't add 2+2 when you ask it but "knows" or associates 4 as the 99.9% match so that's what it outputs (unless one has been programmed to dump simple math questions to typical algorithms- which by the way would admit the weakness). I didn't say LLMs are worthless but everyone should understand that hallucinations are a feature. Have the models and rules around them improved, yes, but they aren't AGI and never will be.

Hydrogen Revolution's avatar

Would it be fair to restate your argument about LLM's math capabilities as: "LLMs, due to their very nature, cannot solve expert human level math problems requiring multi-step reasoning that were not part of their training set?"

Bob Bell's avatar

Sure. After all, the Google model used a combination of LLM and symbolic engine built for this. The OpenAI model was specially trained using human verification of steps that were correct to re-enforce the probability of using those or similar steps instead of ones that were wrong; which improved the performance. My main point doesn't change and that is simply to temper expectations of AGI and a guarantee of correct information from a pure LLM. I don't have a crystal ball but it seems world-models and symbolic systems and probably other additions are needed to get something analogous to AGI and as reliable as a human expert. LLMs will be part of that solution but not the whole solution. There are many more simple tasks than the Math Olympiad where the latest LLMs make errors that a 5th grader would not. A recent example posted online is listing US Presidents or Canadian Prime Ministers along with their dates in office, with wildly incorrect, too long, overlapping persons in office, and even being in office before being born and serving a hundred years. A recent study (Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity) indicated that programmers may be less productive by up to 20% when using AI; it isn't their own work so it takes more time to evaluate and make changes and the routines are often found to be less than optimal so they are re-written which may be slower than if the coder had simply "done it themselves".

I hope you have a good rest of your weekend and I look forward to more amazing news from BLP, "soon"!

Hydrogen Revolution's avatar

Thanks, you too! Something tells me we are going to get a lot more amazing news from BLP over the coming year. It is going to be very exciting to watch!