The DeepSeek generative AI moment is less than two weeks into the making and already new “forms” of Gen AI and Gen AI perspectives and nomenclature are emerging online. DeepSeek, a Chinese company, figured out how to make a powerful LLM + ANN Gen AI model at that required 10% of the hardware used by OpenAI et. al. Basically DeepSeek needed only 10% of the resources and money that OpenAI claimed was required to reach this level of AI power. Kaboom!
As an aside, I had the idea to write an essay entitled: “From Silent Spring (1973) to Data Center Cacophony Spring (2024- ).” And it would be an apt title as the DeepSeek moment will not likely derail the rapid worldwide construction of super data centers. And this is very unfortunate but what to do given the momentum and money being tossed in the ring by Meta, Google, and Microsoft.
Too be fair, I have been very critical of Gen AI for the same reasons as Jann Lecunn and Gary Marcus. Current Gen AI has little to do with how humans comprehend language or anything else. It’s a software hack that produced billions of tokens (or parameters) and each token is just a letter, a word or a phrase or a number. This is not how we process language! Unfortunately, the investments and hype from 2022 onwards are nigh impossible to dial back. So, over 2 years into the ChatGPT era, I can say that Gen AI models & tools are here to stay.
Current forms of Gen AI are worth using whenever they are worth using. The reader may ask, “Do you, Ian M Ropke, use Gen AI or Gen AI tools?” The answer is yes and no. From the beginning I found Perplexity AI to be sleek and simple when I “experimented” with it. And I used it a number of times to get better responses to “more complex” search queries.
Overall, my views on Gen AI are negative or dismissive (and I don’t use the tech much or at all now that I’m done experimenting). And I thieel (think + feel) that the money spent on making Gen AI with LLMs and deep ANNs wasted a lot of time, energy and water. But that is not the point of this essay.
I want to re-explore the key nomenclature of Gen AI and LLMs and deep ANNs from the way a child’s brain develops. And that leaves us with these key works or concepts: data, training, testing, supervision, context, inference and distillation. All of these words are core to the current Gen AI works in progress.
Data is the information used to train an AI. Training is how an AI learns to find patterns or logic in the data. Supervision is what humans (the supervisors of the AI model or software) add to “steer” an AI away from mistakes and towards better and better outcomes. Context is more or less about the nature of the data (numbers, text, images) and how it relates to the knowledge domain or knowledge environment that surrounds the data. So, the context of financial spreadsheet data is a higher-level viewpoint. A forest for the trees perspective, if you will. Inference is simply a layer on top of an already trained Gen AI model that infers new outcomes from new data types, leading to even better outcomes. Distillation involves taking the knowledge from a large and complex Gen AI model and reducing it so that it can be transferred to a smaller and more efficient model (the distilled smaller model requires less CPU power and less memory). Ok?
When I flip all these words to suit my emerging AGI (artificial general intelligence) models, based on human child development (from zero to 14), then these words become a lot more people friendly:
Data: any input to any child’s brain from age zero to age 14;
Parameters (not! tokens): more or less countable and no where near the billions LLMs require (or generate);
Training: the “ideal” home education + public education period for any child, age zero to age 14;
Supervision: the parents and teachers and mentors of any child age zero to age 14;
Context: the ability for any child from a certain age to grasp the nuances of a situation from surface-level details; or the ability to understand meaning and relevance of input data (spoken, written, or symbolic);
Inference: what kid’s brains can do when they have been pumped with data and supervision;
Distillation: data or concepts that any child reduces to first principles or a knowledge domain.
So, now let’s try to imagine how we can go from what we know about human intelligence development and human brain processes to making an AGI model that is like a human intelligence or at least closer to what is actually human (versus silly ANNs that clearly miss this point). Already, there’s a wording problem here.
The 2022-2024 LLM Gen AI models increasingly described as replacing humans and also exceeding human intelligence. We don’t want AGIs to be like HIs. We want AGIs that empower HIs. And we certainly don’t want to take orders from AIs. So, what we really want in all AGI models is an intelligence that understands humans and human weakness. And an intelligence that is inherently from the design stage to be subservient to HIs. Said another way, what humans really want and need is AIs that understand us but aren’t not like us in key ways. Think of Spock on Star Trek. He understood humans but was fundamentally different from humans. He had no emotions, he wasn’t affected by greed, temptation or deceit. But he understood the machinations and implications of all human weakness. And in this way Spock empowered humans by pointing out their shortcomings in a perfectly logical and reasonable manner. Of course, humans ignored his advice on Starship Enterprise. And each time they dismissed Spock’s advice on a key decision humans learned that they got it wrong.
So, we really want AGIs that have Spock’s strengths in order for us to make better decisions. Business decisions. Government decisions. Personal decisions. And societal decisions. AGIs should help us to do better, full stop.
The idea that Gen AIs or AGIs can exceed human intelligence is also incorrectly phrased. Gen AIs and AGIs have and will continue to exceed what humans can do in a fixed amount of time with a seemingly infinite amount of data or even unorganized data. So for pattern recognition in the medical sciences, it is obvious that AIs exceed what a single doctor or researcher can achieve in a month or a year. Again, the point here is that more intelligent AIs is an oxymoron. What we are looking for is where AIs can do what we can’t do easily, quickly or efficiently in a given time period. Ultimately, AIs will either help humans or make humans a little “dumber” and a lot lazier.
It’s pretty easy to catalog human weakness. There aren’t that many when you think about it. greed (for money, power, or attention), ignorance (lazy or complacent minds?), anger (with the exception being laser beam anger or what anger might look like in a human close to the Dalai Lama level), jealousy, envy, pride, hubris, impatience. And also, in a way, emotion, which is something that Spock understood but didn’t do (i.e., Spock doesn’t have any emotions or hardly any . . .).
Ok and let’s get back to main threads of this essay. But before we get back to the stages of intelligence displayed by children (and other young social mammals) let’s look at the 2024 AI landscape. If you look past the obvious hype of Gen AI, you will really only find two primary AI perspectives. One says the human brain is a computer and this perspective is known as computationalism. The other, says the human brain is basically many neural networks (and synapses) that are interconnected.
Computationalism, the first, and older theory, posits that brains are born with hard-wired abilities and skills. And that some*** of the basic thinking in the human brain is symbolic. Meaning, symbols are used as inputs and outputs. The key word here, mine, is some.
Connectionism posits something very different from computationalism. Connectionism claims the human brain is a vast network or neural networks (much like the internet, which a giant network of networks). And this brings us to the moment in history where we are now: artificial neural networks (ANNs).
Theories come and go but some stay. Did you know that artificial intelligence experienced two major AI winters? The first AI winter lasted from 1970 to 1980. And the second went from 1987 to 2000.
The two AI winters were periods where a certain kind of AI thinking stopped growing. Of course, both major theories (computationism and connectionism) stopped getting much research money or media attention. But if you look a little closer you will see that connectionism is what really went out of fashion and stopped growing.
All this changed in the early 2000s when machine learning (and AI in general) surged forward again. The first stars of the 2000s AI chess masters, and machine-learning stars Siri and Alexa. And then, mostly in the background, in the 2010s, neural network thinking (connectionism) resurged big time! By 2016, Google Translate had become Google Neural Machine Translation. Suddenly statistical machine translation was passe and replaced with a deep ANN that used Bayesian statistics instead. Bayesian statistical theory can update probabilities and predictions based on new data (i.e., sequentially). Baye’s nearly 300-year-old theory (Thomas Bayes: 1702–1761!) is based on the idea that probability is “a measure of belief in an event.”
And in 2016, Sam Altman’s garage lab OpenAI team was in Year 2 of building a massive large language model (LLM) which also used Bayesian statistics to predict what the next word “should be” in text and speech. This exploded into the media as ChatGPT round one at the end of 2021 and became the mega meme of 2022, 2023, and 2024. In fact, only DeepSeak could derail the momentum of the generative AI enthusiasm (and hype). (FYI: Google’s seminal transformer-based, “All you need is attention” research paper came out in 2017).
And from 2022 until now (early 2025), the AI world was also divided in two. The massive LLM Gen AI movement versus the computationalists and a few key individuals with good common sense.
On the LLM ANN Bayesian side we have: Yoshua Bengio, Geoffrey E. Hinton, Mark Zukerberg, Sam Altam and Elon Musk.
On the computationalism side we have Alan Turing, Gary Marcus, Yann Lecunn, and Andrew Ng.
I, Ian Martin Ropke (NexussPlus sole founder), have been against the connectionists’ claims and hype since early 2022. My criticisms are detailed in considerable detail in my AI essays of early 2023 and early 2024 and aren’t much different than LeCunn’s or Marcus’s or Jim Covello (Goldman Sach’s Gen AI investment critic since Sept 2024; he’s thieels (think + feel) that Gen AI is too mathematically complex and simply way too expensive; 6x on water, 6x on electricity compared to non-AI search queries; presently Google Overviews have largely “replaced” Perplexity AI, which is still the best of breed for me!).
Gary Marcus has pointed out that only a small number of genes account for the intricacies of the human brain (or animal brain). Marcus also argues that most cognition is abductive and global (all over the brain). Abductive reasoning starts with observations which are then turned into hypotheses which are then evaluated against the observations (and other known facts). Finally, the best hypothesis is chosen to explain the observation.
The human brain is one of the best examples of efficiency you can find. Think about it. A child or an adult’s brain consumes roughly 25 watts of power or the limited light from a 25-watt bulb (most bedside reading lamps are at least 40 watts!).
And the human brain, in it’s simplest form, is a biological-electrical-chemical computer! There is hardware and software. Human brain hardware can be broken down into different brain regions and their specific functions. Human brain software is a little harder to grok and also far from understood. Some of our brain code is written with chemicals, and chemical triggers and thresholds. Some is electrical (neurons and synapses). And signaling across the entire brain, based on breaking late 2024 research, use quantum computing. I find the last insight to be the most intriguing.
Quantum computing in human brains makes total sense. Nothing travels as fast as entanglement! What I don’t understand is the coding used to send messages with quantum computing. The human brain has been explored for over 150 years by doctors and others. We know a lot about the various parts of the brain and what happens in each area of the human brain. Basically, with quantum entanglement in play, one part of the brain can simultaneously signal several other parts of the brain to carry out functions and processes. But how?
How is a message encoded in quantum pulses or quantum signals such that each brain area signaled understands what is being said and what to do about it? I’m still wrestling with this, but I do suspect that the brain codes, like DNA gene codes, will be understood sooner than later. This will have major implications for understanding the human brain but maybe not so much for building better AGIs . . .
That said, I do feel that the way infants and children learn is the way forward for better general artificial intelligence. A great example is the Deb Roy, the Canadian MIT researcher, who filmed and recorded everything his young child saw, heard and said in the family home. The researcher places cameras in every room of the house! What he discovered is that when a new word is used correctly the child learns to connect the word to the thing being described. If you try to convey the word “water” in a desert setting a child finds it too abstract. But if you say “water” looking at the garden hose spraying the lawn or coming out of the kitchen faucet, the child makes the connection. This project, Birth of a Word, was viewed 2.8 million times! Roy’s work is something that can be applied to my general artificial intelligence models. Great stuff! Thank you Deb Roy! [FYI: Roy’s research focuses on language, social dynamics, and games, along the line where at artificial intelligence and cognitive psychology intersect.]
Everything a child sees and hears is a lot of data, at least for an actual child. But based on what we know we (you and I) took in from age zero to say age 14, it’s not that much of a stretch to train new AIs, new AI subroutines to be precise, to learn what kids learn from their environment. All the stuff we learn in school would be selectively used or input into another AI subroutine. And since we are talking about a human child we can also add in another subroutine that relates to how the brain decides to move a finger or walk across the room to get a cookie. The latter robotics-related subroutine may not be required for truly digital AIs. But if you think about it, digital AIs should also be able to understand actual humans and their physical capabilities (and our physical shortcomings).
I will bring this essay to a close now despite having lots more to say about everything I have already written here in this particular essay.
My final point is that general artificial intelligence isn’t about pattern matching (DeepMind) or LLMs + ANNs (like trying to make an author who isn’t human or a painter who isn’t a human being; people paint and write; AIs aren’t people are they?).
General artificial intelligence is more about creating an amazing team member for team human. It doesn’t have feelings but it understands feelings. It doesn’t get greedy but understands greed. Pretty obvious really . . .
A cheeky addition to my thinking would be that general artificial intelligence train should be possible on a killer laptop, a general-purpose laptop built for general artificial intelligence work. If a 25-watt human brain is going to be replicated in any form then it should be almost as efficient as who we are. Or Keep It Simple Stupid and I’m sure Steve Jobs would agree!
Leave a Reply