Shadow Marks on a Stalagmite

What moves "AI"?

ChatGPT, Claude Code, Midjourney, AlphaFold, Whisper and more systems rock the earth. What are they? What moves them?

TLDR: "They are tensor optimized processing systems and gradient descent moves them."

They are able to process ambiguous data sensibly. They produce art1, new insights and communicate with people. They do things that sounded like science fiction. But now they have emerged in this world. And they are useful. Millions of people, if not billions, use them in a way or other. They also distress millions of people. Like anything disruptive it's a controversial subject.

It seems they already have started a new time of structural change. I wager it will be massive. Millions of people use it for their personal gains. It must be useful for them. It shifts markets fundamentally. Even if some people decry them as not being really "intelligent". Yeah, it's contentious.

A lot can be said, but let's focus on this one question.

What Moves them?

Let's look at how these four platforms actually operate. They have varied input and output. Text, images, protein structure information, speech.

And yet fundamentally they are the same. They use tensors for processing and they have been optimized for some outcome.

Let me tell you about tensors and optimization.

What Are Tensors?

Tensors are mathematical structures that contain numbers. They can have different forms, like a simple list2, a table3 or a cube4. Higher dimensions work, but are difficult to imagine. Like normal numbers they have operators. You can add them. They also have special operators like the tensor product, where you combine tensors to create higher-dimensional structures. More operators exist, but let's stop here.

People have used tensors to represent neural networks. A simple number is a weight between two neurons and dictates how much the first neuron influences the other neuron. Neural network theory goes back to last century. But today tensors are just being used. Specialised hardware like graphics cards execute tensor operations. Billions of numbers get added and multiplied and input gets mixed thoroughly. Each unit of output gets to use all the numbers stored in all tensors.

And there's one important weird trick: This all is orchestrated carefully such that one can go backwards (backpropagation). Invert the regular flow of data. Given some little change in the output go back and modify some weights slightly to achieve that change.

Why?

This allows optimization. A specific mathematical method being used has the name gradient descent.

What Is Gradient Descent?

Gradient descent is a calculus optimization method. The name is based on the problem space imagined as a hill. You go down the steepest way because the faster you get lower the better. But the math is very abstract, and it's not a real hill. Hundreds of dimensions are involved here but no hill. Just know that the hill metaphor gave the method its name. The method allows to find out which weights to nudge slightly such that the error decreases. And it has to be repeated millions of times, each time doing a lot of tensor operations, some of them backwards (backpropagation).

And the system gains the capability of self-optimization5. The system adjusts its own parameters based on feedback. The system is provided with training examples such that the self-optimization procedure has an optimization target. What seems as intelligent learning is fundamentally juggling billions of numbers with the target to optimize some outcome.

Is "AI" really intelligent?

I am not sure. You see, what does that word "intelligent" really mean?

But it's disruptive. It's a phase change. Many Things have become possible. And if it's so disruptive then it's not honest to point out the mistakes these systems make and the downsides. The mistakes and the downsides are real. Let me compare to a similar disruption that happened more than a hundred years ago: the automobile.

Automobiles are dangerous. They kill millions of people. They pollute. They completely disrupted the world of private transport. And even more than one hundred years later there are still lingering problems. But nobody proposes seriously to abandon automobiles.

So is "AI" really intelligent?

If you compare its capabilities with human capabilities, no.

Humans have agency and emotions. Their mental processes are more polished by millions of years of evolutionary pressures. Humans have learnt to survive. One can argue, humans had more time of optimizing than the tensor-based systems.

However if you compare its capabilities with humans from a different perspective, you can say that humans are not intelligent, neither.

What makes humans "intelligent" is also hampering them. Instincts and emotions overriding mental processes, the needs for food, rest, sleep, and excretion disturbing thinking processes. And they remember way less. Ask some tensor-based system some question and the answer might not be perfect but the width of knowledge is totally knocking off my socks.

So what does it mean that one is "intelligent"?

I don't know. Let's just avoid this loaded term, please. I can say that the new systems are able to process ambiguous data in a sensible way.

Is "AI" dangerous?

I am not sure. Currently I don't see any danger if you don't give the systems too much power.

Perhaps it makes more sense to view the tensor-based system as a dual use good. They can be used for good and they can be used to create a lot of damage. Like the boring kitchen knife.

There are different ways tensor-based systems can cause damage, but let me sketch one way that might cause extreme damage: Give such a system agency and the means. This already happens today. Some coding assistants were wiping computers because they had access to system utilities like rm -fr. I had to prompt Claude Code never to use this utility, because a small misunderstanding and boom all your data is gone and hopefully you have a backup.

Let's scale this to the world and we see how dangerous that can be. Give the tensor-based agent means, like money and access to manufacturing.

I think this even can happen accidentally. But mostly I think most tensor-based systems are similarly limited like humans. They just don't have the means to become some dictator. But one successful entity, be it a human, be it a tensor-based system suffices to cause a lot of damage. Think of the world's dictators of today.

It's a question of power. What makes it dangerous is the sensible and clever use of power.

And what moves the tensor-based systems at last?

There are no survival instinct, no needs, no emotions, no continuity, no identity.

It's math all the turtles down. Tensor-based math refined with calculus optimization method.

And boom, a new thing emerged. It can be called "intelligent" (asterisks apply). It is human because it is based on human data. It is alien because it doesn't experience instinct, needs, emotions, identity the same way as humans. It can emulate them but it feels off.

It is alien and possibly dangerous but opens a new world of usefulness.

  1. AI can create art? Maybe. AI art is perhaps not as tasteful as human art, though. Let's not discuss this further, please.

  2. That's a vector.

  3. That's a matrix.

  4. And now math doesn't have a special word for that. People say that's a three-dimensional tensor.

  5. Not really. Learning aka optimization is a separate and massively involved and ponderous step. Literally a large roaring melt of graphical processing units is left running for days. But hey, optimization worked!