Skip to content

A better study route for AI model

Last updated on May 4, 2026

Just realize a one layer neural network with back propagation training by raw code without function in python, and also I’ve got quite better understanding of neural network and transformer now, for which I shall say thanks to a chinese upper “跟达叔一起学AI” who is teaching AI basics on bilibili.com. Later, I may try to write a basic transformer using pytorch by myself.

Here from my own experience, I’d like to suggest a route for outsiders to study most fundamental and useful knowledge of AI better, especially when many people introduce AI from non proper angle like neurons of brain, statistics or some out-dated models.

(PS: by the way, an AI model is formed by 4 digital parts – model (weights) architecture, data (token) structure, training data, training methodology.

The route below includes only 4 fundamental bases of present AI model.

AI is the most powerful tool that this humanity has ever created, but AI model is built from most intuitive and simple math and idea, which anyone could handle and understand easily.)

The best study route of AI model as I see is as below:

1. UAT

UAT (universal approximator theorem) not only is the most basic underlying math theory for all neural networks but also has profound implications for us to understand the universe: all logic/causalities can not only be approximated by but also be broken down to some combination of simplest logic/causality.

UAT is simple: y_hat=sum(c_n*activation(a_n*x+b_n)+d_n). But it’s most profound as core idea of neural network, which is not neuron or statistics.

In nature, all neual networks are APPROXIMATOR functions which is just like Taylor function and linear regression.

Neural networks are NOT statistics of math in nature, but just can approximate probability output through softmax.

2. Back propagation

Back propagation is critical to understand the training and learning mechanism of neural network, which is simple in idea but has important implication too.

3. Transformer

Transformer is for now the best improvement and model of neural network, which changes original NN of y=f(x) predicting one y based on one vector x to y=f(x1, x2, …, x_n) predicting one or more y based on multiple vector x with delicately designed correlation between multiple input vector x in context through QKV and sine/cosine positioning mechanism.

There are also other important techniques like (rotation) positioning, heads, softmax, residual, and normalization.

Transformer not only dominates text LLM but also for now is the best match or compatible model for physics pixel, which makes other previously popular models all outdated already.

4. Physics pixel (and omnitoken)

Physics pixel lets model perceive and understand the real world and its physics nature. Along with omnitoken or combining hidden states along column/feature, physics pixel can work with mutlimodalities including text, control and agents.

Physics pixel is just like UAT and backpropagation, for they all are simple and intuitive to understand and apply. There cant see any possible replacement of physics pixel in forseeable future just like UAT and backpropagation too.

Published inUncategorized

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *