Last updated on October 30, 2025
A self-recursive training including labeled seed frames (flow) and unlabeled pure original signal frames (flow) is:
First, pretrain the model (the prime model with physics pixel) with labeled seed frames (or flow) of visual signal, other sensor signal or combination of different sensor signals.
The labeled seed frames include typical scenes which include as many kinds of objects as possible, and the labeled seed frames include original signal of sensor, and the labeled seed frames are labelled with selected physics parameters of physics pixel like object name(near/far), 3D points cloud, mapping relationship for fusion of different kind senor signals, etc.
In the training of labeled frames (including seed frames), use difference between all predicted data (predicted signal and predicted physics parameters) of predicted next frames and all data (orignal signal and labeled data) of corresponding labeled original next frames to adjust weights, which is different from the following step 3) of second phase.
This pretraining may also include text training which include hierarch class of objects and other typical text.
Second, train the model on huge (or even endless) unlabeled original signal frames (flow) of visual or other signal or combined signals:
1) based on pretraining, let the model estimate all selected physics parameters of physics pixels from the orignal signal in the present N frames of a flow, and the selected physics parameters include the name of object(near/far), 3D points cloud, mapping relation in fusion of different kind sensor signal, etc;
2) let the model predict all data of next L frames after the present N frames based on all data of the present N frames in which the all data include signal and selected physics parameters, and L can be 1 or more;
3) use the differece between the real original signal of next L frames and predicted signal of L frames to adjust weights of the model (like the gradient-based optimization to change weight one by one to get the partial derivative to adjust the weight to approximate real frames);
4) let the adjusted model repredict all data of next L frames based on all data of the present N frames, if the repredicted signal converge to real signal of next L frames then go to step 5), if not converge then go back to step3), and after a set number of iteration of step 3) on same next L frames then go to step 5);
5) shift the window of present N frames forward by up to L frames to set new present N frames and go back to step 2).
During the sencond phase, periodically reinforce or anchor with the pretraining stuff or other labeled training data.
This training may be used for the prime model with physics pixel which is based on transformer or diffusion or hybrid of above two.
If the training data of original signal flow is huge enough, once backpropagation may outperform the recursive iteration of 3) for better marginal effect, but the recursive share the same nature as diffusion on finer subtle pattern abstraction from same data, so you may need to balance or trade off or experiment on the number of the recursive.
Be First to Comment