Last updated on March 1, 2026
Just thought about a simple method to do continual learning:
a neural network after training includes FxN weights matrix in first hidden layer and NxN weights matrix for each following hidden layers;
expand weights matrix of first layer from FxN to Fx(N+D);
expand weights matrix of each subsequent hidden layer from NxN to (N+D)x(N+D);
in which the weightsof the submatrices FxN and NxN of hidden layers are frozen (fixed, unchangable, untrainable) during inference or working or deployment;
in which the added weights of added D columns and added D rows of hidden layers are dynamic (changable, trainable, adjustable) in inference or working or deployment.
The output of the output layerof the network with expanded hidden layer matrices can be of same dimensions as the output of the network before expansion.
This continual learning model is simple and same compatible for hardware chip deployment as present fixed models. It’s effect may be better than branch networks based on intermediate omnitoken but much simpler.
In nature, this one is just adding additional approximators for each hidden layer and additional dimensions for each approximator of subsequent hidden layers after first hidden layer.
PS: n fact, you can combine branch networks on intermediate omnitoken and layer expansion to make branch specialized for individual user and main network learning for data of all users, which make all users be able to share a same main network while each user has own customized learning branch.
Be First to Comment