The fixed-bynamic mixed networks based on branch networks can be described simply as:
a main network with branch networks before (pre branch networks) or behand (post branch networks) or both;
wherein one input (a vector or a token) of the mixed networks includes multiple parallel subinputs (subvectors or subtokens) each of for one of the pre branch newtworks sparately, wherein the input and subinputs all have fixed dimensions;
wherein there are a part of all weights of the mixed networks keep unchanged during the inference or working after training of the mixed networks, wherein the part of all weights are hereinafter called fixed weights;
wherein there are another part of all weights of the mixed networks are changable or adjustable uring the inference or working after training of the mixed networks, wherein this another part of all weights are hereinafter called dynamic weights;
wherein a part of the output of the mixed networks is generated by a part of the input of the mixed networks computing with the weights all of which are fixed weights;
wherein a part of the output of the mixed networks is generated by a part of the input of the mixed networks computing with the weights part of which are dynamic weights.
By the way, this fixed-dynamic mixed networks may need some chip level hardware support.
This fixed-dynamic mixed networks are especially ideal for continually learning AI model on device or edge.
In training such a fixed-dynamic mixed networks, you may need to obtain a good differentiatial coefficients for the fixed part of the networks to let fixed part tranmit the gradient smoothing and efficiently later in backpropagation of inference after training.
Be First to Comment