Skip to content

pre&post networks for parallel multimodalities on intermediate omnitoken in main network

Last updated on January 16, 2026

Only one first layer in main network seems too shallow and thin referring to left brain and right brain of human, right? So naturally I thought about 2 following points.

First, omnitoken is the first of multimodalities per token, which is in nature the first parallel processing of multimodalities or different functionalities in one neural network (a main network).

Second, the broader understanding of omnitoken is not only for including raw data and final output but also for including intermediate outputs of multimodalities or different functionalites in one token orthogonally and parallelly, and a omnitoken including intermediate outputs of different modalities or functionalites can be called intermediate omnitoken.

So add a pre-network and a post-network specifically dedicated for each modality or functionality before and after a main network to deepen or increase the layers for each modality or functionality, and let the main network process the intermediate omnitokens each of which includes the intermediate outputs of all multimodalities or functionalites from all pre-networks, and let the post-network process outputs from the main network separately for different modalities or functionalities.

Supposing the example I gave in last post, we have a 4 modalities or functionalities to process which are called F1, F2, F3 and F4 respectively hereinafter. And the width or the number of columns of F1, F2, F3 and F4 are W1, W2, W3 and W4 respectively.

The main network is a network of Lm layers, in which the first layer has Nm=N1+N2+N3+N4+N5 columns, and each of N1 to N4 is corresponding and dedicated to F1 to F4 respectively.

A dedicated pre-network for each of F1, F2, F3 and F4 can be added before the main network to pre-process the data of each modality or functionality to contain intracorrelation more deeply, accurately and stably. The pre-networks for F1, F2, F3 and F4 are P1, P2, P3 and P4 respectively, and the number of columns of first layer of P1, P2, P3 and P4 are 4 to 8 times of W1, W2, W3 and W4 respectively, and you can try how many layers may be proper for these pre-networks. The input of each pre-network can be the data of corresponding modality/functionality/table extracted from an omnitoken or generated directly from raw signal data.

All pre-networks for all different modalities or functionalities work simultaneously and parallelly, and the outputs of all pre-networks form an intermediate omnitoken, which includes the intermediate outputs of different modalities or functionalities from the pre-networks in different orthogonal tables/fields of the intermediate omnitoken.

A dedicated post-network for each of F1, F2, F3 and F4 can be added after the main network as well to post-process the outputs of the main network for each modality or functionality to improve the depth and finally generate the final output for each modality or functionality respectively. The final output of last layer or post-network could be used for input context of autoregressive, separately or in an integrated omnitoken.

The pre&post networks can increase the depth of processing for each modality or functionality like improving the thickness of cerebral cortex of left brain and right brain of human.

Make multimodal not only parallel but also deeper!

Published inUncategorized

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *