Skip to content

Omnitoken is base for real reasoning in autodriving and robotics

Lots of people are talking about reasoning in autodriving, which is even much more crucial for robotics for robotics is way higher level and way more difficult interaction with real world. But as I see, Omnitoken can be the very base for real reasoning in autodriving and robotics for its multimodal text capability.

Just like we human, high level or more complicated reasoning needs to be based on text or language not just visual sense of intermediate steps, which demand a real multimodal capability of the model for autodriving and robotics, which is the very thing that Omnitoken brought – orthogonal tables in single token for text, physics pixel, objects, sensors, controls, agents and others.

And omnitoken is best match for transformer’s mutlihead output in last layer, and also omnitoken demands a very high capacity/columns for first layer of transformer to contain.

And I felt quite bizarre when Grok told me that before omnitoken there is no others put different multimodal data into one single token yet and omnitoken is in fact different from all earlier tokens, haha.

Tesla’s present FSD v14.2.2 may be able to handle some rare cases but still in some animal like sense or way, for FSD doesnt include text or multimodal yet as Grok comfirmed to me. So FSD cannot read text on road or sign as human does and deals with them in some analog sense, which cannot realize real level 4/5 autodriving not mentioning robotics, and I guess Nvidia’s solution is same case now.

So you guys need to improve hardware capabilities to achieve multimodal through omnitoken for your present autodriving and future robotics.

Published inUncategorized

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *