All human speakers, all humans, all sounds, all human and computer language encodings for an open Internet
Beomseok LEE @beomseok_lee_ How do we bridge Speech Encoders and LLMs?
My reply:
All human speakers, all humans, all sounds, all human and computer language encodings for an open Internet
You can aim to write an encoder for speech sounds from all human languages so all the spoken sounds are recognized, encoded in a smaller set of generated voices. That means the content of the speech matches things in the real world.
In parallel, take the LLM encodings and code the old text to meaning. Particularly for Science Technology Engineering Mathematics Computing Finance Government Organizations Trade Production Issues and other things on the Internet and in human society.
Make strings that encode dates into unambiguous data codes, place names into place codes, human names into name codes. All things that exist in databases, now those are not standard and so ambiguous for data exchange and sharing in a global Internet society — soon expanding into the solar system and beyond. Take all the values and units, the equations and methods and algorithms and distill out the core. NOT by creating single points of failure and manipulation and monopolies – but by making all knowledge accessible and usable by all humans and their assistive intelligences (AIs).