During my two-month stay at the IVIA lab at ETH Zurich as part of the Artist in Labs residency program, I developed a prototype for sonifying the modernBERT model. This project turns text into sound and music, using the FMOD middleware, RNBO->FMOD plugins for sound generation, and the Unity engine.
‍
You can enter sentences and explore the individual tokens and sentences across different layers. For the token embeddings, there is a direct mapping of high-value dimensions to an overtone spectrum. For the sentence embedding, the CLS embedding is distributed more freely, with its averages mapped to a harmonic scale and four voices.
To interact with the prototype:
- ‍Enter a sentence at the bottom to add it to the scene.
- ‍Hover over a word to highlight its sentence and hear its embedding.
- ‍Use the arrow keys to explore the 22 layers of the modernBERT model (up/down) and switch between sentences (left/right).
- On mobile? Tilt your phone horizontally for the best experience.