You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Sep 28, 2023. It is now read-only.
Can we use RGB (or grayscale) encoded images, represented as text, as input to an LLM to do non-trivial classification tasks?
i.e., take an MNIST image, print the sequence of grayscale values as text, and provide it in a prompt to an LLM with the question "What is this a picture of?"
If this works, it's feasible to do simple image work, inefficiently, using text-only models.