When you see a red apple on a blue plate, your brain instantly knows the redness belongs to the apple, not the plate. This ability—to bind features like color, shape, and texture to the correct object—is called binding. New research from the University of Pennsylvania's Kording Lab (Lianghuan Huang, Yihao Li, Saeed Salehi, Yingshan Chang, Ansh Soni, and Konrad P. Kording), accepted to ICML 2026, finds that even state-of-the-art Vision Transformers (ViTs) often fail at this fundamental cognitive task, especially when objects share features.
The Research
The team formalized the binding problem using information theory and developed a probing method to measure how much binding information is encoded in a model's representations. They tested several pre-trained ViTs on datasets designed to challenge binding, such as images with overlapping objects, feature sharing (e.g., two red circles), and natural scenes. Their key finding: binding information is present but weak, especially in the [CLS] token (the summary token used for classification). Spatial tokens performed better, but overall, models frequently misattributed features to the wrong object—mirroring a common failure in visual reasoning tasks. For example, in scenes with two objects sharing a color, ViTs often confused which object had which color, leading to accuracy drops of up to 30% compared to humans.
Why It Matters
For human cognition, binding is automatic and effortless—we don't even think about it. But this study reveals that our brains perform a sophisticated computation that even the best AI can't replicate. Understanding binding helps explain why AI can be fooled by adversarial examples or struggle with multi-object scenes. For your own brain, binding relies on attention and working memory; when these fail (e.g., multitasking), you experience the same errors—like putting milk in the cupboard. Improving your ability to focus and hold multiple features in mind can sharpen your visual reasoning.
What You Can Do
Train your brain's binding ability: practice visual search games where you must match features (e.g., Find the blue square among red squares and blue circles). Also, limit multitasking—when you divide attention, binding suffers. Try mindfulness exercises that focus on a single object, noting its color, texture, and shape together.
Source: arXiv q-bio.NC
Curious about your own brain? Take our free adaptive IQ test or try 306 brain training levels.