Neural encoding of real world face perception

Authors will be notified

Trending on alphaXiv

Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures

14 May 2025

DeepSeek-AI researchers present insights from developing DeepSeek-V3, documenting specific hardware constraints and architectural solutions that enable efficient large language model training through innovations in mixed-precision computation, network topology optimization, and memory management while achieving competitive performance with significantly reduced hardware requirements.

Learning Dynamics in Continual Pre-Training for Large Language Models

12 May 2025

A quantitative framework reveals and models the learning dynamics during continual pre-training (CPT) of large language models, deriving scaling laws that predict performance evolution while accounting for distribution shift and learning rate effects, enabling optimization of training parameters across general and domain-specific tasks.

BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset

14 May 2025

BLIP3-o introduces a family of open-source unified multimodal models that combine image understanding and generation capabilities through systematic architecture and training optimization, achieving state-of-the-art performance on multiple benchmarks while providing valuable insights on design choices like CLIP features versus VAE representations and sequential versus joint training approaches.

Learning to Think: Information-Theoretic Reinforcement Fine-Tuning for LLMs

15 May 2025

Researchers from UCAS and ISCAS introduce an information-theoretic reinforcement fine-tuning framework that optimizes LLM reasoning efficiency by using parameter information gain as dense rewards, achieving 10% higher accuracy while doubling token efficiency compared to standard outcome-reward approaches.

Qwen3 Technical Report

14 May 2025

The Qwen Team introduces Qwen3, a series of open-source large language models featuring dynamic thinking/non-thinking modes and mixture-of-experts architectures, expanding multilingual capabilities to 119 languages while achieving competitive performance with the Qwen3-235B-A22B model using only 22B activated parameters per token.

Neural encoding of real world face perception

13 May 2025

Social perception unfolds as we freely interact with people around us. We investigated the neural basis of real world face perception using multi electrode intracranial recordings in humans during spontaneous interactions with friends, family, and others. Computational models reconstructed the faces participants looked at during natural interactions, including facial expressions and motion, from brain activity alone. The results highlighted a critical role for the social vision pathway, a network of areas spanning parietal, temporal, and occipital cortex. This network was more sharply tuned to subtle expressions compared to intense expressions, which was confirmed with controlled psychophysical experiments. These findings reveal that the human social vision pathway encodes facial expressions and motion as deviations from a neutral expression prototype during natural social interactions in real life.

Knowledge Distillation for Enhancing Walmart E-commerce Search Relevance Using Large Language Models

11 May 2025

Walmart researchers developed a knowledge distillation framework that transfers semantic understanding capabilities from large language models to a smaller, production-ready model for e-commerce search ranking, achieving comparable performance while meeting strict latency requirements and demonstrating improved user engagement metrics in A/B testing on Walmart.com.

UniVLA: Learning to Act Anywhere with Task-centric Latent Actions

09 May 2025

A unified vision-language-action framework called UniVLA enables robots to learn task-centric latent actions from unlabeled videos, achieving state-of-the-art performance across multiple manipulation and navigation benchmarks while reducing pre-training compute requirements by 95% compared to previous methods.

Seed1.5-VL Technical Report

11 May 2025

An advanced vision-language model from ByteDance Seed achieves state-of-the-art performance on 38 out of 60 public benchmarks through a three-stage pre-training pipeline and novel data synthesis approaches, demonstrating particularly strong capabilities in GUI control, document understanding, and video comprehension while maintaining a relatively compact architecture.

LLMs Get Lost In Multi-Turn Conversation

09 May 2025

A comprehensive evaluation of 15 large language models reveals systematic performance degradation (39% average drop) in multi-turn conversations compared to single-turn interactions, with models struggling to maintain context and adapt to new information across conversation turns regardless of temperature settings or conversation granularity.

Explore

People

Login

Get add-on

Go home

Paper

Overview

Paper

Overview

Neural encoding of real world face perception

Comments

My Notes

Chat

Similar

Trending on alphaXiv