Academic Project Page

Abstract

Deep vision models have achieved remarkable classification performance by leveraging a hierarchical architecture in which human-interpretable concepts emerge through the composition of individual neurons across layers. Given the distributed nature of representations, pinpointing where specific visual concepts are encoded within a model remains a crucial yet challenging task. In this paper, we introduce an effective circuit discovery method, called Granular Concept Circuit (GCC), in which each circuit represents a concept relevant to a given query. To construct each circuit, our method iteratively assesses inter-neuron connectivity, focusing on both functional dependencies and semantic alignment. By automatically discovering multiple circuits, each capturing specific concepts within that query, our approach offers a profound, concept-wise interpretation of models and is the first to identify circuits tied to specific visual concepts at a fine-grained level. We validate the versatility and effectiveness of GCCs across various deep image classification models.

Approach

Our method utilizes two score metrics to discover and visualize granular concept circuits.

Our method introduces Granular Concept Circuits (GCCs), a novel approach to interpreting deep vision models. Given a query image, we identify a "circuit" of functionally connected neurons that represents a specific, fine-grained concept. This is a two-step process: we first discover the circuit's functional connections using a (a) Neuron Sensitivity Score, and then we ensure the circuit is semantically interpretable by applying a (b) Semantic Flow Score. This two-score approach is crucial because a connection might have a high functional score but a low semantic score.

A diagram illustrating the two-score approach with a black root node and purple, orange, and green candidate nodes. — For example, starting from the **black root node**, our method connects to the **orange node** because it has a high functional score and a high semantic flow, which represents a meaningful concept like "Fluffy Hair." We discard the connection to the **green node**, even though it has a high functional score, because its low semantic flow indicates it is not a semantically interpretable part of the circuit. The resulting circuits are visualized, providing a clear and interpretable view of the model's internal workings.

Results

Single Query and Multiple Granular Concept Circuits

A collage of Granular Concept Circuits visualizations for different concepts in a vision model. — Figure 1. GCCs discovered from a single query image.

Model-agnostic Circuit Extraction

Granular Concept Circuits visualized on a Vision Transformer (ViT). — Figure 2. Our method successfully adapts to Vision Transformer (ViT) architectures.

Multiple Queries with Different Classes and Common GCCs

A visualization of Granular Concept Circuits for different classes of images. — Figure 3. GCCs discovered across multiple classes, demonstrating common concept representations.

Auditing Misclassification

Our method can be used to diagnose and explain why a model makes a specific misclassification.

A figure showing logit gain plots and a visualization of Granular Concept Circuits for a misclassified image. — Analyzing a Misclassified Image

The top bar plots show the logit gain when Granular Concept Circuits (GCCs) derived from a misclassified "Schipperke" example are either inhibited or stimulated. The model incorrectly classified this as a "Soccer ball."

The bottom panel visualizes the most influential GCC, highlighting concepts related to both the incorrect prediction (soccer ball texture) and the correct one (black objects). This demonstrates how our method can pinpoint the specific circuits responsible for the model's error.

Qualitative Evaluation

Average Logit Drop After Ablating Neurons

Insertion and Deletion Game

An insertion-deletion graph showing the sensitivity of models to our discovered circuits. — Our method extracts circuits that are highly influential for classification, as shown by the high insertion and low deletion scores.

User Study Responses

Results from a user study evaluating the interpretability of Granular Concept Circuits. — Results from our user study (33 participants) show a high degree of agreement that Granular Concept Circuits (GCCs) provide meaningful insights into model behavior.

BibTeX

@inproceedings{kwon2025granular, title = {Granular Concept Circuits: Toward a Fine-Grained Circuit Discovery for Concept Representations}, author = {Dahee Kwon and Sehyun Lee and Jaesik Choi}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, year = {2025}, pages = {to appear} }

Granular Concept Circuits: Toward a Fine-Grained Circuit Discovery for Concept Representations

ICCV 2025