Deep vision models have achieved remarkable classification performance by leveraging a hierarchical architecture in which human-interpretable concepts emerge through the composition of individual neurons across layers. Given the distributed nature of representations, pinpointing where specific visual concepts are encoded within a model remains a crucial yet challenging task. In this paper, we introduce an effective circuit discovery method, called Granular Concept Circuit (GCC), in which each circuit represents a concept relevant to a given query. To construct each circuit, our method iteratively assesses inter-neuron connectivity, focusing on both functional dependencies and semantic alignment. By automatically discovering multiple circuits, each capturing specific concepts within that query, our approach offers a profound, concept-wise interpretation of models and is the first to identify circuits tied to specific visual concepts at a fine-grained level. We validate the versatility and effectiveness of GCCs across various deep image classification models.
Our method utilizes two score metrics to discover and visualize granular concept circuits.
Our method introduces Granular Concept Circuits (GCCs), a novel approach to interpreting deep vision models. Given a query image, we identify a "circuit" of functionally connected neurons that represents a specific, fine-grained concept. This is a two-step process: we first discover the circuit's functional connections using a (a) Neuron Sensitivity Score, and then we ensure the circuit is semantically interpretable by applying a (b) Semantic Flow Score. This two-score approach is crucial because a connection might have a high functional score but a low semantic score.
For example, starting from the black root node, our method connects to the orange node because it has a high functional score and a high semantic flow, which represents a meaningful concept like "Fluffy Hair." We discard the connection to the green node, even though it has a high functional score, because its low semantic flow indicates it is not a semantically interpretable part of the circuit. The resulting circuits are visualized, providing a clear and interpretable view of the model's internal workings.
Single Query and Multiple Granular Concept Circuits
Model-agnostic Circuit Extraction
Multiple Queries with Different Classes and Common GCCs
Our method can be used to diagnose and explain why a model makes a specific misclassification.
The top bar plots show the logit gain when Granular Concept Circuits (GCCs) derived from a misclassified "Schipperke" example are either inhibited or stimulated. The model incorrectly classified this as a "Soccer ball."
The bottom panel visualizes the most influential GCC, highlighting concepts related to both the incorrect prediction (soccer ball texture) and the correct one (black objects). This demonstrates how our method can pinpoint the specific circuits responsible for the model's error.
Average Logit Drop After Ablating Neurons
Insertion and Deletion Game
User Study Responses
@inproceedings{kwon2025granular,
title = {Granular Concept Circuits: Toward a Fine-Grained Circuit Discovery for Concept Representations},
author = {Dahee Kwon and Sehyun Lee and Jaesik Choi},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
year = {2025},
pages = {to appear}
}