Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/tree-sitter/tree-sitter/llms.txt

Use this file to discover all available pages before exploring further.

Tree-sitter’s tagging system enables code navigation features like “go to definition”, symbol search, and outline views. It’s used in GitHub’s search-based code navigation and can be integrated into language servers and IDEs.

Tagging Concepts

Tagging identifies entities that can be named in a program using Tree-sitter queries. Each tag contains:
  • Role - Whether the entity is a definition or reference
  • Kind - How the entity is used (class, function, variable, etc.)
  • Name - The identifier text
  • Documentation - Optional docstring

Capture Convention

Tags use a @role.kind naming convention:
; Function definition
(function_definition
  name: (identifier) @name) @definition.function

; Method call
(method_call
  method: (identifier) @name) @reference.call

Standard Tag Vocabulary

CategoryTag
Class definitions@definition.class
Function definitions@definition.function
Interface definitions@definition.interface
Method definitions@definition.method
Module definitions@definition.module
Function/method calls@reference.call
Class reference@reference.class
Interface implementation@reference.implementation
Applications can extend this vocabulary or recognize only a subset based on their needs.

Writing Tag Queries

Tag queries are stored in queries/tags.scm:

Basic Example (Python)

; Function definitions
(function_definition
  name: (identifier) @name) @definition.function

; Class definitions  
(class_definition
  name: (identifier) @name) @definition.class

; Function calls
(call
  function: (identifier) @name) @reference.call

Advanced Example (JavaScript)

; Handle various function assignment patterns
(assignment_expression
  left: [
    (identifier) @name
    (member_expression
      property: (property_identifier) @name)
  ]
  right: [(arrow_function) (function)]
) @definition.function

; Method definitions
(method_definition
  name: (property_identifier) @name) @definition.method

; Class declarations
(class_declaration
  name: (identifier) @name) @definition.class

Documentation Capture

Capture docstrings using the @doc capture:
(comment)* @doc
.
(function_definition
  name: (identifier) @name) @definition.function

Built-in Functions

Two special functions help process docstrings:

#strip!

Removes patterns from captured text:
((comment) @doc
 (#strip! @doc "^#\\s*"))
This removes # and leading whitespace from Ruby comments.

#select-adjacent!

Filters text to only nodes adjacent to another capture:
(
  (comment)* @doc
  .
  [
    (class
      name: [(constant) @name]) @definition.class
    (singleton_class
      value: [(constant) @name]) @definition.class
  ]
  (#strip! @doc "^#\\s*")
  (#select-adjacent! @doc @definition.class)
)
This ensures only comments directly above the class are included.

TagsConfiguration Structure

The tree-sitter-tags crate provides the core implementation:
pub struct TagsConfiguration {
    pub language: Language,
    pub query: Query,
    syntax_type_names: Vec<Box<[u8]>>,
    capture_map: HashMap<u32, NamedCapture>,
    doc_capture_index: Option<u32>,
    name_capture_index: Option<u32>,
    // ...
}

Creating a Configuration

use tree_sitter_tags::TagsConfiguration;

let config = TagsConfiguration::new(
    language,
    tags_query,
    locals_query,
)?;
The locals query is concatenated with the tags query to support local scope tracking.

Tag Structure

Each tag contains comprehensive location information:
pub struct Tag {
    pub range: Range<usize>,           // Byte range
    pub name_range: Range<usize>,      // Name's byte range
    pub line_range: Range<usize>,      // Line range
    pub span: Range<Point>,            // Point range
    pub utf16_column_range: Range<usize>, // UTF-16 columns
    pub docs: Option<String>,          // Documentation
    pub is_definition: bool,           // Definition vs reference
    pub syntax_type_id: u32,           // Kind identifier
}

Generating Tags

Use TagsContext to generate tags from source code:
use tree_sitter_tags::{TagsConfiguration, TagsContext};

let mut context = TagsContext::new();
let config = TagsConfiguration::new(
    language,
    tags_query,
    locals_query,
)?;

let tags = context.generate_tags(
    &config,
    source.as_bytes(),
    None, // cancellation_flag
)?;

for tag in tags {
    println!(
        "{:?} at {}:{}\n  {}",
        tag.syntax_type_id,
        tag.span.start.row,
        tag.span.start.column,
        &source[tag.line_range.clone()],
    );
}

Local Scope Tracking

Tags can track local scopes to distinguish between local and non-local names:
; Mark scopes
(method) @local.scope
(block) @local.scope

; Mark definitions
(method_parameters (identifier) @local.definition)
(assignment left:(identifier) @local.definition)

; Mark references
(identifier) @local.reference
Use the (#is-not? local) predicate to exclude local variables:
; Only tag non-local identifiers
((identifier) @name
 (#is-not? local)) @reference.call

Command-Line Usage

Test tag queries with the tree-sitter tags command:
tree-sitter tags test.rb

Example Output

# test.rb
module Foo
  class Bar
    # won't be included

    # is adjacent, will be
    def baz
    end
  end
end
    test.rb
        Foo              | module       def (0, 7) - (0, 10) `module Foo`
        Bar              | class        def (1, 8) - (1, 11) `class Bar`
        baz              | method       def (4, 8) - (4, 11) `def baz`  "is adjacent, will be"

Unit Testing

Tag queries can be tested using comment assertions:
# test/tags/example.rb
module Foo
  #     ^ definition.module
  class Bar
    #    ^ definition.class

    def baz
      #  ^ definition.method
    end
  end
end
Run tests with:
tree-sitter test

Performance Considerations

Line Length Limits

The tags system limits lines to 180 characters (MAX_LINE_LEN) to prevent excessive memory usage.

Cancellation Support

Long-running operations can be cancelled:
use std::sync::atomic::{AtomicUsize, Ordering};

let cancellation_flag = AtomicUsize::new(0);

let tags = context.generate_tags(
    &config,
    source.as_bytes(),
    Some(&cancellation_flag),
)?;

// From another thread:
cancellation_flag.store(1, Ordering::Relaxed);
Checks occur every 100 iterations (CANCELLATION_CHECK_INTERVAL).

Reusing Context

Reuse TagsContext instances to avoid parser reallocation:
let mut context = TagsContext::new();

for file in files {
    let tags = context.generate_tags(
        &config,
        file.as_bytes(),
        None,
    )?;
    // Process tags...
}

Integration Patterns

Language Server Protocol

// Generate document symbols
let tags = context.generate_tags(&config, source, None)?;

let symbols: Vec<DocumentSymbol> = tags
    .into_iter()
    .filter(|tag| tag.is_definition)
    .map(|tag| DocumentSymbol {
        name: source[tag.name_range.clone()].to_string(),
        kind: syntax_type_to_symbol_kind(tag.syntax_type_id),
        range: byte_range_to_lsp_range(tag.range),
        // ...
    })
    .collect();
// Index all definitions
let mut symbol_index = HashMap::new();

for tag in tags {
    if tag.is_definition {
        let name = &source[tag.name_range.clone()];
        symbol_index
            .entry(name.to_string())
            .or_insert_with(Vec::new)
            .push(tag);
    }
}

Error Handling

The tags system can fail with these errors:
pub enum Error {
    Query(QueryError),        // Invalid query syntax
    Regex(regex::Error),      // Invalid regex in predicates
    Cancelled,                // Operation cancelled
    InvalidLanguage,          // Language not supported
    InvalidCapture(String),   // Unknown capture name
}
Only captures matching @definition.*, @reference.*, @doc, @name, or @local.* are valid. Other captures will return Error::InvalidCapture.

Syntax Highlighting

Learn about Tree-sitter’s highlighting system

Query Language

Master the query language syntax