Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/tree-sitter/tree-sitter/llms.txt

Use this file to discover all available pages before exploring further.

Grammar DSL

The Tree-sitter grammar DSL provides a set of built-in functions for defining grammar rules. This page documents all available functions and grammar properties.

Core Concepts

The $ Symbol

Every grammar rule is written as a JavaScript function that takes a parameter conventionally called $. Use $.identifier to refer to another grammar symbol within a rule.
rules: {
  function_call: $ => seq(
    $.identifier,      // Reference to another rule
    '(',
    optional($.arguments),
    ')'
  )
}
Avoid names starting with $.MISSING or $.UNEXPECTED as they have special meaning for the tree-sitter test command.

Terminal Symbols

Terminal symbols are described using JavaScript strings and regular expressions.
rules: {
  if_statement: $ => seq(
    'if',              // Exact string match
    '(',
    $.expression,
    ')'
  )
}
Tree-sitter doesn’t use JavaScript’s regex engine at runtime. It generates regex-matching logic based on Rust regex syntax as part of the parser.

Rule Functions

seq() - Sequences

Matches rules in order, one after another.
rules: {
  variable_declaration: $ => seq(
    'let',
    $.identifier,
    '=',
    $.expression,
    ';'
  )
}
Analogous to: Writing symbols next to each other in EBNF notation

choice() - Alternatives

Matches one of a set of possible rules.
rules: {
  expression: $ => choice(
    $.identifier,
    $.number,
    $.string,
    $.binary_expression,
    $.unary_expression
  )
}
The order of arguments does not matter for parsing, but it affects the order in which alternatives are tried during error recovery.
Analogous to: The | (pipe) operator in EBNF notation

repeat() - Zero or More

Matches zero or more occurrences of a rule.
rules: {
  // A block contains zero or more statements
  block: $ => seq(
    '{',
    repeat($.statement),
    '}'
  )
}
Analogous to: The {x} (curly brace) syntax in EBNF notation

repeat1() - One or More

Matches one or more occurrences of a rule.
rules: {
  // At least one digit
  number: $ => repeat1(/[0-9]/),
  
  // At least one parameter
  parameter_list: $ => seq(
    '(',
    repeat1($.parameter),
    ')'
  )
}
Analogous to: The + operator in regex or EBNF

optional() - Zero or One

Matches zero or one occurrence of a rule.
rules: {
  function_declaration: $ => seq(
    'function',
    $.identifier,
    optional($.type_parameters),  // Type params are optional
    $.parameter_list,
    optional($.return_type),      // Return type is optional
    $.block
  )
}
Analogous to: The [x] (square bracket) syntax in EBNF notation

Precedence Functions

prec() - Numeric Precedence

Assigns a numerical precedence to resolve LR(1) conflicts.
rules: {
  expression: $ => choice(
    $.unary_expression,
    $.binary_expression
  ),
  
  // Higher precedence (2) means it binds more tightly
  unary_expression: $ => prec(2, seq(
    choice('-', '!', '~'),
    $.expression
  )),
  
  binary_expression: $ => prec(1, seq(
    $.expression,
    choice('+', '-', '*', '/'),
    $.expression
  ))
}
The default precedence of all rules is zero. Higher numbers = higher precedence.

prec.left() - Left Associativity

Marks a rule as left-associative, optionally with precedence.
rules: {
  binary_expression: $ => choice(
    // a * b * c → (a * b) * c
    prec.left(2, seq($.expression, '*', $.expression)),
    prec.left(2, seq($.expression, '/', $.expression)),
    prec.left(1, seq($.expression, '+', $.expression)),
    prec.left(1, seq($.expression, '-', $.expression))
  )
}
Effect: When parsing a + b + c, prefer matching (a + b) + c over a + (b + c)

prec.right() - Right Associativity

Marks a rule as right-associative.
rules: {
  // Assignment is right-associative
  assignment: $ => prec.right(seq(
    $.identifier,
    '=',
    choice($.expression, $.assignment)
  ))
}
Effect: When parsing a = b = c, prefer matching a = (b = c) over (a = b) = c

prec.dynamic() - Dynamic Precedence

Applies precedence at runtime instead of parser generation time.
grammar({
  name: 'mylang',
  
  conflicts: $ => [
    [$.array, $.array_pattern]
  ],
  
  rules: {
    // In case of ambiguity, prefer array over array_pattern
    array: $ => prec.dynamic(1, seq(
      '[',
      optional(seq($.expression, repeat(seq(',', $.expression)))),
      ']'
    )),
    
    array_pattern: $ => seq(
      '[',
      optional(seq($.pattern, repeat(seq(',', $.pattern)))),
      ']'
    )
  }
})
Only use prec.dynamic() when handling genuine ambiguities declared in the conflicts field.

Token Functions

token() - Single Token

Marks a rule as producing a single token.
rules: {
  // Treat the entire sequence as one token
  comment: $ => token(seq(
    '//',
    /.*/
  )),
  
  // Multi-line comment as a single token
  block_comment: $ => token(seq(
    '/*',
    /[^*]*\*+([^/*][^*]*\*+)*/,
    '/'
  ))
}
The token() function only accepts terminal rules (strings/regexes). token($.foo) will not work unless $.foo is a terminal.

token.immediate() - No Whitespace

Matches a token only if there is no whitespace before it.
rules: {
  // Template literal with interpolation
  template_string: $ => seq(
    '`',
    repeat(choice(
      $.template_chars,
      seq(
        '${',
        $.expression,
        token.immediate('}'  // No whitespace before }
      )
    )),
    '`'
  )
}
Use case: When whitespace would change the meaning of the token

Naming Functions

alias() - Rename Nodes

Causes a rule to appear with a different name in the syntax tree.
rules: {
  statement: $ => choice(
    $.if_statement,
    $.while_statement,
    // Make 'for' appear as 'for_statement'
    alias('for', $.for_statement)
  )
}

field() - Named Fields

Assigns a field name to child nodes for easier access.
rules: {
  function_definition: $ => seq(
    'func',
    field('name', $.identifier),
    field('parameters', $.parameter_list),
    field('return_type', optional($.type)),
    field('body', $.block)
  ),
  
  binary_expression: $ => seq(
    field('left', $.expression),
    field('operator', choice('+', '-', '*', '/')),
    field('right', $.expression)
  )
}
Use fields to make tree traversal code more readable and resilient to grammar changes.

reserved() - Contextual Keywords

Overrides the global reserved word set for contextual keywords.
grammar({
  name: 'javascript',
  
  reserved: $ => ({
    keywords: ['if', 'else', 'function', 'return'],
    property_keywords: []  // Empty set for property names
  }),
  
  rules: {
    // 'if' is reserved here
    variable: $ => $.identifier,
    
    // 'if' is allowed as a property name
    property_name: $ => reserved(
      $.property_keywords,
      choice($.identifier, 'if', 'else')
    )
  }
})

Grammar Properties

name

The name of your language (required).
export default grammar({
  name: 'javascript',
  // ...
});

rules

The grammar rules (required). The first rule is the start symbol.
rules: {
  source_file: $ => repeat($._statement),  // Start rule
  _statement: $ => choice(/* ... */),
  // ...
}

extras

Tokens that may appear anywhere in the language.
extras: $ => [
  /\s/,           // Whitespace
  $.comment,      // Comments
  $.line_continuation
]
The default value of extras is [/\s/] (whitespace). To control whitespace explicitly, set extras: $ => [].

inline

Rules to remove by inlining their definition.
inline: $ => [
  $._expression,
  $._statement,
  $._type
]
Effect: These rules won’t create syntax tree nodes; their contents appear directly in parent nodes.

conflicts

Declares intentional LR(1) conflicts.
conflicts: $ => [
  [$.array, $.array_pattern],
  [$.object, $.object_pattern]
]
Only declare conflicts when you have genuine ambiguities that should be resolved at runtime using GLR parsing.

externals

Tokens handled by an external scanner.
externals: $ => [
  $.indent,
  $.dedent,
  $.string_content
]
See External Scanners for details.

word

The token used for keyword extraction optimization.
word: $ => $.identifier
Setting a word token makes the parser smaller and faster by optimizing keyword matching.

supertypes

Rules considered abstract supertypes.
supertypes: $ => [
  $._expression,
  $._statement,
  $._declaration
]
Effect: These nodes are hidden in the parse tree but can be used in queries.

precedences

Named precedence levels.
precedences: $ => [
  ['unary', 'multiplicative', 'additive'],
  ['type', 'expression']
]
Use with: prec('unary', ...) instead of prec(2, ...)

Complete Example

Here’s a complete grammar demonstrating many DSL features:
grammar.js
export default grammar({
  name: 'calculator',
  
  extras: $ => [/\s/],
  
  word: $ => $.identifier,
  
  rules: {
    source_file: $ => repeat($._statement),
    
    _statement: $ => choice(
      $.assignment,
      $.expression
    ),
    
    assignment: $ => seq(
      field('left', $.identifier),
      '=',
      field('right', $.expression)
    ),
    
    expression: $ => choice(
      $.identifier,
      $.number,
      $.binary_expression,
      $.unary_expression,
      $.parenthesized_expression
    ),
    
    binary_expression: $ => choice(
      prec.left(2, seq(
        field('left', $.expression),
        field('operator', choice('*', '/')),
        field('right', $.expression)
      )),
      prec.left(1, seq(
        field('left', $.expression),
        field('operator', choice('+', '-')),
        field('right', $.expression)
      ))
    ),
    
    unary_expression: $ => prec(3, seq(
      field('operator', '-'),
      field('operand', $.expression)
    )),
    
    parenthesized_expression: $ => seq(
      '(',
      $.expression,
      ')'
    ),
    
    identifier: $ => /[a-zA-Z_][a-zA-Z0-9_]*/,
    number: $ => /\d+/
  }
});

Next Steps

Now that you understand the grammar DSL, learn how to: