Grammar DSL

The Tree-sitter grammar DSL provides a set of built-in functions for defining grammar rules. This page documents all available functions and grammar properties.

Core Concepts

The `$` Symbol

Every grammar rule is written as a JavaScript function that takes a parameter conventionally called $. Use $.identifier to refer to another grammar symbol within a rule.

rules: {
  function_call: $ => seq(
    $.identifier,      // Reference to another rule
    '(',
    optional($.arguments),
    ')'
  )
}

Avoid names starting with $.MISSING or $.UNEXPECTED as they have special meaning for the tree-sitter test command.

Terminal Symbols

Terminal symbols are described using JavaScript strings and regular expressions.

rules: {
  if_statement: $ => seq(
    'if',              // Exact string match
    '(',
    $.expression,
    ')'
  )
}

rules: {
  identifier: $ => /[a-z_][a-z0-9_]*/i,
  number: $ => /\d+/,
  whitespace: $ => /\s+/
}

import { RustRegex } from 'tree-sitter-cli/dsl';

rules: {
  // Case-insensitive identifier using inline flag
  identifier: $ => new RustRegex('(?i)[a-z_][a-z0-9_]*')
}

Tree-sitter doesn’t use JavaScript’s regex engine at runtime. It generates regex-matching logic based on Rust regex syntax as part of the parser.

Rule Functions

`seq()` - Sequences

Matches rules in order, one after another.

rules: {
  variable_declaration: $ => seq(
    'let',
    $.identifier,
    '=',
    $.expression,
    ';'
  )
}

Analogous to: Writing symbols next to each other in EBNF notation

`choice()` - Alternatives

Matches one of a set of possible rules.

rules: {
  expression: $ => choice(
    $.identifier,
    $.number,
    $.string,
    $.binary_expression,
    $.unary_expression
  )
}

The order of arguments does not matter for parsing, but it affects the order in which alternatives are tried during error recovery.

Analogous to: The | (pipe) operator in EBNF notation

`repeat()` - Zero or More

Matches zero or more occurrences of a rule.

rules: {
  // A block contains zero or more statements
  block: $ => seq(
    '{',
    repeat($.statement),
    '}'
  )
}

Analogous to: The {x} (curly brace) syntax in EBNF notation

`repeat1()` - One or More

Matches one or more occurrences of a rule.

rules: {
  // At least one digit
  number: $ => repeat1(/[0-9]/),
  
  // At least one parameter
  parameter_list: $ => seq(
    '(',
    repeat1($.parameter),
    ')'
  )
}

Analogous to: The + operator in regex or EBNF

`optional()` - Zero or One

Matches zero or one occurrence of a rule.

rules: {
  function_declaration: $ => seq(
    'function',
    $.identifier,
    optional($.type_parameters),  // Type params are optional
    $.parameter_list,
    optional($.return_type),      // Return type is optional
    $.block
  )
}

Analogous to: The [x] (square bracket) syntax in EBNF notation

Precedence Functions

`prec()` - Numeric Precedence

Assigns a numerical precedence to resolve LR(1) conflicts.

rules: {
  expression: $ => choice(
    $.unary_expression,
    $.binary_expression
  ),
  
  // Higher precedence (2) means it binds more tightly
  unary_expression: $ => prec(2, seq(
    choice('-', '!', '~'),
    $.expression
  )),
  
  binary_expression: $ => prec(1, seq(
    $.expression,
    choice('+', '-', '*', '/'),
    $.expression
  ))
}

The default precedence of all rules is zero. Higher numbers = higher precedence.

`prec.left()` - Left Associativity

Marks a rule as left-associative, optionally with precedence.

rules: {
  binary_expression: $ => choice(
    // a * b * c → (a * b) * c
    prec.left(2, seq($.expression, '*', $.expression)),
    prec.left(2, seq($.expression, '/', $.expression)),
    prec.left(1, seq($.expression, '+', $.expression)),
    prec.left(1, seq($.expression, '-', $.expression))
  )
}

Effect: When parsing a + b + c, prefer matching (a + b) + c over a + (b + c)

`prec.right()` - Right Associativity

Marks a rule as right-associative.

rules: {
  // Assignment is right-associative
  assignment: $ => prec.right(seq(
    $.identifier,
    '=',
    choice($.expression, $.assignment)
  ))
}

Effect: When parsing a = b = c, prefer matching a = (b = c) over (a = b) = c

`prec.dynamic()` - Dynamic Precedence

Applies precedence at runtime instead of parser generation time.

grammar({
  name: 'mylang',
  
  conflicts: $ => [
    [$.array, $.array_pattern]
  ],
  
  rules: {
    // In case of ambiguity, prefer array over array_pattern
    array: $ => prec.dynamic(1, seq(
      '[',
      optional(seq($.expression, repeat(seq(',', $.expression)))),
      ']'
    )),
    
    array_pattern: $ => seq(
      '[',
      optional(seq($.pattern, repeat(seq(',', $.pattern)))),
      ']'
    )
  }
})

Only use prec.dynamic() when handling genuine ambiguities declared in the conflicts field.

Token Functions

`token()` - Single Token

Marks a rule as producing a single token.

rules: {
  // Treat the entire sequence as one token
  comment: $ => token(seq(
    '//',
    /.*/
  )),
  
  // Multi-line comment as a single token
  block_comment: $ => token(seq(
    '/*',
    /[^*]*\*+([^/*][^*]*\*+)*/,
    '/'
  ))
}

The token() function only accepts terminal rules (strings/regexes). token($.foo) will not work unless $.foo is a terminal.

`token.immediate()` - No Whitespace

Matches a token only if there is no whitespace before it.

rules: {
  // Template literal with interpolation
  template_string: $ => seq(
    '`',
    repeat(choice(
      $.template_chars,
      seq(
        '${',
        $.expression,
        token.immediate('}'  // No whitespace before }
      )
    )),
    '`'
  )
}

Use case: When whitespace would change the meaning of the token

Naming Functions

`alias()` - Rename Nodes

Causes a rule to appear with a different name in the syntax tree.

rules: {
  statement: $ => choice(
    $.if_statement,
    $.while_statement,
    // Make 'for' appear as 'for_statement'
    alias('for', $.for_statement)
  )
}

rules: {
  // Make this_keyword appear as the string 'this'
  this_expression: $ => alias($.this_keyword, 'this')
}

`field()` - Named Fields

Assigns a field name to child nodes for easier access.

rules: {
  function_definition: $ => seq(
    'func',
    field('name', $.identifier),
    field('parameters', $.parameter_list),
    field('return_type', optional($.type)),
    field('body', $.block)
  ),
  
  binary_expression: $ => seq(
    field('left', $.expression),
    field('operator', choice('+', '-', '*', '/')),
    field('right', $.expression)
  )
}

Use fields to make tree traversal code more readable and resilient to grammar changes.

`reserved()` - Contextual Keywords

Overrides the global reserved word set for contextual keywords.

grammar({
  name: 'javascript',
  
  reserved: $ => ({
    keywords: ['if', 'else', 'function', 'return'],
    property_keywords: []  // Empty set for property names
  }),
  
  rules: {
    // 'if' is reserved here
    variable: $ => $.identifier,
    
    // 'if' is allowed as a property name
    property_name: $ => reserved(
      $.property_keywords,
      choice($.identifier, 'if', 'else')
    )
  }
})

Grammar Properties

`name`

The name of your language (required).

export default grammar({
  name: 'javascript',
  // ...
});

`rules`

The grammar rules (required). The first rule is the start symbol.

rules: {
  source_file: $ => repeat($._statement),  // Start rule
  _statement: $ => choice(/* ... */),
  // ...
}

`extras`

Tokens that may appear anywhere in the language.

extras: $ => [
  /\s/,           // Whitespace
  $.comment,      // Comments
  $.line_continuation
]

The default value of extras is [/\s/] (whitespace). To control whitespace explicitly, set extras: $ => [].

`inline`

Rules to remove by inlining their definition.

inline: $ => [
  $._expression,
  $._statement,
  $._type
]

Effect: These rules won’t create syntax tree nodes; their contents appear directly in parent nodes.

`conflicts`

Declares intentional LR(1) conflicts.

conflicts: $ => [
  [$.array, $.array_pattern],
  [$.object, $.object_pattern]
]

Only declare conflicts when you have genuine ambiguities that should be resolved at runtime using GLR parsing.

`externals`

Tokens handled by an external scanner.

externals: $ => [
  $.indent,
  $.dedent,
  $.string_content
]

See External Scanners for details.

`word`

The token used for keyword extraction optimization.

word: $ => $.identifier

Setting a word token makes the parser smaller and faster by optimizing keyword matching.

`supertypes`

Rules considered abstract supertypes.

supertypes: $ => [
  $._expression,
  $._statement,
  $._declaration
]

Effect: These nodes are hidden in the parse tree but can be used in queries.

`precedences`

Named precedence levels.

precedences: $ => [
  ['unary', 'multiplicative', 'additive'],
  ['type', 'expression']
]

Use with: prec('unary', ...) instead of prec(2, ...)

Complete Example

Here’s a complete grammar demonstrating many DSL features:

grammar.js

export default grammar({
  name: 'calculator',
  
  extras: $ => [/\s/],
  
  word: $ => $.identifier,
  
  rules: {
    source_file: $ => repeat($._statement),
    
    _statement: $ => choice(
      $.assignment,
      $.expression
    ),
    
    assignment: $ => seq(
      field('left', $.identifier),
      '=',
      field('right', $.expression)
    ),
    
    expression: $ => choice(
      $.identifier,
      $.number,
      $.binary_expression,
      $.unary_expression,
      $.parenthesized_expression
    ),
    
    binary_expression: $ => choice(
      prec.left(2, seq(
        field('left', $.expression),
        field('operator', choice('*', '/')),
        field('right', $.expression)
      )),
      prec.left(1, seq(
        field('left', $.expression),
        field('operator', choice('+', '-')),
        field('right', $.expression)
      ))
    ),
    
    unary_expression: $ => prec(3, seq(
      field('operator', '-'),
      field('operand', $.expression)
    )),
    
    parenthesized_expression: $ => seq(
      '(',
      $.expression,
      ')'
    ),
    
    identifier: $ => /[a-zA-Z_][a-zA-Z0-9_]*/,
    number: $ => /\d+/
  }
});

Next Steps

Now that you understand the grammar DSL, learn how to:

Get Started

Using Parsers

Queries

Creating Parsers

Advanced Topics

Grammar DSL Reference

Grammar DSL

Core Concepts

The `$` Symbol

Terminal Symbols

Rule Functions

`seq()` - Sequences

`choice()` - Alternatives

`repeat()` - Zero or More

`repeat1()` - One or More

`optional()` - Zero or One

Precedence Functions

`prec()` - Numeric Precedence

`prec.left()` - Left Associativity

`prec.right()` - Right Associativity

`prec.dynamic()` - Dynamic Precedence

Token Functions

`token()` - Single Token

`token.immediate()` - No Whitespace

Naming Functions

`alias()` - Rename Nodes

`field()` - Named Fields

`reserved()` - Contextual Keywords

Grammar Properties

`name`

`rules`

`extras`

`inline`

`conflicts`

`externals`

`word`

`supertypes`

`precedences`

Complete Example

Next Steps

​Grammar DSL

​Core Concepts

​The $ Symbol

​Terminal Symbols

​Rule Functions

​seq() - Sequences

​choice() - Alternatives

​repeat() - Zero or More

​repeat1() - One or More

​optional() - Zero or One

​Precedence Functions

​prec() - Numeric Precedence

​prec.left() - Left Associativity

​prec.right() - Right Associativity

​prec.dynamic() - Dynamic Precedence

​Token Functions

​token() - Single Token

​token.immediate() - No Whitespace

​Naming Functions

​alias() - Rename Nodes

​field() - Named Fields

​reserved() - Contextual Keywords

​Grammar Properties

​name

​rules

​extras

​inline

​conflicts

​externals

​word

​supertypes

​precedences

​Complete Example

​Next Steps

Grammar DSL

Core Concepts

The `$` Symbol

Terminal Symbols

Rule Functions

`seq()` - Sequences

`choice()` - Alternatives

`repeat()` - Zero or More

`repeat1()` - One or More

`optional()` - Zero or One

Precedence Functions

`prec()` - Numeric Precedence

`prec.left()` - Left Associativity

`prec.right()` - Right Associativity

`prec.dynamic()` - Dynamic Precedence

Token Functions

`token()` - Single Token

`token.immediate()` - No Whitespace

Naming Functions

`alias()` - Rename Nodes

`field()` - Named Fields

`reserved()` - Contextual Keywords

Grammar Properties

`name`

`rules`

`extras`

`inline`

`conflicts`

`externals`

`word`

`supertypes`

`precedences`

Complete Example

Next Steps