Documentation Index Fetch the complete documentation index at: https://mintlify.com/tree-sitter/tree-sitter/llms.txt
Use this file to discover all available pages before exploring further.
Grammar DSL
The Tree-sitter grammar DSL provides a set of built-in functions for defining grammar rules. This page documents all available functions and grammar properties.
Core Concepts
The $ Symbol
Every grammar rule is written as a JavaScript function that takes a parameter conventionally called $. Use $.identifier to refer to another grammar symbol within a rule.
rules : {
function_call : $ => seq (
$ . identifier , // Reference to another rule
'(' ,
optional ( $ . arguments ),
')'
)
}
Avoid names starting with $.MISSING or $.UNEXPECTED as they have special meaning for the tree-sitter test command.
Terminal Symbols
Terminal symbols are described using JavaScript strings and regular expressions.
Strings
Regular Expressions
Rust Regex
rules : {
if_statement : $ => seq (
'if' , // Exact string match
'(' ,
$ . expression ,
')'
)
}
Tree-sitter doesn’t use JavaScript’s regex engine at runtime. It generates regex-matching logic based on Rust regex syntax as part of the parser.
Rule Functions
seq() - Sequences
Matches rules in order, one after another.
rules : {
variable_declaration : $ => seq (
'let' ,
$ . identifier ,
'=' ,
$ . expression ,
';'
)
}
Analogous to : Writing symbols next to each other in EBNF notation
choice() - Alternatives
Matches one of a set of possible rules.
rules : {
expression : $ => choice (
$ . identifier ,
$ . number ,
$ . string ,
$ . binary_expression ,
$ . unary_expression
)
}
The order of arguments does not matter for parsing, but it affects the order in which alternatives are tried during error recovery.
Analogous to : The | (pipe) operator in EBNF notation
repeat() - Zero or More
Matches zero or more occurrences of a rule.
rules : {
// A block contains zero or more statements
block : $ => seq (
'{' ,
repeat ( $ . statement ),
'}'
)
}
Analogous to : The {x} (curly brace) syntax in EBNF notation
repeat1() - One or More
Matches one or more occurrences of a rule.
rules : {
// At least one digit
number : $ => repeat1 ( / [ 0-9 ] / ),
// At least one parameter
parameter_list : $ => seq (
'(' ,
repeat1 ( $ . parameter ),
')'
)
}
Analogous to : The + operator in regex or EBNF
optional() - Zero or One
Matches zero or one occurrence of a rule.
rules : {
function_declaration : $ => seq (
'function' ,
$ . identifier ,
optional ( $ . type_parameters ), // Type params are optional
$ . parameter_list ,
optional ( $ . return_type ), // Return type is optional
$ . block
)
}
Analogous to : The [x] (square bracket) syntax in EBNF notation
Precedence Functions
prec() - Numeric Precedence
Assigns a numerical precedence to resolve LR(1) conflicts.
rules : {
expression : $ => choice (
$ . unary_expression ,
$ . binary_expression
),
// Higher precedence (2) means it binds more tightly
unary_expression : $ => prec ( 2 , seq (
choice ( '-' , '!' , '~' ),
$ . expression
)),
binary_expression : $ => prec ( 1 , seq (
$ . expression ,
choice ( '+' , '-' , '*' , '/' ),
$ . expression
))
}
The default precedence of all rules is zero. Higher numbers = higher precedence.
prec.left() - Left Associativity
Marks a rule as left-associative, optionally with precedence.
rules : {
binary_expression : $ => choice (
// a * b * c → (a * b) * c
prec . left ( 2 , seq ( $ . expression , '*' , $ . expression )),
prec . left ( 2 , seq ( $ . expression , '/' , $ . expression )),
prec . left ( 1 , seq ( $ . expression , '+' , $ . expression )),
prec . left ( 1 , seq ( $ . expression , '-' , $ . expression ))
)
}
Effect : When parsing a + b + c, prefer matching (a + b) + c over a + (b + c)
prec.right() - Right Associativity
Marks a rule as right-associative.
rules : {
// Assignment is right-associative
assignment : $ => prec . right ( seq (
$ . identifier ,
'=' ,
choice ( $ . expression , $ . assignment )
))
}
Effect : When parsing a = b = c, prefer matching a = (b = c) over (a = b) = c
prec.dynamic() - Dynamic Precedence
Applies precedence at runtime instead of parser generation time.
grammar ({
name: 'mylang' ,
conflicts : $ => [
[ $ . array , $ . array_pattern ]
],
rules: {
// In case of ambiguity, prefer array over array_pattern
array : $ => prec . dynamic ( 1 , seq (
'[' ,
optional ( seq ( $ . expression , repeat ( seq ( ',' , $ . expression )))),
']'
)),
array_pattern : $ => seq (
'[' ,
optional ( seq ( $ . pattern , repeat ( seq ( ',' , $ . pattern )))),
']'
)
}
})
Only use prec.dynamic() when handling genuine ambiguities declared in the conflicts field.
Token Functions
token() - Single Token
Marks a rule as producing a single token.
rules : {
// Treat the entire sequence as one token
comment : $ => token ( seq (
'//' ,
/ . * /
)),
// Multi-line comment as a single token
block_comment : $ => token ( seq (
'/*' ,
/ [ ^ * ] * \* + ( [ ^ /* ][ ^ * ] * \* + ) * / ,
'/'
))
}
The token() function only accepts terminal rules (strings/regexes). token($.foo) will not work unless $.foo is a terminal.
Matches a token only if there is no whitespace before it.
rules : {
// Template literal with interpolation
template_string : $ => seq (
'`' ,
repeat ( choice (
$ . template_chars ,
seq (
'${' ,
$ . expression ,
token . immediate ( '}' // No whitespace before }
)
)),
'`'
)
}
Use case : When whitespace would change the meaning of the token
Naming Functions
alias() - Rename Nodes
Causes a rule to appear with a different name in the syntax tree.
Named Node Alias
Anonymous Node Alias
rules : {
statement : $ => choice (
$ . if_statement ,
$ . while_statement ,
// Make 'for' appear as 'for_statement'
alias ( 'for' , $ . for_statement )
)
}
field() - Named Fields
Assigns a field name to child nodes for easier access.
rules : {
function_definition : $ => seq (
'func' ,
field ( 'name' , $ . identifier ),
field ( 'parameters' , $ . parameter_list ),
field ( 'return_type' , optional ( $ . type )),
field ( 'body' , $ . block )
),
binary_expression : $ => seq (
field ( 'left' , $ . expression ),
field ( 'operator' , choice ( '+' , '-' , '*' , '/' )),
field ( 'right' , $ . expression )
)
}
Use fields to make tree traversal code more readable and resilient to grammar changes.
reserved() - Contextual Keywords
Overrides the global reserved word set for contextual keywords.
grammar ({
name: 'javascript' ,
reserved : $ => ({
keywords: [ 'if' , 'else' , 'function' , 'return' ],
property_keywords: [] // Empty set for property names
}),
rules: {
// 'if' is reserved here
variable : $ => $ . identifier ,
// 'if' is allowed as a property name
property_name : $ => reserved (
$ . property_keywords ,
choice ( $ . identifier , 'if' , 'else' )
)
}
})
Grammar Properties
name
The name of your language (required).
export default grammar ({
name: 'javascript' ,
// ...
}) ;
rules
The grammar rules (required). The first rule is the start symbol.
rules : {
source_file : $ => repeat ( $ . _statement ), // Start rule
_statement : $ => choice ( /* ... */ ),
// ...
}
Tokens that may appear anywhere in the language.
extras : $ => [
/ \s / , // Whitespace
$ . comment , // Comments
$ . line_continuation
]
The default value of extras is [/\s/] (whitespace). To control whitespace explicitly, set extras: $ => [].
inline
Rules to remove by inlining their definition.
inline : $ => [
$ . _expression ,
$ . _statement ,
$ . _type
]
Effect : These rules won’t create syntax tree nodes; their contents appear directly in parent nodes.
conflicts
Declares intentional LR(1) conflicts.
conflicts : $ => [
[ $ . array , $ . array_pattern ],
[ $ . object , $ . object_pattern ]
]
Only declare conflicts when you have genuine ambiguities that should be resolved at runtime using GLR parsing.
externals
Tokens handled by an external scanner.
externals : $ => [
$ . indent ,
$ . dedent ,
$ . string_content
]
See External Scanners for details.
word
The token used for keyword extraction optimization.
Setting a word token makes the parser smaller and faster by optimizing keyword matching.
supertypes
Rules considered abstract supertypes.
supertypes : $ => [
$ . _expression ,
$ . _statement ,
$ . _declaration
]
Effect : These nodes are hidden in the parse tree but can be used in queries.
precedences
Named precedence levels.
precedences : $ => [
[ 'unary' , 'multiplicative' , 'additive' ],
[ 'type' , 'expression' ]
]
Use with : prec('unary', ...) instead of prec(2, ...)
Complete Example
Here’s a complete grammar demonstrating many DSL features:
export default grammar ({
name: 'calculator' ,
extras : $ => [ / \s / ] ,
word : $ => $ . identifier ,
rules: {
source_file : $ => repeat ( $ . _statement ),
_statement : $ => choice (
$ . assignment ,
$ . expression
),
assignment : $ => seq (
field ( 'left' , $ . identifier ),
'=' ,
field ( 'right' , $ . expression )
),
expression : $ => choice (
$ . identifier ,
$ . number ,
$ . binary_expression ,
$ . unary_expression ,
$ . parenthesized_expression
),
binary_expression : $ => choice (
prec . left ( 2 , seq (
field ( 'left' , $ . expression ),
field ( 'operator' , choice ( '*' , '/' )),
field ( 'right' , $ . expression )
)),
prec . left ( 1 , seq (
field ( 'left' , $ . expression ),
field ( 'operator' , choice ( '+' , '-' )),
field ( 'right' , $ . expression )
))
),
unary_expression : $ => prec ( 3 , seq (
field ( 'operator' , '-' ),
field ( 'operand' , $ . expression )
)),
parenthesized_expression : $ => seq (
'(' ,
$ . expression ,
')'
),
identifier : $ => / [ a-zA-Z_ ][ a-zA-Z0-9_ ] * / ,
number : $ => / \d + /
}
}) ;
Next Steps
Now that you understand the grammar DSL, learn how to: