# Symbolic Tokens
# Table of contents
# Overview
A symbolic token is one of a fixed set of
tokens that consist of
characters that are not valid in identifiers. That is, they are tokens
consisting of symbols, not letters or numbers. Operators are one use of symbolic
tokens, but they are also used in patterns :, declarations (-> to indicate
return type, , to separate parameters), statements (;, =, and so on), and
other places (, to separate function call arguments).
Carbon has a fixed set of symbolic tokens, defined by the language specification. Developers cannot define new symbolic tokens in their own code.
Symbolic tokens are lexed using a "max munch" rule: at each lexing step, the longest symbolic token defined by the language specification that appears starting at the current input position is lexed, if any.
When a symbolic token is used as an operator, the surrounding whitespace must follow certain rules:
- There can be no whitespace between a unary operator and its operand.
- The whitespace around a binary operator must be consistent: either there is whitespace on both sides or on neither side.
- If there is whitespace on neither side of a binary operator, the token
before the operator must be an identifier, a literal, or any kind of closing
bracket (for example,
),], or}), and the token after the operator must be an identifier, a literal, or any kind of opening bracket (for example,(,[, or{).
These rules enable us to use a token like * as a prefix, infix, and postfix
operator, without creating ambiguity.
# Details
# Symbolic token list
The following is the initial list of symbolic tokens recognized in a Carbon source file:
# Alternatives considered
Alternatives from proposal #601:
- lex the longest sequence of symbolic characters rather than lexing only the longest known operator
- support an extensible operator set
- different whitespace restrictions or no whitespace restrictions
# References
- Proposal #162: Basic Syntax
- Proposal
#339: Add `var
[ = ];` syntax for variables - Proposal #438: Add statement syntax for function declarations
- Proposal #561: Basic classes: use cases, struct literals, struct types, and future work
- Proposal #601: Operator tokens
- Proposal #676: `:!` generic syntax
- Proposal #702: Comparison operators
- Proposal #989: Member access expressions
- Proposal #1083: Arithmetic expressions
- Proposal #1191: Bitwise operators
- Proposal #2188: Pattern matching syntax and semantics
- Proposal #2274: Subscript syntax and semantics
- Proposal #2511: Assignment statements
- Proposal #2665: Semicolons terminate statements