- Meta
- Lexical Structure
- Comments
- Tokens
- Keywords
- Identifiers
- Operators and punctuation
- Literals
- Numeric literals
- Character literals
- String literals
- Modules
- Builtin modules
- Type System
- Builtin types
- Numeric types
- Pointer types
- Reference types
- Struct types
- Tuple types
- Function types
- Variant types
- decltype
- Typealias
- Declarations
- Function declaration
- Member functions
- Function overloads
- Overload resolution
- Operator declaration
- Struct declaration
- Impl blocks
- Expressions
- Literal expressions
- Operators
- Type conversions
- Lambdas
- Statements
- Compound statements
- Return statements
- break and continue statements
- if statements
- for loops
- Capture by reference
- while loops
- Attributes
- Attribute types
- Function attributes
- Struct attributes
- Intrinsics
- Lvalue references
- Templates
- Template parameters
- Template arguments
- Template argument deduction
Meta[yo.meta]
Note
- This is very much still work in progress and does not necessarily describe the language as implemented in the GitHub repo.
The syntax rules in this document use an extended BNF, with the following modifications:
(* regular expressions *) ? P ? := all values matched by a regular expression with the pattern P (* syntax shorthand for a (possibly empty) comma-separated list *) L(R) := [ R { "," R } ] (* syntax shorthand for a repeated rule which may not be omitted *) {R}+ := R {R}
Lexical Structure[yo.lex]
Yo source code is written in ASCII. Some UTF-8 codepoints will probably work in identifiers and string literals, but there's no proper handling for characters outside the ASCII character set.
Comments[yo.lex.comment]
There are two kinds of comments:
- Line comments, starting with
//
and continuing until the end of the line - Block comments, starting with
/*
and continuing until the next*/
Tokens[yo.lex.token]
The Yo lexer differentiates between the following kinds of tokens: keywords, identifiers, punctuation, and literals.
Keywords[yo.lex.keyword]
Yo reserves the following keywords:
break else impl match switch where
continue fn in operator unless while
decltype for let return use
defer if mut struct var
Identifiers[yo.lex.ident]
An identifier is a sequence of one or more letters or digits. The first element must not be a digit.
Syntax
digit = ? 0-9 ?
letter = ? a-zA-Z_ ?
ident = letter { letter | digit }
A sequence of characters that satisfies the ident pattern above and is not a reserved keyword is assumed to be an identifier. All identifiers with two leading underscores are reserved and should be considered internal.
Operators and punctuation[yo.lex.operator]
The following characters and character sequences represent operators and punctuation:
+ & && == |> ( )
- | || != = { }
* ^ < ! [ ]
/ << <= . ;
% >> > , :
>=
Literals[yo.lex.literal]
Literals are syntactic tokens which represent a constant value. The Yo lexer defines three kinds of literals: numeric, character, and string literals.
Numeric literals[yo.lex.numeric]
A numeric literal represents a constant numeric value, either of integer- or a floating-point-type.
Integer literals have an optional prefix to specify the number's base.
Prefix | Base |
---|---|
0b |
binary |
0o |
octal |
0x |
hexadecimal |
none | decimal |
Syntax
bin_digit = '0' | '1'
oct_digit = ? 0-7 ?
dec_digit = ? 0-9 ?
hex_digit = ? 0-9a-f ?
bin_literal = '0b' bin_digit { bin_digit }
oct_literal = '0o' oct_digit { oct_digit }
dec_literal = dec_digit { dec_digit }
hex_literal = '0x' hex_digit { hex_digit }
flt_literal = dec_literal '.' dec_literal
Character literals[yo.lex.literal.char]
todo
String literals[yo.lex.literal.string]
A string literal is a sequence of ASCII characters, enclosed by double quotes.
There are two kinds of string literals:
- Regular string literals
The text between the quotes is interpreted as a sequence of character literals, with special handling for escape sequences. - Raw string literals
Prefixed withr
. The contents of the literal are taken "as is", with no special handling whatsoever.
Note
- Regular and raw string literals are compiled into objects of the
String
type.- Prefixing a literal with a
b
(eg:b"123"
) results in an object of type*i8
(ie, a pointer to a sequence of characters).
Example
String literal | Characters | Type |
---|---|---|
"abc\n" |
'a' 'b' 'c' '\n' |
String |
r"abc\n" |
'a' 'b' 'c' '\' 'n' |
String |
b"abc\n" |
'a' 'b' 'c' '\n' |
*i8 |
br"abc\n" |
'a' 'b' 'c' '\' 'n' |
*i8 |
Modules[yo.module]
Yo source code is organized in modules. Every .yo
file is considered a module, uniquely identified by its absolute path.
The use
keyword, followed by a string literal, imports a module:
use "name";
Note that this is essentially the same as C++'s #include
directive, as in that the compiler will simply insert the contents of the imported module at the location of the use statement. If a module has already been imported, future imports of the same module will have no effect.
The order of imports does not matter.
Import paths are resolved relative to the directory of the module containing the use
statement.
Builtin modules[yo.module.builtin]
The /stdlib
directory contains several builtin modules. These can be imported by prefixing the import name with a colon.
Example: Importing a builtin module
use ":std/core";
Builtin modules are bundled with the compiler, meaning that the actual /stdlib
files need not be present.
However, the -stdlib-root
flag can be used to specify the base directory all imports with a path prefixed by :
will be resolved to.
Type System[yo.types]
Yo's type system supports both nominal and structural types:
Nominal | Structural |
---|---|
builtin types | pointer types |
struct types | reference types |
variant types | tuple types |
function types |
Note
Nominal types can be template declarations, in which case they must be instantiated via an explicit template argument list.
Syntax
Type = NominalType | PointerType | ReferenceType | TupleType | FunctionType
NominalType = ident [ TmplArgs ]
PointerType = '*' Type
ReferenceType = '&' Type
TupleType = '(' L(Type) ')'
FunctionType = TupleType '->' Type
Builtin types[yo.types.builtin]
The following types are defined as builtins at language-level:
Typename | Size (bytes) | Description | Values |
---|---|---|---|
void |
0 | the empty type | n/a |
u{N} |
N/8 | unsigned integer type | 0 ... 2^N-1 |
i{N} |
N/8 | signed integer type | -2^(N-1) ... 2^(N-1)-1 |
bool |
1 | the boolean type | true , false |
f32 |
4 | 32-bit floating-point type | see wikipedia |
f64 |
8 | 64-bit floating-point type | see wikipedia |
Numeric types[yo.types.numeric]
All numeric types are implemented as builtins:
-
Integral types:
- the
bool
type - the signed integer types:
i8
,i16
,i32
,i64
- the unsigned integer types:
u8
,u16
,u32
,u64
- the
-
Floating-point types:
f32
: an IEEE-754 binary32 floating-point valuef64
: an IEEE-754 binary64 floating-point value
Pointer types[yo.types.ptr]
A pointer *T
that points to an object of type T
represents the memory address of that object.
Pointers are not guaranteed to point to valid objects.
For a pointer *T
, the type T
cannot be empty. (ie, sizeof<T>() == 0
.)
Consequently, Yo does not support C-style "void-pointers". Use *i8
("pointer to a byte") instead.
Reference types[yo.types.ref]
A reference can be thought of as an alias to an object.
See the lvalue references section for more info.
Struct types[yo.types.struct]
A struct is a nominal composite type with named members. Structs can be template declarations.
See the structs section for more info.
Tuple types[yo.types.tuple]
Tuples are composite types with unnamed members.
Function types[yo.types.fn]
A function type represents all functions with the same parameter- and result types.
Variant types[yo.types.variant]
todo
decltype[yo.types.decltype]
The decltype
construct can be used in all places where the compiler would expect a type expression.
It takes a single argument (an expression) and yields the type that expression would evaluate to.
The expression is not evaluated, and no code code is generated for it (with the exception of template instantiations for types used within the expression).
decltype
is useful in situations where expressing a type would otherwise be difficult or impossible, for example when dealing with types that depend on template parameters.
Example: defining a generic add function
fn add<T>(x: T, y: T) -> decltype(x + y) {
return x + y;
}
Typealias[yo.types.typealias]
A typealias is essentially, as the name suggests, an alias mapping a name to a type.
Syntax
TypealiasDecl = 'use' ident '=' Type ';'
Example:
use size_t = i64;
use IntArray = Array<i64>;
use Array2D<T> = Array<Array<T>>;
Declarations[yo.decl]
Function declaration[yo.decl.fn]
Syntax
FnDecl = 'fn' Ident [ TmplParams ] '(' L(FnParam) ')' [ '->' Type ] CompoundStmt
FnParam = ident ':' Type
A function declaration associates an identifier (the function's name) with a sequence of statements (the function's body).
Functions accept zero or more parameters, and may have a return type.
The return type may be omitted from the function's declaration, in which case it defaults to void
.
The combination of a function's parameter types and its return type are collectively referred to as the function's signature.
Example:
fn greet(name: *i8) {
printf(b"Hello, %s!\n", name);
}
Member functions[yo.decl.fn.member]
A function declared within the context of an impl
block for a type T
is called a member function of that type T
.
If a member function's first parameter is named self
and of type &Self
, the member function can be called on instances of the type T
.
Otherwise (if the first-param requirement is not satisfied), the function is considered a static member function and can be called on the type T
itself.
Example: a simple member function
impl String {
fn firstChar(self: &Self) -> i8 {
return self[0];
}
}
Function overloads[yo.decl.fn.overload]
Multiple functions can have the same name, as long as their signatures are distinct.
(For example, because they accept a different number of arguments or arguments of different types.)
Overload resolution[yo.decl.fn.overload-resolution]
todo
Operator declaration[yo.decl.operator]
Infix (binary) operators are implemented as functions, allowing them to be overloaded for custom signatures. Since they are functions, operator overloads can also be declared as templates.
The following operators may be overloaded:
+ & && == () ...
- | || != [] ..<
* ^ <
/ << >
% >> <=
>=
Example: A simple operator overload implementing addition for a custom Number
type
struct Number<T> {
value: T
}
fn operator + <T>(lhs: Number<T>, rhs: Number<T>) -> Number<T> {
return Number<T>(lhs.value + rhs.value);
}
Struct declaration[yo.decl.struct]
Syntax
StructDecl = 'struct' Ident [ TemplateParams ] '{' StructMembers '}'
StructMembers = L(Ident ':' Type)
Example
struct Person {
name: String,
age: i64,
happy: bool
}
Impl blocks[yo.decl.impl]
An impl block declares member functions on a type.
Syntax
ImplBlock = 'impl' [TmplParams] Type '{' {FnDecl} '}'
Within the impl block, the compiler provides the typealias Self
for referring to the impl block's type.
Example: A simple impl block
impl i64 {
fn adding(self: &Self, other: i64) -> i64 {
return self + other;
}
}
Example: A templated impl block
// The `intersection` function can be called on any `Array` object, regardless of the type `T`
impl<T> Array<T> {
fn intersection(self: &Self, other: [T]) -> [T] {
let results = Array<T>();
for &elem in self {
if other.contains(elem) {
results.append(elem);
}
}
return results;
}
}
An impl block can be declared for any valid type expression:
Example
// declare member functions for all tuples of three elements
// where the first and last elements have the same type
impl<T, U> (T, U, T) {
...
}
// declare member functions for all function types
// with a single `i64` parameter and an arbitrary return type
impl<T> (i64) -> T {
...
}
// declare member functions for *any* type
impl<T> T {
...
}
Expressions[yo.expr]
Every expression evaluates to a value of a specific type, which must be known at compile time.
Literal expressions[yo.expr.literal]
Literals are syntactic tokens which represent a constant value. See yo.lex.literal for more info.
Operators[yo.expr.operator]
Operators are functions that can be applied to objects, thus producing another object:
-
Prefix (unary) operators:
Operator Description Signature -
negation (T) -> T
~
bitwise NOT (T) -> T
!
logical NOT (bool) -> bool
&
address-of (T) -> *T
-
Infix (binary) operators (in decreasing order of precedence):
Operator Description Associativity Precedence <<
bitwise shift left None Bitshift >>
bitwise shift right None Bitshift *
multiplication Left Multiplication /
division Left Multiplication %
remainder Left Multiplication &
bitwise AND Left Multiplication +
addition Left Addition -
subtraction Left Addition |
bitwise OR Left Addition ^
bitwise XOR Left Addition ...
inclusive range None RangeFormation ..<
exclusive range None RangeFormation ==
equal None Comparision !=
not equal None Comparision <
less than None Comparision <=
less than or equal None Comparision >
greater than None Comparision >=
greater than or equal None Comparision &&
logical AND Left LogicalConjunction ||
logical OR Left LogicalDisjunction |>
function application Left FunctionApplication Note
Since most infix operators are implemented as functions, they can be overloaded for custom signatures
Type conversions[yo.expr.conversion]
There are two kinds of type conversions: implicit and explicit conversions.
-
Explicit conversions
The language defines two intrinsics for explicitly converting values between types:-
Safe (statically checked) typecasting:
#[intrinsic] fn cast<To, From>(val: From) -> To;
The
cast
intrinsic converts a value of typeA
to a related typeB
, if there exists a known conversion fromA
toB
. If there is no such conversion, the cast will fail to compile. -
Unsafe typecasting:
#[intrinsic] fn bitcast<To, From>(val: From) -> To;
The
bitcast
intrinsic converts between any two typesA
andB
, by reinterpreting a value's bit pattern.A
andB
must have the exact same bit width, otherwise the cast will fail to compile.
-
-
Implicit conversions
The compiler will generate implicit type casts only for numeric types, and only for value-preserving casts.Example
fn foo<T>(x: T) { } foo<i64>(12); // ok (literal 12 defaults to type i64) foo<i32>(12); // ok (literal 12 fits in type i32) foo<i8>(420); // error (literal 420 does not fit in type i8) let x: i8 = -4; foo<i64>(x); // ok (values of type i8 also fit in type i64) foo<u64>(x); // error (cast from i8 to u64 would not be value-preserving)
Lambdas[yo.expr.lambda]
A lambda expression constructs an anonymous function.
A lambda, like a normal function, has a fixed set of input types and a fixed output type. In addition, a lambda can also capture variables and other values from outside its own scope (these captures must be explicitly declared in the lambda's capture list). Since, in most cases, lambdas are essentially just structs with an overloaded call operator, they can also declare template parameters.
There is no uniform type for lambdas with the same signature, instead the compiler will generate an anonymous type for each lambda expression.
Syntax
Lambda = CaptureList [ TmplParams ] Signature CompoundStmt
CaptureList = '[' L(CaptureElement) ']'
CaptureElement = [ '&' ] ident [ '=' Expr ]
Example
// a noop lambda: no input, no output, does nothing
let f1 = []() {};
// a lambda which adds two integers
let f2 = [](x: i64, y: i64) -> i64 {
return x + y;
};
// a lambda which adds two values of the same type
let f3 = []<T>(x: T, y: T) -> T {
return x + y;
};
// a lambda which captures an object by reference, and increments it
let x = 0;
let f4 = [&x](inc: i64) {
x += inc;
};
Statements[yo.stmt]
Statements are things that are executed in order. They can manipulate/affect control flow.
Syntax
Stmt = CompoundStmt | IfStmt | ForLoop | WhileLoop | ReturnStmt | BreakStmt | ContinueStmt
Compound statements[yo.stmt.compound]
A compound statement (also called a block) is a seqence of statements, enclosed in curly braces.
Syntax
CompoundStmt = '{' { Stmt } '}'
Return statements[yo.stmt.ret]
The return statement terminates execution of the current function and transfers control flow back to the caller, optionally passing a value.
Syntax
ReturnStmt = 'return' [ Expr ] ';'
break and continue statements[yo.stmt.breakcont]
The break and continue statements transfer control flow to the nearest break or continue destination, respectively.
Syntax
BreakStmt = 'break' ';'
ContinueStmt = 'continue' ';'
Note: these statements can only be used within the scope of a statement that defines break and continue destinations:
break
: Terminate a for or while loop (ie, transfer control flow to the next statement after the loop)continue
: Transfer control flow to the condition check of a for or while loop
if statements[yo.stmt.if]
The if statement conditionally transfers control flow to one of multiple branches, depending on the value of one or multiple conditions.
Syntax
IfStmt = 'if' Expr CompoundStmt { 'else' 'if' Expr CompoundStmt } [ 'else' CompoundStmt ]
Example
// A simple if statement
if x < y {
return x;
} else {
return y;
}
// An if statement with multiple conditions and branches
if x % 2 == 0 {
return x;
} else if x.isPrime() {
return x + 1;
} else {
return 0;
}
for loops[yo.stmt.for]
A for loop executes a block once for each item provided by an iterable object's iterator.
Syntax
ForLoop = 'for' [ '&' ] ident 'in' Expr CompoundStmt
Example
// Iterating over an array
for number in [0, 2, 4, 6, 8] {
print(number);
}
Capture by reference[yo.stmt.for.capture]
todo
while loops[yo.stmt.while]
A while loops repeatedly executes a block, as long as a condition evaluates to true
.
Syntax
WhileLoop = 'while' Expr CompoundStmt
Example
let x = 1;
while x < 1000 {
x *= 2;
}
Attributes[yo.attr]
Attributes are used to provide the compiler with additional knowledge about a declaration.
Syntax
AttrList = '#[' L(AttrEntry) ']'
AttrEntry = ident [ '=' AttrValue ]
AttrValue = ident | string
A declaration that can have attributes can be preceded by one or multiple attribute lists. Splitting attributes up into multiple separate attribute lists is semantically equivalent to putting them all in a single list.
Note
An attribute list may not specify the same attribute multiple times
Attribute types[yo.attr.types]
-
bool
The default argument type. The value, unless explicitly stated, is determined by the presence of the attribute.
Example: Attribute lists A and B are equivalent, as are C and D.A #[attr_name] B #[attr_name=true] C #[] D #[attr_name=false]
-
string
For attributes of type string, the value must always be explicitly stated.#[attr_name="attr_value"]
Function attributes[yo.attr.fn]
Name | Type | Description |
---|---|---|
extern |
bool |
C linkage |
inline |
bool |
Function may be inlined |
always_inline |
bool |
Function should always be inlined |
intrinsic |
bool |
(internal) declares a compile-time intrinsic |
no_mangle |
bool |
Don't mangle the function's name |
no_debug_info |
bool |
Don't emit debug metadata for this function |
mangle |
string |
Override a function's mangled name |
startup |
bool |
Causes the function to be called before execution enters main |
shutdown |
bool |
Causes the function to be called after main returns |
Note
- the
no_mangle
,mangle={string}
andextern
attributes are mutually exclusive- the
no_mangle
attribute may only be applied to global function declarations
// Forward-declaring a function with external C linkage
#[extern]
fn strcmp(*i8, *i8) -> i32;
// A function with an explicitly set mangled name
#[mangle="bar"]
fn foo() -> void { ... }
Struct attributes[yo.attr.struct]
Name | Type | Description |
---|---|---|
trivial |
bool |
Enforce that the type satisfies the requirements of a trivial type |
no_init |
bool |
The compiler should not generate default initializers for the type |
no_debug_info |
bool |
Don't emit debug metadata for this type and all of its member functions |
Intrinsics[yo.intrinsic]
A function declared with the intrinsic
attribute is considered a compile-time intrinsic. Calls to intrinsic functions will receive special handling by the compiler. All intrinsic functions are declared in the :runtime/intrinsics
module.
An intrinsic function may be overloaded for a custom signature, in this case the overload cannot specify the intrinsic attribute.
Lvalue references[yo.ref]
todo
Templates[yo.tmpl]
Templates provide a way to declare a generic implementation of a function, struct, or variant type.
Syntax
TmplParams = '<' L(ident [ '=' Type ]) '>'
Template parameters[yo.tmpl.params]
A template parameter list consists of one or more template parameters.
In its simplest form, a template parameter is just an identifier, to which the template argument used for the instantiation will be bound for the scope of the template declaration.
Alternatively, however, a parameter can also have a default value, which can be any type expression.
Example: The identity function
fn id<T>(x: T) -> T {
return x;
}
Template arguments[yo.tmpl.args]
In order to instantiate a templated declaration, all template arguments must be known. This is achieved by either explicitly specifying the arguments in the template instantiation, or, in the case of calls to function templates, by relying on the compiler to deduce the argument types from context.
Note
Template argument deduction is not supported for calls to the constructor of a struct template. In this case all template arguments need to be explicitly specified (with the possible exception of template parameters which define a default value).
Syntax
TmplArgs = '<' L(Type) '>'
Template argument deduction[yo.tmpl.deduct]
If a template parameter's value is explicitly specified in the template instantiation, that argument will be used, regardless of a possible default value, or other information that might be deduced from context.
For each template parameter P
which is not explicitly specified in the instantiation, the compiler will attempt to deduce the template argument from context, using the call's arguments.
The following rules and adjustments apply during deduction:
- If
P
was deduced to a type&T
(ie, some reference),P
will be deduced asT
- If
P
was deduced from a numeric literal of typeA
, and the compiler encounters another argument, which deducesP
to a different typeB
and is not a literal expression,P
will be deduced asB
(ie, arguments deduced from numeric literal expressions can be overwritten by deductions based on non-literal expressions) - If
P
has already been deduced to typeA
, and the compiler encounters another argument which deducesP
to an unrelated typeB
, the deduction will fail
All template parameters P
which were not deduced, but also didn't produce any deduction failures, and specify a default value T
, will be deduced as that default value T
.
Example: template argument decuction
// Consider the following function, specifying one template parameter T
fn add<T>(x: T, y: T) -> T {
return x + y;
}
// Explicit template arguments:
add<i64>(1, 2); // No deduction, T = i64
// Deduced template arguments:
add(1, 2); // T deduced as i64
let x: i32 = 1;
let y: i64 = 2;
add(1, y); // T deduced as i32 (initially deduced as i64, then overwritten by non-literal argument)
add(x, y); // T fails to deduce (initially deduced as i32, then again deduced to incompatible type i64)