Reference

Meta[yo.meta]

Note

  • This is very much still work in progress and does not necessarily describe the language as implemented in the GitHub repo.
  • The syntax rules in this document use an extended BNF, with the following modifications:

    (* regular expressions *)
    ? P ? := all values matched by a regular expression with the pattern P
    
    (* syntax shorthand for a (possibly empty) comma-separated list *)
    L(R) := [ R { "," R } ]
    
    (* syntax shorthand for a repeated rule which may not be omitted *)
    {R}+ := R {R}

Lexical Structure[yo.lex]

Yo source code is written in ASCII. Some UTF-8 codepoints will probably work in identifiers and string literals, but there's no proper handling for characters outside the ASCII character set.

Comments[yo.lex.comment]

There are two kinds of comments:

  • Line comments, starting with // and continuing until the end of the line
  • Block comments, starting with /* and continuing until the next */

Tokens[yo.lex.token]

The Yo lexer differentiates between the following kinds of tokens: keywords, identifiers, punctuation, and literals.

Keywords[yo.lex.keyword]

Yo reserves the following keywords:

break       else    impl    match       switch    where
continue    fn      in      operator    unless    while
decltype    for     let     return      use
defer       if      mut     struct      var

Identifiers[yo.lex.ident]

An identifier is a sequence of one or more letters or digits. The first element must not be a digit.

Syntax

digit  = ? 0-9 ?
letter = ? a-zA-Z_ ?
ident  = letter { letter | digit }

A sequence of characters that satisfies the ident pattern above and is not a reserved keyword is assumed to be an identifier. All identifiers with two leading underscores are reserved and should be considered internal.

Operators and punctuation[yo.lex.operator]

The following characters and character sequences represent operators and punctuation:

+    &    &&    ==    |>    (    )
-    |    ||    !=    =     {    }
*    ^          <     !     [    ]
/    <<         <=          .    ;
%    >>         >           ,    :
                >=

Literals[yo.lex.literal]

Literals are syntactic tokens which represent a constant value. The Yo lexer defines three kinds of literals: numeric, character, and string literals.

Numeric literals[yo.lex.numeric]

A numeric literal represents a constant numeric value, either of integer- or a floating-point-type.
Integer literals have an optional prefix to specify the number's base.

Prefix Base
0b binary
0o octal
0x hexadecimal
none decimal

Syntax

bin_digit   = '0' | '1'
oct_digit   = ? 0-7 ?
dec_digit   = ? 0-9 ?
hex_digit   = ? 0-9a-f ?

bin_literal = '0b' bin_digit { bin_digit }
oct_literal = '0o' oct_digit { oct_digit }
dec_literal = dec_digit { dec_digit }
hex_literal = '0x' hex_digit { hex_digit }
flt_literal = dec_literal '.' dec_literal

Character literals[yo.lex.literal.char]

todo

String literals[yo.lex.literal.string]

A string literal is a sequence of ASCII characters, enclosed by double quotes.

There are two kinds of string literals:

  • Regular string literals
    The text between the quotes is interpreted as a sequence of character literals, with special handling for escape sequences.
  • Raw string literals
    Prefixed with r. The contents of the literal are taken "as is", with no special handling whatsoever.

Note

  • Regular and raw string literals are compiled into objects of the String type.
  • Prefixing a literal with a b (eg: b"123") results in an object of type *i8 (ie, a pointer to a sequence of characters).

Example

String literal Characters Type
"abc\n" 'a' 'b' 'c' '\n' String
r"abc\n" 'a' 'b' 'c' '\' 'n' String
b"abc\n" 'a' 'b' 'c' '\n' *i8
br"abc\n" 'a' 'b' 'c' '\' 'n' *i8

Modules[yo.module]

Yo source code is organized in modules. Every .yo file is considered a module, uniquely identified by its absolute path.

The use keyword, followed by a string literal, imports a module:

use "name";

Note that this is essentially the same as C++'s #include directive, as in that the compiler will simply insert the contents of the imported module at the location of the use statement. If a module has already been imported, future imports of the same module will have no effect. The order of imports does not matter.

Import paths are resolved relative to the directory of the module containing the use statement.

Builtin modules[yo.module.builtin]

The /stdlib directory contains several builtin modules. These can be imported by prefixing the import name with a colon.

Example: Importing a builtin module

use ":std/core";

Builtin modules are bundled with the compiler, meaning that the actual /stdlib files need not be present. However, the -stdlib-root flag can be used to specify the base directory all imports with a path prefixed by : will be resolved to.

Type System[yo.types]

Yo's type system supports both nominal and structural types:

Nominal Structural
builtin types pointer types
struct types reference types
variant types tuple types
function types

Note
Nominal types can be template declarations, in which case they must be instantiated via an explicit template argument list.

Syntax

Type          = NominalType | PointerType | ReferenceType | TupleType | FunctionType
NominalType   = ident [ TmplArgs ]
PointerType   = '*' Type
ReferenceType = '&' Type
TupleType     = '(' L(Type) ')'
FunctionType  = TupleType '->' Type

Builtin types[yo.types.builtin]

The following types are defined as builtins at language-level:

Typename Size (bytes) Description Values
void 0 the empty type n/a
u{N} N/8 unsigned integer type 0 ... 2^N-1
i{N} N/8 signed integer type -2^(N-1) ... 2^(N-1)-1
bool 1 the boolean type true, false
f32 4 32-bit floating-point type see wikipedia
f64 8 64-bit floating-point type see wikipedia

Numeric types[yo.types.numeric]

All numeric types are implemented as builtins:

  • Integral types:

    • the bool type
    • the signed integer types: i8, i16, i32, i64
    • the unsigned integer types: u8, u16, u32, u64
  • Floating-point types:

    • f32: an IEEE-754 binary32 floating-point value
    • f64: an IEEE-754 binary64 floating-point value

Pointer types[yo.types.ptr]

A pointer *T that points to an object of type T represents the memory address of that object.

Pointers are not guaranteed to point to valid objects.

For a pointer *T, the type T cannot be empty. (ie, sizeof<T>() == 0.)
Consequently, Yo does not support C-style "void-pointers". Use *i8 ("pointer to a byte") instead.

Reference types[yo.types.ref]

A reference can be thought of as an alias to an object.

See the lvalue references section for more info.

Struct types[yo.types.struct]

A struct is a nominal composite type with named members. Structs can be template declarations.

See the structs section for more info.

Tuple types[yo.types.tuple]

Tuples are composite types with unnamed members.

Function types[yo.types.fn]

A function type represents all functions with the same parameter- and result types.

Variant types[yo.types.variant]

todo

decltype[yo.types.decltype]

The decltype construct can be used in all places where the compiler would expect a type expression. It takes a single argument (an expression) and yields the type that expression would evaluate to.
The expression is not evaluated, and no code code is generated for it (with the exception of template instantiations for types used within the expression).

decltype is useful in situations where expressing a type would otherwise be difficult or impossible, for example when dealing with types that depend on template parameters.

Example: defining a generic add function

fn add<T>(x: T, y: T) -> decltype(x + y) {
    return x + y;
}

Typealias[yo.types.typealias]

A typealias is essentially, as the name suggests, an alias mapping a name to a type.

Syntax

TypealiasDecl = 'use' ident '=' Type ';'

Example:

use size_t = i64;
use IntArray = Array<i64>;
use Array2D<T> = Array<Array<T>>;

Declarations[yo.decl]

Function declaration[yo.decl.fn]

Syntax

FnDecl  = 'fn' Ident [ TmplParams ] '(' L(FnParam) ')' [ '->' Type ] CompoundStmt
FnParam = ident ':' Type

A function declaration associates an identifier (the function's name) with a sequence of statements (the function's body).

Functions accept zero or more parameters, and may have a return type. The return type may be omitted from the function's declaration, in which case it defaults to void.

The combination of a function's parameter types and its return type are collectively referred to as the function's signature.

Example:

fn greet(name: *i8) {
    printf(b"Hello, %s!\n", name);
}

Member functions[yo.decl.fn.member]

A function declared within the context of an impl block for a type T is called a member function of that type T.
If a member function's first parameter is named self and of type &Self, the member function can be called on instances of the type T. Otherwise (if the first-param requirement is not satisfied), the function is considered a static member function and can be called on the type T itself.

Example: a simple member function

impl String {
    fn firstChar(self: &Self) -> i8 {
        return self[0];
    }
}

Function overloads[yo.decl.fn.overload]

Multiple functions can have the same name, as long as their signatures are distinct.
(For example, because they accept a different number of arguments or arguments of different types.)

Overload resolution[yo.decl.fn.overload-resolution]

todo

Operator declaration[yo.decl.operator]

Infix (binary) operators are implemented as functions, allowing them to be overloaded for custom signatures. Since they are functions, operator overloads can also be declared as templates.

The following operators may be overloaded:

+    &     &&    ==    ()    ...
-    |     ||    !=    []    ..<
*    ^           <
/    <<          >
%    >>          <=
                 >=

Example: A simple operator overload implementing addition for a custom Number type

struct Number<T> {
    value: T
}

fn operator + <T>(lhs: Number<T>, rhs: Number<T>) -> Number<T> {
    return Number<T>(lhs.value + rhs.value);
}

Struct declaration[yo.decl.struct]

Syntax

StructDecl    = 'struct' Ident [ TemplateParams ] '{' StructMembers '}'
StructMembers = L(Ident ':' Type)

Example

struct Person {
    name: String,
    age: i64,
    happy: bool
}

Impl blocks[yo.decl.impl]

An impl block declares member functions on a type.

Syntax

ImplBlock = 'impl' [TmplParams] Type '{' {FnDecl} '}'

Within the impl block, the compiler provides the typealias Self for referring to the impl block's type.

Example: A simple impl block

impl i64 {
    fn adding(self: &Self, other: i64) -> i64 {
        return self + other;
    }
}

Example: A templated impl block

// The `intersection` function can be called on any `Array` object, regardless of the type `T`
impl<T> Array<T> {
    fn intersection(self: &Self, other: [T]) -> [T] {
        let results = Array<T>();
        for &elem in self {
            if other.contains(elem) {
                results.append(elem);
            }
        }
        return results;
    }
}

An impl block can be declared for any valid type expression:

Example

// declare member functions for all tuples of three elements
// where the first and last elements have the same type
impl<T, U> (T, U, T) {
    ...
}

// declare member functions for all function types
// with a single `i64` parameter and an arbitrary return type
impl<T> (i64) -> T {
    ...
}

// declare member functions for *any* type
impl<T> T {
    ...
}

Expressions[yo.expr]

Every expression evaluates to a value of a specific type, which must be known at compile time.

Literal expressions[yo.expr.literal]

Literals are syntactic tokens which represent a constant value. See yo.lex.literal for more info.

Operators[yo.expr.operator]

Operators are functions that can be applied to objects, thus producing another object:

  • Prefix (unary) operators:

    Operator Description Signature
    - negation (T) -> T
    ~ bitwise NOT (T) -> T
    ! logical NOT (bool) -> bool
    & address-of (T) -> *T
  • Infix (binary) operators (in decreasing order of precedence):

    Operator Description Associativity Precedence
    << bitwise shift left None Bitshift
    >> bitwise shift right None Bitshift
    * multiplication Left Multiplication
    / division Left Multiplication
    % remainder Left Multiplication
    & bitwise AND Left Multiplication
    + addition Left Addition
    - subtraction Left Addition
    | bitwise OR Left Addition
    ^ bitwise XOR Left Addition
    ... inclusive range None RangeFormation
    ..< exclusive range None RangeFormation
    == equal None Comparision
    != not equal None Comparision
    < less than None Comparision
    <= less than or equal None Comparision
    > greater than None Comparision
    >= greater than or equal None Comparision
    && logical AND Left LogicalConjunction
    || logical OR Left LogicalDisjunction
    |> function application Left FunctionApplication

    Note
    Since most infix operators are implemented as functions, they can be overloaded for custom signatures

Type conversions[yo.expr.conversion]

There are two kinds of type conversions: implicit and explicit conversions.

  • Explicit conversions
    The language defines two intrinsics for explicitly converting values between types:

    • Safe (statically checked) typecasting:

      #[intrinsic]
      fn cast<To, From>(val: From) -> To;

      The cast intrinsic converts a value of type A to a related type B, if there exists a known conversion from A to B. If there is no such conversion, the cast will fail to compile.

    • Unsafe typecasting:

      #[intrinsic]
      fn bitcast<To, From>(val: From) -> To;

      The bitcast intrinsic converts between any two types A and B, by reinterpreting a value's bit pattern. A and B must have the exact same bit width, otherwise the cast will fail to compile.

  • Implicit conversions
    The compiler will generate implicit type casts only for numeric types, and only for value-preserving casts.

    Example

    fn foo<T>(x: T) { }
    
    foo<i64>(12);  // ok    (literal 12 defaults to type i64)
    foo<i32>(12);  // ok    (literal 12 fits in type i32)
    foo<i8>(420);  // error (literal 420 does not fit in type i8)
    
    let x: i8 = -4;
    foo<i64>(x);   // ok    (values of type i8 also fit in type i64)
    foo<u64>(x);   // error (cast from i8 to u64 would not be value-preserving)

Lambdas[yo.expr.lambda]

A lambda expression constructs an anonymous function.

A lambda, like a normal function, has a fixed set of input types and a fixed output type. In addition, a lambda can also capture variables and other values from outside its own scope (these captures must be explicitly declared in the lambda's capture list). Since, in most cases, lambdas are essentially just structs with an overloaded call operator, they can also declare template parameters.

There is no uniform type for lambdas with the same signature, instead the compiler will generate an anonymous type for each lambda expression.

Syntax

Lambda         = CaptureList [ TmplParams ] Signature CompoundStmt
CaptureList    = '[' L(CaptureElement) ']'
CaptureElement = [ '&' ] ident [ '=' Expr ]

Example

// a noop lambda: no input, no output, does nothing
let f1 = []() {};

// a lambda which adds two integers
let f2 = [](x: i64, y: i64) -> i64 {
    return x + y;
};

// a lambda which adds two values of the same type
let f3 = []<T>(x: T, y: T) -> T {
    return x + y;
};

// a lambda which captures an object by reference, and increments it
let x = 0;
let f4 = [&x](inc: i64) {
    x += inc;
};

Statements[yo.stmt]

Statements are things that are executed in order. They can manipulate/affect control flow.

Syntax

Stmt = CompoundStmt | IfStmt | ForLoop | WhileLoop | ReturnStmt | BreakStmt | ContinueStmt

Compound statements[yo.stmt.compound]

A compound statement (also called a block) is a seqence of statements, enclosed in curly braces.

Syntax

CompoundStmt = '{' { Stmt } '}'

Return statements[yo.stmt.ret]

The return statement terminates execution of the current function and transfers control flow back to the caller, optionally passing a value.

Syntax

ReturnStmt = 'return' [ Expr ] ';'

break and continue statements[yo.stmt.breakcont]

The break and continue statements transfer control flow to the nearest break or continue destination, respectively.

Syntax

BreakStmt    = 'break' ';'
ContinueStmt = 'continue' ';'

Note: these statements can only be used within the scope of a statement that defines break and continue destinations:

  • break: Terminate a for or while loop (ie, transfer control flow to the next statement after the loop)
  • continue: Transfer control flow to the condition check of a for or while loop

if statements[yo.stmt.if]

The if statement conditionally transfers control flow to one of multiple branches, depending on the value of one or multiple conditions.

Syntax

IfStmt = 'if' Expr CompoundStmt { 'else' 'if' Expr CompoundStmt } [ 'else' CompoundStmt ]

Example

// A simple if statement
if x < y {
    return x;
} else {
    return y;
}

// An if statement with multiple conditions and branches
if x % 2 == 0 {
    return x;
} else if x.isPrime() {
    return x + 1;
} else {
    return 0;
}

for loops[yo.stmt.for]

A for loop executes a block once for each item provided by an iterable object's iterator.

Syntax

ForLoop = 'for' [ '&' ] ident 'in' Expr CompoundStmt

Example

// Iterating over an array
for number in [0, 2, 4, 6, 8] {
    print(number);
}

Capture by reference[yo.stmt.for.capture]

todo

while loops[yo.stmt.while]

A while loops repeatedly executes a block, as long as a condition evaluates to true.

Syntax

WhileLoop = 'while' Expr CompoundStmt

Example

let x = 1;
while x < 1000 {
    x *= 2;
}

Attributes[yo.attr]

Attributes are used to provide the compiler with additional knowledge about a declaration.

Syntax

AttrList   = '#[' L(AttrEntry) ']'
AttrEntry  = ident [ '=' AttrValue ]
AttrValue  = ident | string

A declaration that can have attributes can be preceded by one or multiple attribute lists. Splitting attributes up into multiple separate attribute lists is semantically equivalent to putting them all in a single list.

Note
An attribute list may not specify the same attribute multiple times

Attribute types[yo.attr.types]

  • bool The default argument type. The value, unless explicitly stated, is determined by the presence of the attribute.
    Example: Attribute lists A and B are equivalent, as are C and D.

    A  #[attr_name]
    B  #[attr_name=true]
    
    C  #[]
    D  #[attr_name=false]
  • string For attributes of type string, the value must always be explicitly stated.

    #[attr_name="attr_value"]

Function attributes[yo.attr.fn]

Name Type Description
extern bool C linkage
inline bool Function may be inlined
always_inline bool Function should always be inlined
intrinsic bool (internal) declares a compile-time intrinsic
no_mangle bool Don't mangle the function's name
no_debug_info bool Don't emit debug metadata for this function
mangle string Override a function's mangled name
startup bool Causes the function to be called before execution enters main
shutdown bool Causes the function to be called after main returns

Note

  • the no_mangle, mangle={string} and extern attributes are mutually exclusive
  • the no_mangle attribute may only be applied to global function declarations
// Forward-declaring a function with external C linkage
#[extern]
fn strcmp(*i8, *i8) -> i32;

// A function with an explicitly set mangled name
#[mangle="bar"]
fn foo() -> void { ... }

Struct attributes[yo.attr.struct]

Name Type Description
trivial bool Enforce that the type satisfies the requirements of a trivial type
no_init bool The compiler should not generate default initializers for the type
no_debug_info bool Don't emit debug metadata for this type and all of its member functions

Intrinsics[yo.intrinsic]

A function declared with the intrinsic attribute is considered a compile-time intrinsic. Calls to intrinsic functions will receive special handling by the compiler. All intrinsic functions are declared in the :runtime/intrinsics module.

An intrinsic function may be overloaded for a custom signature, in this case the overload cannot specify the intrinsic attribute.

Lvalue references[yo.ref]

todo

Templates[yo.tmpl]

Templates provide a way to declare a generic implementation of a function, struct, or variant type.

Syntax

TmplParams = '<' L(ident [ '=' Type ]) '>'

Template parameters[yo.tmpl.params]

A template parameter list consists of one or more template parameters.
In its simplest form, a template parameter is just an identifier, to which the template argument used for the instantiation will be bound for the scope of the template declaration. Alternatively, however, a parameter can also have a default value, which can be any type expression.

Example: The identity function

fn id<T>(x: T) -> T {
    return x;
}

Template arguments[yo.tmpl.args]

In order to instantiate a templated declaration, all template arguments must be known. This is achieved by either explicitly specifying the arguments in the template instantiation, or, in the case of calls to function templates, by relying on the compiler to deduce the argument types from context.

Note
Template argument deduction is not supported for calls to the constructor of a struct template. In this case all template arguments need to be explicitly specified (with the possible exception of template parameters which define a default value).

Syntax

TmplArgs = '<' L(Type) '>'

Template argument deduction[yo.tmpl.deduct]

If a template parameter's value is explicitly specified in the template instantiation, that argument will be used, regardless of a possible default value, or other information that might be deduced from context.

For each template parameter P which is not explicitly specified in the instantiation, the compiler will attempt to deduce the template argument from context, using the call's arguments.

The following rules and adjustments apply during deduction:

  • If P was deduced to a type &T (ie, some reference), P will be deduced as T
  • If P was deduced from a numeric literal of type A, and the compiler encounters another argument, which deduces P to a different type B and is not a literal expression, P will be deduced as B (ie, arguments deduced from numeric literal expressions can be overwritten by deductions based on non-literal expressions)
  • If P has already been deduced to type A, and the compiler encounters another argument which deduces P to an unrelated type B, the deduction will fail

All template parameters P which were not deduced, but also didn't produce any deduction failures, and specify a default value T, will be deduced as that default value T.

Example: template argument decuction

// Consider the following function, specifying one template parameter T
fn add<T>(x: T, y: T) -> T {
    return x + y;
}

// Explicit template arguments:
add<i64>(1, 2);  // No deduction, T = i64

// Deduced template arguments:
add(1, 2);       // T deduced as i64

let x: i32 = 1;
let y: i64 = 2;
add(1, y);       // T deduced as i32 (initially deduced as i64, then overwritten by non-literal argument)
add(x, y);       // T fails to deduce (initially deduced as i32, then again deduced to incompatible type i64)