Formal grammar

General program structure

Argentum program consists on modules. All module sources are placed in the same directory in text files having ".ag" extension. File name matches module name.

One module is passed to the compiler as starting module name. Compiler builds a dependency tree and compiles/recompiles the used modules in an executable.

Lexical rules

Source file is a utf-8 text having any combinations of CR/LF line endings.

Tabs are not allowed in source code so indentation is only with spaces.

Underscore is a special symbol.

  • There is an underscore variable.
  • The leading underscore denotes the private class member name (prospected).
  • Underscores separate the name of a package from the name of a package item: sys_String.

As such underscores cannot be part of names.
Use the Pascal case for types, 'x' prefix and camel case for constants and simple camel case for everything else.

Identifiers can start with upper or lower case Latin letter 'a-z' and can contain Latin letters and numbers '0-9'.

Only single line comments are supported:

x := 0; // clear x with zero, also I'm not a nerd

Formal grammar

module = : ('using' name (';' | '{' import ("," import)* '}'))*   // imported modules and names
           (class | fn | test | const)*
           statements <eof>;     // Main function body
                                 // (only the starting module can have one).
import = : name
         | name '=' name
const =  : 'const' name '=' expression ';'
class =  : 'test'?               // Classes and function marked with `test`
                                 // are test only mocks (TBD)
            ('class' | 'interface') long_name '{' (field | method | base)* '}';
field =  : long_name '=' expression ';';
method = : ('-' | '*')? long_name fn_def;
fn_def = : '(' (name type (',' name type)*)? ')'
           ('this' | type)? (';' | block );
base =   : '+' long_name (';' | '{' method* '}');
fn =     : 'fn' name fn_def;
test =     : 'test' name fn_def;  // tests can be only of `fn()` type 
type =   : '~' expression
         | 'int'
         | 'double'
         | 'bool'
         | '?' type // optional
         | '-' long_name // conform pointer
         | '*' long_name // frozen pointer
         | ('&' | '&-' | '&*') long_name // weak / weak to conform / weak to frozen pointer
         | '@' long_name // composition (single owner) pointer
         | 'fn' fn_proto // function type
         | '&' fn_proto  // delegate type
         | fn_proto      // lambda types
fn_proto = : '(' (type (',' type)*)? ')' type;
block = : '{' ('=' primitive_name) (expression | local_definition ';')* expression? '}';
                   // if last expression ends up with ';' it is void block
local_definition = : name '=' expression ';';
expression = : /// lots of operators by priorities, which priority will
               ///  be changing,
               /// so just a list of binary operator grouped by priority 
        '?'('='name)?  |  '&&'('='name)?    // optional (boolean) handling
        ':' | '||'
        == < > <= >= !=     // comparisons
        + -                 // adds
        * / % & | ^ << >>   // other languages introduce weird priority
                            // across this operators and all ends up with
                            //  scary parenthesis, so one priority, left associative
unary_expression = : unar_head (unar_tail)*
lambda    = : primitive_name* ('\' expression | block )
unar_head = : '(' expression ')'
            | block
            | lambda
            | '(' (primitive_name (',' primitive_name)*)? ')' block  // also lambda
            | '@' unary_expression         // deep copy operator
            | '&' unary_expression         // make weak pointer
            | '*' unary_expression         // freeze operator
            | '!' unary_expression         // logical not
            | '-' unary_expression         // int-double negation
            | '~' unary_expression         // int bitwise inverse
            | '+' unary_expression         // to optional
            | '?' type                     // to nullopt of expression type
            | '^' name ('=' expression)?   // break/return/throw operator
            | 'true' | 'false'
            | ('int'|'double') '(' expression ')' // type conversion
            | '\'' utf8_rune '\''          // character code
            | '\`' (utf8_rune | '${' expression '}')* '\`' // string with interpolation
            | '\"' utf8_rune* '\"' // string, basic escaping
            | long_name            // variable or class instantiation
            | [0-9_]+ | '0x' [0-9a-f_]+ | '0b' [01_]+ // int const
            | [0-9]+ '.' [0-9]* ('e' [0-9]*)?       // double const
unar_tail = : '(' (expression (',' expression)*)? ')' lambda?  // function call
            | '~(' (expression (',' expression)*)? ')'   // async delegate call
            | set_operator expression   // assignment. greedy: RHS grabs the rest
                                // of the expression, use () if needed
            | '.' long_name (set_operator expression)?   // field get/set, greedy
            | '[' expression (',' expression)* ']' (set_operator expression)? // indexed get/set. greedy.
            | '~' type                      // typecast operator
name = : [A-Za-z][A-Za-z0-9]*
long_name = : primitive_name ('_' primitive_name)*
set_operator = ':=' | '+=' | '-='... *= /= %= ^= &= |= <<= >>=

Leave a Reply

Your email address will not be published. Required fields are marked *