Multilevel breaks from lambdas

This post extends the previous post related to break/continue/return operators.

In other languages

Let's start with a JavaScript example:

// JavaScript can iterate arrays using for loops...
for (const polygon of root) {
    for (const point of polygon.points) {
        point.x += point.y;
        point.y *= 100;
    }
}

// Another way, using lambdas:
root.forEach(polygon => { 
    polygon.points.forEach(point => {
        point.x += point.y
        point.y *= 100;
    });
});

Why is it sometimes desirable to use lambda over loops?

There is only fixed amount of different loop types in any programming language. With Lambdas on the other hand there is unlimited number of possibilities. We can create whatever control structure we want just by defining functions that receive lambdas as parameters. For example such operations as map, reduce, filter are better to implement as a set of functions than to include them in the language itself.
Lambdas often make it clearer what the code is doing by focusing on the action rather than the iteration mechanics.
Using lambdas with methods like .map encourages working with immutable data and allows chaining.

On the other hand Lambdas have one big disadvantage: handling control flow:

Inner Lambda cannot directly return from the outer named function.
The inner lambda cannot break or continue arbitrary loops.

All it can do is to return from itself, effectively performing continue on the most nested loop.

// JavaScript can iterate arrays using for loops, and use break/continue/return
//
function processPoints(root) {
    for (const polygon of root) {
        for (const point of polygon.points) {
            if (point.x == 0)
               return processPoints;  // lambdas cannot do that
            point.x += point.y;
            point.y *= 100;
        }
    }
}

In Argentum

Argentum lambdas are different:

They can access variables in the outer lexical scopes in which they are declared, like in any other lambda-enabled languages.
And also, which is unique to Argentum, they can break to any blocks in the outer lexical scopes, up-to (including) the function scope itself (which counts as return).

fn processPoints(root Array(Polygon)) {
    root.each `polygon {
        polygon.points.each `point {
           point.x == 0.0 ?
               ^processPoints;  // << conditionally return from processPoints
           point.x += point.y;
           point.y *= 100.0
        }
    };
}

In line 4 we conditionally return from the processPoints function.

If we stop our application at this line in debugger, we'll see multiple intermediate stack frames between current lambda and processPoint function (of course, if they haven't been inlined by compiler):

a processPoints function
an Array.each method of the polygons-array
a lambda representing the root.each body
an Array.each of one of points-arrays
a lambda representing polygon.points.each body

Despite being separated from its target by various stack frames, this return expression does its work as expected:

It evaluates the resulting value, if presented (void in our example),
It returns from all functions up-to its target,
It destroys all temporary values and local variables and releases all object retained in all pointers,
It returns from the target block or function with the evaluated result.

How is it implemented internally

It all looks like a stack unwinding and exception handling, but it’s not quite the same. In languages like C++ and Java, code optimization is focused on the normal execution path, often at the expense of making the process of exceptions handling more complex. An exception is treated as a genuine application error, so during exception handling, information about the stack trace is usually gathered, numerous dynamic memory allocations are performed, and, in general, this code path is significantly less optimal. Therefore, using exception handling for the natural execution path of a program is not recommended. As a result, languages often introduce some separate techniques, such as data types like Result, Error-Or, and checks for result codes. Sometimes company code styles prohibit exception handling whatsoever, because of its added runtime cost.

Argentum uses approach similar to a return codes, but it's invisible to software developer. If some lambda can break to the outside scopes, compiler wraps result of this lambda in an optional<T>. In most cases optional wrappers are zero cost abstractions. We can say that behind the scenes code with far-breaks gets converted into something like this:

class Array{
   // This method takes a lambda parameter so its return value (void)
   // gets wrapped with optional and becomes `bool` (aka optional<void>).
   each(body(T)bool) bool {
       forRange(0, size()) `i {
          body(this[i]) : ^each=false // if body returns false, exit the `each` method
       };
       true // Normal return.
   }
}
fn processPoints(root Array(Polygon)) {
    root.each `polygon {
        polygon.points.each `point {=innerBody
           point.x == 0.0 ? ^innerBody=false;
           point.x += point.y;
           point.y *= 100.0;
           true  // normal exit returns true
        }
    }
}

This code, give or take, is similar to the "result-codes" approach that C-programmers use to implement brakes from inside functions. So approximately Argentum has the efficiency of C/C++ when they avoid using exceptions. The only difference here is that Argentum generates all the necessary code automatically.

Side note: In the future Argentum compiler will get devirtualization and inlining passes, and the above use cases will be heavily inlined whenever possible, which would improve multilevel returns even more.

Multi level returns as error handling mechanism

As shown in the previous section, functions with Lambda parameters can play role of any imaginable control flow operator, because in Argentum lambdas can break, continue and return to random arbitrary levels.

The other scenario where lambdas are useful is error-handling.

Sometimes, we call functions that might encounter issues during their execution, and these issues need to be addressed differently depending on the context of the calls. In such cases, we can use a well-known design pattern called "lambda strategy." Here, we pass a lambda-function to our main function, which should be invoked if something goes wrong. Typically, such a lambda attempts to resolve the situation, provide an alternative set of data, or signal the main function about what to do next.

fn safeDivide(a int, b int, onDiv0()int){
   b == 0
     ? onDiv0()   // We call the division-by-0-handler and return its result
     : a/b        // Or we return the normal execution result
}
...
x = safeDivide(a, b, \0);                    // replace all wrong results with 0
y = safeDivide(a, b \log("div by zero")->0); // log error and continue with 0
z = safeDivide(a, b, \terminate(-1));        // end the application

In addition to all of the above, the passed lambda can transfer control back to the block where it was defined, canceling the entire process with any desired result.

fn doSomeStuff() {
   people = getUserList();
   heightToOrder = {
      total = people.sum{_.height};
      avg = safeDivide(
         total,
         people.size(),
         \^heightToOrder=-1); // If safeDivide encounters 0-divider, we break out of
                              // heightToOrder initialization block and set its
                              // variable to -1
         // we can also put here ^doSomeStuff to return from the doSomeStuff function
      calculateHeight(avg, people)
   };
   orderShirts(heightToOrder)  // one size fits all
}

In this example we start our code with calculating the average people height, but if there are no people in the list, we can either call orderShirts with -1 or skip the whole doSomeStuff. And Argentum allows this by breaking outside to any outer level.

In this example the safeDiv function to which we pass a onDiv0 error handler is very simple. Though in real life processes that receive error handlers can be very complex and include arbitrary function calls and multiple levels of recursions. It could even include error handlers that call another error handlers and inside those handlers we might break outside, unwinding all nested functions and lambdas, and returning control to the original function frame with the needed data.

fn initializeOpenGL(onBadVersion(v int)) {
   ...
   version != expected ? onBadVersion(version);
   ...
}
fn initializeGraphics(onError(str)) {
   ...
   initGraphicsPlatform {
     initializeOpenGL { onError("Bad OpenGL version {_}") }
   }
   ...
}

class App{
   init() {
      ...
      initializeGraphics {
         log("Error {_}");
         ^init
      };
      ...
   }
}

In this example return in line 19 unwinds its own lambda, lambdas from line 8 and 9, initGraphicsPlatform, initializeOpenGL, initializeGraphics, and any invisible functions invoked inside the initGraphicsPlatform.

Sometimes such far returns are useful even inside one function:

class JsonParser{
   getString() ?str {
      error = `message {
         log("Json parser error: {message} at {currentPosition}");
         ^getString=?""
      };
      toHexDigit = `c {
         inRange(c, '0', '9') ? c - '0' :
         inRange(c, 'a', 'f') ? c - 'a' + 10 :
         error("not a hex digit")
      };
      current != '"' ? error("expected opening quote");
      result = StrBuilder;
      ...
      n = toHexDigit(current);
      ...
      result.toStr()
   }
}

In this example we try to parse a string out of a JSON data format. There are multiple ways it could end up with an error, and we want to log them all before returning an empty-optional string. So we introduced an error lambda-function that logs the message and returns, but not from itself. It rather returns from the outer getString function. This means that this lambda never returns to its caller. These types of functions have result type no_ret. This type is compatible to any other type in the type system, and as such this function can be called on any branch of if statement without failing type checks.

In the following code we call this error lambda when it's needed, sometimes in the nested lambdas sometimes in the getString function itself.

In the remaining code of our function we only keep track of successful parsing control paths. We don't do anything to account for errors. We know that in the case of error Argentum will unwind our lambda's stack frames and return an empty-optional-string from our getString function, and it will do it with the efficiency of manually-written C/C++ code that uses result codes.

Key takeaways

Argentum extends the existing breaks syntax (^block=result) to allow breaks from inner lambda to any outer lexical scopes.
Sometimes it involves stack unwinding.
It's much faster, takes less memory and avoids any allocations in comparison to the traditional exception handling.
I'llIt allows to break continue and return from lambdas residing in control flow functions, effectively erasing the edge between lambdas-in-functions and built-in control statements.
It allows to handle errors and unwind stacks even for deeply nested and recursive processes.

S	M	T	W	T	F	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

In other languages

In Argentum

How is it implemented internally

Multi level returns as error handling mechanism

Key takeaways

Related Posts

How to run Argentum compiler (and try examples)

Tests

New Argentum project structure

Leave a ReplyCancel Reply