Feature request: Tests

By Pavel_Vozenilek

My ideal is SQLite, where each production line is accompanied by thousand lines in tests. This is not always perfect idea for everyone everytime, but it fits me.


[1] A test is test, not a function, doesn't need a name, cannot be called.

TEST()
{
   assert(1 + 1 == 2);
}

TEST()
{
   assert(false); // will fail
}

[2] Any test could have any number of parameters (including none). These parameters do not need follow language rules, the test runner (what executes the test) needs to understand them, that's all.

// test with 4 parameters
TEST( par1  = 10 | par2 = xyz | max-time < 10 ms | leaks-allowed) 
{
   ...
}

This allow to specify constraints, whether a test should be run at this time or not, etc.

For example, if some test fails and I don't have time to fix it right now, I could disable it temporarily:

TEST(do-not-run-until 2023/12/24)
{
  ....
}

Test runner would recognize it and skip it.

Parameters that are not recognized by test runner would mean error.

 

[3] There would be default test runner, but it would be possible (and desirable) to write one's own.

Example of handy test runner: the one that executes only tests from recently modified files (say 20 minutes). Every time the application starts, these "recent tests" are run. one does not need to do anything, switch to some mode or whatever, tests are run automatically and only those needed. Its like a dream, bugs show immediatelly.


[4] Invoking tests would be explicit, the compiler would make sure they are called near the start of main (an not somewhere in the middle).

// C syntax
int main(void) 
{
   initialization();

  #ifdef TESTS
  test_runner t;
  t.run_all_tests();
  // t.run_recently_modified_tests();
  #endif

  ... // whatever main() should do
}

Testing is thus not some incomprehensible magic with its own dedicated compilation mode, and one could debug it.


[5] My language would use manual memory management. Tests allow to catch memory related bugs quickly, it was no problem for me at all.

By default tests should have no leaks inside of them. Checking is done by the test runner.

It would be possible for test to specify there are leaks inside:

TEST( has-leaks) 
{
   ... // leaking test
}
TEST( leaks 100 B) 
{
   ... // another leaking test, it must leak this exact amount
}

[6] A big feature: ability to test non-compiling code. This is handy for testing generics and various corner cases of the language.

TEST( does-not-compile) 
{
   ... // must not compile, the compiler verifies this and continues with the rest of the codebase
}

 // it would be possible to ensure proper error message for failing code
TEST( does-not-compile) 
{
   [[expected-error-message-contains:blah blah]] // compiler check this
   ... // must not compile
}

[7] There should be no special "test mode" (as e.g. D has). E.g. I want to be able to run recent tests each time I execute the debug build. I want to have release mode with tests still running, to make sure optimizations didn't screw up something.

One should be able to have "debug-with-tests", "debug", "release", "release-with-tests", etc modes.


[8] In release mode with tests asserts would be normally disabled. But top level asserts inside a test body should still be there in such mode:

TEST() 
{
   // even if compiled in release mode with tests, this assert will still fire 
   assert(2 + 2 = 5);
}

[9] I want to be able to safely test even "impossible to happen situations":

 void foo() {
    ...
    if (almost-impossible) {
      assert(false);  // this is to notify me I have wrong assumptions
      return;
    }
    ...
 }

 // this parameter informs the test runner and also shuts down one assert(false)
TEST( assert(false)-expected-to-fire 1x ) 
{
   ... // prepare the impossible situation

  foo(); // will run without stopping the execution. If there was no assert contrary to the expectation -> show error
}

[10] I want much more capable asserts:

assert(a < b); // if it fires I want to know a and b values

assert(a < b | value a = {a}, value b = {b}); // here it would also show the formatted string

I may even want to have block of code attached to assert:

assert(a < b) -> {
    c = a + b;
    show_assert_message(a = {a}, b = {b}, c = {c}); // this will be shown in addition to the usual file/line
}

(This is similar to SMART_ASSERT library.)


[11] If an assert fires while running a test, I want to show which test it was (immediately, without the need to investigate the stack trace).


[12] When one writes lot of tests, a new problem appears. Some tests could be in source, but if they are too many, there's better way to structure the project.

I want to have so called "test companion files". If "a.src" is source file, the files "a.src-tests" or sequence of files with names "a.src-test1", "a.src-tests2", "a-src-test3", ... would be understood by the compiler as test companion files. The compiler would internally append then to that source file. It would be impossible to export anything from such test companion files, they would be purely the place for tests and nothing else.

There could be even directory for such test companion files, and nothing else.


[13] It should be possible to specify common test setup and common test teardown just once for several related tests:

TEST() 
{
   ... // common setup code

      TEST()
      {
          ... // first test
      }
      TEST()
      {
          ... // second test
      }

  ... // common teardown code
}

It would get transformed into:

TEST() 
{
   ... // common setup code
   ... // first test
   ... // common teardown code
}

TEST() 
{
   ... // common setup code
   ... // second test
   ... // common teardown code
}

[14] I already mentioned the way to mock functions. This could be used to mock constants too (e.g. to replace timeout constants) just for the testing.

The big idea is that production code would need no modifications at all due to testing.

Mocking functions should be able to use globals (to communicate with the test body) or (perhaps better) should be able to see locals inside the test body as if they were globals.


[15] Similar to the previous feature, one may want to mock whole structures:

TEST() 
{
   [[replace] my_struct = struct { ... with added members heling the testing }
   [[replace]] ... // some functions which work on "my_struct"

   ... // code is now using replacement for "my_struct"
}

Unlike mocking functions, this may require much more effort, duplicating the code for each test with such mock.

I was not able to invent an elegant trick like with function pointers for function mocking.


[16] Tests should have access to everything, no matter how protected it is. This is to test impossible situations and make tests shorter.

TEST() 
{
   some_data d();
   d.protected_positive_number = -1;

   ... // test the impossible situation
}

[17] I would like to check the code which uses using floating points, for numerical stability, by trying it with more precise floats. (This is not mathematically sufficient, but in real life it should catch most of the problems.)

// function I want to check for numerical stability
f32 calculate() { ... complex calculation that may accumulate error ... }

TEST() 
{
   f32 res1 = calculate();

   [[test-numerical-stabillity-with: f64]
   // compiler invisibly changes all f32 to f64
   f64 res2 = calculate();
   assert(fabs(res1 - res2) < 0.0001);

   [[test-numerical-stabillity-with: f128]
   // compiler invisibly changes all f32 to f128
   f128 res3 = calculate();
   assert(fabs(res1 - res3) < 0.0001);
   ...
}

[18] Big feature: ability to run partially uncompilable codebase.

  • I make a change in some important data structure
  • suddenly there are dozens and dozens of places which do not compile
  • I do not want to fix them right now, I want to make sure my change passes some test (or make some benchmark faster)
  • I somehow say: compiler, compile just this test (or benchmark) and ignore everything else, then run it
  • if the test passes (or benchmark shows speedup), I would do the tedious changes in dozens of places

[19] Potential feature: in C one needs to guard debug only code all the time:

void foo() {
   #ifdef DEBUG
    int x = 0;
   #endif
   ...
   ...
   #ifdef DEBUG
   if (...) ++x;
   #endif
   ...
   ...
  assert(x > 10);
}

One may use helper macro, but it is still nuisance.

#ifdef DEBUG
# define DBG(...) __VA_ARGS__
#else
#  define DBG(...)
#endid

I would like the compiler to understand, that debug only data+code should disappear in release mode.

void foo() {
    int x = 0; // compiler deduces it disappears in release
   ...
   ...
   if (...) ++x; // disappears in release
   ...
   ...
  assert(x > 10); // this is where compiler find this out
}

Sufficiently smart IDE could show it in a different visual style.


[20] Possibly there could support for fuzzy testing. But I know too little about it to give details at the moment.


[21] Possibly testing for data races in parallel code. I have only vague idea how to do it. Here are some links what other people did, that may be helpful.


No language I know supports significant fraction of such features.

It was possible to partly implement some features in C, in C++, or in Nim, but direct language help should be better.

[22]

The problem: it is easy to check returned value of a function:

assert(add(2, 2) == 4);

but I may want some more assurance:

  • that the function add allocates memory exactly once and it is 100 bytes,
  • that the function doesn't call any socket API

Code running inside a test would (invisibly to the user) generate traces what it is doing. The test would later evaluate the traces, of the code did what was expected and didn't unexpected. Default traces would be names of functions called while test is running.

Outside the tests nothing would be collected. Collected traces would be automatically destroyed at the end of the test.

TEST() 
{
    // The compiler would recognize this and invisibly add a trace calls inside "fopen" and "fclose". 
    // These would just be "fopen() called" and "fclose() called".
    // The compiler would also check for typos, that there are functions fopen/fclose. If not -> error.
    I-am-interested-in_traces("fopen", "fclose");

   ... // here is some code that may call fopen/fclose

   // now I will evaluate whether my expectations were correct.

  // I expect 2 files being opened & closed, one after another
  assert(expect-trace("fopen() called")); // first found trace "eaten"
  assert(expect-trace("fclose() called")); // second trace eaten

  assert(expect-trace("fopen() called"));
  assert(expect-trace("fclose() called"));

  // there must be no more opened/closed files
  assert(!expect-trace("fopen() called")); // no more relevant traces found
  assert(!expect-trace("fclose() called"));

   // The compiler will protect against typos, will check that strings "fopen() called" and "fclose() called" are indeed generated somewhere in the codebase.
}

One may want to collect more details, e.g. how may bytes were allocated inside the test. Then one could manually add explicit trace into a function:

// my malloc implementation
void* malloc(uint n) {
    Trace("malloced %u B", n); // will save this detailed trace
    ... // actual malloc implementation
 }

TEST() 
{
    I-am-interested-in_traces("malloced"); // compiler checks such string is generated somewhere

   ... // does some mallocs

   char* s = find_expected_trace("malloced");
   assert(s);

   // parse the string, extract number from it
  uint n = ...
  assert(n == 100); // we should have allocated 100 B

   s = find_expected_trace("malloced"); // try to find the next trace
   assert(!s); // only one allocation was expected         
}

Compiler inserting default traces and checking of trace string validity is critical for usability.


Leave a Reply

Your email address will not be published. Required fields are marked *