Writing executable tutorials

The terminology: Stages, chunks and commands

Executable tutorials work a bit like a CI workflow. The command is the smallest element of them. It corresponds to an executable with its arguments to be called onto the system. A chunk is a collection of commands and is used to group them with an environment of execution, chunks can be executed in parallel of each other within the same stage. And finally, there are the stages, which are a collection of chunks. Stages gets executed one after the other.

Code chunks

In an executable tutorial, every line gets executed. Chunks that are executed are chunks that provide a valid json object with at least a stage configured in it.

The code fence to create an executable chunk needs to have at least 3 back quotes and some json metadata on the same line.

``` something {"stage":"init"}
```

The only mandatory field an executable chunk must have is the stage name.

The other possible fields are:

"runtime":"bash"

Specify that the chunk is written in bash. The chunk will get turned into a script and executed.

The script itself is prefixed with set -euo pipefail, meaning that it will error at the first failing commands.

And is suffixed with printenv to extract all the environment variables the user added in.

"runtime":"writer"

Set that the chunk contains a file to write on disk. The metadata needs to contain "destination":"some_place" to write the file.

"label":"some label"

You can give a pretty name to a chunk. It’s good for bash runtimes, as the command to execute is always ./script.sh so if you want to have a differentiable name in the logs, use this field.

"rootdir":"$operator"

The commands in the chunk will the executed from the operator source code directory.

"rootdir":"$tmpdir.x"

Create a new temporary directory to execute the chunk. If another chunk reuses the same suffix (.x in the example) it’ll share the same directory. Useful when you need to chain chunks or to reuse some cached files between chunks

"parallel":"true"

Makes the chunk executed in parallel of the others in its stage. This works best if there’s only one command in the chunk (or if the chunk is a bash chunk). And if all the commands in the stage are made to run in parallel.

For now the parallel behavior has a simplistic implementation. Only the last command of the chunk is really started asynchronously. The other ones are sequentially executed before that.

"breakpoint":"true"

Enters interactive mode when the chunk is started. Useful for debugging purposes. Better to use alongside --verbose

"requires":"stageName/id"

Makes the execution of the given stage dependant of the correct execution of the pointed stage. To be used in teardown chunks.

"id":"someID"

Give an ID to a chunk so that it can get referenced later on

Execution Environment of the chunks

Every chunk is started with the environment of the parent process that started it. If as chunk whom runtime is bash is executed, all the variables it adds to its own env via export will get added to the environment of subsequent chunks.

Examples

Creating and running our first tutorial

Let’s create a markdown file containing an executable chunk, and save it as simple_tutorial.md

# Demo 1

This is a source markdown file it has a simple chunk that gets executed:

## the chunk
```bash {"stage":"inner_test1"}
echo this line will get printed in the output chunk below.
```
When this document is going to get executed, it'll have a new block above this
line containing the actual output of the executed chunk.

We will need to pass the folder containing this markdown file to the tutorial tester.

export TUTORIALS_FOLDER=$(pwd)

Now it’s time to run the script.

cd ${OPERATOR_ROOT}
go run test/utils/tutorials/tester.go \
   --update-tutorials \
   --quiet \
   --tutorials-root ${TUTORIALS_FOLDER}

As we’ve executed the tutorial tester tool with the --update-tutorial option. It’ll make the tool insert the output of the chunk (if any) inside the source markdown file. Let’s have a look at the updated markdown file:

cat simple_tutorial.md
# Demo 1

This is a source markdown file it has a simple chunk that gets executed:

## the chunk
```bash {"stage":"inner_test1"}
echo this line will get printed in the output chunk below.
```
```shell tutorial_tester
this line will get printed in the output chunk below.
```
When this document is going to get executed, it'll have a new block above this
line containing the actual output of the executed chunk.

Sharing variables between chunks

Any export in a bash environment is available for subsequent chunks.

```bash {"stage":"test2", "runtime":"bash"}
export SOME_VARIABLE=$(sleep .1s && echo "This is some content")
export TEST="some value"
```

The chunk below is able to perform some comparisons with value of SOME_VARIABLE

```bash {"stage":"test2", "runtime":"bash"}
if [ "$SOME_VARIABLE" == "This is some content" ]; then
  echo "same string"
fi
```
```shell tutorial_tester
same string
```

Executing in parallel

Parallelism can be important for some workflows, when for instance two processes need to chat with each other.

To demonstrate the capability, we’re executing two pairs of chunks. The first pair runs in parallel, and does concurrent writing to a file on disk. The second pair acts as the control group and performs the writing sequentially, one after the other.

On a parallel environment we expect that the output of the two files should be different. Let’s put that to the test!

The first pair is writing to output in parallel
```bash {"stage":"test3", "runtime":"bash", "parallel":true, "rootdir":"$tmpdir.2"}
for i in $(seq 1 10);
do
    echo FIRST$i >> output
    sleep .1
done
```
```bash {"stage":"test3", "runtime":"bash", "parallel":true, "rootdir":"$tmpdir.2"}
for i in $(seq 1 10);
do
    sleep .1
    echo SECOND$i >> output
done
```
The second pair is writing to output2 sequentially this time
```bash {"stage":"test4", "runtime":"bash", "rootdir":"$tmpdir.2"}
for i in $(seq 1 10);
do
    echo FIRST$i >> output2
done
```
```bash {"stage":"test4", "runtime":"bash", "rootdir":"$tmpdir.2"}
for i in $(seq 1 10);
do
    echo SECOND$i >> output2
done
```
Then let’s compare the two files and asses that the two files are different
```{"stage":"test4", "rootdir":"$tmpdir.2", "runtime":"bash"}
DIFF=$(diff output output2 || echo "")
if [[ -z ${DIFF} ]]; then
    echo "the files are the same, that's a problem."
    exit 1
else
    echo "the two files are different, we're running in parallel"
fi
```
```shell tutorial_tester
the two files are different, we're running in parallel
```

Teardown & dependencies

Teardown chunks are executed even if something went wrong in the middle of the execution.

Adding a dependency to the teardown chunk makes it execute only if its dependant chunk did execute correctly. That allows to make sure that a teardown stage only tears down things that have been built before.

# Demo teardown

## Two classical chunks

```bash {"stage":"inner_test1", "id":"someID", "label":"working command"}
echo this executes correctly
```

```bash {"stage":"inner_test1", "id":"someID2", "runtime":"bash", "label":"failing command"}
echo this has failed
exit 1
```

## Two teardown chunks

This one has an output, because it's dependency did run correctly

```bash {"stage":"teardown", "requires":"inner_test1/someID"}
echo executed because test4/someID got executed
```

This one won't

```bash {"stage":"teardown", "requires":"inner_test1/someID2"}
echo not executed because someID2 is missing in test4
```
export TUTORIALS_FOLDER=$(pwd)

Let’s run and see the result

cd ${OPERATOR_ROOT}
go run test/utils/tutorials/tester.go \
   --no-styling \
   --tutorials-root ${TUTORIALS_FOLDER} | grep "SUCCESS.*someID" || exit 0

                                                                                
SUCCESS: echo executed because test4/someID got executed

As we can see only the first teardown command did run.