Blog - Artur Tagisow
19 Nov 2022

Listening for daemon readiness in bash

Using bash, scan output of N daemons for N "I'm ready!" messages to know all of them have finished bootstrapping. Now that you know they're ready - do something with them.

1. Context

Recently at work I had to run our solution, then run performance tests on it.. I wanted to do all of that in an automated way. This post focuses on the "start our solution" part.

It's a distributed solution, where a parent-aggregator service talks to N child services.

The problem is, the moment the parent-aggregator service starts, it sends one HTTP request to each of the N child services. If one child service hadn't bootstrapped yet, the parent-aggregator service will crash.

The N child and parent-aggregator services are daemons. The child write "I'm ready!" to standard output when they've finished bootstrapping.

2. Naive attempts at a solution

It's clear that we need to ensure all the child services had already bootstrapped before we start the parent-aggregator service.

Let's go through the stupid things first.

2.1. Stupid I

start-child-service # 1
start-child-service # 2

start-parent-service

The script will start the first child-service, keep it in the foreground and never proceed to the rest of the services.

This won't work.

2.2. Stupid II

OK, so let's launch them in parallel and in the background:

start-child-service &
start-child-service &

start-parent-service &

The parent and child service jobs are started all at the same time. The parent service will query the child services, but the child services haven't finished bootstrapping yet, which will cause the parent service to crash.

2.3. Stupid III

Ok, so let's start the child services, wait for a "good enough" amount of time, then start the parent service.

start-child-service &
start-child-service &

sleep 10

start-parent-service &

This will work, is "good enough" and actually allowed me to run a few quick performance tests.

2.4. Stupid III and a half

The previous solutions don't kill the jobs they start. If I ran the script once, then again a second time, I'd get a "port already in use" error. Since the underlying services use Node - a JavaScript runtime - under the hood, I just killed all node processes after I was done. This unfortunately also killed all the language servers I had running in Emacs which also used Node, so there was room for improvement.

start-child-service &
start-child-service &

sleep 10

start-parent-service

pkill -f node

3. Actual solution

Despite the last solution somewhat working, after hours I kept wondering how to make sure the 2 child services are actually running - not just "fingers crossed they've bootstrapped after 10 seconds".

And I'd also like to actually kill the job processes after I'm done.

Here's my solution. Parent service is not started for simplicity

#!/bin/bash
# Need the shebang so that bash arrays work, sh doesn't have arrays

# Contains path to temporary file to which child services will write to, see `man mktemp`
OUTPUTDUMP=$(mktemp)
processIds=()

# Imagine this is a daemon that writes "ready" to standard output during its lifetime
# $1 = first argument of bash function - amount of time after the mock daemon outputs "ready"
function startChildService () {
  (sleep $1 && echo ready && sleep infinity) >> $OUTPUTDUMP &

  # $! = process id of the most recent background command (&), so in this case - the line above
  # Save the process id of the child service so we can clean it up when we're done with our test
  processIds+=($!)
}

startChildService 5
startChildService 3
startChildService 7 
startChildService 10

# Watch the enteriety of the output dump file until 4 matches are found
tail --follow $OUTPUTDUMP | grep --max-count=4 ready

echo "found!"

# This expands to e.g "kill 1234 531 3124"
kill ${processIds[*]}
rm $OUTPUTDUMP