Mutex In Bash

Please keep in mind, that's just note/draft.
Click here to look for other bash-related notes

All sources on this page are on GNU GPL or BSD - as you like. Without any warranty. Free to reuse and modify.

Hello,
inspired idea of using "mkdir" as atomic operation, which is needed when mutexing, I tried this on my own.
I took inspieration from Lock your script (against parallel run) Bash Hackers Wiki .

In past version, there was a small bug, which I analyse below, because it' interesting how small overlook can impact race condition.
Their current version is already fixed.

Generally their code is OK except ONE small bug (actually it's already fixed, they accepted my patch).
They were not checking exit status of cat function (with "$?") (actually they do).
In result, there might happen race condition, when unlocking.

I've made some experiments, and let's take a look what happens:

  • We have three processes. P1, P2, P3.
  • P2 and P3 checks periodically, to make a lock.
  • P1 tries to release lock with "rm -fr LOCKDIR"
  • function "rm -rf LOCKDIR" works recursively.
  • in first step "rm -rf LOCKDIR" deletes pid file : LOCKDIR/pid
  • here P2 makes "OTHERPID="$(cat "${PIDFILE}")" , because file does NOT exist, variable OTHERPID is empty.
  • next, P2 checkes wheter process "$OTHERPID" exists. Because variable is empty, result is obvious.
  • P2 start procedure of making lock.
  • Here "rm -rf LOCKDIR" deletes directory.
  • P3 checks, that LOCKDIR does not exists, so It starts procedure of making lock.
  • Unfortunatelly, both of them think that they reached critical section and both run. What we don't want.

Here is my proposition how to patch it.
Just by checking return status of "cat" function.

OTHERPID="$(cat "${PIDFILE}")"
if [ $? != 0 ]; then exit ${ENO_LOCKFAIL}; fi

thanks to that. Everything works. Fine ;).

Ok. Now my version of script, that can be called by another script, by supplying it's own PID by paramteer:

#!/bin/bash
# Grzegorz Wierzowiecki
# "mutex_dir.sh"
# GNU GPL or BSD - as you like
# Script inspiered by:
# http://www.bash-hackers.org/wiki/doku.php/howto/mutex

function print_help(){
cat << HELP_INFO
Usage:
$0 lock lock-dir calling_app_pid
$0 unlock lock-dir
Return:
0 - on success
1 - on general fail
2 - when locking failed
3 - when received signal
HELP_INFO
exit
}

# lock dirs/files
lock_dir="$2"
app_pid="$3"
pid_file="${lock_dir}/pid"

# exit codes and text for them - additional features nobody needs :-)
ENO_SUCCESS=0; ETXT[0]="ENO_SUCCESS"
ENO_GENERAL=1; ETXT[1]="ENO_GENERAL"
ENO_LOCKFAIL=2; ETXT[2]="ENO_LOCKFAIL"
ENO_RECVSIG=3; ETXT[3]="ENO_RECVSIG"

# start un/locking attempt
#trap 'ECODE=$?; echo "[statsgen] Exit: ${ETXT[ECODE]}($ECODE)" >&2' 0

function lock(){
    if ! kill -0 $app_pid &>/dev/null; then
        echo 'Calling app pid (='$app_pid')is not reposnding.'
        return 1
    fi
    local lock_dir="$1"
    if mkdir "${lock_dir}" &>/dev/null; then
        # lock succeeded, store the PID
        echo "$app_pid" >"${pid_file}"
        return ${ENO_SUCCESS}
    else
        # lock failed, now check if the other PID is alive
        other_pid="$(cat "${pid_file}" 2>/dev/null)"
        if [ $? != 0 ]; then 
            # Pid file does not exists - propably direcotry is beeing deleted
            return ${ENO_LOCKFAIL}
        fi
        if ! kill -0 $other_pid &>/dev/null; then
            # lock is stale, remove it and restart
            unlock "$lock_dir"
        lock "$lock_dir"
        return $?
        else
            # lock is valid and OTHERPID is active - exit, we're locked!
            #echo "lock failed, PID ${OTHERPID} is active" >&2
            return ${ENO_LOCKFAIL}
        fi
    fi
    return 0
}

function unlock(){
    rm -r "$1" &>/dev/null
    return $?
}

case "$1" in
    lock)    lock "$lock_dir" ; exit $?;;
    unlock)    unlock "$lock_dir"; exit $? ;;
    *) print_help ;;
esac

How to use it ?
It we want to waint until lock is locked, we can use:

while ! mutex_dir.sh lock /tmp/xxx $$; do echo -e -n 'Waiting for mutex\r'; sleep 0.2 ; done ;
echo ; echo 'I have it !!!'

Important is to take a sleep before tries. If it's not urgent, try to play with 1second. Above example checkes five times per second.
O ile nie zaznaczono inaczej, treść tej strony objęta jest licencją Creative Commons Attribution-ShareAlike 3.0 License