Tuesday, March 22, 2011

Problem with Peterson's mutual exclusion algorithm on multi-core

This is my Java program for mutual exclusion using Peterson's algorithm. This algorithm can works effectively on single-core machines but it turns wrong when running on multi-core systems. The reason is that multi-core processors re-order instructions for improving performance. In this example, mutual exclusion might be violated because the reordering between 2 instructions "in[id] = true;" and "turn = other();".

/* Prove: Peterson algorithm for mutual exclusion does not work for multi-processor
Precondition: 2 threads updating a shared variable "sum" in the critical region, sum = 0
Postcondition: sum = 2
However, sum can be 0 or 1 or 2.
*/

public class Peterson implements Runnable {

private static boolean[] in = { false, false };
private static volatile int turn = -1;
private static volatile int sum=0;
public static void main(String[] args) {
new Thread(new Peterson(0), "Thread - 0").start();
new Thread(new Peterson(1), "Thread - 1").start();
System.out.println(sum); //expected 2
}

private final int id;

public Peterson(int i) {
id = i;
}

private int other() {
return id == 0 ? 1 : 0;
}

@Override
public void run() {
for (int i=0;i < 1; i++)
{
in[id] = true;
turn = other();
while (in[other()] && turn == other()) {
//System.out.println("[" + id + "] - Waiting...");
}

sum++;

in[id] = false;
}
}
}

Below is my experimental result running on a 48-core machine.

Number of runs: 1 000 000
sum=0: 116556
sum=1: 882538
sum=2: 906

As we can see, our expected result (sum=2) only only holds in 906 runs (~0.1%).

This problem is known widely in the multi-core community. Mine is just a toy experiment to demonstrate the danger of applying sequential algorithms to solve parallel problems.

Labels: , , ,

Saturday, March 12, 2011

Traverse directories and files recursively

Today, I decided to blog about this because it was a painful experience working with this. Below is my completed script.

************************************
************************************
#!/bin/bash
#set -x

traverse()
{
ls -A "$3" | while read i
# for i in "$3"/*;
do
# echo "$3/$i"
if [ -f "$3/$i" ]; then
# echo "Traversing File: $3/$i"
bash -c ""$2" \""$3"/"$i"\""
# echo " "
else
# echo "Traversing Directory: $3/$i"
bash -c ""$1" \""$3"/"$i"\""
#traverse recursively if not empty
if [ "$(ls -A $3/$i)" ]; then
traverse "$1" "$2" "$3/$i"
fi
fi
done
}

SAVEIFS=$IFS
IFS=$'\n'

if [ -z "$1" ]; then
"traverse [map to dirs] [map to files] [directory or file]"
else
if [ -z "$2" ]; then
"traverse [map to dirs] [map to files] [directory or file]"
else
if [ -z "$3" ]; then
"traverse [map to dirs] [map to files] [directory or file]"
else
# bash -c ""$1" "$3""
traverse "$1" "$2" "$3"
fi
fi
fi
IFS=$SAVEIFS
************************************
************************************
Here is the simple manual
"traverse [map to dirs] [map to files] [directory or file]"

My script will traverse directory tree (root is the third argument), apply a function (first argument) to directories and another function (second argument) to files.

There are so many things needing special care

1) Choosing what kind of script
I initially used "#!/bin/sh"; however, I got many unexpected errors and I was wondering what they were. That changing to "#!/bin/bash" can so the question is unclear to me, but it works out.

As a consequence, this is an example how to run it

bash traverse.sh "ls" "ls" ~

2) How to traverse a directory.
In my script, I have 2 options:
ls -A "$3" | while read i
or
for i in "$3"/*;

The second option can not work because it can not recognize hidden directories and file.

Then, I need to differentiate between files and folders using
if [ -f "$3/$i" ]; then
#FILE
else
# DIRECTORY
if [ "$(ls -A $3/$i)" ]; then
#IF DIRECTORY is not EMPTY
traverse "$1" "$2" "$3/$i"
fi

A list of options for checking directories and files can be found at http://www.comp.leeds.ac.uk/hannah/linux/shell.html

3) How to traverse a directory correctly
It depends on what kinds of files and folder I am considering. In my case, I want all files including the hidden directory. This can be done with "ls -a" option. However, this will also list current and parent directories which will cause a infinite traversal. Therefore, "ls -A" will list hidden items but not the current and parent dirs. Additionally, I have to check if the child directory is empty or not before deciding to traverse into it.

4) Dealing with white spaces in filenames
This is the most trickiest things during my scripting. After searching painfully, I found the solution.

First, I have to add this

SAVEIFS=$IFS
IFS=....
MY SCRIPT
....
IFS=$SAVEIFS

This will make "ls -A "$3" | while read i" work correctly. For each while iteration, i is the full name of a file or a directory including spaces. Without modifying $IFS, filenames would not be read correctly.

Secondly, to represent a file with spaces correctly on a command, we should but it in a single quote or double quote. That's why we have a quite complex script at
bash -c ""$2" \""$3"/"$i"\""

5) Single quotes vs double quotes
There are plentiful articles discussing about this. This is what I wanna stress: "variable names are expanded within double quotes, but not single quotes." More details at http://www.panix.com/~elflord/unix/bash-tute.html

6) Dealing with the unknown

Why are they unknown? Because I don't know why my solutions solved the problems.

First, adding "#!/bin/bash" as explained above.

Second, adding "bash -c" before my command.
bash -c ""$2" \""$3"/"$i"\""
Removing "bash -c" will generate some unknown errors.

Third, sometimes we should pay attention to use the full path for certain commands such /usr/bin/time ...

Hope this is helpful to avoid painful future for those needing this script.