thread.jpegWhen working as a java programmer there are a lot of weird questions people can ask you. The most asked are maybe (no scientific proof):

  • Is java better than C#?
  • Is Qt better for multi platform front ends than java?
  • Is springframework better than Google Guice?
  • Is Hibernate JPA better than OpenJpa?
  • Is java end of life?

During job interviews we tend to ask these kind of questions now and than, just to see how creative people can be. (If you are reading this and you are coming for a job interview at JTeam B.V., let me know, than I have to come up with other questions) Another strange question we ask to some of these people is:

Do you understand threading in Java?

I am actually surprised how often people say that they know the basics, but not more than that. I have to admit I am by far from an expert myself. I do have a lot of questions and therefore I decided to learn more about it. This blog post is about my journey through the threading model of java 5+. I got most of my knowledge from the following book, websites that I mention at the end and of course discussions with some of my colleagues.

[amtap book:isbn=0321349606]

It was spring 2007, the event of the NLJUG. I attended a presentation from Kirk Pepperdine about concurrency and performance. At a certain moment a slide with the following code snippet was presented

public class Example {
	private int hitCounter;

	public void hit() {
		hitCounter++;
	}
}

When Kirk told the audience this was not thread safe, the audience was mumbling. The people that knew this already had probably read the book Java Concurrency in Practise. For me this was the start of looking into it. To bad it almost took me a year before I actually bought the mentioned book. Now I understand that reading the current value and upgrading it with one needs to be done by one thread without interference of another thread. This is not default behavior of the ++ operator. Of course you all knew this :-).

The problem with threads start when you have more than one of them and they all want to have access to the same state of an object. Some threads read and others write. There are of course more solutions to this problem, you can prevent sharing the state between multiple threads, you can make the state immutable or you can synchronize access to the shared state.

If you got here and think it is enough, because you always create easy applications that are only exposed to a few users of the web, you are wrong. When you have only one servlet it is already easy to get in trouble. Servlets by default are accessed by multiple threads at the same time. You should not use instance variables if access to these variables is not synchronized in a servlet. I have seen a lot of servlets that people programmed and forgot to do this. Strange results can be returned to your web application.

When java 5 came around, and you are not working on WebSphere 6.0 or lower, there came solutions to simple threading problems. Looking back at the hit counter, there is a special class called AtomicLong, this class a method that do what the ++ operator does in a thread safe manner. This method is called incrementAndGet(). Be careful when you have multiple state variables that need to be changed at the same time, make sure this becomes an Atomic action by synchronizing the complete set of change statements.

To make sure that we keep the state correct we need a second thing, locking. Locks are used to stop access to a state by other threads while one thread is doing something with the state. Java uses synchronized to synchronize access to a certain block of code. This mechanism needs an object that will serve as a lock. There are some shortcuts, a synchronized method is locked using the object the method acts on. For a static method this is the Class object. There are two important things about using locks to safeguard state. The first is about using the same object for a lock for each access to the variable holding the state. The second thing that was again an eye opener to me is called Reentrancy. When a class that has a synchronized method is sub classed, there could be a problem with dead lock. If the subclass has that method as well synchronized. They now both want the same lock. The subclass has the lock, is waiting for the super class to do something and the super class is waiting for the lock that the subclass has. Because the lock is already held by the same thread, reentrancy makes sure the lock is held per thread and not per invocation. Therefore the mentioned threading problem is not a problem due to the reentrancy.

I already mentioned that java 5 came with some additional classes to help you create thread safe classes. AtomicLong is one example. Another example is Collections.synchronizedList(new ArrayList<E>). The book comes with a nice example when discussing Client-side locking. Client side locking is about the enablement for clients of your code to have access to your lock. Some classes support this, one of them is the Vector class, another is the wrapper collection. Check the following code sample. This code sample is not thread safe. The lock of the synchronized method is not the same as the one from the list, so other operations on the list will have threading problems. A solution would be to move the synchronized into the method and use the list as the lock object. This is only possible because the list exposes this lock to the client. If you want to learn more about this, buy the book.

@NotThreadSafe
public class ListHelper<E> {
	public List<E> list = Collections.synchronizedList(new ArrayList<E>());
	...
	public synchronized boolean putIfAbsent(E x) {
		boolean absent = !list.contains(x);
		if (absent) list.add(x);
		return absent;
	}
}

Up till now we have been talking about creating thread safe classes. Java comes with some building blocks that help you create thread safe classes. More of these building blocks were added in java 5 and 6. In previous java versions we already had thread safe collections (Vector and HashTable). The new collection interface adds a wrapper to make all collections thread safe Collections.synchronizedMap(..) for instance. Thread safe collections have some problems with performance. Most of the times this synchronized access to collections is not really necessary. That is why the Concurrent collections have been created. These concurrent collections focus on giving concurrent access. They are not appropriate if your applications needs to lock the collection for exclusive access. I like the methods like putIfAbsent. This method gives atomic access to checking if a certain item is available in the collection and add it if it is not.

To me this is all very interesting, I knew something about it, but apparently not enough. I will keep reading the book, and I will get back if I find something interesting.

The last thing I want to mention is the Blocking Queue, this gives you a nice way of creating a queue with a maximum amount of actions in it. If the queue becomes to large and reaches the maximum items cannot be added to the queue and an exception is thrown. This sounds like a nice thing for the Circuit Breaker (see post of Allard – bring-some-stability-to-your-architecture). Feel free to comment on this post with ideas, corrections, links, whatever.

thanks, see you next time. There are two stories coming up, one about Stripes and the way we used it in a project and another one about some mechanisms tricks for flex and java that I found out during the creation of my flex application.

Tagged on:

3 thoughts on “Threadsafe applications? (part I)

  • November 5, 2008 at 11:37 pm
    Permalink

    Next to the lost update problem in your counter example (that can be prevented by locking) there also is another more subtle problem: visibility. It could be that a recently written value doesn’t become visible to another thread (perhaps it is stuck in cache). This can also be prevented by locking or using the AtomicInteger class since both guarantee a happens before relations between the write and the read (making sure that a reading thread sees the most recently published value).

    If the type of the hitCounter is changed to long, and no additional synchronisation is added, values could even appear out of thin air.

    Reply
  • November 5, 2008 at 10:01 pm
    Permalink

    I wonder why people started grumbling? I certainly made a few comments that should have caused a few eyebrows to raise ;-)

    And Ben, you are correct, that code is not necessarily thread safe also. This is because the tools we have and the intent in which we use them are quite orthogonal. With synchronized we are protecting critical sections of code. What we really want to protect is the memory location pointed to by hitCounter. These are two different things.

    Reply
  • November 5, 2008 at 2:07 pm
    Permalink

    Jettro,

    You are correct, multithreading (multiprogramming in general) has always been challenging for a lot of people ever since the introduction of SIMD and MI*D systems. Which is not very strange, since the number of possible executions of any problem goes up exponentially every time you add a thread of execution. And of course Java adds some challenges of its own dus to its language constructs (such as thread-safe access to an object, but usafe access to that object’ state). And here’s another one for you:

    public class Example00 {
       private static int hitCounter;

       public void hit() {
          synchronized(this) {
             hitCounter++;
          }
       }
    }

    This is not necessarily threadsafe.

    If you want to explore a little, take a look at the java.util.concurrent package and subpackages. There are a lot of high-level synchronization mechanisms in those packages, including semaphores.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>