Threading - Avoiding Thread Conflicts

  • Thread starter twoski
  • Start date
  • Tags
    Thread
In summary, the conversation discusses the assignment of writing a program that uses threads to execute a queue of tasks. The goal is to simulate a thread pool without using Java's built-in utilities for threads. The conversation covers various attempts at implementing a thread-safe list data structure and optimizing the code for faster execution. The teacher mentions the difficulty of achieving linear speed increases with concurrent applications, but the final 10% of the grade involves improving the performance of the application.
  • #1
twoski
181
2

Homework Statement



I have to write a simple program that uses threads to execute a queue of arbitrary tasks.

The more threads i make, the better it should perform (theoretically).

I have to simulate a thread pool without using any of Java's built in utilities for threads.

The Attempt at a Solution



My current attempt seems to work if i use only 3 threads. If i increase the number of threads, i end up getting exceptions caused by threads trying to access an empty queue.

How can i make my threads safely access the queue? I assume i need to do something with locks or somesuch.

I am also not sure what implementing a thread pool means. Is it sufficient to just store all my threads in some sort of list then call join() on all of them?

Here is relevant code: http://pastebin.com/Cbj64t4k
 
Last edited by a moderator:
Physics news on Phys.org
  • #2
  • #3
For this assignment we are not allowed to use anything from java.util.concurrent.

With that in mind, how could i implement a thread safe list type data structure? I assume i would call wait() before doing any operations on the list so that a thread can only do something if the list has an item in it. Would i just call notifyAll on the list if it holds 1 or more items after i remove one?

alternatively: http://docs.oracle.com/javase/tutorial/essential/concurrency/syncmeth.html

I could just make the actual method that checks the list and then removes an item synchronized so that only one thread at a time may access the list. Is that preferable?
 
Last edited:
  • #4
I think i may have figured something out here.

Code:
        private synchronized Task GetTask()
	{
		if (!tasks.isEmpty())
		{
			Task t = (Task) tasks.remove();
			return t;
		}
		
		return null;
	}
	
	public void run()
	{
		Task t;
		
		while ((t = GetTask()) != null)
		{
			t.Perform();
		}
		
		System.out.println("Thread " + this.id + " terminated safely!");
	}

In theory this should work but i still get exceptions caused by accessing an empty list. Strangely, the exceptions do not always occur. If i have 100 tasks and 5 threads, it works fine, but if i go with 500 tasks and 10 threads i get exceptions.
 
Last edited:
  • #5
What makes you think that different threads executing their methods GetTask will synchronize with each other when accessing the shared tasks object?
 
  • #6
I assumed that any method that is synchronized would synchronize regardless of being in a thread or not.

If i made a method external to the thread class which was synchronized and called this method from the thread class, would that work?
 
  • #7
What does the keyword "synchronized", when used in the definition of a method, mean? Can this keyword be used differently, i.e., not in the definition of a method?
 
  • #8
Ah, i created my own TaskList class and put the synchronized function within that class and now my program works.

Part of my assignment requires me to optimize the code such that a greater number of threads will result in faster code execution. Right now if i run the program with large data sets and only one thread, it takes 8ms. If i use 5 threads for the same data sets, i get a time of 9ms.

According to the assignment:

The actual difference will vary considerably due to (1) machine architecture (2) the specific
algorithm(s) used within the code.

With that in mind, how am i supposed to optimize this?
 
  • #9
That requirement, as stated, is impossible to comply with. If the number of threads is much greater than the parallelism the underlying hardware can support, adding more threads will make the system slower, not faster.
 
  • #10
Wouldn't it also depend on the computer running the code and therefore be very subjective when it comes to optimization and speed gains?

I have some more snippets from the assignment handout. The teacher seems to think we can achieve close to linear speed increases, though i have no idea how. My code is already pretty barebones. I am using a Stack for my task list, but i don't think Stack.Pop is inefficient by any stretch.

The primary objective of the assignment is to write thread based code that will run
correctly. By “correct”, we mean that (1) multiple threads are created and they concurrently process
the task list, (2) you produce the proper results (e.g., with the FIND and ADD operations), and (3) the
application does not crash. The grader will evaluate this using a series of tests cases (i.e., various test
parameters. You do not get a copy of the test parms beforehand ).
One thing that you will probably realize, however, is that concurrent doesn’t always mean parallel. In
other words, if your computer has more than one core (as almost all do nowadays), you will see that
running the code with 2 or 4 threads may not make the code any faster than when it is run with one
thread. That can be a little discouraging after all of this work. In fact, this is quite common with parallel
or multi-threaded applications. This is why they are quite difficult to write (well) in practice.
So the last 10% of your grade for this assignment will involve improving the performance of the
application. Ideally, we want to get something closer to linear speedup (e.g., 2 X faster if you have 2
cores). Note that you may not see truly linear speedup since there is typically overhead associated with
thread creation and scheduling. In any case, to improve performance, you can “tweak” the code any way
that you like (algorithms, data structures, etc.), as long as the command line parameters do not change
and the final output is the same

edit:

With very large data sets, my program takes 75ms using 1 thread and 38ms with 2 threads. So i guess it depends on the size of the data set being used.
 
Last edited:
  • #11
That sounds more reasonable as a requirement.

Without knowing what that "Perform" thing does, however, I do not think I can say anything specific. Non-specifically, I can say that it should minimize or ideally avoid interaction with other threads. Interaction means any sort of thing when it has to wait for some other threads to do something, which includes access to "synchronized" resources.
 
  • #12
Well in order for my program to be thread safe wouldn't i have to implement blocking in some way to protect the shared resource (which is the task list in this case)?

Perform is a method which just checks if a static TreeSet contains an arbitrary value... I assume it was just meant to be some arbitrary operation which would take time to complete so we could see meaningful speed differences with larger data sets.

The only blocking taking place is when a thread accesses the stack of tasks in order to pop() and retrieve a task to execute. This would require blocking in order to avoid having threads call pop() on an empty stack.

So in my TaskList class i have:

Code:
public synchronized Task GetTask()
{
	if (!tasks.empty())
	{
		return tasks.pop();
	}
		
	return null;
}
 
  • #13
If the only thread interaction is in the GetTask() method, then your code is close to optimal in that case. I would not attempt any further optimization unless a very specific optimization target was set.
 
  • #14
I have not read every detail you have provided in this thread, but it seems that your code is not waiting (i.e. invoking one of the wait methods) in order to retrieve a new task. If so, that means that all worker threads will continuously poll (also known as busy-wait resulting in 100% CPU usage) until the queue releases a task for them and you should probably consider if this is acceptable or not in the solution for your assignment. In real life, this would of course not be acceptable and one would normally use a wait/notify exchange to make sure that threads only wake up when a task enters the queue. Note however, that it is easy to code an incorrect wait/notify exchange so you should make sure you understand the mechanism first or at least read some guides on how to do it before you start coding it.
 
  • #15
I've looked at a handful of tutorials on wait/notify and they're all pretty confusing. This code seems to get stuck after the first thread is made.

Code:
       public void run()
	{
		boolean processing = true;
		
		while (processing)
		{
			synchronized (tasks)
			{
				try 
				{
					tasks.wait();
				} 
				catch (InterruptedException e) 
				{
					e.printStackTrace();
				}
			}
			
			Task t = tasks.GetTask();
			
			if (t != null)
			{
				t.Perform(id, debug);
			}
			else
			{
				processing = false;
			}
			
			/*while ((t = tasks.GetTask()) != null)
			{
				t.Perform(id, debug);
			}*/
			
			tasks.notifyAll();
		}
		
		System.out.println("Thread " + this.id + " terminated.");
	}

I don't quite understand the program flow. Every example seems to call wait() before anything else, but that means that every thread defaults to the waiting state when the program starts, right?

My understanding is that the code flow should go something like

Wait ---> Do Action Requiring Lock ---> Call NotifyAll

Should i be doing this inside a while loop?
 
  • #16
Yes, that code has to get stuck, because the thread first waits, then notifies. It is going to wait forever before it can notify (and unblock other threads).

More fundamentally, what are trying to achieve here? Yesterday I had an impression that the task list was populated before the threads were launched, and no new tasks were ever added afterwards. In which case, there is no reason to wait. What are going to wait for now?
 
  • #17
The task list is populated prior to creating the threads. Does this mean that waiting would be unnecessary?

I don't want to lose marks on this assignment and my code seems to run correctly... There is no mention of wait/notify in the assignment handout and my code is thread safe, but the real question i suppose is should i really be using wait/notify for this?
 
  • #18
Unless you can clearly state a reason why you should add waiting to your code, stay away from that. I do not think there is anything in the art of computer programming that is worse than adding code that does things for no apparent reason. It is called "cargo cult programming" (look this up).
 
  • #19
voko said:
Yesterday I had an impression that the task list was populated before the threads were launched, and no new tasks were ever added afterwards. In which case, there is no reason to wait.

Reading this thread more carefully (ok, pun intended for once) I agree that there is nothing in the material that twoski has presented that should indicate that this assignment is about constructing a general thread-safe task queue as I was rambling about in my previous post.

I apologize if my previous comment erroneously prompted twoski to get sidetracked into consider adding an unnecessary wait/notify coordination.
 

1. What is threading and why is it important in avoiding thread conflicts?

Threading is the process of executing multiple tasks simultaneously within a single application. It is important in avoiding thread conflicts because it allows multiple threads to access shared resources without interrupting each other's execution.

2. How can thread conflicts occur and what are the consequences?

Thread conflicts can occur when multiple threads try to access and modify the same shared resource at the same time. This can lead to unexpected and erroneous behavior, such as data corruption or deadlock.

3. What are some techniques for avoiding thread conflicts?

One technique is to use synchronization mechanisms, such as locks or semaphores, to control access to shared resources. Another technique is to use immutable objects, where the state cannot be modified, to prevent conflicts.

4. How can the use of threading impact performance?

The use of threading can improve performance by allowing multiple tasks to be executed simultaneously. However, if not implemented properly, it can also lead to performance issues, such as thread contention and overhead from synchronization mechanisms.

5. What are some best practices for managing threading and avoiding thread conflicts?

Some best practices include minimizing the use of shared resources, using thread-safe data structures and libraries, and carefully designing and testing the threading implementation. It is also important to monitor and analyze the performance and behavior of the application to identify and address any potential issues.

Similar threads

  • Engineering and Comp Sci Homework Help
Replies
1
Views
1K
  • Engineering and Comp Sci Homework Help
Replies
2
Views
1K
  • Programming and Computer Science
Replies
1
Views
1K
  • Programming and Computer Science
Replies
2
Views
621
  • Engineering and Comp Sci Homework Help
Replies
4
Views
1K
  • Engineering and Comp Sci Homework Help
Replies
2
Views
3K
Replies
21
Views
3K
  • Feedback and Announcements
Replies
21
Views
2K
  • Engineering and Comp Sci Homework Help
Replies
6
Views
3K
  • Introductory Physics Homework Help
Replies
3
Views
649
Back
Top