mwalts

Ok, I have a service that is running on a remote machine. This service allows the client to access two remote objects, which are singletons.

One just provides access to a database and seems to work fine.

The other object implements a producer consumer model, using a queue and seems to be were the problem lies. When the producer adds an element to the queue, it signals the consumer (each time). The consumer takes some time to run. Normally, this works fine. However, once every 10 or so times, it just doesn't work. Here's the producer section (Node qCurrent is the queue, queueRunner is the consumer thread):
Code Snippet


lock(qCurrent.SyncRoot)
{
qCurrent.Enqueue(benchMe);

lastItemAdded = benchMe;

//Tell the sleeping thread that it has at least one benchmark to run
Monitor.Pulse(qCurrent.SyncRoot);
messages.WriteLine("We think we have pulsed the thread");
if(!queueRunner.IsAlive)
{
messages.WriteLine("Thread is dead for some readon");
messages.WriteLine(queueRunner.ThreadState.ToString());
}
else
{
messages.WriteLine("Thread is alive, why doesn't it work ");
}

}


Now the consumer thread:

Code Snippet

//Runs forever
while(true)
{

//Make sure we have uninterupted access to the queue
lock(qCurrent.SyncRoot)
{

//if the queue is empty,
if(qCurrent.Count==0)
{
//wait until the queue is no longer empty
messages.WriteLine("We are now waiting for an item on the queue");
Monitor.Wait(qCurrent.SyncRoot);
messages.WriteLine("We have the lock again");
}
}

//Can't use the lock block because of scoping issues
//at least without being silly
Monitor.Enter(qCurrent.SyncRoot);
while(qCurrent.Count!=0)
{
//Get the lock for the currently running item and change it
Monitor.Enter(currentlyRunningLock);
currentlyRunning = (CBenchItem)qCurrent.Dequeue();
Monitor.Exit(currentlyRunningLock);

//Make sure we are no longer blocking the queue
Monitor.Exit(qCurrent.SyncRoot);
//Run the current element
runCurrent(currentlyRunning);

Monitor.Enter(qCurrent.SyncRoot);
}
Monitor.Exit(qCurrent.SyncRoot);
}


Now I expect to see something like this:

We are now waiting for an item on the queue
We think we have pulsed the thread
Thread is alive, why doesn't it work
We have the lock again
We are now waiting for an item on the queue
.
.
.

which I normally do, but sometimes I get this:

We are now waiting for an item on the queue
We think we have pulsed the thread
Thread is alive, why doesn't it work
We think we have pulsed the thread
Thread is alive, why doesn't it work
.
.
.


Notice that the consumer thread has not woken up!
When I inspect the queue itself, it will have the elements added to it

I should also mention, that though the producer is normally called remotely, it can also be called by a FileSystemWatcher that is local to the producer/consumer object.

Even when both are mixed however (regardless of which adds the first element), it normally works. I tried using PulseAll instead of pulse too, but no luck.

Yes these are the only threads running (at least unless remoting has created it's own somewhere) and the threads always collect locks in the same direction when multiple locks are required to avoid deadlock (which this doesn't seem to be anyway as one of the threads is still responsive).

I have the feeling this might be some sort of behavior of remoting I'm not familiar with, but I just don't know.

This is starting to drive me nuts, so any help would be much appreciated, thank you in advance,

-mwalts


Re: Visual C# General Remoting, and Threads, and Services Oh My!:Possible synchronization issue?

mwalts

Oh, and yeah, I have overrode InitializeLifetimeService() to return null

Thanks,

-mwalts




Re: Visual C# General Remoting, and Threads, and Services Oh My!:Possible synchronization issue?

Steve Py

This looks like a race condition, and you are doing an aweful lot of locking and unlocking. Smile My guess would be what is causing the trouble is the outer Lock on the queue when you're checking for a wait on zero item condition:

//Make sure we have uninterupted access to the queue
lock(qCurrent.SyncRoot)
{

//if the queue is empty,
if(qCurrent.Count==0)
{
//wait until the queue is no longer empty
messages.WriteLine("We are now waiting for an item on the queue");
Monitor.Wait(qCurrent.SyncRoot);
messages.WriteLine("We have the lock again");
}
}


Try removing that lock. If this worker thread item is the only one removing items from the queue then you shouldn't need to worry about locking it to obtain a count, and I'm a bit suspicious of the behaviour it would have entering a wait state within the lock and the ability for it to be awaken from that wait.

Also, to more easily detect race conditions with thread locking, take a look at the TimedLock class by IanG (http://www.interact-sw.co.uk/iangblog/2004/03/23/locking) and search around since there are a few clever revisions of it available for better stack tracing support. With this class you can lock via using() blocks which are cleanly destructed and they support timeout exceptions.







Re: Visual C# General Remoting, and Threads, and Services Oh My!:Possible synchronization issue?

mwalts

First off, thank you for your response

Actually, Monitor.Wait can only be called from inside a section that has a lock (either through Lock(SomeObject){} or Monitor.Enter(SomeObject)) on the object you are waiting on. Behind the scenes, it releases the lock, waits, then reaquires the lock. As for why the lock is around the count statement, it's actually to avoid a race condition . You see, the count is altered in the producer thread when an item is enqueued, this ensures that if the producer wants to add an item just as the consumer is about to wait for one, it will either

1) reach the wait condition and be woken when the producer gets it's turn or
2) have the count incremented by the enqueue operation just before we evaluate the condition, thus we don't wait at all.

This ensures that we never end up waiting just after a signal was sent, and we could be processing a queue item... or rather it should Stick out tongue

As for the profusion of locks, I'm guessing your talking about the while loop, and yeah you have a point. But it really is all required. That central runCurrent statement could take close to 4 hours so I really don't want to leave any of those locks on, it would destroy the whole point of threading this in the first place. And above that I do need to ensure exclusive access as I change the currently running queue item. The locking and unlocking around the loop itself is actually a fairly standard construct in C, which is where most of my threading experience comes from.

Interestingly enough, I have now solved this problem by using a different method that should act largely the same.

I've exchanged my Monitor.Wait and Monitor.Pulse sections with queueRunner.Suspend() and queueRunner.Resume(), outside of the locks obviously. This did cause me to create an otherwise useless bool value, and it now can result in the race condition I mentioned above, although it is so unlikely I have yet to see it in my testing.

Annoying to know it's there though.

From the reading I've done, and my previous experience with threads it really seems like I had done the right thing, and yet it only worked intermittently. I am using the 1.1 Framework, is this a known bug Haven't seen mention of it in my searches.

Thank you for your time,

-mwalts




Re: Visual C# General Remoting, and Threads, and Services Oh My!:Possible synchronization issue?

Steve Py

Ah, I see. Smile A bit different to approaches I've used. (ThreadPool threads with WaitHandles, and more recently with polling thread workers with async. delegates.)


Regards,