Determining where a STOP is coming from

newjerseyrunner · Mar 1, 2017

I'm having this terrible problem with a server that I inherited. The service has a custom service running on it, and this service handles service commands.

Every hour or so on some of the servers, I see the service go down. It always comes from a STOP. I assumed there must be something wrong with the service, so I added logging to every place where a STOP is generated internally. I got nothing.

That indicates to me that it's coming from another process. Is there any way to determine which process it is? It's too random to try and just wait for it. I checked in the windows automation tools and the scheduler, and there are no tasks in there.

phinds · Mar 1, 2017

Ouch. That sounds ugly. Good luck.

QuantumQuest · Mar 2, 2017

newjerseyrunner said:

I'm having this terrible problem with a server that I inherited. The service has a custom service running on it, and this service handles service commands.

Every hour or so on some of the servers, I see the service go down. It always comes from a STOP. I assumed there must be something wrong with the service, so I added logging to every place where a STOP is generated internally. I got nothing.

That indicates to me that it's coming from another process. Is there any way to determine which process it is? It's too random to try and just wait for it. I checked in the windows automation tools and the scheduler, and there are no tasks in there.

For a helpful answer, details are needed about the service and the specific OS running on server. In any case you need at least the documentation about the service and the way that interacts with the specific OS. If you can't find that, Windows automation tools can be of no help.

jim mcnamara · Mar 2, 2017

All of those STOP codes are really kernel mode errors.

Homework time before you go nuts...

Did you try this? https://www.lifewire.com/blue-screen-error-codes-4065576
Technet also has articles on each of the specific error codes and what the potential cause is.

1. Are the servers that crash completely patched?
2. was the service in good function on a previous version of Windows? Like Server 2000 and it now plays with windows server 2012 or something
2a. and not recompiled on a newer version? i.e., just ported as an executable? Or or same old box? Or moved to a virtual (this killed us twice).
3. Do you have source or is it from a vendor? Can you get an updated version?
4. Are there parameter files or control files? Can you tell what resources they require? IO_WAIT_QUEUE lengths and other crud like that.
5. Get the sysinternals suite - see if you can determine the resources used - there are several tools, and this sounds like resource exhaustion without anything but a guess. Been there done that.

And, no, I believe it is trying to handle a request that requires more resources than are available to the process, or it simply is not freeing resources correctly. It is from the interaction with a client, the client is not the 'bad guy' necessarily. This is because the problem goes away when you restart the service process. For an hour or so.

rkolter · Mar 3, 2017

The OP really didn't give us enough to go on, at all. There's a server, running a service, that in turn monitors a service on other servers, and that service being monitored is receiving a command to stop. Honestly, with that vague a description it could be anything.

I will give some vague but hopefully useful advice - Bring up a VM and only put the OS and the monitored service/software on it. See if the service gets stopped. Add a firewall and record all communication coming into and out of the server. If the service does stop and the firewall's logs don't help enough, add WireShark. In the meantime, pull the event logs from every server the service stops on - grab the minute before to the minute after. If you script, write a quick powershell script to let you know the status of the service and pull the event logs the moment the service status changes to stopped. Start looking for things in common in those event logs.

rootone · Mar 3, 2017

Inheriting undocumented stuff is always a nightmare.
I have refused to do a job at one time because there was not sufficient information concerning the faulting software.
I advised that a re-write from scratch would be the best idea, I didn't get the job

Determining where a STOP is coming from

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Who May Find This Useful

Similar threads

What Free Privacy-Focused AI Chatbots Don’t Use My Data for Training?

How far will we let AI control us?

Impersonation News

Is This Viral Music Sensation Real or AI-Generated?

How Do Mobile Games Handle Resource Optimization for Heavy Graphics?

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight