Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Determining where a STOP is coming from

  1. Mar 1, 2017 #1
    I'm having this terrible problem with a server that I inherited. The service has a custom service running on it, and this service handles service commands.

    Every hour or so on some of the servers, I see the service go down. It always comes from a STOP. I assumed there must be something wrong with the service, so I added logging to every place where a STOP is generated internally. I got nothing.

    That indicates to me that it's coming from another process. Is there any way to determine which process it is? It's too random to try and just wait for it. I checked in the windows automation tools and the scheduler, and there are no tasks in there.
     
  2. jcsd
  3. Mar 1, 2017 #2

    phinds

    User Avatar
    Gold Member
    2016 Award

    Ouch. That sounds ugly. Good luck.
     
  4. Mar 2, 2017 #3

    QuantumQuest

    User Avatar
    Gold Member

    For a helpful answer, details are needed about the service and the specific OS running on server. In any case you need at least the documentation about the service and the way that interacts with the specific OS. If you can't find that, Windows automation tools can be of no help.
     
  5. Mar 2, 2017 #4

    jim mcnamara

    User Avatar

    Staff: Mentor

    All of those STOP codes are really kernel mode errors.

    Homework time before you go nuts...

    Did you try this? https://www.lifewire.com/blue-screen-error-codes-4065576
    Technet also has articles on each of the specific error codes and what the potential cause is.

    1. Are the servers that crash completely patched?
    2. was the service in good function on a previous version of Windows? Like Server 2000 and it now plays with windows server 2012 or something
    2a. and not recompiled on a newer version? i.e., just ported as an executable? Or or same old box? Or moved to a virtual (this killed us twice).
    3. Do you have source or is it from a vendor? Can you get an updated version?
    4. Are there parameter files or control files? Can you tell what resources they require? IO_WAIT_QUEUE lengths and other crud like that.
    5. Get the sysinternals suite - see if you can determine the resources used - there are several tools, and this sounds like resource exhaustion without anything but a guess. Been there done that.

    And, no, I believe it is trying to handle a request that requires more resources than are available to the process, or it simply is not freeing resources correctly. It is from the interaction with a client, the client is not the 'bad guy' necessarily. This is because the problem goes away when you restart the service process. For an hour or so.
     
  6. Mar 3, 2017 #5
    The OP really didn't give us enough to go on, at all. There's a server, running a service, that in turn monitors a service on other servers, and that service being monitored is receiving a command to stop. Honestly, with that vague a description it could be anything.

    I will give some vague but hopefully useful advice - Bring up a VM and only put the OS and the monitored service/software on it. See if the service gets stopped. Add a firewall and record all communication coming into and out of the server. If the service does stop and the firewall's logs don't help enough, add WireShark. In the meantime, pull the event logs from every server the service stops on - grab the minute before to the minute after. If you script, write a quick powershell script to let you know the status of the service and pull the event logs the moment the service status changes to stopped. Start looking for things in common in those event logs.
     
  7. Mar 3, 2017 #6
    Inheriting undocumented stuff is always a nightmare.
    I have refused to do a job at one time because there was not sufficient information concerning the faulting software.
    I advised that a re-write from scratch would be the best idea, I didn't get the job
     
    Last edited: Mar 3, 2017
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted