06-10-2023, 10:24 PM
PING Remy.
Some years ago I tried to use Linux for a TCPServer application.
This I had to abandon, because of a intermittent problem with the TCPServer not releasing a Context followed by the system locking up during shutdown.
This bug only happened in the live production server, and which I could not reproduce on a test enviroment.
Now this same problem has occured on the replacement production Windows Server VPS.
And I have now been able to consistently reproduce the problem on a test server, by accessing this via a StarLink connection, and after some time the issue then happens under certain conditions (I have more data on this).
I have now opened an Issue #484 on Github, https://github.com/IndySockets/Indy/issues/484
I can now produce debug data, but I do not understand why and how certain functions are called.
Since I posted the issue I have collected a lot more data, but I need somebody with knowledge on the system to look at the logs and tell me where I should go next.
Basically what happens:
- A TCP event occurs (NOT a RESET) which causes the TCPServer to drop the TCP connection, but not clearing the Contexts (and probably other areas).
- Attempting to Disconnect the Context then fails (it is already partly disconnected I assume)
- The Context stays in the Context List, and cannot be removed.
- On Shutdown, the TCPServer locks up with 100% CPU.
I have full Logs now of this situation.
There is an endless loop in TIdScheduler.TerminateAllYarns in which it never comes out, however, this is not where the 100% CPU comes from, as it has a 500ms delay build in.
By breaking out of the loop (as a hack) I can get the TCPServer to finally shut down, and the program terminates.
But all this is a result of the initial failure of the TCPServer not to clear the Context after something has happened to the TCP connection.
Can an experienced Indy programmer please help, as this is a bad situation now it also occured on my production server.
Bart
Some years ago I tried to use Linux for a TCPServer application.
This I had to abandon, because of a intermittent problem with the TCPServer not releasing a Context followed by the system locking up during shutdown.
This bug only happened in the live production server, and which I could not reproduce on a test enviroment.
Now this same problem has occured on the replacement production Windows Server VPS.
And I have now been able to consistently reproduce the problem on a test server, by accessing this via a StarLink connection, and after some time the issue then happens under certain conditions (I have more data on this).
I have now opened an Issue #484 on Github, https://github.com/IndySockets/Indy/issues/484
I can now produce debug data, but I do not understand why and how certain functions are called.
Since I posted the issue I have collected a lot more data, but I need somebody with knowledge on the system to look at the logs and tell me where I should go next.
Basically what happens:
- A TCP event occurs (NOT a RESET) which causes the TCPServer to drop the TCP connection, but not clearing the Contexts (and probably other areas).
- Attempting to Disconnect the Context then fails (it is already partly disconnected I assume)
- The Context stays in the Context List, and cannot be removed.
- On Shutdown, the TCPServer locks up with 100% CPU.
I have full Logs now of this situation.
There is an endless loop in TIdScheduler.TerminateAllYarns in which it never comes out, however, this is not where the 100% CPU comes from, as it has a 500ms delay build in.
By breaking out of the loop (as a hack) I can get the TCPServer to finally shut down, and the program terminates.
But all this is a result of the initial failure of the TCPServer not to clear the Context after something has happened to the TCP connection.
Can an experienced Indy programmer please help, as this is a bad situation now it also occured on my production server.
Bart