Code: Select all
nohup /path/to/binary -parameters
Regards
Alex
This will not work. Removing ampersand would keep process running in foreground, this is not the very last command in our script, so this would effectively prevent script to complete the rest of checks.KS-Soft wrote:try without ampersand, should workCode: Select all
nohup /path/to/binary -parameters
Regards
Alex
KS-Soft wrote:What script?
If HostMonitor starts "nohup /path/to/binary -parameters" command then I do not understand where is the script.
Regards
Alex
The actual command is not just as simple as "nohup program &".What's not working:
Background processes do not survive. Basically HM logs into system, checks if certain process is running and if not starts new instance using
Assuming I split this single test into two, we have approximately 350 servers, each with 3 to 8 instances of this test managing different processes. It will take enormous amount of time to change them all.KS-Soft wrote:May be you can split it into 2 scripts?
Make 2 tests, one will start "checking" script and wait for result, another dependent test will start 2nd script (start process).
Regards
Alex
It's same script on multiple servers. around 1500 tests in total.KS-Soft wrote:So you have around 2000 tests related to this script (single script on single server)?
It's already redirected, same way you suggest. Including not just STDOUT and STDERR, but also STDIN:KS-Soft wrote:Then we have to find out what is wrong with this command
nohup /path/to/binary -parameters &
may be this version of "nohup" does not redirect output?
Then may be process "binary" terminates by SIGPIPE (when writing to a terminal which does not exist anymore because script terminated and session closed)?
Have you tried command like
nohup /path/to/binary -parameters >>logout.txt 2>>logerror.txt &
or may be better
setsid /path/to/binary -parameters >>logout.txt 2>>logerror.txt &
Regards
Alex
SSHMode=1 does not work on "old" server. Process does not survive.v.yesakov wrote:Maybe I should try SSHMode=1 on old servers to, to eliminate possible difference with "nohup" . I'll post results later today.
"setsid" method works without "SSHMode=1", but does not work "new" way, same as "nohup"v.yesakov wrote:I'll try with "setsid" too.
Code: Select all
debug1: Forked child 9889.
Set /proc/self/oom_score_adj to 0
debug1: rexec start in 5 out 5 newsock 5 pipe 7 sock 8
debug1: inetd sockets after dupping: 3, 3
Connection from IP.AD.RE.SS port 3042
debug1: Client protocol version 2.0; client software version KS_SSH
debug1: no match: KS_SSH
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_5.9p1 Debian-5ubuntu1.10
debug1: permanently_set_uid: 105/65534 [preauth]
debug1: list_hostkey_types: ssh-rsa,ssh-dss,ecdsa-sha2-nistp256 [preauth]
debug1: SSH2_MSG_KEXINIT sent [preauth]
debug1: SSH2_MSG_KEXINIT received [preauth]
debug1: kex: client->server aes256-cbc hmac-sha1 none [preauth]
debug1: kex: server->client aes256-cbc hmac-sha1 none [preauth]
debug1: expecting SSH2_MSG_KEXDH_INIT [preauth]
debug1: SSH2_MSG_NEWKEYS sent [preauth]
debug1: expecting SSH2_MSG_NEWKEYS [preauth]
debug1: SSH2_MSG_NEWKEYS received [preauth]
debug1: KEX done [preauth]
debug1: userauth-request for user alliance service ssh-connection method none [preauth]
debug1: attempt 0 failures 0 [preauth]
reverse mapping checking getaddrinfo for IP.AD.RE.SS.domain.tld [IP.AD.RE.SS] failed - POSSIBLE BREAK-IN ATTEMPT!
debug1: PAM: initializing for "alliance"
debug1: PAM: setting PAM_RHOST to "IP.AD.RE.SS"
debug1: PAM: setting PAM_TTY to "ssh"
debug1: userauth-request for user alliance service ssh-connection method password [preauth]
debug1: attempt 1 failures 0 [preauth]
debug1: PAM: password authentication accepted for alliance
debug1: do_pam_account: called
Accepted password for alliance from IP.AD.RE.SS port 3042 ssh2
debug1: monitor_read_log: child log fd closed
debug1: monitor_child_preauth: alliance has been authenticated by privileged process
debug1: PAM: establishing credentials
pam_unix(sshd:session): session opened for user alliance by (uid=0)
User child is on pid 10037
debug1: SELinux support disabled
debug1: PAM: establishing credentials
debug1: permanently_set_uid: 1000/1000
debug1: Entering interactive session for SSH2.
debug1: server_init_dispatch_20
debug1: server_input_channel_open: ctype session rchan 0 win 16384 max 16384
debug1: input_session_request
debug1: channel 0: new [server-session]
debug1: session_new: session 0
debug1: session_open: channel 0
debug1: session_open: session 0: link with channel 0
debug1: server_input_channel_open: confirm session
debug1: server_input_channel_req: channel 0 request exec reply 1
debug1: session_by_channel: session 0 channel 0
debug1: session_input_channel_req: session 0 req exec
debug1: Received SIGCHLD.
debug1: session_by_pid: pid 10038
debug1: session_exit_message: session 0 channel 0 pid 10038
debug1: session_exit_message: release channel 0
Received disconnect from IP.AD.RE.SS: 11: session close
debug1: do_cleanup
debug1: do_cleanup
debug1: PAM: cleanup
debug1: PAM: closing session
pam_unix(sshd:session): session closed for user alliance
debug1: PAM: deleting credentials
Code: Select all
debug1: Forked child 9735.
Set /proc/self/oom_score_adj to 0
debug1: rexec start in 5 out 5 newsock 5 pipe 7 sock 8
debug1: inetd sockets after dupping: 3, 3
Connection from IP.AD.RE.SS port 4150
debug1: Client protocol version 2.0; client software version KS_SSH1030
debug1: no match: KS_SSH1030
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_5.9p1 Debian-5ubuntu1.10
debug1: permanently_set_uid: 105/65534 [preauth]
debug1: list_hostkey_types: ssh-rsa,ssh-dss,ecdsa-sha2-nistp256 [preauth]
debug1: SSH2_MSG_KEXINIT sent [preauth]
debug1: SSH2_MSG_KEXINIT received [preauth]
debug1: kex: client->server blowfish-cbc hmac-sha1 none [preauth]
debug1: kex: server->client blowfish-cbc hmac-sha1 none [preauth]
debug1: expecting SSH2_MSG_KEXDH_INIT [preauth]
debug1: SSH2_MSG_NEWKEYS sent [preauth]
debug1: expecting SSH2_MSG_NEWKEYS [preauth]
debug1: SSH2_MSG_NEWKEYS received [preauth]
debug1: KEX done [preauth]
debug1: userauth-request for user alliance service ssh-connection method none [preauth]
debug1: attempt 0 failures 0 [preauth]
reverse mapping checking getaddrinfo for IP.AD.RE.SS.domain.tld [IP.AD.RE.SS] failed - POSSIBLE BREAK-IN ATTEMPT!
debug1: PAM: initializing for "alliance"
debug1: PAM: setting PAM_RHOST to "IP.AD.RE.SS"
debug1: PAM: setting PAM_TTY to "ssh"
debug1: userauth-request for user alliance service ssh-connection method password [preauth]
debug1: attempt 1 failures 0 [preauth]
debug1: PAM: password authentication accepted for alliance
debug1: do_pam_account: called
Accepted password for alliance from IP.AD.RE.SS port 4150 ssh2
debug1: monitor_read_log: child log fd closed
debug1: monitor_child_preauth: alliance has been authenticated by privileged process
debug1: PAM: establishing credentials
pam_unix(sshd:session): session opened for user alliance by (uid=0)
User child is on pid 9882
debug1: SELinux support disabled
debug1: PAM: establishing credentials
debug1: permanently_set_uid: 1000/1000
debug1: Entering interactive session for SSH2.
debug1: server_init_dispatch_20
debug1: server_input_channel_open: ctype session rchan 1 win 131072 max 32768
debug1: input_session_request
debug1: channel 0: new [server-session]
debug1: session_new: session 0
debug1: session_open: channel 0
debug1: session_open: session 0: link with channel 0
debug1: server_input_channel_open: confirm session
debug1: server_input_channel_req: channel 0 request pty-req reply 0
debug1: session_by_channel: session 0 channel 0
debug1: session_input_channel_req: session 0 req pty-req
debug1: Allocating pty.
debug1: session_new: session 0
debug1: SELinux support disabled
debug1: session_pty_req: session 0 alloc /dev/pts/1
debug1: server_input_channel_req: channel 0 request exec reply 0
debug1: session_by_channel: session 0 channel 0
debug1: session_input_channel_req: session 0 req exec
debug1: Setting controlling tty using TIOCSCTTY.
debug1: Received SIGCHLD.
debug1: session_by_pid: pid 9883
debug1: session_exit_message: session 0 channel 0 pid 9883
debug1: session_exit_message: release channel 0
debug1: session_by_tty: session 0 tty /dev/pts/1
debug1: session_pty_cleanup: session 0 release /dev/pts/1
debug1: server_input_channel_req: channel 0 request exit-status reply 0
debug1: session_by_channel: session 0 channel 0
debug1: session_input_channel_req: session 0 req exit-status
debug1: session_by_channel: session 0 channel 0
debug1: session_close_by_channel: channel 0 child 0
debug1: session_close: session 0 pid 0
debug1: channel 0: free: server-session, nchannels 1
Connection closed by IP.AD.RE.SS
debug1: do_cleanup
Transferred: sent 2112, received 1104 bytes
Closing connection to IP.AD.RE.SS port 4150
debug1: PAM: cleanup
debug1: PAM: closing session
pam_unix(sshd:session): session closed for user alliance
debug1: PAM: deleting credentials
I assume there are a lot of changes between Ubuntu 12.04 and 16.04 but this question should be addressed to Ubuntu developers.And remind you: it's working on Ubuntu 12.04 without any issues (without "SSHMode=1").
Code: Select all
nohup ping -c 50 8.8.8.8 >>/home/test/log/pinglog.txt &
Code: Select all
nohup ping -c 50 8.8.8.8 >>/home/test/log/pinglog.txt &
sleep 2s
Two debug logs from SSHD in my post above were taken from the very same server, running Ubuntu 12.04.5 x64. Not two different machines. The only difference is "SSHMode=1" added to RMA config in between.KS-Soft wrote:I assume there are a lot of changes between Ubuntu 12.04 and 16.04And remind you: it's working on Ubuntu 12.04 without any issues (without "SSHMode=1").
Also I think you have different versions of ssh servers on these systems.
This is the puzzle we have to solve together somehow, because it affects old servers as well as new ones.KS-Soft wrote:We made many changes in HostMonitor code as well, but we do not see how these changes may effect nohup program.
It will do as temporary workaround, but it's definitely not a solution.KS-Soft wrote:However we were able to reproduce the problem and I think we found solution - add sleep
Yes, I confirm: adding "sleep" at the end of script helps. Process survives after HM closes SSH session.KS-Soft wrote:while next script works fine (sleep for 2 sec allows ping to work in backround for 50 sec, process works fine when ssh session closed)Code: Select all
nohup ping -c 50 8.8.8.8 >>/home/test/log/pinglog.txt & sleep 2s