简体   繁体   English

如何使2个应用程序在Linux中相互运行?

[英]How to make 2 applications run each other in linux?

The situation is as follows: 情况如下:

We have a main application and a watcher application. 我们有一个主应用程序和一个观察程序。 Both of them are c++ applications. 它们都是c ++应用程序。 Both of them use daemon(1,0) function. 它们都使用daemon(1,0)函数。

Watcher checks if main application is running and if it finds that main process is absent (crashed) or that main does not respond (applications 'talk' to each other through TCP and thats how it knows if it hung) then it runs the main or restarts it. 监视程序检查主应用程序是否正在运行,并且是否发现主进程不存在(崩溃)或主程序没有响应(应用程序通过TCP相互“交谈”,这就是它知道挂起的方式),然后运行主程序或重新启动它。

Now, TCP settings for the connection can be changed by the user, and it is done through main app. 现在,用户可以更改连接的TCP设置,并且可以通过主应用程序进行更改。 After the change, watcher must be restarted to load the new configuration. 更改后,必须重新启动观察程序以加载新配置。 That is done from the main app. 这是从主应用程序完成的。

As it is, it works fine. 实际上,它工作正常。
1. On startup Main app DOES kill existing watcher process and runs it again. 1.在启动时,主应用程序会杀死现有的观察程序并再次运行。 [This is correct] [这是对的]
2. Watcher app DOES kill main and runs it again. 2. Watcher应用不会杀死main并再次运行。 [This is correct] [这是对的]

BUT

  1. If i run Main, which in turn starts Watcher, 如果我运行Main,反过来又启动了Watcher,
  2. then kill the Main so the Watcher is left alone. 然后杀死Main,使Watcher独自一人。
  3. Watcher sees that there is no Main anymore and so it starts it again. 观察者看到不再有Main,因此它再次启动它。
  4. Main starts again, kills the watcher and tries to start it again.... Main再次启动,杀死监视程序并尝试再次启动它。
  5. and at this point, some kind of nonesence happens. 在这一点上,某种不存在发生了。 It starts the watcher (i can see that TCP port being taken through netstat command), but there is no process named watcher. 它启动了监视程序(我可以看到通过netstat命令占用了TCP端口),但是没有名为监视程序的进程。

If normally netstat shows tcp 0 0 IP:TCP_PORT LISTEN Watcher , now it shows tcp 0 0 IP:TCP_PORT LISTEN Main . 如果正常情况下netstat显示tcp 0 0 IP:TCP_PORT LISTEN Watcher ,则现在显示tcp 0 0 IP:TCP_PORT LISTEN Main

It is as if watcher is there, but inside the Main process. 好像观察者在那里,但是在Main进程内部。

I use scripts to run applications. 我使用脚本来运行应用程序。 Watcher uses this 观察者使用此

#!/bin/sh
killall -9 Main
./Main

And runs it like system("./runMain.sh&"); 并像system("./runMain.sh&");一样运行它system("./runMain.sh&");

Main uses this 主要用途

#!/bin/sh
killall -9 Watcher
./Watcher

And runs it like system("./runWatcher.sh&"); 并像system("./runWatcher.sh&");一样运行它system("./runWatcher.sh&");

What am i doing wrong? 我究竟做错了什么? How do i run them so they could restart each other when needed and always start in separate processes? 如何运行它们,以便它们可以在需要时彼此重新启动,并始终在单独的进程中启动?

So far i have also tried running the scripts using the nohup , result is the same. 到目前为止,我也尝试过使用nohup运行脚本,结果是相同的。

EDIT 1: 编辑1:

Note: numbers here are just for clarity. 注意:这里的数字只是为了清楚起见。 In reality PID is not 1 of course. 实际上,PID当然不是1。

  1. I run Main. 我运行Main。 netstat shows me: netstat显示给我:

    tcp 0 0 192.168.0.1:7000 LISTEN (PID 1)Main tcp 0 0 192.168.0.1:7000 LISTEN(PID 1)主要
    tcp 0 0 192.168.0.1:7001 LISTEN (PID 1)Main tcp 0 0 192.168.0.1:7001 LISTEN(PID 1)主要

  2. Main starts the Watcher using the script. Main使用脚本启动Watcher。 Now netstat shows me: 现在netstat向我显示:

    tcp 0 0 192.168.0.1:7000 LISTEN (PID 1)Main tcp 0 0 192.168.0.1:7000 LISTEN(PID 1)主要
    tcp 0 0 192.168.0.1:7001 LISTEN (PID 1)Main tcp 0 0 192.168.0.1:7001 LISTEN(PID 1)主要
    tcp 0 0 192.168.0.1:8000 LISTEN (PID 2)Watcher tcp 0 0 192.168.0.1:8000 LISTEN(PID 2)看守

  3. Now, i manually kill Main by doing killall -9 Main . 现在,我通过执行killall -9 Main手动杀死killall -9 Main Now netstat shows me: 现在netstat向我显示:

    tcp 0 0 192.168.0.1:7000 LISTEN (PID 2)Watcher tcp 0 0 192.168.0.1:7000 LISTEN(PID 2)看守
    tcp 0 0 192.168.0.1:7001 LISTEN (PID 2)Watcher tcp 0 0 192.168.0.1:7001 LISTEN(PID 2)看守
    tcp 0 0 192.168.0.1:8000 LISTEN (PID 2)Watcher tcp 0 0 192.168.0.1:8000 LISTEN(PID 2)看守

    Notice the change in who owns the listening sockets now? 注意现在谁拥有监听套接字的更改? How did that happen? 那是怎么发生的?

  4. Watcher sees that Main is gone and so it starts it using the script file. 观察者看到Main消失了,因此它使用脚本文件启动了它。

  5. Main kills the Watcher on startup. Main在启动时杀死了Watcher。 Netstat shows: Netstat显示:

    tcp 0 0 192.168.0.1:7000 LISTEN (PID 3)Main tcp 0 0 192.168.0.1:7000 LISTEN(PID 3)主要
    tcp 0 0 192.168.0.1:7001 LISTEN (PID 3)Main tcp 0 0 192.168.0.1:7001 LISTEN(PID 3)主要
    tcp 0 0 192.168.0.1:8000 LISTEN (PID 3)Main tcp 0 0 192.168.0.1:8000 LISTEN(PID 3)主要

And thats it. 就是这样。 Watcher never runs again. 观察者再也不会运行。 I tried to debug in Eclipse, Watcher crashes without throwing anything right on the line daemon(1,0) . 我尝试在Eclipse中进行调试,Watcher崩溃了,而没有在行daemon(1,0)上抛出任何错误。

How about using a custom signal (or even listening on another port for admin commands)? 如何使用自定义信号(甚至在另一个端口上监听管理命令)? Using the kill -9 is playing with the process tree such as the child process gaining control of the parent's resources (ports, etc.) 使用kill -9正在处理进程树,例如子进程获得对父级资源(端口等)的控制。

Then, on top of that, when the Main process is started by the Watcher, why does it assume that the running instance of Watcher should be killed? 然后,最重要的是,当Watcher启动Main进程时,为什么还假定应该终止正在运行的Watcher实例? One reason is now Watcher is the parent of Main, so I can see how that could cause trouble. 原因之一是,现在Watcher是Main的父级,因此我可以看到这可能引起麻烦。

It comes down to the need for the two processes to communicate outside of the 'kill' signal. 归结为两个过程需要在“ kill”信号之外进行通信。

Use a semaphore or some other OS-level communication mechanism to coordinate between the two. 使用信号量或其他操作系统级别的通信机制在两者之间进行协调。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM