hello,
i'm running msgcache on a remote machine. For some houres it works fine, but at some point it stopps working.
Here is a part of msgcache.log (msgcache is running with --debug)
... (houres of correkt processing) 2006-08-22 15:11:02 New connection 2006-08-22 15:11:02 -> oksender 2006-08-22 15:11:02 <- oksender(1-a) 2006-08-22 15:11:02 Got pullclient request: pullclient 1 log:/var/log/messages:10240 ignore MARK
2006-08-22 15:11:02 Saved client response: log:/var/log/messages:10240 ignore MARK
2006-08-22 15:12:13 New connection 2006-08-22 15:13:36 New connection 2006-08-22 15:13:37 -> oksender 2006-08-22 15:13:37 <- oksender(1-a) 2006-08-22 15:13:37 Got pullclient request: pullclient 1 log:/var/log/messages:10240 ignore MARK
2006-08-22 15:13:37 Saved client response: log:/var/log/messages:10240 ignore MARK
2006-08-22 15:13:43 New connection 2006-08-22 15:13:53 New connection 2006-08-22 15:16:13 New connection 2006-08-22 15:18:44 New connection 2006-08-22 15:19:00 New connection 2006-08-22 15:21:21 New connection 2006-08-22 15:23:44 New connection 2006-08-22 15:24:19 New connection 2006-08-22 15:27:34 New connection 2006-08-22 15:28:45 New connection ... it stopped sending results to the (polling) hobbit-server
Some ideas? What can i do to get addtionel info's ?
greetings (Great work henrik)
Rolf Masfelder
Tel.: 06321 355 207 FAX: 06321 355 224 Mobil: 0160 80 64 181 world: 0700 NECTORGMbh EMail: rolf.masfelder at nector.de
On Tue, Aug 22, 2006 at 04:03:20PM +0200, Rolf Masfelder wrote:
i'm running msgcache on a remote machine. For some houres it works fine, but at some point it stopps working.
Here is a part of msgcache.log (msgcache is running with --debug)
2006-08-22 15:12:13 New connection 2006-08-22 15:13:36 New connection 2006-08-22 15:13:37 -> oksender 2006-08-22 15:13:37 <- oksender(1-a) 2006-08-22 15:13:37 Got pullclient request: pullclient 1 2006-08-22 15:13:43 New connection 2006-08-22 15:13:53 New connection 2006-08-22 15:16:13 New connection 2006-08-22 15:18:44 New connection
It would be interesting to see the Hobbit server's hobbitfetch.log file also around this time. I suspect there might be some "Timeout while talking to ..." messages there.
Also, try enabling debugging for hobbitfetch with "kill -USR2 <hobbitfetch PID>" and then force a listing of the currently active connections with "kill -USR1 <hobbitfetch PID>"
Finally, when msgcache is in this state, what happens if you run
from the Hobbit server - the command
bb IP.OF.CLIENT.HOST "pullclient"
It should dump the last status message to the screen.
Regards, Henrik
Am Dienstag 22 August 2006 16:21 schrieb Henrik Stoerner:
On Tue, Aug 22, 2006 at 04:03:20PM +0200, Rolf Masfelder wrote:
i'm running msgcache on a remote machine. For some houres it works fine, but at some point it stopps working.
Here is a part of msgcache.log (msgcache is running with --debug)
2006-08-22 15:12:13 New connection 2006-08-22 15:13:36 New connection 2006-08-22 15:13:37 -> oksender 2006-08-22 15:13:37 <- oksender(1-a) 2006-08-22 15:13:37 Got pullclient request: pullclient 1 2006-08-22 15:13:43 New connection 2006-08-22 15:13:53 New connection 2006-08-22 15:16:13 New connection 2006-08-22 15:18:44 New connection
here are some lines from hobbitclient.log:
2006-08-22 00:56:18 Whoops ! bb failed to send message - timeout
2006-08-22 01:01:19 Whoops ! bb failed to send message - timeout
(here i restarted hobbit on the client)
2006-08-22 15:13:58 Whoops ! bb failed to send message - timeout
2006-08-22 15:18:59 Whoops ! bb failed to send message - timeout
2006-08-22 15:23:59 Whoops ! bb failed to send message - timeout
2006-08-22 15:29:00 Whoops ! bb failed to send message - timeout
2006-08-22 15:34:00 Whoops ! bb failed to send message - timeout
2006-08-22 15:39:01 Whoops ! bb failed to send message - timeout
2006-08-22 15:44:02 Whoops ! bb failed to send message - timeout
It would be interesting to see the Hobbit server's hobbitfetch.log file also around this time. I suspect there might be some "Timeout while talking to ..." messages there.
here are the last lines of hobbitfetch.log: 2006-08-22 00:45:54 Timeout while talking to 212.227.90.152:1984 (req 7545): Aborting session 2006-08-22 01:01:44 Connection lost during read from 212.227.90.152:1984 (req 7552): Connection reset by peer 2006-08-22 15:12:46 Connection lost during read from 212.227.90.152:1984 (req 8642): Connection reset by peer ^^^^^^^^^^^^^^ the client with msgcache 2006-08-22 15:29:23 Timeout while talking to 212.227.90.152:1984 (req 8649): Aborting session 2006-08-22 15:44:38 Timeout while talking to 212.227.90.152:1984 (req 8655): Aborting session 2006-08-22 16:00:54 Timeout while talking to 212.227.90.152:1984 (req 8661): Aborting session
I use a connection without a tunnel, so you may access the messagecache also :-<
Also, try enabling debugging for hobbitfetch with "kill -USR2 <hobbitfetch PID>" and then force a listing of the currently active connections with "kill -USR1 <hobbitfetch PID>"
looking for the id i found: hobbit 27013 27011 0 Aug15 ? 00:00:02 /home/hobbit/server/bin/hobbitfetch --pidfile=/var/log/hobbit/hobbitfetch.pid
but in /var/log/hobbit there is no hobbitfetch.pid ???
ok ... kill -USR2 27013 kill -USR1 27013
here are the lines from hobbitfetch.log: 2006-08-22 00:45:54 Timeout while talking to 212.227.90.152:1984 (req 7545): Aborting session 2006-08-22 01:01:44 Connection lost during read from 212.227.90.152:1984 (req 7552): Connection reset by peer 2006-08-22 15:12:46 Connection lost during read from 212.227.90.152:1984 (req 8642): Connection reset by peer 2006-08-22 15:29:23 Timeout while talking to 212.227.90.152:1984 (req 8649): Aborting session 2006-08-22 15:44:38 Timeout while talking to 212.227.90.152:1984 (req 8655): Aborting session 2006-08-22 16:00:54 Timeout while talking to 212.227.90.152:1984 (req 8661): Aborting session 2006-08-22 18:00:49 Debug ON 2006-08-22 18:00:53 Queuing request 8810 to 212.227.90.152:1984 for p15191085: 'pullclient 1 log:/var/log/messages:10240 ignore MARK
' 2006-08-22 18:00:53 Sent 54 bytes to 212.227.90.152:1984 (req 8810) 2006-08-22 18:00:53 Done reading data from 212.227.90.152:1984 (req 8810) 2006-08-22 18:00:53 Doing cleanup 2006-08-22 18:00:53 Next poll of p15191085 in 45 seconds 2006-08-22 18:00:53 Request completed: req 8810, peer 212.227.90.152:1984, action was 2, type was 0 2006-08-22 18:01:38 Queuing request 8811 to 212.227.90.152:1984 for p15191085: 'pullclient 1 log:/var/log/messages:10240 ignore MARK
' 2006-08-22 18:01:38 Sent 54 bytes to 212.227.90.152:1984 (req 8811) 2006-08-22 18:01:38 Done reading data from 212.227.90.152:1984 (req 8811) 2006-08-22 18:01:38 Doing cleanup 2006-08-22 18:01:38 Next poll of p15191085 in 44 seconds (for client msg) 2006-08-22 18:01:38 Request completed: req 8811, peer 212.227.90.152:1984, action was 2, type was 0
Finally, when msgcache is in this state, what happens if you run
I have to wait for this ...
from the Hobbit server - the command
bb IP.OF.CLIENT.HOST "pullclient"
It should dump the last status message to the screen.
Regards, Henrik
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk Thanks in advance
Rolf Masfelder
Tel.: 06321 355 207 FAX: 06321 355 224 Mobil: 0160 80 64 181 world: 0700 NECTORGMbh EMail: rolf.masfelder at nector.de
Am Dienstag 22 August 2006 18:10 schrieb Rolf Masfelder:
Am Dienstag 22 August 2006 16:21 schrieb Henrik Stoerner:
On Tue, Aug 22, 2006 at 04:03:20PM +0200, Rolf Masfelder wrote:
i'm running msgcache on a remote machine. For some houres it works fine, but at some point it stopps working.
Here is a part of msgcache.log (msgcache is running with --debug)
... as before ...
Finally, when msgcache is in this state, what happens if you run
I have to wait for this ...
from the Hobbit server - the command
bb IP.OF.CLIENT.HOST "pullclient"
It should dump the last status message to the screen.
on the server: 2006-08-24 08:50:55 Whoops ! bb failed to send message - timeout
on the client: the last entries in msgcache.log are 2006-08-24 08:47:49 New connection 2006-08-24 08:49:08 New connection 2006-08-24 08:50:15 New connection 2006-08-24 08:50:40 New connection
the last entires in hobbitclient.log: 2006-08-24 08:29:19 Whoops ! bb failed to send message - timeout 2006-08-24 08:34:20 Whoops ! bb failed to send message - timeout 2006-08-24 08:39:21 Whoops ! bb failed to send message - timeout 2006-08-24 08:44:22 Whoops ! bb failed to send message - timeout 2006-08-24 08:49:23 Whoops ! bb failed to send message - timeout 2006-08-24 08:54:24 Whoops ! bb failed to send message - timeout
Time is correct on both machines.
What I have seen: there are two types of processing-blocks: --- first, occures every fourth or fifthes connection 2006-08-23 19:02:06 New connection 2006-08-23 19:02:06 Queuing outbound message 2006-08-23 19:02:15 New connection 2006-08-23 19:02:15 -> oksender 2006-08-23 19:02:15 <- oksender(1-a) 2006-08-23 19:02:15 Got pullclient request: pullclient 1 log:/var/log/messages:10240 ignore MARK
2006-08-23 19:02:15 Saved client response: log:/var/log/messages:10240 ignore MARK
there is a "Queuing outbound message" between two "New connection"
--- second, the 'normal' Version 2006-08-23 19:03:14 New connection 2006-08-23 19:03:14 -> oksender 2006-08-23 19:03:14 <- oksender(1-a) 2006-08-23 19:03:14 Got pullclient request: pullclient 1 log:/var/log/messages:10240 ignore MARK
2006-08-23 19:03:14 Saved client response: log:/var/log/messages:10240 ignore MARK
one "New connection" without "Queuing ..."
--- the last block before the 'problem' occures: 2006-08-23 19:04:16 New connection 2006-08-23 19:05:53 New connection 2006-08-23 19:05:53 -> oksender 2006-08-23 19:05:53 <- oksender(1-a) 2006-08-23 19:05:53 Got pullclient request: pullclient 1 log:/var/log/messages:10240 ignore MARK
2006-08-23 19:05:53 Saved client response: log:/var/log/messages:10240 ignore MARK
there are two "New connection" without the "Queuing ..."
after that there are only 2006-08-23 19:06:39 New connection 2006-08-23 19:07:06 New connection 2006-08-23 19:09:35 New connection 2006-08-23 19:11:47 New connection 2006-08-23 19:12:07 New connection
in the log
i hope that helps
Regards, Henrik
To unsubscribe from the hobbit list, send an e-mail to hobbit-unsubscribe at hswn.dk
Thanks in advance
Greetings
Rolf Masfelder
Tel.: 06321 355 207 FAX: 06321 355 224 Mobil: 0160 80 64 181 world: 0700 NECTORGMbh EMail: rolf.masfelder at nector.de
participants (2)
-
henrik@hswn.dk
-
rolf.masfelder@nector.de