|
A big problem here is that the stack trace is nonexistent, it comes from a different thread or something. To get the stack trace, you need that of the wrapping exception,
org.smartfrog.sfcore.processcompound.SFProcess.sfSelectTargetProcess(SFProcess.java:573) org.smartfrog.sfcore.processcompound.SFProcess.sfSelectTargetProcess(SFProcess.java:513) org.smartfrog.sfcore.common.ConfigurationAction.selectTargetProcess(ConfigurationAction.java:48) ... It is happening in DefaultRootLocatorImpl, in ProcessCompound getRootProcessCompound(InetAddress hostAddress, int portNum) here: ProcessCompound pc = (ProcessCompound) reg.lookup(defaultName); the problem here is that the reg variable is an instance of Registry, created via LocateRegistry.getRegistry(host, port);, and it is just a proxy of (address,port). There is no attempt to reach the host/port until the RMI call is made, at which point you are at the mercy of RMI and its errors.
I propose that for the ping command, we start by trying to open a tcp connection to the port, catching early instances of NoRouteToHost: network problems, laptop has moved hostname doesnt resolve port connection refused ping can also get into trouble in ActionPing.execute() in
targetC = (Prim) targetP.sfResolveWithParser(name); It is here where the misleading ip address error message is creeping in java.rmi.ConnectException: Connection refused to host: 127.0.0.1; nested exception is: java.net.ConnectException: Connection refused more stack trace
Result: * Exception: 'Unable to connect to sfDaemon on: 192.168.1.6. Reason: sfDaemon may not be running on 192.168.1.6java.rmi.ConnectException: Connection refused to host: 127.0.0.1; nested exception is: java.net.ConnectException: Connection refused' * StackTrace: ' java.rmi.ConnectException: Connection refused to host: 127.0.0.1; nested exception is: java.net.ConnectException: Connection refused at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:619) at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216) at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202) at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:128) at org.smartfrog.sfcore.processcompound.ProcessCompoundImpl_Stub.sfPing(Unknown Source) at org.smartfrog.sfcore.common.ActionPing.ping(ActionPing.java:97) at org.smartfrog.sfcore.common.ActionPing.execute(ActionPing.java:57) at org.smartfrog.sfcore.common.ConfigurationAction.execute(ConfigurationAction.java:113) at org.smartfrog.sfcore.common.ConfigurationDescriptor.execute(ConfigurationDescriptor.java:1076) at org.smartfrog.SFSystem.runConfigurationDescriptor(SFSystem.java:367) at org.smartfrog.SFSystem.runConfigurationDescriptor(SFSystem.java:346) at org.smartfrog.SFSystem.runConfigurationDescriptors(SFSystem.java:303) at org.smartfrog.SFSystem.runSmartFrog(SFSystem.java:591) at org.smartfrog.SFSystem.execute(SFSystem.java:434) at org.smartfrog.SFSystem.main(SFSystem.java:395) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:310) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:176) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:163) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:381) at java.net.Socket.connect(Socket.java:537) at java.net.Socket.connect(Socket.java:487) at java.net.Socket.<init>(Socket.java:384) at java.net.Socket.<init>(Socket.java:198) at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40) at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:146) at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613) ... 14 more ' * Command line SFACT: ':PING:::192.168.1.6:' * To String: ', type:PING, url:, host:192.168.1.6, resultType:FAILED, resultExceptionMessage:Connection refused to host: 127.0.0.1; nested exception is: java.net.ConnectException: Connection refused, java.rmi.ConnectException: Connection refused to host: 127.0.0.1; nested exception is: , java.net.ConnectException: Connection refused, at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:619), at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216), at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202), at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:128), at org.smartfrog.sfcore.processcompound.ProcessCompoundImpl_Stub.sfPing(Unknown Source), at org.smartfrog.sfcore.common.ActionPing.ping(ActionPing.java:97), at org.smartfrog.sfcore.common.ActionPing.execute(ActionPing.java:57), at org.smartfrog.sfcore.common.ConfigurationAction.execute(ConfigurationAction.java:113), at org.smartfrog.sfcore.common.ConfigurationDescriptor.execute(ConfigurationDescriptor.java:1076), at org.smartfrog.SFSystem.runConfigurationDescriptor(SFSystem.java:367), at org.smartfrog.SFSystem.runConfigurationDescriptor(SFSystem.java:346), at org.smartfrog.SFSystem.runConfigurationDescriptors(SFSystem.java:303), at org.smartfrog.SFSystem.runSmartFrog(SFSystem.java:591), at org.smartfrog.SFSystem.execute(SFSystem.java:434), at org.smartfrog.SFSystem.main(SFSystem.java:395), Caused by: java.net.ConnectException: Connection refused, at java.net.PlainSocketImpl.socketConnect(Native Method), at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:310), at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:176), at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:163), at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:381), at java.net.Socket.connect(Socket.java:537), at java.net.Socket.connect(Socket.java:487), at java.net.Socket.<init>(Socket.java:384), at java.net.Socket.<init>(Socket.java:198), at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40), at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:146), at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613), ... 14 more, ' It looks like the root cause here is the remote endpoint is returning invalid remote object references, 127.0.0.1 as the IPAddress,
ProcessCompoundImpl_Stub[UnicastRef [liveRef: [endpoint:[127.0.0.1:53194](remote),objID:[-a1b19d7:11f05149fc4:-7fff, -7266556617222224901]]]] We can talk to the remote registry, but when you try to talk to anything it returns, error with Connection refused to host: 127.0.0.1, the joke being that if you had a local daemon, it would be irrelevant -the reference would be wrong and you are still going to miss pinging the far end. We may be able to add some special diags for this; -When reg.lookup(defaultName) returns a process compound, we do a toString() and look for that 127.0.0.1 value, and warn that there may be trouble here, as the machine doesn't know who it is. No bailout, just a warning, so that when the problem arises elsewhere, the root cause is more obvious sfDeploy isnt happy either
namenode: [deploy] [deploy] SmartFrog 3.17.005dev (2009-01-23 16:27:10 GMT) [deploy] (C) Copyright 1998-2009 Hewlett-Packard Development Company, LP [deploy] [deploy] 2009/01/23 22:01:15:791 GMT [WARN ][main] SFCORE_LOG - SmartFrog security is NOT active [deploy] 2009/01/23 22:01:15:796 GMT [WARN ][main] SFCORE_LOG - Warning: stack trace logging enabled [deploy] 2009/01/23 22:01:15:810 GMT [WARN ][main] SFCORE_LOG - Possible problem with classpath: [deploy] 2 occurrences for sf-anubis [deploy] /home/slo/.ivy/cache/org.smartfrog/sf-anubis/jars/sf-anubis-3.17.005dev.jar [deploy] /home/slo/.ivy/cache/org.smartfrog/sf-anubis/jars/sf-anubis-3.17.005dev.jar [deploy] [deploy] 2009/01/23 22:01:15:817 GMT [WARN ][main] SFCORE_LOG - Possible problem with codebase: 2 occurrences for sf-anubis [deploy] /home/slo/.ivy/cache/org.smartfrog/sf-anubis/jars/sf-anubis-3.17.005dev.jar file [deploy] /home/slo/.ivy/cache/org.smartfrog/sf-anubis/jars/sf-anubis-3.17.005dev.jar file [deploy] [deploy] - FAILED when trying DEPLOY of 'namenode', [/org/smartfrog/extras/hadoop/cluster/services/namenode/default.sf], host:192.168.1.6 [deploy] Result: [deploy] * Exception: 'SmartFrogException:: java.rmi.ConnectException: Connection refused to host: 127.0.0.1; nested exception is: [deploy] java.net.ConnectException: Connection refused [deploy] SmartFrog 3.17.005dev (2009-01-23 16:27:10 GMT)' [deploy] * StackTrace: ' [deploy] SmartFrogException:: java.rmi.ConnectException: Connection refused to host: 127.0.0.1; nested exception is: [deploy] java.net.ConnectException: Connection refused, SmartFrog 3.17.005dev (2009-01-23 16:27:10 GMT) [deploy] at org.smartfrog.sfcore.common.SmartFrogException.forward(SmartFrogException.java:92) [deploy] at org.smartfrog.sfcore.common.ActionDeploy.Deploy(ActionDeploy.java:192) [deploy] at org.smartfrog.sfcore.common.ActionDeploy.doDeploy(ActionDeploy.java:297) [deploy] at org.smartfrog.sfcore.common.ActionDeploy.execute(ActionDeploy.java:272) [deploy] at org.smartfrog.sfcore.common.ConfigurationAction.execute(ConfigurationAction.java:113) [deploy] at org.smartfrog.sfcore.common.ConfigurationDescriptor.execute(ConfigurationDescriptor.java:1076) [deploy] at org.smartfrog.SFSystem.runConfigurationDescriptor(SFSystem.java:367) [deploy] at org.smartfrog.SFSystem.runConfigurationDescriptor(SFSystem.java:346) [deploy] at org.smartfrog.SFSystem.runConfigurationDescriptors(SFSystem.java:303) [deploy] at org.smartfrog.SFSystem.runSmartFrog(SFSystem.java:591) [deploy] at org.smartfrog.SFSystem.execute(SFSystem.java:434) [deploy] at org.smartfrog.SFSystem.main(SFSystem.java:395) [deploy] Caused by: java.rmi.ConnectException: Connection refused to host: 127.0.0.1; nested exception is: [deploy] java.net.ConnectException: Connection refused [deploy] at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:601) [deploy] at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:198) [deploy] at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:184) [deploy] at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:110) [deploy] at org.smartfrog.sfcore.processcompound.ProcessCompoundImpl_Stub.sfDeployComponentDescription(Unknown Source) [deploy] at org.smartfrog.sfcore.common.ActionDeploy.Deploy(ActionDeploy.java:155) [deploy] ... 10 more [deploy] Caused by: java.net.ConnectException: Connection refused [deploy] at java.net.PlainSocketImpl.socketConnect(Native Method) [deploy] at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333) [deploy] at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195) [deploy] at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182) [deploy] at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) [deploy] at java.net.Socket.connect(Socket.java:519) [deploy] at java.net.Socket.connect(Socket.java:469) [deploy] at java.net.Socket.<init>(Socket.java:366) [deploy] at java.net.Socket.<init>(Socket.java:180) [deploy] at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:22) [deploy] at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:128) [deploy] at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:595) [deploy] ... 15 more [deploy] ALL: SmartFrogException: java.rmi.ConnectException: Connection refused to host: 127.0.0.1; nested exception is: [deploy] java.net.ConnectException: Connection refused' [deploy] * Command line SFACT: 'namenode:DEPLOY:'"/org/smartfrog/extras/hadoop/cluster/services/namenode/default.sf"'::192.168.1.6:' [deploy] * To String: ' name:namenode, type:DEPLOY, url:/org/smartfrog/extras/hadoop/cluster/services/namenode/default.sf, host:192.168.1.6, resultType:FAILED, resultExceptionMessage:java.rmi.ConnectException: Connection refused to host: 127.0.0.1; nested exception is: [deploy] java.net.ConnectException: Connection refused, SmartFrogException:: java.rmi.ConnectException: Connection refused to host: 127.0.0.1; nested exception is: , java.net.ConnectException: Connection refused, SmartFrog 3.17.005dev (2009-01-23 16:27:10 GMT), at org.smartfrog.sfcore.common.SmartFrogException.forward(SmartFrogException.java:92), at org.smartfrog.sfcore.common.ActionDeploy.Deploy(ActionDeploy.java:192), at org.smartfrog.sfcore.common.ActionDeploy.doDeploy(ActionDeploy.java:297), at org.smartfrog.sfcore.common.ActionDeploy.execute(ActionDeploy.java:272), at org.smartfrog.sfcore.common.ConfigurationAction.execute(ConfigurationAction.java:113), at org.smartfrog.sfcore.common.ConfigurationDescriptor.execute(ConfigurationDescriptor.java:1076), at org.smartfrog.SFSystem.runConfigurationDescriptor(SFSystem.java:367), at org.smartfrog.SFSystem.runConfigurationDescriptor(SFSystem.java:346), at org.smartfrog.SFSystem.runConfigurationDescriptors(SFSystem.java:303), at org.smartfrog.SFSystem.runSmartFrog(SFSystem.java:591), at org.smartfrog.SFSystem.execute(SFSystem.java:434), at org.smartfrog.SFSystem.main(SFSystem.java:395), Caused by: java.rmi.ConnectException: Connection refused to host: 127.0.0.1; nested exception is: , java.net.ConnectException: Connection refused, at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:601), at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:198), at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:184), at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:110), at org.smartfrog.sfcore.processcompound.ProcessCompoundImpl_Stub.sfDeployComponentDescription(Unknown Source), at org.smartfrog.sfcore.common.ActionDeploy.Deploy(ActionDeploy.java:155), ... 10 more, Caused by: java.net.ConnectException: Connection refused, at java.net.PlainSocketImpl.socketConnect(Native Method), at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333), at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195), at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182), at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366), at java.net.Socket.connect(Socket.java:519), at java.net.Socket.connect(Socket.java:469), at java.net.Socket.<init>(Socket.java:366), at java.net.Socket.<init>(Socket.java:180), at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:22), at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:128), at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:595), ... 15 more, ALL: SmartFrogException: java.rmi.ConnectException: Connection refused to host: 127.0.0.1; nested exception is: [deploy] java.net.ConnectException: Connection refused' [deploy] I've committed a patch which works for sfPing, sfManagementConsole, and issues a warning that the problem is on the far end, not locally
sfManagementConsole -h 192.168.1.6 SFHOME undefined, using /home/slo/Projects/SmartFrog/Forge/core/smartfrog/dist as base directory Starting management window... sfManagementConsole connecting to 192.168.1.6:3800 2009/01/23 22:03:03:263 GMT [WARN ][main] SFCORE_LOG - Remote Process Compound for /192.168.1.6 appears to think its IP address is "127.0.0.1" 2009/01/23 22:03:03:268 GMT [WARN ][main] SFCORE_LOG - Communications with components are likely to fail with Connection refused exceptions Error: Connection refused to host: 127.0.0.1; nested exception is: java.net.ConnectException: Connection refused The patch fixes the ant tasks too
[deploy] 2009/01/23 22:35:52:647 GMT [WARN ][main] SFCORE_LOG - Remote Process Compound for /192.168.1.6 appears to think its IP address is "127.0.0.1" [deploy] 2009/01/23 22:35:52:647 GMT [WARN ][main] SFCORE_LOG - Communications with components are likely to fail with Connection refused exceptions [deploy] - FAILED when trying DEPLOY of 'namenode', [/org/smartfrog/extras/hadoop/cluster/services/namenode/default.sf], host:192.168.1.6 [deploy] Result: [deploy] * Exception: 'SmartFrogException:: java.rmi.ConnectException: Connection refused to host: 127.0.0.1; nested exception is: [deploy] java.net.ConnectException: Connection refused [deploy] SmartFrog 3.17.005dev (2009-01-23 22:07:29 GMT)' [deploy] * StackTrace: ' [deploy] SmartFrogException:: java.rmi.ConnectException: Connection refused to host: 127.0.0.1; nested exception is: It's a bit ugly -looking for strings, but the only alternative is looking for the text "Connection refused to host: 127.0.0.1" in a ConnectException; same problem, only later, and could even internationalise worse. |
|||||||||||||||||||||||||||||||||||||||||||||||||
2009/01/23 18:34:47:728 GMT [WARN ][main] SFCORE_LOG - SmartFrog security is NOT active
- FAILED when trying PING of 'rootProcess', host:192.168.1.6
Result:
* Exception: 'SmartFrogException:: Unable to connect to sfDaemon on: /192.168.1.6.
Reason: sfDaemon may not be authenticated properly
cause: java.rmi.ConnectIOException: Exception creating connection to: 192.168.1.6; nested exception is:
java.net.NoRouteToHostException: No route to host
SmartFrog 3.17.005dev (2009-01-23 16:27:10 GMT)'