Issue Details (XML | Word | Printable)

Key: SFOS-970
Type: Bug Bug
Status: Open Open
Priority: Minor Minor
Assignee: Helge Mahrt
Reporter: Helge Mahrt
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
SmartFrog

VAST points of failure

Created: 03/Sep/08 03:26 PM (BST)   Updated: 04/Sep/08 02:54 PM (BST)
Component/s: tools_Avalanche
Affects Version/s: None
Fix Version/s: None

Time Tracking:
Not Specified

Compatibility: unknown


 Description  « Hide
Here's a list of points of failure in the environment construction of vast:

- Sometimes the VMware tools don't come up in the virtual machine. They are required to operate within the guest OS and are used as an indicator whether an OS has completed booting.
- Sometimes virtual machines get the Network Setup Helper copied but it is not executed so the network setup fails. The problem here appears to be vanishing XMPP messages: The command to execute the helper is sent to the VM but never arrives.
- When igniting a VM using Avalanche the sfInstaller.vm file is used to create the sequence of SSH commands. First the SmartFrog daemon for the test runner/controller is started and then the actual runner/controller script is started using sfStart. A sleep command ensures that the daemon is ready bevore it's attempted to start the script. Sometimes the virtual machines are so slow that the daemon is still not ready after the sleep. A less brittle solution for that has to be found which also regards the IP binding of the daemon.
- Sometimes, very rarely, I get a SCP error after igniting the blade servers. I can't remember whether it was a timeout or an authentication error. Next time I encounter it I will post the error message.

Resolved points (which still might be worth looking at in cases of failures):
- When starting the SUT daemon the test runner will try to aquire the ProcessCompound of it to see if it is ready.
- After the Network Setup Helper has been executed successfully VAST tries to ping the according VM. If it's still not reachable after 1 minute the helper will be copied and executed again. 5 attempts will be done at most.

 All   Comments   Work Log   Change History      Sort Order: Ascending order - Click to sort in descending order
Helge Mahrt added a comment - 04/Sep/08 02:54 PM (BST)
Just now the vmware module for the command to copy the helper into the os, logged into the os and then hung up. Don't know if that's because of the VIX API or what else caused it.