Issue Details (XML | Word | Printable)

Key: SFOS-905
Type: Sub-task Sub-task
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: Steve Loughran
Reporter: Steve Loughran
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
SmartFrog
SFOS-780

Move from direct inherited configuration to a "Cluster" CD that defines the cluster

Created: 20/Jun/08 06:57 PM (BST)   Updated: 16/Sep/09 02:50 PM (BST)
Component/s: _service_hadoop
Affects Version/s: 3.17.004
Fix Version/s: 3.17.x

Time Tracking:
Not Specified

Issue Links:
Depends
 
Metabug
 

Compatibility: backwards compatible


 Description  « Hide
Having every part of the Hadoop system defined in the Prim leads to too much replication. Instead everything should take a Cluster CD that could refer to a CD or component providing cluster information...this would make it much easier for workflow operations.

 All   Comments   Work Log   Change History      Sort Order: Ascending order - Click to sort in descending order
Steve Loughran added a comment - 15/Jan/09 01:06 PM (GMT)
The new HadoopConfiguration component can be a configuration (including one read from the official Hadoop XML files), but there's no need to only bind to one of these, as all we want from the cluster template is a list of (name,value) pairs, and that is exactly what the SF attributes are.

There's a need to catch race-condition problems by ensuring that the HadoopConfiguration is live before the copying is done.

* All the attributes could be copied over, or they could be left as is for lazy-evaluation, especially for relative values. That would be safer.
* the HadoopConfiguration component does an early binding
* we could do checks in managedconfiguration, using a list (which could be in the override class) to say what is going on.

Steve Loughran added a comment - 15/Jan/09 01:12 PM (GMT)
It's a lot simpler to not play games with multiple inheritance, and instead have the cluster child be either a LAZY reference to a deployed configuration *or* a CD to actually deploy as a child.

This requires
1 all HadoopComponents to become workflow compounds
2 the cluster component to get deployed early if it is a CD, terminated during termination
3 the ManagedConfiguration to get its config from the cluster() data, and not the local prim

(#3) is going to break existing code/tests


Steve Loughran added a comment - 19/Jan/09 02:02 PM (GMT)
There are some usability constraints to consider here. Imagine a client component -such as one that copies files in and out, or submits jobs. These want to take a cluster definition, but then override any value in there with anything set locally.

eg.

CopyHadoopFile extends DfsCopyFileIn {
  src "/tmp/data.gzip";
  dest "/project/analysis.gzip";
  cluster LAZY livecluster;
  dfs.replication.factor 1;
 }

where the replication factor is throttled back. Without that local override, it would be something like


Cluster2 extends livecluster {
 dfs.replication.factor 1;
}

But even that is limited, as the cluster state comes at deploy time, whereas we may want to pick up some other facts from a live, running cluster.

Proposal. DFS client applications will support a cluster reference that provides the basis for their values, but everything can override any of these properties locally. When the component is started, it copies in all current information from the (deployed) cluster reference, adding it to the local node, except for that which is already deployed. Then the config remains bound to the node for the rest of its life, changes to the Configuration instance propagating back.

Steve Loughran added a comment - 20/Jan/09 02:54 PM (GMT)
with the changes to ManagedConfiguration, we can move to this

Steve Loughran added a comment - 20/Jan/09 04:33 PM (GMT)
There's some fun here with directories; those components that resolve directories to work with will currently pick them up locally, and not go via an (optional) cluster configuration.

Arguably, that's good: different nodes should have different directories. But it will be inconsistent.

Steve Loughran added a comment - 20/Jan/09 04:46 PM (GMT)
Also: need to add logic to pick up a list of required attrs from the different clusters, and use sfResolve to pull them in from any parent

Steve Loughran added a comment - 16/Sep/09 02:50 PM (BST)
This is done. It was hard work, so marked as Major. There are now cluster-driven components as well as the inline ones, and everything is working.