Distributed Computing in Kepler
STATUS:
The content on this page is outdated. The page is archived for reference only. For more information about current work, please contact the Distributed Execution Interest Group.
Overview
We envision an easy-to-use mechanism for Kepler users to harness the power of multiple computing nodes to accomplish computations that would be difficult, time-consuming, or impossible to accomplish on a single node. Ideally thins system should be tightly integrated into the Kepler experience so that scientists need not focus on the mechanistic issues surrounding grid computing and can instead focus on the scientific models they would like to construct and execute.
Such a system would have several important features, such as single sign-on, fine-grained access control, automated and efficient staging of data and models to nodes as needed, efficient scheduling of resources, monitoring of the progress of execution, and job control, among others.
Related documents
- Use cases and requirements
- Peer to Peer design ideas
-
Click here for information on currently implemented distributed functionality in Kepler. Note this is still early alpha functionality, so don't depend on it for anything important.
- Meeting notes: 1 2 3 4 5 6 7
- Distributed Kepler Source (zip file)
- Distirbuted Execution Source (zargo file)
Design
(Notes from the Dec 2006 attachments below)A -> B -> C
source -> distrib -> sink
Assumptions: On initialing host
- A/B/C independent
- B - No dependency between iterations
- B - Output order independent of input order (plan to change? Cost to ordering, don't do if unecessary)
- B - exec time >> startup/transport time
- Targeting PN, SDF, DDF, but B can contain other domains
Assumptions: On remote host
- Kepler engine or agent is already running
- Workflow to be run is not local (moml)
- Is the java code already in place?
- All hosts use the same JVM version and kepler version
Process:
- startup agents - external to kepler
- Node discovery - config file or p2p
- Select components to distribute from iterate actors in workflow
- Choose/calculate nodes to use
- set max number of nodes
- may not know # of iterations
- propogate B to nodes
- Fire cycle for each token, T on B
- Choose node N (How does local controller decide?)
- Transfer T to N
- Execute B on N
- Transfer result of B,T-out to master node.
- Write T-out to output of B on master node
- Clean up/ Shut down remote workflow nodes (let them stay running for a while. have a mechanism to let them)
- Error Handling - No response from node - Node can't compute - Node returns invalid token
SlaveController outside Kepler, on remote host. - it can do:
- transfer to B
- start workflow
- stop workflow
- report errors
- clean up workflow
- report load metadata
MasterController on Kepler. local - does the reverse of above plus
- Transfer B
- Set up stubs
- Send tokens
- Recieve errors
- can monitor and resend a token, if wanting too long
- start and stop remote workflow
- heartbeat
- status
Disallowed: only one fixed port for slave listener, Then slavecontroller tracks which dynamic port is be used by who. Tracking message IDs.
Lots of Grid-based and peer-to-peer systems exist that could be used as the basis for this design. Here are some possibilities:
- Using JXTA for a peer-to-peer Kepler shared computing network
- Batch submission systems
- OGSA managed job service
- NIMROD/APST
- Condor
Design Notes
Simple Flow
- Start distributed WF on server
- for each client, query if WF components are available
- if not all components are available, do not use client
- if they are available, send the client the WF
- Client runs WF and puts it in the "receiving" state
- server starts data flow and sends data token to client
- client executes WF and returns result
- server accepts result
Data transfer
- Remote clients accept one token at a time.
- it's up to the WF designer to select the appropriate type of token to send to his/her clients
- if the token size is huge, a URLToken should be used....if it's small, a DataToken might be used