Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

statguy/Parallel-R-SSH

Open more actions menu

Repository files navigation

Independent parallel task execution with R on a HPC

Runs independent parallel R sessions by executing a remote script in a high performance cluster via SSH. The script is currently configured for the Ukko cluster.

Features

  • Task (session) id is appended as the last argument for R.
  • Task ids can be specified flexibly.
  • If the number-of-tasks > number-of-available-remote-nodes, tasks are queued and nodes are recycled.
  • A pool of available nodes is maintained and unresponsive/troublesome nodes are removed.
  • Number of nodes in the pool can be limited or extra nodes can be reserved for later use.
  • Nodes can be filtered by maximum allowed load and a blacklist of hosts.
  • Output (stdout, stderr) is redirected to log files in remote nodes.

Usage

Runs test.R in the remote hosts with task ids 1,2,3 and 5 on two nodes with maximum load 10.0:

parallel_r.py -t 1:3,5 -n 2 -l 10.0 -b blacklist.txt -v test.R

The file blacklist.txt contains excluded hosts separated by new lines. See parallel_r.py --help for more details.

Kills all your R processes in all remote hosts:

remote_command.py "killall -s SIGKILL R"

Files on cluster nodes

Since the file system may not be up-to-date due to caching mechanism of NFS, there is a script that checks if a local git repository matches remote. The script can be run inside a shell script like this:

if ! "$exec_path"/git_uptodate.sh; then
  exit
fi
```
in the repository directory. Variable `$exec_path` points to the parallel tasks directory.

Extending
---------
The Python class in `independent_parallel_tasks.py` can be extended for custom HP clusters,
see `ukko_cluster.py` for an example.

TODO
----
* Allow specifying remote username, port, nice parameter, etc.
* Allow running other programs than R.

FIXME
-----
* Replace popen with thread pooling.

Feedback
--------
Jussi Jousimo, jvj@iki.fi

About

Independent parallel R tasks execution manager for a HPC

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
Morty Proxy This is a proxified and sanitized view of the page, visit original site.