User Tools

Site Tools


wiki:cluster_processing

Multi-cores and Grid processing

Multi-cores processing on desktop and laptop

We can improve the capacity of processing data using the full computational capacity of CPU available in hardware.

Control number of cores using xargs & qsub bash function:

Cluster computation under GRAPPOLO

GRAPPOLO a supercomputer in a super box is a project conceived and developed from Spatial Ecology to teach cluster computation using a grid engine. We have collaborated with Makernow (a fablab in Cornwall) to make it portable. More specifically, GRAPPOLO is micro cluster computer replicating the functioning of the biggest cluster computer facility in the southwest UK. This tool is similar to the Raspberry pi cluster developed at Southampton University but is aimed towards teaching BigData processing for Geographic Information Systems methods rather than raw computation. It is very low cost ( ~ £140), portable and a perfect replica of an operating system running on a true high performance cluster computer.

Cluster computation using Amazon High Performance Computing HPC

We obtained Amazon education grants for accessing and teaching the use of Amazon web services for High Performance Computing.

 cd /home/user/Downloads/
 wget http://www.spatial-ecology.net/ost4sem/exercise/keypair.pem
 sudo chmod 400 /home/user/Downloads/keypair.pem
    
 # see http://www.spatial-ecology.net/dokuwiki/doku.php?id=wiki:regListSBarb
 # 1ID student  1 2 3 4 
 
 
 
 ssh -X -i /home/user/Downloads/keypair.pem  user*@ec2-54-234-116-177.compute-1.amazonaws.com
 
 # transfer file to the instance 
 scp -i /home/user/Downloads/keypair.pem yourfile user*@ec2-54-234-116-177.compute-1.amazonaws.com:/home/user*/

Multi-cores processing R scripts

Examples of processing R scripts using multiple CPU:

  • Embed R funcion in Bash passing variables from bash to R: to understand the logic of multi chore, here an example of single CPU processing R script using EOF method.
  • The use of xargs Basic R functions processed with multiple chore using xargs bash function



Cloud computation using MS Azure

Cloud computing is the ability to run a program or application on many connected computers at the same time over internet.
We can perform cloud computation uploading one or more virtual machines (VM) on a server, access the VMs through a secure shell - ssh data connection protocol.

 ssh -X    nuvola1P@nuvola1p.cloudapp.net

This is less performing than using cluster computation such as Sun Grid Engines. The advantage over cluster computation is that you can rent the computational hardware sized to your needs and for a limited time and perform a specific task.

wiki/cluster_processing.txt · Last modified: 2016/07/01 04:20 (external edit)