Monday 1 April 2019

HPC Programming

Introduction


High Performance Computing most generally refers to the practice of aggregating computing power in a way that delivers much higher performance than one could get out of a typical desktop computer or workstation in order to solve large problems in science, engineering, or business.This blog will describe a simple experiment that guides on HPC programming.

Requirements

1). NVIDIA QUADRO Graphic Card , Graphic card used in this document is Quadro K5000
2). Download the Cuda toolkit from :  https://developer.nvidia.com/cuda-toolkit-70
3). Visual Studio IDE

Introduction

CUDA  is a parallel computing platform and programming model invented by NVIDIA. It enables dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU).

Test installation

Open command prompt and type nvcc -V should give the version that has been installed.


Test correct installation by navigating to :
C:\ProgramData\NVIDIA Corporation\CUDA Samples\v9.2\bin\win64\Debug
Execute deviceQuery
Which should give the output :


Developing using CUDA toolkit


1). Open the Visual studio IDE (I am using Visual Studio 2015)
Under the CUDA Samples are the solutions for samples that are installed with the toolkit:



2). Open the solution as per the Visual Studio IDE installed on your workstation (Samples_vs2015.VC)
3). Use the build to compile the solution as shown.

  
4). After compiling

Go to the debug directory of the solution and run i.e

C:\Documents and Settings\All Users\NVIDIA Corporation\CUDA Samples\v9.2\0_Simple\matrixMul\x64\Debug

(I am trying the sample on matrix multiplication)

5). Use the command prompt (Startmenu > CMD )to cd to file matrixMul.cu on the Debug directory

6). Compile the CUDA file to generate the executables (a.exe)

The syntax to compile is :

nvcc " C:\Documents and Settings\All Users\NVIDIA Corporation\CUDA Samples\v9.2\0_Simple\matrixMul\x64\Debug\ matrixMul.cu"

7). It will  create the filed a.exe

Run a.exe to give you the output as shown:

8.) In this case it is showing my Graphic card Quadro K5000 , and the computation done by the GPU.

Task One - Create a  CUDA code to read computation power:

Steps
1). Open visual studio > New Project > Cuda Project > Enter Name of the project
2). Type the code on the .cu file as below:
  

Errors encountered:
> nvcc -o a.out matrixMul.cu

1). nvcc fatal : Cannot find compiler 'cl.exe' in PATH
solution add the path:


to windows path environment.

cl is the c/c++ compiler as located in your linux or windows workstation 

Programming  using CUDA toolkit 

1).Start Visual basic 2015.
2).New project
3). Select CUDA 9.2
4).This will create a file " .cu " where you write GPU specific code


Quick start on a simple hello world to execute on GPU


// includes, system
#include <stdio.h>
// includes CUDA Runtime
#include <cuda_runtime.h>
__global__
void kernel(void) {
}
int main(void) {
       kernel << <1, 1 >> >();
       printf("Hello, World!\n");
       getchar();
}

CUDA C keyword __global__ indicates that a function.
nvcc splits source file into host and device components — NVIDIA’s compiler handles device functions like kernel() — Standard host compiler handles host functions like main()


Memory management on CUDA programs

Host and device memory are distinct entities, where
- Device pointers point to GPU memory which may be passed to and from host code and  may not be dereferenced from host code.
- Host pointers point to CPU memory , may be passed to and from device code and may not be dereferenced from device code.

CUDA API for dealing with device memory :
cudaMalloc(), cudaFree(), cudaMemcpy()

I will illustrate with the simple addition code below:

// includes, system
#include <stdio.h>
// includes CUDA Runtime
#include <cuda_runtime.h>
__global__
void add(int *a, int *b, int *c) {
       *c = *a + *b;
}
int main(void) {
       //host copies of a, b, c
       int a, b, c;
    //device copies of a, b, c
       int *dev_a, *dev_b, *dev_c;
    //we need space for an integer
       int size = sizeof(int);
    //allocate device copies of a, b, c
       cudaMalloc((void**)&dev_a, size);
       cudaMalloc((void**)&dev_b, size);
       cudaMalloc((void**)&dev_c, size);
       a = 2;
       b = 7;
       //copy inputs to device
       cudaMemcpy(dev_a, &a, size, cudaMemcpyHostToDevice);
       cudaMemcpy(dev_b, &b, size, cudaMemcpyHostToDevice);
       //launch add() kernel on GPU, passing parameters
       add <<< 1, 1 >>>(dev_a, dev_b, dev_c);
       //copy device result back to host copy of c
       cudaMemcpy(&c, dev_c, size, cudaMemcpyDeviceToHost);
       cudaFree(dev_a);
       cudaFree(dev_b);
       cudaFree(dev_c);
       printf("Result is \n %d " , c);
       getchar();
       return 0;
}

Output above will be as below :


Parallel programming on CUDA programs
To be done 

3d Max Rendering

Introduction

This is aimed to make use of the GPU in rendering animation. NVIDIA Tesla K20Xm will be used.

Test 1.

Rendering a car animation of 151 frames on dx360 M4 Server (20CPUs,65536MB RAM)on NVIDIA Tesla K20Xm.


a). Normal 3d max render time is as shown below :


  

The processor statistics of one of the dx360 M4 Server  node as shown below (rendering via 3dmax):



The time taken to render above is :

58minutes , the resulting file a 17.3 MB video file. 

b). Deadline rendering of same animation. 

The same animation will now be submitted to the render farm with the four slaves.
All the slaves are as shown on the picture below:

Submitted 3dmax job is as shown:

  
NB: Two nodes are the only being scheduled due to license 

Configuring render farm to use the NVIDIA Tesla K20Xm

Below are the screenshots of the configuration:
1). Ensure you are on super user mode.


2).Right click on slave properties > then go to CPU Affinity, unselect all cpus



3).Go to GPU Affinity and select the GPU's of that slave that you want to use on the slave render as shown below:

  
Comparing with the normal CPU render , below is the screenshot of the processor, on one of the m4server nodes.

 Time taken to render the script is as shown below:

To be done.  

Settings to submit the 3dmax file as shown below:






c). Application of CUDA Programming on render farm

On next blog!!

Tuesday 27 November 2018

Version control system - Setup and explanation


Subversion Server

Introduction

Subversion is a version control system that keeps track of changes made to files and folders (directories), facilitating data recovery and providing a history of the changes that have been made over time.

What is a version control system?

Version Control System (VCS) is a software that helps software developers to work together and maintain a complete history of their work.
Following are the goals of a Version Control System.
  • Allow developers to work simultaneously.
  • Do not overwrite each other’s changes.
  • Maintain history of every version of everything.
A VCS is divided into two categories.
  • Centralized Version Control System (CVCS), and
  • Distributed/Decentralized Version Control System (DVCS).
We will concentrate only on the Centralized Version Control System and especially Subversion. Subversion falls under centralized version control system, meaning that it uses central server to store all files and enables team collaboration.

Version Control Terminologies

Explanation of common terms in version control system:
  • Repository: A repository is the heart of any version control system. It is the central place where developers store all their work. Repository not only stores files but also the history. Repository is accessed over a network, acting as a server and version control tool acting as a client. Clients can connect to the repository, and then they can store/retrieve their changes to/from repository. By storing changes, a client makes these changes available to other people and by retrieving changes, a client takes other people's changes as a working copy.
  • Trunk: The trunk is a directory where all the main development happens and is usually checked out by developers to work on the project.
  • Tags : The tags directory is used to store named snapshots of the project. Tag operation allows to give descriptive and memorable names to specific version in the repository.
    For example, LAST_STABLE_CODE_BEFORE_EMAIL_SUPPORT is more memorable than
    Repository UUID: 7ceef8cb-3799-40dd-a067-c216ec2e5247 and
    Revision: 13

  • Branches: Branch operation is used to create another line of development. It is useful when you want your development process to fork off into two different directions. For example, when you release version 5.0, you might want to create a branch so that development of 6.0 features can be kept separate from 5.0 bug-fixes.
  • Working copy: Working copy is a snapshot of the repository. The repository is shared by all the teams, but people do not modify it directly. Instead each developer checks out the working copy. The working copy is a private workplace where developers can do their work remaining isolated from the rest of the team.
  • Commit changes: Commit is a process of storing changes from private workplace to central server. After commit, changes are made available to all the team. Other developers can retrieve these changes by updating their working copy. Commit is an atomic operation. Either the whole commit succeeds or is rolled back. Users never see half finished commit.
 Setup on Personal PC

Requirements :

Three items needed:
  1.  Private key , also sent along this document.
  2. Pageant Software
  3.  Putty
  4.  GIT
  5.  Tortoise GIT
The above two software have a straight forward wizard , follow through.

Configuration.

Enter the below settings on  putty
                     Hostname : 192.168.1.***
                    Port :22
  1. Then navigate to Connection > SSH > Auth
  2. Browse for the private keyissued with this document and set it under the field Private key authentication.
  3. Run the pageant software and add the same key on this software.

Cloning a Repository on the developer server

  1. Create an empty folder anywhere on your Computer.
  2. Right click to give a context menu "SVN Checkout"
  3. Enter the URL of your Repository as svn+ssh://Administrator@MySVNConnection/Order
  4.  Administrator : User login in to the server
  5. MySVNConnection : Name as you have set on your putty software
  6. Order : Name of your project on the developers repository.

Committing changes done to the Repository on the Developers Server

1). Right click on the project, then use SVN Commit -> master as shown below

Then gives you the screen as shown below, enter your comments , select all unversioned files, then use option commit & push as shown below:


Same step is used when updating your project from SVN.
Right click and use SVN Update instead of SVN Commit.
If you commit a file updated and a different programmer updated the file it will show conflicts, that you will need to resolve so you all have the same version on your local PC's as shown :


Click ok , then update
Next screen shows new change and old, choose which to commit.



After choosing the correct changes to update your file, use button Mark as Resolved.Then save, close screen. Then finally commit.
 

Adding a repository for your project.

1). Login to the developer server.
2). Navigate to the location of the repository  (C:\Repository.git)
3). Right click , > Tortoise SVN > Repo-browser
4). Right inside the browser either to add an existing project or a new project as shown below :



Added projects will be available for GIT clone using the url :
svn+ssh://Administrator@MySVNConnection/{your added project on the repository}