Posted
Filed under Computer/HPC

kgt

Kage Engineer tools Keep moving original code to github.com site. This is GPL license(Opensource)

Install:

git clone https://github.com/kagepark/kgt.git

2019/09/30 11:44 2019/09/30 11:44
[로그인][오픈아이디란?]
Posted
Filed under Computer/HPC
KxCAT based on xCAT which is opensource (https://sourceforge.net/projects/xcat/) from IBM. It is just interfacing easy to use xCAT command using BASH shell scripting language. xCAT command was hard to me before. So, I start made some scripts for installing and some of commands for me at 2012. Now it looks cover up many function of xCAT and reduce engineer's mistake of xCAT commands or procedure. If some of user want experience of HPC system then this will help to you how HPC it works and what is it. It tested on CentOS 7.4 base. It will support CentOS 7.x and diskless and diskful compute node.
Wiki site: https://github.com/kagepark/kxcat/wiki

2019/09/30 11:41 2019/09/30 11:41
[로그인][오픈아이디란?]
Posted
Filed under Computer/HPC
* CEPh (Server environment)
- Physical storage clustering
- only one access point(a mount point on one of a node) only can access one storage device
- useful
  - increase I/O performance to one of a server's I/O
  - the storage device for HA server.
  - provide single storage device(rbd) per mount point in a storage pool.
  - Not support a single storage device(rbd) to a multi-mount point ( not support multi-nodes )

* Lustre (HPC computing environment)
- http://lustre.org/
- https://downloads.hpdd.intel.com/public/lustre/
- Physical storage clustering 
- Support TCP/IP(Ethernet/IPoIB)
- multi-access point(a mount point on multi-nodes) can access same storage device
- Support Ethernet network protocol (Ethernet/IP over IB)
- useful
  - increase I/O performance
  - provide single storage device to a mount point on multi-hosts (Similar to NFS mount)
* BeeGFS (HPC computing environment)
- https://www.beegfs.io/
- Physical storage clustering 
- Support TCP/IP, IB
- multi-access point(a mount point on multi-nodes) can access same storage device
- useful
  - increase I/O performance
  - provide single storage device to a mount point on multi-hosts (Similar to NFS mount)

* GlusterFS (HPC computing environment)
- https://www.gluster.org/
- Physical storage clustering 
- Support TCP/IP, IB, Socket direct protocol
- multi-access point(a mount point on multi-nodes) can access same storage device
- useful
  - increase I/O performance
  - provide single storage device to a mount point on multi-hosts (Similar to NFS mount)
2018/03/24 07:58 2018/03/24 07:58
[로그인][오픈아이디란?]
Posted
Filed under Computer/HPC
requirement :
   MPI program and environment in the system.
   attached file based on MPI(mvapich/openmpi) of OFED.

download hpl-2.0 & atlas 3.8.3
http://sourceforge.net/projects/math-atlas/files/Stable/3.8.3/


make a temporary directory for compile/setup
$ mkdir -p /kage
copy hpl & atlas to /kage

extract ATLAS
$ tar zxvf atlas-xxxx.tar.gz
$ cd ATLAS

create configuration file
$ vi opt.conf
--------------------------------------------------------------------------------------------------------------
#http://math-atlas.sourceforge.net/atlas_install/
arch=Linux_Xeon_SSE2
mkdir -p $arch
cd $arch
../configure -b 64 -D -c -DPentiumCPS=240 --prefix=/kage/hpl/atlas

#../configure -b 32 \                 # Currently the BCCD only supports 32-bit
#   -t -1 \                                  # -1 tells ATLAS to try to autodetect th
#   -Si cputhrchk 0 \                 # Do not check for CPU throttling
#   --prefix=$HOME/hpl/atlas \   # Could be anywhere, but note this path,
#   --nof77 \                              # Don't worry about FORTRAN
#   --cc=/usr/bin/gcc \               # Use gcc
#   -C ic /usr/bin/gcc                  # Really, use gcc (see doc for explainat


mkdir -p /kage/hpl/atlas
mkdir -p /kage/hpl/lib/atlas
make build
make check
make time
make install
--------------------------------------------------------------------------------------------------------------

compile ATLAS
$ sh opt.conf

extract hpl-2.0
$ tar zxvf hpl-2.0.xxx.tar.gz
$ cd hpl-2.0

create configuration file
$ vi setup/Make.Linux_ATHLON_CBLAS
--------------------------------------------------------------------------------------------------------------
SHELL        = /bin/sh
CD             = cd
CP              = cp
LN_S          = ln -s
MKDIR        = mkdir
RM             = /bin/rm -f
TOUCH       = touch
HOME        = /kage
ARCH         = Linux_ATHLON_CBLAS
TOPdir       = $(HOME)/hpl-2.0
INCdir       = $(TOPdir)/include
BINdir       = $(TOPdir)/bin/$(ARCH)
LIBdir       = $(TOPdir)/lib/$(ARCH)
HPLlib       = $(LIBdir)/libhpl.a

#MPdir        = /usr/local/mpi
#MPdir        = /usr/mpi/gcc/mvapich-1.2.0
MPdir        = /usr/mpi/gcc/openmpi-1.4.1
MPinc        = -I$(MPdir)/include
#MPlib        = $(MPdir)/lib/libmpich.a
MPlib        = $(MPdir)/lib64/libmpi.so
#MPlib        = $(MPdir)/lib64/libvt.mpi.a
#LAdir        = $(HOME)/netlib/ARCHIVES/Linux_ATHLON
LAdir        = $(HOME)/hpl/atlas
LAinc        =
LAlib        = $(LAdir)/lib/libcblas.a $(LAdir)/lib/libatlas.a
F2CDEFS      =

# ----------------------------------------------------------------------
# - HPL includes / libraries / specifics
# ----------------------------------------------------------------------
HPL_INCLUDES = -I$(INCdir) -I$(INCdir)/$(ARCH) $(LAinc) $(MPinc)
HPL_LIBS     = $(HPLlib) $(LAlib) $(MPlib)
HPL_OPTS     = -DHPL_CALL_CBLAS
HPL_DEFS     = $(F2CDEFS) $(HPL_OPTS) $(HPL_INCLUDES)

# ----------------------------------------------------------------------
# - Compilers / linkers - Optimization flags
# ----------------------------------------------------------------------
CC           = /usr/bin/gcc
CCNOOPT      = $(HPL_DEFS)
CCFLAGS      = $(HPL_DEFS) -fomit-frame-pointer -O3 -funroll-loops -W -W
LINKER       = /usr/bin/gcc
LINKFLAGS    = $(CCFLAGS)
ARCHIVER     = ar
ARFLAGS      = r
RANLIB       = echo
--------------------------------------------------------------------------------------------------------------

$ ln -s setup/Make.Linux_ATHLON_CBLAS .
$ vi opt.conf
--------------------------------------------------------------------------------------------------------------
#make arch=Linux_ATHLON_CBLAS clean
make arch=Linux_ATHLON_CBLAS
--------------------------------------------------------------------------------------------------------------

compile
$ sh opt.conf

test run hpl with 4 process in localhost
$ cd/kage/hpl-2.0/bin/Linux_ATHLON_CBLAS
$ vi hostlist
--------------------------------------------------------------------------------------------------------------
localhost
localhost
localhost
localhost
--------------------------------------------------------------------------------------------------------------
$ /usr/mpi/gcc/openmpi-1.4.1/bin/mpirun -np 4 -machinefile ./hostlist ./xhpl >& hpl.out


check output
$ tail -f hpl.out

~~~~~~~~
================================================================================
T/V                N    NB     P     Q               Time                 Gflops
--------------------------------------------------------------------------------
WR00R2R4          35     4     4     1               0.00              2.112e-01
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)=        0.0217524 ...... PASSED
================================================================================

Finished    864 tests with the following results:
            864 tests completed and passed residual checks,
              0 tests completed and failed residual checks,
              0 tests skipped because of illegal input values.
--------------------------------------------------------------------------------

End of Tests.
================================================================================

it is about 0.21GFlops for not optimized hardware/hpl input data





If you want full test for hardware stress.
$ vi stress.sh
--------------------------------------------------------------------------------------------------------------
MPI_BIN=/usr/mpi/gcc/openmpi-1.4.1/bin
[ -f hostlist ] && rm -f hostlist
for i in $(seq 1 $(cat /proc/cpuinfo | grep "^processor" | wc -l)); do
   echo $(hostname) >> hostlist
done
${MPI_BIN}/mpirun -np $(cat hostlist|wc -l) -machinefile hostlist ./xhpl
--------------------------------------------------------------------------------------------------------------


If you want install script then download below attached file.
and modify first few line for PATH.
run the file like as " sh hpl_install.sh" then it will be automatic install HPL.


simple progress)
1. download hpl-2.0.tar.gz
2. download atlas3.8.3.tar.gz
3. download above kage_hpl-2.0.tgz
4. make a directory for temporary
5. copy 3 files to that directory.
6. modify first few line for path in hpl_install.sh
7. run hpl_install.sh
8. go to hpl/bin directory.
9. create input data like as "./configure.sh"
10. run HPL like as "./run.sh"
11. you can see result.


*)
check CPU number when it has a problem of number of CPU.
 1. configure.sh
    CPU=
 2. run.sh
    NP=

If "CPU=" has a digit number then "NP=" has same number.
If "CPU=" has no digit number then "NP=" has no digit number.
You can choose even number for "CPU=".

2011/04/05 05:45 2011/04/05 05:45
[로그인][오픈아이디란?]
Posted
Filed under Computer/HPC
Altair
Download license daemon & PBS file

license daemon : altair_licensing_11.0.linux_x64.bin
PBS : PBSPro_11.0.0-RHEL5_x86_64.tar.gz

Install

1. Create license :
  type : LM-X

2. Install license daemon to license server:
# sh altair_licensing_11.0.linux_x64.bin

start daemon
# chkconfig --level 35 altairlmxd on
# /etc/init.d/altairlmxd start

check license
# ps -ef |grep lm
root      3863     1  0 09:57 ?        00:00:00 /opt/pbs/licensing11.0/bin/lmx-serv-altair -b -c /opt/pbs/licensing11.0/altair-serv.cfg

debug
# tail -f /opt/pbs/licensing11.0/logs/<hostname>.log



3. Install PBS server in front-end server
requirement daemon : pbs_sched, pbs_server.bin, postgres

# useradd altair
# tar zxvf PBSPro_11.0.0-RHEL5_x86_64.tar.gz
# cd PBSPro_XXXX
# ./INSTALL
***
Execution directory? [/opt/pbs/11.0.0.103450] <enter>
***
Home directory? [/var/spool/PBS] <enter>
***
PBS Installation:
       1. Server, execution and commands  <= front end server
       2. Execution only   <= compute node
       3. Commands only  <= Just run command node (not submit)
(1|2|3)?1 <enter>
PBS Professional version 9.0 and later is licensed
via the Altair License Manager.

The Altair License Manager can be downloaded from:
http://www.pbspro.com/UserArea/Software/

For more information, please refer to the PBS
Professional Administrator's Guide, or contact pbssupport@altair.com.

Continue with the installation ([y]|n)?  <enter>
Please enter the list of Altair License file location(s)
in a colon-separated list of entries of the form
       <port>@<host>
       @<host>
       <license file path>

Examples:
               7788@fest
               7788@tokyo:7788@madrid:7788@rio
               @perikles:27000@aspasia
               @127.3.4.5
               /usr/local/altair/security/altair_lic.dat
Enter License File Location(s):@pbs_license_server  <enter>
***
Switch to the new version of PBS (y/n)?y <enter>
***
Would you like to start PBS now (y|[n])?n <enter>
***

# vi /etc/pbs.conf
-------------------------------------
PBS_EXEC=/opt/pbs/default
PBS_HOME=/var/spool/PBS
PBS_START_SERVER=1
PBS_START_MOM=0              <== change from 1 to 0
PBS_START_SCHED=1
PBS_SERVER=home
PBS_DATA_SERVICE_USER=altair
-------------------------------------

start daemon
# chkconfig --level 35 pbs on
# /etc/init.d/pbs start

check log
# tail -f /var/spool/PBS/server_logs/<date>


4. install PBS in compute node
requirement daemon : pbs_mom
requirement remote shell : default (rsh, rcp), avail (ssh, scp)
* using ssh for remote shell

# vi quick
---------------------------------------------
<enter>
<enter>
2 <enter>
y
<server host name> <enter>
y <enter>
y <enter>
n <enter>
----------------------------------------------
# ./INSTALL < quick

# vi /var/spool/PBS/pbs_environment
------------------------------------------------
TZ=America/Chicago
PATH=/bin:/usr/bin
PBS_RSHCOMMAND=ssh     <== add this line but not must.
------------------------------------------------
or
------------------------------------------------
TZ=America/Chicago
PATH=/bin:/usr/bin
------------------------------------------------


# vi /opt/pbs.conf
------------------------------------------------
PBS_EXEC=/opt/pbs/default
PBS_HOME=/var/spool/PBS
PBS_START_SERVER=0
PBS_START_MOM=1
PBS_START_SCHED=0
PBS_SERVER=<pbs server hostname>
PBS_SCP=/usr/bin/scp            <== add this line (must)
------------------------------------------------

# chkconfig --level 35 pbs on
# /etc/init.d/pbs start

debug
# tail -f /var/spool/PBS/mom_logs/<date>


6. test

$ echo "sleep 60; hostname; pwd; date" | qsub
$ qstat -an
$ cat STDIN.o<job id>


7. useful commands
PBS node list
 $ pbsnodes -a

Trace Job
$ tracejob <job id>

Queue state
$ qstat -an

Queue del
$ qdel <job id>
$ qdel -W force <job id>

Show Queue configuration, license infomation, ...
$ qstat -fB
$ qmgr -c "list server"

Add compute node to server
$ qmgr -c "create  node <host name>"
$ qmgr -c "create  node <host name> resoures_available.ncpu=2"

Delete compute node
$ qmgr -c "delete node <host name>"

change License information
License server
$ qmgr -c "set server pbs_license_info=<port1>@<host1>"
$ qmgr -c "set server pbs_license_info=<port1>@<host1>:...:<port#>@<host#>"
File
$ qmgr -c "set server pbs_license_info=<path license file>"
$ qmgr -c "set server pbs_license_info=<path license file1>:..:<path license file2>"
unset
$ qmgr -c "unset server pbs_license_info"

Server configuration
$ qmgr -c "print nodes @default"

move all jobs within a queue
$<new pbs path>/qmove <queue name>@<new server host name>:15001 <old pbs path>/qselect -q <queue name>@<old server host name>:13001




8. remove PBS
# rpm -qa |grep pbs
# rpm -e pbs-xxxx
# rm -fr /var/spool/PBS
# rm -f /etc/pbs.conf
# rm -fr /opt/pbs/11.0XXXXX



If you find ulimit problem in PBS queue then modify /etc/init.d/pbs in compute node.
add "ulimit -l unlimited" before run pbs_mom daemon.
2011/03/31 01:06 2011/03/31 01:06
[로그인][오픈아이디란?]