Software

This is an old revision of the document!

A common task that every user encounters at least once in his or her use of HPC systems is installing software. Whether that equates to running an install script or configuring/compiling/installing from source code, understanding the Linux environment; where software components belong within the filesystem; and knowing when and how to isolate individual software titles in their own space is critical to success.

The filesystem

Inherited from Unix is the basic layout of the Linux filesystem: a hierarchy of containers called directories descend from the root directory which is simply named /. Each directory can contain files of varying types as well as directories (thus, the hierarchical nature of the filesystem).

There are some directory names of higher significance in Unix and Linux:

Directory name	Purpose
`bin`	Contains executables — programs (compiled or scripted) that the user can run (execute)
`etc`	Contains configuration files that influence how programs execute
`include`	Contains C/C++ header files associated with libraries that are present
`lib`	Contains libraries — compiled subroutine/function bundles — that are used by executables
`lib64`	A variant of `lib` containing code that was compiled for 64-bit execution
`libexec`	Contains executables the user is not meant to run directly, but will be executed by some other program or library
`share`	Contains support files that an executable or library may use: help and documentation pages, data tables, etc.
`sbin`	Contains executables that are meant to be run by someone with higher privileges (e.g. the root user)

Most of the directories named above can be found in the root directory — /bin, /lib64, and /etc, for example — as well as in other parent directories. The /usr directory contains /usr/bin and /usr/lib64 (amongst others). The /usr/local directory contains /usr/local/bin and /usr/local/lib64, meant to hold components that are not integral to the operating system itself.

The GNU Autoconf and the CMake build management systems default to installing components they've build to the /usr/local directory, into directories named according to the above table. If a different installation prefix is chosen, the same layout will be applied to that directory: for example, an installation prefix of /opt/shared/program/version will see executables installed in /opt/shared/program/version/bin and libraries in /opt/shared/program/version/lib.

Finding programs: the PATH variable

Whenever you want to execute a program, the shell needs to find that program in the filesystem. Providing the absolute path to the executable makes that easy:

$ /usr/bin/date
Mon Feb 25 16:03:57 EST 2019

Rather than repeatedly typing that /usr/bin prefix, Unix/Linux shells (your user interface to the OS) allow the user to type just the final part of the path (just date, for example) and the shell will then check the directories in the PATH variable for a file with that name. The PATH consists of a sequence of zero or more directory names separated by a colon; the search proceeds from the left-most directory to the right. A typical PATH might be:

$ echo $PATH
/usr/local/bin:/usr/bin:/bin

When the user types the date command, the shell checks for

/usr/local/bin/date
/usr/bin/date
/bin/date

The first file in that sequence that exists is the one that gets executed. Obviously, when installing a new program copying it to /usr/local/bin (or the other two directories cited) would make it available for use:

$ cp new_program /usr/local/bin

You may not always have the privilege of copying files to /usr/local/bin, though. You can always edit the PATH variable in your shell, though:

$ export PATH="/home/1001/programs/bin:$PATH"

Software

The Linux environment

The filesystem

Finding programs: the PATH variable