Select hosts & Installing

This screen allows to select hosts and components that one would like to install. Express installation mode has a slightly simplified selection model, a full version will appear only in the custom installation mode. The qmaster host is added based on the Qmaster host value from the Main configuration screen by default.


Selecting hosts

You can select the hosts by a host name, host name pattern , an IP address or IP address pattern. Custom installation mode displays an additional component selection panel. Before pressing the Add button, one may setup a selection of components which will be applied to the hosts when button is pressed (available only for custom installation).

Default component selection for both types of installation (express and custom) is displayed in the following table:

Component

Description

Default Value

Shadow daemon

Provides a high availability. Takes over the main qmaster daemon when it becomes unavailable.

Not selected


Execution daemon

Execution daemon is reposnsible for execution jobs and reporting host status.

Selected

Admin host

Administrative commands can be run only from the admin hosts as a user who belongs either to managers or operators

Not selected

Submit Host

Submit host can be used to submit jobs to Grid Engine.

Selected


See following table for supported pattern syntax. The patterns do not support regular expressions. The supported expressions are lists and numeric ranges.

Description

Input

Resolved as

Host name

grid00

grid00

IP address

192.168.0.1

192.168.0.1

List of hosts

grid00 grid01 grid05

grid00, grid01, grid05

List of IP addresses

192.168.0.1 192.168.0.2 192.168.0.5

192.168.0.1, 192.168.0.2, 192.168.0.5

Host ranges

grid[00-10]

grid00, grid01, ..., grid10

Range of IP addresses

192.[168-169].0.[50-60]

192.168.0.50 ... 192.168.0.60, 192.169.0.50 ... 192.169.0.60


Configuring hosts

By performing right mouse click on the host selection table and selecting 'Configure...' menu item, you can reach the host configuration window. Here you can modify host specific values for the selected/all hosts have been added to the table.

Explanation of Host and Install Task States

Complete lists and description of all possible states one may encounter in this screen can be found below. The states can be divided in to 3 categories:

Host resolving

When a new host is added in the host selection screen it get immediately state of a New unknown host and resolving process is started. The host is marked as Reachable only if we can retrieve it's architecture. All other states specify an error. The GUI installer won't be able to perform any installation on such host. Following table lists all possible states:

State

Description

New unknown host

Initial host state. When host is added it will immediately start resolving it's name/address if we have available threads in the resolve pool.

Resolving

Temporary state. Host is being resolved based on the host name or address via default name service.

Unknown host

Final state. Host could not be resolved by the name service.

Resolvable

Both temporary and final state. Once host has been resolved and if we have available threads in the resolve pool, we immediately try to get the host's architecture via an ssh or rsh call. If this is the final state, the installer was probably not able to ssh/rsh to the host without a password. Check the tooltip message for more information. Right-click on the host and select 'Configure...' action and verify that the intended 'Connect user' has been used for remote connection on that host.

Contacting

Temporary state. Host has been resolved and it's architecture is being retrieved.

Missing remote file

Final state. Missing file '$SGE_ROOT/util/arch' on remote host.Is the sge-root path the same of the remote host and on the local host? If not, either fix that or refer to using path aliasing.

Reachable

Final state. Host architecture could be retrieved. Password-less ssh or rsh to remote hosts is working properly.

Unreachable

Final state. Host architecture could NOT be retrieved. Password-less ssh or rsh to remote hosts is not working properly.

Canceled

Final state. User canceled further host resolving.


Host validation

Once the hosts have been resolved, including their architecture, they are placed to the Reachable tab. Installation can be done only on the hosts that are in Reachable state. Pressing Install button first invokes additional remote host validation. If the installer discovers any configuration errors (see red and orange states in the list below), installation is not started and user is informed about this fact. He can return to the host selection or proceed with the installation anyway.

State

Description

Problem Resolution

Copy timeout

Timeout occurred when copying check_host or install_component files.See tooltip for the exact file name.

Try again (press Install button one more time).If timeout reoccurs, save your host list to a file, stop the installer and restart it with increased timeout values. See tweaking start_gui_installer.

Copy failed

Copying files check_host or install_component to the remote host failed.See tooltip for the exact file name.

Try again (press Install button one more time).If problems reoccurs try to copy any file with scp or rcp to verify these commands work properly. If not make sure they do before new installation attempt.

Permission denied

Either of Berkeley DB, qmaster, execution daemon spool directory or JMX keystore file is not writable. See tooltip for the exact message.Installation will most likely fail, if you proceed anyway.

Did you start the installation as root?What permissions are for the first existing directory?Are you on a NFS file system with root mapped to nobody?Is the UID for the admin user the same on the local and remote machine?

Admin user missing

The admin user entered in the main configuration screen does not exist on the remote machine.

Setup the host properly so that name service provides the name properly to the remote machine (or create the user locally).

Directory exists

Berkeley DB spool directory already exists!

Check the remote host for existing Berkeley DB installations.Remove the existing directory.

Wrong FS type

Specified Berkeley DB spool directory is on a local file system.

Go back to the spooling configuration screen and choose a proper local directory.

Unknown error

Unknown error has occurred.

Try again (press Install button one more time).If reoccurring, ignore and try to install anyway.

Reachable

Validation did not discover any issues for this remote host.

 

Canceled

User canceled further host validation.

 


Installation states

When the installation is started the host list with the chosen components is transformed to a task list. The task list is better suited to handle dependencies. These are the states one may encounter during the installation.

Waiting

Task is waiting to be executed.

Processing

Temporary state. Task is being processed.

Timeout

Task did not finish before timeout value has been reached.

Success

Task finished successfully.

Failed

Task finished unsuccessfully. Click the Log button to get more information.

Failed due to dependency

Task was not started, because it depended on a task that failed. Click the Log button to get more information.

Component already exists

Task was not started. The installation detected a previous conflicting component installation. Click the Log button to get more information. Remove any remains of the old installation, before trying again.

Canceled

User canceled the installation process.