This screen allows to select hosts and components that one would like to install. Express installation mode has a slightly simplified selection model, a full version will appear only in the custom installation mode. The qmaster host is added based on the Qmaster host value from the Main configuration screen by default.
You can select the hosts by a host name, host name pattern , an IP address or IP address pattern. Custom installation mode displays an additional component selection panel. Before pressing the Add button, one may setup a selection of components which will be applied to the hosts when button is pressed (available only for custom installation).
Default component selection for both types of installation (express and custom) is displayed in the following table:
Component |
Description |
Default Value |
---|---|---|
Shadow daemon |
Provides a high availability. Takes over the main qmaster daemon when it becomes unavailable. |
Not selected
|
Execution daemon |
Execution daemon is reposnsible for execution jobs and reporting host status. |
Selected |
Admin host |
Administrative commands can be run only from the admin hosts as a user who belongs either to managers or operators |
Not selected |
Submit Host |
Submit host can be used to submit jobs to Grid Engine. |
Selected |
See following table for supported pattern syntax. The patterns do not support regular expressions. The supported expressions are lists and numeric ranges.
Description |
Input |
Resolved as |
---|---|---|
Host name |
grid00 |
grid00 |
IP address |
192.168.0.1 |
192.168.0.1 |
List of hosts |
grid00 grid01 grid05 |
grid00, grid01, grid05 |
List of IP addresses |
192.168.0.1 192.168.0.2 192.168.0.5 |
192.168.0.1, 192.168.0.2, 192.168.0.5 |
Host ranges |
grid[00-10] |
grid00, grid01, ..., grid10 |
Range of IP addresses |
192.[168-169].0.[50-60] |
192.168.0.50 ... 192.168.0.60, 192.169.0.50 ... 192.169.0.60 |
By performing right mouse click on the host selection table and selecting 'Configure...' menu item, you can reach the host configuration window. Here you can modify host specific values for the selected/all hosts have been added to the table.
Local execd spool directory - The pathname of the local spool directory for the execution host. If differs from the gobal spool directory this value will be used. The specified admin user must have the right to create this directory and to write into it.
JVM library path - Path to the JVM library on the qmaster and/or shadow host(s). Be sure to enter a correct path! If you are on a 64-bit system you may have executed the GUI installer in default JavaTM 32-bit mode. In that case the detected path to the library will be for a 32-bit JavaTM and thread will fail to start later in 64-bit mode. Leave it empty for auto detection.
Additional JVM args - Additional arguments to be used when starting the JVM on qmaster and/or shadow host(s). Default is -Xmx256m
Connect user - The user name that will be used to connect to the remote host.
Resolve timeout - A timeout value for any operation in a resolve_pool (resolving hosts, refreshing host states, copying an installation script to remote host). Increase the default value if you see hosts with Unreachable state and you are sure that password-less access is working correctly for the connect user.
Install timeout - A timeout value for any installation task. Increase the default value if you see that the installation tasks are failing with a Timeout state.
Complete lists and description of all possible states one may encounter in this screen can be found below. The states can be divided in to 3 categories:
When a new host is added in the host selection screen it get immediately state of a New unknown host and resolving process is started. The host is marked as Reachable only if we can retrieve it's architecture. All other states specify an error. The GUI installer won't be able to perform any installation on such host. Following table lists all possible states:
State |
Description |
---|---|
New unknown host |
Initial host state. When host is added it will immediately start resolving it's name/address if we have available threads in the resolve pool. |
Resolving |
Temporary state. Host is being resolved based on the host name or address via default name service. |
Unknown host |
Final state. Host could not be resolved by the name service. |
Resolvable |
Both temporary and final state. Once host has been resolved and if we have available threads in the resolve pool, we immediately try to get the host's architecture via an ssh or rsh call. If this is the final state, the installer was probably not able to ssh/rsh to the host without a password. Check the tooltip message for more information. Right-click on the host and select 'Configure...' action and verify that the intended 'Connect user' has been used for remote connection on that host. |
Contacting |
Temporary state. Host has been resolved and it's architecture is being retrieved. |
Missing remote file |
Final state. Missing file '$SGE_ROOT/util/arch' on remote host.Is the sge-root path the same of the remote host and on the local host? If not, either fix that or refer to using path aliasing. |
Reachable |
Final state. Host architecture could be retrieved. Password-less ssh or rsh to remote hosts is working properly. |
Unreachable |
Final state. Host architecture could NOT be retrieved. Password-less ssh or rsh to remote hosts is not working properly. |
Canceled |
Final state. User canceled further host resolving. |
Once the hosts have been resolved, including their architecture, they are placed to the Reachable tab. Installation can be done only on the hosts that are in Reachable state. Pressing Install button first invokes additional remote host validation. If the installer discovers any configuration errors (see red and orange states in the list below), installation is not started and user is informed about this fact. He can return to the host selection or proceed with the installation anyway.
State |
Description |
Problem Resolution |
---|---|---|
Copy timeout |
Timeout occurred when copying check_host or install_component files.See tooltip for the exact file name. |
Try again (press Install button one more time).If timeout reoccurs, save your host list to a file, stop the installer and restart it with increased timeout values. See tweaking start_gui_installer. |
Copy failed |
Copying files check_host or install_component to the remote host failed.See tooltip for the exact file name. |
Try again (press Install button one more time).If problems reoccurs try to copy any file with scp or rcp to verify these commands work properly. If not make sure they do before new installation attempt. |
Permission denied |
Either of Berkeley DB, qmaster, execution daemon spool directory or JMX keystore file is not writable. See tooltip for the exact message.Installation will most likely fail, if you proceed anyway. |
Did you start the installation as root?What permissions are for the first existing directory?Are you on a NFS file system with root mapped to nobody?Is the UID for the admin user the same on the local and remote machine? |
Admin user missing |
The admin user entered in the main configuration screen does not exist on the remote machine. |
Setup the host properly so that name service provides the name properly to the remote machine (or create the user locally). |
Directory exists |
Berkeley DB spool directory already exists! |
Check the remote host for existing Berkeley DB installations.Remove the existing directory. |
Wrong FS type |
Specified Berkeley DB spool directory is on a local file system. |
Go back to the spooling configuration screen and choose a proper local directory. |
Unknown error |
Unknown error has occurred. |
Try again (press Install button one more time).If reoccurring, ignore and try to install anyway. |
Reachable |
Validation did not discover any issues for this remote host. |
|
Canceled |
User canceled further host validation. |
|
When the installation is started the host list with the chosen components is transformed to a task list. The task list is better suited to handle dependencies. These are the states one may encounter during the installation.
Waiting |
Task is waiting to be executed. |
Processing |
Temporary state. Task is being processed. |
Timeout |
Task did not finish before timeout value has been reached. |
Success |
Task finished successfully. |
Failed |
Task finished unsuccessfully. Click the Log button to get more information. |
Failed due to dependency |
Task was not started, because it depended on a task that failed. Click the Log button to get more information. |
Component already exists |
Task was not started. The installation detected a previous conflicting component installation. Click the Log button to get more information. Remove any remains of the old installation, before trying again. |
Canceled |
User canceled the installation process. |