wiki:UmTutorial/vn8.2/RunningJobs

Running UM FCM Jobs

Before proceeding ensure you have completed the Setting up section of the tutorial.

In this section of the tutorial you will learn:

  • How to create and run a basic job from the UMUI
  • About precompiled builds
  • How to incorporate modifications to your job
  • How to apply user defined compiler overrides

Health Warning!

You will need to set up ssh-agent to allow passwordless login from PUMA to ARCHER. The tutorial won't work unless ssh-agent is set up. Instructions for setting up ssh-agent on PUMA can be found here. If you are doing this tutorial as part of the NCAS UM Training course you will already have setup your ssh-agent correctly and can move onto the next section.

Creating a UM FCM job

FCM was introduced to the UM system at version 6.3. This Tutorial will allow you to get familiar with editing code and how to submit jobs with the FCM system. A standard tutorial job at UM8.2 has been created to work with this tutorial. You can find it in the UMUI under experiment xmdw, job id a (owned by the user umui) and has been tailored for use on ARCHER. At the moment there is only a Global Atmospheric (HadGEM3-A) job. Job xmdwa only requires modifications to set up for your userid, and output directories. A later section of this tutorial deals with how you supply your own or other users' modifications.

Take a copy of xmdwa, a UM 8.2 job and then check through the following list to make sure the job is set up correctly:
(If you are unfamiliar with the UMUI then it is suggested that you look at the UMUI tutorial video on the CMS website which will show you how to copy a UMUI job and the basics of the UMUI)

Set up your details and machine and queue details

  • Go to window User Information and Submit Method -> General Details.
  • Set up with your user id, email address and account name (for the UM Training course the account name is n02-training).
  • Close this window.
  • Go to User Information and Submit Method -> Job Submission Method
  • The tutorial job is set up to run on ARCHER and to use 'qsub' by default. You will see this is also the window in which the number of processors the job will run on is set.
  • Note: The job run time limit is set in the follow-on window, accessible by clicking the "Qsub" button at the bottom of this window.
  • Close this window.

Compile options for executable and reconfiguration

  • Go to Compilation & Run Options -> Compile & Run options for Atmosphere and Reconfiguration.
  • Option "Compile Model Executable" should be set.
  • Option "Change the system default for max no of compilation processes" should be set.
  • Number of compilation processes should be set to 4.
  • Option "Compile Reconfiguration executable" should be unselected.
  • Options "Run the Model" and "Run the reconfiguration" should be set.
  • The directory for reconfiguration executable should be /work/n02/n02/hum/um-training/xmdwa/bin
  • The directory for the model executable is set to $DATAW/bin by default.
  • The name of the model executable should be $RUNID.exe (not .exec)
  • The name of the reconfiguration executable is set to qxreconf by default.

Set up FCM specific information

  • Go to FCM configuration -> FCM Options for UM atmos and recon.
  • Ensure that UM_SVN_URL field is set to fcm:um_tutorial_tr
  • Ensure UM_SVN_BIND field is set to fcm:um_tutorial_br/dev/um/vn8.2_machine_cfg/src/configs/bindings
  • Ensure UM_CONTAINER field is set to $UM_SVN_BIND/container.cfg@vn8.2_cfg
  • Use different version of UM code base from the default should be switched off
  • Option "Use precompiled build" should be set on.
  • Option "Include modifications from branches" should be set on. (Note: The vn8.2_ncas branch contains all the NCAS specific changes required to run this version of the UM on ARCHER.)
  • Go to FCM configuration -> FCM Extract directories and output levels
  • Set "Local machine root extract directory" to an appropriate directory on PUMA where you have write access and plenty of space. $HOME/um/um_extracts is the default in the sample jobs and is recommended! Note that $RUNID will be appended to this path by the UMUI.
  • Set "Target machine root extract directory" to your output directory on the ARCHER. This should be a path under your HOME space directory like: /home/n02/n02/<userid>. $RUNID is automatically appended to this path by UMUI.
  • Ensure level of output for extract is set to 3.
  • Extract output location should be $UM_OUTDIR/ext.out
  • Ensure level of output for build is set to 3.

Running a UM FCM job

Firstly you will be doing a straightforward full build and run of a job using the UM tutorial trunk code which was created at UM 8.2. Doing a full build in FCM can take between 1/2 and 1 hour depending on what type of job you have started with. A typical Atmosphere only job takes about 1/2 an hour. Incremental builds and use of precompiled builds should shorten the build time considerably, but seems to be rather dependent on the node usage. There is more explanation of how incremental builds are used in a later section.

In your job you need to Save and Process and then take a look in the job output directory ($HOME/umui_jobs/[runid]).

So what's different?

You will see that some extra files are now generated:

  • FCM_UMATMOS_CFG, FCM_UMRECON_CFG & FCM_UMSCRIPTS_CFG - these files are the FCM job configuration. They replace the old UPDEFS file as they contain the section defs for the job and also other information required to build the UM Atmosphere model, UM reconfiguration and UM scripts.
  • EXTR_SCR - sets up environment variables required for FCM and the code to do the local FCM extract process.
  • MAIN_SCR - sets up environment variables required for FCM and initiates the local FCM extract and fcm build and/or run commands on a remote machine.
  • FCM_BLD_COMMAND - script submitted via qsub that actually does the FCM build.

Do not change anything in these files!

Now click on the Submit button and wait for this to complete. This can take several minutes. The submit runs the extract command, followed by initiating the build request on ARCHER. You should get a message informing you of whether the extract and build/run submission were successful.

Log on to ARCHER and run qstat -u <archer-userid> to see if your compilation job is running.

If you look under your compile directory on ARCHER (this is the directory you specified for "Target machine root directory" above) you will see that you have several new directories (e.g. umbase, ummodel, umrecon). If the build is successful you will also see a bin sub-directory containing your executable.

If either extract or build has not completed you should examine the relevant output file - these are described in the next subsection.

Output

Under the old UM build and run system you ended up with two output files in $HOME/output on ARCHER a .comp.leave file and a .leave file. Now you will get 3 files:

  • FCM extract output in the local extract directory. For example, if you specified $HOME/um_extracts as the root you will get the output in file $HOME/um_extracts/$RUNID/[umbase|ummodel]/ext.out.
  • FCM build output in the remote extract directory. The output will be in the file $HOME/um/umui_out/'$RUNIDxxx.$RUNID.xxxxxx.xxxxxxx.comp.leave on ARCHER.
  • The job run output goes to a $HOME/um/umui_out/'$RUNIDxxx.$RUNID.xxxxxx.xxxxxxx.leave file as before.

The run directory should still equate to $WORKDIR/$RUNID as in previous UM versions, but note that your executable is now generated in a 'bin' subdirectory.

Modifications

Now you can run a basic job, you need to know how to apply changes to the trunk code. This section of the tutorial only deals with applying existing changes to your UMUI job. For developers who need to know how to make such code changes there is a further tutorial section covering this.

Branches

These are essentially the replacement for modsets and equate to a copy of the UM source code tree with user modifications. Like modsets, a branch can contain changes to many code files. In FCM branches, user modifications to scripts can also be present alongside code changes so there is no differentiation between code and script as with modsets.

Using branches in your UMUI job

As a user, you will be supplied with the URL and possibly a version number of a branch. FCM keywords can be used to specify the URL which is a way of using shorthand for standard bits of the URL path. For example, in the UM_TUTORIAL repository, this branch has been created:

svn://puma/UM_TUTORIAL_svn/UM/branches/dev/ros/vn8.2_um_shell1

A shorter way to specify this is:

fcm:um_tutorial_br/dev/ros/vn8.2_um_shell1

This branch contains a code change to um_shell to print

Tutorial Change: Start of UM RUN Job : instead of Start of UM RUN Job :

To add this branch go to the FCM Configuration -> FCM Options for Atmosphere and Reconfiguration panel. Ensure "Include modifications from branches" is switched on and enter this URL:

fcm:um_tutorial_br/dev/ros/vn8.2_um_shell1/src

into the next available line in "User Modifications" table.

Note that '/src' is required to be added to the branch URL.

Also put a "Y" in the Use column.

Now Save and Process your job and then click the Submit button.

When the job has completed, check the output in the .leave file to verify that you got the changed output message:

Tutorial Change: Start of UM RUN Job : instead of Start of UM RUN Job : .

Script Modifications

As previously mentioned, script modifications can be contained in the same branch as code modifications, so if fixes of this type are needed you will either get them in the same branch as code changes or possibly supplied as a separate branch which is the equivalent of a central script mod.

Compiler overrides

Under the previous UM build system, it was possible to supply custom compiler options for specific decks and similar functionality has been provided for the FCM version. The difference is that the overriding options need to be specified using FCM config syntax.

In the UMUI panel Compilation and Run Options -> User Override Files there are 3 tables where user override files, machine override files and overrides to certain paths can be specified. By default these are usually switched off. For the purposes of this tutorial, we will apply one simple file override. The tutorial section for UM developers covers overrides in more detail for those who need to routinely set them up.

Create a new text file $HOME/umui_jobs/overrides/tutorial_file_ovrds on PUMA and add in the following line:

bld::tool::fflags::UM::atmosphere::dynamics_advection::flux_rho %fflags64 -O0

What you are doing here is redefining the default optimisation for code file flux_rho from -O1 to -O0.

To ensure that the job will run properly. Once you have done this and saved the file, add the path and name of your override file to the User file overrides table on panel Compilation and Run Options -> User Override Files. Place a Y in the Include Y/N column.

Redo the Save, Process and Submit process in the UMUI.

When the extract has finished, the easiest way to check that the changes will be applied is to look in the bld.cfg file in the cfg subdirectory of the output directory on ARCHER (e.g. $HOME/$EXPTID/umatmos/cfg). If you search in this file for 'atmosphere::dynamics_advection::flux_rho' you should see that -O0 has been appended to the list of compile options for this file.

For queries / corrections / suggestions email: r.s.hatcher(at)reading.ac.uk

Last modified 4 years ago Last modified on 12/04/15 12:54:56