Clusterbuild Documentation

General

The Python script 'clusterbuild.py' is used sample clusters defined in an input file on a DVR grid. Within the cluster-expansion a high-dimensional potential V is expanded into n-particle interaction terms, called the clusters. The program is executed in four stages:

Initialization
The input file is parsed and checked for consistency and the hierarchy of clusters is established. Also MCTDH is invoked to generate a DVR file.
Calculation of PES cuts
In a first step after the initialization, so-called PES cuts, are calculated. These are cuts through the PES with all coordinates except those participating in a particular n-particle interaction term set to const. Usually this is the most time-consuming part.
Calculation of clusters
The clusters are calculated by substracting all lower order clusters (i.e., all which contain a subset of coordinates) from a PES cut.
Potfitting
When the clusters are generated the porgram guesses a suitable way to potfit the clusters and runs the potfit executable. Note: this is more a convenience function then intended to produce accurate potfits. Usually the input files need to be adjusted manually because suitable numbers of natural potentials have to be chosen.

Note: This program may (depending on the system under consideration) require large amounts of disk space, RAM and CPU time.

Usage

To start calculating the clusters type:

clusterbuild [-opt] input_file

Possible options can be obtained with

clusterbuild -h

-------------------------------------------------------------------------------
Purpose:     Calculate the cluster expansion of a PES.

Usage:       clusterbuild [-mnd -c -C -D -deb -w -ver -h -? ] inputfile.inp

  -mnd       Make name directory
  -c         Continuation run
  -C         Continuation run; do not read restart file
  -D <dir>   Denotes the directory where files are written to
             (name in .inp file ignored).
  -deb       Run in debug mode (more detailed logging)
  -w         Allow to overwriting of existing data.
  -ver       Print version info
  -h -?      Print this help text.

Note: The name directory should be located on a local disk as the program may
cause massive network traffic and be very slow if the name directory
is located in a network file system.

For continuation runs (-c), the program reads the restart file to determine
which clusters have already been created and loads them into memory,
if required. Additional clusters can be defined in the input file.

For continuation runs (-C), the program does not read the restart file but
re-calculates all tasks of the job from the given input. Existing data files in
the name directory are checked for consistency with the input and will be
re-calculated if considered corrupted or inconsistent. This usually requires
using the -w option.

-------------------------------------------------------------------------------

Note: Continuation runs (-c/-C option) not only can be used to re-start a crashed or canceled run but also to calculate additional clusters. If -c is given, the program attemts to read a previously created restart file in the name directory and retrieves the current state. Afterwards the EXPANSION-SECTION (see below) of the input file is read and, if needed, additional clusters are added to the expansion. All other sections of the input file are ignored. If the -c option is used after a crash or keyboard interrupt, also the the -w option might have to be given as some not completed files may have to be overwritten.

If the -C option is used the restart file is not read, but all tasks of the job are derived from the given input. This is for instance useful if a calculation crashed because of a lack of disk space. Existing data files in the name directory are checked for formal consistency with the input (not correctnes of data contained!) and will be re-created if considered corrupted.

Input Documentation

The input needed by clusterbuild in general follows the rules of the usual MCTDH input. The input is organized in five sections:

Section	Description
RUN	What is to be done.
PRIMITIVE-BASIS	Definition of primitive basis. See MCTDH inputdocumentation
MODES	Definition of logical coordinates, i.e, the mode combination scheme (similar as for single-particle functions).
REFERENCE	Reference points around which the PES is expanded.
EXPANSION	Specification of the expansion terms. See EXPANSION-SECTION of clusterstat.

Example inputs can be found in $MCTDH_DIR/inputs/clusters/build.

RUN-SECTION

Required keywords
Keyword	Description
name = S	The 'name' directory to which output files are written.
user-source = S	The path (relative or absulute) to the module containing the potential energy routine and, if needed, a switching function. See using the 'user-source' keyword
Optionmal keywords
Keyword	Description
title = S	The title of the calculation.
potential-routine = S	The name of the routine that evalutares the PES for a given coordinate vector. The routine must be found in the module set by the 'user-source' keyword (see above). Default: S = "potential"
mctdh-exe = S	The name of the executable of the MCTDH program. Default: S = "mctdh84".
potfit-exe = S	The name of the executable of the Potfit program. Default: S = "potfit84".
overwrite	Allow overwriting of existing files in the 'name' directory. Similar to option -w in the command line.
pes-min = R(,S)	Minimum energy of the PES routine given with the 'user-source' keyword. All energies smaller then R are set to R. S may be a unit, default S='au'.
pes-max = R(,S)	Maximum energy of the PES routine given with the 'user-source' keyword. All energies larger then R are set to R. S may be a unit, default S='au'.
cluster-min = R(,S)	Minimum energy of the clusters. This option can be in particular useful to prevent 'holes'. All energies smaller then R are set to R. S may be a unit, default S='au'.
cluster-max = R(,S)	Maximum energy of the clusters. All energies smaller then R are set to R. S may be a unit, default S='au'.
potfit-inp	Create potfit input files. This flag is automatic if 'run-potfit' is set. Without 'run-potfit' only the input files are created.
run-potfit	Run potfit to generate a first guess for a good potfit after the clusters are calculated. Note: automatic potfitting usually requires manual adjustments later on. Default: not set.

Using the 'potential-routine' keyword

The potential-routine keyword is used to specify the name of the PES routine in the module specified as 'user-source'. The subroutine receives a 1D numpy array (type float) containing the a coordinate vector and must return a single float number, the potential energy for the given coordinates. The program assumes that the result is in atomic units.

Simple example of a PES subroutine:

#!/usr/bin/env python
#
# file: my_pes.py

import numpy

def potential(Q):
    "My PES routine - just a harmonic oscillator"

    return numpy.dot(Q,Q)/2

MODES-SECTION

The MODES-SECTION is used to specify mode combinations, i.e., logical coordinates which are used to generate the clusters. In general, this is very similar to the SINGLE-PARTICLE-SECTION within the MCTDH main program. Within the MODES-SECTION one particle is specified per line by comma separated listing of coordinate labes as specified in the PRIMITIVE-BASIS-SECTION.

Example

MODES-SECTION
 Q1, Q2           # mode 1
 Q3               # mode 2
 Q5, Q4           # mode 3
 Q6               # mode 4
end-modes-section

REFERENCE-SECTION

The cluster expansion is performed around a number of reference points. In most cases one reference point might be sufficient, however, e.g., in some systems the definition of multiple reference points might be helpful (e.g., systems with large amplitude motions).

Within the REFERENCE-SECTION, these reference points are specified one per line between the begin-reference and end-reference statements. The first entry in a line is the label of the coordinate as specified in the PRIMITIVE-BASIS-SECTION. The label is followed by whitespace separated coordinate values for each reference point. Note, the coordinate values must coincide with one of the DVR grid points.

Other keywords In the case of multple reference points additional keywords can be used to define how the clusters are summed. By defailt all clusters will be summed with the same relative weight.

Optional keywords

Keyword Description

weights = R,R1,... The clusters for each reference point are summed together with weights R,R1, etc. If weights are given, there must be as many weights as reference points. weights cannot be negative and the sum must be unity. If neither weights nor a switching function are given, all weights are set to 1/N where N is the number of reference points.

switch-function = S Alternatively to the weights keyword, a switching function can be used to generate coordinate dependent weights. (see below).

sum = S((,S1),S2,...) If switch-function is given, sum is used to specify the clusters to be summed. S, S1,.. are coordinate labels. All clusters containing these coordinates are summed according to the weights returned by the switching function. Note: the values of the reference coordinates must be the same for all coordinates not listed.

Using a switching function
Using a user-defined switching function is similar to defining the potential energy routine. A routine (or other callable object) of the name set by the 'switch-function' keyword must be found in the module or library defined by the 'user-source' keyword in the run-section. The switching function receives a complete coordinate vector containing the currunt position of the random walker (1D numpy array, type float) and must return a 1D float array (or any other ordered sequence) of weights. The array must contain as many entries as there are reference points. There must not be negative elements and the sum of all elements must be unity.

Example: two reference points connected by a switching function

REFERENCE-SECTION
  begin-reference
# label   ref1  ref2
   Q1   0.0   0.0
   Q2   0.0   0.0
   Q3   0.0   0.0
   Q4  -1.0   1.0
   Q5   1.0  -1.0
   Q6   1.0   1.0
  end-reference
  switch-function = my_switch_routine
  sum = Q4,Q5
end-reference

Output Documentation

The execution of clusterbuild is organized in four stages in which different output files are generated or updated.

Initialization

In the very beginning the program opens a log file clusterbuild.log which remains open until termination of the program. It recieves all messages sent through the logging system and contains time-stamps, log levels and messages.

Also in the very beginning a file input containing a string representation of the original input is generated. This file is reproduced from allready processed input and is not merely a copy of the original input file.

After this, the file dvr.inp is generated. It serves as an auxilliary input file for the MCTDH executable. MCTDH is invoked to generate a dvr file which is located in the subdirectory dvr. In this derectory there are also other output files generated by the MCTDH program.

Calculation of the PES cuts

After the initialization, the PES cuts are generated. They are located in the cuts sub-directory. The file names consist of the prefix cut_ followed underscore separated indices if the mode/coordinate tuples as given in the EXPANSION-SECTION followed by the prefix _ref_ and the number of the reference point.

The cut-files contain one energy per line and run over all indices of coordinates they contain. The ordering of the indices is according to their appearance in the PRIMITIVE-BASIS-SECTION: The index of the coordinate appearing first in the PRIMITIVE-BASIS-SECTION runs fastest, the index of the coordinate appearing second in the PRIMITIVE-BASIS-SECTION runs second fastest, and so forth. This means, that the coordinate indices in general have a different ordering than the mode indices. Only if the ordering of coordinates within the the definition of the modes strictly follows the ordering of the coordinates in the PRIMITIVE-BASIS-SECTION, the ordering of both indices coincides.

Calcululation of the Clusters

When all PES cuts are generated, the clusters are calculated. The clusters are located in the clusters sub-directory. Their file names follow the same rules as the ones of the PES cuts, except with the prefix cluster_. They also contain one energy per line and the indices have the same ordering as for the PES cuts.

In addition to the clusters that are generated for each reference point, this directory also containes sums of clusters if more then one reference point is used. The sums are calculated according to the weights or the switching function given in the REFERENCE-SECTION. The names of files containing summed clusters have the suffix _sum.

Potfitting the Clusters

If the keyword run-potfit is set in the RUN-SECTION, auxilliary potfit input files for all clusters containing coordinates in more then one mode are created in the sub-directory potfit and potfit is executed.

For generating the potfits, clusterbuild guesses the contracted mode as the one with most grid points. Also, the number of natural potentials for each mode or coordinate is guessed from the number of primitive basis points. Complete modes, i.e, modes with no coordinates missing within a cluster, are combined in a potfit, not complete modes are fitted to degrees of freedom. Note: As no further analysis is made, usually manual corrections of these input files and re-fitting is required.

Optional keywords
Keyword	Description
weights = R,R1,...	The clusters for each reference point are summed together with weights R,R1, etc. If weights are given, there must be as many weights as reference points. weights cannot be negative and the sum must be unity. If neither weights nor a switching function are given, all weights are set to 1/N where N is the number of reference points.
switch-function = S	Alternatively to the `weights` keyword, a switching function can be used to generate coordinate dependent weights. (see below).
sum = S((,S1),S2,...)	If `switch-function` is given, `sum` is used to specify the clusters to be summed. S, S1,.. are coordinate labels. All clusters containing these coordinates are summed according to the weights returned by the switching function. Note: the values of the reference coordinates must be the same for all coordinates not listed.