Monte-Carlo Potfit Documentation

General Remarks

Monte-Carlo Potfit (or MC-Potfit) is a variant of the traditional Potfit where the integrals over the complete primitive grid are replaced by Monte-Carlo integrals. This on the one hand enables the use of larger potentials, but on the other hand also introduces a statistical error into the resulting natpot which is hard to control. However, the error can be estimated again by statistical methods.

It should be noted that the Monte-Carlo integration error manifests mainly in an integral over the residue potential (the part not represented by an optimal natpot once the SPP are determined). In Particular this means: if the potential can be completely expressed in the SPP basis obtained by MC-Potfit in the first step, then also the obtained natpot file will be numerically exact. If this is not the case, that is, if the SPP cannot represent the exact potential, one needs to keep in mind that the role of the SPP is actually to interpolate the potential between the Monte-Carlo points which were used to calculate the coefficients. This can lead to a paradox situation that the total error of the fit can increase if more SPP are used and the number of Monte-Carlo points is not increased appropriately.

A great advantage of using Monte-Carlo techniques is that correlated weights are implemented straight forwardly by means of the sampling density, i.e., the distribution of the sampling points. At present, generating a uniform distribution and a Boltzmann distribution using a Metropolis algorithm are implemented. Furthermore, external samplings can be read in and used for generating natpot files.

MC-Potfit does not yet support all features of the original Potfit. Implemented features are outlined below. Another difference is that the internal structure of the resulting natpot file is somewhat different from the original Potfit. A contracted mode is not mandatory. When creating the natpot file, however, one mode will be artificially contracted for compatibility with MCTDH without loss of precision.

Usage and command line options

Monte-Carlo potfit is called as other programs of the package from he command line. With the command

mcpotfit85 -h

a help text is printed:

  Purpose: creates a natural potential fit.
    Usage:  mcpotfit<vers><d> [-h|-?]  [-ver -rd -w -D name] inpf
     vers:  program version (e.g. 85 for version 8.5)
        d:  indicates debug version
     inpf:  input file (with or without extension ".inp")
     optn: -h   : print this help-text.
           -?   : print this help-text.
           -ver : Version information.
           -rd  : Reading of the dvr-file is enforced.
           -w   : overwrite enabled.
           -D name: 'name' denotes the directory where files are
                    written to, (name in ~.inp file ignored).

     The order of the options does not matter, but "inpf"
     must be the last argument.

Sections

General Remarks

As in other programs of the Heidelberg MCTDH package, the input file is organized in sections, where the section XYZ starts with the keyword xyz-section and ends with end-xyz-section. At present, the MC-Potfit input may contain four to six sections, as listed below. Those with a STATUS of C are compulsory, O marks optional sections.

XYZ	Status	Description
RUN	C	Defines general parameters.
SAMPLING	C	Defines the Monte-Carlo method, No. of points, etc.
OPERATOR	C	Which surface to be used, energy cut-offs.
PRIMITIVE-BASIS	O	Definition of primitive basis. Optional if an external DVR file is specified in the RUN-SECTION.
NATPOT-BASIS	C	Size of natural potential basis, optional contracted mode, etc.
SYMMETRY	O	Define symmetries of the PES fit

Below, tables of keywords are given for the input sections.

The following tables describe the keywords. The number and type of arguments is specified. The type is S for a character string, R for a real number, and I for an integer. For instance, 'keyword = I,R[,S]' indicates that the keyword takes two or optionally three arguments where the optional argument is indicated by square brackets. Here, the first argument is an integer, the second a real number, and the optional third one a character string. The status column indicates whether the keyword is compulsory, C, or optional, O.

Run-Section

Keyword	Status	Description
name = S	C	The output files will be written to the directory S. Note: If the `-D name` option is used (see To Run the Program), the name-string given in the input file is ignored.
title = S	O	The string S is taken as the title of the run and printed to the log and output files. Note: everything that follows the equal sign '=' till the end of the line is taken as title!
readdvr (= S)	O	The DVR information will be taken from the DVR file in directory S. Note that the primitive-basis-section of the input file is completely ignored if this keyword is given. A DVR file residing in directory S will NEVER be overwritten or deleted, even if the keywords deldvr and/or gendvr have been specified. If S is not given, name-directory/dvr is assumed.
gendvr	O	The DVR file will be generated unconditionally. (This is the default)
deldvr	O	The DVR file will be deleted at the end of the calculation.
overwrite	O	Any files already in name directory may be overwritten. (It is safer not to use overwrite but the option -w).
output = S	O	The output will be written to the file `name/output`. The string S may take the values short or long. When long is specified, additional information on grids and weights are printed. short is default, i.e. `output` is equivalent to `output=short` . Note: `output` is default.
dvronly	O	The program generates the dvr file and then stops.
timing	O	Program timing information will be written to the file `name/timing` (default).
no-timing	O	The timing file is not opened.
no-OMPpotential	O	If openMP parallelization is used with this flag no calls to the PES routine are performed in parallel. This is useful if the PES routine is not thread safe.
no-omega	O	Do not allocate the SPP-configuration-sampling matrix Ω which is needed to calculate the coefficients, but calculate it on the fly as needed. This saves considerable amounts of memory but also takes considerably longer, especially for large sampling sets. Dimensions of Ω: (N_sample-coeff × N_config) where N_config is the product of the number of SPP given in the NATPOT-BASIS-SECTION. Note: Ω is distributed among MPI tasks in case of distributed memory parallelization. Default: not set.
no-omega-t	O	Relevant only when invert-method = conjgrad is used: then do not allocate the transpose of the SPP-configuration-sampling matrix Ω^T. If Ω is allocated, use Ω instead, otherwise calculate it on the fly as needed. This saves considerable amounts of memory but also takes considerably longer, especially for large sampling sets. Note: Ω^T is only allocated and used when invert-method = conjgrad is set to reduce cache misses. Also, Ω^T is distributed among MPI tasks in case of distributed memory parallelization. Default: not set.
no-omega-spp	O	Do not allocate the grid-sampling matrix Ω_SPP. Per default the grid-sampling matrix Ω_SPP is allocated and the potential is first sampled on all primitive grid points of a mode for all sampling points before the integration over the sampling points is performed. If no-omega-spp is set, the Ω_SPP-matrix is not allocated but 1-D potential cuts for each sampling point are calculated and added to the reduced density sequentially. This saves considerable amounts of memory but it may also take somewhat longer. Size of Ω_SPP: N_grid × N_sample, where N_grid is the number of grid points of a (combined) mode and N_sample is the number of sampling points used to calculate the SPP. Note: Ω_SPP is distributed among MPI tasks in case of distributed memory parallelization. Default: no-omega-spp is not set.
density = S	O	If the reduced densities of the modes are calculated (and not read-in, see below) the densities are stored in a file such that they can be re-used in a later run if for instance more coefficients are needed. The density keyword describes the format of the file. S can be either `ASCII` or `binary`, where `binary` is default. The `ASCII` format is useful if a visualization of the density, for instance with gnuplot is desired. The density keyword is ignored if the reduced densities are read-in. In this case the format is determined from the files directly.
save-evecs	O	In Addition to the reduced densities (see density above) save a copy of all single particle potentials as well and in the same format as the densities. The SPP can be re-used in case one needs to re-calculate the coefficients or add more SPP.
invert-method = S (,R)	O	Method used to solve the system `AX=B` where `A` is the overlap matrix (Ω^T×Ω), and B is (Ω^T V). `X` are the coefficients of the SPP in the potfit. S can be one of direct, conjgrad preudo or eigen. When direct is selected, the overlap matrix (Ω^TΩ) es explicitly calculated and inverted using the LAPACK DPOSV (or ScaLAPACK PDPOSV) routine which uses a Cholesky decomposition of Ω^T×Ω. This can be quite expensive but is numerically exact. If conjgrad is selected, a conjugate gradients method is used that avoids explicit calculation of the overlap matrix Ω^T×Ω. If pseudo is selected, a different system, ΩX = V is solved in a least square sense with LAPACK DGELS (or ScaLAPACK PDGELS) (This method is not optimized and should only be used for small systems - it avoids calculation of the overlap matrix, though) If eigen is selected, the matrix (Ω^T×Ω) is diagonalized and inverted using LAPACK DSYEV (or ScaLAPACK PDSYEV) This allows for regularization, with the parameter R, where R times the trace of Ω^T×Ω is used asregularization. The eigenvalues and their regularization are stored for later reference in a separate file. Default is direct.
cg-tolerance = R	O	When invert-method = conjgrad is used, cg-tolerance is the length of the residual vector(s) or "error term" divided by the length of `B` (see invert-method above). Conjugate gradients is considered converged when the length of the residual divided by `\|B\|` is less than cg-tolerance. If a contracted mode is present, B has multiple vectors, hence, conjugate gradients is considered converged when the length of the largest residual divided by `\|B\|` is less than cg-tolerance. Default 1.E-8.
cg-maxiter = I	O	When invert-method = conjgrad is used, then at most I iterations are used to achieve the requests error-estimate. Default 1000.
cg-precon = S [,I\| S]	O	When invert-method = conjgrad use the preconditioner S, where S can be one of "diag" in which case a Jacobi preconditioner is used, or "band" in which case a banded matrix is used or "sparse", see below. S="band" requires an integer I specifying the number of sub-diagonals used in the band matrix. cg-precon = band,1 will use a tri-diagonal matrix, while cg-precon = band,0 is equivalent to the Jacobi preconditioner. Note: The behavior of the band matrix preconditioner is somewhat unpredictable as it is not guarantied that the condition number of Ω^TΩ is reduced. It can also get worse. The band matrix preconditioner can therefore speed up the convergence substantially, but also reduce it substantially. For metropolis sampling we always observed some speed up using the Jacobi preconditioner. For uniform sampling, a preconditioner is usually not needed as Ω^T&Omega is close to unity and well conditioned. The sparse preconditioner uses a sparse matrix as a preconditionier which is computed from selected matrix elements of the SPP overlap matrix. One specifies the number of matrix elements mode wise as a comma-separated list in the form <mode_1>:<N_1>,<mode_2>:<N_2>, etc., where <mode_i> is the mode number and <N_i> is the number of matrix elements for this mode. Note: there must not be spaces around the ':'. If all <N_i> are equal to the numbers of SPP in their mode, then the preconditioner matrix is not sparse any more but complete and the preconditioner is exact.
sampling-only	O	Do not create a natpot file but only calculate the sampling points and exit.
sampling-energies	O	Write out the energies that correspond to the sampling points. Default: not set.
iseed = I	O	Calculates the initial random seed from I. If not set, the random seed is calculated from the current system time. In case of MPI, the rank number is added in each task and the system time used is the one from rank 0.
subtract-pes = S	O	Subtract a potential stored in a PES file (created with the -pes option of MCTDH) from the potential that is defined in the OPERATOR-SECTION. S is the full path to a PES file. This option is useful if one already has an approximate description of the potential (for instance a cluster-expansion) and only wants to produce a natpot of the difference to the exact potential. If `subtract-pes` is given, then the OPERATOR-SECTION itself may not use the "pes-file" option but only "usersurf" or built-in surfaces from the MCTDH operator library. `subtract-pes` cannot be used with extended grids.
auto-extgrid	O	If symmetry operations are present (see SYMMETRY-SECTION below) that do not stay on the DVR grid but lead to grid points outside the primitive grid defined in the PRIMITIVE-BASIS-SECTION, then if `auto-extgrid` is set, extended mode-grids will be automatically generated such that the symmetry operations stay on an extended primitive grid and a usual grid-based symmetrization can be performed. The additional grid points are subsequently removed such that the original grid is restored. If not set, an error message is produced if any symmetry operation leeds to outside the primitive grid. Default: not set.
density=S	O	Save the density in format S, where S either ASCII or binary. Default: ASCII
binary-samplings	O	Save the DVR index points and energies in binary form. Default: ASCII
ascii-samplings	O	Save the DVR index points and energies in ASCII form. This is default.

Sampling-Section

The Sampling-Section defines how to obtain the sampling points using different methods.

Options in the Sampling-Section are the following

sampling[-S1] = S2, I1 [,I2, I3, R [,S3]] C The sampling method and the number of sampling points to use. See remark below .

Sampling methods

Monte-Carlo samplings are used for tree different purposes or tasks in MC-Potfit:

generation of the reduced densities, of which the eigenvectors are the SPP,

subsequent calculation of the coefficient vector,

and finally testing the natpot file.

The sampling[-S1] keyword can be used to assign different sampling methods and sampling sizes for these different tasks. The suffix S1 is an optional string that specifies the particular task a sampling method is used for. S1 can be omitted. In this case the same sampling method and ensemble size are used for all three tasks. Note, that newly generated sets of sampling points will be different for all three tasks even if the specification of the task is omitted (see exception below). The following specifications are possible: sampling[-S1]

sampling-spp: sampling for generating the reduced densities of which the SPP are the eigenfunctions.
sampling-coeff: sampling for calculating the coefficient vector and the overlaps of the SPP with the potential routine.
sampling-test: sampling to calculate mean and RMS values of the natpot error.
sampling: sampling for all the above tasks with the same sampling method.

The total set of sampling points that is used for a specific task can be composed of chunks of points generated with different methods. To this end one can specify multiple (up to 10) sampling sources or each of the sampling-tasks (SPP, coeff, or test), such as

   SAMPLING-SECTION
    sampling-spp = <method 1>
    sampling-spp = <method 2>
    ...
    sampling-coeff = <method n>
    sampling-coeff = <method n+1>
    ...
    sampling-test = <method N>
  end-sampling-section

This will create several sub-sets of sampling points (see examples below) and subsequently combine them into one large set to perform the calculations. This can for instance be used to add small fractions of uniform sampling to a metropolis sampling or to combine sets of metropolis samplings with different temperatures. In addition, if there are two or more test-sets, detailed statistics will be written in a separate 'natpot-statistics' file for each subset in the test-trajectory.

If the calculation is a new run, either sampling or sampling-spp and sampling-coeff must be given. The test sampling is optional and only used to estimate error measures of the potfit. A test is in any case performed with the sampling used for calculating the coefficients. A test with an independent set of sampling points is, however, highly recommended. Comparison of a test-set with the coefficient-set is a good measure for the convergence in terms of numbers of sampling points (if the sampling methods are the same).

If the run is a re-run of a previous calculation one can skip the SPP sampling. In this case the SPP are read-in from evec-files if they exist, or are calculated from previously stored density matrices. If sampling-spp is anyhow specified, the density matrices are re-calculated. Note, that if the densities are read in, at present only the grid dimensions are checked. It is not checked weather the density represents a valid representation of the given mode. That is, interchanging modes in the NATPOT-BASIS-SECTION will not lead to an error if the combined grids of the modes have the same sizes.

The first argument of the sampling[-S1] keyword is the sampling method, followed by a set of parameters. Possible sampling methods - with parameters - are:

uniform, I: Uniform sampling: each grid point is chosen as a sampling point with the same probability. The integer I defines the total number of sampling points
metropolis, I1, I2, I3, R [,S3]: A Metropolis algorithm is used to generate I1 sampling points. The sampling points are selected from a random walk on the total grid. The initial starting position of the walker is a random point on the primitive grid. The first, I2 initial steps of the random walk are skipped to equilibrate before any sampling points are recorded (burn-in). The integer I3 is a step width. It specifies that only every I3th accepted step from the random walk is actually used as a sampling point. That is, I3-1 accepted Metropolis-steps are not used until the next accepted step is recorded as a sampling point. This reduces the correlation between subsequent sampling points. If I3=1 then all accepted Metropolis-steps (after burn in) are used. R defines the temperature (times the Boltzmann constant, so R is actually an energy) used to create a Boltzmann distribution. It may bear a unit defined by S3. Default unit is au.
Note: If metropolis sampling is used in an MPI run, each of the MPI tasks will create a subset of the total trajectory. All MPI tasks will do the burn-in and have the same step-width, etc. but will start from a different initial position and with a different random-seed (which is calculated from iseed if given in the RUN-SECTION, otherwise the system time of rank 0, and the MPI rank number). The trajectories from the different ranks are then combined into one.
readidx{/path/to/indexfile} [,I1 [,I2 [,I3]]]: The sampling points are read from file. The file can be either ASCII or binary (depricated). It must contain the indices of the primitive grid points in the order as given in the PRIMITIVE-BASIS-SECTION, one sampling point with all indices per line (ASCII) or record (binary). Optionally only the first I1 data points in the file are used. If I2 is given the first I2 sampling points are skipped before I1, sampling points are read. If also I3 is given, then I2 sampling points are skipped and of the remaining sampling points only every I3th point is used until I1 sampling points are read in. If I3 == 1 no sampling points are skipped, in which case I3 might as well be omitted. If no integers are given all data points from the file are read in.
complete: No sampling is done, but the complete primitive grid is used. This is only possible if the total grid has less then 2³¹ grid points. Complete cannot be combined with other methods.

Note: By default, even if the same sampling method with the same parameters is used for all of the sub-tasks, for each of the sub-tasks an independent set of sampling-points will be created. There is one exception, however: sometimes one wishes to use the same set of points to calculate the SPP and the coefficients. In this case one can simply create the set of sampling methods for the SPP and than assign this the coefficients (or the other way round):

   SAMPLING-SECTION
    sampling-coeff = <method 1>
    sampling-coeff = <method 2>
    ...
    sampling-spp =  sampling-coeff

    sampling-test = <method N>
  end-sampling-section

The following examples demonstrate the use of the sampling[-S1] keyword.

Example 1

Uniform sampling with 10000 points for generating the SPP
Metropolis sampling with 10000 points created with a temperature of 400 wave numbers. Furthermore the first 1000 sampling points are skipped and then only every 10th sampling point is used for the trajectory. So altogether 101000 points are generated of which 10000 are used for calculating the coefficients.
For testing the natpot, an external file containing the dvr-index is read in. From this file, only the first 15000 data sets are used.

   SAMPLING-SECTION
     sampling-spp   = uniform, 10000
     sampling-coeff = metropolis, 10000, 1000, 10, 400.0, cm-1
     sampling-test  = readidx{/path/to/indexfile}, 15000
   end-sampling-section

Example 2

Uniform sampling with 10000 points for generating the SPP, the coefficients and also for testing the natpot. The sampling points will be different for the three individual tasks, only the method of their generation is the same.

   SAMPLING-SECTION
     sampling = uniform, 10000
   end-sampling-section

Example 3

Metropolis sampling for the coefficients plus an additional small fraction of uniform sampling. For the SPP the same sampling points as for the coefficients are used. The test is done with an independent metropolis sampling.

   SAMPLING-SECTION
     sampling-coeff = metropolis, 100000, 1000, 10, 400.0, cm-1
     sampling-coeff = uniform, 1000
     sampling-spp = sampling-coeff
     sampling-test = metropolis, 100000, 1000, 10, 400.0, cm-1
   end-sampling-section

Natural-Potential-Basis-Section

For the Natural-Potential-Basis-Section the same roles as in the Natural-Potential-Basis-Section of Potfit apply. There is one important difference, though: a contracted mode is not mandatory. A number of SPP can be set for every mode. For compatibility, one of the modes will be retrospectively contracted in the natpot file. This will be the mode that causes the smallest increase in memory usage.

Operator-Section

Keyword Status Description

pes = S {pesopts}

Keyword	Status	Description
pes = S {pesopts}	C	S can be either the keyword "usersurf" in which the program assumes that the potential energy routine has been implemented in the file $MCTDH_DIR/source/potfit/userpot.f90 or it can be the name of a potential energy surface which has been encoded in the MCTDH package. A third option is `pes = pes-file{path/to/pes}` in which case an existing operator file is re-fitted. The first option has been provided to allow easy implementation of user provided routines using a FORTRAN 90 interface. Note, that options from the input file are not supported in this case, i.e, there must not be any `{pesopts}`. In case that a potential energy from the MCTDH library the name is interpreted in a case sensitive manner. If a surface depends on additional parameters, one can change the encoded default values by adding those parameters via the string '{pesopts}'. Parameter names specified within the braces '{}' are processed in a case sensitive manner. The selected surface determines which parameter names are possible. For a description of the available surfaces see Hamiltonian Documentation -- Available Surfaces. Note that fitting 'vpot' files is not supported at present. Example: For the lsth surface use `'pes = lsth {tfac = 0.5d0}'`. tfac is a mass-dependent parameter which is needed to transform between binding and Jacobian coordinates. Let r1,r2, and r3 denote the binding and rd,rv, and theta the Jacobian coordinates. Then the transformation rules are as follows: 2D: r1=rd-tfacrv r2=rv 3D: r1=sqrt(rd2+(tfacrv)*2 -2d0tfacrdrvctheta) r2=rv r3=sqrt(rd2+((1d0-tfac)rv)*2 +2d0(1d0-tfac)rdrvcos(theta)) For a homo-nuclear diatomic molecule tfac=0.5d0, in general tfac=m1/(m1+m2). rv denotes the distance between the two atoms of the diatom, rd the distance between the third atom and the center of mass of the diatom, and theta the angle between rd and rv. Note*: tfac is not used if binding coordinates are selected for a calculation.
vcut < R	O	Energy cut-off for exact potential energy surface. All potential energy values greater than R are set to R.
vcut > R	O	Energy cut-off for exact potential energy surface. All potential energy values less than R are set to R.

S can be either the keyword "usersurf" in which the program assumes that the potential energy routine has been implemented in the file $MCTDH_DIR/source/potfit/userpot.f90 or it can be the name of a potential energy surface which has been encoded in the MCTDH package.
A third option is pes = pes-file{path/to/pes} in which case an existing operator file is re-fitted.
The first option has been provided to allow easy implementation of user provided routines using a FORTRAN 90 interface. Note, that options from the input file are not supported in this case, i.e, there must not be any {pesopts}.

In case that a potential energy from the MCTDH library the name is interpreted in a case sensitive manner. If a surface depends on additional parameters, one can change the encoded default values by adding those parameters via the string '{pesopts}'. Parameter names specified within the braces '{}' are processed in a case sensitive manner. The selected surface determines which parameter names are possible. For a description of the available surfaces see Hamiltonian Documentation -- Available Surfaces. Note that fitting 'vpot' files is not supported at present.

Example:
For the lsth surface use 'pes = lsth {tfac = 0.5d0}'. tfac is a mass-dependent parameter which is needed to transform between binding and Jacobian coordinates. Let r1,r2, and r3 denote the binding and rd,rv, and theta the Jacobian coordinates. Then the transformation rules are as follows:

2D:
  r1=rd-tfac*rv
  r2=rv
3D:
  r1=sqrt(rd**2+(tfac*rv)**2
          -2d0*tfac*rd*rv*ctheta)
  r2=rv
  r3=sqrt(rd**2+((1d0-tfac)*rv)**2
          +2d0*(1d0-tfac)*rd*rv*cos(theta))

For a homo-nuclear diatomic molecule tfac=0.5d0, in general tfac=m1/(m1+m2). rv denotes the distance between the two atoms of the diatom, rd the distance between the third atom and the center of mass of the diatom, and theta the angle between rd and rv.

Note: tfac is not used if binding coordinates are selected for a calculation.

vcut < R O Energy cut-off for exact potential energy surface. All potential energy values greater than R are set to R.

vcut > R O Energy cut-off for exact potential energy surface. All potential energy values less than R are set to R.

Primitive-Basis-Section. Ordering of arguments

The primitive basis is defined in the same way as it is done in the input file of an MCTDH run (see the MCTDH Primitive-Basis-Section ). As for the traditional Potfit, there is, however, one important difference. The order of the degrees of freedom defines the ordering of the arguments passed to the potential routine. To see, how a particular pes from the MCTDH library is implemented see the file funcsrf.F (just type mcb funcsrf.F ). Inspect also the log-file log, which is created when running potfit. There one finds a message like:


BKMP2 PES : 3D model in Jacobian Coordinates
Dissociative Coordinate : rd
Vibrational Coordinate  : rv
Angular Coordinate      : theta

A line like Dissociative Coordinate : theta would hint to a wrong ordering of the degrees of freedom in the Primitive-Basis-Section.
The arguments passed to the potential routine may be reordered with the aid of the order keyword of the Operator-Section.
NB: In MCTDH the ordering of the degrees of freedom in an HAMILTONIAN-SECTION (i.e. the ordering of the mode-line) must be consistent with the potential definition. There the ordering within the Primitive-Basis-Section is arbitrary.

Molecular symmetries

In the traditional POTFIT symmetries of the potential are usually preserved in a natural way if one chooses the primitive grids accordingly. In MC-POTFIT this is not the case because of the random choice of the Monte Carlo integration points. Hence, symmetries need to be restored. In MC-POTFIT this is done by symmetrizing the reduced density matrices to obtain symmetry adapted SPP and subsequently the coefficients. To accomplish this the molecular symmetries must be explicitly specified in the SYMMETRY-SECTION.

The symmetries are given as a table of symmetry operations where each column of the table implements one symmetry operation. The first line of the symmetry table contains the labels by which the symmetry operations are addressed. The label can be chosen freely but must be a single word. The only exception is the first label of the symmetry table which cannot be chosen freely. It must always contain the E operation, the identity. The first column E serves as a reference for all further symmetries.

The columns of the symmetry table under the other symmetry labels implement the symmetry operations. In the column under the label E one must arrange all DOF labels from the Primitive-Basis-Section. The order can be different from the PBASIS-SECTION. If the order of DOFs in a second column is different from the order in the E-column, this is interpreted as an exchange of coordinates within this symmetry operation.

If symmetries are specified one should carefully check the sym.log file. It contains detailed information on the internal representation and mappings of the symmetry-operations. Also a number of tests are performed and logged there to detect if the symmetries are implemented correctly and if the final natpot file is symmetrized correctly.

Grid-based symmetry operations

To implement a symmetry operation one can chose between a grid-based or a coordinate based operation. A grid-based operation exchanges the primitive grid points of two DOFs where additionally the order of the grid points can be reversed. The following example shows a grid-based symmetry operation:

Example: grid-based symmetry operation (1)

  SYMMETRY-SECTION
   E        SXY1
  --------------
   x        y
   y        x
  end-symmetry-section

The symmetry operation 'SXY' interchanges the grid points of DOF 'x' and 'y'. This exchange does not adhere to any coordinate ranges. If the coordinate 'x' is has grid points in the interval, say, [-1,1], and 'y' in the interval [3,5], it will interchange the first grid point (x₁ = -1) of 'x' with the first grid point (y₁ = 3) of 'y' (and vice versa), the second grid point of 'x' with the second grid point of 'y' and so forth.

Optionally one may invert the grids. Using the example above with 'x' being defined in [-1,1], and 'y' in [3,5], The following symmetry operation 'SXY2' will map the first grid point (y₁ = 3) of 'y' to the first grid point (x₁ = -1) of 'x' but the last grid point (x_n = +1) of 'x' to the first grid point (y₁ = 3) of 'y' and so forth. That is, the 'x' grid is reversed before mapping to 'y'. This again does not adhere to any coordinate ranges.

Example: grid-based symmetry operation (2)

Inversion of the DOF grid in a symmetry operation:

  SYMMETRY-SECTION
   E        SXY2
  --------------
   x         y
   y        -x
  end-symmetry-section

Coodinate-based symmetry operations

An alternative way to implement symmetries is by arithmetic expressions. Arithmetic expressions are indicated by curly barackets and must be of the form

{±<DOF-Label>±R}

The opening bracket must be followed by a coordinate label which may optionally bear a sign (scaling of the coordinate is not supported), followed by a constant shift which may be an arithmetic expression of real numbers. Besides real numbers also the symbols pi and 2pi can be used.

Other than grid-based operations coordinate-based operations do adhere to coordinate values: The grid points of the respective DVR are transformed to coordinate values, altered through the expression in the curly brackets and transformed back to grid points which are ultimately used to perform the symmetrization. Unless one uses extended grids (see below) one therefore needs to make sure that the expression in curly brackets applied to a primitive grid point again leads to a primitive grid point.

The following examples implement the same symmetry operations as SXY1 and SXY2 from the examples above in coordinate-based form.

Example: coordinate-based symmetry operation (1)

  SYMMETRY-SECTION
   E        SXY3   # same as SXY1
  --------------
   x        {y-4}
   y        {x+4}
  end-symmetry-section

Example: coordinate-based symmetry operation (2)

  SYMMETRY-SECTION
   E        SXY4   # same as SXY2
  --------------
   x        {y-4}
   y       {-x+4}
  end-symmetry-section

Periodic boundary conditions

DVRs with periodic boundary conditions behave a bit different than other DVRs. There is usually not a natural zero with respect to which an inversion should occur. One can therefore specify coordinate labels for which periodic boundary conditions should be observed. If a coordinate is set to "periodic" one must use coordinate-based expressions for it. In this case symmetry operations that lead to values outside the periodic interval are mapped back into it.

One can define one ore more DOF as periodic with the "periodic" keyword:

  SYMMETRY-SECTION

   <SYMMETRY-TABLE GOES HERE>

    periodic = DOF1 [,DOF2 [,DOF3,...]]
  end-symmetry-section

The example below shows an example of the use of the "periodic" keyword:

Example: periodic boundary conditions

SYMMETRY-SECTION
 --------------------------------------------
     E       C2           C4           C4^3
 --------------------------------------------
     x       -x            y             -y
     y       -y           -x              x
   phi  {phi+pi}   {phi-pi/2}     {phi+pi/2}
 --------------------------------------------
 periodic = phi
end-symmetry-section

Note: Since periodic DOF typically use exponential or FFT DVRs, mcpotfit will issue a warning if a DOF with another DVR type is marked as periodic, but will continue anyways and apply periodic boundary conditions.

User-implemented symmetry operations

Sometimes, more complicated symmetry operations are needed, for instance when the transformation of one coordinate depends on another. To this end we introduced the keyword usersym. This allows the user to implement their own symmetry operations in the file $MCTDH_DIR/source/mcpotfit/usersym.F90. The input of the routine is a vector of coordinate values from which the equivalent coordinates must be computed. See $MCTDH_DIR/source/mcpotfit/usersym.F90 or the example file $MCTDH_DIR/source/mcpotfit/usersym.F90.zundel and the instructions within these files on how to do this. The file $MCTDH_DIR/source/mcpotfit/usersym.F90.zundel implements the symmetry operations from the example in the following paragraph.

The following rules apply for user-implemented symmetries:

If usersym is given for one coordinate in one symmetry operation in the SYMMETRY-SECTION, it must also be given (and implementd) for all other DOF of the combined mode and for all other symmetry operations to guarantee complete mappings.
User-implemented symmetry operations must either stay within one mode or map all DOF of one mode m₁ into another mode m₂. There must not be any mode-mixing.

Example: symmetry operations for the Zundel cation (D_2d)

All symmetry operations are implemented with respect to column one, E. The symmetry operations S4, S4^3, C2A, and C2B, however do not stay on the primitive grid because of the exchange of the two water molecules. The x,y-plane is fixed to one of the waters so that, if they are exchanged, it must be rotated to account for the relative angle alpha (='a' below) between the two waters. Hence user defined symmetry operations had to be implemented. They were used together with the auto-extgrid keyword in the run-section to automatically exend the mode grid by additional points which are generated by the rotation of the x,y- plane. For this, the three coordinates x,y, and a had to be combined in one mode.

SYMMETRY-SECTION
 E        C2        SigmaA      SigmaB     S4        S4^3        C2A        C2B
 --------------------------------------------------------------------------------------
 z        z          z          z         -z        -z          -z         -z
 R        R          R          R          R         R           R          R
 x        usersym    usersym    usersym    usersym   usersym     usersym    usersym
 y        usersym    usersym    usersym    usersym   usersym     usersym    usersym
 a        usersym    usersym    usersym    usersym   usersym     usersym    usersym
 la       {-la+2pi}  {-la+2pi}  {la}       {-lb+pi}  {lb+pi}     {lb+pi}    {-lb+pi}
 lb       {-lb}      {lb}       {-lb}      {la-pi}   {-la+pi}    {la-pi}    {-la+pi}
 ua      -ua         ua        -ua        -ub        ub         -ub         ub
 ub      -ub        -ub         ub         ua       -ua         -ua         ua
 r1a      r1a        r1a        r1a        r1b       r1b         r1b        r1b
 r2a      r2a        r2a        r2a        r2b       r2b         r2b        r2b
 va      -va         va        -va         vb       -vb          vb        -vb
 r1b      r1b        r1b        r1b        r1a       r1a         r1a        r1a
 r2b      r2b        r2b        r2b        r2a       r2a         r2a        r2a
 vb      -vb        -vb         vb        -va        va          va        -va
end-symmetry-section

Extended grids

As mentioned in the section for user-implemented symmetry operations, sometimes symmetry operations do not stay on the original primitive grid. These can be grid-points that originate from a usersym- or a coordinate-based operations in curly brackets.

To be able to still perform the symmetry operations one can use the keyword auto-extgrid in the RUN-SECTION. If auto-extgrid is set and it is detected that any symmetry operarion leads to points outside the original primitive grid, these extra points will be automatically added to the respective mode-grids.

When extended grids are used, it is important that all symmetry operations of a point group are implemented, such that that symmetry operations are closed. Otherwise the forward-backward mapping of extended grid points would be incomplete and symmetry operations applied twice would again lead to points outside the grid.

Extended grids can be used for all modes.

Rules for the SYMMETRY-SECTION

A number of rules apply for the implementation of the symmetries.

Any DOFs that are interchanged must have the same number of primitive grid points. Different DVR types will not lead to an error but only to a warning; coordinate ranges and other DVR parameters are not checked.
When DOF are interchanged there must not be any mixing of combined modes: DOFs can either be re-arranged within one mode, or all DOF from one mode m₁ must map into the same other mode m₂ (with arbitrary arrangement within m₂ as long as the DVR grids are compatible, see previous point).
The contracted mode cannot be interchanged with other modes. DOFs of the conctracted mode can only be re-arranged within this mode.
If extended grids are used, all symmetry operations of a point group must be implemented (see above).
If usersym is set for one DOF, it must also be set for all other DOF of that mode and for all symmetries except E.
If a coordinate-based operation is used for a DOF in one symmetry operation it must also be used in all other symmetry operations except E
Symmetry-Operations for DOFs with the periodic flag must be implemented with coordinate-based operations.
DOFs with the periodic flag can only be interchanged with other periodic DOF.
For external DVRs one must always use usersym.

Example input

Output files

natpot: File containing the potential fit in MCTDH compatible format. See also Potfit documentation. In case no contracted mode is specified in the input file, the mode which leads to the smallest increase of the file size, is retrospectively contracted. This does not lead to a loss of precision.
natpot_unsym: File containing the potential fit in MCTDH compatible format but before the coefficients have been symmetrized (only written if symmetries are specified).
natpot-statistics: File containing detailed statistics of the various test-trajectories. Only written if there are two or more test-trajectories.
dvr: See MCTDH output documentation
output: The output will be saved to file or written to the screen, depending on the output keyword in the Run-Section. It contains some standard results like the "Natural weights" (see the MCTDH review for this) for all the modes. It also prints, after Monte-Carlo testing, the mean and RMS errors of the fit. Finally, total CPU-time, host, date and time, path of the name-directory and (if specified in the input) the title of the run are printed.
input: This is a copy of the input file used with the program version progver and the list of options (if any given) added.
log: This file contains information about the type of calculation performed, the potential energy surface used, memory sizes, arrays that have been allocated, opened data files and error messages - if any. The most important thing in this file is to check the labeling of the parameters of the coordinate system used which should be consistent with the arguments of the potential energy surface used.
sym.log: If a SYMMETRY-SECTION is present in the input, this files contains some details about the internal representation of the symmetry operations as well as the results of symmetry-tests on a number of random grid points. The symmetries are tested against the routine, i.e., it is tested if the symmetry operations indeed lead to equivalent points, and also if the final natpot has the same symmetries.
sigma: If a SYMMETRY-SECTION is present in the input, this file contains the sigma matrices that are used for the transformation of the coefficients verctor.
extgrid_mode_I: If an extended grid has been created for mode I, the complete extended grid of the mode is stored in this file in terms of coordinate valies of the primitive coordinates of the mode: N columns, where N is the number of DOF of the mode, in the same order as defined in the NATPOT-BASIS section.
conjgrad.log: If conjugate gradients is used to calculate the coefficients, the convergence of the algorithm is logged in this files.
timing: This file contains various timing statistics of the potential fitting run. For each subroutine, the number of times the routine is called, the total CPU time (user and system) spent in the routine and in all subroutines below it are listed, as well as the CPU time per function call.
dvrindex-spp[-I]: This file contains the integer indices of the dvr grid points that were used as sampling points to generate the SPP. Each line in this file represents one sampling point. The index is given in the same order as the DOF in the PRIMITIVE-BASIS-SECTION. If more then one trajectory is used, then the files are numbered as they appear in the SAMPLING-SECTION.
dvrindex-coeff[-I]: Same as for dvrindex-spp[-I], but contains the sampling used for obtaining the coefficients.
dvrindex-test[-I]: Same as for dvrindex-spp[-I], but contains the sampling used for testing the natpot against the PES routine.
energies-spp[-I]: This file contains the single-point energies obtained from the sampling stored in dvrindex-spp[-I]. One energy per line obtained from the PES routine.
energies-coeff[-I]: Same as energies-spp[-I], but for the coefficients.
energies-test[-I]: Same as energies-spp[-I] but for the set that was used for testing the natpot against the PES routine.
eigenvalues: If invert-method=eigen is used in the RUN-SECTION then this file contains the eigenvalues of the SPP-overlap matrix in ascending order.
density_mode_I: The density matrix used to calculate the SPP of mode I in matrix format. Stored on the extended grid if present.
idxmap_mode_I: When symmetrization is used, the mapping of mode grid-points. One symmetry operation per column in the same order as specified in the SYMMETRY-SECTION. Also includes extended grid points if present.
evec_mI_J: The J'th SPP of mode I including extended grid points if present. These files are ment for plotting with standart tools like gnuplot.
mcvpot: In case of sampling-[spp,coeff,test] = complete , mcvpot contains the potential evaluated on the complete primitive grid.

Compiling

Compiling works just as for mctdh:

    compile [options] mcpotfit

OpenMP support

MC-Potfit supports OpenMP. To use OpenMP use the -P option of the compile command

    compile -P mcpotfit

this will result in the executable mcpotfit85P

MPI support

MPI support is either enabled by the -m compile option or by specifying an MPI compiler with the -c option.

    compile -m mcpotfit

The resulting executable will be mcpotfit85.mpi-compiler. MPI support can be combined with OpenMP, of course.

ScaLAPACK support

Since a possibly large system of liner equations needs to be solved, MC-Potfit can be linked to the ScaLAPACK library. The library must be available on the system and does not come with the MCTDH package. Linking to it is somewhat automatized by the -S option of the compile command.

    compile -S -m mcpotfit

Note that ScaLAPACK requires MPI. If the -S flag is used, the executable will be named with the 'S' suffix, for instance

    mcpotfit85S.mpi-gfortran

With the default setup the linker will look in the standard system search path for the file 'libscalapack.a'. If the library is not in the search path or has another name, one can alter this accordingly in the variable MCTDH_SCALAPACK_LIBS of compiler in use in the compile.cnf file. Note that ScaLAPACK uses BLACS for communication, which is usually shipped with the Scalapack package and compiled into the ScaLapack library. On some older systems, especially if ScaLAPACK is not self-compiled but installed from the systems package manager, BLACS may be installed in a separate library, usually called libblacs.a. If this is the case one also needs to add the BLACS library in the compile.cnf file.

At present, ScaLAPACK is only used for solving for the natpot coefficients.

External libraries

More sophisticated (possibly vendor specific) linear algebra packages are available which may be very beneficial to use with mcpotfit. These are packages like ATLAS, Intels MKL, AMDs ACML, Cray Scientific Library etc. which implement the LAPACK (some also ScaLAPACK) API. To use these instead of the Netlib implementation one needs to edit the compile.cnf file in $MCTDH_DIR/install and setup the environment accordingly. See also the -x option of the compile script.

Troubleshooting

When using OpenMP parallelization there is a segmentation fault.
Try setting and/or increasing the OMP_STACKSIZE environment variable.
When using OpenMP parallelization I see much less CPU usage then theoretically possible.
Did you use the correct executable (it ends with "P": " mcpotfit85P ")?
Try setting or changing the OMP_NUM_THREADS and OMP_SCHEDULE environment variable. Usually OMP_SCHEDULE=static should give good results. Also, try pinning the threads to specific cores, see for instance OMP_PROC_BIND and OMP_PLACES environment variables.
My results are garbage when using parallelization.
Is the PES routine thread-safe? Try the no-OMPpotential keyword in the RUN-SECTION.
I get different results for different runs with the same input.
This can have different reasons:
- If not set explicitly in the RUN-SECTION the seed for the random number generator is calculated from the current system time. This should give only slightly different results for different runs. Try setting a fixed seed with the iseed keyword in the RUN-SECTION. Keep in mind that when using MPI, each MPI rank gets a seed which is calculated from the original seed and the rank number. So the number of ranks should be the same for reproducing results.
- Is the PES routine thread-safe? If not, results for different runs should differ a lot. Try the no-OMPpotential keyword in the RUN-SECTION.
The natpot is not symmetric, but the routine is - and the symmetries are correctly implemented.
This can still happen. The most likely reasons are:
- If degenerate SPP are present in a mode (check natural populations in the output file), make sure all SPP from a degenerate block are taken into account. If not the complete block is excluded or included, the natpot cannot be symmetrized correctly. There should be a warning if only part of a degenerate block is included.
- Check if all symmetry operations of the point group are implemented. If one is missing, and extended grids are used, the mapping from the extended grid back into the original DVR grid may not be complete.
MC-Potfit won't compile with scalapack support.
Make sure the Scalapack libraries are present on the system. Depending on the installation Scalapack may come in two libraries libblacs.a and libscalapack.a or just in one library libscalapack.a. If BLACS is linked separately this library has to be added in compile.cnf with -lblacs to the MCTDH_SCALAPACK_LIBS variable. Also, make sure the $LIBRARY_PATH environment variable (or whatever search path the linker uses) is set correctly and the -DSCALAPACK compile flag is set.
If special (tuned) libraries are used please refer to the manual of those libraries on how to link.
(Block) conjugate gradients was converging at first but after a number of iterations the error grows again.
This can happen if the Monte-Carlo overlap matrix of the SPP is singular or "almost" singular. Conjugate gradients relies on a positive definite overlap matrix. Usually one has to use more sampling points in his case. Also, when Metropolis sampling is used, a higher temperature or adding a few percent uniform sampling can help.
(Block) conjugate gradients takes forever to converge.
In theory, conjugate gradients should converge after at most N iterations, where N is the product of all numbers of SPP divided by the number of points on the contracted mode (if there is one). For numerical calculations this becomes more like a rule of thumb because of round-off errors during the iterations. However, especially if the overlap matrix is ill-conditioned, this does not hold any more. Try the following:
- Try increasing the condition number by using a preconditioner
- Check the convergence of the SPP (see log file). Increase number of sampling points for the SPP if not converged
- Try to remove the least relevant SPP (see 'output' or 'log' file)
- Use more sampling points for the coefficients
- In case of Metropolis sampling use a higher temperature or add a few percent uniform sampling

Monte-Carlo Potfit Documentation

General Remarks

Grid-based symmetry operations

Example: grid-based symmetry operation (1)

Example: grid-based symmetry operation (2)

Coodinate-based symmetry operations

Example: coordinate-based symmetry operation (1)

Example: coordinate-based symmetry operation (2)

Periodic boundary conditions

Example: periodic boundary conditions

User-implemented symmetry operations

Example: symmetry operations for the Zundel cation (D2d)

Rules for the SYMMETRY-SECTION

OpenMP support

MPI support

ScaLAPACK support

External libraries

Example: symmetry operations for the Zundel cation (D_2d)