Potfit/CANDECOMP Tools Documentation

General Remarks

The Potfit/CANDCOMP tools are a collection of programs to modify, convert, and analyse natpot, cpd, and hcpd files. The programs can be compiled using the shell script $MCTDH_DIR/bin/compile. Type:

compile progname

to produce the executable $MCTDH_DIR/bin/$MCTDH_PLATFORM/ progname<ver>, where progname is the name of one of the tool programs (e.g. natpot2cpd). Typing

compile opertools

will compile all the tool programs.

For more information on the compile script, see Compiling the programs.
NB: All programs are usually compiled when the package is installed.

Documentation of the Individual Programs

All program names have the version number appended to them, e.g. the natpot2cpd program from version 8 release 6 is called natpot2cpd86. The following programs exist:

chnpot : Interpolation of natpot/cpd terms to another grid.
cutcpd : Remove one or more DOF from a CPD file by freezing coordinates to a reference value.
compactoper : Reduces the rank of a given operator (oper or pes from mctdh) in sum-of-products form using and alternative least squares algorithm.
natpot2cpd : Converts a natpot file from potfit or mcpotfit into a cpd file using and alternative least squares algorithm.

chnpotver

  Purpose : Interpolation of natpot/cpd terms to another grid.
  Usage   : chnpot86 [-D -d -rd -w -mnd -nf -nd -cd -cf -ver -h -?] input


 Options  :
 -D PATH  : The output is directed to PATH.
            This option overwrites "name" in the run-section of the input file.
 -d DIR   : The old dvr is read from file DIR/dvr. This option overwrites
            "dvrdir" in the run-section of the input file.
 -rd DIR  : The new dvr is read from directory DIR. If DIR is missing,
            name is assumed. This option sets/overwrites
            "readdvr" in the run-section of the input file.
 -w       : Overwrite existing files in name directory.
 -mnd     : Make Name Directory. Creates the name directory, if not existent.
 -cpd     : Use a cpd rather than a natpot file. (Obsolete, use -cf or -cd).
 -nf FILE : A natpot is read from file FILE. This option overwrites
            "natpotfile" in the run-section of the input file.
 -nd DIR  : A natpot is read from file DIR/natpot. This option overwrites
            "natpotdir" in the run-section of the input file.
 -cf FILE : A cpd is read from file FILE. This option overwrites
            "cpdfile" in the run-section of the input file.
 -cd DIR  : A cpd is read from file DIR/cpd. This option overwrites
            "cpddir" in the run-section of the input file.
 -ver     : Version information about the program
 -h       : Print this help text.
 -?       : Print this help text.

  The argument "input" is the file containing the Run-Section, the new
  Primitive-Basis-Section, and the Fit-Section. Options overwrite settings in input file.
------------------------------------------------------------------------------

The program interpolates natural potential terms or cpd terms calculated by programs potfit/mcpotfit or mccpd from one grid to another. The program reads the old grid from "dvr" file. The input file consist of three sections, a Run-Section, a Primitive-Basis-Section and a Fit-Section (see below) and the final keyword "end-input".

In Run-Section following keywords are possible:

Keywords of chnpot RUN-SECTION
Keyword	compulsory / optional	Description	equivalent option
name = S	C	The name of the calculation. A directory with path S is required in which files will be written.	-D
dvrdir = S	C	The old DVR file is located in directory S.	-d
natpotdir = S	O	The old natpot file is located in directory S.	-nd
natpotfile = S	O	The path of the old natpot file is S.	-nf
cpddir = S	O	The old cpd file is located in directory S.	-cd
cpdfile = S	O	The path of the old cpd file is S.	-cf
readdvr = S	O	The new DVR should be read from directory S. Without argument, S = name is assumed. Without this keyword the new dvr file is created in the name directory. A Primitive-Basis-Section is then required.	-rd
overwrite	O	Files in name directory will be overwritten.	-w
If neither of the keywords natpotdir natpotfile, cpddir, cpdfile is given, then natpotdir=dvrdir is assumed. Only one of the above four keywords must be given.

The Primitive-Basis-Section is as in MCTDH.

In the Fit-Section the parameters for interpolation are given. For translational degrees of freedom one can use cubic (bicubic) spline interpolation or ENO (Essentially Non Oscillatory) interpolation. For angular degrees of freedom (in the case of periodicity) one can select between cosine, sine or fourier interpolation. At present, spline interpolation is implemented for combined modes with an arbitrary number of degrees of freedom while cosine, sine and fourier interpolation are restricted to two dimensions. For more the two dimensions, the spline interpolation is applied recursively to the underlying one-dimensional grids. Note, that the interpolation method should be defined for each degree of freedom, and not for each mode. If grid points for some degree of freedom have not changed, it can be omitted.

The keyword "spline" ("spl") is used for spline interpolation. ENO interpolation is currently implemented for non-combined modes. It consists basically on a polynomial interpolation where the points from the original data set used to obtain the interpolant function are automatically selected to fulfill some smooth criterion. The algorithm will provide a continuous non-oscillatory interpolation near regions with sudden changes in slope, as encountered for example when an energy cut is found in the original data. The formulation of the algorithm can be found in J. Comput. Phys. 71, (1987), 231-303. ENO interpolation is requested using the keyword eno, which can optionally be followed by the keywords rmax=integer, maximum degree of interpolation (default 3), rmin=integer, minimum degree of interpolation (default 1) and dfac=real, weight factor controlling how preferential is the symmetric interpolation with respect to the asymmetric one (default 50.0). The ENO implementation is a new feature and by now is rather experimental. Use it with care. The defaults should provide a reasonable behavior of the interpolator, specifically, setting a higher maximum order or lowering dfac too much may result in a noisy potential energy function at the end. Increasing too much dfac (say, above 200) will tight the interpolator always to a symmetric interpolation, thus not being able to avoid discontinuities. A good idea would be to compare it for a given case to the corresponding spline interpolation and pick up the most satisfactory result.

The keywords "cosine" ("cos"), "sine" ("sin") and "fourier" ("ft") are used for the Fourier-interpolation of angular degrees of freedom. (NB: "ft" or (equivalently) "fourier" means that sine and cosine functions are used). The keyword "sym" after "sin", "cos" and "ft" can be used, indicating that only even functions (i.e. Cos2x, Cos4x, ...) should be used. The angular fit procedure is described in JCP 104,(1996) 7974. The parameter M (default is max(50,2*gdim)) can be given as a keyword following the sin, cos, or ft keyword (see Eq.(37) in JCP 104,(1996) 7974). Finally, the keyword "period=xx" defines the period of the potential curve, for which xx="2pi/N" or "P" format can be used, where N is an integer and P is a real number. If the keyword period is missing, period=2pi is taken as default. If the variable under discussion is not an angle, period should be set to the interval of periodicity. If the potential is not periodic, period should be set to twice the grid length.

The order of the arguments which follow the method keyword is arbitrary.
Here is an example of a Fit-Section:

Fit-Section

 R        spline

 r        fourier 120  period=7.85

 q        eno  rmax=3   dfac=60

 Q        eno  dfac=70  rmin=1

 theta    cosine   60    period=2pi/2

 phi      fourier  sym   period=2pi

end-fit-section

Here it was assumed that V(theta=0) = V(theta=pi), e.g. H₂O in Jacobian coordinates. If this periodicity is not given, e.g. H₂O in valence coordinates, period=2pi should be used.

Note, that the interpolation method for the degrees of freedom in combined mode should be the same (for angular modes cosine, sine and fourier can be mixed). Some parameters of the calculation are written to the "chnpot.log" file. The program writes the new natural potentials file "natpot" and the "dvr" file (if the option -rd or keyword readdvr in Run-Section are not used) to the name directory. For over-writing of the natpot file use "-w" option. For an example input file see chnpot.inp

Example: chnpot86 -w -rd input

Note that chnpot may be used to change the modelabels. Chnpot maps the n-th old DOF onto the n-th new DOF, the modelabels are not used. If you just want to change the modelabels without re-fitting the potential, simply set up the PRIMITIVE-BASIS-SECTION of the chnpot input file using the new modelabels but otherwise the old basis definitions. A FIT-SECTION is not needed in this case.

cutcpdver

Purpose: read a cpd file and produce a cut through a
         pre-defined reference point. Produces a new
         cpd file that only depends on coordinates
         that are not part of the reference geometry.
         Modes that only contain reference points
         (i.e., which are constant) are removed from
         the cpd file.

  Usage:  cpdcut<vers><d/D>< [-h|-?|-ver] [-w -mnd -D] <inp>
          where <inp> is an input file.

Options:
  -h      : Print this help-text.
  -?      : Print this help-text.
-ver      : Version information.
  -w      : Allow overwrite.
-mnd      : Make Name Directory. Creates the name directory, if not existent.
-D <name> : Write output to <name>. (Ignore <name> in input file.)

 Remark:  It may be beneficial to run compactoper
          on the newly created cpd file.

The program needs an input file containing a RUN-SECTION and a REFERENCE-SECTION, specifying the source cpd and dvr file as well as the reference geometry, respectively. A new CPD file will be produced where all coordinates specified in the reference section are fixed to the reference geometry such that the resulting CPD file only contains the remaining DOF. Combined modes that are fixed to their reference geometry are removed from the CPD file. If only one combined mode remains, then the resulting CPD will contain only one term.

In Run-Section following keywords are possible:

Keywords of cutcpd RUN-SECTION
Keyword	compulsory / optional	Description	equivalent option
name = S	C	The name of the calculation. A directory with path S is required in which files will be written.	-D
readcpd = S	C	The input cpd-file to be cutted is read from directory S
readdvr = S	O	The DVR file associated with the input CPD file should be read from directory S. Default: same directory as the input CPD file.
overwrite	O	Files in name directory will be overwritten.	-w

The REFERENCE-SECTION must contain the keywords begin-reference and end-reference enclosing the definitions of reference points. The reference points themselves are specified one coordinate per line, starting with the coordinate label of the primitive DOF that is to be fixed to a reference value. The reference value itself can be defined in two ways: index-based or value-based.

Index-based reference point

Here one specifies the number of the DVR point that defines the reference geometry using square brackets enclosing an integer number after the coordinate label. For instance,

    begin-reference
      theta[7]
      R[4]
    end-reference

will fix the DOF theta to the value of the 7th grid point and the DOF R to the value of the 4th grid point.

Value-based reference point

Here one sprecifies the value of the DVR point that defines the reference geometry using the equal sign (=) and a floating point number. The program will then find the closest DVR point to the given value and use this as reference value. For instance,

    begin-reference
      theta = 3.14159
      R = 4.6
    end-reference

will fix the DOF theta to the value of the DVR point closest to 3.14159 and the DOF R to the value of the grid point closest to 4.6. The final values and grid points can be found in the log file.

Mixing value-based and index-based reference points

It is, of course, possible to mix the tho kinds of definitions:

    begin-reference
      theta = 3.14159
      R[4]
    end-reference

will fix the DOF theta to the value of the DVR point closest to 3.14159 and the DOF R to the value of the 4th grid point.

An example file can be found in $MCTDH_DIR/pinputs/cutcpd_zundel.inp

compactoperver

  Purpose: Read an MCTDH oper or pes file and convert to a
           Hamiltonian canonical polyadic decomposition (HCPD)
           of a canonical polyadic decomposition (CPD), respectively,
           which is subsequently optimized using ALS.

  Usage:  compactoper<vers><d/D><P> [-h|-?]  [-ver -o -l -c -C -CC
                                 -r -R -s -n -g -w -spd -ev -d -v -D]
   vers:  Program version (e.g. 86 for version 8.6)
    d/D:  Indicates a debug version

 optn: -h           : Print this help-text.
       -?           : Print this help-text.
       -ver         : Version information.

       -mnd         : Make Name Directory.Creates the name directory, if not existent.
       -w           : Allow overwrite.
       -c           : Continue obtimizing an existing HCPD file.
       -C           : Same as -c but reset iteration counter.
       -CC dir      : Same as -C but reads the initial CPD/HCPD-file from directory dir.
       -pes         : The operator is a potential and read from
                      a "pes" file instead of an "oper" file
                      The resulting file will be a "cpd" file rather
                      than a "hcpd" file.
       -r int       : Rank of the resulting CPD/HCPD-fit.
                      Default: half the number of terms
                      of the original read-in operator.'
       -H           : Enforce HCPD fit to be Hermitian
                      (requires rank to be an even number).
       -O ham       : Use Hamiltonian ham from the oper file (default ham="system")
       -a real      : Regularization, default: sqrt(machine_precision).
       -s real [,S] : Stop if |oper-fit| is less than real.
                      Default: not set. real may bear a
                      unit S [au, eV, meV, cm-1]. default a.u.
       -o           : Write overlap matrix of the fit basis on exit.
                      Default: not set.
       -l           : Write coefficients of the fit on exit.
                      Default: not set.
       -z err       : Compress with error err before optimization.
                      Compression is done using an SVD of the single-
                      particle operators, err is the sum of the
                      squared singular values that are neglected
                      devided by the sum of all squared singular values.
       -spd real    : Solve ALS equations using Cholesky decomposition with
                      uniform regularization on the diagonal.
                      If neither -spd nor -ev is set, then -spd with
                      regularization=sqrt(machine_precision) is assumed.
       -ev real     : Solve ALS using eigen-decomposition with
                      regularization of smallest eigenvalues with real.
       -n int       : Maximum number of iterations of ALS optimizer, default 1000.
       -g int int1  : Do not start with full rank but with rank int
                      and increase every int1 iterations by the same amount.
                      If an initial guess is read from file the starting rank
                      will be that of the fit from file but the rank increment
                      will be as given by int1.

       -I dir       : Read operator file from dir. Default: current directory
       -J dir       : Read dvr file from dir. Default: same as oper file.
       -D name      : Write output to name. Default: current directory.

       -noblas      : This program relies heavily on being linked to an
                      optimized and parallel BLAS/LAPACK installation.
                      Use this flag if you compiled with OpenMP support.
                      but linked to the default BLAS/LAPACK library.

      IMPORTANT:
        Using a parallel BLAS/LAPACK library is usually sufficient
        for a good performance. Using OpenMP on top of this for
        other parts of the program may lead to some speed up
        especially for large fit and operator ranks (it may
	however also slow the program down in some cases).

Important remark 1
This program should in general not be used to reduce the rank of kinetic energy operators. Kinetic energy operators contain mostly unit operators and other diagonal operators which is exploited in MCTDH. The output of compactoper for general operators is a hcpd file that contains matrix operators for all modes and all terms in the operator. Using this program therefore only makes sense for operators where all single-particle operators are matrices in the first place (for instance second quantized operators) or if all single-particle operators are diagonal in the first place (for instance a potential, see -pes option). Mixed operators with diagonal and non-diagonal single-particle operators usually lead to poor performance in MCTDH.

Important remark 2
As mentioned in the help text this program relies on being linked to optimized and parallel BLAS/LAPACK libraries. A number of such libraries are available for free, for instance OpenBLAS, ATLAS or Intel oneMKL. To add optimized BLAS/LAPACK libraries edit the file $MCTDH_DIR/install/compile.cnf. Locate the section of the compiler you are using and within this section locate the lines "EXTERNAL_BLAS=..." and "EXTERNAL_LAPACK=...". Replace the entries with the link line for your optimized BLAS/LAPACK library. For instance:

    EXTERNAL_BLAS=-L/path/to/library -lmyblas
    EXTERNAL_LAPACK=-L/path/to/library -lmylapack

Then compile with

    compile -x lapack compactoper

Some libraries may need more complicated link lines. See for instance Intel link line advisor. Also special environemnt variables may be required.

natpot2cpdver

  Purpose: Read a natpot file and convert to a
         canonical polyadic decomposition (CPD)
         which is subsequently optimized using ALS.

  Usage:  natpot2cpd<vers><d/D><P> [-h|-?]  [-ver -o -l -c -C -CC
                                 -r -R -s -n -g -w -spd -ev -d -v -D -DD]
   vers:  Program version (e.g. 86 for version 8.6)
    d/D:  Indicates a debug version
      P:  Compiled with OpenMP support.

 optn: -h           : Print this help-text.
       -?           : Print this help-text.
       -ver         : Version information.

       -mnd         : Make Name Directory.Creates the name directory, if not existent.
       -w           : Allow overwrite.
       -c           : Continue obtimizing an existing CPD file.
       -C           : Same as -c but reset iteration counter.
       -CC dir      : Same as -C but reads the initial CPD-file from directory dir.
       -r int       : CPD rank, default: sum of SPP of natpot
                      or rank of read-in CPD.
       -R           : Initial guess of CPD are random numbers,
                      default: pick largest coeff and SPP
                      from natpot or use read-in CPD.
       -a real      : Regularization, default:  sqrt(machine_precision).
       -s real[,S]  : Stop if |natpot-cpd| is less than real.
                      Default: not set. real may bear a
                      unit S [au, eV, meV, cm-1]. default a.u.
       -o           : Write overlap matrix of the cpd on exit.
                      Default: not set.
       -l           : Write coefficients of the cpd on exit.
                      Default: not set.
       -spd real    : Solve ALS using Cholesky decomposition with
                      uniform regularization on the diagonal.
                      If neither -spd nor -ev is set, then -spd with
                      regularization=sqrt(machine_precision) is assumed.
       -ev real     : Solve ALS using eigen-decomposition with
                      regularization of smallest eigenvalues.
       -d real[,S]  : Use real as error-estimate when de-contracting
                      the natpot D-tensor. real may bear a
                      unit S [au, eV, meV, cm-1]. default a.u.
       -n int       : Maximum number of iterations of ALS optimizer, default 1000.
       -g int int1  : Do not start with full rank but with rank 
                      and increase every int1 iterations by the same amount.
                      If -g is set, -r must be set too, to set a maximal rank.
       -v           : If a vpot file is present, calculate the
                      error of the cpd against the vpot in the end.

       -I dir       : Read natpot and dvr from dir. Default: current directory
       -D name      : Write output to name. Default: current directory.
       -DD name     : Similar to -D, but additionally drops
                      the prefix np2cpd from created files.

        The program supports OpenMP (compile with -P option)

The program reads a natpot file and converts it to canonical polyadic decomposition (CPD) format. The CPD file will use the same mode-combinations and mode-labels as the natpot. In a first step, an initial guess in CPD format is created, depending on the given command line options from the given natpot file or simply from random numbers. Subsequently, an optimization procedure is started to iteratively improve the quality of the CPD fit compared to the natpot file using alternating least squares (ALS).

Several output files will be written to the name-directory:

cpd The output file containing the potential in CPD format. Can be used similar to natpot files. See Hamiltonian input documentation.
np2cpd.log The log file with various information.
np2cpd_iterations Fit error as a function of the iterations of iterations. Also contains the condition number of the problem and, if -spd is set, total functional that is being optimized, including the regularization part.
np2cpd_timing Information in time spend in various subroutines.
np2cpd_stop Stop file. Can be used to stop the ALS optimizer prematurely. The iterations are stopped if np2cpd_stop is removed or does not contain the word "continue" as its first entry.
als_eigenval_mode_m If -ev is set, these files contain the eigenvalues of the problem solved during the ALS iteration for mode number m. This file is only produced in last iteration step.
cpdovlp If option -o is given this file contains the overlap matrix of the cpd basis functions.
cpdcoeff If option -l is given this file contains the coefficients of the cpd basis functions (the core tensor).

Examples
natpot2cpd86 -r 30
natpot2cpd86 -mnd -DD CPD.r100 -r 100 -n 25000 -g 10 1000 -v -l
natpot2cpd86 -mnd -DD CPD.r200 -CC CPD.r100 -r 200 -n 12000 -v -l -ev 1.d-10

The use of the -g option is recommended. Using the -ev option often provides a better fit, but at higher computational costs. The smaller the regularization parameter, the larger become the coefficients (CPD core-tensor) and hence the condition number. The latter should not exceed the value of 1000.

The program supports parallelization with OpenMP. We recommend compilation with the -P option. Furthermore, as several liner algebra operations are performed, linking to an optimized Lapack package is of advantage (needs editing of compile.cnf).
For instance: compile -P -x lapack natpot2cpd
See also remark 2 of compactoper.