Wrapper program
ULaMDyn provides a set of predefined functions accessible through a command-line interface that carry out a workflow of operations to facilitate user experience and speed up the data analysis of MD trajectories. After installation, a wrapper script called run-ulamdyn can be executed from (linux) terminal with program options parsed explicitly via command-line or given in a configuration file with the name “config.txt”.
In this section, we will guide you through the basic steps required to run ULaMDyn through its command-line interface.
Step 1: create geom.xyz file
The wrapper script should be executed from the TRAJECTORIES directory of the Newton-X NAMD simulations. In the TRAJECTORIES folder you must also create a XYZ geometry file with name “geom.xyz” containing some reference geometry for the system (typically the ground-state geometry). This geometry will be used to compute the RMSD with respect to all the other geometries read from each step of the MD trajectories. The RMSD is an useful and natural metric to compare how much the molecular geometries are distorted during the dynamics. Also, one can use the RMSD values as a distance measure in unsupervised learning algorithms.
Step 2: run ULaMDyn Wrapper
Once you have moved to the TRAJECTORIES directory being analyzed and have created geom.xyz file as described in step 1, you can run the wrapper script in the Terminal using one of the following options:
method 1: command-line parser
In this case, the sequence of analysis tasks executed by the wrapper script should be given in the command-line after the program name as shown below
$ run-ulamdyn --save_dataset=all --create_stats=all
By running this command, ULaMDyn will first create several structured datasets (flattened XYZ coordinates and Z-matrices, QM properties…) in the CSV format with all information collected from the output files of different trajectories stacked in the same dataset. Next, the program computes the basic descriptive statistics (mean, median, and standard deviation) for each of the full datasets and export the results as separate CSV files. If more options are parsed in the command-line, they will be included in the wrapper workflow and executed accordingly.
To check which command-line options are available, you can run the program without parsing any option or using the flag --help
(or -h
). The output of the
helper function, as well as a simple example of the --save_dataset
option is shown below:
method 2: configuration file
Alternatively, you can also create a configuration file placed in the TRAJECTORIES directory that will be read by the wrapper script. This “config.txt” file should
contain the same parser options as given in the command-line. In this case, you just run the run-ulamdyn
in the Terminal to perform the data analysis:
## Content of the config.txt file
--save_dataset=all
--create_stats=all
--save_xyz=hops,S21
clustering
--space=geoms
--method=kmeans
--n_clusters=3
--descriptor=inv-R2
--n_samples=1000
--data_scaler=standard