Welcome to ULaMDyn’s documentation!

ULaMDyn is a Python toolkit designed for advanced data analysis of nonadiabatic molecular dynamics (NAMD) simulations. Built on the Pandas and Scikit-Learn frameworks, this package offers powerful tools for preprocessing, statistical analysis, and unsupervised learning of molecular dynamics (MD) trajectory data generated by the Newton-X program. ULaMDyn was designed to automate the search and discovery of hidden patterns in high-dimensional molecular data sets representing complex potential energy surfaces, thereby enhancing the interpretability and understanding of nonadiabatic dynamics simulations.

General features

Data curation:
- Efficiently collects and organizes output files from multiple MD trajectories into structured datasets.
- Facilitates data sharing by exporting curated datasets in standard CSV format.
Statistical analysis:
- Provide a complete statistical description of the data by computing the median, mean, standard deviation, skewness, and kurtosis for all NAMD trajectories available.
- The bootstrap algorithm can be used to determine the uncertainty of molecular properties within a given confidence level.
Normal mode analysis:
- Includes a seamlessly integrated NMA module to identify key vibrational modes influencing molecular dynamics by using a principal component decomposition of the geometric displacements.
Ring-puckering analysis:
- Calculates Cremer-Pople puckering parameters of cyclic fragments, providing insights into molecular conformational dynamics under light exposure.

Unsupervised learning methods

In its current version, ULaMDyn provides a suite of linear and nonlinear unsupervised learning methods for dimensionality reduction and clustering analysis. Within the former class of methods, the high-dimensional feature vectors representing a molecular configuration can be compressed in a meaningful way to a few coordinates, enabling a visual inspection of the underlying relationships and pattern discovery in molecular dynamics datasets. To search for patterns in the full-dimensional MD data, ULaMDyn has implemented a set of clustering algorithms that can be applied both in geometry space – where each point represents a specific molecular configuration sampled during the dynamics and in trajectory space, treating each MD trajectory as a multi-variate time series. Clustering in trajectory space enables grouping trajectories based on their temporal evolution and similarity in behavior.

Dimensionality reduction:
- Principal Component Analysis (PCA).
- Kernel Principal Component Analysis (KPCA).
- Isometric Mapping.
- t-distributed Stochastic Neighbor Embedding (TSNE).
Clustering methods:
- K-Means (geometries or trajectories).
- Gaussian Mixture Model (GMM).
- Hierarchical agglomerative clustering.
- Spectral clustering.

Getting Started:

Tutorial:

Source documentation:

ulamdyn package

License

This package is freely available for use and distribution under the terms of the GNU Public License (GPL version 3).

Our team:

Light and Molecules group - Aix-Marseille University (AMU), France 🇫🇷

Maintainers:

Max Pinheiro Jr (AMU), max.pinheiro-jr@univ-amu.fr

Contributors:

Mariana Casal (AMU): Jupyter notebook tutorials
Bidhan Chandra Garain (AMU): SOAP descriptor interface

Coordinator:

Prof. Mario Barbatti (AMU)

We encourage any contributions and feedback. Feel free to fork and make pull-request to the “development” branch in GitLab.