Welcome to ULaMDyn’s documentation!

ULaMDyn is a python-based toolkit for data analysis of (nonadiabatic) molecular dynamics (MD) simulations built on top of the Pandas and Scikit-Learn framework. It provides a set of classes and methods to perform the preprocessing, statistical, and unsupervised learning analysis of MD trajectories data generated by the Newton-X program. ULaMDyn was designed to automate the search and discovery of hidden patterns in high-dimensional molecular data sets representing complex potential energy surfaces, thereby enhancing the interpretability and understanding of nonadiabatic dynamics simulations.

Features

  • Data curation:

    • Collect and process text outputs from multiple MD trajectories to construct structured data sets.

    • Facilitate data sharing by providing data sets in standard csv format.

  • Statistical analysis:

    • Perform a descriptive statistics by computing the median, mean and standard deviation over all available MD trajectories.

    • The bootstrap algorithm can be used to determine the uncertainty of molecular properties within a given confidence level.

  • Dimensionality reduction:

    • Principal Component Analysis (PCA).

    • Kernel Principal Component Analysis (KPCA).

    • Isometric Mapping.

    • t-distributed Stochastic Neighbor Embedding (TSNE).

  • Patterns search in full dimensional MD data:

    • K-Means clustering.

    • Hierarchical agglomerative clustering.

    • Spectral clustering.

License

This package is freely available for use and distribution under the terms of the GNU Public License (GPL version 3).

Our team:

Light and Molecules group - Aix-Marseille University (AMU), France 🇫🇷

Maintainers
Contributors
  • Mariana Casal (AMU): Jupyter notebook tutorials

Coordinator
  • Prof. Mario Barbatti (AMU)

We encourage any contributions and feedback. Feel free to fork and make pull-request to the “development” branch in GitLab.