Welcome to ULaMDyn’s documentation!¶
ULaMDyn is a python-based toolkit for data analysis of (nonadiabatic) molecular dynamics (MD) simulations built on top of the Pandas and Scikit-Learn framework. It provides a set of classes and methods to perform the preprocessing, statistical, and unsupervised learning analysis of MD trajectories data generated by the Newton-X program. ULaMDyn was designed to automate the search and discovery of hidden patterns in high-dimensional molecular data sets representing complex potential energy surfaces, thereby enhancing the interpretability and understanding of nonadiabatic dynamics simulations.
Features¶
Data curation:
Collect and process text outputs from multiple MD trajectories to construct structured data sets.
Facilitate data sharing by providing data sets in standard csv format.
Statistical analysis:
Perform a descriptive statistics by computing the median, mean and standard deviation over all available MD trajectories.
The bootstrap algorithm can be used to determine the uncertainty of molecular properties within a given confidence level.
Dimensionality reduction:
Principal Component Analysis (PCA).
Kernel Principal Component Analysis (KPCA).
Isometric Mapping.
t-distributed Stochastic Neighbor Embedding (TSNE).
Patterns search in full dimensional MD data:
K-Means clustering.
Hierarchical agglomerative clustering.
Spectral clustering.
License¶
This package is freely available for use and distribution under the terms of the GNU Public License (GPL version 3).
Our team:¶
Light and Molecules group - Aix-Marseille University (AMU), France 🇫🇷
- Maintainers
Max Pinheiro Jr (AMU), max.pinheiro-jr@univ-amu.fr
- Contributors
Mariana Casal (AMU): Jupyter notebook tutorials
- Coordinator
Prof. Mario Barbatti (AMU)
We encourage any contributions and feedback. Feel free to fork and make pull-request to the “development” branch in GitLab.