{ "cells": [ { "cell_type": "markdown", "id": "23347dcb", "metadata": {}, "source": [ "# Hopping Parameter Extraction\n", "\n", "You can download source files to follow this tutorial from this [link](https://drive.google.com/file/d/1_gMT74f_1PqxQ8Um1-y_i11Y2tKg7-3f/view?usp=drivesdk) (26 GB).\n", "\n", "```{note}\n", "This is the same example set used in the **Hopping Parameter Extraction** or **Mean Square Displacement** section in `CLI Tutorial` documentation. If you have already completed that tutorial, you do not need to download the files again.\n", "```\n", "\n", "---\n", "\n", "Calculating effective hopping parameters relies on the `Calculator()` factory function and the `Site` class. For a detailed explanation of these core components, please refer to the previous tutorial.\n", "\n", "First, navigate to the `Example3` directory you downloaded. Inside, you will find three items: the `TRAJ_TiO2` directory, a `POSCAR_TiO2` file, and a `neb.csv` file.\n", "\n", "The `TRAJ_TiO2` directory contains a thermal ensemble of HDF5 trajectory files from MD simulations performed at various temperatures. This example includes simulations from 1700 K to 2100 K, with 20 independent runs at each temperature.\n", "\n", "The file structure is as follows:\n", "\n", "```bash\n", "TRAJ_TiO2/\n", "├── TRAJ_1700K/\n", "│ ├── TRAJ_O_01.h5\n", "│ ├── ...\n", "│ └── TRAJ_O_20.h5\n", "├── TRAJ_1800K/\n", "│ ├── TRAJ_O_01.h5\n", "│ ├── ...\n", "│ └── TRAJ_O_20.h5\n", "├── TRAJ_1900K/\n", "│ ├── TRAJ_O_01.h5\n", "│ ├── ...\n", "│ └── TRAJ_O_20.h5\n", "├── TRAJ_2000K/\n", "│ ├── TRAJ_O_01.h5\n", "│ ├── ...\n", "│ └── TRAJ_O_20.h5\n", "└── TRAJ_2100K/\n", " ├── TRAJ_O_01.h5\n", " ├── ...\n", " └── TRAJ_O_20.h5\n", " ```\n", "\n", "A key advantage of `VacHopPy` is its ability to process a large number of trajectories from multiple NVT ensembles simultaneously and in a memory-efficient way." ] }, { "cell_type": "code", "execution_count": 1, "id": "96c12618", "metadata": {}, "outputs": [], "source": [ "import os\n", "import numpy as np\n", "import pandas as pd\n", "from vachoppy.core import Site, Calculator\n", "\n", "path_traj = 'TRAJ_TiO2'\n", "path_structure = 'POSCAR_TiO2'\n", "if not os.path.exists(path_traj): print(f\"{path_traj} not found.\")\n", "if not os.path.exists(path_structure): print(f\"{path_structure} not found.\")" ] }, { "cell_type": "markdown", "id": "ff28ce42", "metadata": {}, "source": [ "---\n", "## Usage Example\n", "\n", "### - Creating the `Site` and `CalculatorEnsemble` Instances\n", "\n", "The first step in our analysis is to define the crystal's structural framework by creating a `Site` instance. This object analyzes a **perfect, vacancy-free supercell** to identify all **lattice sites** and **hopping paths**.\n", "\n", "Next, we pass this `Site` object to the `Calculator()` factory function. This function automatically processes all trajectories found in `path_traj`. A key feature is its ability to automatically determine an optimal `t_interval`." ] }, { "cell_type": "code", "execution_count": 2, "id": "8a70cf94", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "====================================================================\n", " Automatic t_interval Estimation\n", "====================================================================\n", " [1700.0 K] Estimating from TRAJ_1700K/TRAJ_O_01.h5\n", " -> t_interval : 0.075 ps\n", " [1800.0 K] Estimating from TRAJ_1800K/TRAJ_O_01.h5\n", " -> t_interval : 0.075 ps\n", " [1900.0 K] Estimating from TRAJ_1900K/TRAJ_O_01.h5\n", " -> t_interval : 0.075 ps\n", " [2000.0 K] Estimating from TRAJ_2000K/TRAJ_O_01.h5\n", " -> t_interval : 0.075 ps\n", " [2100.0 K] Estimating from TRAJ_2100K/TRAJ_O_01.h5\n", " -> t_interval : 0.075 ps\n", "====================================================================\n", " Adjusting t_interval to the nearest multiple of dt\n", "====================================================================\n", " - dt : 0.0020 ps\n", " - Original t_interval : 0.0750 ps\n", " - Adjusted t_interval : 0.0740 ps (37 frames)\n", "====================================================================\n" ] } ], "source": [ "site = Site(path_structure, 'O')\n", "calc_ensemble = Calculator(path_traj, site)" ] }, { "cell_type": "markdown", "id": "ecd8b508", "metadata": {}, "source": [ "When `path_traj` points to a directory, `Calculator()` automatically discovers HDF5 files within it. By default, this search extends down to two subdirectory levels (a depth of 2). You can control this search depth using the `depth` argument if your file structure is different.\n", "\n", "---\n", "### - Calculating Hopping Parameters\n", "\n", "Once the `CalculatorEnsemble` is configured, you can initiate the main analysis by calling the `.calculate()` method.\n", "\n", "This method is highly optimized for performance and is designed to handle very large datasets.\n", "\n", "* **Memory Efficiency**\n", " \n", " It processes trajectories in a **streaming** fashion, loading data in small chunks to minimize RAM usage.\n", "\n", "* **Speed** \n", "\n", " It leverages **parallel processing** to perform computations on multiple CPU cores simultaneously, significantly speeding up the analysis. You can control the number of CPU cores used for this parallel computation with the `n_jobs` argument. Setting `n_jobs=-1` will use all available cores." ] }, { "cell_type": "code", "execution_count": 3, "id": "209c110d", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "317674348d6241e0b2b7bf9f0e5ec249", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Analyze Trajectory: 0%| | 0/100 [00:00, ?it/s]" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "Analysis complete: 100 successful, 0 failed.\n", "Execution Time: 16.900 seconds\n", "Peak RAM Usage: 1.153 GB\n" ] } ], "source": [ "calc_ensemble.calculate()" ] }, { "cell_type": "markdown", "id": "7b4376af", "metadata": {}, "source": [ "This method processes all HDF5 files found within the `path_traj` directory, calculating the vacancy hopping properties for each one. The results from each individual file are then stored in the `.calculators` attribute as a list of `CalculatorSingle` objects.\n", "\n", "You can inspect the first few results to see which files were processed:" ] }, { "cell_type": "code", "execution_count": 4, "id": "9e840c5e", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Number of Data: 100\n", "\n", "[Index 0] : /home/jty/Examples/Example3/TRAJ_TiO2/TRAJ_1700K/TRAJ_O_01.h5\n", "[Index 1] : /home/jty/Examples/Example3/TRAJ_TiO2/TRAJ_1700K/TRAJ_O_02.h5\n", "[Index 2] : /home/jty/Examples/Example3/TRAJ_TiO2/TRAJ_1700K/TRAJ_O_03.h5\n" ] } ], "source": [ "print(f\"Number of Data: {len(calc_ensemble.calculators)}\\n\")\n", "\n", "# Display the file paths for the first three results\n", "for i, calc in enumerate(calc_ensemble.calculators[:3]):\n", " print(f\"[Index {i}] : {calc.path_traj}\")" ] }, { "cell_type": "markdown", "id": "16e4ca76", "metadata": {}, "source": [ "You can now view the calculated hopping parameters for each trajectory by calling the `.summary()` method." ] }, { "cell_type": "code", "execution_count": 5, "id": "4ad64cfb", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "====================================================================\n", "Summary for Trajectory dataset\n", " - Path to TRAJ bundle : TRAJ_TiO2 (depth=2)\n", " - Lattice structure : POSCAR_TiO2\n", " - t_interval : 0.074 ps (37 frames)\n", " - Temperatures (K) : [1700.0, 1800.0, 1900.0, 2000.0, 2100.0]\n", " - Num. of TRAJ files : [20, 20, 20, 20, 20]\n", "====================================================================\n", "\n", "==================== Temperature-Dependent Data ====================\n", "T (K) D (m2/s) D_rand (m2/s) f tau (ps) a (Ang)\n", "------- ---------- --------------- ------ ---------- ---------\n", "1700 4.176e-10 6.413e-10 0.6511 19.092 2.7104\n", "1800 6.097e-10 9.179e-10 0.6642 13.3759 2.7142\n", "1900 8.228e-10 1.249e-09 0.6589 9.938 2.7287\n", "2000 1.099e-09 1.663e-09 0.661 7.4301 2.7227\n", "2100 1.48e-09 2.262e-09 0.6543 5.4597 2.7224\n", "====================================================================\n", "\n", "===================== Final Fitted Parameters ======================\n", "Diffusivity (D):\n", " - Ea : 0.961 eV\n", " - D0 : 2.944e-07 m^2/s\n", " - R-squared : 0.9991\n", "Random Walk Diffusivity (D_rand):\n", " - Ea : 0.958 eV\n", " - D0 : 4.413e-07 m^2/s\n", " - R-squared : 0.9988\n", "Correlation Factor (f):\n", " - Ea : 0.002 eV\n", " - f0 : 0.667\n", " - R-squared : 0.0218\n", "Residence Time (tau):\n", " - Ea (fixed) : 0.958 eV\n", " - tau0 : 2.777e-02 ps\n", " - R-squared : 0.9987\n", "====================================================================\n" ] } ], "source": [ "calc_ensemble.summary()" ] }, { "cell_type": "markdown", "id": "f88f15d6", "metadata": {}, "source": [ "### - Plotting the Hopping Parameters\n", "\n", "`VacHopPy` includes a suite of convenient methods to visualize the calculated hopping parameters as a function of temperature. \n", "\n", "The following table summarizes the available plotting functions:\n", "\n", "