{ "cells": [ { "cell_type": "markdown", "id": "23347dcb", "metadata": {}, "source": [ "# Hopping Parameter Extraction\n", "\n", "You can download source files to follow this tutorial from this [link](https://drive.google.com/file/d/1_gMT74f_1PqxQ8Um1-y_i11Y2tKg7-3f/view?usp=drivesdk) (26 GB).\n", "\n", "```{note}\n", "This is the same example set used in the **Hopping Parameter Extraction** or **Mean Square Displacement** section in `CLI Tutorial` documentation. If you have already completed that tutorial, you do not need to download the files again.\n", "```\n", "\n", "---\n", "\n", "Calculating effective hopping parameters relies on the `Calculator()` factory function and the `Site` class. For a detailed explanation of these core components, please refer to the previous tutorial.\n", "\n", "First, navigate to the `Example3` directory you downloaded. Inside, you will find three items: the `TRAJ_TiO2` directory, a `POSCAR_TiO2` file, and a `neb.csv` file.\n", "\n", "The `TRAJ_TiO2` directory contains a thermal ensemble of HDF5 trajectory files from MD simulations performed at various temperatures. This example includes simulations from 1700 K to 2100 K, with 20 independent runs at each temperature.\n", "\n", "The file structure is as follows:\n", "\n", "```bash\n", "TRAJ_TiO2/\n", "├── TRAJ_1700K/\n", "│ ├── TRAJ_O_01.h5\n", "│ ├── ...\n", "│ └── TRAJ_O_20.h5\n", "├── TRAJ_1800K/\n", "│ ├── TRAJ_O_01.h5\n", "│ ├── ...\n", "│ └── TRAJ_O_20.h5\n", "├── TRAJ_1900K/\n", "│ ├── TRAJ_O_01.h5\n", "│ ├── ...\n", "│ └── TRAJ_O_20.h5\n", "├── TRAJ_2000K/\n", "│ ├── TRAJ_O_01.h5\n", "│ ├── ...\n", "│ └── TRAJ_O_20.h5\n", "└── TRAJ_2100K/\n", " ├── TRAJ_O_01.h5\n", " ├── ...\n", " └── TRAJ_O_20.h5\n", " ```\n", "\n", "A key advantage of `VacHopPy` is its ability to process a large number of trajectories from multiple NVT ensembles simultaneously and in a memory-efficient way." ] }, { "cell_type": "code", "execution_count": 1, "id": "96c12618", "metadata": {}, "outputs": [], "source": [ "import os\n", "import numpy as np\n", "import pandas as pd\n", "from vachoppy.core import Site, Calculator\n", "\n", "path_traj = 'TRAJ_TiO2'\n", "path_structure = 'POSCAR_TiO2'\n", "if not os.path.exists(path_traj): print(f\"{path_traj} not found.\")\n", "if not os.path.exists(path_structure): print(f\"{path_structure} not found.\")" ] }, { "cell_type": "markdown", "id": "ff28ce42", "metadata": {}, "source": [ "---\n", "## Usage Example\n", "\n", "### - Creating the `Site` and `CalculatorEnsemble` Instances\n", "\n", "The first step in our analysis is to define the crystal's structural framework by creating a `Site` instance. This object analyzes a **perfect, vacancy-free supercell** to identify all **lattice sites** and **hopping paths**.\n", "\n", "Next, we pass this `Site` object to the `Calculator()` factory function. This function automatically processes all trajectories found in `path_traj`. A key feature is its ability to automatically determine an optimal `t_interval`." ] }, { "cell_type": "code", "execution_count": 2, "id": "8a70cf94", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "====================================================================\n", " Automatic t_interval Estimation\n", "====================================================================\n", " [1700.0 K] Estimating from TRAJ_1700K/TRAJ_O_01.h5\n", " -> t_interval : 0.075 ps\n", " [1800.0 K] Estimating from TRAJ_1800K/TRAJ_O_01.h5\n", " -> t_interval : 0.075 ps\n", " [1900.0 K] Estimating from TRAJ_1900K/TRAJ_O_01.h5\n", " -> t_interval : 0.075 ps\n", " [2000.0 K] Estimating from TRAJ_2000K/TRAJ_O_01.h5\n", " -> t_interval : 0.075 ps\n", " [2100.0 K] Estimating from TRAJ_2100K/TRAJ_O_01.h5\n", " -> t_interval : 0.075 ps\n", "====================================================================\n", " Adjusting t_interval to the nearest multiple of dt\n", "====================================================================\n", " - dt : 0.0020 ps\n", " - Original t_interval : 0.0750 ps\n", " - Adjusted t_interval : 0.0740 ps (37 frames)\n", "====================================================================\n" ] } ], "source": [ "site = Site(path_structure, 'O')\n", "calc_ensemble = Calculator(path_traj, site)" ] }, { "cell_type": "markdown", "id": "ecd8b508", "metadata": {}, "source": [ "When `path_traj` points to a directory, `Calculator()` automatically discovers HDF5 files within it. By default, this search extends down to two subdirectory levels (a depth of 2). You can control this search depth using the `depth` argument if your file structure is different.\n", "\n", "---\n", "### - Calculating Hopping Parameters\n", "\n", "Once the `CalculatorEnsemble` is configured, you can initiate the main analysis by calling the `.calculate()` method.\n", "\n", "This method is highly optimized for performance and is designed to handle very large datasets.\n", "\n", "* **Memory Efficiency**\n", " \n", " It processes trajectories in a **streaming** fashion, loading data in small chunks to minimize RAM usage.\n", "\n", "* **Speed** \n", "\n", " It leverages **parallel processing** to perform computations on multiple CPU cores simultaneously, significantly speeding up the analysis. You can control the number of CPU cores used for this parallel computation with the `n_jobs` argument. Setting `n_jobs=-1` will use all available cores." ] }, { "cell_type": "code", "execution_count": 3, "id": "209c110d", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "317674348d6241e0b2b7bf9f0e5ec249", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Analyze Trajectory: 0%| | 0/100 [00:00\n", "\n", "| Method | Description |\n", "| :---: | :--- |\n", "| **`plot_D()`** | Generates an Arrhenius plot of the calculated **diffusivity ($D$)**. |\n", "| **`plot_D_rand()`** | Creates an Arrhenius plot for the **random walk diffusivity ($D_{rand}$)**. |\n", "| **`plot_f()`** | Plots the **correlation factor ($f$)** as a function of temperature. |\n", "| **`plot_tau()`** | Visualizes the **vacancy residence time ($\\tau$)** across different temperatures. |\n", "| **`plot_a()`** | Shows the average **hopping distance ($a$)** as a function of temperature. |\n", "| **`plot_counts()`** | Displays the total **count** for each unique type of hopping path. |\n", "\n", "" ] }, { "cell_type": "code", "execution_count": null, "id": "e3ef6178", "metadata": {}, "outputs": [], "source": [ "calc_ensemble.plot_D()" ] }, { "cell_type": "markdown", "id": "08ff16fb", "metadata": {}, "source": [ "Example output:\n", "\n", "
\n", "\n", " \"Image\"\n", "
\n" ] }, { "cell_type": "markdown", "id": "a69a705e", "metadata": {}, "source": [ "#### - Saving the Hopping Parameters\n", "\n", "Once your analysis is complete, you can save all the calculated effective hopping parameters to a file using the `.save_parameters()` method. This command exports the results into a `parameters.json` file, making it easy to reload or analyze them later without re-running the entire calculation." ] }, { "cell_type": "code", "execution_count": 7, "id": "4ba806e1", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Parameters saved to 'parameters.json'\n" ] } ], "source": [ "calc_ensemble.save_parameters()" ] }, { "cell_type": "markdown", "id": "c6273fc6", "metadata": {}, "source": [ "### - Calculating the Attempt Frequency\n", "\n", "To calculate the **effective attempt frequency** ($\\nu$) and the **effective coordination number** ($z$), `VacHopPy` requires the pre-calculated energy barriers for each unique type of hopping path.\n", "\n", "This information must be provided in a CSV file. Each row in this file should correspond to a specific hopping path and contain its calculated hopping barrier, which is typically obtained from methods like the nudged elastic band (NEB).\n", "\n", "Below is an example of the CSV file:" ] }, { "cell_type": "code", "execution_count": 8, "id": "5e789932", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " A1 0.8698\n", "0 A2 1.058\n", "1 A3 1.766\n" ] } ], "source": [ "neb_csv = 'neb_TiO2.csv'\n", "if not os.path.exists(neb_csv):\n", " print(f\"'{neb_csv}' not found.\")\n", "else:\n", " data = pd.read_csv(neb_csv)\n", " print(data)" ] }, { "cell_type": "markdown", "id": "e0f62c74", "metadata": {}, "source": [ "This calculation is performed by calling the `.calculate_attempt_frequency()` method. Simply provide the path to the CSV file containing the hopping barriers as an argument.\n", "\n", "Upon completion, the method will print a summary of the calculation. Additionally, if a `parameters.json` file from a previous run exists, this method will automatically update it by adding the newly calculated attempt frequency and coordination number.\n" ] }, { "cell_type": "code", "execution_count": 9, "id": "23b87a1d", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Parameters saved to 'parameters.json'\n", "====================================================================\n", " Attempt Frequency Analysis Summary\n", "====================================================================\n", "\n", "-- Temperature-Dependent Effective Parameters --\n", "\n", "Temp (K) nu (THz) z\n", "---------- ---------- ------\n", "1700.0 6.1675 5.8920\n", "1800.0 6.0200 5.9904\n", "1900.0 5.7611 6.0862\n", "2000.0 5.6642 6.1788\n", "2100.0 5.8308 6.2682\n", "-------- -------- -\n", "Mean 5.8887 6.0831\n", "\n", "-- Path-Wise Parameters (per Temperature) --\n", "\n", "Temperature: 1700.0 K\n", "Path Name Hop Count nu_path (THz)\n", "----------- ----------- ---------------\n", "A1 167 7.8923\n", "A2 252 5.3794\n", "A3 1 10.7228\n", "----------------------------------------\n", "Temperature: 1800.0 K\n", "Path Name Hop Count nu_path (THz)\n", "----------- ----------- ---------------\n", "A1 171 7.7409\n", "A2 279 5.3119\n", "A3 0 0\n", "----------------------------------------\n", "Temperature: 1900.0 K\n", "Path Name Hop Count nu_path (THz)\n", "----------- ----------- ---------------\n", "A1 161 6.5072\n", "A2 344 5.4858\n", "A3 0 0\n", "----------------------------------------\n", "Temperature: 2000.0 K\n", "Path Name Hop Count nu_path (THz)\n", "----------- ----------- ---------------\n", "A1 187 7.2352\n", "A2 353 5.0879\n", "A3 1 3.5067\n", "----------------------------------------\n", "Temperature: 2100.0 K\n", "Path Name Hop Count nu_path (THz)\n", "----------- ----------- ---------------\n", "A1 191 7.7369\n", "A2 362 5.1858\n", "A3 0 0\n", "----------------------------------------\n", "====================================================================\n", "NOTE: nu_paths can be unreliable for paths with low hop counts,\n", " as they are sensitive to statistical noise.\n", "====================================================================\n", "\n" ] } ], "source": [ "calc_ensemble.calculate_attempt_frequency(neb_csv)" ] }, { "cell_type": "markdown", "id": "61585cfe", "metadata": {}, "source": [ "After running the `.calculate_attempt_frequency()` method, you can visualize its primary results using two newly available plotting functions:\n", "\n", "
\n", "\n", "| Method | Description |\n", "| :-------------: | :----------------------------------------------------------------- |\n", "| **`plot_nu()`** | Plots the **attempt frequency ($\\nu$)** as a function of temperature. |\n", "| **`plot_z()`** | Plots the **coordination number ($z$)** as a function of temperature. |\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": null, "id": "e44a7291", "metadata": {}, "outputs": [], "source": [ "calc_ensemble.plot_nu()" ] }, { "cell_type": "markdown", "id": "d1b53aa2", "metadata": {}, "source": [ "Example output:\n", "\n", "
\n", "\n", " \"Image\"\n", "\n", "
\n" ] }, { "cell_type": "markdown", "id": "13d24d6b", "metadata": {}, "source": [ "---\n", "\n", "## List of Extractable Parameters\n", "\n", "The specific effective hopping parameters that `VacHopPy` can calculate depend entirely on the data you provide. There are two factors:\n", "\n", "* **Number of Temperatures**\n", "\n", " Do your HDF5 files represent a single temperature or a range of temperatures?\n", "\n", "* **NEB_CSV** \n", "\n", " Have you provided a `NEB_CSV` file with pre-calculated hopping barriers?\n", "\n", "The following table summarizes what can be calculated under different conditions. (**O** = Calculable, **X** = Not Calculable).\n", "\n", "
\n", "\n", "| Parameter | Symbol | Single T | Multiple T | `NEB_CSV` Required? |\n", "| :--- | :---: | :---: | :---: | :---: |\n", "| Diffusivity | $D$ | O | O | No |\n", "| Residence Time | $\\tau$ | O | O | No |\n", "| Correlation Factor | $f$ | O | O | No |\n", "| Hopping Barrier | $E_a$ | X | O | No |\n", "| Hopping Distance | $a$ | O | O | No |\n", "| Coordination Number | $z$ | X | O | **Yes** |\n", "| Attempt Frequency | $\\nu$ | O | O | **Yes** |\n", "\n", "
\n", "\n", "To demonstrate this, we can run the analysis on a subset of the data corresponding to a single temperature (2100 K).\n" ] }, { "cell_type": "code", "execution_count": 11, "id": "7d7e4a35", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "====================================================================\n", " Automatic t_interval Estimation\n", "====================================================================\n", " [2100.0 K] Estimating from TRAJ_2100K/TRAJ_O_01.h5\n", " -> t_interval : 0.075 ps\n", "====================================================================\n", " Adjusting t_interval to the nearest multiple of dt\n", "====================================================================\n", " - dt : 0.0020 ps\n", " - Original t_interval : 0.0751 ps\n", " - Adjusted t_interval : 0.0760 ps (38 frames)\n", "====================================================================\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "5be2f0dbc5034486be39629701a84824", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Analyze Trajectory: 0%| | 0/20 [00:00\n", "\n", " \"Image\"\n", "\n", "" ] }, { "cell_type": "markdown", "id": "ad50b088", "metadata": {}, "source": [ "---\n", "### Decomposing Vacancy Diffusivity\n", "\n", "You can decompose the total diffusivity ($D$) into its x, y, and z components ($D_x$, $D_y$, $D_z$) at each temperature using the `decompose_diffusivity()` method. This functionality is particularly useful for analyzing **anisotropic systems** where diffusion varies by direction, allowing you to calculate the specific diffusion barrier ($E_a$) for each axis.\n", "\n", "Note that obtaining statistically reliable directional results may require more simulation data than calculating the total diffusivity alone." ] }, { "cell_type": "code", "execution_count": 19, "id": "2c3ac629", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "=========== Temperature-Dependent Decomposed Diffusivity ===========\n", "T (K) Dx (m2/s) Dy (m2/s) Dz (m2/s)\n", "------- ----------- ----------- -----------\n", "1700 4.729e-10 5.324e-10 2.475e-10\n", "1800 6.673e-10 7.123e-10 4.496e-10\n", "1900 1.144e-09 6.331e-10 6.915e-10\n", "2000 1.25e-09 1.497e-09 5.507e-10\n", "2100 2.069e-09 1.525e-09 8.47e-10\n", "====================================================================\n", "\n", "=================== Decomposed Diffusivity Fits ====================\n", "Directional Diffusivity (Dx):\n", " - Ea : 1.101 eV\n", " - D0 : 8.523e-07 m^2/s\n", " - R-squared : 0.9710\n", "Directional Diffusivity (Dy):\n", " - Ea : 0.867 eV\n", " - D0 : 1.818e-07 m^2/s\n", " - R-squared : 0.8022\n", "Directional Diffusivity (Dz):\n", " - Ea : 0.834 eV\n", " - D0 : 8.645e-08 m^2/s\n", " - R-squared : 0.8231\n", "====================================================================\n" ] } ], "source": [ "calc_ensemble.decompose_diffusivity()" ] }, { "cell_type": "markdown", "id": "73f3911e", "metadata": {}, "source": [ "The `plot_msd_xyz()` and `plot_D_xyz()` methods can be used to obtain component-wise Mean Squared Displacement (MSD) and Arrhenius plots for diffusivity, respectively." ] }, { "cell_type": "code", "execution_count": null, "id": "05f0784a", "metadata": {}, "outputs": [], "source": [ "calc_ensemble.plot_msd_xyz()" ] }, { "cell_type": "markdown", "id": "66e1e85b", "metadata": {}, "source": [ "Example output:\n", "\n", "
\n", "\n", " \"Image\"\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": null, "id": "4387b6b0", "metadata": {}, "outputs": [], "source": [ "calc_ensemble.plot_D_xyz()" ] }, { "cell_type": "markdown", "id": "8d9337eb", "metadata": {}, "source": [ "Example output:\n", "\n", "
\n", "\n", " \"Image\"\n", "\n", "
" ] } ], "metadata": { "kernelspec": { "display_name": "vhp_ver3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.18" } }, "nbformat": 4, "nbformat_minor": 5 }