Vacancy Trajectory Analysis
You can download source files to follow this tutorial from this link (552 MB).
Analyzing vacancy hopping in VacHopPy begins with the Calculator() factory function and the Site class.
The Calculator() Factory Function
The Calculator() function is the main entry point for VacHopPy’s analysis pipeline. It gathers all necessary inputs, performs initial setup, and returns a fully configured CalculatorEnsemble object, ready for analysis.
To get started, you call this function as follows:
calc_ensemble = Calculator(path_traj, site, t_interval)
Key Arguments
path_traj(str)Path to the trajectory data. This can be either a single HDF5 trajectory file or a directory containing a bundle of HDF5 files.
VacHopPywill automatically discover and process all valid files in the specified directory.site(Site)An instance of the
Siteclass.t_interval(float, Optional)The time interval in picoseconds (ps) used for averaging atomic positions to determine site occupation. To distinguish true hopping events from thermal vibrations,
VacHopPyaverages the atomic coordinates and forces over eacht_interval. This averaged quantities are then used to assign an atom to a specific lattice site. Each interval corresponds to a single analysis step. If this argument is not provided,Calculator()will automatically search for an optimalt_intervalfor the analysis.
Return Value
Calling the Calculator() function returns an instance of the CalculatorEnsemble class, which holds all the processed data and methods required for subsequent analysis steps.
The Site Class
The Site class is used to define the structural backbone of your material. It identifies all lattice sites from a vacancy-free reference structure and determines the possible hopping paths between them.
You can create a Site object like this:
site = Site(path_structure, symbol)
Key Arguments
path_structure(str)Path to a structure file of the perfect, vacancy-free material. Any format supported by the Atomic Simulation Environment (ASE) is compatible.
symbol(str)The chemical symbol of the diffusing species.
Usage Example
Navigate into the Example6/ directory you downloaded. In this directory, you will find two files:
TRAJ_O.h5This is the HDF5 file containing the MD trajectory data. The example system is rutile TiO₂ containing two oxygen vacancies, simulated at 2100 K.
POSCAR_TiO2This file contains the crystal structure of the perfect, vacancy-free rutile rutile TiO₂ supercell.
import os
import numpy as np
from vachoppy.utils import show_traj
from vachoppy.core import Site, Calculator
path_traj, path_structure = 'TRAJ_O.h5', 'POSCAR_TiO2'
if not os.path.exists(path_traj): print(f"{path_traj} not found.")
if not os.path.exists(path_structure): print(f"{path_structure} not found.")
1. How to inspect the HDF5 file
You can inspect the contents of the HDF5 trajectory file using the show_traj method:
show_traj(path_traj)
==================================================
Trajectory File: TRAJ_O.h5
==================================================
[Simulation Parameters]
- Atomic Symbol: O
- Number of Frames: 300000
- Temperature: 2100.0 K
- Time Step: 2.0 fs
[Composition]
- Counts: O: 46, Ti: 24
- Total Atoms: 70
[Lattice Vectors (Ang)]
[ 9.29133, 0.00000, 0.00000]
[ 0.00000, 9.29133, 0.00000]
[ 0.00000, 0.00000, 8.90126]
[Stored Datasets]
- positions: Shape = (300000, 46, 3)
- forces: Shape = (300000, 46, 3)
==================================================
From the output, the MD run was performed at 2100 K for 600 ps (300,000 iterations × 2.0 fs). The composition shows the system contains two oxygen vacancies.
2. Creating a Site Instance
site = Site(path_structure, 'O')
You can inspect the generated lattice information by calling the .summary() method.
VacHopPy automatically identifies all possible hopping paths by employing the Voronoi tessellation technique. The maximum distance for this search can be adjusted using the rmax parameter, which defaults to 3.25 Å.
site.summary()
====================================================================================================
Structure File: POSCAR_TiO2
====================================================================================================
[Structure Information]
- Structure Composition : Ti24 O48
- Lattice Vectors (Ang) :
[ 9.29133, 0.00000, 0.00000]
[ 0.00000, 9.29133, 0.00000]
[ 0.00000, 0.00000, 8.90126]
[Hopping Path Information]
- Diffusing Symbol : O
- Inequivalent Sites : 1 found
- Inequivalent Paths : 3 found (with Rmax = 3.25 Å)
Name Init Site Final Site a (Å) z Initial Coord (Frac) Final Coord (Frac)
------ ----------- ------------ ------- --- ------------------------ -------------------------
A1 site1 site1 2.5625 1 [0.0975, 0.4025, 0.1667] [-0.0975, 0.5975, 0.1667]
A2 site1 site1 2.8031 8 [0.0975, 0.4025, 0.1667] [0.3475, 0.3475, 0.0000]
A3 site1 site1 2.9671 2 [0.0975, 0.4025, 0.1667] [0.0975, 0.4025, -0.1667]
====================================================================================================
3. Creating the CalculatorEnsemble Instance
While you can create a CalculatorEnsemble instance by calling its constructor directly, the recommended approach is to use the Calculator() factory function. This helper function offers a significant advantage: it can automatically search for an optimal t_interval if one is not provided, streamlining your analysis setup
calc_ensemble = Calculator(path_traj, site)
====================================================================
Automatic t_interval Estimation
====================================================================
Estimating from TRAJ_O.h5
-> t_interval : 0.076 ps
====================================================================
Adjusting t_interval to the nearest multiple of dt
====================================================================
- dt : 0.0020 ps
- Original t_interval : 0.0764 ps
- Adjusted t_interval : 0.0760 ps (38 frames)
====================================================================
Now that the calc_ensemble object is prepared, you can kick off the main analysis by calling the .calculate() method.
This powerful method handles the entire analysis pipeline for you, which automatically includes identifying the trajectory of each vacancy and calculating hopping parameters.
calc_ensemble.calculate()
Analysis complete: 1 successful, 0 failed.
Execution Time: 2.528 seconds
Peak RAM Usage: 0.043 GB
Although this tutorial uses a single HDF5 file, the CalculatorEnsemble class is specifically designed to process multiple trajectory files at once.
The results for each individual HDF5 file are stored in the .calculators attribute. This attribute is a list of CalculatorSingle objects, sorted first by temperature (ascending) and then alphabetically by file path.
You can verify which CalculatorSingle instance corresponds to which HDF5 file using the following code:
for i, calc in enumerate(calc_ensemble.calculators):
print(f"calculators[{i}] : {calc.path_traj}")
calculators[0] : /home/jty/Examples/Example6/TRAJ_O.h5
For the remainder of this tutorial, we will focus on the results from the first trajectory by selecting its corresponding object.
calc = calc_ensemble.calculators[0]
Key analysis parameters are now stored as attributes in the calc object and can be viewed as follows:
print(f"- Number of Vacancies : {calc.num_vacancies}")
print(f"- Time Step for the MD Simulation (dt) : {calc.dt} fs")
print(f"- Time interval for Average (t_interval) : {calc.t_interval} ps")
print(f"- Number of Frames in the MD Simulatoin : {calc.num_frames} frames")
print(f"- Number of Frames per Step : {int(calc.t_interval * 1000 / calc.dt)} frames")
print(f"- Number of Steps for Analysis : {calc.num_steps} steps")
- Number of Vacancies : 2
- Time Step for the MD Simulation (dt) : 2.0 fs
- Time interval for Averaging (t_interval) : 0.076 ps
- Number of Frames in the MD Simulatoin : 299972 frames
- Number of Frames per Step : 38 frames
- Number of Steps for Analysis : 7894 steps
You might notice that while the MD simulation contains 300,000 frames, only 299,972 are used in the analysis. This is a direct consequence of the time-averaging process controlled by t_interval.
VacHopPy groups the raw simulation frames into blocks, with each block corresponding to a single analysis step. The duration of each step is defined by t_interval.
In this example, with a t_interval of 0.076 ps (or 76 fs) and a simulation timestep (dt) of 2 fs, each analysis step is derived by averaging 76 fs / 2 fs = 38 frames.
Because the analysis requires complete steps, VacHopPy uses the largest number of frames that is perfectly divisible by 38. Any remaining frames at the end of the trajectory are discarded. Therefore, the total number of analysis steps is int(300000 / 38) = 7894, and the total frames used is 7894 * 38 = 299972.
4. Analyzing Vacancy Hopping Trajectories
Once the calculation is complete, you can access the detailed hopping trajectory of each vacancy through two key attributes. These attributes provide different but complementary views of the vacancy movement.
.hopping_history(list)A list where the length is equal to the number of vacancies (
num_vacancies). Each element in the list corresponds to a single vacancy and contains its complete hopping history..unwrapped_vacancy_trajectory_coord_cart(dict)A dictionary where each key is an analysis step (an integer) and the value contains the corresponding spatial coordinates. The coordinates are the PBC-unwrapped Cartesian positions (in Å) of all vacancies at that specific step.
- Usage of .hopping_history
Let’s examine the history of the first vacancy (calc.hopping_history[0]) to see this in action:
print(f"Number of Vacancies: {len(calc.hopping_history)}")
print(f"Number of Hopping Events for the First Vacancy: {len(calc.hopping_history[0])}")
Number of Vacancies: 2
Number of Hopping Events for the First Vacancy: 111
The raw dictionary output for a single event can be hard to read. We can easily format it for better clarity:
# Get the last hopping event of the first vacancy
last_event = calc.hopping_history[0][-1]
print("\n--- Details of the Last Hopping Event ---")
for key, value in last_event.items():
if 'coord' in key and isinstance(value, np.ndarray):
print(f"- {key:<12}: [{value[0]:.3f}, {value[1]:.3f}, {value[2]:.3f}]")
else:
print(f"- {key:<12}: {value}")
--- Details of the Last Hopping Event ---
- site_init : site1
- site_final : site1
- distance : 2.5624908680896756
- z : 1
- coord_init : [0.098, 0.402, 0.167]
- coord_final : [-0.098, 0.598, 0.167]
- name : A1
- step : 7825
- index_init : 24
- index_final : 36
- Usage of .unwrapped_vacancy_trajectory_coord_cart
To get the coordinates at the 100th step:
coords_at_step_100 = calc.unwrapped_vacancy_trajectory_coord_cart[100]
print(coords_at_step_100)
[[0.90597733 3.73968856 1.48354252]
[3.22881028 3.22881028 2.96708503]]
The result is a NumPy array where each row represents the [x, y, z] coordinates of a single vacancy at the specified step. The first row corresponds to the first vacancy, the second row to the second, and so on.
- Save the Vacancy Hopping Histories
You can save this unwrapped trajectory into the trajectory.json file using .save_trajectory() method.
calc.save_trajectory()
- Summary of Vacancy Hopping Histories
You can print the hopping history of each vacancy easily using .show_hopping_history():
calc.show_hopping_history()
====================================================================================================================
Hopping Sequence of Vacancy 0
====================================================================================================================
Num Time (ps) Path a (Ang) Initial Site (Fractional Coordinate) Final Site (Fractional Coordinate)
----- ----------- ------ --------- -------------------------------------- ------------------------------------
1 11.93 A2 2.80311 site1 [0.09751, 0.40249, 0.16667] site1 [0.15249, 0.15249, 0.33333]
2 13.6 A1 2.56249 site1 [0.15249, 0.15249, 0.33333] site1 [0.34751, 0.34751, 0.33333]
3 14.44 A1 2.56249 site1 [0.34751, 0.34751, 0.33333] site1 [0.15249, 0.15249, 0.33333]
... (rest of table) ...
====================================================================================================================
====================================================================================================================
Hopping Sequence of Vacancy 1
====================================================================================================================
Num Time (ps) Path a (Ang) Initial Site (Fractional Coordinate) Final Site (Fractional Coordinate)
----- ----------- ------ --------- -------------------------------------- ------------------------------------
1 7.3 A2 2.80311 site1 [0.40249, 0.59751, 0.50000] site1 [0.34751, 0.34751, 0.33333]
2 8.13 A1 2.56249 site1 [0.34751, 0.34751, 0.33333] site1 [0.15249, 0.15249, 0.33333]
3 8.36 A1 2.56249 site1 [0.15249, 0.15249, 0.33333] site1 [0.34751, 0.34751, 0.33333]
... (rest of table) ...
====================================================================================================================
4. Visualizing Vacancy Hopping Trajectories
VacHopPy provides three powerful methods for visualizing the vacancy hopping trajectories, each offering a unique perspective on the diffusion process.
plot_vacancy_trajectory()This method generates a static interactive 3D plot of the complete vacancy paths.
animate_occupation()This method creates a GIF animation showing the time evolution of site occupations. It visualizes how individual atoms move between lattice sites and, as a consequence, how vacancies propagate through the structure.
animate_vacancy_trajectory()This method generates an interactive HTML-based video of the vacancy trajectories.
- Usage of plot_vacancy_trajectory()
You can customize which trajectories are displayed using the following key arguments:
vacancy_indicesSpecifies which vacancies to visualize. You can pass a single index (e.g.,
0) or a list of indices (e.g.,[0, 2]).unwrapSet this argument to
Trueto plot the PBC-unwrapped trajectory.
Furthermore, you can adjust the range of steps using step_init and step_final arguments
calc.plot_vacancy_trajectory(vacancy_indices=[0, 1], unwrap=True)
Example output:
- Usage of animate_occupation()
You can customize the output GIF file using step_init, step_final, fps, dpi, and etc.
calc.animate_occupation(step_init=5400, step_final=5600, fps=2)
Merging 200 snapshots into a GIF...
Successfully created 'occupation_video.gif'.
Execution Time: 133.106 seconds
Peak RAM Usage: 0.019 GB
Example output:
The .animate_occupation() method visualizes the dynamics of site occupancy for both atoms and vacancies using a color-coded scheme.
Each arrow in the animation traces the movement of an atom of the corresponding color. To indicate the direction and recency of motion, these arrows gradually fade over time. The fading rate can be controlled with the update_alpha argument.
Color and Symbol Legend
Yellow Circles: Represent vacancies.
Other Colored Circles: Represent the atoms.
Orange Squares: Represent transient vacancies.
A transient vacancy is a temporary vacancy that was not present in the initial system configuration. It occurs when two or more atoms are assigned to the same lattice site, a phenomenon often associated with the kick-out mechanism.
For instance, the animation below illustrates a kick-out event observed in this system, where a transient vacancy (the orange square) can be seen appearing at step 5516.
The .transient_vacancy attribute is a dictionary mapping step numbers to a list of site indices where transient vacancies exist. For instance, calc.transient_vacancy[5516] returns the index of sites of transient vacancies at that step. This data is particularly useful for identifying diffusion mechanisms other than standard vacancy-mediated hopping.
- Usage of animate_vacancy_trajectory()
The animate_vacancy_trajectory() method brings the static 3D plot to life, generating a fully interactive HTML-based video. This animation includes playback controls and a timeline slider, making it an excellent tool for observing the dynamic movement of vacancies over time.
However, creating a video of the entire trajectory can be resource-intensive and result in very large file sizes. Therefore, it is highly recommended to focus on a specific range of interest using the step_init and step_final arguments.
calc.animate_vacancy_trajectory(vacancy_indices=[0, 1], step_init=5500, step_final=5525, unwrap=False)
Make Animation: 100%|##############################| 26/26 [00:00<00:00, 77.40it/s]
'trajectory_video.html' created.
Execution Time: 0.626 seconds
Peak RAM Usage: 0.035 GB
Example output: