Getting Started
Installation
You can install VacHopPy using pip:
pip install vachoppy
Alternatively, you can install the latest version in development from VacHopPy GitHub:
git clone git@github.com:TY-Jeong/VacHopPy.git
cd VacHopPy
pip install -e .
Running Unit Tests
After installing VacHopPy in editable mode (pip install -e .) from a cloned repository, you can run the built-in unit tests to verify the installation and core functionality. The tests use the pytest framework.
Running the tests requires pytest. If you don’t have it installed in your environment, you can install it using pip:
pip install pytest
Make sure you are in the main VacHopPy directory (the one containing pyproject.toml). Execute the tests using the following command:
pytest
The first time you run this, the necessary test data files will be downloaded automatically if they are not already present.
Or, for more detailed output:
pytest -v
A successful run will show a summary indicating that all tests have passed. This confirms that the core components of VacHopPy are functioning correctly in your environment.
Input File Preparation
VacHopPy does not directly read raw MD trajectory files like vasprun.xml for its main analysis. Instead, it uses the HDF5 format as its primary input.
HDF5 allows for highly efficient, streaming-based data access. As a result, VacHopPy can process massive trajectory datasets quickly while consuming minimal RAM (typically only a few gigabytes), even for simulations that are hundreds of gigabytes in size.
Before running an analysis, you must first convert your MD trajectory into this format.
Warning
The input MD trajectory file must contain both position and force information. VacHopPy uses force data to accurately determine site occupations.
Converting Your Trajectory to HDF5
You can convert your files using either a simple command-line tool or the Python script.
Via the Command-Line Interface (CLI)
The most straightforward method is the built-in convert command.
vachoppy convert vasprun_TiO2.xml 2000.0 1.0 --label 2000K
The arguments are as follows:
vasprun_TiO2.xml: The path to your source MD trajectory file.2000.0: The simulation temperature in Kelvin.1.0: The time step (dt) between frames in femtoseconds.2000K: An optional suffix to append to the output filenames.
For efficient storage, the converter automatically splits the trajectory by chemical species. The command above would generate two separate files: TRAJ_Ti_2000K.h5 and TRAJ_O_2000K.h5. This conversion process is powered by Atomic Simulation Environment (ASE), enabling compatibility with a wide range of file formats supported by the ASE.
Via the Python script
The CLI command is a convenient wrapper for the parse_md function. You can achieve the same result within a Python script:
from vachoppy.core import parse_md
parse_md(
filename='vasprun_TiO2.xml',
temperature=2000.0,
dt=1.0,
label='2000K'
)
Note
For lammps-dump-text format, VacHopPy automatically uses MDAnalysis as a backend instead of ASE. For more details and advanced options, consult the help message with vachoppy convert -h or refer to parse_lammps module in the API documentation.