# Getting Started ## Installation You can install `VacHopPy` using pip: ```bash pip install vachoppy ``` Alternatively, you can install the latest version in development from [VacHopPy GitHub](https://github.com/TY-Jeong/VacHopPy): ```bash git clone git@github.com:TY-Jeong/VacHopPy.git cd VacHopPy pip install -e . ``` --- ## Running Unit Tests After installing `VacHopPy` in editable mode (`pip install -e .`) from a cloned repository, you can run the built-in unit tests to verify the installation and core functionality. The tests use the `pytest` framework. Running the tests requires `pytest`. If you don't have it installed in your environment, you can install it using pip: ```bash pip install pytest ``` Make sure you are in the main `VacHopPy` directory (the one containing `pyproject.toml`). Execute the tests using the following command: ```bash pytest ``` The first time you run this, the necessary test data files will be downloaded automatically if they are not already present. Or, for more detailed output: ```bash pytest -v ``` A successful run will show a summary indicating that all tests have **passed**. This confirms that the core components of `VacHopPy` are functioning correctly in your environment. --- ## Input File Preparation `VacHopPy` does not directly read raw MD trajectory files like `vasprun.xml` for its main analysis. Instead, it uses the **HDF5 format** as its primary input. HDF5 allows for highly efficient, streaming-based data access. As a result, `VacHopPy` can process massive trajectory datasets quickly while consuming minimal RAM (typically only a few gigabytes), even for simulations that are hundreds of gigabytes in size. Before running an analysis, you must first convert your MD trajectory into this format. ````{warning} The input MD trajectory file must contain both **position** and **force** information. `VacHopPy` uses force data to accurately determine site occupations. ```` ### Converting Your Trajectory to HDF5 You can convert your files using either a simple command-line tool or the Python script. #### Via the Command-Line Interface (CLI) The most straightforward method is the built-in `convert` command. ```bash vachoppy convert vasprun_TiO2.xml 2000.0 1.0 --label 2000K ``` The arguments are as follows: * `vasprun_TiO2.xml`: The path to your source MD trajectory file. * `2000.0`: The simulation temperature in Kelvin. * `1.0`: The time step (dt) between frames in femtoseconds. * `2000K`: An optional suffix to append to the output filenames. For efficient storage, the converter automatically splits the trajectory by chemical species. The command above would generate two separate files: `TRAJ_Ti_2000K.h5` and `TRAJ_O_2000K.h5`. This conversion process is powered by Atomic Simulation Environment (**ASE**), enabling compatibility with a wide range of [**file formats supported by the ASE**](https://ase-lib.org/ase/io/io.html). #### Via the Python script The CLI command is a convenient wrapper for the parse_md function. You can achieve the same result within a Python script: ```python from vachoppy.core import parse_md parse_md( filename='vasprun_TiO2.xml', temperature=2000.0, dt=1.0, label='2000K' ) ``` ````{note} For **lammps-dump-text** format, `VacHopPy` automatically uses [**MDAnalysis**](https://www.mdanalysis.org) as a backend instead of ASE. For more details and advanced options, consult the help message with `vachoppy convert -h` or refer to `parse_lammps` module in the API documentation. ````