Migrate3D was developed in Python and compatible with Python versions >= 3.9, and the GUI is built with the dearpygui library. To use Migrate3D, the input needs to be in comma-separated value (*.csv) format, and each row must include a unique cell identifier (ID), X/Y/Z coordinates, and the time. In other words, each row must represent the position of one cell at one point in time. Migrate3D has five major functionalities: data formatting, step-based calculations, cell-based summary statistics, contact detection, and principal component analysis (PCA). Several variables are adjustable in order to accommodate different types of motility or different research questions. Upon completion of a run, Migrate3D returns an Excel workbook (*.xlsx) with multiple worksheets displaying the results.
2.2. Data Formatting:
All data formatting functions leave the original input dataset intact, and instead export the reformatted dataset as a new *.csv. While complete, uninterrupted tracks are ideal, Migrate3D can interpolate missing data if needed, as long as the different segments of the track belong to the same unique cell ID. Similarly, if cells are multiply tracked such that there are multiple X/Y/Z coordinates for a single timepoint (usually resulting from segmentation errors), as long as these segments are assigned to the same cell ID (usually achieved by manual curation of the data), Migrate3D will average them into one position for that timepoint. Migrate3D is also able to handle two-dimensional data, and it provides the option to convert three-dimensional data into two-dimensional data by ignoring the Z position.
2.3. Calculations and Summary Statistics:
Step-based calculations are performed to extract the most information possible from a track. Instantaneous displacement, total displacement, path length, instantaneous velocity, instantaneous acceleration, point-to-point Euclidean distance, and relative turning angle are calculated (Beltman et al., 2009). A number of user-input variables are utilized in these calculations; these include the time lag (τ or tau) and the minimum displacement limit. Euclidean distance and relative turning angle are calculated over a given τ value such that these parameters can be tuned to be measured at smaller or larger time intervals according to the particular type of sample. Euclidean distance and relative turning angle are filtered based on the per-timepoint minimum displacement limit variable so that any background non-specific movement can be omitted. Cell-based summary statistics will also be calculated for all step-based parameters including displacement ratio, outreach ratio, arrest coefficient, and mean squared displacement (MSD). Average displacements and standard deviations are also reported per tau value across the entire dataset, as well as for cell subsets within the dataset (if provided by the user in a ‘categories’ file; see PCA section).
2.4. Contact Detection:
If enabled, the Contacts process will iterate over all the cells in the dataset comparing their X/Y/Z position at each timepoint. If the intercellular distance at a given timepoint is lower than the contact length limit set by the user, it will be recorded as a contact for the two cells in question. The resulting dataset is also further filtered down to exclude cell divisions (where the daughter cells are, for some time right after cytokinesis, in close proximity but not because a new cell-cell contact has formed) and any contacts involving non-motile or ‘dead’ cells (whether or not they are truly dead). Cell divisions are detected by evaluating the unique identifier of each pair of cells that are found to be in contact, and if those identifiers differ from each other by exactly one, they are considered to be daughter cells. This may limit the universality of the function and may require some manual reformatting of data to properly utilize it. A cell is determined to be non-motile if its arrest coefficient is higher than a user-set minimum (note that the arrest coefficient is, in turn, based on the minimum displacement limit set by the user). Finally, a summary of each individual cell’s contact history within the dataset is also produced, including the number of contacts made, the total time spent in contact, and the median contact duration.
2.5. Dimensionality Reduction (PCA):
If a dataset contains known cell subsets (e.g. virus-infected and uninfected cells thanks to a fluorescent reporter), the user has the option to provide a ‘categories’ file (also a CSV), where each cell ID has been annotated with a category (which can be any string or value). If this is done, Migrate3D will use the migration parameters described above to perform dimensionality reduction by PCA and subsequent statistical analyses comparing the given cell subsets. A Kruskal-Wallis test and a Dunn post-hoc analysis are performed on the PCA results based on the provided categories in order to evaluate whether significant differences exist between known subsets of cells. A separate Excel workbook output is generated for the results of PCA.