The development of CitoLIMS, named for the Cytogenomics Laboratory (LIM 03), was divided into three distinct stages: Analysis, standardization and schematization of tables (Excel format) already existing in the laboratory; Creation of non-interactive prototypes (drawings, flowcharts and small-scale functional tests) and writing the source code.
The analysis of tables and schematization of the database, served as a basis for the creation of prototypes and, in the future, at crucial points of the source code, because based on this analysis, certain critical points were verified, such as which database to use and what data would be included in it. In the rest, when it comes to prototyping and functional tests, these were carried out using design tools (digital and manual), analysis of application usage flowcharts and tests within Python to find out which modules would be necessary to achieve the proposed objectives.
The source code was written in its entirety using different integrated development environments (IDE), with VSCode (https://code.visualstudio.com/) as a highlight, as it has integration with Flake8 (https://flake8.pycqa.org/en/latest/) which helps to frame the written code within the proposed Python Enhancement Proposals (https://peps.python.org/), which seek to ensure standardization and higher quality of writing/ reading code written in Python.
The Table 1 lists the modules used during the creation of the source code, also a brief description of its function and its documentation/reference. We highlighting the modules that were the basis for the creation of the software.
Psycopg2 and SQLalchemy were responsible for the integration between Python and PostgreSQL, allowing not only the communication between the languages, but serving as a basis for creating the interface between the application and the database. This, in turn, allows the exchange of data between the parties in a secure way (with prevention of SQL injections), search for information, transactions between the parties (addition and deletion of data). PyQT5 integrates Python and QT5 (written in another programming language, C), responsible for using the operating system's graphic elements to create custom application windows, however, the creation of the interface, in the current version, had the meta the minimum necessary to use the software without using a shell (command prompt). Pandas is the main module for working with tables and spreadsheets within Python, being the main component for viewing and changing data within the application.
Version control was performed with Git (https://git-scm.com/).
The software presents an open-source code under the MIT license (“The Massachusetts Institute of Technology License”), which frees its use and modifications by third parties and is distributed by the online platforms GitHub (https://github.com/) and GitLab (https://gitlab.com/).
Table 1
Modules utilized for creation of the source code
Module Name | Description |
sys | Module that allows system access with Python7. |
os | Module that allows operational system (OS) manipulation with Python7. |
subprocess | Module that allows utilization of system shell with Python7. |
pyQT5 | Module that utilizes SO graphical elements to build and develop software8. |
datetime | Module for date and time access and manipulation7. |
SQLalchemy | Module for building and using ORM with Python to access databases9. |
psycopg2 | PostgreSQL adapter for Python10. |
smtplib | Module for SMTP (e-mail) access and manipulation with Python7. |
MIME | Module for writing proper e-mails inside Python7. |
random | Module for creating randomized numbers7. |
numpy | Module with mathematical and statistics operation for Python11. |
pandas | Module for creating and manipulating data in Python12. |
logging | Module for logging processing events in Python code7. |
Flake8 | Code analyzer to check accordance with most recent PEP13. |
re | Module for text analysis and manipulation7. |
The Table 1 informs the modules used to write the CitoLIMS source code with a brief description and reference. |