3.1. Architecture analysis
B/S architecture is an application structure for distributed processing of information, and its business logic is on the server side. B/S architecture is implemented by a browser. It can be used on any operating system provided that the computer has Internet access, and it can be used by simply installing a browser or other software that can access the Internet.
3.2. MySQL
Relational databases such as MySQL facilitate the handling of large amounts of data that need to be related in a system because they are supported by mathematical relationships. In contrast, flat databases such as Excel are not suitable for handling multiple tables that are related. PepQSAR is based on MySQL, which has separate clients that allow users to interact directly with the MySQL database using SQL. MySQL is the preferred database choice for small-scale web development because of its quick start and ease of use. It also offers high performance at a low cost, and is more suitable for small to medium sized websites, although it is a little less expensive than other expensive and large database systems.
3.3. PHP
PHP's unique syntax is a mix of C, Java, Perl, and PHP's own new syntax. PHP can also execute compiled code, which can be encrypted and optimized to run faster than CGI. Its advantages also include open-source code, fast program development, and cross-platform, can run under UNIX, LINUX, WINDOWS.
3.4 Web technology
Bootstrap. Bootstrap is based on HTML, CSS, Javascript for general use, simple and beautiful responsive front-end development UI library, also known as a framework. The emergence of this framework gives front-end developers a faster development tool that greatly improves productivity.
JQuery. JQuery is a free and open-source library of Javascript that simplifies HTML DOM manipulation, event handling, and Ajax processing in CSS. Approximately 73% of the 10 million popular websites use JQuery technology till 2019. The benefits of JQuery include isolating HTML from the Javascript text, simplicity, and a high degree of extensibility.
HTML5. HTML is a text markup language that can be used to display structured content on websites. It is the last major version of HTML recommended by the World Wide Web Consortium.
CSS. CSS (Cascading Style Sheets) can be used to render the presentation of HTML documents. CSS is intended to act on the separation of content and presentation, including colors, fonts, and layout. This separation improves the readability of content and can provide greater flexibility and control in the specification of representation features. Pages on a website that require the same style can have the same format by being written in the same CSS file, and the complexity and repetition of overall website authoring can be well simplified.
Javscript. Javascript is a computer programming language that complies with the ECMAScript standard and programming specifications. It can be used to embed dynamic text in HTML pages, respond to browser-generated events, and validate a portion of the data submitted to the backend server.
Apache tomcat. Apache HTTP Server (Apache) is an open-source web server that can be run by any computer. Apache Tomcat (Tomcat) is a free open-source lightweight web server that is characterized by its high availability without affecting the live environment.
Developing a web software based on B/S structure requires a combination of technologies. The client side is generally written using HTML, CSS, Java Script, and other technologies, and uses a browser to interpret the graphical interface and provide the user with the ability to browse or operate. The server side uses a web server to receive requests from the client, respond to the client with the results of the requests, use a database management system to store and manage the data required for business processing in the website and respond to the results in a timely manner through the server scripting language, and dynamically generate the content of the pages by accessing the database. Although there are various versions of web development technology components for developers to choose from, the open-source LAMP has received attention from the entire IT community for its high compatibility, low investment cost, and stable operation. Incorporating some of the best features of modern programming languages, combination of PHP, Apache and MySQL has become a standard for web servers.
PepQSAR was developed using a MySQL database, HTML (for GUI), CSS (for style sheets), PHP (server-side scripting language) and Javascript (for displaying alert messages or data validation) in a manner that is transparent to the end user. The details of field designing for amino acid descriptor data sheet are listed in Table 3.
Table 3
Database field
|
Description
|
Data type
|
Length
|
Constraint
|
Name
|
Name of descriptors
|
varchar
|
50
|
Primary key
|
Style
|
Type of descriptors
|
varchar
|
50
|
|
Brief introduction
|
Physicochemical property of amino acids
|
char
|
50
|
|
Number of factors
|
Factors in descriptors
|
Int
|
20
|
|
Method
|
Methods to select factors
|
char
|
50
|
|
Number of variables
|
Variables selected in descriptors
|
Int
|
20
|
|
Application
|
Peptide Characterization
|
char
|
50
|
|
DOI
|
Digital Object Unique Identifier
|
varchar
|
255
|
|
References
|
The source documents
|
varchar
|
20
|
Foreign key
|
3.5 Functional requirements analysis
Functional requirements are the most basic requirements in web development, and they are the basic expectations of the users (Fig. 2). Functional requirement analysis is very critical and it changes from time to time because of the continuous development of the Internet. This website is developed using the B/S architecture, so that users can use this website as long as they access the Internet on their own computers. Based on the overall demand analysis of the system, the requirements of the system are two main modules:
(1) Web front-end module:
I. About: Users can see the database details in this module.
II. Amino acid descriptor: Users can see the table data and detailed information of amino acid descriptors.
III. Bioactive peptides: Users can see the activity data and modeling results of active peptides.
IV.QSAR models: Users can see the modeling data of the active peptide characterized by descriptors in this module.
V. Latest information: Users can see the latest literature about descriptors and active peptides updated by the administrator.
VI. Contact information: The email and contact address of the administrator are provided at the bottom of the page.
(2) Backend function module:
The administrator organizes the database data and updates the related latest information, mainly by adding and deleting data in a timely manner.
On the top left of the home page is the logo of the site, below the logo of the home page is the navigation bar of the site, click Home to return to the home page. Clicking About will jump to the brief introduction page of the database. Click Amino Acid Descriptor and it will appear a small drop-down menu, users click Introduction to jump to the brief introduction and summary of the descriptor. Besides, users click Search will switch to the search page of the descriptor. There is also a small drop-down menu for bioactive peptide, users click Introduction to jump to the brief introduction of the active peptide, and click Search page to switch to the search page of the peptide. At the bottom of the home page there is a detailed way to contact the webmaster.
3.6. Database block
The organization and architecture of PepQSAR database consists of three modules as shown in Fig. 3.
(1) Descriptor block:
Above the search box is to choose the search method, for retrieving entries of interest, PepQSAR provides search tools of keyword searches, users can enter the name in the search box to switch to the search result page. If the user selects type search, there is a drop-down selection box for the user to click. PepQSAR will return search results as a sorted table according to the user input keywords, including Name, Style, Brief introduction. If users click Name, it will jump to the detailed information of the descriptor, including Name, Style, Brief introduction, Number of factors, Method, Number of variables, Method, Descriptor table, Application, QSAR result, DOI and References. Descriptor table and QSAR modeling data of the descriptor, users can see and download the detailed information by clicking on View.
(2) Peptide block:
Enter the active peptide name in the search box to switch to the search result page. The search result will display the active peptide information in the form of a list, which includes Name and Introduction. Click on Name to jump to the detailed information of the peptide, including name, brief introduction, method, modeling results and references. QSAR results and peptide activity values can be viewed and downloaded.
(3) QSAR model block:
Descriptors and peptides corresponding to each model are shown in this page. You can get relevant information by simply entering names of active peptides or descriptors in the search box. The search result will display the information in the form of a list, which includes Descriptor, Peptide, QSAR Result and References.
3.7. Software testing
Software testing is an indispensable step after design and development of every program, because it means that the functions of the website need to be more precise.
Software testing is a systematic project for developers to test the reasonableness and safety of the software according to the requirements. After completing the software testing, the developers also have to reprogram the software and modify the modules that are different from the intended logic and functions within a certain scope. Black-box testing is that without knowing how the program works, the developers enter some commands according to the requirement specification, and then sees whether the program receives the parameters and outputs the corresponding results in order. White-box testing is defined as testing each program interface after getting the code, based on the design and the requirements analysis given by the requirements person, including static and dynamic analysis methods, etc.
Based on a comprehensive understanding of the site's functionality, we performed a long period of testing for content accuracy, image and click accuracy. PepQSAR also received better feedback on the implementation effect, button accuracy and content accuracy after alpha test. The second test did not find any serious errors, and the bugs found had been modified and improved, so PepQSAR could be launched as scheduled.
The literature is constantly being added to, so a strategy to maintain PepQSAR database is a necessity. At present we have settled on manual, periodic literature searches to identify new publications that we can mine for QSAR data. It would be advantageous if journals that publish significant numbers of peptide QSAR studies could be persuaded to require submission to PepQSAR a requirement as is done for GenBank. We hope that as PepQSAR becomes more widely used, that journals will require or at least strongly encourage authors to submit their research data directly to us.
In this fashion, researchers should be able to dispense with lengthy material searching and associated data collecting. This should significantly speed up the new research course. Meanwhile, we will keep trying to implement the more advanced features and rendering effects of the database.