Blockchain as a service environment: a dependability evaluation

Blockchain has become an important processing paradigm in recent years. The blockchain supports financial transactions and validates contracts, documents and data. However, the evolution of blockchain has become viable for many applications. The servers’ availability and reliability (dependence) are required in the data processing. The contract will only be signed if there are enough components to form the blockchain blocks. This paper analyses the dependency between project components that use blockchain. We present a model based on stochastic Petri net (SPN) for evaluating the dependency of the blockchain architecture. The Design of Experiments (DoE) method was used to analyse this model’s factors, seeking to know which ones had the higher impact on the system. The sensitivity analysis showed that the MongoDB component has a greater impact on the system dependency and the need to upgrade such a component. Also, for reliability, making component improvements is unnecessary if the system has fewer than 36,000 h of runtime.


Introduction
Different payment methods have allowed the monetary system to develop.All payment systems work to protect buyers and sellers against fraud of all kinds, including the spread of malware, spam, and theft [1,2].Blockchain enables you to save your money securely and in a less cumbersome manner without paying the bank expensive maintenance costs [3].However, this change presents significant obstacles, including financial stability, supervision, and regulation, and the implementation process is complicated.Several nations have used blockchain in the context of financial technologies.Since 2017, Japan has used blockchain to streamline bidding procedures and synchronize all land and property records in metropolitan regions.
Usually, blockchain technology needs to use a cloud computing environment to host and manage data distribution services.However, this environment can present flaws and difficulties in providing services that can take longer to perform transactions [4].However, the availability and reliability of cloud computing systems are of great importance to those planning to contract, deliver or share through these distributed system environments [5].Thus, evaluating the availability and reliability of blockchain architectures is a complex task.This paper mainly evaluates reliability and availability by simulating a blockchain network as a data distribution service through stochastic Petri net models with dependency.
Blockchain technology often requires hosting and managing data distribution services in a cloud computing environment.However, this ecosystem may have shortcomings and be difficult to provide, leading to lengthier transaction times [4].To those who intend to contract, provide, or share through these distributed system settings, cloud computing platforms' availability and dependability are of utmost importance [5].Determining the accessibility and dependability of blockchain technologies is a challenging task.This study uses stochastic Petri net models with dependence to simulate a blockchain network as a data distribution service to evaluate reliability and availability.
Petri Nets [6] are a family of formalisms very well suited for modelling several system types since concurrency, synchronization, communication mechanisms, and deterministic and probabilistic delays are naturally represented.This work adopts a particular extension, namely Stochastic Petri Nets [7], which allows the association of stochastic delays to timed transitions, and the respective state space can be converted into CTMC [8].SPN models present a strong mathematical foundation and are suitable for representing and analysing parallel systems with heterogeneous components that exhibit concurrency and synchronization aspects.In SPNs, places are represented by circles, whereas transitions are depicted as filled rectangles (immediate transitions) or hollow rectangles (timed transitions).Therefore, this formalism represents a superior choice to model cloud computing systems.
Petri Nets are a collection of formalisms that are well suited for modelling many system types because they naturally describe concurrency, synchronization, communication protocols, and deterministic and probabilistic delays.This study 1 3 Blockchain as a service environment: a dependability… utilizes a specific extension, called Stochastic Petri Nets modelling [7], which enables the coupling of stochastic delays to timed transitions and allows for the conversion of the corresponding state space into CTMC [8].SPN models have a solid mathematical foundation and may be used to depict and analyse heterogeneous parallel systems with concurrency and synchronization issues.Places are shown as circles in SPNs, whereas transitions are shown as full (for instantaneous transitions) or hollow (for delayed transitions) rectangles.Therefore, using this formalism to model cloud computing systems is a better option.
We have evaluated the Hyperledger Cello, one popular project hosted by Hyperledger and managed by the Linux Foundation.Our behavioural models describe the entire infrastructure's components, relationships, and dependencies.The research contributions are relevant to project managers and organizations planning to offer a blockchain network for data distribution.However, knowing the limits of availability and reliability allows service providers to apply techniques to increase system availability and reliability, such as redundancy and preventive maintenance.Therefore, the main contributions of this paper are: -An SPN model to assess the availability of a blockchain system.The proposed model enables the configuration of a significant number of parameters, including 26 transitions corresponding to mean times to failure (MTTFs) and repair (MTTRs).In addition, the model is highly scalable in terms of the number of master and slaves.-An SPN model to assess the reliability of a blockchain system.The reliability model, in turn, presents 13 configurable transitions, with the aforementioned advantages of the availability model, but also the capability of calculating the reliability in different scenarios over time.-A set of sensitivity analyses on the model's components to identify those that most impact the metrics of interest.Sensitivity analysis showed that specific components have a more significant impact on availability.The Design of Experiment made it possible to visualize more precisely the impact of the MTTF and MTTR values of each component on the availability of the system.-Case studies that serve as guides for how evaluators can use the proposed models, The remaining of this paper is divided as follows: Sect. 2 describes the related works.Section 3 presents the architecture of the analysed system.Section 4 shows the adopted methodology.Section 5 presents the extended SPN models.Section 6 presents the results obtained in the simulations.Section 7 presents a discussion about the challenge of dealing with availability in blockchain environments.Finally, Sect.8 concludes the work and presents future work.

Related work
The proposal for this study is presented in this part together with a state-of-theart survey.The key distinctions between the publications that are similar to this research are compared and summarized in Table 1.Ten pieces were displayed, the Blockchain as a service environment: a dependability… oldest of which was from 2016.We searched for jobs that would be associated with the blockchain industry.The use of analytical models was a second criterion that was included.Finally, in an effort to pick works that addressed architectural difficulties especially connected to computational components (mostly data processing), we also aimed to select works that were more in-depth.Few papers have examined the issue of blockchain in relation to analytical models in terms of context.Therefore, studies that assess computational communication systems using reliability and performability criteria using stochastic Petri nets, extended SPN, and DRBD were chosen.A feasibility study for a blockchain infrastructure as a service was undertaken by Melo et al. [9], and it aids those wishing to implement or market blockchains.System reliability and availability are two reliability properties that are evaluated using a modelling technique based on Dynamic Reliability Block Diagrams (DRBD).Rodrigues et al. [10] present an approach based on generalized stochastic Petri nets (GSPN) to evaluate the performance of private cloud computing environments that adopt NoSQL DBMS as a storage system.Models are presented to jointly estimate throughput and availability, which are prominent indicators of QoS.Liu et al. [11] present a new model of a generalized coloured stochastic Petri net (CGSPN) based on IT infrastructures, which reflects the dynamic behaviour and procedure processing service requests under the advanced active-active mechanism.
Jammal et al. [12] propose a cloud scoring system with the SPN model.In contrast, the Petri Net model assesses the availability of cloud application implementations, so illustrating the approach with a use case that shows how you can use the various deployment options to satisfy tenant and cloud provider needs.Zabala et al. [13] present the modelling of a virtual firewall based on SPN to analyse the performance in terms of throughput and delay.Mendonça et al. [14] present an integrated experience-model approach to evaluating cloud-based disaster recovery solutions.They have used SPNs and fault injection experiments to assess availability-related metrics.To demonstrate the approach's feasibility, distinct real-world cloud-based DR solutions (e.g., active/active and active/standby) were modelled and analysed.Silva et al. [15] propose an SPN modelling strategy to represent method call executions of mobile cloud systems.This approach allows a designer to plan and optimize MCC environments where SPNs represent system behaviour and drive parallelizable application execution time.
Pinheiro et al. [16] propose a formal framework based on SPN to represent application partitioning at the method call level.The framework considers the network bandwidth available to send and receive tasks to the cloud.Jammal et al. [17] propose a stochastic Petri Net model that captures the stochastic characteristics of cloud services.The model assesses the availability of cloud services and their deployments in geographically distributed data centres.Fé et al. [18] propose a stochastic model to assist cloud planning.The model was validated for a set of significant scenarios by comparing the results of the respective model with those obtained from real system measurements.This model takes as input the auto-scaling configuration parameters and the time between user requests.The proposed model calculates the throughput, the mean response time and the cost of configuring the cloud computing infrastructure.A sensitivity analysis was also performed to identify the impact of parameters on system performance.
Other works did not use analytical modelling, but it is important to highlight the work by Das et al. [19][20][21] that used experiments with Blockchain, such articles presented real applications that worked with authentication and security, as well as reliability and availability.

Architecture and base models
This section presents the reference architecture for the blockchain system and the SPN model, with details on the execution flow and its base components.The SPN model was proposed to apply a simulation that integrates the formal description, proof of correction, and performance evaluation of the proposed context [15,16,[22][23][24][25][26][27][28][29][30].
Figure 1 illustrates the reference architecture representing the Hyperledger Cello.The environment used to host Hyperledger Cello consists of two nodes, the master node and the worker node, each responsible for running a series of services.The flow starts with the Watchdog, responsible for monitoring the blockchain network service and the system's status.RestServer performs environment provisioning, orchestration and task management.The dashboard provides environment management for system administrators.Docker manages containers and provides the tools needed to run and virtualize applications.Nodes run Docker as a host for Hyperledger Cello.Python also runs on the host by supporting the Watchdog, Rest-Server, and Dashboard on the Master Nodes.We are considering using a service like Nginx, a reverse proxy used by Hyperledger Cello, to improve web performance.NodeJS is a JavaScript runtime used by Cello to improve provisioning.MongoDB is an open-source distributed database that allows you to query and index data.The Hardware used to run Hyperledger Cello can be a desktop or a virtual machine.The fundamental prerequisite is that the operating system is Linux.Blockchain as a service environment: a dependability… Figure 2 presents two models of a Master and Worker architecture that represent a series of components in a blockchain network.This model deals with an architecture with the minimum requirements to provide a blockchain network on top of the Hyperledger Cello platform.If any of the components fail, the system will not be available, and the service running on the worker will not be accessible from the external infrastructure of the blockchain network.

Master node
The master node is the machine responsible for providing access management to the blockchain network.Through the master node, it is possible to create, delete and define who can share information or see what is coming from one user to another, representing some dependencies.The master contains the hardware, an operating system, MongoDB, Python, node.js,nginx, Docker, Dashboard, RestServer, and Watchdog.The hardware component (HW) is the foundation of the entire blockchain architecture containing a direct dependency on the HW.If the HW fail, all software components will fail.To repair a machine that has had a hardware failure, the blockchain architecture repair routine starts with repairing it.After repairing the HW, the operating system (OS) becomes the next component to be repaired.Next, all other software is repaired.Docker should be repaired first, followed by Dashboard, RestServer, and Watchdog.The model's focus is to help professionals choose the best architecture configuration for their blockchain system.Table 2 presents the adopted guard expressions.

Worker node
Another SPN model is proposed for the Worker Node and presented in Fig. 2.This model has fewer components, containing only three elements: the hardware (HW), operating system (OS) and Docker.All system elements have dependency characteristics with the previous component.If the hardware fails, all software components will fail.The failure and recovery times used in the Worker Node were the same used in the Master Node.A blockchain system has the characteristics of a P2P system, where the client can also be a server [31].Understanding the application: It is important to understand how the application works, define how many components are involved, and the system's data flow, for example, where the data will be sent after passing through component 'x'.Metric Definition: The metrics of interest must be identified, considering the model information to diagnose the system performance.In this work, the selected metrics (reliability and availability) can be important in the end user's perception and useful for the system administrators.Parameter definition: The parameters that will be inserted in the model are defined here.These parameters define the behaviour and capability of each component's features.Analytical model generation: A performance model using a Petri net is developed.In this part, it is built considering the defined metrics and parameters and the expected results.The choice of the Petri model is given because the scenario has several components needing a specific level of abstraction.Sensitivity analysis: Using DoE, the analysis presents impacts considering predefined factors and levels.DoE enables us to identify the most relevant factors for the results of the chosen metrics and how the interaction between the factors and variations in their levels impact performance.Scenario selection: Some scenarios are created for performance analysis.This step defines which scenarios can represent the reality of a blockchain system.Scenarios will be chosen to analyse the most important factors considering the sensitivity analysis results.Performing the scenario evaluation: The constructed scenarios are evaluated using the Petri net model through simulation.In each scenario, the factors are varied, and the metrics will be analysed, allowing observe which configurations the system performs satisfactorily.

Sensitivity analysis
In this work, the Design of Experiments (DoE) was carried out, corresponding to a collection of statistical techniques that deepen the knowledge about the product or process under study [32].The DoE can be defined by a series of tests in which the researcher changes the variables or input factors to observe output responses.The parameters to be changed are defined using an experiment plan.The objective is to generate the most significant amount of information with the fewest experiments  Blockchain as a service environment: a dependability… possible.System behaviour based on parameter changes can be observed using output sets.Table 3 presents the factors used in constructing the DoE.The execution of the DoE seeks to identify the factors that most influence the system.In this analysis, MTTF and MTTR were chosen as the dependent variable because it is the most perceptive aspect for the end user.Table 4 presents the MTTFs and MTTRs for the current case study.Equation 1 shows the expression to calculate the availability.P represents the probability of containing a token in WAT_U and W_DOC_U.DoE was applied considering twenty factors: HW-MTTF, HW-MTTR, OS-MMTF, OS-MMTR, MongoDB-MTTF, MongoDB-MTTR, Python-MTTF, Python-MTTR, NodeJS -MTTF, NodeJS-MTTR, Nginx-MTTF, Nginx-MTTR, Dashboard-MTTF, Dashboard-MTTR, RestServer-MTTF, RestServer-MTTR, Watchdog-MTTF, Watchdog-MTTR, Docker-MTTF, and Docker-MTTR.The factors have two levels of variation, with 50% higher and lower.
Figure 4 presents the Pareto chart for the factors related to the availability metric.
When a factor has a high impact on tests, very different values are obtained when changing its level.According to the p-values found, the effects of the MongoDB-MTTF factor have the greatest impact among the factors in this study, followed by Docker-MTTR and OS-MMTF.Therefore, choosing the database with the shortest failure time to use is important for the impact of system availability.Watchdog-MTTF and MongoDB-MTTR have the least impact on the system.Figure 5 shows the main effects graph for availability.The graph represents the availability to carry out the tests at each level.In this graph, the more horizontal the line, the less influence the factor has, as it means that the different levels of the factor influence the final result similarly.The MongoDB-MTTF, Docker-MTTR, OS-MTTF, Dashboard-MTTR, and Python-MTTR factors had the greatest impact.
(1) A = P{((#WAT_U > 0)AND(#W_DOC_U > 0))} Fig. 4 Influence of MTTR and MTTF Factors Figure 6 presents the interaction graph.An interaction occurs when a difference in another factor changes the influence of a certain component on the result.If the lines of the graphs are parallel, there is no interaction between the factors.In general, there was little interaction between the factors.However, we can highlight the interaction between the MongoDB-MTTF and Python-MTTR factors, characterizing itself with greater interaction, reaching the level of − 50%, even if in a minimal way.However, this demonstrates that if the evaluator opts for the MongoDB database, taking into account the MTTF, the best choice is the MTTF of the database with more than 50%.The interaction between MongoDB-MTTF and Docker-MTTR was similar to the one mentioned above.The same choice criterion applies if the database choice is MongoDB.For the interaction between Docker-MTTR and OS-MMTF, if the evaluator considers Docker MTTR, the best choice falls within the +50%.

Extended models
This section presents the structure of the extended model applying Cold, Warm and Hot Standby redundancies [33][34][35].The MTTF and MTTR values for the extended models are the same as in Table 4 presented earlier.The time that triggered the redundant server in the SWITCH_TIME transition was 0.0833333 h, extracted from [36].The characteristic of redundant models is presented in Table 5.

Model two: cold standby
A major limitation of the base proposal is that if one of the two servers fails, the entire system will stop working.We have only considered the redundancy of the MongoDB component to assess whether there is an improvement in the availability.MongoDB_01 and MongoDB_02 are always connected; however, Mon-goDB_02 is instantiated only if MongoDB_01 fails.Therefore, the cold standby  Blockchain as a service environment: a dependability… redundancy mechanism is applied when the main component fails, providing system operation after a component fails.Figure 7 presents an overview of the model in cold standby.Given the DoE analysis, it was identified that MongoDB has the greatest impact, so MongoDB was the component chosen to do the redundancy.When the component fails, the redundant MongoDB will be triggered so that the system continues to be fed with the database and can carry out storing data normally.When the first MongoDB is repaired, the redundancy will be disabled as it is only needed in case the main component fails.Only if both groups fail will the system become unavailable.Table 6 presents the guard expressions of the extended model.

Model three: warm standby
Here, the unique redundancy of MongoDB was considered to evaluate availability.We have MongoDB_01, W_MongoDB and W_MongoDB_01.Both components are always connected.However, when W_MongoDB crashes, nothing happens.The system continues to be fed with the data present in MongoDB_01.However, if Mon-goDB_01 goes down, W_MongoDB_01 is automatically activated.Thus, the system is fed with the data provided by the redundancy of the database.However, if W_MongoDB_01 fails, the system becomes idle.
Figure 8 presents an overview of the SPN model with the proposed extension of the base model.However, when MongoDB fails, redundancy will be triggered so that the system continues to feed the database.When group 1 MongoDB is repaired, the opposite component (group 2) will be disabled.If both MongoDB fails, the system will become unavailable.Table 7 presents the guard conditions used for system operation in the extended model.In this case, using guard conditions was of great  help in avoiding visual pollution of the model since several connections had to be made.

Model four: hot standby
In the model based on Hot standby, it is necessary to double the number of tokens of that component to become redundant.In Hot Standby redundancy, the faulty module is replaced without significant delay, as the resilient modules are also powered.Figure 9 shows the model with Hot Standby of the MongoDB component with increased capacity, where MONG_U works with double capacity.

Model five: reliability
Figure 10 presents the SPN reliability model for the blockchain architecture of the baseline scenario.This model is composed of thirteen system components present in the system.The MTTF transitions trigger each component's change from active to inactive status.Each component can operate independently if the number of tokens in UP equals the markup value.This model is a variation of the base SPN model by removing MTTR transitions from all components.Once components fail, they cannot be repaired.All input parameters are the same as in the base model.This model aims to show the system's confidence level to continue working as a function of time.
Reliability was also assessed using DoE. Figure 4 shows the Pareto chart.The HW, OS, MongoDB, NodeJS, Docker and Dashboard components impact the reliability the most.Figure 11 shows the reliability by varying the MTTF of those components with the greatest impact.The MTTF of the components was varied between base value, base value plus 25%, base value plus 50% and base value plus 75%.
Blockchain as a service environment: a dependability… The increase in execution time is directly proportional to reliability.The longer the execution time, the lower the system's reliability.In both configurations, reliability started at 0, but the highest angle of fall occurred until Time = 50000 hs.Reliability starts to decrease and tends to stabilize towards the end of the experiment.We can also observe that the points where the reliability tends to a value smaller than 0.099 are: (i) base configuration = 45000 hs; (ii) base configuration plus 25% = 52500 hs; (iii) base configuration plus 50% = 61500 hs; and (iV) base value configuration plus 75 % = 73500 hrs.For the starting point, the entire system started its execution with reliability at 100%.Therefore, the longer the failure time, the greater the reliability, as the system will operate longer.Figure 12 presents the bar graph for reliability.Four cuts (time window) are performed at runtime for better visualization, so we can evaluate the behaviour over time.T1 goes from 0 h to 36000 hs, T2 goes from 36001 hs to 73500 hs, T3 goes from 73501 hs to 111000 hs, and T4 goes from 111001 hs to 150000 hs.The base model has the highest reliability because the system has little execution time, and thus the probability of system failure is much lower.However, the longer the system remains active, the base model shows itself as inefficient in terms of reliability.However, improvements in HW, OS, MongoDB, NodeJS, Docker and Dashboard components are not relevant for a time lower than 36000 hs.In order to consider the need for improvements in the components, a run greater than 36000 hs must Blockchain as a service environment: a dependability… be taken into account.It was also observed that as the running time of the system increases, the reliability values are inverted.Therefore, the base model at the beginning of the execution showed the highest reliability at the end of the experiment with the lowest reliability.

Model comparison
This section presents the availability analysis of the four models presented in the paper.Figure 13 presents the bar graph for availability.The variation in the availability of each model was observed, as well as the impact of each component on the architecture.The proposed model was evaluated using the Mercury Script Language [37] tool.In this study, the test was performed to assess the reliability and availability of the presented models.During the simulation, the Hot Standby redundancy proved superior to the other models in terms of availability.
The base architecture had 141 h of unavailability, equivalent to 98.3% of stationary availability.With Cold redundancy, it is possible to observe a result with a longer availability time, totalling 126,775 h of unavailability.Cold redundancy presented an availability of 98.4%.The Warm redundancy presented a better result than the Cold redundancy and the base model, having an unavailability time of approximately five days, with 98.5% of availability.However, the Hot one presented the highest performance, with only 77 h of unavailability, and availability = 99%.
In the previous scenarios, the MongoDB failure time factor was analysed individually, creating redundancies and testing system performance and availability.Such analyses allowed us to observe the factor with the greatest impact that interferes with all metrics in a very detailed way.However, in addition to having an isolated impact on the behaviour of the system, the DoE analysis showed that there is a strong interaction between the two factors in the average response time, as shown in the Pareto chart (Fig. 4) and graph of interaction (Fig. 6).These graphs only indicate the existence and magnitude of the interaction.Therefore, this section shows the variation between the two factors.Table 8 presents the combinations between the factors analysed in this scenario.
Figure 14 presents a 3D surface graph to show the system behaviour considering system availability, varying two factors with a high impact on performance.Colours are related to the result of availability.The bar on the right indicates the magnitude of the results.The upper part indicates the highest availability, and the lower indicates the lowest availability.Therefore, purple represents the lowest availability, and red represents the highest availability.In the graph, it is worth highlighting the presence of a projection at the top that facilitates the interaction of factors.
Changing the MongoDB MTTF has a greater impact than changing the Docker MTTR.The red colour is present in most of the projections, indicating a high availability of the system.The purple colour corresponds to availabilities at the bottom of the chart.If a MongoDB failure time and a higher Docker recovery time are adopted, the system availability drops, showing that the Docker recovery factor is relevant to the system availability.Therefore, the result indicates that it is often more beneficial to invest in Docker recovery time and thus improve availability.

The challenge of dealing with availability in blockchain environments
The availability of a blockchain system is a potential problem.Specifically, the transaction throughput and latency remain consistent challenges, and as the volume of transactions increases, in general blockchain systems cannot cope.While theoretical analysis of a platform may provide an idea about its performance, only benchmarking and implementation can provide a real-world use analysis.In considering the performance of blockchain systems in practice, Anh et al. [38] developed their BLOCKBENCH framework to analyse blockchains as data processing platforms, using both micro-and macro-benchmarking workloads.The designed framework was used to compare the Etherium, Parity, and Hyperledger blockchains against the in-memory database system H-Store, finding that the throughputs in transactions per second of Parity, Etherium, and Hyperledger were each an order of magnitude apart in order from lowest to highest, and that H-Store was one to two orders of magnitude higher than Hyperledger.In general, they also found Hyperledger to have the highest throughput, fastest execution, lowest peak memory usage, and moderate to low latency, seemingly performing the best of the three.
Likewise, Weber et al. [39] considered the availability limitations of Bitcoin and Etherium, and measured the time for transactions to commit.Specifically, they observed that some transactions never commit, due to the blockchain design, noting the inability for abort and retry functions.In addition, Pongnumkul et al. [40] conducted a performance analysis of Hyperledger Fabric (HF) and Etherium in private deployment, developing a methodology for blockchain analysis.Their results showed that, while HF consistently performed better across all metrics, including latency and throughput, neither platform can be considered competitive with current database systems in high-workload scenarios.
Considering these results, we can analyse the applicability of blockchain systems based on the target use by considering the number of transactions necessary to be served in a target time frame.In the case of IoT devices, private blockchains may be suitable, as the number of measurements for any single device will be small.Nonetheless, as we scale to larger IoT-based smart-world systems serving massively distributed devices, or big data systems that act on an unprecedented number of data items, the ability to apply blockchain becomes more difficult.

Conclusion
This paper proposed stochastic Petri net models for a blockchain architecture to help system administrators plan computer system architectures.The models consider several factors that influence the total availability of the system.Among the factors presented in the sensitivity analysis, it was noted that the MongoDB component has a greater impact on the availability and reliability of the system.There were significant components in the sensitivity analysis: HW, OS, NodeJS, Docker, and Dashboard.Significant improvements in these components will increase the availability and reliability of the blockchain network.It was also noted that regarding reliability, making improvements to the components is unnecessary if the system has a runtime lower than 36,000 h.The models provide accurate availability and reliability metrics.The models were also demonstrated by carrying out four case studies.The case studies provide a practical guide that shows how a system administrator can apply the model to perform assessments of various configurations for a blockchain architecture.We intend to expand the complexity of the models in future studies by introducing more places of probable failure.We intend to improve the Master component's functionality and evaluate the concept in a real blockchain infrastructure.

Fig. 1
Fig. 1 Illustration of the architecture of a system that uses Hyperledger Cello

Fig. 2
Fig. 2 SPN composed of the master and worker nodes

Figure 3
Figure 3 presents a flowchart that summarizes the strategy used in this work as a research methodology composed of eight steps.Understanding the application: It is important to understand how the application works, define how many components are involved, and the system's data flow, for example, where the data will be sent after passing through component 'x'.Metric Definition: The metrics of interest must be identified, considering the model information to diagnose the system performance.In this work, the selected metrics (reliability and availability) can be important in the end user's perception and useful for the system administrators.Parameter definition: The parameters that will be inserted in the model are defined here.These parameters define the behaviour and capability of each component's features.Analytical model generation: A performance model using a Petri net is developed.In this part, it is built considering the defined metrics and parameters and the expected results.The choice of the Petri model is given because the scenario has several components needing a specific level of abstraction.Sensitivity analysis: Using DoE, the analysis presents impacts considering predefined factors and levels.DoE enables us to identify the most relevant factors for the results of the chosen metrics and how the interaction between the factors and variations in their levels impact performance.Scenario selection: Some scenarios are created for performance analysis.This step defines which scenarios can represent the reality of a blockchain system.Scenarios will be chosen to analyse the most important factors considering the sensitivity analysis results.Performing the scenario evaluation: The constructed scenarios are evaluated using the Petri net model through simulation.In each scenario, the factors are varied, and the metrics will be analysed, allowing observe which configurations the system performs satisfactorily.

Fig. 5
Fig. 5 Main effects for availability

Fig. 6
Fig. 6 Influence of factors in relation to availability

Fig. 7
Fig. 7 SPN model with cold standby redundancy

Fig. 11
Fig. 9 SPN Model with Hot Standby Redundancy

Fig. 13
Fig. 13 Availability of the four models varying redundancy in the MongoDB component

Table 1
Related work comparison

Table 5
Features of redundant models

Table 6
Extended model guard expressions Cold Standby

Table 7
Extended model guard conditions using Warm Standby