A QoS-aware scheduling with node grouping for IEEE 802.11ah

The recent IEEE 802.11ah amendment has proven to be suitable for supporting large-scale devices in Internet of Things (IoT). It is essential to provide a minimum level of Quality of Service (QoS) for critical applications such as industrial automation and healthcare. In this paper, we propose a QoS-aware Medium Access Control (MAC) layer solution to enhance network reliability and reduce critical traffic latency by an adaptive station grouping and a priority traffic scheduling scheme. First, a link layer representation of traffic categories as per the delay and reliability requirements is proposed. Second, a novel backoff size-based slot scheduling scheme for Restricted Access Window (RAW) is proposed to support QoS. Third, a grouping scheme is proposed to calculate the current traffic load and balance it among different RAW groups. Finally, a Markov-chain model is developed to study the throughput and latency behaviors of the traffic generated from the critical application. The proposed protocol shows significant delay improvement for priority traffic. The overall throughput performance improves up to 12.7% over the existing RAW grouping scheme.


Introduction
The deployment of Internet of Things (IoT) involves the installation of a huge number of nodes to allow seamless communication. The nodes deployed over a large area are capable of sensing, actuating, or tracking [1]. The IEEE 802.11ah [2] is a recent standard having different new concepts which are suitable for IoT. Features like improved range (up to 1Km in a single hop), low power consumption, rich data rate (up to 78Mbps), hierarchical addressing, and grouping of nodes are wonderfully incorporated in 802.11ah. The new hierarchical address used in the scheme is 13 bits long, leading to a maximum number of associated stations of 8191. To reduce collisions among such a large number of stations, 802.11ah proposes a Restricted Access Window (RAW) mechanism.
In IoT, data flows are mainly of three types-event-driven, query-driven, and continuous [3]. Event-driven and query-driven traffic flows are generated only when there is an event or query, respectively. While supporting communication in an IEEE 802.11ah network with 8191 stations (STAs), which is more for multiple Basic Service Sets (BSSs), such traffic diversity is obvious. Low-rate traffic are usually generated by the alarms/commands from the control center to the corresponding authorities/actuators. For example, there may be a huge amount of on-demand reporting type of traffic from thousands of smart-grid STAs [4]. In the power outage situation, a huge number of STA simultaneously tries to report the failure before the battery dies [5]. Prior scheduling is important for improving reliability for such even-driven traffic. Also, many critical applications need to be processed within a given timebound.
Furthermore, there are IoT applications, such as healthcare and fire alarms, where data are sensitive in terms of reliability and delay, so provisioning QoS is important but challenging. It is expected that 802.11ah will connect multiple types of IoT applications in the same network. While running all these applications together, QoS in terms of provisioning priority over one another will be challenging. Also, real-time audio or graphics data streaming is delay-sensitive and has a certain requirement of bandwidth. The overall IoT control loop (sensor to the actuator) may take a huge delay with an increasing number of relays.
Unfortunately, the design of 802.11ah does not specifically consider these dynamic behaviors.
The RAW mechanism of IEEE 802.11ah divides the time frame and allows a particular group of STAs for transmission. Several issues evolve while dealing with such a massive number of STAs with heterogeneous traffic flow. Grouping of STAs has been proposed to improve the performance of the densely deployed network. With the motivation of reducing collision during contention, the grouping happened to be based on the Association Identifier (AID) value, which is assigned according to their association time. After the association, STAs are grouped in a sequence. Therefore, in the case of STAs with higher AID, it may differ from a possible transmission. The grouping approach is further utilized in existing works such as [6][7][8] to improve real-time traffic, reduce hidden nodes, and improve slicing. Critical and command-driven traffic demand special processing over continuous data flows to meet different requirements. However, the solutions do not consider traffic types, QoS requirements, and distribution of traffic loads among the groups. In such a case, dynamic and adaptive grouping mechanisms can help a network achieve better performance.
Moreover, numerous solutions are developed to provide better QoS support considering large-scale IoT scenarios. The state-of-the-art solutions such as [7,9,10] achieve QoS in terms of delay and throughput up to a certain extent by slot scheduling over the RAW frame. However, the scheduling problem is limited to the size of the RAW frame. The current research works are primarily focused on estimating the RAW size based on the number of STAs or traffic loads for a particular group. As RAW restricts transmission of a station, provisioning priority in their transmission is challenging. The scheduling scheme should look into all the RAW groups and make a global decision for scheduling. With this motivation, this work is on how to provide QoS at the link layer over RAW. There is a need for a consistent QoS mechanism concerned with different groups and time frames in the network. Concerning these facts, we propose a QoS-aware priority RAW scheduling and adaptive load-aware node grouping scheme named QS-MAC. The proposed solution can optimize channel access among different groups in a highly dense network. The key contributions of this paper can be summarized as follows: -A QoS-aware RAW slot assignment and fairness method to handle the traffic flow of critical applications. Concerning the criticality of traffic in terms of delay and reliability, the proposed scheduling algorithm provisions priority to the frame belonging to the needy applications. -A load balancing scheme among different RAW groups is proposed using dynamic AID allocation to improve network reliability. It calculates the ease of transmission for each AID group and uses an STA-initiated dynamic AID mechanism for load balancing.
The rest of the paper is organized into four sections. Background and related works are discussed in Sect. 2. Section 3 presents the proposed MAC layer solution for IEEE 802.11ah. Section 4 gives the performance evaluation and analysis of the proposed scheme. Finally, Sect. 5 concludes the paper. Table 1 shows some important notation used in this paper.

Background and related works
This section discusses the different QoS requirements of various IoT applications and their support in 802.11ah. We discuss some of the related solutions while provisioning different QoS supports. Finally, we provide an analysis of the existing RAW scheme.

QoS provisioning in IoT networks
The QoS is defined as a set of quality criteria for a particular service [11]. QoS in IoT depends on the characteristics of the objects and applications. Different rules/ criteria need to be set due to their various characteristics. Looking at the traffic in IoT applications, there can be event-driven, query-driven, and continuous flows. Low-rate traffic is usually generated by the alarms/commands from the control center to the corresponding authorities/actuators. For example, there may be a massive amount of ondemand reporting type of traffic from thousands of smartgrid STAs [4]. In the power outage situation, a vast number of STA simultaneously try to report the failure before the battery dies [5]. Prior scheduling is essential for improving reliability for such even-driven traffic. Also, many critical applications need to be processed within a given timebound. Expecting the behavior of event-driven and querydriven traffic, scheduling MAC protocol can improve performance to a great extent. Further, a dynamic node grouping scheme can help to utilize network resources efficiently. The MAC solution for 802.11ah does not consider these requirements.

Related works
IEEE 802.11-based Wireless local area network (WLAN) widely uses Distributed Coordination Function (DCF) for channel access which mainly supports best-effort traffic flows. Therefore, IEEE 802.11 Task Group proposed Enhanced Distributed Channel Access (EDCA) in the IEEE 802.11e [12] to provide prioritized services. In the EDCA, traffic of different priorities is assigned to one of four transmit queues, corresponding to four Access Categories (ACs), by adjusting arbitration inter-frame space and maximum backoff window size. After that, a large number of works discussed the QoS effort in different traffic conditions like saturated [13][14][15][16] and unsaturated [17,18]. Similarly, 802.11ah also adopts the EDCA-based priority access control mechanism. However, concerning different QoS requirements of IoT, application-based QoS support requires grouping, and the RAW-level mechanism is essential.
There is an effort to improve throughput and delay performance in the channel access mechanism through grouping and RAW optimization. Lei et al. [19] proposes a RAW grouping approach based on the Tx request received by the AP node. A RAW begins with a short time period called Control Traffic Window (CTW), which is included in the devices to reserve their channel access time, followed by a Transmission Window (TW). The sequence of successful channel reserves is maintained in TW for data transmission. However, pre-scheduling for a channel is not a suitable solution for event-driven traffic. A similar approach considers the report activities from an alarmbased application in [4]. The RAW is a periodically recurring pool of time slots, the size of which can be dynamically tuned based on the reporting activity in the cell. However, in a heterogeneous traffic scenario where a STA may transmit at any time, at any rate, the protocol fails to fulfill the traffic requirements. Energy efficiency, traffic load, and node dynamicity are not considered in the RAW group optimization. There are solutions on RAW optimization for improving throughout performances. Considering the current backoff stage in the STAs, Hamzi et al. [20] estimates the RAW size at AP. To provide guaranteed QoS for delay-sensitive STAs, Charaniya et al. [21] categorized RAW operation into slot reservationbased access for Delay Sensitive Machine type Devices (DSMDs) and conventional 802.11ah-based access for non-Delay Sensitive Machine type Devices (non-DSMDs). Ahmed et al. [22] schedules the RAW slots of a group according to the priority of STAs. STAs are classified into higher priority traffic as critical and relatively low priority as periodic. However, it does not mention the effect on a load of groups, RAW, and fairness. If the detailed deployments conceding different geographic locations and efficiency are not measured. Authors carry an analysis of control-loop latency for the delay and jitter-sensitive traffic in [10]. Park et al. [23] estimated the number of uplink STAs to determine the size of RAW. Zhao et al. [24] optimized the performance of the RAW mechanism in terms of power consumption and showed that energy efficiency in the sensor nodes improves with the increasing number of RAW groups.
Recently, real-time scheduling schemes such as [25,26] have been proposed for provisioning QoS of various applications. In the case of QoS improvement, in [27], based on the delay requirements of STAs, RAW is divided and allocated. Tian et al. [7] suggested an optimization algorithm to judicially define the grouping parameters based on real-time traffic and improved network efficiency. Š ljivo et al. [9] proposed an immediate reply scheme for 802.11ah AP to reduce downlink latency. The above mechanisms improve RAW performance to achieve a better result in terms of success rate. Tian et al. [7] and Š ljivo et al. [9] improve delay performance considering the real-time nature of traffic. Seferagic et al. [10] enhances QoS by considering the overall control loop frame. It considers delay and reliability for sensitive and periodic traffic over the RAW frame. Further, Kia et al. [28] improves energy efficiency with the help of a load-aware grouping scheme without considering the QoS requirements of an application. Summary For real IoT scenarios, traffic classification and priority scheduling for delay-sensitive applications according to their requirements are needed to be considered. Furthermore, load balancing among the complete network is essential to utilize the available bandwidth optimally. The analysis of state-of-art makes out a research space on provisioning QoS for real-time priority traffic over limited channel bandwidth in 802.11ah. To address this issue, we propose RAW scheduling and load-aware node grouping scheme for 802.11ah called QS-MAC.
3 A QoS-aware scheduling with node grouping for IEEE 802.11ah We propose a RAW scheduling and node grouping scheme to provide QoS for IoT applications. The proposed protocol provides QoS in terms of delay and reliability in a large-scale 802.11ah network. Considering the requirements of critical IoT applications, the proposed scheme dynamically schedules and regroups STAs. It works primarily in two phases: (i) Priority slot scheduling within a RAW group, and (ii) Adaptive grouping within the network. Before discussing these two algorithms, we present the problem space and system model.

Problem formulation
The AP node in IEEE 802.11ah transmits a beacon, which also broadcasts RAW Parameter Set (RPS) information in the preceding. In the interval, there may be one or more RAW for a group of STAs. The STAs belonging to a RAW group are allowed for contention within the assigned slot duration. Getting a slot from RAW is decided based on a mapping function, as shown in Eq. (1), where, x is the slot number in a RAW frame of size S RAW , the offset value is for improving fairness among the STAs in a RAW, and i is the position index or AID of the STA. If STA is already paged, it uses AID. Otherwise, the position index is used. If the RAW is restricted to STAs with AID bits in the TIM element set to 1. The slot duration (T x ) is calculated from slot duration count (S c ) specified in RPS as: where, S c depends on the value of k (S c ¼ 2 k À 1), which is the number of bits in sub-filed. If the slot format field is set to 0, k ¼ 11; otherwise for 1, k ¼ 8.
Again, the size of S RAW is calculated as 2 14Àk from a 14bit field, hence the maximum number of slots in a RAW is 8 for k ¼ 11, and 64 for k ¼ 8. In this way, group size will be T RAW ¼ T x Â S RAW , and the maximum time frame size of group for k ¼ 11 is 1.96s and 1.99s for k ¼ 8. The complete time frame (T FRAME ) for a network with all associated STAs can be calculated as: where N RAW is the total number of RAW groups in the network. If we consider N RAW ¼ 10, a STA may need to wait for T wait ¼ 19:9 À T x Second for its transmission. However, such delay is not tolerable for critical and rare traffic flows. Once STA finds its RAW slot, it uses a DCFbased contention mechanism. The second approach needs T ac access time to get its slot for transmission. Access delay in the DCF scheme depends on the probability of successful transmission in a slot. An increase in the number of groups can lead to a reduction in collisions. Hence throughput improves beyond a threshold number of groups. Again, the latency of a STA will be very high if the number of groups is high, as it may need to wait for T wait time until assigned RAW and slot. In such a case, the primary concern in a network is to find the optimal RAW and group size based on throughput, delay, and energy consumption. Also, for uplink traffic, the main issue in a large-scale network is to reduce collisions due to contention.
As the existing mechanism schedules a group of stations by departing others separated by RAW, meeting the requirements of event-based transmission is complex. For example, if a STA is not in the current RAW, it is not allowed to transmit even if it has a packet. As shown in Fig. 1, although an event has occurred in Group 5 for a STA, however, it cannot transmit because Group 1 is

System model
The 802.11ah technology operates in the sub-1GHz channel band, which extends connectivity over distances upto 1km in a single hop. Relay node support may increase this distance further. An IoT architecture using IEEE 802.11ah is shown Fig. 2. The network consist of an AP (A), a set of stations S ¼ S 1 ; S 2 ; . . .; S n , where n is the maximum number of stations. The STAs are divided into a set of groups S ¼ fG 1 ; G 2 ; . . .; G N RAW g, where N RAW is the number of RAW groups . For a network with n associated STAs, the total number of groups will be dn=Re. The network should be able to support various applications without compromising service quality, i.e., QoS. We consider two key QoS metrics-throughput, and delay, for supporting instance, packet delivery and reliability. We believe the built-in EDCA mechanism of 802.11ah (adopted from 802.11e) handles the traffic level QoS by provisioning different Access Categories for voice, video, best effort, and background. The proposed scheme uses group and RAW-level priority mechanisms for supporting QoS of heterogeneous IoT applications, especially considering their delay and reliability sensitivity. These STAs are categorized into C number of priority classes, a class c i 2 ½0; C À 1 where, c i [ c iþ1 , based on different QoS performance metrics used in IoT (discussed in Sect. 2.1). Here, a lower value of c indicates higher priority. The proposed scheme takes the priority value of traffic as input to schedule it accordingly. We assume that the type of application, e.g., delay or reliability sensitive, is known to the AP. AP implements two bits in the frame header to represent four different categories. For example, considering bit value '1' for QoS and '0' for non-QoS traffic, classification can be done with QoS restriction such as latency-sensitive and packet-losssensitive. The possible traffic classification can be: 1. QoS Traffic: c 1 2 f1; 1g; c 2 2 f1; 0g; c 3 2 f0; 1g 2. Non-QoS Traffic: c 4 2 f0; 0g The binary and decimal representation of the classified traffic can be seen in Table 2.
Each of the priority classes is assigned with a contention window W with successful transmission probability p. The AP node is responsible for assigning a lower W to a higher priority class to ensure a better probability of successful transmission in an earlier slot. For example, critical traffic may acquire an initial value from a list of c. Due to the heterogeneity and QoS restrictions, there are different loads fL 1 ; L 2 ; L 3 ; :::; L N RAW g and L G i 6 ¼ L G j . Hence, AP adaptively balances loads, and it increases the number of groups for additional AID or swaps an unused AID j with i from an existing group if needed. The proposed load balancing scheme enhances reliability by reducing the number of transmission failures.

Traffic flow identification
A slot scheduling scheme is run for uplink and downlink traffic from STA and AP, respectively. The scheme starts with Target Beacon Transmission Time (TBTT) carrying RPS information from AP. Although most IoT applications periodically generate traffic, other types of traffic, as discussed above, may be present in a single network. The AP collects transmission behavior from the different received information and predicts their next transmission time. We identify the traffic in the network mostly into two typesperiodic and non-periodic, based on their transmission behavior. The past information is stored in a database to predict the expected next packet arriving. For example, a STA with a motion sensor is kept in a smart parking area and sends a packet (i.e., non-periodic) whenever a vehicle arrives. AP can easily find periodic traffic by monitoring the initial transmissions from a STA. However, finding the  next possible transmission for non-periodic STA is challenging. As 802.11ah maps (using Eq. 1), a set of stations for a slot, Poisson distribution can be used to find the probability of packet transmissions occurring for non-periodic (i.e., event-based) traffic within a time frame. A probability mass function fðdÞ can be calculated as [29]: Here, fðdÞ is the probability for d events that occur in a time-bound, and l is the expected rate of occurrence of an event. Probability of several packet transmissions in a range (between a and b, where a\b) is given as: where I is the average service interval of traffic, the value of I is dependent on the requirements of IoT applications. We store flow rates associated with different links from STAs in an event database.

Traffic flow-aware slot scheduling
If a STA is ready to transmit, the AP node finds an available slot over the RAW frame. The AP node checks priority class c j , and their priority value j for scheduling.
Meanwhile, the proposed protocol ensures an earlier slot for low-priority STAs if higher-priority STAs are currently not requested. Then, a smaller backoff window is assigned to a higher priority STA so that it can win an earlier slot. Instead of uniform random backoff time selection, we use a non-uniform random backoff time selection using a truncated distribution defined in [0, W i -1] where i is the actual backoff stage. Most of the existing mechanisms use continuous distribution to model the backoff process. But, due to the discrete nature of slots, we need a discrete backoff distribution technique for different classes. Therefore, we use a random time selection scheme that utilizes all the slots at any backoff stage. For any backoff stage i, window size, W i is calculated as [30]: where m is the maximum retry limits and m' is the backoff stage, after which the window size remains constant. Similarly, if W in stage i for class k, W i k , and the maximum possible value of W for the class k is W max k , then: if Ready to transmit then 19: Calculate T FRAME , N RAW , T RAW 20: Allow G i to transmit in T RAW 21: else 22: Wait for next transmission The probability density function, fðxÞ of the mentioned geometric distribution can be calculated as [31]: where, i 2 ½0; m; j 2 ½0; W i À 1, and u c 2 R gives the increase and decrease of the distribution for a given class c. The variation of fðc; i; jÞ gives different W value which further defines the priority of class c. For example, if u c 1 \u c 2 then priority of class c 1 is higher than c 2 .
Algorithm 1 describes the proposed traffic classification and priority scheduling scheme. Line #5-14 classifies the traffic into different categories. Once the STAs are classified, RAW scheduling is carried out in line #15-22.
We provide fairness among critical traffic belongs to the STA, which has failed in recent attempts; the proposed scheme keeps track of transmission count T c . On every failure, this count value is incremented by 1. Higher the value of T c , higher is the chance of getting a slot in the near future. Algorithm 2 describes a QoS class promotion scheme. A STA with higher transmission count is promoted to higher priority class if the current RAW is not saturated, as discussed from line #2-6.

Adaptive load-aware grouping
IEEE 802.11ah uses RAW for channel access among a large number of stations. To limit simultaneous channel access, stations are grouped into multiple groups. A group of the station is allocated for channel access in RAW. Proper strategies are essential for splitting stations into groups. For example, decisions on the number of groups, duration of each group, or keeping stations in a group as per their QoS requirements. Considering IoT applications' QoS concerns (e.g., priority), we propose a grouping algorithm for the 802.11ah network. The proposed grouping algorithm starts whenever any request arrives from the previous phase of the protocol. Meanwhile, the AP periodically collects all the load information of RAWs for both types of traffic. For all received frames, the AP node keeps records of the window (W) and timestamp (t s ), which they have piggybacked. A circular queue is used to store this information for every node in a group. Whenever the queue becomes full, the oldest entry is automatically removed to allow the latest frame to enter. For maximum permissible window sizes W m ¼ fW 1 ; W 2 ; W 3 ; W 4 g, AP finds a set of X, where X m is the number of counts for same type of entries in the queue. Enhancing PigWin [32], the ease of transmission (C), which is the reciprocal of the average W size. The C is calculated as: The value of a large and small C indicates that the STAs are currently using a small and large W, respectively. Consequently, the transmission failure is low and high for a large and small W, respectively. Therefore, we define the load in the network is the reciprocal of C. This can further explain the difficulty of transmission. Hence, the average W size, considering the recent past communication, is vital for understanding the current network load.
The theoretical maximum value of the load is W m , whereas, the minimum permission value of it is W 0 if the frames are being transmitted successfully. Similarly, loads of other groups are also monitored. However, according to the current network condition, a threshold value (C t G i ) is decided. For example, value of C t G i ) is calculated considering all the saturated window sizes received only once, i.e., X ¼ f1; 1; 1; 1g. So, C t G i is calculated with C ¼ f1; 2; 3; 4g and W m ¼ f16; 64; 256; 1024g as:

Algorithm 2 Fairness for Critical Applications
Initialize: W c a ← average window size for class C, S ← a set of STAs, T c ← is the failure count Steps: 1: if Ready to transmit then 2: for (S ∈ C) do 3: for Every failure do 4: T c = T c + 1 Increases failure counts 5: if (W a (C i−1 ) < W i−1 m ) then 6: Promote S → C i−1 with highest T c QoS class promotion for STA S 7: else 8: Wait for the next beacon -Case 1:

hence, group is lightly loaded
Once the parameters for grouping related decision are available, the regrouping is done by utilizing the dynamic AID allocation mechanism. In general, dynamic AID allocation is initiated by a non-AP node by sending an AID switch request. To enable this, STAs are programmed with dot11DynamicAIDActivated equal to true. The proposed adaptive grouping scheme is discussed in Algorithm 3. In our scheme, an AID switch frame is sent to a particular STA from a heavily loaded group with a new, unused, or AID from the lesser loaded group. Most of the major operations are carried out by the AP node, which further reduces the load of resource constraint STAs. Figure 3 shows an example scenario of the proposed QoS scheme. RAW1, RAW2, RAW3, and RAW4 are allocated for groups-A, B, C, and D, respectively. We consider the Cross-Slot Boundary (CSB) type of slots. As discussed, the QoS aware scheduling happens within a slot. However, as group A has more number of QoS STAs, there is a chance of congestion in the slot. In such a case, a non-QoS STA may need to follow different AID, preferably from a group with a lesser number of QoS-STAs. This allows the scheme to automatically reduce the load in group A by transferring the STA to group C.

Dynamic AID Operation
QoS STA Non-QoS STA if Γ C i < Γ C t then Γ t C is the threshold value for ease of load 5: Find G C j with highest Γ C j 6: if Γ C j then 7: Swap (ID STA (L C i ), ID STA (L C j )) Exchanging AID values with a STA from low load group 8: else 9: Create new group G k 10: else 11: Goto Step (3)

Performance evaluation
The performance of the proposed priority RAW and grouping scheme is measured using theoretical and simulation analysis. The objective of the experiments and analysis is to validate the proposed QoS scheme in terms of improving delay and throughout with an increasing number of STAs, loads, and groups. The performance analysis methods consider uplink traffic generated randomly by different QoS stations. AP identifies its traffic patterns as discussed in the methodology section and carries the slot scheduling and regroup mechanism. The throughput is collected by calculating the volume of frames received over time. On the other hand, the delay is calculated by finding the average time difference between receive and send time of the frame. We use these two performance metrics in all the analyses presented below and compare them with start-of-the-art solutions. The improvement is measured by the percentage of increases over the traditional scheme. For example, if the average throughput incurred for traditional scheme and QS-MAC are T t avg and T q avg , respectively, then the percentage of improvement is ððT t avg À T q avg Þ=T t avg Þ Â 100. We plan to analyse the proposed scheme with other performance metrics such as jitter and energy consumption in the future.
We compare results with the traditional schemes considering both QoS and non-QoS traffic. The existing solutions optimise RAW size based on delay and energy consumption [33], reduce association delay [34], and control loop latency [10]. To the best of our knowledge, this work comes as the first for proposing a traffic classbased QoS with dynamic grouping in 802.11ah. Therefore, the proposed protocol is compared with traditional 802.11ah. The following set of analysis are carried out to portray the essence of the proposed scheme:

Analytical model
The 2-Dimensional (2D) discrete Markov Chain model (proposed by Bianchi [35]) is used to analyze the performance of the proposed protocol. The probability of transition to a next state is only dependent on the current state without any regard for the past states. The probabilistic picture for a possible transition is dependent on two stochastic functions: s(t) and c(t) at time slot t. The 2D-Markov chain model as depicted in Fig. 4.

Probability of successful transmission for priority classes
The network with n STAs are divided into C classes with class c 2 ½0; C À 1 having n g STAs in each group. A random 2D process ðsðtÞ; cðtÞÞ where s(t) is the backoff stages in [0, m] and c(t) is the backoff counter in ½0; W À 1 is represented by ði; kÞ. The state transition probability of (i, k) for class c is denoted as P c;i;k . Different backoff transition probabilities can be described as: -From state (i, k?1), if channel is found to be idle, k is decremented by 1 and moves to state (i, k) Pfi; kji; k þ 1g ¼ 1; k 2 ½0; W i À 2; i 2 ½0; m -At state (i,k), if the channel is busy, backoff counter stays at the same state with probability, p c Pfi; kji; kg ¼ p c ; k 2 ½1; W i À 1; i 2 ½0; m -If the transmission is successful, initialize i=0 to the minimum contention window W 0 with probability, 1 À p c Pf0; kji; 0g ¼ ð1 À p c Þfðc; 0; kÞ; k 2 ½0; W i À 1; i 2 ½0; 2 -After an unsuccessful transmission attempts at stage i-1, the STA selects its backoff counter value from the range (0,W i À 1) in the next stage i with the probability p c Pfi; kji À 1; 0g ¼ p c fðc; 0; kÞ; k 2 ½0; W i À 1; i 2 ½1; m -After being unsuccessful is transmission attempts at all the stages, the STA reaches the last stage m, and it remains in the same stage with the probability p until the transmission succeeds.
Pfm; kjm; 0g ¼ p c fðc; 0; kÞ; k 2 ð0; W m À 1Þ The steady distribution of the Markov Chain model can be calculated as: . Finally, once STA's backoff counter reaches zero (k ¼ 0), probability s c that STA can transmit in the randomly chosen slot can be calculated as:

Throughput analysis
Let n & N be the total number STA of class c, so during allocated RAW slot duration, the following events may occur [36]: 1. Channel may be idle, so probability that no STA is currently contending for a RAW slot is given by Probability that at least one STA gets access to a slot for transmission P txn ¼ 1 À ð1 À s c Þ n 3. Probability that exactly one STA gets access and transmits successfully can be written as P suc ¼ ns c ð1 À s c Þ nÀ1 P txn 4. If more than one STAs try to transmit in a single slot at the same time, collision occurs with probability, P col ¼1 À ðP idl þ P txn þ P suc Þ ¼P txn ð1 À P suc Þ Throughout (Th c ) is defined as the successful transmission of a frame bits for a given time for class c, Th c ¼ E½Payload size transmitted in a slot E½Length of slot time ¼ P txn P suc E½Payload ð1 À P txn Þt slot þ P txn P suc T suc þ P col T col where, T suc is the busy time for successful transmission, t slot is the average duration of a slot, and T col is the busy time when a collision occurs. In 802.11ah, it can be calculated as below: Where, T FH ¼ T PHY þ T MAC is the frame header duration, and T DATA ; T SIFS ; T P ; T ACK , and T DIFS are the data, SIFS, propagation, ACK and DIFS duration respectively. Again, the system throughput (Th) is the summation of throughput for all classes of STAs, i.e., Th c ; c 2 ½0; C À 1

Delay transmission analysis
We calculate the average transmission delay as discussed in [37]. If EðD b Þ, EðD s Þ, and EðD r Þ are the delays due to backoff, number slots before the backoff counter freezes, and retries respectively, then the average delay for class c can be calculated as: T ACKtimeout is the ACK time out, and T 1 ¼ P suc T suc þ P col T col . So, the average delay including all classes is EðD c Þ; c 2 ½0; C À 1

Analytical and simulation results and validation
We analyze the RAW performance considering traffic classes-C 1 , C 2 , C 3 , and C 4 , where priority of C i [ C iþ1 ; 1 i 4. We compare the results of the proposed QoSaware MAC protocol (QS-MAC) with the DCF mechanism of 802.11ah. Further, a simulation analysis is carried out in the same environment. The values of different parameters are mentioned in Table 5. The duration of the data frame for MCSs used in 802.11ah can be calculated by the equation mentioned in [38] as: where, m h ; R; D r ; L D rs ; T sym and T PHY are MAC header size, basic data rate, number of bits in one OFDM, symbol duration of OFDM, and PHY header size respectively. For 256 bytes (MCS0, 2MHz), T DATA ¼ 4:56 ms. To see the saturation throughput in a condition where the number of STAs from different classes is not the same (among 1000 STAs), we use tabular data representation. We consider 4, 6, and 10 as the maximum backoff stages for C1, C2, and C4 classes respectively. Otherwise, a graphical representation is used for common case scenarios. Initially, we analyze the throughput and delay performance of the proposed scheme in analytical (A) and simulation (S) environments and compare the results with traditional 802.11ah. Considering the theoretical analysis from Sect. 4.1.2, the calculated saturation throughput can be seen as presented in Table 3. Throughput achieved by the QoS scheme is much higher than the conventional DCF used in 802.11ah. The required bandwidth is optimally utilized for more reliability and low latency in priority traffic. Further, a saturation delay is calculated from both types of traffic, which can be seen in Table 4. The saturation delay is greatly improved in the proposed protocol.

Simulation analysis
We use NS-3 [39] for simulation analyses of the proposed protocol. Initially, the effect on throughput and delay are analyzed for different priority classes with 50% of QoS traffic in the traditional scheme. Other simulation parameters are mentioned in Table 5.

QoS in proposed versus traditional scheme
We compare the performance of QS-MAC with the traditional scheme. We measure the effect on the priority of traffic over non-priority traffic is measured. Before checking the performance results of multiple traffic classes in the same network, we initially consider only priority and non-priority traffic. The overall throughput is shared among priority and non-priority traffic in this case. With MS0 and 2 MHz, the maximum data rate over a half-duplex link is not more than half of the total data rate (i.e., 650Kbps/2). As shown in Fig. 5a, if we add throughputs achieved by QoS-MAC and the traditional scheme, it comes to around 300Kbps. In Fig. 5a, considering a number of groups like 6 and 10, with an increasing number of STAs, throughput decreases for the DCF scheme. However, in the case of the priority scheme, saturation throughput increases. With an increasing number of STAs, a higher number of priority STAs try for channel access. Hence, the probability of getting a channel for the non-QoS STA is lesser. Secondly, without any priority scheduling, the chances of collisions are higher in case of a higher   number of STAs in DCF. The remaining bandwidth can be utilized to support the requirements of class STAs, which is done by the proposed adaptive grouping scheme. Again, the DCF mechanism performs better in case of a higher number of groups. This is because of the reduction in collisions (due to contention). Figure 5b considers all the traffic classes for analysis. Throughput is measured with an increasing number of groups considering QoS-bound traffic. For the only group in the DCF mechanism, throughput decreases drastically. However, using 6 groups, a better result can be seen, especially in the case for a large number of STAs.

QoS of different traffic categories
In the next phase, we keep all types of traffic flow classes in the same network. The total average throughput achieved by all categories is around 300 Kbps. The throughput was achieved over an increasing number of STAs considering all classes C1, C2, C3, and C4 of traffic at the same time. If we consider only class-based traffic in the same scenarios, the throughput achieved by the higher priority class is better. Also, a small decline in throughput is observed as compared to the DCF scheme. In addition to all the priority classes, throughput is 125% higher than the traditional DCF scheme. Further, throughput performance for class STAs is also measured against different group sizes. The proposed protocol shows similar results in almost all the group sizes (can be seen in Fig. 6a. The throughput was achieved over an increasing number of STAs considering all classes C1, C2, C3, and C4 of traffic at the same time. This is due to the adaptive grouping scheme, which balances the available bandwidth among all the groups. The proposed scheme reduces the simultaneous channel access by early priority scheduling. Finally, Fig. 6b shows the saturation delay performance. The proposed scheme reduces delay up to a huge margin. It applies RAW scheduling and adaptive node grouping to provide QoS for the critical STAs. The preliminary delay in a largescale network is the channel access delay, which is reduced in this scheme. However, non-class STAs incur higher delays in spite of relief for more groups.

Effect of grouping in proposed versus traditional scheme
We analyzed the overall delay improvement in the proposed scheme and compared it with the traditional scheme. Along with the traditional scheme as the base lone mechanism, we also consider a dynamic RAW configuration (DRAW) scheme as proposed in [13]. This solution adjusts the RAW size as per the load and collision in a group without considering a dynamic grouping in their decision. As shown in Fig. 7a, the average throughout is measured with the increasing number of STAs. Our solution efficiently utilizes the available channel bandwidth, which is distributed among different RAW groups. Due to this, the total throughput (C1?C2?C3?C4) in the proposed scheme is higher than the traditional scheme and DRAW. Figure 7b shows the average delay performance of the proposed scheme with increasing loads in a group. Considering 50% of QoS traffic in a RAW group, we take an average of delays of all the STAs that are initially considered to be in the same group. It may be noted that a STA may not be in the same group for the entire duration. Due to the dynamic grouping scheme, the average delay of the  STAs is reduced significantly as compared to the traditional scheme. Due to the saturation capacity of the links, the delay drastically changes when the loads increase beyond 300 Kbps. The proposed QoS-aware scheme considers multiple types of applications in an 802.11ah network for provisioning priority for the needy ones. While supporting priority, it proposes a dynamic scheduling and grouping scheme. The overall performance of the proposed scheme has been improved significantly.

Conclusion
This work presented a priority scheduling and adaptive grouping scheme for improving QoS performance in terms of latency and throughput over a highly dense network. With the use of the proposed scheme, the utilization of a large-scale network is significantly improved over the existing RAW grouping. From the experimental results, it is apparent that a higher number of groups gives better performance given the different requirements of STAs. This study suggested that it is more advisable to classify all the STAs of the network into different priority classes according to their requirements and schedule their transmission opportunities accordingly. The proposed protocol used three algorithms, viz., priority scheduling, fairness scheme, and adaptive grouping to achieve QoS. These algorithms use the same computation on backoff operations for different purposes, such as window-based priority setup, load balancing, and achieving fairness. Hence, computational latency is lesser, and scalability is higher as compared to the existing related works.
Although the proposed QoS scheme can ensure improved services for critical applications up to a certain extent, with an increasing number of STAs and critical  traffic, congestion will be high. In the future, we plan to implement a deadline-aware scheduling and grouping scheme to provide guaranteed QoS.