Hierarchical traffic light-aware routing via fuzzy reinforcement learning in software-defined vehicular networks

Lack of a fully vehicular topology view and restricted vehicles' movement to streets with the time-varying traffic light conditions have caused drastic gaps in the traditional vehicular routing protocols. Using software-defined networks (SDN), this paper proposes HIFS, a Hierarchical Intersection-based routing strategy that incorporates Fuzzy SARSA reinforcement learning to fill these gaps. At the first level of our HIFS scheme, a utility-based intersections selection policy is presented using fuzzy logic that jointly considers delay estimation, curve distance, and predicted of moving vehicles towards intersections. Then, a fuzzy logic-based path selection policy is proposed to choose the paths with highest flexibility against the intermittent connectivity and increased traffic loads. Residual bandwidth, Euclidean distance, angular orientation, and congestion are considered inputs of the fuzzy logic system. Meanwhile, traffic light states and nodes' information are used to tune the output fuzzy membership functions via reinforcement learning algorithm. The efficiency of our scheme in controlling ambiguity and uncertainty of the vehicular environment is confirmed through simulations in various vehicle densities and different traffic lights duration. Simulation results of average gains obtained for both scenarios show that our HIFS scheme increases the packet delivery ratio on average by 48.75%, 54.63%, and 8.78%, increases the throughput by 48.66%, 53.79%, and 8.61%, reduces end-to-end delay by 33.35%, 46.14%, and 15.38%, reduces the path length by 25.25%, 36.47%, and 15.32%, and reduces normalized routing overhead by 37.09%, 49.79%, and 20.17%, compared to MISR, ITAR-FQ, and GLS methods, respectively.


Introduction
The growing intelligent information and communication technology in various fields, including the Internet of Things (IoT), Sensor Networks (SN), Vehicular Networks (VN), and Unmanned Aerial Vehicles (UAV), [1][2][3][4][5][6][7][8][9] profoundly affects our lives in various application areas Among these networks, VNs have emerged as a promising technology in the future of Intelligent Transportation Systems (ITS), intending to improve road safety and develop vehicular entertainment applications [10,11]. Road safety applications, including traffic violation warnings, lane change alerts, and pre-crash warnings, aim to make driving easier and reduce casualties. Recent studies [12] showed that if drivers are informed earlier than half a second before the crashes, road casualties can be reduced by approximately 60%. Therefore, road safety applications should be disseminated to all intended vehicles as quickly and reliably as possible. Entertainmentrelated information would also provide passengers or drivers with entertainment, such as internet access, video streaming or gaming [10][11][12][13][14]. Therefore, researchers, industrial, and academic communities have focused on the potential of vehicular networks to save lives and a clean urban environment to improve citizens' quality of life, as well as the ability of entertaining applications to make a trip enjoyable with online games and Internet access, and to save travel time by avoiding traffic jams regions. By providing a variety of communications, such as vehicle-to-vehicle (V2V), vehicle-toinfrastructure (V2I), or vehicle-to-everything (V2X), these applications can be shared with the surroundings [11,15]. Sharing applications among vehicles via an efficient and reliable routing method has become an important research area in vehicular networks. In recent decades, numerous vehicular routing protocols and strategies [9,[15][16][17][18][19] have been proposed in VNs that can be classified into locationbased and topology-based routing categories based on the route formation strategy [14]. Lately, position-based routing protocols have attracted much attention among researchers and communities due to their applicability in vehicular networks [14]. However, intermittent connectivity caused by high dynamics of the vehicular topology, bandwidth limitations, traffic congestion, and obstacles can lead to communication bottlenecks and compromise location-based routing protocols' efficiency. That is why intersection-based routing protocols have been widely proposed to reduce the effects of the above-discussed challenges in vehicular networks [17,18]. Among the state-of-the-art intersection-based routing schemes, full path traffic-aware, local-based intersection selection, and broadcasting control packets (CP) are the most commonly employed methods for real-time road evaluations and choosing intersections [17,18]. However, limited knowledge of vehicular topology and additional costs such as increased latency and routing overhead may lead to sub-optimal traditional intersection-based routing performance. Some of the mentioned challenges are likely to be reduced by integrating new technological paradigms such as software-defined networking into vehicular intersectionbased routing. SDN is a promising technology that offers an efficient network management solution by separating the control plane from the data plane [11,15,20,21]. Combined software-defined networks into vehicular networks (SDVNs) can provide programmability and access to entire network information. Therefore, the routing decisions can be made based on the global information obtained from the vehicular environments and road network topology.

Motivation
Significant challenges of traditional vehicular routing, including local maxima and congestion caused by limited knowledge in routing decisions, can be reduced through global visibility provided by SDVNs. However, intermittent connectivity, interface heterogeneity, and increasing demand for scalability and reliability, besides road constraints such as time-varying traffic lights, are common barriers to optimal SDVNs routing performance. Therefore, making routing decisions in SDN-enabled vehicular networks due to road restrictions and application requirements such as end-to-end latency is still challenging. Various routing criteria to tackle these challenges have been presented in SDVNs using different techniques and computational methods [22][23][24][25][26][27][28][29][30][31]. Among these techniques, fuzzy logic schemes such as [25] can make routing decisions more efficient as a user-friendly method closer to human thinking by processing approximate data using non-numeric linguistic variables. However, the use of the traditional fuzzy type-1 due to the possibility of conflicting criteria, in addition to the uncertainty and ambiguity of the vehicular environment, may lead to sub-optimal routing decisions. Accordingly, a new paradigm should be developed to cope with vehicular environment uncertainties. Fuzzy logic with the ability of tuning fuzzy membership functions can deal with ambiguities resulting in more suitable performance [32,33]. Adaptivity of membership functions can handle linguistic and numerical uncertainties associated with traditional type-1 fuzzy [33]. On the other hand, adapting the effects of traffic lights in a sequence of intersections is still a challenging task in transmitting routing data that should be investigated more deeply. Motivated by dealing with the uncertainties of vehicular environments and considering the traffic light's effect, this paper presented a fuzzy-based hierarchical routing scheme with tuning output membership functions based on SARSA reinforcement learning.

Main contributions
Our HIFS routing scheme, by helping global visibility provided via software-defined network, is an adaptive routing scheme which can cope with road constraints and timevarying vehicular topology. Reducing the routing overhead and handling inherent vehicular environments challenges, including bandwidth limitations and high velocity of vehicles in increased traffic load, are among the strengths of our proposed scheme. Besides, efficiently selecting a sequence of intersections informed of the traffic light conditions and the ability to cope with the vehicular environment's ambiguity will be augmented our proposed scheme's superiority over state-of-the-art intersection routing schemes. Our main contributions are summarized as follows: • Through a global vision provided by software-defined networks, a hierarchical routing scheme is proposed to make decisions efficiently in vehicular environments. Initially, a fuzzy logic-based intersections selection policy is provided by jointly considering estimated latency, predicted of moving vehicles towards candidate intersections, and curve distance. Based on the fuzzy results, we proposed a utility-based scheme to dynamically select a sequence of intersections, which can improve the data packet transmission efficiency. • A utility-based relay node selection policy is proposed via fuzzy logic with flexibility against network topology 1 3 changes and increased traffic loads. The HIFS scheme considers multiple fuzzy metrics, including residual bandwidth, angular orientation, Euclidean distance, and load capacity. • The output fuzzy membership functions are tuned by SARSA reinforcement learning to handle the impacts of uncertainties arising from the dynamic vehicular topology and road restricted to the different traffic lights conditions. The efficiency of adaptively tuning fuzzy membership functions depending on macroscopic and microscopic vehicular environment aspects has been proven through simulation. Our HIFS scheme has improved network performance in terms of packet delivery ratio, average delay, path length, and routing overhead than the state-of-the-art routing schemes.

Organization
The rest of this paper is organized as follows. A summary over the state-of-the-art intersection-based routing protocols is laid out in Section 2. Section 3 presents the details of our routing scheme. In Section 4, the simulation settings and the results are discussed. Finally, concluding remarks and our future works are given in Section 5.

Related work
Some recent efforts in the state-of-the-art vehicular intersection-based routing in two categories: 1) traditional intersection-based protocols; 2) integration of software-defined networks into the intersection-based routing were reviewed in this section. The aim is to show the progress of related works in considering traffic light effects and the ability to deal with the ambiguity of the vehicular environment. The previous works' ineptitude in dealing with the ambiguous vehicular environment and considering global traffic light effects inspired us to propose our HIFS scheme. The details of surveyed papers in each section are stated below.

Traditional intersection-based routing in VNs
Some of the intersection-based routing protocols using fuzzy logic or traffic light-aware are discussed below, along with their similarity to our scheme. In [34], Ding et al. to consider traffic light's effects on routing performance, have proposed a traffic light-aware routing scheme called TLRC. The next road section in this scheme was selected according to the density and vehicle distribution. A greedy strategy was used to choose the next-hop nodes in this scheme. Chang et al. [35] suggested a shortest path-based traffic light aware routing scheme called STAR. Data packet forwarding in the road sections is based on the traffic lights statutes (e.g., green lights have higher priority for forwarding data packets). The next-hop node between two intersections was chosen using the greedy method. In [36], Xia et al. presented a greedy traffic light and queue-aware routing method called GTLQR, which used street connectivity in various traffic light conditions to select the best intersection. Then it would consider the channel condition, distance to destination, and queueing delay in prioritizing neighbor nodes for selecting the next-hop nodes. Zhou et al. [37] proposed a multiple intersection selection routing called MISR, which modeled the intersection selection as an optimization problem. It selected the intersection with minimum delay and highest connectivity by considering traffic light conditions. In addition, the instability coefficient by combining the neighbor's progress towards the destination and the speed difference between neighbors were used to select the next-hop nodes. Cao et al. [38] suggested a fuzzy logic intersection-based routing scheme called IRFMFD, in which intersections would be selected based on the number of vehicles and link lifetime obtained from two-hop neighbors' information. The next-hop node selection in the IFRMFD scheme is a fuzzybased method considering vehicle density, distance, and relative speed of vehicles. Also, Cao et al. [39] presented an intersection-based routing scheme with a fuzzy multifactor called IRFMF. Density, number of lanes, and traffic flow were considered inputs of the fuzzy system in selecting the intersections. This method used the limited greedy strategy by considering the number of contacts between vehicles for the next-hop selection. He et al. [40] proposed an intersection-based traffic-aware routing via fuzzy Q-learning called ITAR-FQ. ITAR-FQ scheme used a weighted cost function composed of density, road latency, and Manhattan distance in its intersection selection mechanism. Also, link lifetime, link quality, Euclidean distance, and bandwidth via a fuzzy logic system were utilized to choose the next-hop nodes. Debnath et al. [41] proposed a fuzzy logic scheme for inter-vehicle network routing. Intersections selection in this method depends on the number of vehicles, the distance between two intersections, and the average speed of vehicles moving between two intersections. The communication link expiration time and communication quality factor simultaneously selected the next-hop nodes based on the fuzzy logic.

Intersection-based routing with the aid of SDNs
In recent years, numerous intersection-based multi-path or single-path routing schemes have been proposed in the software-defined vehicular networks via various techniques such as computational intelligence, and position estimation methods [11,[22][23][24][25][26][27][28][29]. Some of them are being studied in the following. Zhao et al. [22] proposed Penicillium reproduction-based Online Learning Adaptive Routing method called POLAR. A proper routing method (i.e., ADOV and DSR) would be selected in this scheme based on the information obtained from the current traffic conditions. The geographical area was divided into multiple grids to facilitate real-time information processing. Furthermore, Penicillium Reproduction Algorithm (PRA) with optimization abilities would enhance the learning process's efficiency. Oubbati et al. [23] proposed a three-tier architecture called SEARCH to develop the awareness of road conditions and select the besttraveled path for vehicles. Dijkstra's algorithm was used to assess shorter time routes to get to any specific destination based on the journey times. The journey time was computed according to the density of vehicles, obstacles, casualties on the roads, and the average velocity of vehicles. Noorani and Seno [24] proposed SDN and Fog computing-based Switchable Routing called SFSR to select the most suitable path for data routing. A weighted cost function by combining Euclidean distance, route length, route delay, traffic congestion, and stability parameters is used to select the best routing paths. Zhao et al. [25] proposed an SDN-based fuzzy logic routing scheme that divided urban regions into several areas. A fuzzy-based method with several features, including mixed distribution, one-way connectivity, and valid distance, was employed for selecting the most suitable areas. In addition, a reinforcement learning algorithm was used to adapt the area selection policy. The most stable paths were also selected through fuzzy logic considering movement direction, speed, and distance. Abbas et al. [26] developed a hybrid routing strategy for SDVNs that divides road segments into different sections. High-reliability paths were selected considering the number of neighbors, link connections, and node distributions. The controller also would utilize a mechanism to deal with link failures. Hello messages interval was adjusted depending on the vehicle velocity. Venkatramana et al. [27] proposed an SDN-enabled connectivity-aware geographical routing called SCGRP, in which the shortest vehicular route would be computed using OSM spatial data. A traffic value threshold on the roads in this method was used to provide high path connectivity. The route between the source and destination nodes was computed using the distance, vehicle density, and vehicle speed. Kong and Zhang [28] proposed an ant colony routing algorithm for SDVNs, in which a pair of exploring ants would find the best routing paths for data transmission using pheromones. The intersection was computed considering the density and distance between the source and destination, and then next-hop nodes were selected greedily. Gao et al. [29] proposed a hierarchical load balancing scheme called HRLB for software-defined vehicular networks. In this method, the geographical area was divided into multiple small grids based on the geographical location. A sequence of grids was selected based on the probability of transition and realtime density of these grids for transmission of data packets. Two paths with the least cost would be selected considering the load balancing and node utility among the grids. The weighted cost function was obtained as a combined function of route length, the number of vehicles, distance from adjacent nodes, and route load. The utility of nodes was calculated based on the remaining buffer size and distance of nodes to the destination.

Research gap
• Transmission of CP packets at intersections such as ITAR-FQ [40] requires updating periodically, which increases latency and overhead, especially at high densities or high traffic loads. • Routing decision-making based on the local knowledge or two-hop neighbor's information due to the restricted view of vehicular topology leads to nodes trapped into the local maxima problem resulting in degradation of data routing efficiency [34][35][36][37][38][39][40][41]. • Regardless of road constraints and traffic light conditions, routing efficiency would be restricted. The traditional routing methods, including TLRC [34], STAR [35], GTLQR [36], and MISR [37], considered the effect of traffic lights during the local selection of intersections. However, traffic light effects in SDN-enabled vehicular networks should be considered while selecting a sequence of intersections in the data routing process. • Despite all advances gained by using the traditional fuzzy type-1 and other computational intelligence techniques in SDVNs routing schemes [22][23][24][25][26][27][28][29], coping with uncertainties and ambiguity of the vehicular environments is still a challenging task that needs to be addressed.
Studied works evaluated the traffic lights' effects locally or used software-defined networks by combining computational intelligence, such as fuzzy type-1 systems. However, they have not considered the effect of traffic lights in a sequence of intersections to cope with the uncertainty of vehicular environments. Furthermore, the lack of consideration of traffic lights' global effect encounters scattered connectivity and network partitioning problems, specifically at intersections. Since due to the long vehicles queueing behind the red traffic lights, fragmentation and partitioning on the other sides of the intersection will be increased. Our method considered global traffic light conditions and tuning fuzzy output membership functions. Therefore, determining routing paths adaptively, considering the characteristics consistent with the vehicular environment restricted to the road topology, and effectively handling the vehicular environment's ambiguity are the main differences between our work and state-of-the-art intersection-based schemes.

3 4 Proposed method
This paper proposes a fuzzy-based routing scheme in the software-defined vehicular network to address the shortfalls outlined in Section 2.3 so that the ambiguity and uncertainty of the vehicular environment by tuning membership functions can be handled. Details of our scheme are given below.

Network model
Our hierarchical proposed SDVN architecture includes a set of vehicles, wireless switches, roadside units (RSUs), base stations (BSs), and centralized SDN controller. The centralized SDN controller is at the top level of the network. At the bottom level, RSUs and BSs are deployed for providing fog services supported by OpenFlow protocol, processing, and storage capabilities. The network road can be depicted as and E J is a set of connected road segments to intersections. If there is at least one road segment between the two intersections, they are said to be adjacent. Vehicular topology also can form a dynamic graph as V v represents a finite set of nodes, and E v denotes a set of asymmetric wireless links between neighboring vehicles. U J and U v are the utility of intersections and vehicle links, respectively, and are explained in Section 3.2. Using Global Positioning Systems (GPS), each node can inform of its positional information. The centralized SDN controller stores information, including congestion, residual bandwidth, location, distance, velocity, direction, etc., obtained from the local controllers. This information is gathered periodically from each vehicle using the southbound SDN interface. When a node is in the range of two local controllers, the controller with the shortest Hello's response time is taken as the nearest controller. The response time relies on the vehicle density and the length between the RSUs and vehicles. Our offered layered architecture is shown in Fig. 1.

Communication model
Vehicles and roadside units can get their own one-hop neighbor's information by exchanging Hello packets periodically. The interval of broadcasting Hello messages can be adaptive [42] or fixed. This paper uses a fixed Hello interval ( = 1 ). Table 1 shows the format of Hello packet. By broadcasting Hello packets from vehicular nodes, RSUs can listen to these messages if they are within the vehicle's range and transmit the obtained information to the SDN controller. Hence, the centralized SDN controller views the entire vehicular network topology. Conversely, each RSU broadcasts ACK messages to notify vehicles of its presence by receiving Hello messages from the vehicles. If no information has been received within the pre-defined time, the local controllers notify the centralized SDN controller. Then both local and centralized SDN controllers update their information. In some cases, vehicular nodes can leave the current RSU's coverage area and join the coverage area of another controller. At this point, the previous controller (RSU) forwards a notification to the centralized SDN controller, waiting for the flow table to be updated. The SDN controller fixes and updates the data routing path based on the new network topology and then sends back the flow table to the roadside units.
Our sending data packet policy will differ based on the communication model of source vehicles with RSUs and own their neighboring vehicles. At first, if the sending vehicle is within the range of the RSU, it sends a request message to RSU. Then, RSU matches its flow table to find the route to the destination vehicle. A response message including the best routing path information will be sent to the source vehicle if the route is discovered. The message is directed to the centralized SDN controller if any data routing path is not found. Global visibility enables the central SDN controller to identify the optimum routing path from the source to the desired destination. Sending data packets will begin after getting information about the optimum routing path at the source vehicle. The formats of the request packets used by source vehicles (i.e., packet_in) and the control packets by controllers (i.e., packet_out) are illustrated in Table 2.
Specifying a list of intersections and forwarding vehicles for passing the data packets to attain the intended destination based on our utility-based scheme is heightened in the following. The fuzzy SARSA learning algorithm assigns utilities to intersections and vehicles constituting data transmission paths in our utility-based scheme. Curve distance, prediction of vehicle movement towards intersections, and estimated delay are used to determine intersection utilities. Thus, forwarding vehicles constituting data routing paths are selected by their utilities which are computed based on residual bandwidth, angular orientation congestion, and Euclidean distance. Let u J n ,J m be the local utility of candidate intersection J m from intersection J n , and u v i ,v j is the local utility between two vehicles v i , v j . The utilities of formed routes consisting of sequence intersections and relays are defined as Eq. (1).
where U (Js,Jd) , and U (vs,vd) show the global utilities of the sequence intersections and forwarding vehicles in the data transmission path. Due to the different combinations of intersections and vehicular nodes, the centralized SDN con- be formed in the selected intersections. Therefore, to have high end-to-end routing performance, the best sequence of intersections and forwarding vehicles among all possible paths via Dijkstra algorithm must form the data routing path. First, a sequence intersection with the maximum utility using Eq. (2) is selected among all intersection sequences from the source to the destination intersection.
After determining a sequence of intersections with maximum utility, our scheme in the selected intersection sequence has chosen the route consisting of vehicle links with maximum utility according to Eq. (3).
Finally, to ensure data deliver successfully from source to destination, several constraints are considered as follows: The following two constraints are considered to avoid nodes trapped in the routing loop: If a forwarding vehicle is traveled from J m to J n , (Jm,Jn) is equal to one; otherwise equals zero.
• Constraint 2 (5): To ensure the progress of data packets toward the destination, the curve distance of candidate intersections to the destination intersection must be less than the curve distance of the current intersection to the destination intersection. • Constraint 3 (6): Let dist (vi,vj) estimate is defined as the estimated distance between vehicles v i and v j and R be the transmission radius between two neighbor's vehicles. dist (vi,vj) estimate must be less than R ; otherwise, the data forwarding will fail. • Constraint 4 (7): The total end-to-end delay of the application should not exceed the delay constraint Δ . V v is a set of relay vehicles in the data transmission path. To find out the definitions of contained parameters in Eq. (7), please see Eq. (18).
Finally, next-hop node will be locally selected if the source vehicle is outside of the local controller's communication range. Each forwarding vehicle computed the neighboring nodes' utilities and sorted them in an ascending order based on the local information collected from Hello packets and fuzzy system computations. The neighboring vehicle with the highest utility is selected as the next-hop node for relaying data packets. This process will be continued until carrying vehicles are covered by an RSU or reached the intersection. The details of considered criteria and the steps of fuzzy systems are described below.

Proposed hierarchical SDN-enabled vehicular routing: HIFS
As shown in Fig. 2, our HIFS scheme consists of intersection and forwarding vehicle selection strategies via fuzzy reinforcement learning algorithm. The details of these strategies and their steps are described below.

Intersection selection strategy
Local controllers or centralized SDN controller in our HIFS scheme uses the curve distance, delay estimation between two intersections, and predicting the number of vehicles moving towards intersections as decision-making factors to select intersections passing the data packets. After computing these criteria, the computed crisp values will be applied to the fuzzy system. The numerical obtained value via fuzzy output membership functions and the defuzzification method is considered the candidate intersection's local utility. How to compute the abovementioned criteria is summarized in Algorithm 1. Our utility-based intersection selection policy is described below.  where Dist curve J n , J d and Dist curve J m , J d are the curve distance of the current intersection J n and candidate intersection J m to destination intersection J d , respectively. Also, Eq. (9) [43] shows the curve distance between the two intersections.
The distance of the candidate destination intersection from the destination node and vehicle movement angle, as a weight function, is used to determine the destination intersection as Eq. (10) [44].
where where Dist (J ,v d ) is the distance between candidate destination intersection J and destination v d . c shows a normali- indicates the position vector forming at v d and finishing at J . is a weighted factor in the range of (0,1) . The highest weight of w (J ) refers to the selected destination intersection.

Predicting movement of vehicles towards intersections:
Vehicles distributed sparsely at intersections or road segments have lower probability of finding a suitable relay node for sending the data packets. Based on the estimated movement of vehicles, this paper employed an evaluation index to predict the number of nodes entering or approaching the candidate intersections. Accordingly, moving vehicles towards or getting away from candidate intersections is computed as Eq. (12) [45].
where x m , y m is the position of candidate intersection J m , and x cur , y cur shows the current position of vehicle v i . By assuming the current speed of the vehicle as s v , the predicted position x future , y future of vehicle v i is computed by Eq. (13) [45].
where, by considering the previous position of vehicle v i as x prev , y prev , the angular orientation can be calculated by Eq. (14).
The centralized SDN controller or local controllers have set a predefined threshold value to calculate the number of vehicles approaching candidate intersections on the road segments. If M is higher than with a value of 0.65, the vehicle moves towards the intersection; otherwise, it moves away from it. The number of vehicles approaching the candidate intersection can be predicted as: indicates the total number of vehicles at all road segments q leading to intersection J m . Also, N e J m shows the number of vehicles entering/ approaching intersection J m at road segment e . Based on Eq. (15), the road segment with a higher value of MTI norm has a higher chance of forwarding data packets successfully. An exponentially weighted moving average (EWMA) is used by Eq. (16) to ensure that MTI norm is not affected by the sudden changes. The coefficient is set to 0.75 based on the simulation results.
3. Delay estimation: The estimated delay between the two intersections can be computed based on the vehicle link's connectivity. The carry-and-forward mechanism is used if there is no link in the vicinity of the packet carrying vehicle. In this case, delay of data packets depends on the velocity of the carrying vehicle and the length of the traveled road segment by that vehicle. Also, if the road segment is connected, the delay induced to data packets relies on the hop counts passed at the road segment. Therefore, the estimated delay between the two intersections J n , and J m , can be computed as: forming the data transmission path v i , v j ∈ V v between the two intersections J n , J m ∈ V J and obtained as Eq. (18) [46]. Therefore, based on the collected information from the vehicles, the centralized SDN controller can compute the latency D ete J n , J m for each road segment r mn . In the computation of the delay, an exponentially weighted average method with coefficient = 0.75 is used as: Finally, D ete J n , J m using Eq. (21) is normalized to (0,1).
Fuzzy system design for intersection selection strategy Fuzzification, fuzzy inference, and defuzzification are the main parts of a fuzzy inference system (FIS) [47]. In the fuzzification process, a set of inputs crisp values are converted to the corresponding fuzzy set. This process depends on the membership functions defined separately for each input. Fuzzy membership functions describe the  Fig. 3. The ranking of the intersections according to the fuzzy rule base (i.e., IF/THEN rules) is shown in Table 3. In a rule, the IF part is called antecedent, and THEN part is called the consequent. Each rule combines input variables and obtains fuzzy decision output based on the linguistic variables as {VeryLow(VL), Low(L), Medium(M), High(H), VeryHigh(VH)} . For example, if the curve distance is far, the latency is high, and the number of vehicles moving towards the intersection is low, the candidate intersection has a much lower chance of being selected in the data routing path. The utility assigned in Table 3 is reduced to avoid compromising the quality of service parameters based on considered criteria values. Moreover, to reduce end-to-end delay, more utilities are given to intersections with the lowest estimated delay and more proximity to the destination vehicle. Thus, the intersections with a higher MTI norm value which is an essential factor in providing a higher delivery ratio, have more utility. In contrast, if the parameters mentioned above have a small value, the assigned intersections' utility will be decreased.
Finally, the fuzzy output is converted to a numerical value using the output membership function depicted in Figs. 4 and 5 and the Last of Maxima (LOM) defuzzification method. The crisp value in the LOM method is the largest element of the fuzzy value. Here LOM shows the utility of candidate intersections to be selected. A higher value of LOM would indicate a better intersection. This method is based on Maxmembership principle and defined as follows (Fig. 4).
where x * height of the output fuzzy set C, which is considered the utility of intersection selection and relay selection at each level of our proposed fuzzy systems.

Forwarding node selection strategy
This section describes selecting the relay nodes that make up the data transmission path from source to destination. The considered criteria and fuzzy system processing are described below.

Forwarding vehicle selection criteria
1. Load capacity: In this paper, to mitigate the effect of nodes buffer overflow in the formed data routing path, the size of the remaining buffer is used as a criterion for the next-hop node selection. The residual buffer size of a node similar to [48] is computed by Eq. (23).  where R c and T c indicate the current buffer size, and the maximum buffer size of node v i at time t , respectively. The EWMA method with coefficient = 0.75 as Eq. (24) is employed in load capacity calculation.

Angular orientation:
In our strategy to select the nexthop node, the angular orientation is computed using Eq. (25) [45].
where x m , y m is the location of intersection J m , x i , y i denotes the position of forwarding vehicle v i , and x j , y j shows the position of neighboring node v j . Assigning higher priority to a neighbor with a minimum angle caused higher link stability between vehicles. By Eq. (26), the angular orientation is normalized in the range of (0,1).
where v i , v j shows the angular orientation between vehicle v i and neighbor v j , and max v i , v j , … , v i , v n is the maximum angles in the one-hop neighbors of vehicle v i . 3. Euclidean distance: Normalized Euclidean distance for selecting the next-hop node is considered in this paper as Eq. (27).
where x i , y i , and x j , y j are the position of the current vehicle v i and its neighbor node v j , respectively, and R denotes the communication range.
4. Residual bandwidth: RBW i is one of the routing metrics applied to the fuzzy system in the next-hop selection policy in our scheme, which is computed by Eq. (28) [43].
where C rate is the channel data rate, and denotes the total data generation rate computed as the sum of the bandwidth consumed by forwarding vehicle and its neighbors, including the MAC layer overhead, ACKs, and retransmissions. The EWMA method is used to consider the effect of traffic changes as follows: where t is the average data generation rate according to the data generation rates at time t and t − 1 , respectively, and a is the configuration parameter set to 0.4.

Fuzzy system design for forwarding vehicles selection strategy
Our forwarding vehicle selection scheme is a TSK-FLS method with a four-input/one-output zero-order and 24 rules. In the fuzzification process, four quantifiable input parameters (e.g., crisp values), including load capacity, angular orientation, Euclidean distance, and residual bandwidth, are converted to the linguistic variables and expressed by {Low, High} , {Small, Large} , {Small, Large} , and {Low, Medium, High} , respectively, as shown in Fig. 6. In addition, the fuzzy rule base (i.e., IF/THEN rules) in Table 4 is defined for ranking the one-hop neighboring vehicles. The linguistic variables are denoted as {VeryLow(VL), Low(L), Medium(M), High(H), VeryHigh(VH)} . Processing delay (Euclidean distance and congestion are factors affecting it) and bottleneck formation, buffer overflows, and buffering delays (limited bandwidth and congestion are factors affecting them) jeopardize the quality of service parameters. Therefore, to fulfilling the application requirements, the design of Table 4 is such that assigning high utility to vehicles that compromise the parameters mentioned above is avoided. Finally, Fig. 7 shows the output membership function in the defuzzification process to get the crisp values. The defuzzification process in this level, similar to the first level, used the LOM method.

Tuning fuzzy membership functions
Considering the dynamics of vehicular environments and their unique characteristics, even fuzzy type-1 methods cannot effectively deal with the ambiguity of such environments. Therefore, our proposed scheme has tuned the output membership functions by considering other aspects of (28) vehicular environments and road topology characteristics to achieve higher performance in terms of the quality of service parameters. In our scheme, SARSA learning algorithm [49] is employed for tuning the consequent fuzzy output membership functions of intersection and forwarding node selection strategies. SARSA learning allows agents to learn online policy while interacting with their environments and is defined by quintuple s t , a t , r t , s t+1 , a t+1 . By considering the whole network as the environment, centralized SDN controller and RSUs are defined as our agents due to their ability to compute data routing paths. The learning task is finding the best parameters for output fuzzy membership functions in both strategies. It can be done by observing the current state s t of the environment and taking action a t based on their own policies State-space, action-space, and immediate rewards in our policies are defined below.
State-space Each combination of antecedent parts in the intersection and forwarding vehicle selection policies is considered the agent's state and shown as S J = s 1 J , s 2 J , … , s k J , and   Reward As a utility, agents will receive the rewards r J n,m t (e.g., intersection selection reward) and r v i,| t (e.g., relay node selection reward), and transition to new states. Then, the agents will update their current policies, J and V , according to the reward received to perform the optimal action. The following reward functions in both levels are explained. where R G is the ratio of the green time to the intersection cycle time, R c shows the number of passing vehicles at the intersection in the green light duration, and R D indicates the total number of vehicles stopped on all the roads at candidate intersection J m and computed as Eq. (31) [50]. If the vehicle's velocity is less than v min , it is assumed that the vehicle is waiting. D lk indicates the number of waiting vehicles at road l and lane k.
(ii) Vehicle reward: For the fuzzy output membership function in the forwarding vehicle selection strategy, the reward r is the stability of the neighboring vehicle v j in the transmission range R and computed based on [51], t is the current time, and Δt is the previous time interval of sending Hello message. n v j (t) denotes the local density of vehicle v j , divided by the optimal local density found in [52]. Finally, c as a reputation index shows the number of times a node between the two intersections is selected as a next-hop node. The reward function is divided by one to avoid ambiguity. Each rule R i in the fuzzy SARA strategy is depicted as if Q(i, m) ). Here, L i = L in × ⋯ × L i2 × L i1 is a fuzzy set of the i th rule. m shows the number of possible discrete actions that are the same for both levels. a ij denotes the j th candidate action, and Q(i, j) is the Q-value of the j th action in the i th rule. In the intersection selection strategy,x 1 , x 2 , and x 3 are the distance from the candidate intersections to the destination intersection, the delay estimated, and moving vehicles towards intersections, respectively. Conversely, load capacity, Euclidean distance, angular orientation, and residual bandwidth are considered as x 1 , x 2 , and x 3 , respectively, in the forwarding vehicle selection strategy. In both fuzzy SARSA levels in rule i , action a ij is chosen based on the softmax action selection rule as Eq. (33) [49].
The fire strength i of each rule i is calculated as Eq. (34). In addition, is a positive variable called temperature.
The rule with highest fire strength value is considered as output in both levels of fuzzy systems. The Q table will be updated according to Eq. (35) at the predefined intervals.
Here is the learning rate, and depicts the discount factor. The updated Q-values in the fuzzy calculations are used to adjust output membership functions. For example, in rule 3, selecting the consequent part of our fuzzy intersection scheme is defined as follows.
Similarly, in rule 1 of our forwarding vehicle selection policy, considering SARSA learning, the fuzzy output membership function is obtained as follows.

3
The same process to update output functions will be applied to all fuzzy rules. Consequently, based on updated Q-values, the actions with the maximum Q-values will be chosen as the consequent fuzzy parts at both levels. These Q-values will vary according to the periodic update of the Q-table (every 500 ms). Algorithm 2 shows the process of the fuzzy SARSA learning algorithm in tuning output fuzzy membership functions of the intersection selection and forwarding vehicle selection strategies. It should be noted that R i represents the maximum value of the rules (i.e., the maximum value of the fuzzy antecedent parts in Tables 3 and 4).

Proposed SDN-enabled vehicular data routing algorithm
The best routing path is computed in the centralized SDN or RSUs at time t through Dijkstra algorithm. Assuming that packet k is sent from source v s (k) to destination v d (k) , Algorithm 3 summarized our SDN-enabled data routing process. Given the local density, two possibilities are considered in the routing processing: 1) the source node is within the transmission range of RSU, and 2) the source node is outside the RSU's communication range. In the first case, v s (k) forwards a path request message directly to the RSU (line 3). It includes source and destination IDs, road ID, data packet size, and delay constraint of application. Upon receiving the route request packets, controller perform the routing algorithm to specify the most suitable path between the source-destination nodes (lines 5-6).
The request message is transmitted to the SDN controller if the destination is outside the local controller's communication range (line 8). SDN computes the transmission path, including a sequence of intersections and relay nodes. Then, the obtained route through the local controller in a replay packet is sent back to the source vehicle (lines 9-10). If the destination of packet k is in the neighbor table, packet k is delivered to it directly (lines [13][14]. Finally, if the source node is not within the local controller's communication range, the forwarding node selects the best relay node obtained by fuzzy results (lines [16][17]. If the forwarding vehicle has no neighbors within its communication range, it uses a carry-and-store mechanism. This process will proceed until the carrier vehicle arrives at the appropriate neighbor vehicle or controller node located at the next intersection (line 19).

Complexity analysis
Our HIFS scheme's complexity compared to other methods with respect to 1) collecting information, 2) calculating routing criteria for the intersection and forwarding node strategies, 3) fuzzy calculations, 4) reinforcement learning algorithm, and finally, 5) Dijkstra algorithm was analyzed and shown in Table 5. Here, V J is a set of all intersections, and E J is a set of connected road segments to intersections. Also, V v represents a finite set of nodes, and E v denotes a set of asymmetric wireless links among neighboring vehicles. Fuzzification, rule base (minimum and multiplication operations), and defuzzification process is computed in the fuzzy complexity calculations. Hence, n i , n f , n r , and, n o are the number of inputs, number of fuzzy sets, number of fuzzy rules, and number of discrete fuzzy outputs, respectively. Also, n s and n a are the state-space and action space, respectively. Finally, is the number of levels that used the fuzzy system, and q denotes the number of lanes at each intersection.

Performance analysis
The performance evaluation of our HIFS scheme is discussed in this section in various scenarios compared to the different state-of-the-art methods, including real-time evaluation of road conditions using CP packets (e.g., ITAR-FQ [40]), traffic light aware strategy (e.g., MISR [37]), and SDN-enabled routing scheme (e.g., GLS [25]). MISR scheme is a traffic light-based routing scheme in that intersections, and next-hop nodes are locally selected. MISR scheme is compared to our HIFS strategy to indicate the performance difference between the local traffic lightaware intersection selection policy and the global one. ITAR-FQ scheme is a traditional CP-based scheme which uses fuzzy type-1 and Q learning. Furthermore, GLS 21: 22: scheme used fuzzy type-1 to select a sequence of intersections and vehicles by integrating it into the softwaredefined network paradigm. Thus, comparing our scheme with ITAR-FQ and GLS strategies shows their ability in dealing with the vehicular environment's ambiguity and the effects of considering traffic lights. Table 6 compares the benchmark schemes with our HIFS proposed scheme regarding used criteria and strategies in the next-hop and intersection selection levels. Furthermore, Table 6 shows evaluated schemes' ability to handle vehicular uncertainty environments.

Simulation settings
Our scheme and other compared methods were simulated via NS-3.29 [53] as a network simulator and SUMO as a mobility generator tool [54] to generate realistic vehicle mobility. The simulated scenario describes real urban roads belonging to Los Angeles in USA, with an area of 3000 m by 3000 m extracted from OpenStreetMap (OSM) (see Fig. 8). OSM file is converted to a SUMO network file (sumo.cfg) compatible with the NS-3 movement mobility model. The considered scenario includes the different number of lanes, intersections, and varied traffic light cycles. Moreover, all types of vehicles, including trucks, cars, motorcycles, and buses, are considered with their characteristics, including acceleration, vehicle length, and maximum velocity. The car-following mobility model was used in SUMO for the movement of vehicles. In this model, the velocity of a node depends on the velocity of the vehicular node ahead of it. Before initiating the simulation, the vehicular nodes were distributed randomly. As the simulation starts, each vehicle moves on the roads with a minimum and maximum velocity of 10 km/h to 90 km/h. The Nakagami-m model was considered as a propagation model in the physical layer. RSUs as fixed nodes were located at intersections, and Open Flow protocol was installed on them. Vehicles, RSUs and BSs have multiple user interfaces, including LTE, Wi-Max, and IEEE 802.11p. The data packets are forwarded via IEEE 802.11p to reduce the cost of vehicular communication. At the same time, the control messages are transmitted via LTE interface. Similarly, RSUs have used LTE interface to send their ACK messages. BSs have used Wi-Max as their wireless radio interface.

3
The transmission radius was varied according to the interfaces. Using UDP, ten randomly source-destination pairs with the data packet sizes of 512 bytes and a generation rate of two packets per second as foreground traffic were produced by the Constant Bit Rate (CBR) traffic flows. The foreground traffic was generated 30 s after the simulation started to reduce the effect of transient changes. Also, the last generated foreground traffic was sent 50 s before the simulation ended. Background traffic, including ten random source-destination pairs with various data packet sizes, is also created to interfere with the foreground traffic. That is' why source-destination pairs between the foreground and background flows are overlapped to each other. The distance between source and destination pairs in our scheme to make a multi-hop communication should be at least two hops away from each other. It is assumed that each vehicle (e.g., source node or an intermediate node) can get the positional information of the destination with a query of a centralized administration unit such as RLSMP [55]. Each point of each routing scheme in all graphs is an average of 30 simulations, where the error bars indicate the 95% confidence intervals. Each simulation duration is set to 500 s. The tunable simulation parameters are detailed in Table 7.

Performance metrics
• Packet Delivery Ratio (PDR): PDR is defined as the ratio of successfully received data packets at the destination vehicular node to the number of data packets sent at the source vehicle. • Average End to End Delay (AE2ED): AE2ED depicted the average end-to-end delay of data packets sent from the source vehicle and received at the destination.
• Path length (PL): PL is computed as the sum of the curve distances of the vehicle links from the source to the destination node that make up the data transmission path. • Normalized Routing Overhead (NRO): NRO is defined as the size of the control message sent to the size of data packets successfully received at the destination. • Average Throughput: AT is defined as the total amount of data received at the destination node which was sent by the source node, divided by elapsed time which takes the final packet reach the destination. • Routing Failure Ratio: LFR indicates the number of link failures during the routing process. This measure showed the efficiency of routing scheme in terms of reliability.

Evaluation scenarios
Two sets of tests were employed to assess the effectiveness of routing schemes; the following parameters in them are varied.
• Vehicle density changes: Vehicle density was changed from 50 to 500 vehicles. Traffic lights duration in this test is set to 60 s. • Different traffic light duration: The traffic light duration varied from 60 to 150 s. The time ratio of green and red traffic lights is 1:1. It means that the first half of the traffic lights period equals the green light time, and the second half equals the red-light duration. The number of vehicles in this test is set to 400 vehicles.

Packet delivery ratio
The evaluated performance of the ITAR-FQ, MISR, and GLS methods compared to our HIFS scheme are shown in Fig. 9 with respect to the packet delivery ratio for the different number of vehicles and various traffic lights duration. For various vehicle densities, as shown in Fig. 9a, the packet delivery ratio is increased in all compared methods at densities from 50 to 300. Then, from 300 to 500 vehicles, the GLS method and our HIFS scheme have slightly increased, while ITAR-FQ and MISR methods have declined. At low densities, our HIFS and GLS scheme reduced the likelihood of nodes trapped into the local maxima problem by considering the global perspective and using microscopic and macroscopic information. The MISR and ITAR-FQ schemes with limited visibility have led to sub-optimal routing decisions. Therefore, their difference and our scheme at low densities are significant. At higher densities, limited bandwidth and nodes competition caused the performance of all routing methods to be restricted. Our scheme has adapted routing decisions considering traffic light conditions, resulting in a higher packet delivery ratio than all compared methods. As shown in Fig. 9a, by increasing vehicle density, the performance improvement of our proposed HIFS scheme is more significant. Since the possibility of forming long queues of vehicles waiting behind red lights is vague at low densities, our HIFS and GLS method differences are negligible. Figure 9b shows the influence of various traffic lights duration on the performance of all compared schemes. As the traffic light duration increases, a declining behavior in the packet delivery ratio of all schemes can be seen in Fig. 9b. Lack of consideration  Packet delivery ratio of traffic light's effects when selecting intersections besides the afore-mentioned ITAR-FQ scheme's shortfalls has further reduced its performance. The GLS scheme also suffers from the lack of considering the traffic light's effects during the data routing process, resulting in reduced performance upon increasing traffic light duration. Our scheme can deal with dynamic vehicular environments bound to the road topology with different traffic light conditions by tuning the vehicular environment-dependent fuzzy membership functions. Hence, our HIFS scheme offered more suitable performance than all compared methods under various traffic light circumstances.

Average end-to-end delay
The effect of vehicle densities and various traffic lights duration on the average end-to-end latency is shown in Fig. 10. Figure 10a indicates the advantages of our HIFS proposed over other methods in terms of average delay as a function of vehicle density. The end-to-end delay has two different behaviors in various vehicle densities. Initially, it decreased for all compared schemes at densities of 50 to 300 vehicles, then increased at densities from 300 to 500. At low densities, trapping nodes into the local maxima problem has occurred more. Consequently, in a scattered environment, the end-to-end delay is more. Node's competition in the channel and passing more hop counts, besides higher congestion level, are the main reasons for the increased average delay at high vehicle densities (e.g., from 300 to 500 vehicles). Figure 10a shows that our HIFS scheme performs significantly better than the MISR and ITAR-FQ methods, especially at low densities. Moreover, our HIFS scheme has the minimum delay at high densities over the traditional intersection-based routing schemes for the following reasons: 1) Using buffering delay and limited bandwidth in the forwarding selection strategy; 2) Considering traffic light effects in the intersection selection strategy; 3) Having the global visibility of the vehicular environments. The ITAR-FQ method has increased hop counts by considering bidirectional vehicles and limited visibility in the next-hop node selection. Broadcasting of CP packets, especially at high densities, has further increased latency in the ITAR-FQ scheme. Greedily choosing a high-density path, increases the processing, transmission, and buffering delays due to high packet congestion and node competition. The GLS and MISR methods suffer from this problem. In addition, lack of consideration of the vehicles waiting behind red-light traffic has augmented the increased delay in the GLS scheme at high densities. Figure 10b compares end-to-end latency for all compared routing schemes at various traffic light durations. According to Fig. 10b, the latency increases in all routing methods by increasing the traffic lights period. The GLS and ITAR-FQ methods have caused further delays due to the increased unintended hop counts in the data routing process. It is due to the inability of these methods to avoid selecting intersections with long waiting queues of vehicles behind red traffic lights. The obtained results in Fig. 10b show that the data received at the destination in our HIFS scheme has experienced lower delay than in other methods. This is because our HIFS scheme can avoid passing the paths with more hop counts by considering traffic light effects in selecting intersections. closer. The ITAR-FQ method has caused a further increase in path length by considering densities in both directions when selecting the next-hop nodes. The MISR method also increased the path length, regardless of the distance of intersections in its intersection selection mechanism. The GLS and HIFS schemes, considering the global traffic information, have reduced the path length compared to the MISR and ITAR-FQ methods. The lack of consideration of traffic lights effects in the GLS method has led to selecting areas with a longer duration of remaining red lights traffic. Consequently, the path length traveled by data packets increased. The adaptability of our scheme in determining intersections by taking into account the impact of traffic lights has led to performing better than the GLS scheme. By increasing vehicle densities due to the creation of longer queues behind the red lights, the difference between our HIFS method and the GLS scheme has increased. The influence of various traffic light duration on the path length for the ITAR-FQ, MISR, GLS protocols, and our proposed HIFS scheme is assessed in Fig. 11b. Forming longer queues of vehicles waiting behind red traffic lights is the leading cause of increasing the traveled path length in the various traffic lights duration. The GLS scheme prioritized high-density areas without considering the traffic light effects, resulting in more path length than our method. To avoid jeopardizing the next-hop node selection process and interrupting network connectivity, the proposed method, in addition to choosing intersections with proper connections, bypasses the intersections with long queues formed behind red lights. Figure 12 shows the communication overhead for all compared schemes at different vehicle densities and various traffic lights duration. Figure 12a reveals that the communication overhead has increased by increasing vehicle densities. The increased demands for more communications have caused more control messages. The ITAR-FQ protocol has highest routing overhead due to the broadcasting of CP packets. The SDN-based methods resulted in less overhead due to the provided overall visibility. Because by developing routing sight and specifying routes with higher stability, broadcasting new control packets requiring new path discovery is reduced. Routing overhead considered in GLS methods and the proposed method includes the messages transmitted between the SDN controller and the Open Flow switches, i.e., packet_in and packet_out. Our method's adaptability in dealing with uncertainty and ambiguity by considering vehicular environment attributes has reduced route failure and thus diminished requests to discover new routes compared to the GLS method. Figure 12b shows the effect of various traffic lights' duration on routing overhead. Since the number of vehicles is fixed, the increased overhead, in this case, is due to the reduction in the packet delivery ratio of routing schemes. The proposed method performs better than other methods, recalling that routing overhead relies on the packet delivery ratio. Figure 13 shows the outcomes of network throughput in our scheme for different vehicle densities and traffic light periods compared to MISR, ITAR-FQ, and GLS schemes. The overall throughput increases with vehicle density, as shown in Fig. 13a. Since by increasing vehicle density, stable links among the vehicles will be formed more likely, resulting in more data packets being transferred. HIFS and GLS schemes outperformed the highest throughput by extending routing visibility using software-defined networks. Local routing viewpoint and lack of effective routing criteria for picking forwarding vehicles reduce the average throughput of the MISR approach. Moreover, despite using bandwidth factor in ITAR-FQ strategy, massive generated CP packets have caused to decline in this method's performance. Our HIFS scheme, by taking into account bandwidth and congestion, simultaneously has led to routing packets from less congested-area and more available bandwidth. Thus, assigning more utility to the intersection with vehicles' smooth distribution has augmented our scheme's increased network throughput. Besides, our scheme has dealt with the ambiguity and uncertainty of vehicular environments by tuning fuzzy output membership functions. For the reasons mentioned above, as seen in Fig. 13b, the performance difference of our proposed method was enhanced by increasing the traffic light period compared to the ITAR-FQ and GLS methods.

Link failure ratio
The link failure ratio evaluation for various vehicle densities, different traffic light durations, and varying vehicle velocities in the HIFS and GLS schemes is illustrated in Fig. 14. Frequently topology change caused by high vehicle velocities, presence of obstacles and signal attenuation, restricted road topology to traffic lights, lack of connectivity in scattered environments, and mutable vehicular density are the main reasons for the more routing failure in the formed routing paths between the source and destination. At first, as shown in Fig. 14a, the number of link failures decreased by increasing vehicle densities. More link availability in a dense environment resulted in forming the links with higher stability. Consequently, the link failure ratio will be decreased. The most critical aspects in dealing with the frequent link failure in urban routing schemes On the other hand, stable link indicators are considered in both methods. Thus, in both routing schemes, the node distribution in intersections is used as an intersection's selection decision-making characteristic. However, the traffic light effects in our scheme reduced the link failure ratio compared to GLS method. The link failure ratio outcomes in various traffic light durations in Fig. 14b show that the difference between our HIFS scheme and GLS scheme is increased. Regarding long vehicles queueing behind the red traffic lights and considering the dynamics of vehicles, fragmentation and partitioning on the other sides of intersection will be increased, resulting in more link failures. Our scheme considering traffic light effects, and road structures, have avoided scattered connectivity at intersections. Finally, to analyze the effect of various vehicle velocities on the link failure ratio, Fig. 14c shows the assessment of link failure ratio with different vehicle velocities in the range from 25 km/h to 90 km/h with 50 vehicles. The duration of traffic lights is set to 60 s. In this set of tests, low-density vehicles with a low traffic light duration are regarded to show the effect of velocity on the performance of protocols. Figure 14c shows that even with reduced our scheme's strength, HIFS scheme outperformed GLS scheme. Because in our next-hop selection policy, density, link stability, and vehicle movement direction indicators can select the most appropriate routing path. The link failure ratio in our scheme at various vehicle velocities is reduced on average by 4.3% compared to GLS method.

Discussion
The average performance enhancement of our HIFS scheme for all performance metrics compared to other methods in various densities and traffic lights duration are outlined in Tables 8 and 9. The improvement enhancement reported in Table 8 shows that our HIFS scheme performed better than the other compared methods in all performance metrics for various densities. Moreover, obtained gains shown in Table 9 reveal that the differences between our HIFS scheme with the ITAR-FQ and GLS methods have increased by increasing traffic light duration. Our HIFS scheme has dealt with the ambiguity and uncertainties of the vehicular environment by using the fuzzy reinforcement learning  algorithm for tuning output membership functions based on the time-varying traffic light conditions. Therefore, besides proving the shortcomings of traditional intersection-based routing methods, our scheme outperformed all performance metrics compared to the SDN-enabled vehicular routing scheme. Average gains obtained for both scenarios show that our HIFS scheme increases the packet delivery ratio on average 48.75%, 54.63%, and 8.78%, increases the throughput by 48.66%, 53.79%, and 8.61%, reduces end-to-end delay by 33.35%, 46.14%, and 15.38%, reduces the path length by 25.25%, 36.47%, and 15.32%, and reduces normalized routing overhead by 37.09%, 49.79%, and 20.17%, MISR, ITAR-FQ, and GLS methods, respectively. Furthermore, the link failure ratio in our HIFS scheme was reduced on average by 8.31% compared to the GLS method.

Conclusion
A hierarchical SDN-enabled vehicular intersection-based routing scheme was proposed in this paper by complete accessibility to vehicular environment information. In our scheme, candidate intersections were initially prioritized by jointly considering fuzzy values of curve intersections distance, predicting the number of nodes moving towards candidate intersections, and estimating the delay between the two intersections. Fuzzy results chose a sequence of intersections with maximum utility from the source to the destination intersection. Then, in the selected sequence intersections, a relay sequence with maximum utility was chosen by considering multi fuzzy factors, including Euclidean distance, residual bandwidth, congestion, and angular orientation, to participate in the data forwarding process. A reinforcement learning algorithm at both levels was used to tune the routing policies depending on the time-varying conditions. At the intersection selection level, the effect of road constraints, including traffic lights, was employed to tune the output membership functions. Stability, reputation index, and density were also utilized to adjust the output membership function of the next-hop nodes selection policy. Extensive simulations were performed in an urban scenario using NS-3 and SUMO. The obtained results show that our proposed strategy has led to significant improvement in terms of all routing performance over the state-of-the-art schemes. Therefore, the effectiveness of considering road structures and traffic light conditions in our routing decisions was confirmed. Finally, addressing effects of traffic lights in the ambiguity and uncertainties of regular and complex urban scenarios, as well as adaptively choosing the most suitable user interfaces, considering the cost and quality of service aspects, are fascinating topics to be considered in our future works.
Author contribution All authors contributed equally to this manuscript.
Data availability Not applicable.

Declarations
Ethics approval The paper is original, and any other publishing house is not considering it for publication. The paper reflects the author's own research and analysis in a truthful and complete manner. All sources used are properly disclosed (correct citation).

Conflicts of interest
There is no conflict of interest.