Jointly Optimize Energy Harvest Time and Device Pairing for D2D Communications underlaying Cellular Network

. Abstract Nowadays, to realize device sustainability and prolong device working time, energy harvest (EH) has been introduced into D2D communication networks that allow each D2D equipment (DUE) to harvest the radio frequency (RF) energy from the facilities in proximity. However, in such an EH-enabled D2D network, it is challenging to integrate EH with the device pairing mechanism that is critical to the performance of the network. To this problem, we propose an optimization algorithm in this paper that jointly optimizes the energy harvesting time and the pairing for each DUE in a close-form to obtain the maximum throughput of the EH-enabled D2D network. In the proposed algorithm, each DUE will go through two mutually inﬂuenced stages, i.e., EH stage and information transmit stage, in which the device pairing will take the energy status of the candidate DUEs into consideration. The numerical results demonstrate that the joint optimization algorithm has a signiﬁcant increased throughput for the EH-enabled D2D network, compared with other benchmark solutions.


Introduction
With the growth of smart devices and 5G related applications, device to device (D2D) communication has attracted enormous research interests [1,2].D2D communication enables end devices within close proximity to directly communication with each other, rather than going through the base station.Such new network architecture has become a key technology of the 5G system.Particularly, the in-band D2D communication underlay cellular network has been intensively studied, as it more releases the issue caused by the scarce spectrum resource then more expanding the network coverage [3].In such in-band mode, each D2D equipment (DUE) shares the spectrum resources with the cellular users [4][5][6].
One of the main differences between D2D communication and traditional cellular communication is that the peer-to-peer devices need to be paired in the D2D communication system [7].And an efficient pairing strategy can often lead to effective content distribution within those devices, so as to increase the capacity of the system [8,9].For example, Song et al analyzed a heuristic distance-based scheme in [8], which appropriately paired the devices.Zhou et al. [9] employed a Gale-Shapley algorithm based on social perception to pair the D2D devices for maximum sum rate of the D2D users.In addition, different from the matching based on content distribution [8,9], a part of the research only focuses on the effective matching of users to achieve better system performance [10][11][12].Liu et al. [10] conceived a D2D matching scheme using Voronoi diagrams, by which a D2D link can only be established in adjacent areas.The matching scheme can reduce the co-channel interference and power consumption of DUE to achieve higher network coverage and spectrum efficiency.Such work therefore supports D2D network better working in the actual scene.In [11], the authors utilized the alliance game algorithm to optimize and adjust the distribution of D2D devices forming multiple disjoint alliances.They also used the logistic regression method to predict the probability of successful device pairing.Such works have improved the quality of D2D service.In [12], a symmetric matching algorithm is adopted to form a stable and proportional fair device pairing algorithm in the D2D-relay system.It has been verified that the algorithm is Pareto optimal to system performance.
Even through D2D communication has great potential supported by the above mentioned device paring mechanisms, there is a power constraint issue of the D2D end device.In practice, the frequent data emission and reception within the battery-powered D2D devices will quickly exhaust the energy of those devices [13,14].Thus, in order to improve the working time of the D2D device, the energy harvest (EH) technology has been utilized in D2D network [15,16].Compared with other energy-limited solutions such as reducing energy consumption [17], EH is more effective by bringing the new source of energy to the D2D network.For example, the EH implementations in [15,16] allow D2D devices to collect energy from solar, radio frequency and so on, which are efficient to prolong the life of the device and realize green D2D communication.
There are a few works studying EH working on the D2D network.Atat et al. [19,20] investigated improving the spectral efficiency and managing interference in EH-enabled D2D network.In [19], it discussed realizing EH via spatial RF in D2D networks.And a closed expression of the probability of activating the RF power conversion circuit was studied.In [20], it studied how EH affects the spectrum efficiency and cellular spectrum efficiency of the EHenabled D2D cellular network.The resource allocation of D2D communication based on EH has been studied [21][22][23][24][25].In [21][22][23], the problem of resource allocation in energy-harvesting D2D heterogeneous networks (EH-DHNS) was studied to maximize the data transferring rate.Similarly, in [24], the resource allocation problem of cellular networks under EH-enabled D2D network was discussed, and it maximized the throughput and the signal to interference plus noise ratio (SINR) requirements of cellular users through joint time scheduling and power control.In [25], the optimization of the EH time of relay devices and transmit power were carried out under the constraints of probabilistic interference.Moreover, the time allocation of D2D communication based on energy harvesting is research in [26][27][28].Yu et al. [26] considered multiple time slots in the cycle that the base station decides whether the current time slot is for energy harvesting or D2D transmission according to the energy level.An iterative algorithm is used to jointly optimize the resource block and the transmission power of the device without compromising the QoS of the cellular user, while achieving the maximum sum throughput of the D2D user.Wang et al. [27] considered allocating two time slots for energy collection and information transmission within a period, and combined time scheduling and power control to maximize the total throughput of D2D network underlay cellular network.The work also ensured the reliability of cellular users.In particular, Xu et al. [28] proposed a robust resource allocation algorithm to maximize energy efficiency in the case of a D2D communication underlying UAV auxiliary network based on energy harvesting, and imperfect channel information and coordinate information.Similar to [27], the energy harvesting time in [28] is also divided into two stages: energy harvesting and information transmission.
Based on our study, existing works have comprehensively considered the device paring, and EH related problems in EH-enabled D2D communication including the allocation of spectrum resources, power and time slot [29][30][31].However, there are few works studying the mutual influence between device paring and energy harvesting in EH-enabled D2D network.In practice, the D2D device equipped with a dedicated EH module will increase the complexity of the system [18].This is because EH will significantly increase the complicacy of the device pairing of the D2D network.For example, in a D2D network, a DUE will go through two mutually influenced stages, i.e., EH stage and information transmit stage, in which the device pairing will take the energy status of the candidate DUEs into consideration.Therefore, an EH-enabled D2D network has to consider EH and DUE paring as a whole.According to our best knowledge, there is no work so far proposed to solve such a complicated issue.
To this end, in this paper, we will study the joint problem on EH and device pairing, which is exclusive to any previous works.A joint algorithm will be proposed to reasonably optimize the distribution of energy harvesting time and information transmission time within the cycle time, and then select the best charging time of the paired D2D devices.As a result, the proposed solution will maximize the throughput of the EH-enabled D2D systems.In summary, the main contributions of this paper are summarized as follows: 1) We propose a novel system model on D2D working in the underlay cellular environment, and consider the joint problem on EH and device pairing.In the system model, considering that the distribution density of cellular users gradually decreases on the edge of the base station, the cell edge users thus can be selected as D2D users, while the center users perform cellular communication.This system model thus avoids the interference between any D2D user and cell user due to their longer distance between each other, thus supports D2D communication working effectively underlaying the cellular network.
2) Based on the system model, we further study the joint optimization problem on energy harvesting time allocation and device pairing.To solve such a complicated problem, we reformulate it into two sub-problems, which are simpler and easy to solve separately.In specific, we first obtain the optimal closed-form solution of the energy harvesting time first.Then, a Kuhn-munkres (KM) algorithm is used to achieve the best pairing and the information transmission rate of the D2D devices.These two steps will be run iteratively in an iterative algorithm.The algorithm will surely converge and obtain the optimized solution leading to an improved overall system performance of the D2D network.
In the rest part of this paper, we introduce the system model in section 2.Then, we formulate the optimization problem in section 3, and in section 4, we provide a solution to the joint optimization problem.In section 5, we compare the proposed algorithm with other benchmark solutions.The numerical results will be given to justify that the proposed solution can well improve the throughput of the EH-enabled D2D network.

System Model
We consider a D2D communication system supported by EH underlaying cellular network.As shown in Figure 1 below, in the environment covered by the base station (BS), there are numbers of D2D equipment (DUE) and cellular users (CUEs), where DUEs are equipped with radio frequency (RF) energy harvesting modules.In a specific time-point, DUEs can be divided into D2D transmitters (DTs) and D2D receivers (DRs) due to different tasks in current time cycle T.In addition, we assume that the content to be requested by DR is already cached by the DT from the BS, and one DT can only be paired with one DR.In order to improve the utilization rate of the spectrum, the DT-DR pair will reuse the uplink spectrum resources of the CUEs during communication, which might cause interference between the CUEs and the DUEs.We define M = {1, 2, ..M } as the set of DTs and N = {1, 2, ..N } as the set of DRs, while assuming C = {1, 2, ..C} representing the set of CUEs.
In this paper, the implemented EH in each DT adopts the harvest-store-use working model, i.e., one DT can collect the radio frequency energy from the BS and then store it and finally use it to transmit information in next time slot.In Figure 1, assume that the coverage area of a base station (BS) is circular and the BS is closely surrounded by numbers of end users, whose wireless links to the BS are in good quality.However, there are numbers of end devices located in the edge of the BS cell, and having a poor wireless link to the BS due to the distance.Within these cell edge devices, D2D direct communication thus can be effectively used to increase the traffic rate.Also, D2D communication for edge users can effectively reduce interference.This is because edge users will request higher transmission power to establish reliable communicatio link to the BS, which will cause greater interference to other end users.To this end, we take into account the distribution of the end devices in the area, and propose a regional division policy (RDP) to make the EH-enabled D2D network more practical underlaying the cellular network.As shown in Figure 2, the devices around the BS, i.e., the devices in inner circle, are set to be CUEs linking to the BS.In the middle ring area, the end devices are set to be DTs that can better harvest the radio frequency energy from the BS and transmit data to cell edged DRs.Finally, the DUEs that request data among the edge users of the cell are set to be DRs located in outer circle.Though this kind of set up, RDP thus can improve the communication rate of edge users for ensuring the fairness of users, and is particularly suitable for the EH-enabled D2D network.

Energy Harvesting Model
Based on the system model in Figure 1, we assume that all DUEs in the area have certain energy acquisition capabilities and can collect RF energy through wireless energy transmission.Generally, there are three protocols for energy harvesting: harvest-use (HU), harvest-store-use (HSU), and harvest-use-store (HUS).The HSU protocol in [32] was adopted in this study.Hence, the energy harvested by the device in the time slot cannot be used immediately, and only loaded into the battery and used at the beginning of the next time slot.
According to HSU protocol, the period T is divided into two time slots: τ e and τ t .Specifically, during τ e , DTs utilizes EH to harvest the RF energy transmitted by BS.The collected energy will be used for information transmission in slot τ t .The length of the two stages have to satisfy the following constraint: In this study, different frequency bands are used for RF energy transmission and information transmission, which keep no interference with each other.Additionally, the perfect channel state information can also be obtained through pilot signals and remains unchanged within the T. A typical energy harvesting model can be modeled as a linear one, by which the energy collected by the DT from the BS within τ e can be expressed as where η ∈ (0, 1) is the efficiency of the converting RF signal into energy in the DC circuit, p 0 is the transmit power of the BS and K represents the system parameter.d b,i is the distance between the BS and i -DT, α refers to the path loss index.
In order to ensure sustainable operation, the stored energy must not exceed the maximum capacity of the battery, where the initial energy of each DUE is sufficient for smooth communication at the beginning.Thus, in transmission period T, the average transmit power of i -th DT should follow:

Channel Model
The channel power gain between DT i and DR j can be denoted as In this paper, each CUE is assigned an orthogonal sub-channel, while the M CUEs occupy the entire bandwidth W. When DT i reuses the spectrum resource of CUE c the signal-to-interference-plus-noise ratio (SINR) of DT can be expressed as where p i is the transmit power of DT i and p c is the fixed transmit power of CUE c. σ 2 indicates the noise variance of additive white Gaussian noise.h c,j represents the channel power gain between CUE c and DR j.Therefore, the information transmission rate from transmitter to receiver can be expressed as where W represents the D2D communication bandwidth.Similarly, for CUE, when D2D pairs reuse the spectrum resources, the SINR is Considering the fairness, a pair of DT-DR will reuses the spectrum resources of unique CUE.In addition, to ensure the QoS, the SINR of the CUE should meet the following requirement SIN R c ≥ r c th (8) where r c th represents the lowest threshold.Assuming the number of CUE is greater than the number of DUE, any DT-DR pair thus can easily find one CUE sharing spectrum resource.With the aim to ensure (8), the spectrum resources of CUE can only be reused by a pair of DT-DR.Meanwhile, within all potential connections, DR tends to establish a great link with unique DT for better service quality.

Problem Formulation
In EH-enabled D2D network, it is critical to jointly optimize the device pairing and the allocation of EH time of the DUEs.In principle, the longer the EH time involved with the higher power of DUEs and the information transmission rate will increase accordingly.However, in the period T, the longer EH time will shorten the information transmission time of the device.Therefore, it is a tricky problem to find the proper EH time to both satisfy the energy and data transmission requirements of the DUEs.
Therefore, this paper investigates a more efficient EH-enabled D2D network by jointly optimizing the energy harvesting time and the device pairing to maximize the throughput of the EH-enabled D2D system.Therefore, in T time, the optimization problem can be formulated as where the optimization target of ( 9) is to maximize the throughput of the EHenabled D2D network.With the reliable communication of DUEs, the SINR should meet (9a) where r c th represents the lowest threshold for DUEs normal communication.(9b) constraints the transmission power of DTs.Constraints (9d) and (9e) are to ensure the fairness that a DT can only be connected with one DR.The elements of device connection matrix X obey binary constraints where its 0-1 value represents whether related DT and DR connected or not, respectively.
The proposed problem in (9) considers the reliability of DUE, QoS and the fairness of pairing, which is a non-convex problem and difficult to be solved.This is because the time uncertainty of (9c) makes (9a) (9b) a non-linear constraint, and (9d) and (9e) are integer constraints.As a result, the joint pair and time allocation optimization problem is a nonlinear mixed integer problem.

Proposed Solution
To make the joint problem (9) tractable, we try to divide the problem into two sub-problems: time optimization and device pairing.For time optimization, assuming that the matching matrix is known, the optimal solution therefore is to determine the relationship between energy harvesting and information transmission, that is, transforming (9a) and (9b) into linear constraints, and the optimal time allocation of DTs can be obtained.Then, to find the best match based on the throughput of time allocation in the pairing problem, an KM based iterative algorithm can be designed to gradually approach the best match.Hence, a joint iterative algorithm can be designed to obtain the overall optimal solution.

Optimize EH Time Allocation
Assuming {τ * e ,τ * t } denotes the optimized EH time and data transmission time respectively, it enables the D2D communication system to reach the maximum throughput T P s * and ensure τ * e + τ * t ≤ T .Obviously, the optimal solution with the satisfying throughput can always be obtained with τ e + τ t =T being satisfied [27].Therefore, in the following derivation, let τ e =T − τ t .Based on (3) and (9b), the DT transmit power p i constraint is derived as Then one has Additionally, the existence of (5) indicates that (9a) can be further derived as Here, for the convenience of further representation let α i,j = Kηp0d b,i −α hi,j pchc,j +σ 2 , then one has As a result, according to (11) and ( 13), we can get Under the condition that the DUE matching is determined, then the D2D matching matrix X becomes available.Then we can further transform problem P1 into P2 as According to P2, DTs cannot guarantee the QoS if the transmission time occupies too much.On the contrary, if the energy harvesting time increases it will cause the data throughput of the D2D communications reducing correspondingly.In summary, we cannot find the optimal solutions, unless τ t is the maximum and SIN R i,j ≥ r d th is satisfied.Then we can find the optimal transmission time as

Optimize D2D Device pairing
After the optimal EH time τ * e and τ * t are obtained within T, we can further define Y i,j to be the throughput between DT i and DR j as Then the joint problem can be further reformulated as s.t.(9d), (9e) Obviously, problem P3 is to select the best connection partner in the DUE group.In specific, we can model the D2D communication network as an undirected bipartite graph, which represents the pairing relationship between DTs and DRs.G = {V,E} denotes an undirected bipartite graph in graph theory where E is an edge set and V={DT,DR} is a vertex set composed of DT-DR pair.The coefficient Y i,j in ( 17) can be regarded as the weight of different DT-DR pairs.This therefore turns problem P3 into the problem of finding the best device pairing for the largest sum of DT-DR throughput.Under this model, we apply the KM algorithm to solve the optimization problem P3 to find the maximum weighted DT-DR pair in the bipartite graph, which is shown in Figure 3. Specifically, in order to maximize the sum of the weights of the edges in the undirected bipartite graph, DR then tends to take throughput of the link into consideration to select the best DT in the candidate group.Select the user with the best channel condition to establish the initial D2D connection X 2: Gets τ * t =

Joint Optimization Algorithm
T αi,j r d th +αi,j through closed form solutions of optimization problems 3: Gets Y i,j from τ t * as KM algorithm weight factor to obtain optimally matching matrix X * 4: Gets maximum system throughput T P s * 5: Let X * = X 6: while T P s * -T P s > ε do Based on the solving of P2 and P3, a joint optimization (JOPT) algorithm to obtain the maximum throughput of the D2D communication network can be defined as Algorithm 1.In the JOPT algorithm, it first initializes the system throughput as T P s=0 and the devices are randomly distributed in the specific area.Then, the matching matrix X can be initially determined by the channel gain between DUEs in step 1. Afterwards, the optimal information transmission time τ t can be calculated at step 2. In step 3, the pairing weight Y i,j can be calculated, then the KM algorithm is applied to obtain the updated pairing matrix X * .Subsequently, the optimal throughput T P s * of the system can be obtained in step 4. Finally, the algorithm goes into a loop to find the optimal throughput from step 6-9, which will stop when the difference between T P s and T P s * is less than a pre-defined ε = 1 × 10 −5 .After the loop, the algorithm will return the maximum throughput T P s * , τ * t and X * .Considering complicity, the time complexity of the iteration from step 7-10 is O(log 1/ε).And assuming M being greater than N, the time complexity of the KM algorithm is O(M 3 ), while the optimal time τ e is directly commutated with O(M N ) as the time complexity.As a result, the overall time complexity of the proposed algorithm is O(log 1/ε(M N + M 3 )).

Numerical Results
In this section, the relevant numerical results will validate the proposed solution, while comparing the proposed solution to those benchmark solutions.We assume that there are 8 CUEs, 5 DTs, and 5 DRs in the system.The distribution of all devices in an area with radius R=200 m and the BS is located at the coordinate (0, 0).Moreover, the radius of the inner circle is 120 m, and the radius of the middle circle is 160 m.Most of the simulation parameters are set following the setting in [26,28].We run all the simulations on the computer with the 3.40 GHz DTU and 12 GB RAM.The simulation software is Matlab 2017b running on Windows 10.The simulation parameters are shown in Table 1.In the numerical results section, we compare the throughput of the system led by our proposed solution, while considering different BS transmit power settings and the simultaneously influence of the SINR threshold on the throughput.Next, we compare the proposed JOPT with the two benchmark solutions.One is maximum power transmission optimization (MPOPT) solution where each DT will maximally harvest the energy to be able to transmit its data with maximum power during the period T. The other is fixed time slot optimization (FTSOPT) solution, which fixes τ e * to satisfy the constraint (8) and to be a fixed value between the minimum and the maximum.Specifically, the information transmission time is fixed as τ t = 2 × 10 -4 s in the period T.
During the comparison, it takes energy harvesting circuit conversion efficiency η, signal-to-noise ratio r d th threshold, BS transmit power p 0 and the number of DUE, i.e., DT-DR pairs N as the four criteria.

System Performance Against Different Transmit Power and SINR Threshold
To justify the influence of different SINR threshold to the throughput of EH-D2D communication with JOPT, the CUE SINR threshold r d th is set to be variable from formula (9) to (14). Figure 4 demonstrates that the transmit power of the BS varies with p 0 = 2W , p 0 = 1.5W , p 0 = 1W in the alternate iterative algorithm, and as the r d th changes, the DUE throughput varies accordingly.As shown in Figure 4, the greater the base station transmit power Fig. 4 The performance of JOPT algorithm considering different r d th p 0 is, the higher the throughput of the D2D system will be, while the SINR threshold stays the same.The greater p 0 means that more available energy the DT can harvest through wireless energy power transferring within the same time period.And while the transmit power increased, the DUE then can obtain higher throughput accordingly.At the same time, when the base station's transmission power is fixed, the throughput of the DUE increases as the SINR threshold increases, while the throughput of the system decreases.This is because the higher the SINR threshold, the higher the QoS requirements of D2D communication will be.

Performance Comparison with Different Algorithms
To validate proposed solution of this paper, we compare the JOPT solution to MPOPT and FTSOPT solution.The throughput of all solutions has been improved to a certain extent with the increase of energy conversion efficiency.
In Figure 6, we compare the throughput of three solutions with different SINR threshold settings.The irrelevant parameters are set to be fixed, where let η=0.5 and p 0 =2W .As shown in Figure 6, it can be seen that the JOPT solution proposed in this paper is superior to the MPOPT or FTSOPT under such settings.In specific, the throughput of each solution decreases as the SINR threshold increases.And the larger the SINR threshold is, the higher the communication quality requirements of the DUE will be.Also, the more time the DT will take to harvest energy to ensure QoS.Correspondingly, the remaining information transmission time τ t decreases leading to a decrease in throughput.In particular, since the QoS of the MPOPT is always the best, it can be seen that there is almost no change in the D2D throughput led by the MPOPT solution.In Figure 7, we compare the throughput of three solutions with different base station transmit power settings, where the EH efficiency is set to be fixed to be η=0.5 and the SINR threshold fixed at r d th = 10. Figure 7 demonstrates that the proposed JOPT in this paper has maintained a high D2D throughput during the change in base station transmit power from 1W to 2W, in contrast to the FTSOPT and MPOPT solutions.This is because the greater the transmission power of the base station, the more energy will be harvested by the DUEs, as summarized in formula (3).Finally, in Figure 8, we compare the throughput of three solutions with variable DUEs, and the relationship between the number of DUEs and the throughput of the system is demonstrated.Obviously, it can be seen that the JOPT solution proposed in this paper is superior to the MPOPT or FTSOPT with different DUE numbers.We can also see that there is a positive correlation between the throughput of different solutions and the number of devices.But it is not a linear increase because the location distribution of different numbers of devices has changed.In the worst case, unfavorable distribution will lead to a decrease in the throughput of the system.
In general, the digital simulation results show the impact of different parameters on the system throughput and the throughput of different solutions.Last but not least, under the influence of different parameters, the JOPT solution is always greater than the FTSPT and MPOPT solutions in terms of throughput.In summary, the joint optimization algorithm proposed in this paper has certain advantages.

Conclusion
This paper mainly studies an optimization algorithm on joint EH time allocation and DUE pairing strategy in the EH-enabled D2D network.First, a closed-form optimal solution of the EH time allocation algorithm was proposed, which can help DTs to harvest the most energy for data transmission in the cycle time.Second, taking into account the different energy collection capabilities of DTs in the system, different optimal charging times are allocated, and then the KM algorithm is used to find the best DT-DR pairing.Then, an alternate iterative algorithm is proposed to maximize the D2D system throughput.The numerical results show that the proposed JOPT solution can improve the system throughput more effectively than the FTSOPT and MPOPT solution.
For future work, in this paper, it is assumed that the transmitter has cached the requested file, and the number of requesters and senders are fixed in this paper.However, in reality, the number of requesters and senders change alternately, and the requested file may not be cached.In addition, in spectrum reuse, our study assumes there are a fixed number of CUE to reuse spectrum resources from, which is not realistic.Therefore, in the future, our research will take these issues into consideration to make the model more practical.

Algorithm 1
Joint optimization EH slot allocation and devices matching (JOPT) Input: The number of D2D transmitters(DTs) M and receivers(DRs) N ; Cycle length T ; D2D maximum transmit power p d max ; SINR threshold of DUE communication Output: Optimal information transmission time τ t ; EH time T − τ t ; D2D matching matrix X ; Maximum system throughput T P s * ; 1: Initialize the devices distribution in fixed area; System throughput T P s=0;

Figure 5 Fig. 5 Fig. 6
Fig. 5 Comparison the throughput with different solutions under variable η

Fig. 8
Fig. 8 Comparison of throughput with different solutions with variable DUE number

Table 1
Simulation parameters Comparison of throughput with different solutions under variable p 0