Upgrading an analog recovery loop for optimized decoding jointly to an increased data rate

The maximum likelihood detection theory improves the error rate of a sub-optimal but cheaper, coded symbol recovery loop using oversampling proposed as an alternate solution for the decoding problem without the log-likelihood ratio computation. The former implementation delivers the output data in one-symbol delay, and the required transistor count makes this approach attractive for ultra-low-energy wireless applications. The proposed hardware upgrade includes an analog to digital converter and fixed-point accumulation logic to compute the soft values, replacing a trigger used as a hard detector. This work investigates the soft decoding in the presence of binary and non-binary source symbols. Simulation results show that the soft approach improves the signal-to-noise ratio by 3 dB and 2.5 dB when the encoding rates are 1/3 and 2/3.


Introduction
Recent advances in signal processing enable new technologies for wireless communications. The low-complexity hardware implementation and energy efficiency are essential requirements for portable wireless systems such as wireless sensor networks, implantable devices, radio-frequency identifications, and many more. Most signal processing and decoding systems are digital and radio-frequency (RF) electronic circuits. The classic Viterbi decoder requires a demodulation section (e.g., phase-shift keying (PSK)) made of a traditional mixer and local oscillators to work correctly. The hard Viterbi, in an additive white Gaussian noise (AWGN) channel model and no inter-symbolic interference (ISI), decodes convolutional coded symbols with a digital circuit made of a simple add-compare-select (ACS) and trace-back units. The survivors-path and the related output symbols, with a decision depth sufficient to the survivors' convergence in a unique state, [1] require two arrays as storage: typically a random access memory (RAM). The classic Viterbi decoder stores the state and branch metrics in additional RAMs. The soft-output Viterbi [2] increases the hardware complexity [3] with another array, keeping the reliability information of each survivor path [4]. The system proposed in [5,6] can recover a coded binary data stream from a phase-modulated waveform with few transistors. Potentially, since the better error rate than Viterbi decoding, its implementation could reach the maximum energy efficiency in data decoding. This mixed-signal system does not require the prior evaluation of the log-likelihood ratio (LLR) when past research on energy-efficient analog decoder by Hagenuauer [7], and Loelinger [8] approximates it by using the voltage and current characteristic of diodes and BJT transistors, 1000 times faster than a standard digital signal processor (DSP). The first recovery loop coherently demodulates a direct-sequence-suppressed-carrier amplitude (DSSC-AM) modulation [9] functionally equivalent to the well-known trellis-code modulation (TCM) [10][11][12] by a closed-loop with the role of phase tracking in the analog domain. The current source information binary digit, in an antipodal form, represents the waveform message. Under perfect tracking, the demodulation section removes the carrier wave producing a pulse amplitude modulation (PAM) waveform to a trigger used as ML hard decision block. It delivers an analog binary signal representative of the current decoded source symbol. This wave is overs-sampled, and the final decision comes from a majority vote. Over-sampling technology is widely used in signal processing and telecommunication. In some applications, the sampling at a rate higher than that of the baud rate makes the input random process cyclo-stationary [13,14], removing the circulation characteristic of the modulated signal. In some cases, the proper handling of the sampling signals can ensure to minimize the effect of noise [15].
The standard recovery loop delivers the estimated source symbol in one symbol clock delay when traditional Viterbibased decoders estimate the source symbol sequence with a delay proportional to the trace-back depth. The work [5] showed how a closed-loop and over-sampling in the coded symbol recovery system improves the error rate to traditional Viterbi-based and evolution of the well-known Costas Loop. The error probability does not depend on the encoder distance, critical under Viterbi decoding, but from the code-words stochastic distribution. We show how the proposed approach converges to the ideal transmitter at a higher signal-to-noise ratio (SNR), plotting the code-words probability mass function (PMF) at increasing SNRs. The analytical bound of the bit error rate indicates how the closedloop has a decisive role in the detection performance. The over-sampling in energy-efficient 1-bit quantization systems [16][17][18] does not significantly improve the BER that suffers from low-resolution analog to digital converter (ADC). Instead, the recovery loop samples the decoded source symbol, an analog binary output waveform, so the majority vote can improve the final error rate. In this work, we focus our research on error rate minimization due to the maximum-likelihood (ML) detection theory. This goal represents the primary target of any research on decoding systems. The LLR ratio, excluded in the first implementation, converges to a hardware upgrade made of an accumulation logic and a binary comparator that replaces the trigger under a straightforward transmission model. We also consider the scenario of higher-order modulations that require a new evolution of the primary circuit to consider the channel attenuation at the receiver for correct decoding. Therefore, we introduce the sub-system that matches the received waveform in the correct decision regions, both using a hard or soft internal decoder. This ultimate goal lays the basis of the recovery loop working at a single-path flat fading and, in the perspective of evolution to multi-path propagation, more realistic than the simple additive white Gaussian noise. We simulate the proposed system architectures, the last recovery loop, the Viterbi decoders, the Costas loop using oversampling, and the 1-bit quantization with oversampling and Viterbi decoding.
Moreover, we explore diversity receiver schemes [19] to confirm the superiority of our approach to the modified Costas loop, which represents our real competitor in terms of error rate and circuit complexity. The paper has this organization: Sect. 2 describes how the analog recovery loop works, the supported modulation, the phase mapping, and the automaton role for the loop stability. In Sect. 3, we describe the recovery loop architecture and the limits of our modeling. We apply the ML detection theory in Sect. 4, deriving a novel soft-based recovery loop's architecture. Section 5 shows the architectural changes when the encoder rate is 2 bits per symbol. We illustrated the result of our simulations in Sect. 6, working with hard and soft recovery loops trying data rate of 1/3 and 2/3, respectively. The extension to whichever data rate is straightforward. Lastly, Sect. 7 concluded the paper.

How the recovery loop works
Eq. 1 shows the DSSC-AM modulated waveform in the k th signaling interval, when the message is the antipodal source symbol u k , after conversion from a single binary digit. With T, we indicate the symbol period, k the discrete time-step, f 0 and θ 0 the carrier frequency, and the initial phase.
Here, in (1), the variable E S is the signal's energy measured in Joule, and θ k is a transmitted signal phase belonging to the set of real numbers {2π/M · i, i = 0, 1, . . . , M − 1}. We are normalizing with the square root of the signal energy to be consistent with the signal energy formula [20]. We use a finite automaton as an encoder with a rate 1/R, where 1 is the raw input data rate and R is the data rate of the output channel encoded stream. The automaton design follows the Fig. 1 The coded symbol recovery loop using oversampling and hard detection rules in [5] to achieve the loop stability and the best error rate. The code-word c k is a unique set of R binary digits to each source symbol. Consequently, we map the k th coded symbol c k in a constellation with M symmetrical phases according to Eq. (2), where M = 2 R .
The additive white Gaussian noise corrupts the signal (1); the recovery loop ( Fig. 1) applies the signal at the antenna to a pass-band filter, a low-noise amplifier (LNA), minimizing the noise's statistical power. Next, a coherent demodulation section using a multi-phase voltage-controlled oscillator (MP-VCO) removes the cosine in (1). The MP-VCO [21] mathematical behavior is: In this equation,θ k (t) is the analog tracking of the currently transmitted signal phase θ k . The mixer, modeled as an ideal multiplier, output port, combines two contributions at frequency zero and 2 · f 0 using the Werner formula. If f 0 is multiple of 1/T, the loop filter (with Laplace function H(s), and the internal state reset accessible by an additional port) removes the high-frequency signal, and its output is a sequence of linear ramps when H(s) is a pure integrator (1/s). Under the perfect tracking, the trigger estimates the current source symbol correctly. In this condition, the hybrid automaton, identical to those used in the transmitter, whose output network working in the analog domain, receives the correct analog estimation producing the properly transmitted signal phaseθ k (t) = θ k , and the loop is in-lock. The eq. (4) shows the noiseless ML detector input signal y(t) under perfect tracking: The constant 0.5 comes from the Werner formula, and 1/T is the filter gain to obtain a ramp from 0.0 to ±1.0 in any signaling interval. The perfect tracking allows the cosine removal; otherwise, the trigger might wrongly detect the k th source symbol. In this case,θ k (t) = θ k and the signal at the trigger depends on the variable k (t) = θ k − θ k (t): If the cosine of k is negative, the trigger fails, producing the 1's complement of the correct source symbol, and the loop is in out-of-lock. In Eq.
In this last equation, n m and y m are samples of Gaussian noise and loop filter output at rate T/S. The final decision is due by a majority voting, counting the positive samples, and comparing the final count to a threshold, typically S/2. This final decision applies to the hybrid automaton's nextstate logic to prepare the successive symbol decoding. This hardware needs a synchronization sub-system, using a clock of period T (clk, Fig. 2) in the hybrid encoder; the binary counter receives a clock S time faster (clk/S). Finally, this system uses a reset and finish spikes as a periodic wave with period T and active in the origin's proximity. The reset spike at time 0 + clears the loop filter's internal state and the counter value. The finish spike at time 0 − enables the delivery of the current source symbol final estimationû k−1 (ukd), one symbol delay late. This mixed-signal system requires some fundamental design issues to prevent loop instabilities. The conducted noise-less stability analysis revealed that the correct decoding is the unique equilibrium point when the cosine of k is positive. The decoding mechanism is different from traditional approaches based on the Viterbi algorithm. The encoder distance property does not influence the BER. Instead, the error rate depends on the Eq (7) is an evolution of the BER found in [5] where the error formula is aligned to the specific automaton used in the simulation. In that work, c k belongs to the set of 0,1 only, and the in-lock probability P L is equivalent to We extend the general case when the conditional probabilities are given not only the prior source symbol but the discrete value of the random variable c In this example, the code rate is 1/3, so c k owns to the set {0, 1, 2, 3, 4, 5, 6, 7}. The variable C is the counter, a typical birth Markov chain, value at step S. and P R {C S/2|u k , c k } is the conditional probability [22] of the counter value given c k and u k . The approach in the Viterbi decoding [19] found an upper bound assuming that the all-zero sequence is transmitted, determining the probability of error in deciding in favor of another sequence. Instead, we derive the first event error probability as the probability that the receiver path diverges from the transmitted path for the first time at a given step (k) in the trellis. The first event error probability is a part of the accurate BER formula, but in an optimistic scenario, therefore it is a lower bound error rate. The PMFs of the automaton used in the first recovery loop at different signal-to-noise ratios in ascending order (Fig. 3) indicate any recovery loop with whichever encoder has a code word stochastic profile that converges to a single tone at zero at very high SNRs, achieving the full stability. Thus, memory codes or finite automata in general with identical PMF have the same error rate.

VLSI system implementation
The classical coded symbol recovery loop may be partitioned into three different sections (Fig. 1). The demodulation section includes the LNA amplifier, the mixer, the MP-VCO, and the loop filter. These blocks do not require specific design difficulties, and they are currently used in any communication system. The MP-VCO is a ring oscillator with a plurality (M=8) of different phases, coupled to a standard sinusoidal oscillator based on the principle described in [23]. The analog estimation section has the ML trigger (a logic inverter) [24], the hybrid encoder (few logic gates, see  [5]), and the phase map. The MP-VCO hardware includes the phase-map block's mathematical behavior. The ML trigger is a comparator circuit with hysteresis implemented by applying positive feedback to the non-inverting input of a comparator or differential amplifier. It is an active circuit which converts an analog input signal to a digital output signal. In the non-inverting configuration, when the input is higher than a chosen threshold, the output is high. We simulated this circuit with zero-threshold and no hysteresis. Finally, the digital estimation section uses the sample-andhold [25] and a 3-bit (S=8) binary counter (a few logic gates and three sequential cells) [26]. A binary counter can be constructed from J-K flip-flops by taking the output of one cell to the clock input of the next. The J and K inputs of each flip-flop are set to 1 to produce a toggle at each cycle of the clock input. For each two toggles of the first cell, a toggle is produced in the second cell, and so on down to the last cell. This produces a binary number equal to the number of cycles of the input clock signal. The sample and hold circuit is an analog device that samples the voltage of a continuously varying analog signal and holds its value at a constant level for a specified minimum period of time. A typical CMOS sample and hold circuit stores electric charge in a capacitor and contains at least one switching device such as a CMOS switch and typically one operational amplifier. To sample the input signal the switch connects the capacitor to the output of a buffer amplifier. The buffer amplifier charges or discharges the capacitor so that the voltage across the capacitor is practically equal, or proportional to the input voltage. In hold mode the switch disconnects the capacitor from the buffer. Although there is no physical implementation, the maximum allowable data rate (T) depends on the used radio frequency digital semiconductor technologies, instead of typical industrial/consumer CMOS, due to the very low latency. The mixer and the trigger latency are other limiting factors for the maximum data rate measured in bit per second. In this work, the recovery loop operates at T=80ns and S=8, which is satisfactory for the total latency introduced by logic and analog part less than 10ns (T/S). We justify these settings having the binary counter a clock rate of 100Mhz (10 ns), realistic and achievable by any current CMOS RF digital technologies.

Proposed implementation method
The recovery loop introduced in [5] represents the cheapest implementation in terms of hardware, area, and power dissipation, making this circuit attractive for ultra-low-energy applications such as implantation devices, wireless sensors networks, and many others. Although the former implementation is competitive over any other loop based on Costa's circuit [27] or traditional Viterbi decoding, we focus our attention on the bit-error rate optimization in this contribution. We gain S independent Gaussian random variables with a non-zero mean value. Therefore, Eq. (9) shows the LLR when the loop filter output is over-sampled, and i of S subsamples are available at frequency T/S and packed in the vector y i : The function p is the conditional probability density function (PDF) of the received samples given the antipodal source symbol. The estimated source symbolû k , function of the available observations y i , achieves the highest PDF. The notation in eq. (9) follows the style in works [28] and many others from the same author. This equation indicates that the ML decision is +1 if the LLR is greater than or equal to 1.0, and is -1 if the LLR is less than 1.0. According to the formula below, the PDFs p are Gaussian random variables under our considered channel model with statistical power σ 2 (proportional to the SNR) and mean value proportional to +1 or -1, respectively.
Here in Eq. (10) σ 2 is, the low-pass Gaussian random process variance after the mixer and filtered by a one-pole low-pass model with Laplace function H(s) and frequency response H(f) (s = j ·2.0·π · f ). N 0 is the noise power spectral density measured in Watts per Hertz, B is the pass-band bandwidth of the LNA, and f p is the loop-filter cut-off frequency. In the equation below, we assumed a swift rise time under f p · T 1. Therefore since the nature of independent over samples, the PDF of the entire sequences y i are Gaussian: We use the logarithm, and removing the positive constant 1/( √ 2 · π · σ ), the criterion (9) becomes: We can remove 2 · σ 2 in (12) since it is a positive constant, and we change the operators orientation removing the minus sign. Therefore, the optimal decision strategy in (13) is: The result in (13) requires the source symbol +1 and -1 with the same prior probability: 0.5. Finally, expanding the equation (13) and simplifying the common terms, the current decision strategy that uses the variable "m" from 1 to "i" is: The decision formula (14) averages the zero-mean noise, strengthening the signal to noise ratio. The soft decoder applies the rule (14) with the accumulation logic and a binary comparator, using fixed-precision arithmetic (more compact and energy-efficient than floating-point). The ML soft detector makes available the progressive binary estimation with increased reliability. The evaluation of (14) when i=S (the finish spike) gives the final source symbol estimation. This theory applies when the binary source symbols have the same prior probabilities. More in general, let q be the binary symbol +1 prior probability (P R (u k = +1)) different from 0.5; the new LLR is, therefore [29]: Here, in Eq. (15), the new log-likelihood LLR' uses the former derived in (9). Using the same approach, the new threshold and corresponding decision regions are: Consequently, we move the sample-and-hold circuit after the loop-filter (Fig. 4). The new decision strategy does not need the binary counter anymore, but a fixed/floating precision accumulation logic and comparator after converting the loop filter output in a digital string. The rest of the hardware is unchanged; the demodulation and the circuitry for channel and timing recovery are the same illustrated in [5].

Decoding at a higher data rate
Any hard or soft decoder requires a finite automaton whose rate is 1/R and source information symbols in the set {0,1}. We complete the data decoding theory using a coded symbol recovery loop when the source symbols own to the set {0,1,2,3}, and consequently, the finite automaton has a rate of 2/R. Eq. (1) represents the used modulation when the antipodal symbols u k ∈ U = {−3, −1, +1, +3}; we confirm the rest of the used parameters carefully considering M distinct symmetrical phases, aligned to the Eq. (2). The general approach to whichever automaton rate and, therefore, with any data rate is straightforward.

The hard recovery loop at a higher data rate
In this new context, the use of multiple amplitudes in the modulated waveform requires the correct matching of the signal at the loop filter output connected to the hard detector with four decision regions limited by the boundaries {-2, 0, Fig. 4 The coded symbol recovery loop using oversampling and soft detection when the source symbols have the same prior probabilities +2}. Further, the scaled boundaries must consider the channel attenuation (h), which is either positive or negative and related to the radio link distance.
In (17), r(t) is the random process at the antenna, as the sum of the used modulation s(t) and the zero-mean, pass-band noise random process n w (t). Hereafter, we justify the efficacy of the proposed sub-system to match the hard boundaries by a noiseless analysis. The signal y(t) from the used loop filter, a one-pole low-pass filter (LPF), is an exponential smoothing as depicted in the equation below (ω p = 2.0 · π · f p ): This filter receives ideally a constant wave in [0,T] ( Fig. 1): The presence of exponential smoothing requires the boundaries in the ML trigger a proper scaling . The scaling factor alc(t) should be in [0,T]: Thus we derive a sub-system, the automatic level control (ALC), that starting from the used modulation, calculates the scaling factor (20) for the hard boundaries. This additional hardware is mandatory for the high data rate recovery loops to work correctly, considering the unknown channel attenuation and multiple decision regions. We propose two ALC implementations; the first circuit (Fig. 5 A) is a serial chain of a delay circuit (T), a square block, a voltage-controlled amplifier (VCA) that scales its input q(t) with the square of the previous estimated antipodal source symbol: The ALC receives the signal at the antenna (17); the waveform at the square block output port is: By trigonometric manipulation, the equation (22) will be in [0,T]: The VCA removes the gain u 2 k−1 under the hypothesis of correct detection at step k-1 (û k−1 = u k−1 ) and correct tracking ( k−1 = 0). After the VCA, we have a square root block with gain √ 0.5 whose output is: Thus, the low-pass filter with Laplace function H(s) and the reset port, identical to those used in the recovery loop, partially removes the periodic waveform represented by the square root in eq. (24), generating a waveform similar to (20).
This approach fails under negative attenuation, where the loop filter output has a wrong sign. For this reason, we propose a second implementation (Fig. 5 B) made of the same delay circuit, a mixer, a loop filter, and an MP-VCO aligned to the model (3) receiving the previously estimated phasê θ k−1 . We still use the signal at the antenna r(t). Now q(t) is: In this new implementation, the VCA input-output behavior is: After the filter H(s), we gain the wished scaling factor (20). Thus, under the hypothesis of correct decoding and tracking at step k-1, the four decision regions are correctly scaled in any signaling interval. According to this new modulation, this new coded symbol recovery loop using oversampling (Fig. 6) uses an ML hard detector with four decision regions separated by the boundaries in -2, 0, and +2, respectively scaled by alc(t). This new system architecture confirms the demodulation section by including the mixer, the MP-VCO, the hybrid encoder, and the loop filter. The over-sample section is still composed of the sample-and-hold and the 3-bit (S=8) binary counter.

The soft recovery loop at a higher data rate
The soft recovery loop at a high data rate derived in this work, applying the maximum-likelihood decision theory, now considers a generalized strategy aligned to the equation below [29]: The PDF "p" is exponential under the Gaussian additive noise, so if we use the logarithm, we have an equation similar to (13): In this new context, the coded symbol recovery loop with soft values working with the source symbol at a 2-bit per second data rate (Fig. 7) is a generalization of Eq. (29). This new recovery loop is a consequence of the systems that use binary encoding (Fig. 4) and the related hard detection counterpart (Fig. 6). We identify a soft decoder made of an ADC after the sample-and-hold and digital logic to realize the criterion (29). The novelty is the measure of the soft-values now with the computation of the Euclidean distance's square using fixedprecision arithmetic between the observation y m and the trial source symbol "u" scaled by the ALC value converted into a binary string by another ADC at T/S clock rate. The min operator in (29) is a digital logic made of a hierarchy of fixedpoint comparators. Finally, the ALC sub-system is the same described above under hard decoding.

Simulation results
We modeled these new recovery loops using the SystemC/ SystemC-AMS C++ library and additional free software such as the GSL (GNU Scientific Library) and IT++ libraries to model mainly the AWGN noise. Our primary performance parameter is measuring the bit-error-rate of these extensive benchmarks represented by the proposed recovery loop, the Costas loop with oversampling (Fig. 8) and soft detection, Fig. 6 The recovery loop is delivering hard decisions with a 2-bits per symbol data rate Fig. 7 The recovery loop is delivering soft decisions with a 2-bits per symbol data rate Fig. 8 The simulated Costas' Loop with oversampling and soft decisions Fig. 9 The simulated 1-bit quantization and oversampling decoding the classic soft Viterbi decoder, and finally, the 1-bit quantization and oversampling decoder (Fig. 9). The complex baseband received signal x(t) corresponds to the complex transmit coded symbol c k , which is given as a weighted sum of time-shifted transmit pulses p(t), disturbed by additive white Gaussian noise w(t). At the receiver, x(t) is processed by the receive filter with the impulse response g(t). Finally, we have 1-bit ADC followed by the classic soft Viterbi. The filters p(t) and g(t) do not generate ISI in the over samples.
All the comparisons were made in the same environment and settings; we use T=80ns, S=8 over samples, and a carrier frequency of 400 MHz. We used the same encoding matrices in the recovery loops and the Viterbi decoders. The ADC onboard the proposed recovery loop has a 16-bit resolution and fixed-point arithmetic. The mixer and ML trigger outputs have 2ns delay representative of a non-ideal behavior. Another performance metrics is the normalized mean square error (MSE) of the ALC output, measuring the real alc(t): In this last equation, E{} is the stochastic expectation operator [22]. We initially assumed perfect phase and timing recovery. Finally, we evaluated the impact of a diversity scheme in the error rate, with a system architecture made of three receiving antennas simulating the modified recovery and Costas loops.

Binary recovery loops
We use the same finite automaton and hybrid encoder introduced in [5]. Noise is an additive white Gaussian with an SNR ranging from -2dB to 2dB. The proposed recovery loop has the minimum error rate for any SNR (Fig. 10). The modified Costats loop is the real competitor for this kind of metric, due to the oversampling. The SNR gain from the recovery loop and the hard decision is around 3dB at B E R = 10 −4 using S=8 independent samples, increasing the sufficient statistic. This result is a consequence of the ML theory. The system based on the 1-bit quantization has a poor error rate since the 1-bit ADC degrades the Viterbi internal metrics; averaging the over samples from the 1-bit ADC output does not significantly improve the error probability. The classic soft Viterbi, without oversampling, confirmed the result in the former work [5]. Finally, we simulated the proposed recovery loop with the carrier and symbol recovery schemes used in [5], obtaining an average SNR loss in the BER of 0.8dB in the considered range [-2.0dB 2.0dB].

Non-binary recovery loops
In this context, the soft decoder delivers the estimated source symbol that, according to a 2/3 code rate, owns to a set of four possible values. The used automata follow the guidelines in [5] for the loop stability and the best error rate. A deterministic finite automaton (DFA) models our used encoder. A digraph called state diagram represents a DFA. A DFA can be represented by a 7-tuple (Q, , δ, q 0 , F,O,X): The soft detector internal to the proposed recovery loop, with higher modulation, has four fixed-point registers (rû k ) to evaluate the criterion in Eq.(29): We set a specific design strategy for the decoding system to work efficiently. The transmission always begins with the first source symbol at logic zero (u k = −3.0), mandatory for the whole circuit's correct behavior. In this condition, the ALC output (Fig. 5 B) alc m is zero due to the delay line; the two thresholds −2 and +2 collapse to zero, and the four registers have the same value. This soft decoder selects the output symbol with an iterative procedure starting from (r −3 ) Fig. 12 The bit-error rate of Soft and Hard recovery loops with ALC and data rate of 2-bit per symbol. The Costas loop has ALC and delivers soft decisions. The soft ML Viterbi works with a 2/3 encoder rate Fig. 13 The measured MSE of the ALC sub-system working in higher modulation schemes searching for the minimum register value using the strictly less operator (<). In this way, the first decision is always correct. We set the channel attenuation with positive and negative values, measuring the ALC and the loop filter outputs (Fig. 11). We find the error rate in this new configuration is better than any considered alternative approaches (Fig. 12): the modified Costas Loop, the classic and 1-bit quantization with oversampling Viterbi decoding. The soft-based recovery loop gains almost 2.5 dB in SNR to the hard counterpart when BER=10 −3 . Finally, we simulated this system with the carrier and symbol recovery schemes used in [5], obtaining an average SNR loss in the BER of 0.6dB in the considered range [2.0dB 6.0 dB].

ALC tracking performance
The use of the ML detection theory justified the primary goal to improve the error rate as a performance metric. In this sub-section, we introduce another performance parameter that measures the effectiveness and the convergence of the channel amplitude tracking. We measured the mean square error in (30) of the ALC output, varying the SNR from 2dB to 6dB with a high order modulation and code rate 2/3. Results (Fig. 13) show how solution B, working with channel attenuation both positive and negative, has better statistics than solution A, working with positive attenuation only. This is not a surprise, the presence of the square and square root blocks introduces a bias in the ALC difficult to remove by a simple one-pole low-pass filter as discussed in Sect. 5.

Detection in diversity
The diversity scheme represents another approach to improve the error rate better than a single antenna receiver. We simulated the system architecture of three different recovery loops working with three antennas. The final decoded sym- Fig. 14 The bit-error rate of the proposed recovery loop and Costas loop in diversity configuration with three different receiving antennas bol comes from another majority vote. We simulated our soft recovery loops under the DSSC-AM (rate 1/3), and the high order modulation (rate 2/3) (Loop Soft), and the modified Costas circuit (Costas Soft). Results (Fig. 14) show how the multiple arrangements of recovery loops have a better error rate than the direct competitor with an SNR gain at most 0.5dB when B E R = 10 −3 and B E R = 10 −2 and when the code rate is 1/3 and 2/3 respectively.

Conclusion and future directions
This work introduces the maximum likelihood detection theory's basic principle applied to a coded symbol recovery loop using oversampling to improve the error rate. The price to pay for this advantage is a substantial increase in the circuit complexity, affecting the most critical design parameters: the total circuit area about the wafer's cost, and the power dissipation, which determines the lifetime of wireless and portable systems. The use of a higher modulation scheme requires a mandatory scaling of the signal to the internal decoder (hard or soft) to match the multiple decision regions. The proposed approaches enable using this kind of recovery loop when the error probability and the data rate are critical design parameters at any hardware cost. The waveform scaling proposed in this work, and the related hardware implementation, lay the basis of this new kind of data recovery over a multiplicative flat-fading channel model. In the future, the utilization of these conclusions will bring to a new recovery loop working in a scenario of multi-path propagation. Finally, current studies over deep learning algorithms enable this new technology by using a pilot symbol-assisted experienced transmission to augment the knowledge of the wireless channel more than the single probability density function. The related data set, receiving values of the loop filter output, and the basis of classification algorithms improve the quality of the oversamples, enabling corrections and reducing the bit error rate in highly noisy channels. In this last application, the nature of over-sampling makes cyclo-stationary the analog estimations, making practical the process of data acquisition and correction on the fly. Deep learning is also used to design the source-channel coding for particular applications, such as transmitting kind of information to preserve some quality parameters. There are many applications; for example, the transmission of text to preserve the semantics of the messages [30]. We can imagine using deep learning to select the best encoding system in the recovery loop to optimize the PMF of the code-words as a unique constraint to reduce the error rate. Next, we can model the entire process of recovery loop, or part of it, as a neural network assisted by the deep learning [31]. However, there are many improvements in this research area, with the challenge of the minimum additional hardware and the use of those algorithms justified by the achievement of considerable benefits.
Acknowledgements Not available.

Author Contributions
The author read and approved the final manuscript.

Funding Not applicable.
Data Availability Please contact the author for data request or visit his Homepage.
Code Availability Custom code, available on GitHub soon.

Conflict of interest
The author declares that he has no competing interests.
Consent for publication Not applicable.