Organic Electronics from Nature: Investigation of the Electronic and Optical Properties of the Isomers of Bixin and Norbixin Present in the Achiote Seeds

Organic semiconductors have been widely employed in developing new green energy solutions with good cost-efﬁciency compromise, such as Organic Photovoltaics (OPVs). The light-harvesting process in OPVs is a crucial aspect, which still needs improvements. In this context, the Dye-sensitized solar cells (DSSCs) have arisen as a technically and economically credible alternatives. In this work, we have performed density functional theory (DFT) calculations to investigate the electronic and optical properties of the four natural dyes found in the seeds of annatto (Bixa Orellana L.), which is a natural dye widely found in tropical America. Different DFT functionals, and basis sets, were used in the calculations of the bixin, norbixin, and their trans-isomers (molecules present in Bixa Orellana L.). All molecules present a conjugated backbone with nine double bonds. We observed that the planarity of the molecules and their similar extension for the conjugation pathways provide substantially delocalized wavefunctions of the frontier orbitals and similar values for their energies. Moreover, our ﬁndings showed a strong absorption peak in the blue region and the absorption band over the visible spectrum, thus indicating that molecules are good candidates for organic electronic and optoelectronic applications. The results were contrasted with the experimental data. strength ( f ), and transition dipole moments ( µ 01 ). These results were obtained by employing the 6-31+G(d,p) basis set.


Introduction
Organic semiconductors have been successfully employed in developing several optoelectronic devices [1][2][3][4] . Organic Photovoltaics (OPVs) have not yet reached the desired maturity concerning their physicochemical stability and efficiency 5 . In this sense, the search and designing of new organic semiconducting materials for overcoming these barriers are still growing 3,5 . Another problem that still limits the OPV efficiency is its light-harvesting capability (LHC) 6 . In this context, the Dye-sensitized solar cells (DSSCs) have arisen as a technically and economically credible alternative [7][8][9] . The working principle of DSSCs is based in increasing the LHC of OPVs by including a layer of molecular dyes to potentialize the excitons creation, which may be further dissociated into free charge carriers, thus improving the efficiency of the photovoltaic effect in the device 8,9 . Since the advent of DSSCs, the searching or designing of new dyes that can improve the LHC of OPVs have received much attention recently 9 .
Among the extensive class of natural dyes, achiote (Bixa Orellana L.) is a plant commonly found in South and Central America, whose seeds, after being crushed, are used as a condiment and a food coloring 10,11 . Widely found in the Amazon region, in Brazil it is better known by the popular name urucum. It is also known by other popular names in different countries: atolé, achiote, and bija (Peru and Cuba); axiote (Mexico); achiote, achote, anatto, bija, and santo-domingo (Puerto Rico); bixa (Guyana); analto (Honduras); guajachote (El Salvador); onotto and onotillo (Venezuela); achiote and urucu (Bolivia); urucu (Argentina); roucou (Trinidad); roucou and koessewee (Suriname); and annatto (United States) 12 . The pigments present in these seeds -especially the bixin and norbixin -can be applied in the textile, cosmetic, and pharmaceutical industry 12 and as preservative for meat derivatives 13 . They were also employed in developing novel DSSCs 14 .
The bixin (see Figure 1) represents nearly 80% of the pigments in the seed, with molecular formula C 25 H 30 O 4 . Bixin presents a functional group of carboxylic acid in one extremity and an ester group in the other one. These groups are separated by nine conjugated double bonds, serving as excellent receptors of free radicals 15 , and substituting methyl. Commonly, bixin has geometric isomerism Z in the sixteenth carbon and isomerism E in the rest of the chain 16 . The isomeric structure in which all the carbons of the chain are in isomerism E, named trans-bixin or isobixin (see Figure 1), may be formed with the pigment extraction process 16 .
The norbixin has the molecular formula C 24 H 28 O 4 and presents, apart from the conjugated chain, a carboxylic acid group in each of the extremities, which together are responsible for the anionic property and hydrosoluble character of this pigment 17 . Similar to bixin, norbixin also presents isomerism Z in its conjugated chain. Submitting the norbixin to controlled heating is a way of producing the trans-norbixin or isonorbixin, a molecule which, like the isobixin, presents all carbons with isomeric configuration E along the conjugated chain 17 . Although these pigments have already been used in the development of DSSCs 14 , their optoelectronic and structural properties remain not deeply understood.
Herein, we employed density functional theory (DFT) and time-dependent DFT (TDDFT) calculations to study the structural and optoelectronic properties of the four molecular dyes present in achiote (Bixa Orellana L.): bixin, isobixin, norbixin, and isonorbixin. The computational protocol employed here considers different DFT functionals and basis sets used to estimate crucial optical parameters of these dyes, such as the vertical transitions energies, wavelength, oscillator strength, and transition dipole moments. The bond-length alternation (BLA), frontier molecular orbitals, nonlinear optical properties and absorption spectra were also obtained. The BLA provides information on conjugated oligomers once the extension of the conjugation pathway in their backbone is an important parameter associated with the mobility of the charge carriers. It is worth mentioning that our results were contrasted with the experimental data.

Computational details
To obtain the optimized molecular geometries, we used DFT calculations that considered three different functionals, i.e., B3LYP, M06, and CAMB3LYP, with the 6-31+G(d,p) basis set [18][19][20][21] . We also performed the geometry optimization of the molecules presented in Figure 1 using these functionals and the 6-31G and 6-31G(d,p) basis sets. All the calculations considered molecules in the gas phase, and the polarized continuum model (PCM) was used to include molecules in solution with chloroform.
Low-lying singlet excited states were evaluated at the optimized geometries using time-dependent density functional theory (TDDFT) 22 . The optical absorption profiles were simulated through convolution of the vertical transition energies with the Gaussian functions by a full width at half maximum (FWHM) equal to 0.37 eV (3000 cm-1). All calculations were performed using the Gaussian 09 (Revision D.01) suite 23 .

Structural and electronic properties
We begin our discussion by presenting the geometric properties of the molecular dyes studied here. Figure 2 illustrates their optimized structures. As a general trend, we observed that the molecules, in both chloroform solution and gas phase cases, present almost planar lattice configurations with small torsion angles (about 1-2 degrees) in the edges. Such a signature for the lattice arrangement allows for the wavefunction delocalization on the π-conjugated backbone. All the molecules show similar extensions of conjugation, i.e., nine carbon double bonds C = C on the π-conjugated backbone. According to the earlier studies 24,25 , these findings indicate that the electronic and optical properties of the dyes in Figure 1 tend to present similar behavior.
As mentioned above, BLA is a crucial geometric parameter related to the electronic energy gap 26,27 . BLA is defined as (R single − R double )/N, where R single , R double , and N denote single bonds length, double bonds length, and the number of the single-double bond pairs in a π-delocalized system, respectively 27 . Here, we used the BLA values to realize possible changes in the bond length configuration of the dyes. In this way, Figure 3 shows the examined bonds, and Table 1 and Tables S1-S6 (Supplementary Material) show bond lengths and BLA values of the π-conjugated backbone for the dyes in gas phase and chloroform solution. For each DFT functional, we observed that both bond lengths and BLA values are similar among the dyes, and the solvent effect can be observed on the reduction of the BLA values when contrasted with the gas phase molecules. CAM-B3LYP provides higher single-bond lengths and lower double-bond lengths concerning the results obtained by employing B3LYP and M06. As a consequence, results from CAM-B3LYP present higher BLA values. The total BLA values increase, in sequence, from B3LYP, M06 to CAM-B3LYP, indicating that the higher Hartree-Fock (HF) contribution on DFT functional leads to higher the BLA values. This behavior is also reflected in the HOMO-LUMO energy gap, as shown later. Figure 4 and Figures S2-S4 (see Supplementary Material) illustrate the HOMO and LUMO wavefunctions of the dyes in the chloroform solution and gas phase cases. One can note that the frontier molecular orbitals widely delocalize on the π-conjugated backbone. Moreover, no impediment to electronic mobility along the π-conjugated chain is realized. This feature aggregates a metallic character to the polyenic systems since π-electrons of the conjugated chains are not part of a particular bond between atoms, which allows the charge to move along the chain freely 28 .
According to Koopman's theorem, the HOMO energy is the first approximation to the potential of molecular ionization 29 . By analogy, the LUMO energy is an approximation for the electron affinity. In this context, Table 2 and Tables S7-S12 (see  Supplementary Material) show the energies of the frontier molecular orbitals (MOs) and HOMO-LUMO gap energies of the dyes in the gas phase and chloroform solution. One can note a slight variation of the frontier MOs energies and gap energy for the same functional. We observed differences in the gap energy values for each DFT functional, which increased from B3LYP, M06 to CAM-B3LYP in sequence. These differences are related to the HF contribution since high HF contributions to the DFT functionals induce higher gap energy values. We also can note an interplay between BLA and the electronic gap, where an increase in the BLA values leads to an increase in gap energy values. In general, for all used DFT functionals and basis sets, the cis conformation presents gap energy values higher than the trans one, i.e., the trans conformation is energetically more stable than its cis analog.

Nonlinear optical properties
It is know that NLO properties of organic molecules are significantly influenced by BLA values, so the NLO response in conjugated organic molecules can be optimized by varying the BLA values as investigated by Marder et al 30 . In addition, it is essential to choose an appropriate basis set for the accurate description of NLO properties and it has been widely investigated by many scientists [31][32][33] . In this context, we investigate the electric properties of the isomers both in gas phase and chloroform solution to see how these properties were impacted both by the BLA behavior and by choice of basis set. The analyzed quantities were the normally experimentally measured, i.e., the dipole moment magnitude µ = µ 2 x + µ 2 y + µ 2 z , the average linear polarizability α = α xx +α yy +α zz 3 and the vector component of the first hyperpolarizability β vec = ∑ i Tables 3 and S13-S16 show the absolute values of obtained electric quantities for different DFT functionals and basis set. Here we observed that the values of the µ, α and β vec obtained with CAM-B3LYP functional are the smaller than B3LYP and M06. The exception occurred to isonorbixin with B3LYP that present smaller values of µ and β vec compared to CAM-B3LYP and M06. As CAM-BLYP provided the highest values of BLA (see Table 1) the findings indicate that higher values of BLA lower values of µ, α and β vec . This relashioship is in agreement with the work of Labidi et al for transhexatriene 34 . In addition , the isobixin presented the highest values of µ and α, and bixin presented the highest values of β vec . Furthermore, the 6-31+G(d,p) basis set provided highest values of α followed in descending order by 6-31G and 6-31G(d,p) basis sets, so the inclusion of the diffuse function on basis set induces increasing of α values. On another hand, this behavior is not observed to µ and β vec . The highest values of µ and β vec of isomers in gas phase are provided by 6-31G basis set. However, there are some cases that 6-31+G(d,p) basis set provided highest values of µ and β vec of isomers in chloroform solution. Also the findings of µ and β vec with 6-31G(d,p) basis set are the lowests compared to 6-31G and 6-31+G(d,p) basis set. We concluded the analysis of the NLO properties of isomers observing that the solvent effect causes an increase of µ, α and β vec .

Excited states properties
We now turn to the description of the low-lying excited states and optical properties of the molecular dyes. Here, we used Gaussian convolution of the wavelength to obtain the absorption spectra of these molecules. Figure 5 and Figures S5-S6 (see Supplementary Material) show the absorption spectra of the molecules obtained by the Gaussian convolution of the vertical transitions with the FWHM=3000cm-1. One can see that the solvent induces a slight shift in the absorption spectra to higher wavelengths considering the absorption bands. For each case, i.e., gas phase or in chloroform solution with same DFT functional and basis set, all molecules present maximum absorption peak in the same region, which can be associated with the same extension of backbone conjugation and close gap energies.
Calculations with different DFT functionals revealed a shift in the absorption band positions to higher wavelength from CAM-B3LYP, M06 to B3LYP, in sequence. We associate the increase of the gap energy with the decrease of the wavelength (analogously, an increase in the vertical energy). Here, the interplay between the BLA values and the position of the absorption bands can also be understood as follows: the increase in the BLA values causes the displacement of the absorption bands to the region of higher energies (lower wavelengths).
Finally, Table 4 and Tables S17 and S18 (see Supplementary Material) show the values of the vertical transition energy (E 01 ), the wavelength of the maximum absorption peak (λ 01 ), oscillator strength ( f ), and transition dipole moment (µ 01 ). In all cases, we observed that the transition dipole moment is mainly on the conjugated backbone, i.e., along the x-direction, and isobixin and isonorbixin presented higher values, about µ 01 . The experimental data show that the maximum peaks of bixin, isobixin, norbixin, and isonorbixin are in the blue region, i.e., 470, 476, 468 e 475 nm, respectively 35 . From these data, we conclude that the results obtained with CAM-B3LYP functional and 6-31+G(d,p) basis set presented a better description of the optical properties.

Conclusions
In summary, we employed DFT and TD-DFT calculations to study the geometrical and optoelectronic properties of bixin and norbixin isomers. These molecules are present in the achiote seeds, a plant found in tropical America. Since they present a strong UV-Vis absorption spectrum, they can be good candidates for developing novel DSSCs. The DFT and TD-DFT calculations were conducted within the framework of three different functionals (B3LYP, CAM-B3LYP, and M06) and basis sets (6-31+G(d,p), 6-31G(d,p), and 6-31G).
As a general trend, we observed that these molecules, in both chloroform solution and gas phase cases, present almost planar lattice configurations with small torsion angles in the edges. This kind of lattice arrangement allows for wavefunction delocalization on the π-conjugated backbone. Moreover, their similar extension in the conjugation pathway leads to close values found for their MOs energies. The HOMO-LUMO gap energy values were increased from the B3LYP, M06 to CAM-B3LYP levels of theory, in sequence, as a consequence of the increasing HF contribution to the DFT functional.
In the optical properties study, we observed that the increase of the HF contribution is reflected in the shift of the bands to lower wavelengths (or higher energies). The absorption bands of the molecules in the chloroform solution were slightly shifted to a higher wavelength concerning the gas phase. We also obtained the vertical transition energies, wavelength, oscillator strength, and transition dipole moment. Here, we observed that the transition dipole moment for all the molecular dyes is aligned to the molecular axis, and the comparison with the experimental data showed that the CAM-B3LYP functional, with 6-31+G(d,p) basis set, provided a better description of the optical properties.
It is important to stress that bixin represents the main carotenoid found in the annatto seeds. In addition, the E-isomer (isobixin) provides a higher absorption peak than the Z-isomer (bixin), which indicates that in case of choosing between one of them for possible application in electronic or optoelectronic devices, the isobixin would be the appropriate choice.

Additional information
To include, in this order: Accession codes (where applicable); Competing interests (mandatory statement).
The corresponding author is responsible for submitting a competing interests statement on behalf of all authors of the paper. This statement must be included in the submitted article file.     Table 4. S 0 → S 1 vertical transition energies (E 01 ), wavelength (λ 01 ), oscillator strength ( f ), and transition dipole moments (µ 01 ). These results were obtained by employing the 6-31+G(d,p) basis set.