Selective Terminal Functionalization of Linear Alkanes

Selective undirected functionalization of strong primary C-H bonds of linear alkanes, that do not possess directing groups, historically stands as one of the most challenging transformation in chemistry. In this Article, we report a two-step sequential strategy involving a biocatalytic dehydrogenation / remote hydrofunctionalization, as a unied and versatile approach to selectively convert linear alkanes into a large array of valuable functionalized aliphatic derivatives. The dehydrogenation is carried out by a mutant strain of Rhodococcus and the produced alkenes are subsequently engaged in a remote functionalization through a metal-catalyzed hydrometalation/migration sequence that subsequently react with a large variety of electrophiles. The judicious implementation of this combined biocatalytic and organometallic approach enabled us to develop a high-yielding protocol to site-selectively functionalize unreactive primary C–H bonds.

mutant strain of Rhodococcus and the produced alkenes are subsequently engaged in a remote functionalization through a metal-catalyzed hydrometalation/migration sequence that subsequently react with a large variety of electrophiles. The judicious implementation of this combined biocatalytic and organometallic approach enabled us to develop a high-yielding protocol to site-selectively functionalize unreactive primary C-H bonds.

Main Text
Selective functionalization of unreactive C-H bonds represents a signi cant paradigm shift from the standard logic of organic synthesis, traditionally con ned to the orchestration of transformations of functional groups [1][2][3][4] . Research efforts in targeting C-H bond functionalization have mainly focused on the use of directing groups, present or pre-installed, or through the undirected activation of particularly reactive C-H bonds. Extending this strategy to undirected C-H bond functionalization requires the discovery of new strategies that would accurately activate and differentiate energetically akin C-H bonds, featured both in the substrate and in the product without cross-activation (Fig. 1a) 5 . This selective functionalization of alkanes, that do not possess directing groups, historically stands as one of the most challenging transformation in chemistry, particularly for undirected functionalization of strong and less electron-rich primary C-H bonds 6 . Extensive efforts have been fueled by the prospect of rapidly valorizing this cheap and readily available organic feedstock, and few pioneering results have been achieved in alkane activation of primary C-H bonds, most notably using homogeneous transition-metal based systems ( Fig. 1b) 7 . Hartwig pioneered the rst Rh-catalyzed regioselective terminal borylation with bispinacolatodiboron using alkanes in large excess 8 , and more recently the Ir-catalyzed activation of primary C-H bonds using linear alkanes as limiting reagent 9 . An impressive regio-and enantioselective Rhcatalyzed carbene insertion into primary C-H bonds have recently been reported by Davis (Fig. 1b) 10 whereas Nelson developed the intermolecular non-selective C-H insertion reaction of vinyl cations 11,12 . A recent very elegant report by Aggarwal on photoinduced electron transfer opened new doors to the C(sp 3 )-H borylation of alkanes although as a mixture of isomers in moderate yields 13 . Among indirect approaches, Martin and Baudoin have independently reported the unselective bromination of linear alkanes followed by subsequent and selective catalytic remote terminal carboxylation 14 and arylation 15 reactions, respectively whereas Huang converted alkanes into primary alkylsilanes through an initial dehydrogenation reaction using the initial alkanes as solvent 16 . Although these few selected examples highlight remarkable accomplishments for selective alkane functionalization, an inherent limitation of these strategies is that each transformation leads to a single type of functionalization and required, at the exception of the recent Ir-based strategy 9 , the alkanes as solvent or in very large excess. A general, highyielding and fully selective approach to the overall activation of primary C-H bonds allowing a large number of different terminal functionalization (FG = I, Br, Cl, SiR 3 , Bpin, OH, NR 2 , CN, sp 3 or sp 2 C-C bond, etc) at the exclusive terminus position of an alkane, present in a stoichiometric amount, remains completely elusive. We set out to develop this missing link by proposing an alternative approach that would rst transform a stoichiometric amount of linear alkane into alkene, that would subsequently be engaged in a remote functionalization 17,18 through a metal-catalyzed hydrometalation/migration sequence towards the formation of a primary organometallic species. The latter would subsequently react with a large variety of electrophiles providing a general access to the expected functionalized alkanes exclusively at the primary position (Fig. 1c). The activation of hydrocarbons through "dehydrogenation" is however, also a rather di cult transformation and heterogeneous 19 or homogeneous-catalyzed reactions 20 required high temperatures, inevitably leading to potential side reactions. In light of these inherent aws of existing synthetic reagents for "dehydrogenation" reactions of alkanes 21 , we decisively turned our attention towards biocatalytic desaturase systems 22 , acknowledging the enzymatic ability to accurately distinguish even closely structurally related organic compounds and the potential of biocatalysis for important applications for sustainable processes 23 . The desaturation of non-activated C-C bonds have been mainly restricted to saturated fatty acids with the help of fatty acid desaturases (FADs). Most FADs do not act on the free fatty acid directly but rather on the corresponding thioester through an acyl carrier protein, practically limiting the applications only to carboxylic acids 24,25 . The selective and e cient desaturation of linear saturated unfunctionalized hydrocarbons into their corresponding unsaturated products was missing as alkenes were never directly assessed as catabolic intermediates in alkane biodegradation. Crucially, a mutant strain of Rhodococcus (Rhodococcus KSM-B-3M) [26][27][28] , obtained through random mutagenesis of the wild type strain Rhodococcus KSM-B-3, could hyperproduce internal alkenes from linear alkanes in rather low yields. We were therefore interested to implement this original low-yielding transformation as a general approach and evaluate whether synthetically useful yields of alkenes could be obtained. Following a thorough study of the reaction carried by the mutant (see Table  S1 for all details, Supplementary Materials), a remarkable monodehydrogenation transformation of alkane could be achieved by Rhodoccoccus KSM-B-3M (1 g wet cells) cultured in the presence of phosphate buffer (K 2 HPO 4 /KH 2 PO 4 , 0.8 M, pH = 6.4), monosodium glutamate (1.67 equiv.), thiamine•HCl (3 mol%) and magnesium sulfate (2.8 mol%) under ambient conditions (30 °C, aqueous solution, atmospheric pressure). Using this protocol, we were delighted to observe the transformation of hexadecane (1) into two major cis-hexadecene regioisomers (2, 80% cis-7-hexadecene + 20% cis-8hexadecene) with a remarkable 61% isolated yield based on the original alkane with a 83% dehydrogenation e ciency (alkenes/alkane) of aerobic incubation (Figure 2a). Several control experiments revealed the importance of the presence and concentration of all the reactants in solution to reach maximal ole n productivity (see Supplementary Materials). The biological material could also easily be recovered and reutilized into another batch of transformation for 4 consecutive runs further asserting the resiliency of this strain under the set conditions. The fermentation reaction could be scaled-up to reach multigram scale synthesis, demonstrating the practicality of this procedure. The same protocol allowed to speci cally desaturate linear C 14 -C 20 alkanes (Figure 2b, 3-7), reaching optimal conversion ratios for n-hexadecane and n-heptadecane. In all cases, the ole ns produced were exclusively of (Z)-stereochemistry. Importantly, this microorganism enabled to mediate the desaturation of aliphatic xenobiotics such as terminally functionalized aliphatic derivatives ( Figure 2c, for the entire scope, see Supplementary Materials). Interestingly, the formed ole ns were accumulated over time, indicating that they were not further metabolized by the strain, underlying that linear a-ole ns such as 1-hexadecene or 1-octadecene could undergo the biocatalytic desaturation to provide the corresponding unconjugated dienes in decent isolated yields (8,9). The reaction similarly proceeded with n-hexadecyl methyl ether, or chloro and uoro hexa-and octadecane affording afunctionalized Z-ole ns (10)(11)(12)(13)(14) with somewhat variable yields and dehydrogenation ratios. These last examples place emphasis on the opportunity of chemoselectively desaturating aliphatic chains, even in the presence of potentially sensitive functional groups (8)(9)(10)(11)(12)(13)(14). This strongly contrasts with the relatively low functional group tolerance of synthetic reagents or catalysts capable of activating C-H bonds in alkanes.
To shed some light on the biochemical pathways underlying the bacteria's ability to convert alkanes into alkenes, we rst compared KSM B-3M with closely related Rhodococcus strains. Phylogenetic analysis revealed that Rhodococcus strain sp. 008 was the genetically available closest known strain to KSM B-3M (Fig. 3a). In sharp contrast to KSM B-3M, when the strain sp. 008 was grown on hexadecane as a carbon source, it doubled its biomass in a 21 days experiment while consuming the alkane without any formation of alkenes (Fig. S1). These results suggest that sp. 008 uses hexadecane more e ciently than KSM B-3M through different metabolic pathways alluding that the mutation of the strain KSM B-3M potentially blocked subsequent alkane metabolism pathways 27 . To further identify the genes responsible for this transformation, we performed transcriptomic comparison between KSM B-3M and sp. 008, both growing independently on hexadecane and dodecane as carbon sources. Principal Component Analysis (PCA) of the transcriptome revealed that strain differences explain 97% of the variability in gene expression (PC1, Figure 3b). Only 1% of the variation is attributed to the difference in the carbon source observed in KSM B-3M but not in sp. 008 (PC2, Figure 3b). This result indicates distinct cell metabolism pathways amongst the two strains regarding alkane metabolism. Differential gene expression analysis comparing the transcriptome of the two strains on hexadecane revealed 1331 genes with at least a tenfold excess (adjusted p-value <0.05, FDR). Interestingly, 1283 of these genes were higher in sp. 008 while only 48 genes were higher in KSM B-3M, indicating a speci c metabolic activity that de nes KSM B-3M dehydrogenation ability. Strikingly, the most abundant gene in KSM B-3M, 250 times higher than in sp. 008, was found to be acyl-CoA desaturase (Fig. 3c). Moreover, two genes adjacent to acyl-CoA desaturase on the KSM B-3M genome, ferredoxin reductase and a gene encoding for a hypothetical protein were also much highly expressed in KSM B-3M than in sp. 008 de ning the region containing these three genes as a putative operon. Interestingly, acyl-CoA desaturase catalyzes the conversion of saturated fatty acids to unsaturated forms and ferredoxin reductase is an electron donor required for its function 29 . Quantitative-PCR (q-PCR) targeting the three genes of the operon, namely acyl-CoA desaturase, ferredoxin reductase and the hypothetical protein con rmed their high expression levels induced by hexadecane in KSM B-3M (Fig. 3d). Proteomic analysis demonstrated similar results at the proteins level. It was further subjected to enrichment analysis, which revealed four metabolic pathways that are higher in sp. 008 compared to KSM B-3M. The beta-oxidation pathway for fatty acid degradation is the most enriched among these pathways (adjusted p-value = 2.6E-4) with ten genes higher in sp. 008 (Table S2, Fig. S2). Combining the results from the transcriptomic and proteomic analyses of KSM B-3M indicate an over expression of acyl-CoA desaturase operon with a lack of subsequent beta oxidation pathways, indicating a regioselective dehydrogenation reaction of alkanes without subsequent b-oxidation. Based on these ndings molecular docking was then performed in order to predict hexadecane binding to acyl-CoA desaturase. For that we used the 3D structure of a mouse Stearoyl-Coenzyme A Desaturase-1, the closest acyl-CoA desaturase homolog with PDB structure 30,31 . Despite a rather low 17% overall sequence identity, the binding pocket residues showed higher than 30% pairwise identity and 30% similarity identity between mouse and Rhodococcus meaning that both proteins share very similar desaturase functions (Table S3, Fig. S3) 32 . Based on our observation that the binding pocket is much more conserved than the overall fold, we generated a chimeric Rhodococcusmouse binding site 3D structure in silico and proposed a model for the binding pocket that presents a highly hydrophobic lower half, with eight AA (two conserved Trp, four Val, one Leu and one Tyr), (Fig. 4a) and a more polar and charged upper half. By exploring the minimal distance between various hydrocarbon chains and the Cß atom of Val 257 as speci c buried binding pocket (Fig. 4a and 4b), hexadecane is at a short 7.0 Å distance while it is at a longer distance for all other substrates that were not dehydrogenated in our reaction conditions. With a practical and powerful strategy in hand to access stoichiometrically mono-unsaturated hydrocarbon derivatives using strain KSM B-3M, we proceeded by examining the next step of our approach, namely the remote and selective terminal functionalization reaction 33 . This strategy capitalizes on the remarkable ability of transition-metal complexes to promote rapid and long-range migration along linear hydrocarbon chains under speci c conditions 34 . This equilibrium is usually shifted towards the predominant formation of the anti-Markovnikov terminal organometallic intermediate, enabling eventual terminal derivatization either through reductive elimination or subsequent electrophilic functionalization.
If successful, an inherent advantage of our strategy is that the chemical outcome is independent of the position of the initial double bond, thus serving as a regioconvergent remote activation of terminal C-H bonds. To this purpose, we prospectively identi ed several transition-metal based catalysts and reagents capable of affecting such long-range functionalization upon our biodesaturated substrates (see Figure  5). As a proof of concept, we successfully engaged the mixture of hexadecenes produced from nhexadecane through the biocatalytic desaturation process (Figure 5a, green arrow) into various metalmediated remote hydrofunctionalization reactions (Figure 5a, red arrow), thoroughly optimized to reach excellent site-selectivities (a>98:2 in all cases). For instance, the terminal alkylborane 15 could be selectively synthesized using pinacolborane and a catalytic mount of Co-based pincer complex at room temperature 35 . Furthermore, trichloroalkylsilanes -which are essential components to polymer and material sciences-could be accessed from 2 and trichlorosilane using 5 mol% of chloroplatinic acid (H 2 PtCl 6 .6H 2 O) as catalyst 36 . In a one-pot operation, subsequently treating the resulting trichlorohexadecylsilane with different Grignard reagents (i.e. MeMgBr and vinylMgBr) enabled us to provide either trimethyl-or trivinyl-hexadecylsilane 16 and 17 in good overall yields. Hydrozirconation reactions have the advantage of enabling an even broader palette of electrophilic functionalizations 37 .
We found that treating 2 with a mixture of Cp 2 ZrCl 2 and Red-Al 38 (22 and 23). Alternatively, addition of TMSCN and I 2 provided nheptadecanonitrile 24 in 75% yield. Importantly, C-C bond formation could also be achieved in the presence of a catalytic amount of Cu(I) salts to provide either the allenylated or allylated products 25-27, respectively. Regarding other sp 3 -sp 3 cross-coupling reactions, we implemented the nickel-based catalytic system 40 to couple the bioproduced hexadecenes with an alkyl iodide to exclusively provide linear product 28. Figure 5. Illustration of the large array of derivatives accessible from different remote hydrofunctionalization procedures applied to: a, the mixture of cis-hexadecenes 2, bioproduced from nhexadecane, and b, 1-methoxyhexadecenes, bioproduced from 1-methoxyhexadecane. Yields were either evaluated by GC (with internal standard in parentheses) or after puri cation by column chromatrography on silica gel. All site-selectivities in 15-28 were above 98:2 (evaluated by NMR).
Finally, our approach could also target the directionally selective and remote C-O bond activation as illustrated with a mixture of 1-methoxyhexadecenes, bioproduced from 1-methoxyhexadecane ( Figure  5b). In this case, the ruthenium-catalyzed ole n migration was thermodynamically guided by the formation of the corresponding enol ether, which is then cross-coupled with PhMgBr in the presence of catalytic amount of [NiCl 2 (PPh 3 ) 2 ], yielding b-pentadecylstyrene 29 in a one-pot sequence 41,42 .
In summary, we herein report an e cient, uni ed and general two-step strategy to selectively functionalize terminal C-H bonds of linear alkanes, yielding a large variety of functionalized alkyl derivatives (FG = I, Br, Cl, SiR 3 , Bpin, OH, NR 2 , CN, sp 3 or sp 2 C-C bond, etc) while avoiding the use of linear α-ole ns. The originality of this approach stems from the combined use of bacterial alkane activation with the metalmediated remote hydro-functionalization, which remarkably prevents side-reactions, reaching synthetically useful yields of linear alkyl derivatives. We believe these ndings will inspire the design and development of even more e cient (semi-) synthetic strategies targeting the highly desirable dehydrogenation or other selective functionalization of hydrocarbon chains.

Declarations
Technion Genome Center and The Smoler Proteomics Center at the Technion are acknowledged for whole-genome & transcriptome sequencing and for the proteomic analysis.
Author contributions: JB, AC, MC, YZ, G-MH, IM and IM planned, conducted and analyzed the chemical experiments. KB-R, JB, IK, NE, HH-V and YK planned, conducted and analyzed the biological experiments. FG, KB-R and GH conducted the genomic and proteomic analysis as well as docking experiments. IM conceived and directed the project and wrote the manuscript with contributions by JB, KB-R, AC, GH, FG and YK. All authors contributed to discussions. Correspondence and requests for materials should be addressed to IM; Competing interests: JB, KBR, IK, YK, MC and IM* declare competing nancial interests.
Data and materials availability: All data is available in the main text or the supplementary materials. All data, code, and materials used in the analysis are available in some form to any researcher for purposes of reproducing or extending the analysis. Scope of the Rhodococcus KSM-B-3M for regioselective desaturation of alkanes and other alkyl derivatives. a, The described protocol was optimized based on the conversion of n-hexadecane and recycling experiments. b, Scope of the range of desaturated linear alkanes. c, Scope of other desaturated xenobiotics using similar fermentation conditions. Reported yields were determined from the mass of organic material recovered after puri cation. Desaturation ratios were determined by GC and 1H NMR. rpm = rotation per minute. a, Chimera Rhodococcus-mouse binding pocket used for the docking experiments with hexadecane docking shown in light green. All residues that do not belong to the binding pocket are showed as a white ribbon. Rhodococcus hydrophobic and aromatic residues are colored tan, polar and charged residues in light blue, Zn atoms in yellow, hydrogens in white and the ribbon backbone atoms (from mouse) in transparent white. b, Best LeDock docked for hexadecane, hexadecane is colored in light green.