SARS-CoV-2 infected host cell proteomics reveal potential therapy targets
A novel coronavirus was recently discovered and termed SARS-CoV-2. Human infection can cause coronavirus disease 2019 (COVID-19), for which, at this point, over 80,000 cases resulting in over 2,500 deaths have been reported in over 40 countries. SARS-CoV-2 shows some similarities to other coronaviruses. However, treatment options and a cellular understanding of SARS-CoV-2 infection are lacking. Here we identify the host cell pathways modulated by SARS-CoV-19 infection and reveal that drugs targeting pathways prevent viral replication in human cells. We established a human cell culture model for infection with SARS-CoV-2 clinical isolate. Employing this system, we determined the SARS-CoV-2 infection profile by translatome and proteome proteomics at different times after infection.
These analyses revealed that SARS-CoV-2 reshapes central cellular pathways, such as translation, splicing, carbon metabolism and nucleic acid metabolism. Small molecule inhibitors targeting these pathways were tested in cellular infection assays and prevented viral replication. Our results reveal the cellular infection profile of SARS-CoV-2 and led to the identification of drugs inhibiting viral replication. We anticipate our results to guide efforts to develop therapy options for COVID-19.
Authors Denisa Bojkova, Kevin Klann, and Benjamin Koch contributed equally to this work
Data associated with the preprint has been made available on the authors' website.
Figure 1
Figure 2
Figure 3
Figure 4
This is a list of supplementary files associated with this preprint. Click to download.
What is the reason for using colon cancer cell line???
Thanks for sharing the data ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2020/03/PXD017710/ ! I am interested in the protein P0DTC2 which shows a rising trend in your dashboard http://corona.papers.biochem2.com/ however your results seem to have a batch effect ( please check figure 3 and 4 of my report at https://docs.google.com/document/d/e/2PACX-1vRGri8C0cRp5uJ6Kg5vqBs9ziuMxjP_VvDgsYPzsqjZEtsy7cT3xE5EiXnvin5onN7bpWnM2VN7POFN/pub ) , is my understanding wrong or do you recommend to control for the first two principal components which is where i seem to see it...
Did not see the 2006 article on SARS-CoV ( Antiviral Therapy 11(8) 10210-1030 ) mentioned in the references?
Hi, When I was reading through the method section, I came across the sentence "All multiplexes were mixed with a bridge channel, that consists of control sample labeled in one reaction and split to all multiplexes into equimolar amounts." Can someone please elaborate on the Bridge channel? Why it is used here and How does it work? It is quite confusing. Thank you, Chinmaya
Hi, Very nice work! Thanks! It is not clear to me why choose a colon epithelial carcinoma cell line Caco–2 instead of a lung cancer cell line or endothelial cells, since the 1st target and most severely damaged organ is lung. Is it because of higher ACE2 expression in guts? Or because you are aiming to prevent systematic damage even if patients survived the acute complications? I am wondering if the lung cell infection profile would be different from colon cell. Best, Lan
Dear authors, could you please provide a table that helps to de-muliplex the proteomics data? I could not find this information in the M&M section or ProteomeExchange. More precisely, which samples where labeled using which TMT11 reagent? Which channel was used as a common reference (golden lane)? In addition, please indicate which raw files corresponds to which high pH RP fraction. Is F1 the most hydrophilic fraction? Thanks in advance, Tobias Kockmann
I just saw that such a table exists on your homepage: https://biochem2.com/images/IBCII-Images/Research_Groups_Images/Projects/PQC/TMT_scheme.xlsx What is the light and heavy standard sample?
Hi Tobias, we used a pulse SILAC approach to label nascent proteins. The light and heavy standards we use for data analysis and are used for 1) determination of the isolation interference and 2) boost the signal of heavy peptides in the survey scan, since we only labelled for 2 hours. Best, Kevin
Hi Kevin, I meanwhile looked at your MoCell paper and already suspected that the two channels are the booster and the noise channel. Unfortunately, that only answers for what these channels are used, but not which material was labeled. Does the booster channel contain fully labeled virus proteins? Or host? Or a mixture? Not clear from the manuscript. Was the identical material booster/noise material used throughout the three 10plex "experiments". If yes, why is an additional bridge channel needed? An additional question: Why are these comments flagged?
First of all, we don´t have anything to do with flagging the comments. I don´t have any clue either. Next, we used Hela digests. This works perfectly fine and is not interfering with the other samples. We cannot use the heavy and light channels as a bridge, since they only contain one of the SILAC labels but not both. All the real samples contain mixture of both. Therefore we have to use a sample that was treated with the same experimental setup to avoid large normalization bias. Best
P.S. F1 is the most hydrophilic fraction. For the Bridge channel we used one of the controls and split one labelleling reaction on all multiplexes.
Hi Kevin, could you please elaborate what exactly your control sample is? Uninfected host cell extract? Light/Heavy? Best, Tobi
It´s a simple HeLa digest, once light and once heavy. Works fine if combined with most cell lines.
Your booster channel is heavy HeLa digest and your noise channel is light HeLa digest? So no virus peptides were boosted? How comparable is the Caco-2 peptidome to the HeLa peptidome?
Are you interested in a peptide internal standard i.e. a heavy labelled tryptic peptide?
Very interesting study. However, I have one query. Would you please elaborate on why you chose to study Caco-2 cell line which is a colon cancer cell lines, although a more obvious choice would be a lung-related cell line.
Hi. Great study. We are looking to replicate the functional enrichment. Can you elaborate with what R package or software the tables S3 and S4 were computed?
More specifically, what background list (=universe) was used for the hypergeometric test? The measured 6000-7000 proteins or all proteins? It would be helpful to include the background size in the tables.
Based on data at the human protein atlas, Caco-2 cells do not seem to express ACE2. https://www.proteinatlas.org/ENSG00000130234-ACE2/cell Did you verify ACE2 expression in your cell line?
I found the ACE2 mRNA expression of CaCo2 in the CCLE database
Dear authors, could you please provide the figure 1C table to us for COVID-19 relation research? Thanks in advance, Mirrersan
Dear authors, Great work! May I ask you which log2FC (infected vs. uninfected) cut-off you used for the enrichment analyses? Thank you! Julia
Dear authors, Can you please tell me why to use TMT6-plex modifications while searching using Sequest ?. Thank You Hari
Why have my comments (and actually from many others...) been moderated??!
Dear authors, In another preprint (https://www.biorxiv.org/content/10.1101/2020.03.25.008482v1.full, Fig. 2), emetine was reported not to show specific SARS-Cov-2 inhibition more potent than non-specific cytotoxicity in Vero E6 cells (CC50<0.39 uM). These results differ from those from your CaCo2 model. Do you also find that emetine and DG show specific inhibition/high SI in a Vero E6 model? Other discussion of reconciliation/differences?
I do not understand why my comment was moderated. Emetine is being reported by Weston, Frieman and colleagues (biorxiv) not to more potently inhibit SARS-Cov-2 in Vero E6 than it induces non-specific cytoxicity, different than what was seen here in Caco-2. Relevant differences between Caco2 and Vero E6 lines have been reported for SARS-1 (c.f. PMID: 16487305, 15731278). Under your conditions, do you observe lytic or non-lytic replication of SARS-Cov-2 in Caco-2, and if lytic by what day? Have you compared manner or rate of replication in Caco-2 as compared to in Vero/Vero E6/VeroE6-TMPRSS2 lines? Thank you.
The authors also may wish to integrate the findings of Choy, Yen and colleagues who, after this submission, published their findings of emetine in Vero E6 cells. They observed non-specific loss of viability at sub EC50 concentrations for emetine on infected cells, but more pronounced on infected cells at supra-EC50 concentrations (Fig. 1D, https://www.ncbi.nlm.nih.gov/pubmed/32251767).
Please put out more stories dealing with the covid problem. Thank You!
Thank you for the excelente paper! I have a question about the protein/gene identifiers in the expression and translation rate matrices. How should rows containing two or more proteins (e.g. P21439;P08183) should be interpreted? Are these proteins associated? How is the expression of these protein doubles (triples, etc) compare to the expression of each protein separetly? Best, Thomaz Luscher
Reference 13 is from Molecular Cell, NOT Cell.
Nice work. Please describe the reduction/alkylation step involved in lysis/digestion prior to LC-MS/MS. Although the description is missing from the methods, your search parameters describe carbamidomethylation of Cys, which is consistent with use of iodoacetamide. I reprocessed plex1 data using the RAW files from PRIDE. While Cysteine containing peptides are readily recovered for host proteins (~10% of peptide spectrum matches), very few Cys-containing viral peptides were recovered. Thus peptide coverage of the Cys-rich, spike-protein is only 20%. So I'm wondering if the reduction conditions are not strong enough to disrupt the disulfide bonds in the spike protein? I suspect this would probably NOT substantially distort your quantitation of the spike protein, I'm simply interested in coverage for other reasons. Thanks, --Karl Clauser
Dear Authors, i think your research provides a valuable approach for identification of potential targets for therapy of COVID-19. The approach is quite straight-forward and valid. Also from a rational point of view, inhibition of glycolysis makes sense, as it can serve as source of nucleotides, necessary for SARS-CoV-2 replication. However, i would like to stress out major limitations, regarding the inhibition of glycolysis via 2-DG. Firstly, your data indicates an IC50 of 9.09 mM for cytotoxicity (Fig. S3d), however, IC50 for inhibition of viral replication is also 9.09 mM. Therefore, i think your data shows that 2-DG does not inhibit SARS-CoV-2 replication, but rather leads to enhanced apoptosis/necrosis in CaCo-2 cells, resulting in lowered viral replication. Secondly, why did you choose CaCo-2 cells as model organism? It is widely appreciated that tumorous cell lines do not provide an adequate model for reflecting the accurate in vivo situation. In my opinion, primary epithelial cells (e.g. HBEpCs or other, epithelial as the lung is the primary target organ) would suit better. Thirdly, metabolic activity of glycolysis is cell type dependent. As a consequence, toxicity will also strongly differ. For example, 2-DG was shown to modulate myeloid differentiation of HSCs at 1 mM (see: https://www.sciencedirect.com/science/article/pii/S1934590914002501?via%3Dihub), also anti-inflammatory M2 Macrophage polarization is strongly lowered at 1 mM 2-DG, which may also drastically increase the resulting cytokine storm syndrome in COVID-19 (see: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5451502/). Additionally, already 0.6 mM 2-DG induce apoptosis in HUVECs and HMVEC-L (see: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2965179/). Thus, severe adverse events have to be expected, if 2-DG would be used for therapy. Therefore, i think that at least inhibition of glycolysis is not a suitable therapeutic target, which should also be clearly pointed out in the manuscript. Best, Lars Kaiser
SARS-CoV-2 infected host cell proteomics reveal potential therapy targets
A novel coronavirus was recently discovered and termed SARS-CoV-2. Human infection can cause coronavirus disease 2019 (COVID-19), for which, at this point, over 80,000 cases resulting in over 2,500 deaths have been reported in over 40 countries. SARS-CoV-2 shows some similarities to other coronaviruses. However, treatment options and a cellular understanding of SARS-CoV-2 infection are lacking. Here we identify the host cell pathways modulated by SARS-CoV-19 infection and reveal that drugs targeting pathways prevent viral replication in human cells. We established a human cell culture model for infection with SARS-CoV-2 clinical isolate. Employing this system, we determined the SARS-CoV-2 infection profile by translatome and proteome proteomics at different times after infection.
These analyses revealed that SARS-CoV-2 reshapes central cellular pathways, such as translation, splicing, carbon metabolism and nucleic acid metabolism. Small molecule inhibitors targeting these pathways were tested in cellular infection assays and prevented viral replication. Our results reveal the cellular infection profile of SARS-CoV-2 and led to the identification of drugs inhibiting viral replication. We anticipate our results to guide efforts to develop therapy options for COVID-19.
Authors Denisa Bojkova, Kevin Klann, and Benjamin Koch contributed equally to this work
Data associated with the preprint has been made available on the authors' website.
Figure 1
Figure 2
Figure 3
Figure 4
What is the reason for using colon cancer cell line???
Thanks for sharing the data ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2020/03/PXD017710/ ! I am interested in the protein P0DTC2 which shows a rising trend in your dashboard http://corona.papers.biochem2.com/ however your results seem to have a batch effect ( please check figure 3 and 4 of my report at https://docs.google.com/document/d/e/2PACX-1vRGri8C0cRp5uJ6Kg5vqBs9ziuMxjP_VvDgsYPzsqjZEtsy7cT3xE5EiXnvin5onN7bpWnM2VN7POFN/pub ) , is my understanding wrong or do you recommend to control for the first two principal components which is where i seem to see it...
Did not see the 2006 article on SARS-CoV ( Antiviral Therapy 11(8) 10210-1030 ) mentioned in the references?
Hi, When I was reading through the method section, I came across the sentence "All multiplexes were mixed with a bridge channel, that consists of control sample labeled in one reaction and split to all multiplexes into equimolar amounts." Can someone please elaborate on the Bridge channel? Why it is used here and How does it work? It is quite confusing. Thank you, Chinmaya
Dear Chinmaya, Thanks for your comment. A bridge channel is an approach commonly used when having several multiplexes (here three). The so-called bridge channel is one identical sample used across all multiplexes to allow normalization across the multiplexes and to determine technical variation. Here, we used one of the controls as bridge channel across all the multiplexes of biological replicates. Best wishes, Christian
Dear Christian, Thank you for the information. How these bridge channel abundance values are used in the current study? It is used for any quantitative data analysis? Thank you, Chinmaya
Hi, Very nice work! Thanks! It is not clear to me why choose a colon epithelial carcinoma cell line Caco–2 instead of a lung cancer cell line or endothelial cells, since the 1st target and most severely damaged organ is lung. Is it because of higher ACE2 expression in guts? Or because you are aiming to prevent systematic damage even if patients survived the acute complications? I am wondering if the lung cell infection profile would be different from colon cell. Best, Lan
Dear authors, could you please provide a table that helps to de-muliplex the proteomics data? I could not find this information in the M&M section or ProteomeExchange. More precisely, which samples where labeled using which TMT11 reagent? Which channel was used as a common reference (golden lane)? In addition, please indicate which raw files corresponds to which high pH RP fraction. Is F1 the most hydrophilic fraction? Thanks in advance, Tobias Kockmann
I just saw that such a table exists on your homepage: https://biochem2.com/images/IBCII-Images/Research_Groups_Images/Projects/PQC/TMT_scheme.xlsx What is the light and heavy standard sample?
Hi Tobias, we used a pulse SILAC approach to label nascent proteins. The light and heavy standards we use for data analysis and are used for 1) determination of the isolation interference and 2) boost the signal of heavy peptides in the survey scan, since we only labelled for 2 hours. Best, Kevin
Hi Kevin, I meanwhile looked at your MoCell paper and already suspected that the two channels are the booster and the noise channel. Unfortunately, that only answers for what these channels are used, but not which material was labeled. Does the booster channel contain fully labeled virus proteins? Or host? Or a mixture? Not clear from the manuscript. Was the identical material booster/noise material used throughout the three 10plex "experiments". If yes, why is an additional bridge channel needed? An additional question: Why are these comments flagged?
First of all, we don´t have anything to do with flagging the comments. I don´t have any clue either. Next, we used Hela digests. This works perfectly fine and is not interfering with the other samples. We cannot use the heavy and light channels as a bridge, since they only contain one of the SILAC labels but not both. All the real samples contain mixture of both. Therefore we have to use a sample that was treated with the same experimental setup to avoid large normalization bias. Best
P.S. F1 is the most hydrophilic fraction. For the Bridge channel we used one of the controls and split one labelleling reaction on all multiplexes.
Hi Kevin, could you please elaborate what exactly your control sample is? Uninfected host cell extract? Light/Heavy? Best, Tobi
It´s a simple HeLa digest, once light and once heavy. Works fine if combined with most cell lines.
Your booster channel is heavy HeLa digest and your noise channel is light HeLa digest? So no virus peptides were boosted? How comparable is the Caco-2 peptidome to the HeLa peptidome?
Are you interested in a peptide internal standard i.e. a heavy labelled tryptic peptide?
Very interesting study. However, I have one query. Would you please elaborate on why you chose to study Caco-2 cell line which is a colon cancer cell lines, although a more obvious choice would be a lung-related cell line.
Hi. Great study. We are looking to replicate the functional enrichment. Can you elaborate with what R package or software the tables S3 and S4 were computed?
More specifically, what background list (=universe) was used for the hypergeometric test? The measured 6000-7000 proteins or all proteins? It would be helpful to include the background size in the tables.
Based on data at the human protein atlas, Caco-2 cells do not seem to express ACE2. https://www.proteinatlas.org/ENSG00000130234-ACE2/cell Did you verify ACE2 expression in your cell line?
I found the ACE2 mRNA expression of CaCo2 in the CCLE database
Dear authors, could you please provide the figure 1C table to us for COVID-19 relation research? Thanks in advance, Mirrersan
Dear authors, Great work! May I ask you which log2FC (infected vs. uninfected) cut-off you used for the enrichment analyses? Thank you! Julia
Dear authors, Can you please tell me why to use TMT6-plex modifications while searching using Sequest ?. Thank You Hari
Why have my comments (and actually from many others...) been moderated??!
Dear authors, In another preprint (https://www.biorxiv.org/content/10.1101/2020.03.25.008482v1.full, Fig. 2), emetine was reported not to show specific SARS-Cov-2 inhibition more potent than non-specific cytotoxicity in Vero E6 cells (CC50<0.39 uM). These results differ from those from your CaCo2 model. Do you also find that emetine and DG show specific inhibition/high SI in a Vero E6 model? Other discussion of reconciliation/differences?
I do not understand why my comment was moderated. Emetine is being reported by Weston, Frieman and colleagues (biorxiv) not to more potently inhibit SARS-Cov-2 in Vero E6 than it induces non-specific cytoxicity, different than what was seen here in Caco-2. Relevant differences between Caco2 and Vero E6 lines have been reported for SARS-1 (c.f. PMID: 16487305, 15731278). Under your conditions, do you observe lytic or non-lytic replication of SARS-Cov-2 in Caco-2, and if lytic by what day? Have you compared manner or rate of replication in Caco-2 as compared to in Vero/Vero E6/VeroE6-TMPRSS2 lines? Thank you.
The authors also may wish to integrate the findings of Choy, Yen and colleagues who, after this submission, published their findings of emetine in Vero E6 cells. They observed non-specific loss of viability at sub EC50 concentrations for emetine on infected cells, but more pronounced on infected cells at supra-EC50 concentrations (Fig. 1D, https://www.ncbi.nlm.nih.gov/pubmed/32251767).
Please put out more stories dealing with the covid problem. Thank You!
Thank you for the excelente paper! I have a question about the protein/gene identifiers in the expression and translation rate matrices. How should rows containing two or more proteins (e.g. P21439;P08183) should be interpreted? Are these proteins associated? How is the expression of these protein doubles (triples, etc) compare to the expression of each protein separetly? Best, Thomaz Luscher
Reference 13 is from Molecular Cell, NOT Cell.
Nice work. Please describe the reduction/alkylation step involved in lysis/digestion prior to LC-MS/MS. Although the description is missing from the methods, your search parameters describe carbamidomethylation of Cys, which is consistent with use of iodoacetamide. I reprocessed plex1 data using the RAW files from PRIDE. While Cysteine containing peptides are readily recovered for host proteins (~10% of peptide spectrum matches), very few Cys-containing viral peptides were recovered. Thus peptide coverage of the Cys-rich, spike-protein is only 20%. So I'm wondering if the reduction conditions are not strong enough to disrupt the disulfide bonds in the spike protein? I suspect this would probably NOT substantially distort your quantitation of the spike protein, I'm simply interested in coverage for other reasons. Thanks, --Karl Clauser
Dear Authors, i think your research provides a valuable approach for identification of potential targets for therapy of COVID-19. The approach is quite straight-forward and valid. Also from a rational point of view, inhibition of glycolysis makes sense, as it can serve as source of nucleotides, necessary for SARS-CoV-2 replication. However, i would like to stress out major limitations, regarding the inhibition of glycolysis via 2-DG. Firstly, your data indicates an IC50 of 9.09 mM for cytotoxicity (Fig. S3d), however, IC50 for inhibition of viral replication is also 9.09 mM. Therefore, i think your data shows that 2-DG does not inhibit SARS-CoV-2 replication, but rather leads to enhanced apoptosis/necrosis in CaCo-2 cells, resulting in lowered viral replication. Secondly, why did you choose CaCo-2 cells as model organism? It is widely appreciated that tumorous cell lines do not provide an adequate model for reflecting the accurate in vivo situation. In my opinion, primary epithelial cells (e.g. HBEpCs or other, epithelial as the lung is the primary target organ) would suit better. Thirdly, metabolic activity of glycolysis is cell type dependent. As a consequence, toxicity will also strongly differ. For example, 2-DG was shown to modulate myeloid differentiation of HSCs at 1 mM (see: https://www.sciencedirect.com/science/article/pii/S1934590914002501?via%3Dihub), also anti-inflammatory M2 Macrophage polarization is strongly lowered at 1 mM 2-DG, which may also drastically increase the resulting cytokine storm syndrome in COVID-19 (see: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5451502/). Additionally, already 0.6 mM 2-DG induce apoptosis in HUVECs and HMVEC-L (see: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2965179/). Thus, severe adverse events have to be expected, if 2-DG would be used for therapy. Therefore, i think that at least inhibition of glycolysis is not a suitable therapeutic target, which should also be clearly pointed out in the manuscript. Best, Lars Kaiser
Christian Münch
replied on 20 March, 2020
Dear Chinmaya, Thanks for your comment. A bridge channel is an approach commonly used when having several multiplexes (here three). The so-called bridge channel is one identical sample used across all multiplexes to allow normalization across the multiplexes and to determine technical variation. Here, we used one of the controls as bridge channel across all the multiplexes of biological replicates. Best wishes, Christian
View 1 reply
Chinmaya
ORCiDreplied on 23 March, 2020
Dear Christian, Thank you for the information. How these bridge channel abundance values are used in the current study? It is used for any quantitative data analysis? Thank you, Chinmaya