Technique permits genome sequencing of novel coronavirus with 25-fold increase in resolution
September 15, 2021
By Karina Toledo | Agência FAPESP – For the first time in Brazil, scientists at the Federal University of São Paulo (UNIFESP) have succeeded in sequencing directly the RNA of SARS-CoV-2, the virus that causes COVID-19. The study was supported by FAPESP. An article reporting its results has been posted to the preprint platform bioRxiv but has not yet been peer-reviewed.
According to the authors, the technique described in the article can be used to map the viral genome with a resolution 25 times higher than conventional sequencing methods, affording a more precise picture of the pathogen’s biology and how its genome is evolving.
“It’s highly promising because it helps us understand, for example, why some strains are more virulent or more capable of escaping our immune system,” Marcelo Briones, principal investigator for the study, told Agência FAPESP. Briones is a professor at UNIFESP and a researcher affiliated with the Center for Medical Bioinformatics at its Medical School (EPM).
As Briones explained, SARS-CoV-2 is a single-stranded RNA virus. Its genetic material consists of a single filament of nucleotides, whose bases are guanine, adenine, cytosine, and uracil. In the conventional sequencing technique, RT-PCR (reverse transcription polymerase chain reaction), the RNA is converted to complementary DNA (cDNA). DNA molecules comprise two filaments of nucleotides, so a complementary copy of the viral RNA strand is synthesized. The cDNA molecules are then amplified (generating billions of clones) and sequenced. The advantages of the strategy include speed and the possibility of sequencing samples with very little genetic material.
“Conventional sequencing of this virus is like trying to identify someone by gazing at their shadow,” Briones said. “The method we used lets us look directly at the viral RNA as it exists in the real world. It’s far more accurate.”
Carla Braconi, a professor at EPM-UNIFESP’s Department of Microbiology, Immunology and Parasitology and a co-author of the article, recalled that the study involved a viral strain isolated in 2020, one of the first lineages of SARS-CoV-2 isolated in Brazil.
“We were sent the viral isolate by José Luiz Proença-Módena [a professor at the University of Campinas] and cultured the pathogen in Vero cells [derived from African green monkeys and highly susceptible to SARS-CoV-2]. We then extracted the RNA and sequenced it using MinION, a portable technology developed by Oxford Nanopore Technologies,” Braconi said.
According to Briones, the RNA is sequenced exactly as it is when it comes out of the Vero cell, without undergoing RT-PCR or amplification. “We attach an adapter to one end of the molecule and a cDNA strip to ensure the RNA is fully stretched out. The RNA alone goes through the sequencer, base by base. Each base [cytosine, guanine, adenine or uracil], with modifications such as methylation, interrupts the electric current flowing through the device with a different pattern, enabling us to identify which is which,” he said.
The process produces a plot like an electroencephalogram (EEG), and this is interpreted using bioinformatics. The final sequence generated can then be compared with benchmarks.
“The initial impression is that the sequence obtained includes a great many errors, but actually they’re modified RNA bases. Some of these modifications aren’t detected by conventional sequencing,” Briones said.
The analysis, which was performed by postdoctoral fellow João Campos, focused on RNA methylation by looking for bases to which a methyl radical (CH3) had been added (the virus has 29,903 bases).
“This kind of biochemical modification of RNA is very important to the proper functioning of viruses like SARS-CoV-2, as well as some arboviruses [including dengue and zika] classed in the Baltimore classification system as Group 4, comprising single-stranded RNA viruses with positive polarity,” Braconi explained.
RNA viruses have about 100 modified bases that are essential for their proper biological functioning, the authors write in the article. “After SARS-CoV-2 enters a cell and ‘makes’ the cell produce copies of its genetic material, an enzyme methylates them and these modifications then acquire a function. They’re part of the information the virus needs to survive. Unless the methylation pattern is analyzed, therefore, it’s impossible to understand in detail the real genetic makeup of SARS-CoV-2,” Briones said.
A modified base that is frequently found in the RNA of SARS-CoV-2 is N6-methyladenosine (m6A), which is involved in the evasion of the immune response. “This modification enables the virus to escape the system that activates interferons [proteins produced by defense cells to inhibit viral replication]. It’s therefore a potential target for drug development and is already being studied for this purpose,” Briones said.
If it were possible to develop a drug that totally blocked RNA methylation, the virus would be cleared from our cells and COVID-19 would disappear. “The trouble is that if we block methylation too much, the host cells will also die, as the enzymes that methylate viral RNA also methylate the RNA in our cells. A drug would have to be very precisely targeted,” he said.
The UNIFESP group was the first to perform direct sequencing of RNA from SARS-CoV-2 in conjunction with identification of m6A bases. The research was conducted under the aegis of the Thematic Project “Investigation of the host’s induced elements in response to immunization with ChAdOx1 nCOV-19 vaccine in a Phase III Clinical Trial”, for which the principal investigator is Luiz Mário Janini. Juliana Maricato and Fernando Antoneli also participated.
“Two studies published previously [by groups in other countries] only sequenced the RNA. A third also performed direct sequencing but identified the base 5mC. A fourth study also identified the base m6A, but used different techniques that don’t involve direct RNA sequencing,” Briones said.
In acquiring this detailed grasp of the viral genome’s functioning, he added, scientists can obtain a clearer view of how the pathogen is evolving. “To my mind, people who talk about mutations, which strictly speaking are the substitution of one base for another in the RNA sequence, are stuck in the DNA paradigm. That doesn’t make sense for this virus, which follows different logic. It never has DNA in its reproductive cycle, and so it’s absurd to speak of ‘transcripts’ of this virus. SARS-CoV-2 lives entirely in the RNA world,” said Briones. The “RNA world” is a hypothetical stage in the evolutionary history of life on Earth, in which self-replicating RNA molecules proliferated before the evolution of DNA and protein.
“The level of complexity of RNA molecules is extraordinary. The new technologies that make direct sequencing possible have opened up a new research universe. We’re catching this train from the word go. Plenty of development remains to be done, but this is the way forward,” Briones said.
The next step for the group will be to sequence the genomes of the recently identified variants of SARS-CoV-2 and see if there are significant differences in their methylation patterns.
The article “Direct RNA sequencing reveals SARS-CoV-2 m6A sites and possible differential DRACH motif methylation among variants” is at: www.biorxiv.org/content/10.1101/2021.08.24.457397v1.
Agência FAPESP licenses news reports under Creative Commons license CC-BY-NC-ND so that they can be republished free of charge and in a straightforward manner by other digital media or by print media. The name of the author or reporter (when applied) must be cited, as must the source (Agência FAPESP). Using the button HTML below ensures compliance with the rules described in Agência FAPESP’s Digital Content Republication Policy.