The new tool can quickly reveal special parts of a genome

It has been more than 20 years since the entire human genome has been mapped. This means that we are able to read the entire sequence of all base pairs or letters in our DNA.

DNA is the cookbook of how we are put together. Genes tell us how everything in the body, like skin, muscles, and organs, should function and look.

But researchers have found that only one to two percent of our genome actually consists of genes that code for proteins. Much in the genome therefore does not directly describe how the body should be built.

That discovery was a bit strange, says Bastian Fromm, a researcher at the Norwegian Arctic University of UiT, which has its laboratory at the Arctic University Museum in Troms. Why do we have such a large amount of non-coding or meaningless DNA?

The other surprise, Fromm says, was that humans don’t necessarily have many more genes than other organisms, even though we consider ourselves quite complex.

Humans have 25,000 genes, while corn has 32,000, according to the Great Norwegian Encyclopedia.

What is what?

Mapping an organism’s entire genome just got easier. Scientists have now sequenced the genomes of more than 6,000 organisms.

Finding out what the letters mean in sequencing is also important, such as where we find genes, what they do and what other areas of the genome do. This is called annotation.

Noncoding DNA, the parts of the genome that aren’t used to make proteins, still have important functions.

The recipes for microRNAs are one of the things hidden in the mess of letters. These are small RNA molecules that help regulate which genes are active in a cell.

Mapping microRNA recipes into a genome can be time consuming.

Fromm and Sinan Ugur Umu of the University of Oslo, as well as other researchers in Oslo, Troms and internationally, have created a tool that can discover recipes for microRNAs in genomes.

RNA

RNA is a molecule found in cells. They have important tasks in protein production and gene regulation. RNA is made up of almost the same building blocks as DNA.

DNA has two strands with rungs of the ladder, while RNA has only one strand and is like half a ladder.

Source: The Great Norwegian Encyclopedia

Database trained

The tool is called MirMachine and is a program based on machine learning.

The researchers present the tool and their results after testing it on 100 mammalian genomes in a paper in Cell genomics.

They trained the machine learning tool on a database. The database contains genomes of 75 animal species whose microRNAs have been mapped in detail.

It took nearly a decade to build and fine-tune the database, says microRNA expert Fromm.

The machine learning tool has learned to recognize what microRNA is. He can see patterns that would be difficult for humans to see.

To demonstrate the power of our tool, we’ve successfully used MirMachine on a number of genomes from extinct organisms such as the mammoth and giant salamander and lungfish genomes, where genome annotation is particularly difficult, says Fromm.

He compared the machine to manual labor

How can researchers know that the machine is responding correctly and not just making things up?

Of course, it’s hard to know 100%, says Fromm.

But when the algorithm was completed, we tested it on all 75 organisms that we have in the database. We said we know nothing about this organism, try to find the microRNA.

Next, we compared what we’ve done manually over several years and what MirMachine did over a long weekend. It was almost impossible to find differences between them. We found some, but they were hard to see.

RELATED

RNA: Scientists have discovered a new layer in the genetic code of life

I hope it will be used

Fromm hopes many researchers will use the tool. He imagines it would be useful for researchers who are working to map the whole genome of new organisms.

Several such whole genome sequencing projects are available today, he says. They include The Earth BioGenome Project and Darwin Tree of Life.

Pl Strom is a professor at NTNU studying the role of non-coding RNA in gene regulation and disease. He reviewed the new study.

I think the work is useful, because it solves some of the challenges associated with identifying and annotating which microRNA genes are found in various species that have available genomic sequences, Strom writes in an email.

Bastian Fromm is a microRNA expert.

Bastian Fromm is a microRNA expert.

Better alternative

The main limitations of the method are that currently it can only be used on animals and can only identify evolutionarily conserved microRNA genes, says Strom.

Evolutionarily conserved means the genes could still be found in animals living today.

However, one example of microRNA genes that aren’t evolutionarily conserved, but can still be found in animals today, are microRNA genes that are only found in a specific species, Strom says.

This means that the method is suitable for automatically mapping and annotating evolutionarily known microRNA genes in recently sequenced animal species.

However, the method may not detect microRNA genes specific to a species or subset of species. In this case, the method depends on whether those microRNA genes are found using other methods, such as sequencing and small RNA bioinformatics analysis, Strom says.

He adds that tools similar to the one created by the researchers already exist.

It’s not like this work is currently being done by hand, one microRNA gene at a time. But MirMachine is an improvement over these alternative tools, he says.

Study of the octopus

Fromm says one of the interesting things to investigate is whether the number of genes for microRNA correlates with an organism’s complexity and intelligence.

As mentioned, the number of protein coding genes is not what is decisive here.

Last year, Fromm and colleagues conducted a study on octopuses. As for the number of protein-coded genes and genome size, everything was as described above. But the octopus is a very intelligent animal.

So it came as a surprise to find that octopuses have far more microRNAs than birds, fish and reptiles, and almost as many as mammals and humans, Fromm says.

So perhaps microRNA is related to the development of intelligence.

FROM UiT The Arctic University of Norway

What makes an octopus so smart?

It prevents the production of proteins

MicroRNA helps regulate genes and is the youngest of the gene regulators discovered. MicroRNA was discovered 30 years ago, Fromm says.

It explains how small RNA molecules help control which genes should be active in a cell.

Every cell in the body contains the entire DNA, the entire recipe book. But each cell only reads some of the recipes, so the eye cells won’t suddenly start making hair.

When a gene is read, a working copy in the form of RNA, called mRNA, is created. Messenger RNA is carried to the ribosomes so that a protein is made from the recipe.

MicroRNA can put a stop to that.

It finds a complementary sequence on the mRNA and fits in, preventing a protein from being made, Fromm says.

This is a safeguard to stop proteins that absolutely shouldn’t be produced, Fromm says.

Relevant to cancer research

Fromm has previously studied microRNA in cancer cells and found that they have fewer microRNAs than normal cells.

That’s part of the explanation for why cancer cells can become different things than normal cells, he says.

MicroRNA is really important for maintaining a stable cell type that does its job and nothing else, Fromm says.

We have about 550600 microRNA genes.

Reference:

Sinan Ugur Umu, Bastian Fromm et.al.: Accurate microRNA annotation of animal genomes using trained covariance models of curated microRNA complements in MirMachine. Cell genomics2023.

Read the Norwegian version of this article at forskning.no

#tool #quickly #reveal #special #parts #genome
Image Source : sciencenorway.no

Leave a Comment