Combining genomic data and machine learning tools to reconstruct the phylogeny and evolutionary history of Ocotea and Nectandra (Lauraceae)

Date
March 2024 to April 2026
Countries
Members
Keywords
evolutionary biology
South America
Lauraceae
Phylogenetics
Research fields
Biology and Life Sciences

We are living a fascinating and challenging turning point in terms of biological research and knowledge. From a positive perspective, we are able now to study the biological diversity using millions of DNA base pairs and combine these data with collection-based information from online repositories in many inventive ways. Considering the systematics and phylogenetic studies, new avenues were opened and we now have a better knowledge of the effects of incomplete lineage sorting or introgression through the use of genomic data. These wealth of data are followed by new analytical tools, such as the use of artificial intelligence to analyse molecular and/or morphological data and search for patterns in nature (e.g., Borowiec et al., 2022). All this progress in terms of molecular data acquirement and data analyses are direct consequences of scientific development, however the same material development also brought some challenges for humanity and wild life. We are losing natural environments in a high pace, with increases in the number of endangered and extinct species, and few areas of permanent conservation close or within terrestrial hotspots of biodiversity.

To mitigate the effects of human economic development on natural environments, the use of technology is a key factor. The era of big data is an opportunity to jointly evaluate and mitigate the effects of human activities using all the data available at this point or to be available in the coming years. Considering the field of plant systematics and evolution, it’s an opportunity to infer accurate phylogenetic histories and better delimitate the species. These knowledge improvements will certainly guide better decisions for conservation and management of threatened species.

Based on these assumptions, we will combine genomic data and cutting edge analytical tools to start a research line on taxonomy, systematics, and phylogenetics of Ocotea complex and Lauraceae as a whole. 

All Lauraceae, except the parasitic genus Cassytha Osbeck ex L., are woody plants, mainly tall magnificent trees. The family has 2500–3500 species, distributed throughout the tropical zone to subtropical latitudes. Economically, the family is an important plant group with many species yield high quality timber, spices (species of Cinnamomum Schaeff.), oil (Sassafras albidum J.Pres.), andedible fruits (Persea americana Mill.). Phylogenetic relationships and species delimitation are very difficult to infer in this family, particularly in the genera Ocotea and Nectandra, together comprising 450-500 species.

The goals of the project are to:

- develop a new target capture kit for sequencing of single copy nuclear genes

- infer the phylogenetic and biogeographic history at the species-level of Ocotea and Nectandra, using cutting edge tools in machine learning and deep learning

- evaluate the implications for species conservation and extinction risk.