Andrey Tomarovsky

Software enthusiast in bioinformatics based in St. Petersburg, Russia

📧Email / 💬Telegram / 💬Facebook / 📜Google Schular

👨🏻‍🎓 Education

September, 2021 - present	PhD student in Genetics. Novosibirsk State University, Novosibirsk, Russia. PhD thesis: “Obtaining genomic assemblies and phylogenetic analysis of members of the genus Martes (fam. Mustelidae)”.
September, 2019 - July 2021	MS in Bioinformatics. Saint-Petersburg State University, St. Petersburg, Russia. MS thesis: “Assembly and annotation of the sable (Martes zibellina) and pine marten (Martes martes) genomes”.
September, 2015 - July 2019	BS in Biotechnologies. Belgorod State National Research University, Belgorod, Russia.

🏆 Work experience

1) March, 2021 - present Research programmer at the Genomic Diversity Research Center, ITMO University. Conducts research on genomics of the genus Martes:

Quality control and filtering of sequencing data.
De novo assembly and quality control.
Genome annotation:
- Whole genome alignments (pairwise and multiple) and coverage statistics.
- Calculating the coordinates of the pseudoautosomal region (custom algorithm was developed).
- De novo assembly and annotation of genome repeats.
- Protein-coding and non-coding genes annotation.
Calling and filtering of genetic variants.
Visualization and analysis of heterozygosity.
Phylogenetic and evolutionary analysis.
Comparison and analysis of the results obtained.
Pipeline development based on Bash and Snakemake.

2) July, 2021 - present Teaching and software support for Blastim courses:

Introduction to Linux for Bioinformatics.
Python and Linux for Bioinformatics and Biology.
Snakemake for Bioinformatics.
Analysis of NGS data.

3) December, 2020 - January, 2024 ResOps and system administration experience on computing cluster MSU FBB.

Assistance in writing commands and pipelines for bioinformatic analyses.
Software support for cluster users.
Creating a cluster documentation.

🛠 Skills

OS:	Linux, Windows
Shell:	Bash. A good knowledge of the various shell tools, such as Awk, Grep, Sed
Programming:	Python
Python libraries:	Biopython, Matplotlib, Numpy, Pandas, Scikit-learn Experience in writing various scripts and data visualizations in Jupyter Notebook and individual Python packages. - Parsing data from files or websites to Pandas dataframes. - Calculation of average, median, minimum and maximum values in datasets. - Visualization of results using Matplotlib in the form of plots, histograms, Venn diagrams. - A little experience in ML (kNN, clustering, linear regression)
Statistics:	R
R libraries:	readxl, dplyr, car, cowplot, ggplot2 Experience analyzing various datasets, such as those containing information on different types of cancer and patient survival times. - Linear and multiple regression. - Description and significance testing of linear models. - Comparison of linear models. - Testing statistical hypotheses.
Workflow managers:	Snakemake Experience in writing complex Snakemake pipelines including benchmarking, logging, task grouping and running on a compute cluster. There is experience in collaborative development.
Workload managers:	Slurm, PBS. ResOps experience on computing clusters MSU FBB, ICG, IMCB and ITMO. - Running large-scale computational tasks using Slurm, PBS and Snakemake. - Installation and interaction with Conda environments.
Others:	- SQL (creating a database, simple and medium complexity queries) - Circos (basic level, experience in visualization of mDNA and its coverage) - Tcl (basic level, experience in writing module files)

📌 On The Side

Snakemake pipelines:

BuscoClade. Pipeline to construct species phylogenies using universal single-copy orthologs BUSCOs.

ITSpipe. Pipeline for the analysis of ITS sequences from the ribosomal cluster. Coverage visualization using Matplotlib and variant calling using Gatk, Pisces, and Bcftool is performed.

varcaller. Pipeline for calling genetic variants correctly. Includes visualization of coverage and calculation of PAR coordinates.

Others:

Biocrutch. A custom python package for bioinformatics research. My project contains bioinformatics scripts for genome and coverage statistics, repeats masking, determining coordinates of pseudoautosomal region, filtering 10XGenomics linked reads, PSMC date combine and others.

Bashare. The repository contains custom Bash scripts and pipelines for data processing.

📝 Grants

Russian Foundation for Basic Research, grant № 20-04-00808 A, “Genomes and genetic diversity of mustelids (fam. Mustelidae) of Russia and South-Eastern Asia”.

📝 Articles

Kliver S, Houck ML, Perelman PL, Totikov A, Tomarovsky A, Dudchenko O, Omer AD, Colaric Z, Weisz D, Aiden EL, Chan S. Chromosome-length genome assembly and karyotype of the endangered black-footed ferret (Mustela nigripes). Journal of Heredity. 2023 May 30. DOI: 10.1093/jhered/esad035
Тотиков А.А., Томаровский А.А., Якупова А.Р., Графодатский А.С., Кливер С.Ф. Обзор методов реконструкции демографической истории популяций в природоохранной биологии // Экологическая генетика. 2023. Т. 21. № 1. С. 85–102. DOI: 10.17816/ecogen120078
Yakupova, A.; Tomarovsky, A.; Totikov, A.; Beklemisheva, V.; Logacheva, M.; Perelman, P.; Komissarov, A.; Dobrynin, P.; Krasheninnikova, K.; Tamazian, G.; Serdyukova, N.; Rayko, M.; Bulyonkova, T.; Grachev, M.; Cherkasov, N.; Pylev, V.; Varnavsky, A.; Peterfeld, V.; Penin, A.; Balanovska, E.; Lapidus, A.; O’Brien, S.; Graphodatsky, A.; Kepfli, K.-P.; Kliver, S. Chromosome length genome assembly of the Baikal seal (Pusa sibirica) reveals fewer answers than new mysteries. Genes. 2023; 14(3):619. DOI: 10.3390/genes14030619
Derežanin, L.; Blažytė, A.; Dobrynin, P.; Duchêne, D.A.; Grau, J.H.; Hofreiter, M.; Jeon, S.; Kliver, S.; Koepfli, K.P.; Meneghini, D.; Preick, M.; Tomarovsky, A.; Totikov, A.; Fickel, J.; Förster, D.W. Multiple types of genomic variation contribute to adaptive traits in the mustelid subfamily Guloninae. Molecular Ecology 2022. DOI: 10.1111/mec.16443
Totikov, A.; Tomarovsky, A.; Prokopov, D.; Yakupova, A.; Bulyonkova, T.; Derezanin, L.; Rasskazov, D.; Wolfsberger, W.; Koepfli, K.P.; Oleksyk, T.K. and Kliver, S. Chromosome-Level Genome Assemblies Expand Capabilities for Conservation Biology. Genes 2021, 12, x. DOI: 10.3390/genes12091336

👨🏻‍💼 Conferences

Tomarovsky, A.; Totikov, A.; Beklemisheva, V.; Perelman, P.; Serdyokova, N.; Bulyonkova, T.; Koniaeva, K.; Abramov, A.; Graphodatsky, A.; Koepfli K.; Powell R.; Kliver S. Assembly and annotation of the sable (Martes zibellina) and pine marten (Martes martes) genomes. ISBN: 978-5-901158-32-6.
Totikov, A.; Tomarovsky, A.; Perelman, P.; Serdyokova, N.; Beklemisheva, V.; Bulyonkova, T.; Zub, K.; Panov, V.; Mukhacheva, A.; Abramov, A.; Koepfli, K.; Graphodatsky, A.; Melo-Ferreira, J.; Kliver, S. Reconstruction of the demographic history for three populations of the least weasel Mustela nivalis. ISBN: 978-5-901158-32-6.

💬 Languages

Russian: Native
English: Pre-Intermediate