0% Complete
Home
/
15th International Conference on Computer and Knowledge Engineering
Characterizing Microsatellite Distribution Patterns Across Distinct Gene Categories in Human
Authors :
Elahe Mehrazin
1
Mahmoud Naghibzadeh
2
Sara Jamali
3
1- Dept. of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran
2- Dept. of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran
3- Dept. of Medical Genetics, Hormozgan University of Medical Sciences, Bandar Abbas, Iran
Keywords :
Genome،Gene،Microsatellite،Coding Sequence،Protein-Coding genes،Simple Sequence Repeat (SSR)،Distribution pattern،Long non-coding RNAs،protein
Abstract :
Microsatellites are genomic regions composed of short repeat units (typically 1–6 base pairs) that are tandemly repeated multiple times and are distributed throughout the genome of various organisms, including humans. With the growing understanding of their roles in the human genome, such as involvement in the development of diseases like hereditary neurological disorders and certain colorectal tumors, as well as their use as genetic markers in population studies and forensic science, numerous algorithms have been developed to identify these sequences across the genome. To date, many studies using such algorithms have statistically analyzed the distribution patterns of microsatellites across different genomic regions, though most of them have focused on exonic, intronic, intergenic, and coding regions. In this study, we used SQL and the powerful Power BI tool to investigate the distribution patterns of microsatellite sequences in the human genome, with a specific emphasis on different types of genes and coding regions. The results revealed that the distribution pattern of microsatellites varies among different gene types. Protein-coding genes contained the highest number of microsatellites, whereas small nucleolar RNA (snoRNA) and microRNA (miRNA) genes included very few, and some gene types, such as small nuclear RNA (snRNA) and transfer RNA (tRNA), were completely devoid of them. Interestingly, protein-coding genes had the highest frequency of microsatellite occurrences, including long repeats over 100 nucleotides. Distribution analysis in coding regions showed that among all repeats, trinucleotide sequences such as ACG, CAG, and GAG were the most frequently found in Known messenger RNA (mRNA), contributing to the formation of repeat-rich polypeptides composed of threonine, glutamine, and glutamic acid. We also provided a list of protein-coding genes with the highest number of protein products encoded from microsatellite-containing regions.
Papers List
List of archived papers
A Genetic-based Fusion Approach of Persian and Universal Phonetic results for Spoken Language Identification
Ashkan Moradi - Yasser Shekofteh - Saeed Zarei
Exploring 3D Transfer Learning CNN Models for Alzheimer’s Disease Diagnosis from MRI Images
Fatemehsadat Ghanadi Ladani - Hamidreza Baradaran Kashani
Intensity-Image Reconstruction Using Event Camera Data by Changing in LSTM Update
Arezoo Rahmati Soltangholi - Ahad Harati - Abedin Vahedian
Realism in Action: Anomaly-Aware Diagnosis of Brain Tumors from Medical Images Using YOLOv8 and DeiT
Seyed Mohammad Hossein Hashemi - Leila Safari - Mohsen Hooshmand - Amirhossein Dadashzadeh Taromi
A Robust Network for Embedded Traffic Sign Recognation.
Omid Nejati Manzari - Shahriar Baradaran Shokouhi
Improving ADHD Detection with Cost-Sensitive LightGBM
Behnam Yousefimehr - Mehdi Ghatee - Ali Heydari
Extracting Major Topics of COVID-19 Related Tweets
Faezeh Azizi - Hamed Vahdat-Nejad - Hamideh Hajiabadi - Mohammad Hossein Khosravi
ExaAEC: A New Multi-label Emotion Classification Corpus in Arabic Tweets
Saeed Sarbazi-Azad - Ahmad Akbari - Mohsen Khazeni
Reversible Data Insertion in Encryption Domain Based on Reduced Quad Difference Expansion
Alireza Ghaemi - Mohammad Zare Ehteshami - Amirhossein Ghaemi
Energy-Aware Dynamic Digital Twin Placement in Mobile Edge Computing
Mahdi Hematyar - Zeinab Movahedi
more
Samin Hamayesh - Version 43.7.0