Material and Methods
Sample selection
In this study, 100 Iranian subjects were selected from among the
patients referred to the genetics laboratory to carry out investigations
to determine the etiology of various non-infectious genetic diseases.
The participants have been genotyped and followed up for COVID-19
genetic risk factors for 2 years (2021-2023). Individuals under the age
of 18 are one of the exclusion criteria. Information about SARS‐CoV‐2
infection was collected from a group of 100 selected individuals via a
questionnaire administered to patients 34.
Genotype analysis
Blood samples taken from patients were washed with lysis buffer so that
RBCs were separated. Then, genomic DNA was extracted from the WBCs
through the salting-out method and the extracted DNA samples were stored
at -20°C until analysis. To assess the purity of the extracted DNA, the
Optical Density (OD) of the samples was measured by spectrophotometry in
a nanodrop device. The whole exome of 100 participants in this project
was sequenced using the Illumina HiSeq 2500 platform with an average
coverage of 50X 35. Informed written consent had been
acquired from all participants. The study was approved by the ethics
committee of the Kerman University of Medical Science. To have a brief
review, after checking the read quality by measuring Quality Control
(QC) score through the Phred scale, the raw reads aligned to the human
reference genome assembly (GRCh38) using BWA. As well, VCF files of
multi-sample were generated utilizing the GATK tool. All ACE2, TMPRSS2,
TYK2, SLC6A20, and IFNAR2 variants were extracted for further analysis.
Variants were annotated using the dbSNP, ClinVar, Varsome, and Franklin
databases, which include population-oriented data on nucleotide and
amino acid sequence changes. The highest population minor allele
frequency (MAF) of all variants was checked to be less than 1%.
In-Silico analysis
Various bioinformatics software for molecular dynamics simulation has
been used to determine the effect of the genetic variant on the amino
acid sequence, including determining the effect of the variant on the
primary transcripts of gene and alternative transcripts, as well as the
potential effect of the variant on the function and tertiary structure
of the protein. Data are also provided on Polyphen-2, SIFT,
MutationTaster, FATHMM-MKL, and CADD scores.
Protein modeling
Most methods generate models interactively based on the user requests;
For example, I-TASSER. Here, homology modeling was applied by the
I-TASSER server to create the 3D structure of the trimeric studied
protein that can calculate the effect of genetic variants on protein
structure and stability. All files in PDB (Protein Data Bank) format
were obtained from the I-TASSER server. UCSF ChimeraX was applied for
the graphical visualization of molecules. The matchmaker tool was chosen
to superimpose related structures without worrying about numbering or
missing residues. This tool superimposes proteins by creating an
alignment and then matches the aligned residues to the 3D structure.
Also, all figures of 3D structures and alignments have been assembled
with the UCSF ChimeraX software.
Statistical analysis
Statistical analysis was performed using SPSS version 26.0 software and
GraphPad Prism 9.4 software was used to draw graphs. The Chi-Square
statistical test was used to evaluate the association between different
variants of human genes ACE2, TMPRSS2, TYK2, SLC6A20, and IFNAR2 with
the severity and incidence rate of COVID-19. It should be noted that in
all analyses the group without variants was considered as the reference
group for calculating the Odds Ratio (OR). P-Value < 0.05 was
considered statistically significant.