EFIN

 

Evaluation of Functional Impact of Nonsynonymous SNPs

Welcome to EFIN

EFIN is a free tool which predict whether amino acids substitution would related to disease evaluated by random forests based on protein conservation.

How to use EFIN

You can submit your querying to our website or download the precomputed results for all the human protein here.

 

Welcome to EFIN

 

EFIN is trained by variants datasets from Uniprot -Swiss-Prot protein knowledgebase and HumDiv respectly

Detailed on output

detailed in columns of precomputed human protein EFIN database and EFIN webserver querying result.

column name description
Protein Protein accession number from Uniprot
Location Locations of substitution in protein
AAref/AAmut The reference/mutant amino acid of the query position
EFIN(Swiss-Prot)_score EFIN prediction score on the querying amino acid substitution using Swiss-Prot dataset-trained EFIN
EFIN(Swiss-Prot)_prediction EFIN prediction result (Damaging/neutral) on the querying amino acid substitution using Swiss-Prot dataset-trained EFIN
EFIN(HumDiv)_score EFIN prediction score on the querying amino acid substitution using HumDiv-trained EFIN
EFIN(HumDiv)_prediction: EFIN prediction result (Damaging/neutral) on the querying amino acid substitution using HumDiv-trained EFIN
Lowest conserved block The lowest block for which all sequences , together with all the sequences in its upper blocks, have the reference amino acid perfectly conserved.

the following features are calculated in each block:

the feature in each block follows the name rule: FeatureName_BlcokName

for each block name, primates = primates block, mammal = non-primates mammal block, vertebrate = non-mammal vertebrate block, invertebrate = invertebrate block, other species =  other species block, paralog = paralog block.

Fref Frequency of reference amino acid at the query position in each block
Fmut Frequency of mutant amino acid at the query position in each block
H Shannon entropy in each block at the query position
NASfirst normalized alignment score of the first sequence in each block
No_all Number of total sequences in each block
No_qp number of sequences that cover the query position in each block
QAratio(No_qp/No_all) the ratio of No_qp/No_all

Choose the appropriate model

 

EFIN is trained by variants datasets from Uniprot -Swiss-Prot protein knowledgebase and HumDiv respectly

Comparing with Siwss-Prot dataset, HumDiv dataset is a more separable dataset with selected disease causal variants as damaging variants, while Swiss-Prot dataset contains both disease related variants and disease causal variants as damaging variants.

We suggest use EFIN (HumDiv) to detect disease causal mutations for single mutation disease, meanwhile, use EFIN(Swiss-Prot) to identify disease related mutations for complex disease. 

 

Other question

If you have any other questions, please send E-mail to zengshuai95@gmail.com