Ultra-fast and Large-Scale Protein Structural Neighbor Searching


The datasets, source code and the DSSP program used in MADOKA paper are available here. The first dataset is from the TM-align paper, which includes 200 non-homologous protein structures from the PDB range in size from 46 to 1058 residues. The second is from MALIDUP, it contains 241 manually-curated pairwise structure alignments homologous domains originated from internal duplication. The third is from MALISAM, which consists of 130 protein pairs which two proteins in any pair are different with SCOP folds but structurally analogous. There are two releases of MADOKA, the first will output superposed Ca atoms traces while the other simplified release won't.