protFP

The ProtFP descriptor set was constructed from a large initial selection of indices obtained from the AAindex database for all 20 naturally occurring AAs.Indices showing highest covariance were removed, while at the same time a number of largely independent physicochemical parameters were maintained. In order to obtain descriptors at lower dimensionality PCA was performed on the final selection set of 58 amino acid properties. 8 PCs(protFP1~protFP8) explained 92% of the variance.

Click here to download protFP code.


Running protFP

Before using protFP, make sure that you have the R language and the R packages Peptides, doParallel, entropy, dplyr, readxl, stringr installed on your computer. If not, after installing R, run the following commands to install the dependencies in R respectively.

Install R Packages

`install.packages("doParallel")`
`install.packages("Peptides")`
`install.packages("dplyr")`
`install.packages("stringr")`
`install.packages("readxl")`
`install.packages("entropy")`


Run protFP

Creat a working directory, e.g., ./working
Unzip the downloaded file in the working directory
Run protFP:
path-to-Rscript protFP_calculate.R -i <Input file(csv)> -o <Working directory>

Parameters:

-i: A csv file with two columns. The first column is the sequence of neoantigen's WT Peptide, while the second column is the sequence of neoantigen's Mutant Peptide. Now only neoantigens with nine peptides and one site-specific mutation are supported for calculation. You can download the example data by clicking on the link above.
-o: Working directory. The folder's path which you put the unziped files in, and where the output files will be located.

Example:

/usr/bin/Rscript protfp_calculate.R -i example.csv -o /feature_calculate/

Citation:

van Westen, G.J., R.F. Swier, J.K. Wegner, A.P. Ijzerman, et al., Benchmarking of protein descriptor sets in proteochemometric modeling (part 1): comparative study of 13 amino acid descriptor sets. J Cheminform, 2013. 5(1): 41.
van Westen, G.J., R.F. Swier, I. Cortes-Ciriano, J.K. Wegner, et al., Benchmarking of protein descriptor sets in proteochemometric modeling (part 2): modeling performance of 13 amino acid descriptor sets. J Cheminform, 2013. 5(1): 42.