What Are Neuropeptides?

In brief, Neuropeptides (NPs) are a class of neurohormone processed from longer precursor proteins, named Neuropeptide Precursors (NPPs). Each NPP may contain multiple NPs. The NPs are produced in neurons and act in cells communication. NPs have diverse functions related to reproduction, neuronal development, metabolism, social behaviors, learning and more.

Are Neuropeptides found in all species?

The accepted view is that NPPs were evolved and expanded in Metazoa along the development of multicellular organisms. NPs were identified in Cnidaria but an expansion in genes number and diversity occurred in the insects and mammals.

Where can I read more on Neuropeptides?

Many articles, books, and reviews exist. Be aware that the definition of neuropeptides, neurohormones, short peptides, antimicrobial peptides maybe somewhat blurry.

Where can I find related tools?

The knowledge-based proteins resource is UniProtKB (protein database).

Collections of peptides from Mass spectrometry analyses are the best source for identifying NPs. Databases such as PeptideAtlas and NeuroPedia are good resources.

Where can I find information on NPs from a selected organism?

The knowledge-based proteins resource is UniProtKB (protein database) can be searched by Taxonomy or organisms. UniprotKB (The protein database). The [List of known and experimentally validated Neuropeptide Precursors] can be accessed.

There is no systematic resource that is organisms specific but see Neuropeptides - WormBook is a resource that focuses on the worm C. elegans.

What is NeuroPID?

NeuroPID is an online webserver for predicting Neuropeptide Precursors (NPPs), using sequence data. NeuroPID predictions are based on Machine Learning. The predictors were constructed using the freely available Sci-Kit Learn toolkit. For details on the methodologies see NeuroPID: A Predictor for Identifying Neuropeptide Precursors from Metazoan Proteomes Bioinformatics (2013).

Can active NPs be identified by NeuroPID?

Actually no. The prediction of NeuroPID is for the NPP only. The user is encouraged to forward the result to predictors of NPs such as NeuroPred - An online tool to predict potential Neuropeptide products according to likely cleavage from a given sequence.

Alternatively, the user can approach resources from experiments that are based on Mass-Spectrometry (MS).

Does NeuroPID rely on biological knowledge?

NeuroPID is designed to work also for newly discovered sequences and genes that lack any prior annotation. The features that are used to train the machine learning are all encoded in the protein sequences and thus pre-knowledge is not needed.

Are sequences without the Signal Peptide valid for NeuroPID?

Sure. One can process the sequences in hand off-line and remove the Signal Peptide segment. The predictions are not reply of the presence of a Signal Peptide in the sequence. Actually, one can use other tools in addition to SignalP for identifying the Signal Peptide (e.g. Phobius).

NeuroPID is designed to work also for newly discovered sequences and genes that lack any prior annotation. The features that are used to train the machine learning are all encoded in the protein sequences and thus pre-knowledge is not needed.

What are the different confidence measures?

NeuroPID uses multiple machine learning methods. Still, the number of false predictions may be quite high. We suggest that the user will activate the SignalP filter to remain with a set of secreted proteins. Results can be filtered according to how many of the predictors agree. The different consensus levels available for filtering are:

  • Full - Display predictions with a full consensus, namely, 4/4 prediction methods.
  • High - Display predictions with a consensus of 3/4 prediction methods.
  • Medium - Display predictions that are supported by only 2/4 prediction methods.
  • Low - Display predictions that are supported by only 1/4 prediction method.

What is the "Internal Score" made of?

The internal score uses to filter out sequences post predictions. The internal score is the sum of number of Dibasic AA and Tribasic AA (K and R) divided by length of the peptide. The peptide for this purpose doesn't include it's signal peptide (first 25 AAs) which is cleaved out anyway.

What can we learn from the feature's histogram?

For the predictions about 600 features were extracted. However, only few of the entire collection of features were shown to distinguish between the positive and negative predictions (in cross validation). The list of features is from such informative features. Each protein shows the actual value transformed by S.D:

  • Molecular weight
  • Length (sequence length in amino acids)
  • Isoelectric Point (pI)
  • Aromaticity (The relative frequency of Phe, Trp, Tyr)
  • Amino Acid usage (20 features)
  • Instability Index (and estimate for the stability of a protein in vitro)
  • GRAVY (Grand Average of Hydropathy) - the sum of hydropathy values of all amino acids, divided by the number of residues in the analyzed sequence.
  • Aliphatic index - the relative volume occupied by aliphatic side chains Ala, Val, Ile and Leu.

Is the system available for offline use?

NeuroPID source code is freely available at: http://neuropid.cs.huji.ac.il. For further materials and help, feel free to contact us.

© 2013 Neuropid | The Hebrew University of Jerusalem | SK