Back to Glossary
General Definition

Amino Acid Sequence

Also known as: Primary structure, Peptide sequence, Sequence

Amino Acid Sequence is the specific linear order of amino acids in a peptide or protein, read from the amino terminus (N-terminus) to the carboxyl terminus (C-terminus). The sequence determines the peptide's three-dimensional structure, biological function, receptor binding specificity, and pharmacological properties. Even single amino acid changes can dramatically alter activity.

Last updated: February 1, 2026

Sequence Notation Systems

One-Letter Code

LetterAmino AcidLetterAmino Acid
AAlanineMMethionine
CCysteineNAsparagine
DAspartic acidPProline
EGlutamic acidQGlutamine
FPhenylalanineRArginine
GGlycineSSerine
HHistidineTThreonine
IIsoleucineVValine
KLysineWTryptophan
LLeucineYTyrosine

Three-Letter Code

Used in detailed documentation:

Met-Gly-Ser-Ser-Ser-His-Leu-Val-Arg-Ala-Leu-Tyr-Leu-Val-Cys

Example Sequences

PeptideSequence (One-Letter)Length
BPC-157GEPPPGKPADDAGLV15 AA
OxytocinCYIQNCPLG (cyclic)9 AA
VasopressinCYFQNCPRG (cyclic)9 AA
TB-500LKKTETQ… (43 AA total)43 AA

How Sequence Determines Function

Structure Levels

LevelDetermined ByDescription
PrimarySequence itselfLinear amino acid order
SecondaryLocal sequenceAlpha helices, beta sheets
TertiaryFull sequence3D folding
QuaternaryMultiple chainsMulti-subunit assembly

Sequence-Function Relationships

Sequence FeatureFunctional Impact
Hydrophobic residuesMembrane interaction, folding core
Charged residuesSolubility, receptor binding
Cysteine positionsDisulfide bonds, structure
Proline positionsHelix breaks, rigidity
Glycine positionsFlexibility

Critical Positions

Receptor Binding Sites

PeptideCritical ResiduesFunction
GLP-1His7, Glu9, Phe12Receptor activation
InsulinA21, B24-B26Receptor binding
OxytocinTyr2, Ile3, Gln4Uterine receptor

Single Amino Acid Effects

ChangePeptideEffect
Glu3 to Gln3GLP-1 vs GlucagonDifferent receptor specificity
Ile3 vs Phe3Oxytocin vs VasopressinUterine vs vascular effects
Ala8 to Aib8Native GLP-1 vs SemaglutideDPP-4 resistance

Sequence Analysis

Calculating Properties from Sequence

PropertyHow to CalculateImportance
Molecular weightSum of residue masses - (n-1)x18Identity confirmation
Isoelectric point (pI)From charged residuesSolubility, purification
HydrophobicitySum of hydrophobicity valuesSolubility prediction
Charge at pH 7From ionizable groupsBehavior in solution

BPC-157 Example

Sequence: GEPPPGKPADDAGLV

PropertyValueBasis
Length15 amino acidsCount
MW1419.53 DaSum of masses
pI4.22 Asp, 1 Glu, 1 Lys
Net charge (pH 7)-2Acidic peptide

Sequence Variations

Natural Variants

TypeDescriptionExample
PolymorphismsPopulation variantsSNPs affecting peptide hormones
Splice variantsAlternative processingDifferent preprohormone processing
Species differencesEvolutionInsulin varies between species

Designed Modifications

ModificationPurposeExample
SubstitutionImprove propertyMet to Nle for stability
DeletionShorten/simplifyTruncated analogs
InsertionAdd functionAdditional binding site
CyclizationStability, selectivityCyclic peptide drugs

Sequence Determination Methods

Sequencing Techniques

MethodPrincipleUse Case
Edman degradationSequential N-terminal removalClassical method
MS/MS sequencingFragmentation analysisModern standard
Amino acid analysisComposition (not order)Confirms composition

Verification Requirements

TestConfirmsMethod
Molecular weightCorrect formulaMass spectrometry
Sequence orderCorrect sequenceMS/MS, Edman
CompositionRight amino acidsAmino acid analysis
PurityMain componentHPLC

Representing Modified Sequences

Notation Conventions

ModificationNotation Example
N-terminal acetylAc-GEPPPGKPADDAGLV
C-terminal amideGEPPPGKPADDAGLV-NH2
D-amino acid[D-Ala]-GEPPPG…
Disulfide bondC1-C6 (positions)
Non-standard AA[Aib]-EGTFTSDV…
Fatty acidK([Palmitoyl-Glu])…

Complex Example: Semaglutide

[Aib8,Arg34,Lys26(Nε-[Nα-hexadecanoyl-γ-Glu])]GLP-1(7-37)

Translation:

  • Position 8: Aib instead of Ala
  • Position 34: Arg instead of Lys
  • Position 26: Lys with fatty acid attachment

Frequently Asked Questions

Why is sequence written N to C terminus?

This convention matches how peptides are synthesized biologically (ribosomes add amino acids to the C-terminus) and chemically (solid-phase synthesis typically builds C to N, then the sequence is read N to C). It’s the universal standard in biochemistry.

Can two different sequences have the same activity?

Yes, if the critical binding residues are preserved. Many peptide analogs maintain activity with sequence changes outside the binding site. However, these changes may affect other properties like stability, immunogenicity, or pharmacokinetics.

How many possible peptide sequences exist?

For a peptide of length n using 20 amino acids: 20^n possibilities. A 10-amino acid peptide has over 10 trillion possible sequences (20^10 = 10.24 x 10^12). This vast “sequence space” is why peptides offer enormous therapeutic diversity.

Related Peptides

Related Terms

Disclaimer: This glossary entry is for educational purposes only and does not constitute medical advice. Always consult a qualified healthcare provider for medical questions.