|
Amino acids play central roles both as building blocks of proteins
and as intermediates in metabolism. The 20 amino acids that are
found within proteins convey a vast array of chemical versatility. The
precise amino acid content, and the sequence of those amino acids,
of a specific protein, is determined by the sequence of the bases
in the gene that encodes that protein.
The chemical properties of the amino acids of proteins determine
the biological activity of the protein. Proteins not only catalyze
all (or most) of the reactions in living cells, they control
virtually all cellular process. In addition, proteins contain
within their amino acid sequences the necessary information to
determine how that protein will fold into a three dimensional
structure, and the stability of the resulting structure.
The
field of protein folding and stability has been a critically
important area of research for years, and remains today one of the
great unsolved mysteries. It is, however, being actively
investigated, and progress is being made every
Alanine A (Ala)

Alanine is a hydrophobic molecule. It is
ambivalent, meaning that it can be inside or outside of the
protein molecule. The α carbon of alanine is optically active; in
proteins, only the L-isomer is found.
Note that alanine is the α-amino acid analog of the α-keto acid
pyruvate, an intermediate in sugar metabolism. Alanine and
pyruvate are interchangeable by a transamination reaction.
Arginine R (Arg)

Arginine, an essential amino acid, has a
positively charged guanidino group. Arginine is well
designed to bind the phosphate anion, and is often found in the
active centers of proteins that bind phosphorylated substrates. As
a cation, arginine, as well as lysine, plays a role in maintaining
the overall charge balance of a protein.
Arginine also plays an important role in nitrogen metabolism.
In the urea cycle, the enzyme arginase cleaves (hydrolyzes) the
guanidinium group to yield urea and the L-amino acid ornithine.
Ornithine is lysine with one fewer methylene groups in the side
chain. L-ornithine is not normally found in proteins.
There are 6 codons in the genetic code for arginine, yet,
although this large a number of codons is normally associated with
a high frequency of the particular amino acid in proteins,
arginine is one of the least frequent amino acids. The discrepancy
between the frequency of the amino acid in proteins and the number
of codons is greater for arginine than for any other amino acid.
Asparagine N (Asn)

Asparagine is the amide of aspartic acid. The amide
group does not carry a formal charge under any biologically
relevant pH conditions. The amide is rather easily hydrolyzed,
converting asparagine to aspartic acid. This process is thought to
be one of the factors related to the molecular basis of aging.
Asparagine has a high propensity to hydrogen bond, since the
amide group can accept two and donate two hydrogen bonds. It is
found on the surface as well as buried within proteins.
Asparagine is a common site for attachment of carbohydrates in
glycoproteins
Aspartic Acid D (Asp)

Aspartic acid is one of two acidic amino acids. Aspartic
acid and glutamic acid play important roles as general acids in
enzyme active centers, as well as in maintaining the solubility
and ionic character of proteins.
Proteins in the serum are critical to maintaining the pH
balance in the body; it is largely the charged amino acids that
are involved in the buffering properties of proteins. Aspartic
acid is alanine with one of the β hydrogens replaced by a
carboxylic acid group. The pKa of the β carboxyl group of aspartic
acid in a polypeptide is about 4.0
Note that aspartic acid has an α-keto homolog, oxaloacetate,
just as pyruvate is the α-keto homolog of alanine. Aspartic acid
and oxaloacetate are interconvertable by a simple transamination
reaction, just as alanine and pyruvate are interconvertible.
Cysteine C (Cys)

Cysteine is one of two sulfur-containing
amino acids; the other is methionine. Cysteine differs from serine
in a single atom-- the sulfur of the thiol replaces the oxygen of
the alcohol. The amino acids are, however, much more different in
their physical and chemical properties than their similarity might
suggest.
Cysteine also plays a key role in stabilizing
extracellular proteins. Cysteine can react with itself to form an
oxidized dimer by formation of a disulfide bond. The environment
within a cell is too strongly reducing for disulfides to form, but
in the extracellular environment, disulfides can form and play a
key role in stabilizing many such proteins, such as the digestive
enzymes of the small intestine.
Glutamic Acid E (Glu)

Glutamic acid has one additional methylene group in its
side chain than does aspartic acid. The side chain carboxyl of
aspartic acid is referred to as the β carboxyl group, while that
of glutamic acid is referred to as the γ carboxyl group.
The pKa of the γ carboxyl group for glutamic acid in
a polypeptide is about 4.3, significantly higher than that of
aspartic acid. This is due to the inductive effect of the
additional methylene group. In some proteins, due to a vitamin K
dependent carboxylase, some glutamic acids will be dicarboxylic
acids, referred to as γ carboxyglutamic acid, that form tight
binding sites for calcium ion.
Glutamine Q (Gln)

Glutamine is the amide of glutamic acid, and is
uncharged under all biological conditions.
The additional single methylene group in the side chain
relative to asparagine allows glutamine in the free form or as the
N-terminus of proteins to spontaneously cyclize and deamidate
yielding the six-membered ring structure pyrrolidone carboxylic
acid, which is found at the N-terminus of many immunoglobulin
polypeptides. This causes obvious difficulties with amino acid
sequence determination
Glycine G (Gly)

Glycine is the smallest of the amino acids. It is
ambivalent, meaning that it can be inside or outside of the
protein molecule. In aqueous solution at or near neutral pH,
glycine will exist predominantly as the zwitterion
The isoelectric point or isoelectric pH of glycine will be
centered between the pKas of the two ionizable groups,
the amino group and the carboxylic acid group.
In estimating the pKa of a functional group, it is
important to consider the molecule as a whole. For example,
glycine is a derivative of acetic acid, and the pKa of
acetic acid is well known. Alternatively, glycine could be
considered a derivative of aminoethane.
Histidine H (His)

Histidine, an essential amino acid, has as a
positively charged imidazole functional group.
The imidazole makes it a common participant in enzyme catalyzed
reactions. The unprotonated imidazole is nucleophilic and can
serve as a general base, while the protonated form can serve as a
general acid. The residue can also serve a role in stabilizing the
folded structures of proteins
Isoleucine I (Ile)

Isoleucine, an essential amino acid, is
one of the three amino acids having branched hydrocarbon side
chains. It is usually interchangeable with leucine and
occasionally with valine in proteins.
The side chains of these amino acids are not reactive and
therefore not involved in any covalent chemistry in enzyme active
centers.
However, these residues are critically important for ligand
binding to proteins, and play central roles in protein stability.
Note also that the β carbon of isoleucine is optically active,
just as the β carbon of threonine. These two amino acids,
isoleucine and threonine, have in common the fact that they have
two chiral centers.
Leucine L (Leu)

Leucine, an essential amino acid, is one of the
three amino acid with a branched hydrocarbon side chain. It has
one additional methylene group in its side chain compared with
valine.
Like valine, leucine is hydrophobic and generally buried
in folded proteins.
Lysine K (Lys)

Lysine. an essential amino acid, has a
positively charged ε-amino group (a primary amine).
Lysine is basically alanine with a propylamine substituent on
theβcarbon. The ε-amino group has a significantly higher pKa
(about 10.5 in polypeptides) than does the α-amino group.
The amino group is highly reactive and often participates in a
reactions at the active centers of enzymes. Proteins only have one
α amino group, but numerous ε amino groups. However, the higher pKa
renders the lysyl side chains effectively less nucleophilic.
Specific environmental effects in enzyme active centers can lower
the pKa of the lysyl side chain such that it becomes reactive.
Methionine M (Met)

Methionine, an essential amino acid, is one of
the two sulfur-containing amino acids. The side chain is quite
hydrophobic and methionine is usually found buried within
proteins. Unlike cysteine, the sulfur of methionine is not highly
nucleophilic, although it will react with some electrophilic
centers. It is generally not a participant in the covalent
chemistry that occurs in the active centers of enzymes.
The chemical linkage of the sulfur in methionine is a thiol
ether. Compare this terminology with that of the oxygen containing
ethers. The sulfur of methionine, as with that of cysteine, is
prone to oxidation. The first step, yielding methionine sulfoxide,
can be reversed by standard thiol containing reducing agents. The
second step yields methionine sulfone, and is effectively
irreversible. It is thought that oxidation of the sulfur in a
specific methionine of the elastase inhibitor in human lung tissue
by agents in cigarette smoke is one of the causes of
smoking-induced emphysema.
Methionine as the free amino acid plays several important roles
in metabolism. It can react to form S-Adenosyl-L-Methionine (SAM)
which servers at a methyl donor in reactions
Phenylalanine F (Phe)

As the name suggests, phenylalanine, an essential
amino acid, is a derivative of alanine with a phenyl substituent
on the β carbon. Phenylalanine is quite hydrophobic and
even the free amino acid is not very soluble in water.
It is an interesting point of history that Marshall Nirenberg
and Phil Leder in their earliest experiments were studying the
translation of the synthetic message polyU, which encodes
polyphenylalanine. It was a happy coincidence that the product was
insoluble. At the time, they did not know that UUU encodes Phe,
but soon after the precipitate formed in their translation mix,
they did, and they were on the way to unraveling the genetic code,
and the Nobel prize.
Due to its hydrophobicity, phenylalanine is nearly
always found buried within a protein. The π electrons of the
phenyl ring can stack with other aromatic systems and often do
within folded proteins, adding to the stability of the structure.
Proline P (Pro)

Proline is formally NOT an amino acid, but an imino
acid. Nonetheless, it is called an amino acid. The primary
amine on the α carbon of glutamate semialdehyde forms a Schiff
base with the aldehyde which is then reduced, yielding proline.
When proline is in a peptide bond, it does not have a hydrogen
on the α amino group, so it cannot donate a hydrogen bond to
stabilize an α helix or a β sheet. It is often said, inaccurately,
that proline cannot exist in an α helix. When proline is found in
an α helix, the helix will have a slight bend due to the lack of
the hydrogen bond.
Proline is often found at the end of α helix or in turns or
loops. Unlike other amino acids which exist almost exclusively in
the trans- form in polypeptides, proline can exist in the
cis-configuration in peptides. The cis and trans
forms are nearly isoenergetic. The cis/trans isomerization
can play an important role in the folding of proteins and will be
discussed more in that context.
Serine S (Ser)

Serine differs from alanine in that one of the
methylenic hydrogens is replaced by a hydroxyl group.
Serine is one of two hydroxyl amino acids. Both are commonly
considered to by hydrophilic due to the hydrogen bonding
capacity of the hydroxyl group.
Threonine T (Thr)

Threonine, an essential amino acid, is a
hydrophilic molecule.
Threonine is an other hydroxyl-containing amino acid. It
differs from serine by having a methyl substituent in place of one
of the hydrogens on the β carbon and it differs from valine by
replacement of a methyl substituent with a hydroxyl group.
Tryptophan W (Trp)

Tryptophan, an essential amino
acid, is the largest of the amino acids. It is also a derivative
of alanine, having an indole substituent on the β carbon. The
indole functional group absorbs strongly in the near ultraviolet
part of the spectrum. The indole nitrogen can hydrogen bond
donate, and as a result, tryptophan, or at least the nitrogen, is
often in contact with solvent in folded proteins.
Tyrosine Y (Tyr)

Tyrosine, an essential amino acid, is also
an aromatic amino acid and is derived from phenylalanine by
hydroxylation in the para position. While tyrosine is
hydrophobic, it is significantly more soluble that is
phenylalanine. The phenolic hydroxyl of tyrosine is significantly
more acidic than are the aliphatic hydroxyls of either serine or
threonine, having a pKa of about 9.8 in polypeptides.
As with all ionizable groups, the precise pKa will
depend to a major degree upon the environment within the protein.
Tyrosines that are on the surface of a protein will generally have
a lower pKa than those that are buried within a
protein; ionization yielding the phenolate anion would be
exceedingly unstable in the hydrophobic interior of a protein.
Tyrosine absorbs ultraviolet radiation and contributes to the
absorbance spectra of proteins. The absorbance spectrum of
tyrosine will be shown later; the extinction of tyrosine is only
about 1/5 that of tryptophan at 280 nm, which is the primary
contributor to the UV absorbance of proteins depending upon the
number of residues of each in the protein.
Valine V (Val)

Valine, an essential amino acid, is hydrophobic,
and as expected, is usually found in the interior of proteins.
Valine differs from threonine by replacement of the
hydroxyl group with a methyl substituent. Valine is often referred
to as one of the amino acids with hydrocarbon side chains, or as a
branched chain amino acid.
Protein
Proteins are large organic compounds made of amino acids
arranged in a linear chain and joined together between the
carboxyl atom of one amino acid and the amine nitrogen of another.
This bond is called a peptide bond. The sequence of amino acids in
a protein is defined by a gene and encoded in the genetic code.
Although this genetic code specifies 20 "standard" amino acids,
the residues in a protein are often chemically altered in
post-translational modification: either before the protein can
function in the cell, or as part of control mechanisms. Proteins
can also work together to achieve a particular function, and they
often associate to form stable complexes.
Like other biological macromolecules such as polysaccharides
and nucleic acids, proteins are essential parts of all living
organisms and participate in every process within cells. Many
proteins are enzymes that catalyze biochemical reactions, and are
vital to metabolism. Other proteins have structural or mechanical
functions, such as the proteins in the cytoskeleton, which forms a
system of scaffolding that maintains cell shape. Proteins are also
important in cell signaling, immune responses, cell adhesion, and
the cell cycle. Protein is also a necessary component in our diet,
since animals cannot synthesise all the amino acids and must
obtain essential amino acids from food. Through the process of
digestion, animals break down ingested protein into free amino
acids that can be used for protein synthesis.
The name protein comes from the Greek πρώτα ("prota"),
meaning "of primary importance" and were first described
and named by Jöns Jakob Berzelius in 1838. However, their central
role in living organisms was not fully appreciated until 1926,
when James B. Sumner showed that the enzyme urease was a protein.
The first protein structures to be solved included insulin and
myoglobin; the first was by Sir Frederick Sanger who won a 1958
Nobel Prize for it, and the second by Max Perutz and Sir John
Cowdery Kendrew in 1958. Both proteins' three-dimensional
structures were amongst the first determined by x-ray diffraction
analysis; the myoglobin structure won the Nobel Prize in Chemistry
for its discoverers.
Biochemistry of Proteins
Proteins are linear polymers built from 20 different
L-alpha-amino acids. All amino acids share common structural
features including an alpha carbon to which an amino group, a
carboxyl group, and a variable side chain are bonded. Only proline
shows little difference in a fashion by containing an unusual ring
to the N-end amine group, which forces the CO-NH amide sequence
into a fixed conformation. The side chains of the standard amino
acids, detailed in the list of standard amino acids, have varying
chemical properties that produce proteins' three-dimensional
structure and are therefore critical to protein function. The
amino acids in a polypeptide chain are linked by peptide bonds
formed in a dehydration reaction. Once linked in the protein
chain, an individual amino acid is called a residue and the
linked series of carbon, nitrogen, and oxygen atoms are known as
the main chain or protein backbone. The peptide bond
has two resonance forms that contribute some double bond character
and inhibit rotation around its axis, so that the alpha carbons
are roughly coplanar. The other two dihedral angles in the peptide
bond determine the local shape assumed by the protein backbone.
Due to the chemical structure of the individual amino acids,
the protein chain has directionality. The end of the protein with
a free carboxyl group is known as the C-terminus or carboxy
terminus, while the end with a free amino group is known as the
N-terminus or amino terminus.
There is some ambiguity between the usage of the words
protein, polypeptide, and peptide. Protein
is generally used to refer to the complete biological molecule in
a stable conformation, while peptide is generally reserved
for a short amino acid oligomers often lacking a stable
3-dimensional structure. However, the boundary between the two is
ill-defined and usually lies near 20-30 residues. Polypeptide
can refer to any single linear chain of amino acids, usually
regardless of length, but often implies an absence of a single
defined conformation.
Structure of proteins
Most proteins fold into unique 3-dimensional structures. The
shape into which a protein naturally folds is known as its native
state. Although many proteins can fold unassisted simply through
the structural propensities of their component amino acids, others
require the aid of molecular chaperones to efficiently fold to
their native states. Biochemists often refer to four distinct
aspects of a protein's structure:
- Primary structure: the amino acid
sequence
- Secondary structure: regularly
repeating local structures stabilized by hydrogen bonds. The
most common examples are the alpha helix and beta sheet.Because
secondary structures are local, many regions of different
secondary structure can be present in the same protein molecule.
- Tertiary structure: the overall
shape of a single protein molecule; the spatial relationship of
the secondary structures to one another. Tertiary structure is
generally stabilized by nonlocal interactions, most commonly the
formation of a hydrophobic core, but also through salt bridges,
hydrogen bonds, disulfide bonds, and even post-translational
modifications. The term "tertiary structure" is often used as
synonymous with the term fold.
- Quaternary structure: the shape or
structure that results from the interaction of more than one
protein molecule, usually called protein subunits in this
context, which function as part of the larger assembly or
protein complex.
In addition to these levels of structure, proteins may shift
between several related structures in performing their biological
function. In the context of these functional rearrangements, these
tertiary or quaternary structures are usually referred to as
"conformations," and transitions between them are called
conformational changes. Such changes are often induced by the
binding of a substrate molecule to an enzyme's active site, or the
physical region of the protein that participates in chemical
catalysis. |