Proteins are a critical part of the functioning of all life. They make up over 50% of the dry weight of any given living cell and are critical for many important biological processes and systems, including maintaining structure, catalyzing reactions and cell signaling. The basic subunit, or “building block”, is the amino acid. These small, varied molecules are linked in chains to give proteins their well-known basic structure, known as the peptide, but it’s the specific composition and combination of these subunits that give each protein it’s unique three-dimensional structure and function.
The fundamental structure of all amino acids is basically the same. They have an amine group on one end and a carboxyl group on the other. These ends link to the two neighboring amino acids to form the linear chain primary structure of the peptide. Attached to this backbone for each amino acid is a group known as the side chain, which gives the amino acid its specific properties. For example, lysine has a long carbon side chain with an amide group on the end and generally has a positive charge. Leucine, on the other hand, has an entirely hydrocarbon side chain that gives it hydrophobic properties. It is the combination of these different properties in the chain of a peptide that creates the folded structure of the protein and gives it function.
When an organism needs to synthesize a specific protein, the corresponding mRNA that carries the genetic code for that protein is recruited and its information is “decoded”. Triplets of bases in the mRNA correspond to one of the 22 coded, standard amino acids. For instance, the amino acid methionine is encoded by AUG (adenine, uracil and guanine). Given that there are 4 bases, that would mean that there are 64 possible triplet combinations, but only 22 amino acids. The fact that some amino acids have multiple codons is known as degeneracy in the genetic code. (Note: some of the codons encode “stops” which signal that a particular peptide chain is at its end.)
There are 22 standard amino acids that are present in proteins and are encoded either directly or indirectly by the genetic code. These are known as proteinogenic amino acids. Nonproteinogenic amino acids are either not present in proteins or is not encoded by the genetic code, which can occur when a standard amino acid is modified after translation. Of the standard amino acids, eight are known as “essential amino acids” because they cannot be synthesized by the human body and must be consumed in our diet. Of the remaining non-essential amino acids, some are essential in certain genetic disorders or other circumstances (very young children cannot yet synthesize cysteine).