HIV structure and genome

Associate Editors-in-Chief: Ujjwal Rastogi, MBBS [mailto:urastogi@perfuse.org]

Overview
The genome and proteins of HIV have been the subject of extensive research since the discovery of the virus in 1983. It is a well known fact that no two HIV genomes are the same, not even from the same person, causing some to speculate that HIV is a "quasispecies" of a virus.

Structure


HIV is different in structure from other retroviruses. It is around 120 nm in diameter (120 billionths of a meter; around 60 times smaller than a red blood cell) and roughly spherical.

HIV-1 is composed of two copies of single-stranded RNA enclosed by a conical capsid comprising the viral protein p24, typical of lentiviruses (Figure 1). The RNA component is 9749 nucleotides long. This is in turn surrounded by a plasma membrane of host-cell origin. The single-strand RNA is tightly bound to the nucleocapsid proteins, p7 and enzymes that are indispensable for the development of the virion, such as reverse transcriptase and integrase. The nucleocapsid (p7 and p6) associates with the genomic RNA (one molecule per hexamer) and protects the RNA from digestion by nucleases. A matrix composed of an association of the viral protein p17 surrounds the capsid, ensuring the integrity of the virion particle. Also enclosed within the virion particle are Vif, Vpr, Nef, p7 and viral protease (Figure 1). The envelope is formed when the capsid buds from the host cell, taking some of the host-cell membrane with it. The envelope includes the glycoproteins gp120 and gp41.

Recently, an Anglo-German team compiled a 3D structure of HIV by combining multiple images. It is hoped that this new information would contribute to scientific understanding of the virus, and help in the creation of a cure. Oxford University's Professor Stephen D. Fuller said the 3D map would assist in understanding how the virus grows. The validity of this work remains a matter of debate, with a conflicting model produced by another team led by Florida State University Professor Kenneth Roux in the US.

Genome organization


HIV has several major genes coding for structural proteins that are found in all retroviruses, and several nonstructural ("accessory") genes that are unique to HIV. The gag gene provides the basic physical infrastructure of the virus, and pol provides the basic mechanism by which retroviruses reproduce, while the others help HIV to enter the host cell and enhance its reproduction. Though they may be altered by mutation, all of these genes except tev exist in all known variants of HIV; see Genetic variability of HIV.


 * gag (Group-specific Antigen): codes for p24, the viral capsid; p6 and p7, the nucleocapsid proteins; and p17, a matrix protein.


 * pol: Codes for viral enzymes, the most important of which are reverse transcriptase, integrase, and protease which cleaves the proteins derived from gag and pol into functional proteins.


 * env (for "envelope"): Codes for the precursor to gp120 and gp41, proteins embedded in the viral envelope which enable the virus to attach to and fuse with target cells.


 * tat, rev, nef, vif, vpr, vpu: Each of these genes codes for a single protein with the same names; see Tat, Rev, Nef, Vif, Vpr, Vpu.


 * tev: This gene is only present in a few HIV-1 isolates. It is a fusion of parts of the tat, env, and rev genes, and codes for a protein with some of the properties of Tat, but little or none of the properties of Rev.

Gag
These proteins are encoded by the gag gene, and provide structural elements of the virus.

p24
p24 makes up the viral capsid.

When a Western blot test is used to detect HIV infection, p24 is one of the three major proteins tested for, along with gp120/gp160 and gp41.

p6, p7, and p17
p6 and p7 provide the nucleocapsid.

p17 provides a protective matrix.

Reverse transcriptase
Common to all retroviruses, this enzyme transcribes the viral RNA into double-stranded DNA.

Integrase
This enzyme integrates the DNA produced by reverse transcriptase into the host's genome.

Protease
A protease is any enzyme that cuts proteins into segments. HIV's gag and pol genes do not produce their proteins in their final form, but as larger combination proteins; the specific protease used by HIV cleaves these into separate functional units. Protease inhibitor drugs block this step.

Env
The env gene does not actually code for gp120 and gp41, but for a precursor to both, gp160. During HIV reproduction, the host cell's own enzymes cleave gp160 into gp120 and gp41. See Replication cycle of HIV.

gp120
Exposed on the surface of the viral envelope, the glycoprotein gp120 binds to the CD4 receptor on any target cell that has such a receptor, particularly the helper T-cell. See HIV tropism and Replication cycle of HIV.

Since CD4 receptor binding is the most obvious step in HIV infection, gp120 was among the first targets of HIV vaccine research. These efforts have been hampered by its chemical properties, which make it difficult for antibodies to bind to gp120; also, it can easily be shed from the virus due to its loose binding with gp41.

gp41
The glycoprotein gp41 is non-covalently bound to gp120, and provides the second step by which HIV enters the cell. It is originally buried within the viral envelope, but when gp120 binds to a CD4 receptor, gp120 changes its conformation causing gp41 to become exposed, where it can assist in fusion with the host cell.

Fusion inhibitor drugs such as enfuvirtide block the fusion process by binding to gp41.

Tat
Stands for "Trans-Activator of Transcription". Tat consists of between 86 and 101 amino acids depending on the subtype.

Tat helps HIV reproduce by compensating for a defect in its genome: the HIV RNA initially has a hairpin-structured portion which prevents full transcription occurring. However, a small number of RNA transcripts will be made, which allow the Tat protein to be produced. Tat then binds to and phosphorylates cellular factors, eliminating the effect of the hairpin RNA structure and allowing transcription of the HIV DNA.

This itself increases the rate of transcription, providing a positive feedback cycle. This in turn allows HIV to have an explosive response once a threshold amount of Tat is produced, a useful tool for defeating the body's response.

Tat also appears to play a more direct role in the HIV disease process. The protein is released by infected cells in culture, and is found in the blood of HIV-1 infected patients.

It can be absorbed by cells that are not infected with HIV, and can act directly as a toxin producing cell death via apoptosis in uninfected "bystander" T cells, assisting in progression toward AIDS.

By interacting with the CXCR4 receptor, Tat also appears to encourage the reproduction of less virulent M-tropic strains of HIV early in the course of infection, allowing the more rapidly pathogenic T-tropic strains to emerge later.

Rev


Stands for "Regulator of Virion". This protein allows fragments of HIV mRNA that contain a Rev Response Unit (RRE) to be exported from the nucleus to the cytoplasm. In the absence of the rev gene, RNA splicing machinery in the nucleus quickly splices the RNA so that only the smaller, regulatory proteins can be produced; in the presence of rev, RNA is exported from the nucleus before it can be spliced, so that the structural proteins and RNA genome can be produced. Again, this mechanism allows a positive feedback loop to allow HIV to overwhelm the host's defenses, and provides time-dependent regulation of replication (a common process in viral infections)

Vpr
Stands for "Viral Protein R". Vpr, a 96 amino acid 14-kDa protein, plays an important role in regulating nuclear import of the HIV-1 pre-integration complex, and is required for virus replication in non-dividing cells such as macrophages. Vpr also induces cell cycle arrest and apoptosis in proliferating cells, which can result in immune dysfunction.

Vpr is also immunosuppressive due to its ability to sequester a proinflammatory transcriptional activator in the cytoplasm. HIV-2 contains both a Vpr protein and a related (by sequence homology) Vpx protein (Viral Protein X). Two functions of Vpr in HIV-1 are split between Vpr and Vpx in HIV-2, with the HIV-2 Vpr protein inducing cell cylce arrest and the Vpx protein required for nuclear import.

Nef
Stands for "Negative Regulatory Factor". The expression of Nef early in the viral life cycle ensures T cell activation and the establishment of a persistent state of infection, two basic attributes of HIV infection. Nef also promotes the survival of infected cells by downmodulating the expression of several surface molecules important in host immune function. These include major histocompatibility complex-I (MHC I) and MHC II present on antigen presenting cells (APCs) and target cells, CD4 and CD28 present on CD4+ T cells. One group of patients in Sydney were infected with a nef-deleted virus and took much longer than expected to progress to AIDS.

A nef-deleted virus vaccine has not been trialed in humans and has failed in nonhuman animals.HIV-1 Nef-induced FasL induction and bystander killing requires p38 MAPK activation.

Vif
Stands for "Viral infectivity factor". Vif is a 23-kilodalton protein that is essential for viral replication. Vif inhibits the cellular protein, APOBEC3G, from entering the virion during budding from a host cell by targeting it for proteasomal degredation. Vif hijacks the cellular Cullin5 E3 ubiquitin ligase in order to target APOBEC3G for degradation. In the absence of Vif, APOBEC3G causes hypermutation of the viral genome, rendering it dead-on-arrival at the next host cell. APOBEC3G is thus a host defence to retroviral infection which HIV-1 has overcome by the acquisition of Vif.

Vpu
Stands for "Viral Protein U". Vpu is involved in viral budding, enhancing virion release from the cell.