(19)
(11)EP 2 705 150 B1

(12)EUROPEAN PATENT SPECIFICATION

(45)Mention of the grant of the patent:
29.07.2020 Bulletin 2020/31

(21)Application number: 12779797.5

(22)Date of filing:  02.05.2012
(51)International Patent Classification (IPC): 
C12N 15/33(2006.01)
C12N 15/11(2006.01)
G01N 33/569(2006.01)
C12N 5/10(2006.01)
C12Q 1/68(2018.01)
A61K 39/12(2006.01)
(86)International application number:
PCT/US2012/036098
(87)International publication number:
WO 2012/151263 (08.11.2012 Gazette  2012/45)

(54)

A SYNTHETIC HEPATITIS C GENOME AND METHODS OF MAKING AND USE

SYNTHETISCHES HEPATITIS-C GENOM SOWIE HERSTELLUNGS- UND ANWENDUNGSVERFAHREN DAFÜR

GÉNOME SYNTHÉTIQUE DE L'HÉPATITE C ET PROCÉDÉ DE SA PRODUCTION ET DE SON UTILISATION


(84)Designated Contracting States:
AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

(30)Priority: 02.05.2011 US 201161481457 P

(43)Date of publication of application:
12.03.2014 Bulletin 2014/11

(73)Proprietor: The Johns Hopkins University
Baltimore, MD 21218 (US)

(72)Inventors:
  • RAY, Stuart, Campbell
    Lutherville Maryland 21093 (US)
  • MUNSHAW, Supriya
    Baltimore Maryland 21231 (US)
  • LIU, Lin
    Gainesville Florida 32607 (US)

(74)Representative: Murgitroyd & Company 
Murgitroyd House 165-169 Scotland Street
Glasgow G5 8PL
Glasgow G5 8PL (GB)


(56)References cited: : 
WO-A1-2010/039154
US-A1- 2009 238 822
US-A- 6 027 729
US-B1- 7 235 394
  
  • DATABASE GENBANK [Online] 18 June 2009 'Hepatitis C virus subtype la polyprotein gene', XP003030258 Database accession no. AF009606
  • S. CHEVALIEZ ET AL.: 'Hepatitis C Virus (HCV) Genotype 1 Subtype Identification in New HCV Drug Development and Future Clinical Practice' PLOS ONE vol. 4, no. IS.12, December 2009, pages 1 - 9, XP055084010
  
Note: Within nine months from the publication of the mention of the grant of the European patent, any person may give notice to the European Patent Office of opposition to the European patent granted. Notice of opposition shall be filed in a written reasoned statement. It shall not be deemed to have been filed until the opposition fee has been paid. (Art. 99(1) European Patent Convention).


Description

BACKGROUND OF THE INVENTION



[0001] HCV is a small enveloped Flaviviridae family virus with a 9.6-kb single, positive-stranded RNA genome consisting of a 5' untranslated region (UTR), a large open reading frame encoding the virus-specific proteins, and a 3' UTR. The 5' UTR contains an internal ribosome entry site (IRES) that mediates translation of a single polyprotein of approximately 3000 amino acids. The polyprotein consists of structural proteins (core, E1, and E2) located in the N terminus, followed by p7 and nonstructural proteins (NS2, NS3, NS4A, NS4B, NS5A, and NS5B) encoded in the remainder.

[0002] While there is a recognized need for an effective HCV vaccine, selection of the viral strain to be used as an antigen has been arbitrary. Studies in humans and chimpanzees have shown that the host immune system is able to launch an effective response to HCV, and people who have cleared infection once are likely to do so again, though this effect is potentially attributable to host genetics. The genetic diversity of HCV, which is even greater than that of HIV, poses a great challenge to the development of an effective vaccine. Selection of an appropriate strain as a vaccine candidate is crucial since even a single amino acid substitution could reduce vaccine effectiveness by eliminating recognition by T cells specific for that epitope. Use of an ancestral or consensus sequence as a vaccine candidate has been proposed for HIV-1. Compared to a consensus sequence, a mosaic approach (including multiple variant sequences of individual epitopes) generated more vigorous T cell responses to HIV-1 epitopes. Mosaic candidates have recently been identified for HCV although their effectiveness is still unknown.

[0003] Hepatitis C virus (HCV) affects approximately 170 million people worldwide. Approximately 20-25% of patients with acute hepatitis C achieve spontaneous clearance of the virus but 75%-80% develop chronic infection. Approximately 20% of chronic hepatitis C patients develop cirrhosis and of these, 4% will develop hepatocellular carcinoma and 6% will develop end stage liver disease. There is no available HCV vaccine and commonly used interferon-based treatment is toxic, prolonged, expensive, not consistently successful, and not effective in the most advanced forms of disease.

[0004] US7,235,394 discloses the determination of HCV genome RNA sequences, construction of infectious HCV DNA clones and the use of the clones in therapeutic, vaccine and diagnostic applications. US7,235,394 discloses a nucleic acid molecule encoding the genome of a synthetic HCV subtype 1a.

[0005] As such, there still exists an unmet need for more effective tools for preparing antigens, antibodies and vaccines against HCV and related viruses.

SUMMARY OF THE INVENTION



[0006] In accordance with an embodiment, the present invention provides a nucleic acid molecule encoding the genome of a synthetic hepatitis C virus subtype 1a (Bolela) comprising the nucleotide sequence of SEQ ID NO: 1, or the complement thereof.

[0007] In accordance with another embodiment, the present invention provides an isolated nucleic acid molecule that specifically hybridizes to the nucleotide sequence set forth in SEQ ID NO: 1 or to the complement thereof.

[0008] In accordance with a further embodiment, the present invention provides a pair of oligonucleotide primers for PCR, wherein the first primer is an isolated nucleic acid molecule between about 10 and about 30 nucleotides in length that specifically hybridizes to the nucleotide sequence set forth in SEQ ID NO: 1 and the second primer is an isolated nucleic acid molecule between about 10 and about 30 nucleotides in length that specifically hybridizes to the complement of the nucleotide sequence set forth in SEQ ID NO: 1.

[0009] In accordance with still another embodiment, the present invention provides an isolated polypeptide encoded by nucleic acid comprising a nucleotide sequence of SEQ ID NO: 1.

[0010] In accordance with yet a further embodiment, the present invention provides an isolated polypeptide having the amino acid sequence of SEQ ID NO: 2.

[0011] In accordance with an embodiment, the present invention provides a viral particle comprising a) a polynucleotide encoding the E1E2 region of SEQ ID NO:1 comprising the HVR1 sequence of SEQ ID NO: 3, and b) a reporter element.

[0012] Disclosed herein is a HCV antigen comprising a polynucleotide molecule encoding between 15 to 100 contiguous amino acids of the nucleotide sequence set forth in SEQ ID NO: 1.

[0013] Also disclosed is an antibody, or antigen binding portion thereof, which specifically binds to the to the nucleic acid molecule having the nucleotide sequence set forth in SEQ ID NO: 1.

[0014] Further disclosed is a method of testing a sample for the presence of HCV in the sample, the method comprising detecting the presence of a polypeptide in the sample that specifically binds to the antibody disclosed above.

[0015] Also disclosed is a method of treating a subject infected with HCV comprising administering to the subject, a pharmaceutical composition comprising an antigen as described above, in an amount sufficient to stimulate an immune response to the antigen in the subject, such that the immune response is sufficient to decrease the viral load of HCV in the subject.

BRIEF DESCRIPTION OF THE DRAWINGS



[0016] 

Figure 1 depicts a neighbor-joining tree showing a) Bole1a and the Yusim dataset b) Bolela and the E1E2 dataset. The Bole1a sequence is shown in bold in both figures.

Figure 2(a) is a diversity plot comparing mean pairwise non-synonymous (dN) and synonymous (dS) diversity among subtype 1a sequences ("subtype 1a") to mean pairwise distance between Bolela and subtype 1a sequences, using sliding window size of 20 codons. For this comparison, the original dataset of 390 full-genome sequences was the source of polyprotein reference sequences. Figure 2(b) shows an alignment comparison of E1E2 using Bolela as the reference sequence and consensus (of 390 sequences), H77, HCV-1 and a 1b (D90208) sequence. Vertical bars indicate positions with amino acid differences in respective sequences compared to Bole1a and asterisk indicates the position of HVR1.

Figure 3(a) depicts that Bolela (indicated by asterisk) is highly representative based on (a) coverage of modal (most commonly-observed) 9-mers provided by Bolela and all other sequences in the Yusim dataset, and Figure 3(b) shows identity to known epitopes, depicted as a histogram showing the percentage of epitope sequences that are identical sequences to the known and common 338 epitopes T cell epitopes.

Figure 4 shows the infectivity of various HCVpp is shown in log10(RLU). The black dotted line represents the RLU threshold for infectious HCVpp. The leftmost group of bars depicts the average infectivities of Bolela with media only, Bolela with anti-CD81, and Bole1a with an isotype control respectively. The middle group of bars depicts the average infectivities of H77 with media only, H77 with anti-CD81, and H77 with an isotype control respectively. Error bars are standard deviations calculated from 3 experiments. The two bars on the right show the average infectivities of all subtype la HCVpp that are infective (solid frame) and non-infective (dashed frame). The error bars represent standard deviation of infectivities.


DETAILED DESCRIPTION OF THE INVENTION



[0017] The present invention provides a synthetic HCV subtype la genome (Bole1a) which is useful for vaccine research and development, antigen production, antibody production, diagnostic testing and oligonucleotide primer or probe production, and other uses.

[0018] In accordance with one or more embodiments, the present invention provides a synthetic subtype 1a HCV virus genome and the resulting computationally-derived genome is representative of widely circulating strains, has functional envelope genes that mediate entry into hepatoma cells in vitro, and matches more CD8+T cell epitopes than any other subtype la sequence in GenBank whether comparing all 9-mers or all known common epitopes.

[0019] In accordance with an embodiment, the present invention provides a nucleic acid molecule encoding the genome of a synthetic hepatitis C virus subtype la (Bolela) comprising the nucleotide sequence of SEQ ID NO: 1, or the complement thereof.

[0020] By "nucleic acid" as used herein includes "polynucleotide," "oligonucleotide," and "nucleic acid molecule," and generally means a polymer of DNA or RNA, which can be single-stranded or double-stranded, synthesized or obtained (e.g., isolated and/or purified) from natural sources, which can contain natural, non-natural or altered nucleotides, and which can contain a natural, non-natural or altered internucleotide linkage, such as a phosphoroamidate linkage or a phosphorothioate linkage, instead of the phosphodiester found between the nucleotides of an unmodified oligonucleotide. It is generally preferred that the nucleic acid does not comprise any insertions, deletions, inversions, and/or substitutions. However, it may be suitable in some instances, as discussed herein, for the nucleic acid to comprise one or more insertions, deletions, inversions, and/or substitutions.

[0021] In an embodiment, the nucleic acids of the invention are recombinant. As used herein, the term "recombinant" refers to (i) molecules that are constructed outside living cells by joining natural or synthetic nucleic acid segments to nucleic acid molecules that can replicate in a living cell, or (ii) molecules that result from the replication of those described in (i) above. For purposes herein, the replication can be in vitro replication or in vivo replication.

[0022] The nucleic acids can be constructed based on chemical synthesis and/or enzymatic ligation reactions using procedures known in the art. See, for example, Sambrook et al. (eds.), Molecular Cloning, A Laboratory Manual, 3rd Edition, Cold Spring Harbor Laboratory Press, New York (2001) and Ausubel et al., Current Protocols in Molecular Biology, Greene Publishing Associates and John Wiley & Sons, NY (1994). For example, a nucleic acid can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed upon hybridization (e.g., phosphorothioate derivatives and acridine substituted nucleotides). Examples of modified nucleotides that can be used to generate the nucleic acids include, but are not limited to, 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 4-acetylcytosine, 5-(carboxyhydroxymethyl) uracil, 5-carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-substituted adenine, 7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, beta-D-mannosylqueosine, 5'-methoxycarboxymethyluracil, 5-methoxyuracil, 2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, uracil-5-oxyacetic acid methylester, 3-(3-amino-3-N-2-carboxypropyl) uracil, and 2,6-diaminopurine. Alternatively, one or more of the nucleic acids of the invention can be purchased from companies, such as Macromolecular Resources (Fort Collins, CO) and Synthegen (Houston, TX).

[0023] The nucleic acid can comprise any nucleotide sequence that encodes any of the Bole1a polypeptides, or proteins, or fragments or functional portions or functional variants thereof. For example, the nucleic acid can comprise a nucleotide sequence comprising SEQ ID NO: 1, or alternatively can comprise a nucleotide sequence that is degenerate to SEQ ID NO: 1.

[0024] The invention also provides an isolated or purified nucleic acid comprising a nucleotide sequence which is complementary to the nucleotide sequence of any of the nucleic acids described herein or a nucleotide sequence which hybridizes under stringent conditions to the nucleotide sequence of any of the nucleic acids described herein. In an embodiment, the present invention provides a nucleic acid molecule which is complementary to the full length nucleotide sequence of SEQ ID NO: 1.

[0025] As defined herein, a functional portion or functional variant of Bole1a polypeptides, or proteins, includes, for example, any of the core, E1, E2, NS3, NS4, NS5, and their subunits, UTR antigen proteins, and fragments thereof.

[0026] The isolated nucleic acid molecule may comprise a nucleotide sequence which is substantially the same as, e.g., has at least 50%, e.g., 60%, 70%, 80% or 90% or more, contiguous nucleic acid sequence identity to SEQ ID NO: 1, or the complement thereof.

[0027] The nucleotide sequence which hybridizes under stringent conditions preferably hybridizes under high stringency conditions. By "high stringency conditions" is meant that the nucleotide sequence specifically hybridizes to a target sequence (the nucleotide sequence of any of the nucleic acids described herein) in an amount that is detectably stronger than non-specific hybridization. High stringency conditions include conditions which would distinguish a polynucleotide with an exact complementary sequence, or one containing only a few scattered mismatches from a random sequence that happened to have a few small regions (e.g., 3-10 bases) that matched the nucleotide sequence. Such small regions of complementarity are more easily melted than a full-length complement of 14-17 or more bases, and high stringency hybridization makes them easily distinguishable. Relatively high stringency conditions would include, for example, low salt and/or high temperature conditions, such as provided by about 0.02-0.1 M NaCl or the equivalent, at temperatures of about 50-70 °C.

[0028] The nucleic acids of the invention can be incorporated into a recombinant expression vector. In this regard, disclosed herein are recombinant expression vectors comprising any of the nucleic acids of the invention. For purposes herein, the term "recombinant expression vector" means a genetically-modified oligonucleotide or polynucleotide construct that permits the expression of an mRNA, protein, polypeptide, or peptide by a host cell, when the construct comprises a nucleotide sequence encoding the mRNA, protein, polypeptide, or peptide, and the vector is contacted with the cell under conditions sufficient to have the mRNA, protein, polypeptide, or peptide expressed within the cell. The vectors disclosed herein are not naturally-occurring as a whole. However, parts of the vectors can be naturally-occurring. The inventive recombinant expression vectors can comprise any type of nucleotides, including, but not limited to DNA and RNA, which can be single-stranded or double-stranded, synthesized or obtained in part from natural sources, and which can contain natural, non-natural or altered nucleotides. The recombinant expression vectors can comprise naturally-occurring, non-naturally-occurring internucleotide linkages, or both types of linkages. Preferably, the non-naturally occurring or altered nucleotides or internucleotide linkages do not hinder the transcription or replication of the vector.

[0029] The recombinant expression vector can be any suitable recombinant expression vector, and can be used to transform or transfect any suitable host. Suitable vectors include those designed for propagation and expansion or for expression or both, such as plasmids and viruses. The vector can be selected from the group consisting of the pUC series (Fermentas Life Sciences), the pBluescript series (Stratagene, La Jolla, CA), the pET series (Novagen, Madison, WI), the pGEX series (Pharmacia Biotech, Uppsala, Sweden), and the pEX series (Clontech, Palo Alto, CA). Bacteriophage vectors, such as λGT10, λGPT11, λZapII (Stratagene), λEMBL4, and λNM1149, also can be used. Examples of plant expression vectors include pBI01, pBI101.2, pBI101.3, pBI121 and pBIN19 (Clontech, Mountain View, CA). Examples of animal expression vectors include pEUK-Cl, pMAM and pMAMneo (Clontech). Preferably, the recombinant expression vector is a viral vector, e.g., a retroviral vector, such as a lentiviral vector.

[0030] The recombinant expression vectors can be prepared using standard recombinant DNA techniques described in, for example, Sambrook et al., supra, and Ausubel et al., supra. Constructs of expression vectors, which are circular or linear, can be prepared to contain a replication system functional in a prokaryotic or eukaryotic host cell.
Replication systems can be derived, e.g., from ColEl, 2 µ plasmid, λ, SV40, bovine papilloma virus, lentiviruses and the like.

[0031] Desirably, the recombinant expression vector comprises regulatory sequences, such as transcription and translation initiation and termination codons, which are specific to the type of host (e.g., bacterium, fungus, plant, or animal) into which the vector is to be introduced, as appropriate and taking into consideration whether the vector is DNA or RNA based.

[0032] The recombinant expression vector can include one or more marker genes, which allow for selection of transformed or transfected hosts. Marker genes include biocide resistance, e.g., resistance to antibiotics, heavy metals, etc., complementation in an auxotrophic host to provide prototrophy, and the like. Suitable marker genes for the inventive expression vectors include, for instance, neomycin/G418 resistance genes, hygromycin resistance genes, histidinol resistance genes, tetracycline resistance genes, and ampicillin resistance genes.

[0033] The recombinant expression vector can comprise a native or nonnative promoter operably linked to the nucleotide sequence encoding the Bole1a viral polypeptides, or proteins (including functional portions and functional variants thereof), such as core, E1, E2, NS3, NS4, NS5, UTR and the like, or to the nucleotide sequence which is complementary to or which hybridizes to the nucleotide sequence encoding the Bole1a viral polypeptides, or proteins or fragments thereof, as discussed above.

[0034] The selection of promoters, e.g., strong, weak, inducible, tissue-specific and developmental-specific, is within the ordinary skill of the artisan. Similarly, the combining of a nucleotide sequence with a promoter is also within the skill of the artisan. The promoter can be a non-viral promoter or a viral promoter, e.g., a cytomegalovirus (CMV) promoter, an SV40 promoter, an RSV promoter, and a promoter found in the long-terminal repeat of the murine stem cell virus.

[0035] In accordance with another embodiment, the present invention provides an isolated host cell comprising the isolated nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 1, or the complement thereof.

[0036] The invention further provides a host cell comprising any of the recombinant expression vectors described herein. As used herein, the term "host cell" refers to any type of cell that can contain the inventive recombinant expression vector. The host cell can be a eukaryotic cell, e.g., plant, animal, fungi, or algae, or can be a prokaryotic cell, e.g., bacteria or protozoa. The host cell can be a cultured cell or a primary cell, i.e., isolated directly from an organism, e.g., a human. The host cell can be an adherent cell or a suspended cell, i.e., a cell that grows in suspension. Suitable host cells are known in the art and include, for instance, DH5α E. coli cells, Chinese hamster ovarian cells, monkey VERO cells, COS cells, HEK293 cells, and the like. For purposes of amplifying or replicating the recombinant expression vector, the host cell is preferably a prokaryotic cell, e.g., a DH5α cell. For purposes of producing a recombinant Bole1a virus, polypeptide, or protein, the host cell is preferably a mammalian cell. Most preferably, the host cell is a human cell. The host cell can be of any cell type, can originate from any type of tissue, and can be of any developmental stage. The host cell can be an liver cell, such as Hep3B cells for example.

[0037] Also disclosed is a population of cells comprising at least one host cell described herein. The population of cells can be a heterogeneous population comprising the host cell comprising any of the recombinant expression vectors described, in addition to at least one other cell, e.g., a host cell (e.g., a liver cell), which does not comprise any of the recombinant expression vectors, or a cell other than a skin cell, e.g., a macrophage, a neutrophil, an erythrocyte, a hepatocyte, an endothelial cell, an epithelial cell, a muscle cell, a brain cell, etc. Alternatively, the population of cells can be a substantially homogeneous population, in which the population comprises mainly of host cells (e.g., consisting essentially of) comprising the recombinant expression vector. The population also can be a clonal population of cells, in which all cells of the population are clones of a single host cell comprising a recombinant expression vector, such that all cells of the population comprise the recombinant expression vectorThe population of cells may be a clonal population comprising host cells comprising a recombinant expression vector as described herein.

[0038] In accordance with a further embodiment, the host cell is a mammalian cell, preferably a liver cell or cell line derived therefrom.

[0039] In accordance with an embodiment, the present invention provides an isolated nucleic acid molecule that specifically hybridizes to the nucleotide sequence set forth in SEQ ID NO: 1 or to the complement thereof. Isolated nucleic acid molecule that specifically hybridizes to the nucleotide sequence set forth in SEQ ID NO: 1 or to the complement thereof comprises an oligonucleotide primer between about 10 and about 100 nucleotides in length, or various lengths of about 20, 30, 40, 50, 60, 70, 80 and about 90 nucleotides in length.

[0040] In accordance with an embodiment, the present invention provides an pair of oligonucleotide primers for PCR, wherein the first primer is an isolated nucleic acid molecule between about 10 and about 30 nucleotides in length that specifically hybridizes to the nucleotide sequence set forth in SEQ ID NO: 1 and the second primer is an isolated nucleic acid molecule between about 10 and about 30 nucleotides in length that specifically hybridizes to the complement of the nucleotide sequence set forth in SEQ ID NO: 1.

[0041] In accordance with an embodiment, the present invention provides an isolated polypeptide encoded by nucleic acid comprising the nucleotide sequence set forth in SEQ ID NO: 1, having the amino acid sequence of SEQ ID NO: 2.

[0042] In accordance with an embodiment, the present invention provides methods for constructing a synthetic viral genome polynucleotide sequences, including, for example, a series of synthetic HCV viral genome polynucleotide sequences, which can be used to construct viral particles, pseudoparticles, and fragments or portions of the polynucleotide sequences can be used for many purposes, including, for example, production of epitopes, antigens, antibodies and vaccines.

[0043] In one embodiment, the method for synthesizing a synthetic viral genome polynucleotide sequence generally comprises the following steps:
  1. 1. Select an appropriate number of representative non-recombinant genomic nucleotide sequences for the virus of interest and an appropriate number of outgroup sequences. For genomic regions lacking a representative sample, go to step 6.
  2. 2. Align sequences using an appropriate alignment program such as MUSCLE (Nucleic Acids Res. 32:1792-1797 (2004)) or ClustalX.
  3. 3. To avoid idiosyncrasies of any individual phylogeny, reconstruct 2 independent phylogenetic trees using a Bayesian or Maximum Likelihood method applied to two phylogenetically informative regions of the alignment. Run sufficient number of iterations to confirm convergence of parameters for phylogenetic trees.
  4. 4. Use both phylogenetic trees to infer ancestral sequences for the rest of the genome. The program used for estimation must infer the ancestral sequence as a probability distribution for each position, generating a probability for each base (e.g.: MrBayes or Garli).
  5. 5. Infer the final representative sequence in the following manner (methods I & II):

    5a. For each nucleotide position i in the genome, if both trees agree on the maximum posterior probability (MPP) residue, the probability of that position pi is selected to be the greater of the two MPPs. These positions are defined as concordant.

    5b. For each discordant position (where the MPP residue does not agree), either (method I) go to directly to step 5d or (method II) calculate the joint probability of the codon k containing the discordant position based on both trees. For concordant residues within such codons, the pi calculated in the previous step is used in calculating the joint probability.

    5c. The codon with the higher joint MPP from the two trees is selected to represent that codon position. This codon-based analysis resolves cases where more than one position in the codon is discordant and accommodates 6-fold degenerate codons.

    5d. To determine a stringent threshold for codon/nucleotide MPP, the inflection in the distribution of codon/nucleotide MPPs at which the variance in second derivative is less than 10-6 for MPP values is used as a threshold for resolving a codon/nucleotide. Each codon/nucleotide with an MPP greater than or equal to the threshold based on either tree is accepted as ancestral and its constituent positions are defined as resolved.

    5e. Covariance analysis is used to examine still-unresolved positions. The basic assumption of phylogenetic reconstruction that each site evolves independently ignores covarying and interacting sites. In order to take such sites into consideration, the observed and expected frequencies of pairs of bases is determined and the chi-squared metric is calculated as shown in equation 1 and adjusted for multiple comparisons using the Holm-Bonferroni method at α = 0.05.

    5f. Using the adjusted chi-squared metric, all resolved positions j that significantly covaried with unresolved positions i are identified. In case of a positive interaction (oij > eij), the MPP codon/nucleotide containing the positively interacting residue is selected. For negative interactions (oij < eij), all codon/nucleotide with the negatively interacting base are eliminated and the MPP codon from the remaining is selected.

    5g. At still-unresolved sites, the MPP codon is selected even if less than the threshold (this is rarely necessary).

    5h. The result is the representative sequence.

  6. 6. (For genomic regions lacking a representative sequence sample) Using available sequences, determine the consensus sequence.


[0044] The term "isolated and purified" as used herein means a protein that is essentially free of association with other proteins or polypeptides, e.g., as a naturally occurring protein that has been separated from cellular and other contaminants by the use of antibodies or other methods or as a purification product of a recombinant host cell culture.

[0045] The term "biologically active" as used herein means an enzyme or protein having structural, regulatory, or biochemical functions of a naturally occurring molecule.

[0046] A "functional variant" of an amino acid sequence as used herein, refers to no more than one, two, three, four, five, six, seven, eight, nine or ten amino acid substitutions in the sequence of interest. The functional variant retains at least one biological activity normally associated with that amino acid sequence. The functional variant may retain at least about 40%, 50%, 60%, 75%, 85%, 90%, 95% or more biological activity normally associated with the full-length amino acid sequence. Aa functional variant may be an amino acid sequence that is at least about 60%, 70%, 80%, 90%, 95% 97% or 98% similar to the polypeptide sequence disclosed herein (or fragments thereof).

[0047] Disclosed is an HCV pseudoviral particle comprising: a) the last 27 amino acids of the core sequence of SEQ ID NO: 1 followed by the amino acid sequences of the E1 and E2 regions; and b) a reporter element. The pseudoviral particle may comprise as a reporter element, the luciferase polyprotein or a functional portion thereof.

[0048] HIV readily forms pseudotypes or pseudoparticles with the envelope proteins of many different viruses. In particular, HIV pseudoparticles bearing native HCV E1 and E2 glycoproteins are infectious for the human hepatoma cell lines Huh-7 and PLC/PR5. Significantly, infectivity is pH-dependent and can be neutralized by a number of E2-specific mAbs. HCV pseudoviral particles can be generated by cotransfection of 293-T cells with equal amounts of expression plasmids expressing the viral gps or an empty vector and the envelope-defective pNL4.3.Luc.R-E- proviral genome.

[0049] Disclosed is a HCV antigen comprising a polynucleotide molecule encoding between 15 to 100 contiguous amino acids of the polypeptide encoded by the nucleotide sequence set forth in SEQ ID NO: 1, or a portion or fragment thereof. Also disclosed is a HCV antigen comprising the polypeptide having the amino acid sequence of SEQ ID NO: 2, or a portion or fragment thereof. The HCV antigen may comprises a polynucleotide molecule which encodes amino acids from the core, E1 and/or E2 regions of the polypeptide of SEQ ID NO: 2, or a portion or fragment thereof.

[0050] Disclosed is a method of treating a subject infected with HCV comprising administering to the subject, a pharmaceutical composition comprising an antigen as described above, in an amount sufficient to stimulate an immune response to the antigen in the subject, such that the immune response is sufficient to decrease the viral load of HCV in the subject.

[0051] The amount or dose of the vaccine compositions that is administered should be sufficient to stimulate an immune response in the subject which will diminish the viral load of HCV in the subject over a reasonable time frame. The dose will be determined by the efficacy of the particular pharmaceutical formulation and the location of the target population of cells in the subject, as well as the body weight of the subject to be treated.

[0052] The term "administering" means that at least one or more pharmaceutical compositions are introduced into a subject, preferably a subject receiving treatment for a disease, and the at least one or more compositions are allowed to come in contact with the one or more disease related cells or population of cells having the target gene of interest in vivo.

[0053] As used herein, the term "treat," as well as words stemming therefrom, includes diagnostic and preventative as well as disorder remitative treatment.

[0054] As used herein, the term "subject" refers to any mammal, including, but not limited to, mammals of the order Rodentia, such as mice and hamsters, and mammals of the order Logomorpha, such as rabbits. It is preferred that the mammals are from the order Carnivora, including Felines (cats) and Canines (dogs). It is more preferred that the mammals are from the order Artiodactyla, including Bovines (cows) and Swines (pigs) or of the order Perssodactyla, including Equines (horses). It is most preferred that the mammals are of the order Primates, Ceboids, or Simoids (monkeys) or of the order Anthropoids (humans and apes). An especially preferred mammal is the human.

[0055] The pharmaceutical compositions can be used in combination with one or more additional therapeutically active agents which are known to be capable of treating conditions or diseases discussed above. For example, the described compositions could be used in combination with one or more known therapeutically active agents, to treat a disease or condition such as HCV infection. Non-limiting examples of other therapeutically active agents that can be readily combined in a pharmaceutical composition are enzymatic nucleic acid molecules, allosteric nucleic acid molecules, antisense, decoy, or aptamer nucleic acid molecules, antibodies such as monoclonal antibodies, small molecules, and other organic and/or inorganic compounds including metals, salts and ions.

[0056] Also disclosed is an antibody, or antigen binding portion thereof, which specifically binds to the nucleic acid molecule set forth in SEQ ID NO: 1, or a portion or fragment thereof, or the isolated polypeptide having the amino acid sequence of SEQ ID NO: 2, or a portion or fragment thereof.

[0057] Disclosed herein are monoclonal antibodies directed against the any of the HCV polypeptides, or proteins, including, for example, the core, E1 and/or E2, NS2, NS3, NS4, NS5 and their subunits, and fragments thereof. The antibody may be a human or humanized antibody molecule.

[0058] The antibody may be labeled with a detectable label.

[0059] Functional variants include, but are not limited to, derivatives that are substantially similar in primary structural sequence, but which contain e.g., in vitro or in vivo modifications, chemical and/or biochemical, that are not found in the parent binding molecule. Such modifications include inter alia acetylation, acylation, covalent attachment of a nucleotide or nucleotide derivative, covalent attachment of a lipid or lipid derivative, cross-linking, disulfide bond formation, glycosylation, hydroxylation, methylation, oxidation, pegylation, proteolytic processing, phosphorylation, and the like.

[0060] Nonlimiting examples of antibody fragments or antigen bindable fragments that bind to epitopes on the antigen include the following: Fab fragments, F(ab)2 fragments, Fab' fragments, fragments produced by F(ab) expression libraries, F(ab')2 fragments, Fd fragments, Fd' fragments and Fv fragments. The antibodies may be human, or from animals other than humans, preferably mammals, such as rat, mouse, guinea pig, rabbit, goat, sheep, and pig. Preferred are mouse monoclonal antibodies and antigen-binding fragments or portions thereof. In addition, chimeric antibodies and hybrid antibodies are embraced herein.

[0061] The monoclonal antibody can be obtained by culturing a hybridoma producing the antibody in a culture medium, for example, a RPMI1640 medium that contains fetal bovine serum. Alternatively, it can be obtained by preparing a gene comprising a heavy chain or a light chain, in which a DNA encoding a constant region of heavy chain or light chain is ligated to a DNA encoding each variable region by means of a PCR method or a chemical synthesis; inserting the obtained gene into a conventionally-used expression vector (e.g., pcDNA3.1 (Invitrogen) capable of expressing the gene; expressing the gene in a host cell such as a CHO cell (Chinese hamster ovary cell) or Escherichia coli to produce the antibody; and purifying the obtained antibody from the culture medium using a Protein A/G column or the like.

[0062] Furthermore, the monoclonal antibody may be obtained by: preparing a hybridoma from a mammal immunized with a recombinant fusion protein comprising any of the HCV proteins, or fragments thereof, including for example, core, E1, E2, NS2, NS3, NS4 and NS5 proteins and their subunits, and one or more other proteins; expressing the fusion protein in a bacterial culture; purifying the fusion protein from bacterial lysates; mixing the purified fusion protein comprising any of the HCV proteins, or fragments thereof, with adjuvant and inoculating the mammal with the purified fusion protein. The inoculated mammals are given a booster inoculation after three weeks and then the splenocytes and lymphocytes are collected three days after the booster. Lymphocytes and splenocytes were fused with murine B cell hybridoma cells, such as SP2/mIL6 cells (ATCC), and propagated using HFCS supplement (Roche) according to manufacturer's instructions. Hybridomas are then screened for reactivity with the various species of recombinant HCV proteins, or fragments thereof.

[0063] Also disclosed herein are conjugates, e.g., bioconjugates, comprising any of the inventive monoclonal antibodies (including any of the functional portions or variants thereof), host cells, populations of host cells, or antibodies, or antigen binding portions thereof. Conjugates, as well as methods of synthesizing conjugates in general, are known in the art. See, for instance, Hudecz, F., Methods Mol. Biol., 298: 209-223 (2005) and Kirin et al., Inorg. Chem, 44(15): 5405-5415 (2005).

[0064] The antibody can be any type of immunoglobulin that is known in the art. For instance, the antibody can be of any isotype, e.g., IgA, IgD, IgE, IgG, IgM, etc. The antibody can be monoclonal or polyclonal. The antibody can be a naturally-occurring antibody, e.g., an antibody isolated and/or purified from a mammal, e.g., mouse, rabbit, goat, horse, chicken, hamster, human, etc. Alternatively, the antibody can be a genetically-engineered antibody, e.g., a humanized antibody or a chimeric antibody. The antibody can be in monomeric or polymeric form. Also, the antibody can have any level of affinity or avidity for any of the HCV polypeptides, or proteins, including, for example, core, E1, E2, NS2, NS3, NS4 and NS5 proteins and their subunits, and fragments thereof.

[0065] Methods of testing antibodies for the ability to bind any of the HCV proteins, or fragments thereof are known in the art and include any antibody-antigen binding assay, such as, for example, radioimmunoassay (RIA), ELISA, Western blot, immunoprecipitation, and competitive inhibition assays (see, e.g., Janeway et al., infra, and U.S. Patent Application Publication No. 2002/0197266 A1).

[0066] Suitable methods of making antibodies are known in the art. For instance, standard hybridoma methods are described in, e.g., Köhler and Milstein, Eur. J. Immunol., 5: 511-519 (1976), Harlow and Lane (eds.), Antibodies: A Laboratory Manual, CSH Press (1988), and C.A. Janeway et al. (eds.), Immunobiology, 5th Ed., Garland Publishing, New York, NY (2001)). Alternatively, other methods, such as EBV-hybridoma methods (Haskard and Archer, J. Immunol. Methods, 74(2): 361-67 (1984), and Roder et al., Methods Enzymol., 121: 140-67 (1986)), and bacteriophage vector expression systems (see, e.g., Huse et al., Science, 246: 1275-81 (1989)) are known in the art. Further, methods of producing antibodies in non-human animals are described in, e.g., U.S. Patents 5,545,806, 5,569,825, and 5,714,352, and U.S. Patent Application Publication No. 2002/0197266 A1).

[0067] Antibodies can be produced by transgenic mice that are transgenic for specific heavy and light chain immunoglobulin genes. Such methods are known in the art and described in, for example U.S. Patents 5,545,806 and 5,569,825, and Janeway et al., supra.

[0068] Methods for generating humanized antibodies are well known in the art and are described in detail in, for example, Janeway et al., supra, U.S. Patents 5,225,539, 5,585,089 and 5,693,761. Humanized antibodies can also be generated using the antibody resurfacing technology described in U.S. Patent 5,639,641 and Pedersen et al., J. Mol. Biol., 235: 959-973 (1994).

[0069] A single-chain variable region fragment (sFv) antibody fragment, which consists of a truncated Fab fragment comprising the variable (V) domain of an antibody heavy chain linked to a V domain of a light antibody chain via a synthetic peptide, can be generated using routine recombinant DNA technology techniques (see, e.g., Janeway et al., supra). Similarly, disulfide-stabilized variable region fragments (dsFv) can be prepared by recombinant DNA technology (see, e.g., Reiter et al., Protein Engineering, 7: 697-704 (1994)). Antibody fragments disclosed herein, however, are not limited to these exemplary types of antibody fragments.

[0070] The antibody, or antigen binding fragment thereof, maybe modified to comprise a detectable label, such as, for instance, a radioisotope, a fluorophore (e.g., fluorescein isothiocyanate (FITC), phycoerythrin (PE)), an enzyme (e.g., luciferase, alkaline phosphatase, horseradish peroxidase), and element particles (e.g., gold or magnetic particles).

[0071] Once an antibody molecule has been produced by an animal, chemically synthesized, or recombinantly expressed, it may be purified by any method known in the art for purification of an immunoglobulin molecule, for example, by chromatography (e.g., ion exchange, affinity, protein A/G immunoprecipitation chromatography, and sizing column chromatography), centrifugation, differential solubility, or by any other standard technique for the purification of proteins. In addition, the antibodies or fragments thereof can be fused to heterologous polypeptide sequences described herein or otherwise known in the art, to facilitate purification.

[0072] The antibodies can be employed to prepare antigen-antibody affinity columns, which may be used for the purification of the antigen. For example, gel supports or beads can be activated with various chemical compounds, e.g., cyanogen bromide, N-hydroxysuccinimide esters, and antibodies can be bound thereto. More particularly, and by way of example, antibodies can be added to Affigel-10 (BioRad, Hercules, CA), a gel support which is activated with N-hydroxysuccinimide esters, such that the antibodies form covalent linkages with the agarose gel bead support. The antibodies are then coupled to the gel via amide bonds with a spacer arm. The remaining activated esters are then quenched with ethanolamine HCl, 1 M, pH 8. The column is washed with water, followed by 0.23 M glycine HCl, pH 2.6, to remove any non-conjugated antibody or extraneous protein. The column is then equilibrated in phosphate buffered saline (PBS), pH 7.3, with appropriate detergent, and the sample materials, i.e., cell culture supernatants or cell extracts, for example, containing the cancer-specific antigens (e.g., prepared using appropriate membrane solubilizing surfactants) are slowly passed over the column. The column is washed with PBS/surfactant until the optical density falls to background. The protein is then eluted from the column with 0.23 M glycine-HCl, pH 2.6/surfactant. The purified antigens are then dialyzed against PBS/surfactant.

[0073] Methods of detecting the presence of HCV in a host and methods of treating or preventing infection of a host with HCV are further disclosed. The method of detecting the presence of HCV in a host comprises (i) contacting a sample comprising cells of the host with any of the antibodies, or antigen binding fragments thereof, described herein, thereby forming a complex, and (ii) detecting the complex, wherein detection of the complex is indicative of the presence of HCV infection in the host.

[0074] Also disclosed is a method of testing a sample for the presence of HCV in the sample, the method comprising detecting the presence of a polypeptide in the sample that specifically binds to the antibody as described herein.

[0075] Also disclosed is a method for localizing cells infected with HCV in a subject, especially cells expressing the core, E1, E2, NS2, NS3, NS4 and NS5 proteins and their subunits, and fragments thereof, comprising: (a) administering to the subject a detectably-labeled monoclonal antibody, or binding fragment thereof; (b) allowing the detectably-labeled (e.g., radiolabeled; flurochrome labeled, or enzyme labeled, for example, via ELISA) monoclonal antibody, or binding fragment thereof, to bind to the infected cells within the subject; and (c) determining the location of the labeled monoclonal antibody or binding fragment thereof, within the subject.

[0076] The antibody may be labeled with a detectable moiety, such as a fluorophore, a chromophore, a radionuclide, a chemiluminescent agent, a bioluminescent agent and an enzyme.

[0077] Antibodies may be labeled with such reagents using protocols and techniques known and practiced in the art. See, for example, Wenzel and Meares, Radioimmunoimaging and Radioimmunotherapy, Elsevier, New York, (1983); Colcer et al., Meth. Enzymol., 121: 802-816 (1986); and Monoclonal Antibodies for Cancer Detection and Therapy, Baldwin et al., (eds) Academic Press, 303-316 (1985), for techniques relating to the radiolabeling of antibodies.

[0078] The antibodies, or binding fragments thereof, may be delivered parenterally, such as by intravenous, subcutaneous, or intraperitoneal administration, e.g., injection. Suitable buffers, carriers, and other components known in the art can be used in formulating a composition comprising the antibody or fragments for suitable shelf-life and compatibility for the administration. These substances may include ancillary agents such as buffering agents and protein stabilizing agents (e.g., polysaccharides).

[0079] More specifically, therapeutic formulations of the antibodies, or binding fragments thereof, are prepared for storage by mixing the antibodies or their binding fragments, having the desired degree of purity, with optional physiologically acceptable carriers, excipients, or stabilizers (Remington's Pharmaceutical Sciences, 17th edition, (Ed.) A. Osol, Mack Publishing Company, Easton, Pa., (1985)), in lyophilized form or in the form of aqueous solutions. Acceptable carriers, excipients or stabilizers are nontoxic to recipients at the dosages and concentrations employed, and include buffers such as phosphate, citrate, and other organic acids; antioxidants including ascorbic acid; low molecular weight (e.g., about 10-15 amino acid residues or less) polypeptides; proteins, such as serum albumin, gelatin, or immunoglobulins; hydrophilic polymers such as polyvinylpyrrolidone; amino acids such as glycine, glutamine, asparagine, arginine or lysine; monosaccharides, disaccharides, and other carbohydrates including glucose, mannose, or dextrins; chelating agents such as EDTA; sugar alcohols such as mannitol or sorbitol; salt-forming counterions such as sodium; and/or nonionic surfactants such as TWEEN™ (polysorbates), PLURONICS™ (block copolymers of ethylene oxide (EO) and propylene oxide (PO)) or polyethylene glycol (PEG). The antibodies, or binding fragments thereof, also may be entrapped in microcapsules prepared, for example, by coacervation techniques or by interfacial polymerization (for example, hydroxymethylcellulose or gelatin-microcapsules and poly-[methylmethacylate] microcapsules, respectively), in colloidal drug delivery systems (for example, liposomes, albumin microspheres, microemulsions, nano-particles and nanocapsules), or in macroemulsions. Such techniques are disclosed in Remington's Pharmaceutical Sciences, supra.

[0080] Antibodies or their binding fragments to be used for in vivo administration must be sterile. This is readily accomplished by filtration through sterile filtration membranes, prior to, or following lyophilization and reconstitution. The antibodies, or binding fragments thereof, ordinarily will be stored in lyophilized form or in solution.

[0081] Therapeutic antibody compositions generally are placed into a container having a sterile access port, for example, an intravenous solution bag or vial having a stopper pierceable by a hypodermic injection needle. The route of administration of the antibodies, or binding fragments thereof is in accord with known methods, e.g., injection or infusion by intravenous, intraperitoneal, intramuscular, intrarterial, subcutaneous, intralesional routes, by aerosol or intranasal routes, or by sustained release systems as noted below. The antibodies, or binding fragments thereof, are administered continuously by infusion or by bolus injection. Suitable examples of sustained-release preparations include semipermeable matrices of solid hydrophobic polymers containing the protein, which matrices are in the form of shaped articles, e.g., films, or microcapsules. Examples of sustained-release matrices include polyesters, hydrogels (e.g., poly(2-hydroxyethyl-methacrylate) as described by Langer et al., J. Biomed. Mater. Res., 15: 167-277 (1981) and Langer, Chem. Tech., 12: 98-105 (1982)), or poly(vinylalcohol), polylactides (U.S. Pat. No. 3,773,919), copolymers of L-glutamic acid and gamma ethyl-L-glutamate (Sidman et al., Biopolymers, 22: 547-556 (1983)), nondegradable ethylene-vinyl acetate (Langer et al., supra), degradable lactic acid-glycolic acid copolymers such as the LUPRON DEPOT™ (injectable microspheres composed of lactic acid-glycolic acid copolymer and leuprolide acetate), and poly-D-(-)-3-hydroxybutyric acid (EP 133,988).

[0082] An effective amount of antibody to be employed therapeutically will depend, for example, upon the therapeutic and treatment objectives, the route of administration, the age, condition, and body mass of the patient undergoing treatment or therapy, and auxiliary or adjuvant therapies being provided to the patient. Accordingly, it will be necessary and routine for the practitioner to titer the dosage and modify the route of administration, as required, to obtain the optimal therapeutic effect. A typical daily dosage might range from about 1 mg/kg to up to about 100 mg/kg or more, preferably from about 0.1 to about 10 mg/kg/day depending on the above-mentioned factors. Typically, the clinician will administer antibody until a dosage is reached that achieves the desired effect. The progress of this therapy is easily monitored by conventional assays.

[0083] Various adjuvants may be used to increase the immunological response to the antigen or vaccine and to elicit specific antibodies. Depending on the host species to be immunized, adjuvants may include, but are not limited to, Freund's (complete and incomplete), mineral gels, such as aluminum hydroxide, surface active agents, such as lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet hemocyanin, dinitrophenol, and potentially useful human adjuvants such as BCG (bacille Calmette-Guerin) and Corynebacterium parvum.

[0084] The antibodies are also useful for in vitro diagnostic applications for the detection of HCV infected cells that possess the antigen for which the antibodies are specific. As detailed above, in vitro diagnostic methods include immunohistological or immunohistochemical detection of HCV infected cells (e.g., on human tissue, or on cells dissociated from excised specimens), or serological detection of HCV associated antigens (e.g., in blood samples or other biological fluids). Immunohistochemical techniques involve staining a biological specimen, such as a tissue specimen, with one or more of the antibodies and then detecting the presence on the specimen of antibody-antigen complexes comprising antibodies bound to the cognate antigen. The formation of such antibody-antigen complexes with the specimen indicates the presence of HCV infection in the tissue.

[0085] Detection of the antibody on the specimen can be accomplished using techniques known in the art such as immunoenzymatic techniques, e.g., immunoperoxidase staining technique, or the avidin-biotin technique, or immunofluorescence techniques (see, e.g., Ciocca et al., Meth. Enzymol., 121: 562-79 (1986), and Introduction to Immunology, (2nd Ed), 113-117, Macmillan Publishing Company (1986)). Serologic diagnostic techniques involve the detection and quantification of tumor-associated antigens that have been secreted or "shed" into the serum or other biological fluids of patients thought to be suffering from cancer, as mentioned above. Such antigens can be detected in the body fluids using techniques known in the art, such as radioimmunoassays (RIA) or enzyme-linked immunoabsorbant assays (ELISA), wherein antibody reactive with the shed antigen is used to detect the presence of the antigen in a fluid sample (See, e.g., Uotila et al., J. Immunol. Methods, 42: 11 (1981) and Fayed et al., Disease Markers, 14: 155-160 (1998)).

[0086] Disclosed is a method of detection of circulating serum antibodies specific for HCV proteins in a biological sample from a subject using an ELISA assay comprising: (a) contacting said at least one biological sample having at least one antibody specific for HCV protein, or at least one fragment of said protein with an HCV protein or a fragment thereof, and (b) detecting the formation of an antigen-antibody complex between the HCV protein or a fragment thereof, and an HCV specific antibody or fragment thereof, present in the biological sample.

[0087] The antibody or antibodies can, themselves, be linked to a detectable label. Such a detectable label allows for the presence of, or the amount of the primary immune complexes to be determined. Alternatively, the first added component that becomes bound within the primary immune complexes can be detected by means of a second binding ligand that has binding affinity for the first antibody. In these cases, the second binding ligand is itself, often an antibody, which can be termed a "secondary" antibody. The primary immune complexes are contacted with the labeled, secondary binding ligand, or antibody, for a period of time sufficient to allow the formation of secondary immune complexes. The secondary immune complexes are then washed to remove any non-specifically bound labeled secondary antibodies or ligands, and the remaining label in the secondary immune complexes is then detected.

[0088] A method of detecting the presence and extent of infection of HCV in a patient is provided, comprising: determining the level of the antigen in a sample of bodily fluid or a tissue section from the patient and correlating the quantity of the antigen with the presence and extent of the infection in the patient. The antigen may be detected by (1) adding monoclonal antibody specific for core, E1, E2, NS2, NS3, NS4 and NS5 proteins and their subunits, and fragments thereof to the sample or tissue section; (2) adding goat anti-mouse IgG antibody conjugated with peroxidase; (3) fixing with diaminobenzidene and peroxide, and (4) examining the sample or section, wherein reddish brown color indicates that the cells bear the antigen.

[0089] Also disclosed is a method of making affinity-purified polyclonal antibodies using a 10 kD recombinant version of the core, E1, E2, NS2, NS3, NS4 and NS5 proteins and their subunits, and fragments thereof. The common leader peptide is transfected into bacteria and the leader peptide is expressed and is suitably soluble in aqueous solution. Polyclonal antibodies are ordinarily obtained from the serum of goat or rabbit immunized with a particular antigen. The antigen may be the 10 kD recombinant version of the core, E1, E2, NS2, NS3, NS4 and NS5 proteins and their subunits, and fragments thereof. The antiserum is affinity purified to remove nonspecific antibodies, increasing sensitivity and reducing background. Further purification is performed to remove potential nonspecific reactivities among related animal species, or to reduce shared reactivity with other heavy and light chains. The purified antibody may be labeled with a detectable marker, for example, rhodamine. The purified polyclonal antibodies are used to detect antigen using tissue samples that are fixed and embedded in paraffin, using methods known in the art.

[0090] Further methods include the detection of primary immune complexes by a two-step approach. A second binding ligand, such as an antibody, that has binding affinity for the first antibody is used to form secondary immune complexes, as described above. After washing, the secondary immune complexes are contacted with a third binding ligand or antibody that has binding affinity for the second antibody, for a period of time sufficient to allow the formation of immune complexes (tertiary immune complexes). The third ligand or antibody is linked to a detectable label, allowing detection of the tertiary immune complexes thus formed.

[0091] The monoclonal antibodies can be administered parenterally by injection or by gradual perfusion over time. The monoclonal antibodies can be administered intravenously, intraperitoneally, intramuscularly, subcutaneously, intracavity, or transdermally, alone or in combination with effector cells. Preparations for parenteral administration include sterile aqueous or non-aqueous solutions, suspensions, and emulsions. Examples of non-aqueous solvents are propylene glycol, polyethylene glycol, vegetable oils such as olive oil, and injectable organic esters such as ethyl oleate. Aqueous carriers include water, alcoholic/aqueous solutions, emulsions or suspensions, including saline and buffered media. Parenteral vehicles include sodium chloride solution, Ringer's dextrose, dextrose and sodium chloride, lactated Ringer's, or fixed oils. Intravenous vehicles include fluid and nutrient replenishers, electrolyte replenishers (such as those based on Ringer's dextrose), and the like. Preservatives and other additives may also be present such as, for example, antimicrobials, anti-oxidants, chelating agents, and inert gases and the like.

[0092] The monoclonal antibodies, or binding fragments thereof may be used to quantitatively or qualitatively detect the presence of any of the HCV proteins, or fragments thereof, on or in various skin or other cells. This can be achieved, for example, by immunofluorescence techniques employing a fluorescently labeled antibody, coupled with light microscopic, flow cytometric, or fluorometric detection. In addition, the antibodies, or binding fragments thereof may additionally be employed histologically, as in immunofluorescence, immunoelectron microscopy, or non-immuno assays, for in situ detection of the cancer-specific antigen on cells, such as for use in monitoring, diagnosing, or detection assays.

[0093] In situ detection is accomplished by removing a histological specimen from a patient, and applying thereto a labeled antibody. The antibody, or antigen-binding fragment thereof, is preferably applied by overlaying the labeled antibody or fragment onto the biological sample. Through the use of such a procedure, it is possible to determine not only the presence of the antigen, or conserved variants, or peptide fragments, but also its distribution in the examined tissue. Those of ordinary skill in the art will readily recognize that any of a wide variety of histological methods, e.g., staining procedures, can be modified in order to achieve such in situ detection.

[0094] In an immunoassay, a biological sample may be brought into contact with, and immobilized onto, a solid phase support or carrier, such as nitrocellulose, or other solid support or matrix, which is capable of immobilizing cells, cell particles, membranes, or soluble proteins. The support is then washed with suitable buffers, followed by treatment with the detectably-labeled antibody. The solid phase support is then washed with buffer a second time to remove unbound antibody. The amount of bound label on the solid support is then detected by conventional means. Accordingly, compositions are provided comprising the monoclonal antibodies, or binding fragments thereof, bound to a solid phase support, such as described herein.

[0095] By solid phase support, or carrier, or matrix, is meant any support capable of binding an antigen or an antibody. Well-known supports or carriers include glass, plastic, nylon wool, polystyrene, polyethylene, polypropylene, dextran, nylon, amylases, films, resins, natural and modified celluloses, polyacrylamides, agarose, alumina gels, gabbros, and magnetite. The nature of the carrier can be either soluble to some extent, or insoluble. The support material may have virtually any possible structural configuration as long as the coupled molecule is capable of binding to an antigen or antibody. Thus, the support configuration may be spherical, as in a bead, cylindrical, as in the inside surface of a test tube, or the external surface of a rod. Alternatively, the surface may be flat, such as a sheet, film, test strip, stick, and the like.

[0096] The solid support may be inert to the reaction conditions for binding and may have reactive groups, or activated groups, in order to attach the monoclonal antibody, a binding fragment, or the binding partner of the antibody. The solid phase support can also be useful as a chromatographic support, such as the carbohydrate polymers SEPHAROSE™ (crosslinked agarose beads), SEPHADEX™ (crosslinked dextran gel), or agarose. Indeed, a large number of such supports for binding antibody or antigen are commercially available and known to those having skill in the art.

[0097] The binding activity for a given antibody may be determined by well-known methods. With respect to the cancer specific antibodies disclosed herein, numerous ways to detectably label such protein molecules are known and practiced in the art. For example, the antibodies can be detectably labeled by linking the antibody to an enzyme, e.g., for use in an enzyme immunoassay (EIA), (Voller et al., Diagnostic Horizons, 2: 1-7 (1978); Butler et al., Meths. Enzymol., 73: 482-523 (1981)). The enzyme that is bound to the antibody reacts with an appropriate substrate, preferably a chromogenic substrate, so as to produce a chemical moiety which can be detected, for example, by spectrophotometric, fluorometric, or by visual detection means. Nonlimiting examples of enzymes which can be used to detectably label the antibodies include malate dehydrogenase, staphylococcal nuclease, delta-5-steroid isomerase, yeast alcohol dehydrogenase, alpha-glycerophosphate dehydrogenase, triose phosphate isomerase, horseradish peroxidase, alkaline phosphatase, ribonuclease, urease, catalase, glucose-6-phosphate dehydrogenase, glucoamylase and acetylcholinesterase. The detection can be accomplished by chromometric methods, which employ a chromogenic substrate for the enzyme, or by visual comparison of the extent of enzymatic reaction of a substrate compared with similarly prepared standards or controls.

[0098] The antibodies disclosed herein, or their antigen-binding fragments, can also be labeled using a fluorescent compound. When the fluorescently labeled antibody is exposed to light of the proper wavelength, its presence can then be detected due to fluorescence. Some of the most commonly used fluorescent labeling compounds include fluorescein isothiocyanate, rhodamine, phycoerythrin, phycocyanin, allophycocyanin, o-phthaldehyde and fluorescamine.

[0099] The antibodies can also be detectably labeled by coupling them to a chemiluminescent compound. The presence of the chemiluminescent-tagged antibody is then determined by detecting the presence of luminescence that develops during the course of a chemical reaction. Examples of particularly useful chemiluminescent labeling compounds include, without limitation, luminol, isoluminol, theromatic acridinium ester, imidazole, acridinium salt and oxalate ester. Similarly, a bioluminescent compound may be used to label the antibodies. Bioluminescence is a type of chemiluminescence found in biological systems in which a catalytic protein increases the efficiency of the chemiluminescent reaction. The presence of a bioluminescent protein is determined by detecting the presence of luminescence. Useful bioluminescent labeling compounds include luciferin, luciferase and aequorin.

[0100] The following examples further illustrate the invention but, of course, should not be construed as in any way limiting its scope.

EXAMPLES



[0101] Human subjects. The Baltimore Before and After Acute Study of Hepatitis (BBAASH) cohort is a prospective study of persons at risk for hepatitis C infection. Eligible participants have a history of or ongoing intravenous drug use and are seronegative for anti-HCV antibodies at enrollment. Written consent was obtained from each participant. Once enrolled, participants receive counseling to reduce intravenous drug use and its complications. Blood is drawn for isolation of serum, plasma, and peripheral blood mononuclear cells (PBMC) in a protocol designed for monthly follow-up. Participants with acute HCV infection were referred for evaluation of treatment. The study was approved by the Institutional Review Board at the Johns Hopkins School of Medicine.

[0102] Synthetic Coding Sequence Reconstruction. HCV subtype 1a (n=390) and 1b (n=296) sequences that included at least the entire open reading frame of the polyprotein, were obtained from human specimens, and were not epidemiologically redundant were downloaded from GenBank (accession numbers AB016785, AB049087-101, AB154177, AB154179, AB154181, AB154183, AB154185, AB154187, AB154189, AB154191, AB154193, AB154195, AB154197, AB154199, AB154201, AB154203, AB154205, AB191333, AB249644, AB429050, AF009606, AF139594, AF165045, AF165047, AF165049, AF165051, AF165053, AF165055, AF165057, AF165059, AF165061, AF165063, AF176573, AF207752-74, AF208024, AF313916, AF356827, AF483269, AF511948-50, AJ000009, AJ132996-97, AJ238799-800, AJ278830, AY045702, AY460204, AY587844, AY615798, AY695437, AY956463-8, D10749, D10934, D11168, D14484, D50480-82, D63857, D85516, D89815, D89872, D90208, DQ071885, DQ838739, EF032883, EF032886, EF032892, EF032900, EF407411-57, EF407458-504, EF621489, EF638081, EU155213-16, EU155217-35, EU155233, EU155236-381, EU234061, EU234063-65, EU239713, EU239714, EU239715-17, EU255927-99, EU255960-2, EU256000-1, EU256002-97, EU256045, EU256054, EU256059, EU256061-2, EU256064-6, EU256075-103, EU256104, EU256106-7, EU260395-6, EU362882, EU362888-901, EU362911, EU482831-2, EU482833, EU482834-89,EU482839, EU482849, EU482859, EU482860, EU482874, EU482875, EU482877, EU482879-81, EU482883, EU482885-6, EU482888, EU529676-81, EU529682, EU569722-23, EU595697-99, EU660383-85, EU660386, EU660387, EU660388, EU677248, EU677253, EU687193-95, EU857431, EU862823-24, EU862826-27, EU862835, FJ024086, FJ024087, FJ024274-76, FJ024277, FJ024278, FJ024279, FJ024280-82, FJ181999-201, FJ205867-69, FJ390394-95, FJ390396-8, FJ390399, FJ410172, L02836, M58335, M84754, U01214, U16362, U45476, U89019, X61596).

[0103] Hereinafter, the 390 subtype la sequence dataset is referred to as the "original dataset" for purposes of describing the present invention. The sequences were aligned using MUSCLEv3.0 (Nucleic Acids Res. 32:1792-1797 (2004)) and modified using BioEditv7.0.5.3108 (mbio.ncsu.edu/RNaseP/info/programs/BIOEDIT/bioedit.html (accessed 20 Feb 2005)). To avoid idiosyncrasies of any individual phylogeny, we constructed 2 independent phylogenetic trees using a software program which allows phylogenetic reconstruction and ancestral sequence reconstruction as a probability distribution, e.g. MrBayesv3.2 (Bioinformatics 19:1572-1574 (2003)) applied to positions 869-1292 (Core/El) and 8276-8615 (NS5B) from the full-genome alignment (position numbers are based on reference genome H77; Genbank accession number AF009606). These segments were chosen because they were shown to be most phylogenetically informative. They are hereinafter referred to as "Simmonds" regions in the present invention. 30 million iterations of MrBayesv3.2 were run and confirmed convergence of parameters for phylogenetic trees inferred from both Simmonds regions using Tracerv1.5 (Rambaut A, available from the author, [beast.bio.ed.ac.uk/Tracer]). Both Simmonds regions yielded different trees which is expected due to the large number of possible trees; nonetheless, analysis of these two dataset converged with similar model parameters. In addition, recombination in HCV is rare. Hence, it can be assumed that the same phylogenetic tree or same evolutionary history will be correct for the entire length of the genome.

[0104] Using both phylogenetic trees reconstructed with Simmonds regions, ancestral sequences were inferred for each of the HCV-1a coding regions. The ancestral sequence is obtained as a probability distribution for each position, such that there is a probability of observing each base.

[0105] Computational preparation of synthetic HCV genome. Bolela was derived using the methods described herein.
  1. 1. For each nucleotide position i in the genome, if both trees agreed on the maximum posterior probability (MPP) residue, the probability of that position pi was selected to be the greater of the two MPPs. These positions are defined as concordant.
  2. 2. For discordant positions (where the MPP residue did not agree), the joint probability of the codon k containing the discordant position based on both trees was designated pck(core-E1) and pck(NS5B). For concordant residues within such codons, the pi calculated in the previous step was used in calculating the joint probability.
  3. 3. The codon with the higher joint MPP from the two trees was selected to represent that codon position. This codon-based analysis resolves cases where more than one position in the codon is discordant and accommodates 6-fold degenerate codons.
  4. 4. To determine a stringent threshold for codon MPP, the inflection in the distribution of codon MPPs at which the variance in second derivative was less than 10-6 for MPP values was found to be 0.9837, corresponding to individual residue MPPs > 0.99.
  5. 5. Each codon with an MPP greater than or equal to 0.9837 based on either tree was accepted as ancestral and its constituent positions were defined as resolved.
  6. 6. Covariance analysis was used to examine still-unresolved positions. The basic assumption of phylogenetic reconstruction that each site evolves independently ignores covarying and interacting sites. In order to take such sites into consideration, the observed and expected frequencies of pairs of bases was determined and the chi-squared metric was calculated as shown in equation 1 and adjusted for multiple comparisons using the Holm-Bonferroni method at α = 0.05.


    Using the adjusted chi-squared metric, all resolved positions j that significantly covaried with unresolved positions i were identified. In case of a positive interaction (oij > eij), the MPP codon containing the positively interacting residue was selected. For negative interactions (oij < eij), all codons with the negatively interacting base were eliminated and the MPP codon from the remaining is selected.
  7. 7. At still-unresolved sites, the MPP codon was selected even if less than 0.9837 (as noted in Example x, this was rarely necessary).


[0106] 5' and 3' UTR sequence reconstruction. Although 5'UTR and 3'UTR are noncoding regions, they are essential in the replication of the virus. However, of the 390 sequences, only 6 had completely sequenced 5'UTR regions and 4 had completely sequenced 3'UTR regions. Hence we used additional sequences to better design the noncoding regions. The 5'-UTR (n=257) and 3'-UTR (n=46) sequences were from clonal sequences generated from acutely-infected subjects in the BBAASH cohort. We found that our 90% consensus sequence of the 5' UTR was identical to the consensus sequence derived from the 6 sequenes with complete sequences and also to the H77 5'UTR. The 3'UTR sequence was divided into 4 parts based on classification by Kolykhalov et. al. We determined the 90% consensus sequence for the first part, which is a short sequence with significant variability among genotypes. For the second segment of the 3'UTR, we determined that the median length of the homopolymeric uracil tract was 51 residues, which is also a favorable length for replication. We selected a segment of median length for the third segment, a polypyrimidine tract consisting of mainly U with interspersed C residues. The last (3' end) part is a conserved sequence of 98 bases for which we used the 90% consensus sequence, was confirmed with 15 additional sequences from an unrelated study

[0107] HCV pseudoparticle (HCVpp) system. A region of Bolela nucleotide sequence encoding the last 27 amino acids of core followed by the E1 and E2 regions was synthesized (Blue Heron, Bothell, WA) and then subcloned into the expression vector pcDNA3.2/V5/Dest (Invitrogen, Carlsbad, CA) using Gateway cloning technology. The E1E2 region was sequenced after cloning and showed no errors. Pseudoparticles containing the luciferase reporter gene were generated as described (Proc.Natl.Acad.Sci.U.S.A 100:7271-7276 (2003); Proc.Natl.Acad.Sci.U.S.A 101:10149-10154 (2004); Clin.Infect.Dis. 41:667-675 (2005)). Briefly, plasmid expressing Bolela E1E2 was co-transfected into HEK293T cells with pNL4-3.Luc.R-E- plasmid containing the env-defective HIV proviral genome and a luciferase reporter gene. The HCVpp containing supernatants were collected 48 and 72 hours after transfection. Pseudoparticles expressing E1E2 glycoproteins from H77, and from another subtype la HCV virus (ppla46), as well as no E1E2 (mock) were produced in parallel with pseudoparticles expressing Bole1a E1E2 for comparison of infectivity. Serial two-fold dilutions of pseudoparticles were used to infect Hep3B hepatoma cells in duplicate wells of a 96-well plate for 5 hours, followed by replacement of media, and measurement of luciferase activity 72 hours post infection. Cells were lysed with Cell Culture Lysis Reagent (Promega, USA) and luciferase activity was measured using Luciferase Assay Reagent (Promega, USA) and a Centro LB960 Chemiluminometer (Berthold, Germany).

[0108] CD81 blocking experiments. Hep3b cells were incubated with a mouse antihuman CD81 monoclonal antibody (100 µg/ml, clone 1.3.3.22, Santa Cruz Biotechnology) or mouse IgG1 isotype control (Santa Cruz Biotechnology, USA) for 1 hr at 37 °C, and HCVpp infection was assessed as above.

[0109] Neutralization by human plasma. Heat-inactivated plasma or serum was diluted 1:4 with MEM containing 10% FBS, incubated with each library HCVpp for 1 hour at 37 °C (final HCVpp dilution, 1:100), added to Hep3b hepatoma cells in duplicate wells of a 96-well plate and incubated for 5 hours at 37 °C followed by replacement of media. Luciferase activity was measured as above. HCVpp infection was measured in terms of relative light units (RLUs) in the presence of plasma or serum samples (RLUtest) versus average infection in the presence of normal human serum (Gemini Bio-Products, West Sacremento, CA) and plasma pooled from seronegative BBAASH participants (RLUcontrol). Percent neutralization was calculated as [1-(RLU test/RLUcontrol)]x100.

[0110] Diversity analysis. Diversity plots were generated using VarPlot version 1.2 (available from the author at sray.med.som.jhmi.edu/scroftware/VarPlot). Plots were generated using a window size of 20 codons (to reflect the upper limit of T cell epitopes) and a step size of 1. Nonsynonymous and synonymous distances were calculated using the models of Nei and Gojobori (Mol. Biol. Evol. 3:418-426 (1986)). The E1E2 pixel alignment (Figure 2b) was drawn using VisSPAvl.6 (sray.med.som.jhmi.edu/SCRoftware/VisSPA/).

[0111] The Bole1a genomic sequence has been deposited in Genbank under accession # JQ791196.

EXAMPLE 1



[0112] Trees for the E1 and NS5B regions generated ancestral sequences that agreed at 9763 (∼98%) of 9992 nucleotide sites in the alignment (gaps were counted as characters). Applying the codon threshold of MPP of 0.9837 or higher in either tree left 68/3012 (2.2%) unresolved codons. Of these 68, 42 were choices between synonymous codons and 26 were choices between non-synonymous codons. Covariance networks were used to resolve ambiguities.

EXAMPLE 2



[0113] Covarying positions. Of the 68 unresolved codons, 4 were determined based on dependence with resolved positions in the genome (H77 positions 1157, 1611, 2120, and 6554). All four of the positions (1157, 1611, 2120, and 6554) led to synonymous changes. Positions 1611 and 6554 were linked to multiple sites across the genome (50 and 3 respectively) whereas positions 1157 and 2120 were linked to one other resolved position. Because the covariance was only detected statistically, biological interaction is a question for further research.

EXAMPLE 3



[0114] Representative characteristics of Bole1a. Once a complete representative sequence for Bole1a was determined, it was desired that to ensure that Bole1a represents any set of nucleotide or protein HCV subtype la sequences and not just the sequences from which it was reconstructed. In order to confirm this, two additional datasets were used for confirmation. The first dataset was from a paper by Yusim et al. (J.Gen.Virol. 91:1194-1206 (2010)) and collected from the Los Alamos HCV database. This dataset contains 143 sequences, 136 of which are present in the original dataset; however, the authors of that report curated the dataset to avoid resampling linked sequences. This dataset is referred to as the Yusim dataset. The second dataset, which is referred to as the E1E2 dataset, contains 214 E1E2 sequences; these were obtained from our ongoing BBAASH cohort. The sequences in the latter dataset are unrelated to any full-length sequences in GenBank or from the LANL database. Neighbor joining trees showed that Bole1a consistently branches from the center, suggesting that it is representative of both the Yusim and E1E2 datasets (Fig. 1).

[0115] Based on full-genome pairwise comparison, Bole1a has greater similarity to subtype la sequences than any other sequence in the original dataset (average and median reduction in non-synonymous distance of 39% and 44%, respectively, Fig. 2a). In sliding windows of 20 codons (approximating the upper limit on the size of T cell epitopes) spanning the genome, the similarity of Bolela was greater than 98% overall (mean and median similarity 98.4% and 98.9%, respectively). Not surprisingly, the lowest similarity between Bolela and subtype la circulating genomes was in hypervariable region 1 (HVR1), where similarity was as low as 73% (similarity among subtype la isolates at the same position was 64%). Comparison of Bolela sequence to the consensus sequence of the original dataset, H77, HCV-1 and a 1b sequence demonstrates the high variability in HVR-1 (as shown by an asterisk, Fig 2b).

[0116] All 9-mers of the Bolela amino acid sequence were compared to sequences in the Yusim dataset to represent potential epitope coverage. The use of 9-mers for this comparison is based on the typical MHC class I-restricted epitope length recognized by effector CD8+ T cells that are a crucial component of immune control in spontaneous clearance of HCV infection. Bole1a provides 78% exact-match 9-mer coverage for the HCV polyprotein whether compared to the Yusim dataset or the original dataset (data not shown, method previously described in Yusim). In the highly-diverse E1 and E2 regions of the E1E2 dataset, Bolela provides 58% exact-match 9-mer coverage. The Yusim dataset was then compared by individual proteins and confirmed that highly heterogeneous regions such as E1 and E2 have lower coverage by Bolela than more conserved regions such as core and NS4B. In all cases, Bole1a provided greater 9-mer coverage than the reference sequence H77. Bolela matched 99% of all 9-mers on full genome comparison when a mismatch at 1 or 2 positions was allowed. In summary, it was found that Bolela matched 95% of all modal (most frequently-observed) 9-mers at each position of the genome whereas individual sequences in the Yusim dataset had a median modal 9-mer coverage of 80% (Fig. 3a).

EXAMPLE 4



[0117] The obvious limitation of comparing 9-mer coverage is that not all 9-mers are recognized as T cell epitopes. To focus on epitope coverage all known subtype la T cell epitopes (both CD4 and CD8) were selected from the Immune Epitope Database (www.immuneepitope.org/) associated with a positive result in at least one assay. Of the 548 epitopes in the database, only 338 were present in at least half of the sequences of the Yusim dataset (excluding AF271632 and AX100563 due to their linkage with HCV-1 and H77, respectively). Bolela had the highest (100%) coverage of these 338 epitopes (Fig. 3b). HCV-1 and H77, which are commonly used as antigens in HCV immunology, only matched 317 and 316 (∼93%) of these 338 epitopes respectively. Figure 3b shows the distribution of epitope coverage for all sequences in the Yusim dataset. When epitopes that were present in 80% of the sequences were included, Bolela provided 94% coverage while H77 and HCV-1 provided 87% and 91% coverage respectively (data not shown). It should be noted that because HCV-1 and H77 have been the primary isolates used as antigens in many of the studies from which these epitopes are derived, their coverage may be due in part to analytical bias. Lastly, interferon gamma ELISpot analysis of variant epitopes demonstrated that variants in the Bole1a genome are more consistently immunogenic than other sequences, including H77 and a simple consensus (data not shown).

EXAMPLE 5



[0118] Bolela pseudoparticle. Approximately 75% of individual E1E2 isolates tested have low infectivity (less than 5 standard deviations above background) when used to pseudotype lentiviral particles (Fig. 4). The diversity of the envelope genes is extremely high with an average non-synonymous diversity of 36% in 20-codon windows (Fig. 2). As a result of this diversity and our methods, Bole1a has a unique HVR1 sequence (ETHVTGGSAARATAGFAGLFTPGAKQN) (SEQ ID NO: 3) among the genomes we have examined, and searches for this peptide sequence using BLAST (blast.ncbi.nlm.nih.gov) and HMMER (hmmer.janelia.org) did not reveal any identical sequences. To test functionality despite these negative predictors for infectivity, the E1E2 sequence of Bolela was used to construct a lentiviral pseudoparticle. Surprisingly, HCVpp-Bolela infected Hep3B target cells with high efficiency comparable to highly-infectious and well-characterized isolate HCVpp-H77 (Fig. 4). Blocking the Bole1a HCVpp with anti-CD81 antibody led to at least a 10 fold reduction in infectivity (p=0.0008) whereas the isotype control antibody did not change infectivity (p=0.85; Fig. 4). RLU values below the threshold (Fig. 4) are found to be reproducibly low with high variance. Although the comparison panel of subtype la HCVpp excluded those that contained stop codons, frameshift mutations, or other obvious defects, it is evident that there are other biological or artifactual characteristics that render many of those clones less infectious. Importantly, the goal of this experiment was to determine whether Bole1a E1E2 would be functional at all in spite of its synthetic origin and the high variability of HCV envelope; that this E1E2 was infectious in the HCVpp system was highly unexpected. Additionally, the Bolela HCVpp was readily neutralized by human sera. It was observed that 30% of BBAASH subjects inhibited at least 85% of entry and 90% of subjects from the BBAASH cohort (36 out of 40) inhibited at least 50% of Bolela HCVpp entry, whereas normal human serum and pooled HCV-seronegative sera were non-neutralizing.

[0119] This proof-of-concept study demonstrates that the Bole1a envelope E1E2 heterodimer is able to fold and assemble correctly. Because HCV E1 and E2 genes are critical for host cell entry, they are also important targets for antibody-mediated virus neutralization. Because it lacks evident immunologically-driven escape mutations, Bolela may represent the ideal platform to study determinants of HCV fitness.

EXAMPLE 6



[0120] Preliminary analyses have shown that epitopes from Bole1a are the most immunogenic of any isolate tested. In those cases where Bolela epitopes differed from the traditional consensus (2 out of 15 tested), T cells from chronically infected patients recognized Bole1a epitopes better than the corresponding epitopes from circulating and consensus sequences (data not shown). Since Bolela is representative of circulating strains, it is unlikely to contain escape mutations that hinder viral fitness. For example, the Bolela sequence has a Y at position 1444 whereas an F at the position is believed to be an escape mutation causing the NS3 1436-1444 epitope to elicit a less robust T cell response. Additionally, Bole1a contains the KLVALGINAV (SEQ ID NO: 4) sequence at NS3 1406-1416. Three variants of this epitope have been shown to have diminished T cell response without a change in MHC binding ability making escape the most likely explanation for these variants.

EXAMPLE 7



[0121] Using the above described methods, two additional synthetic HCV genome polynucleotide sequences were prepared. The sequences are for a second HCV subtype 1a (SEQ ID NO: 5) and its resolved amino acid sequence (SEQ ID NO: 6), and HCV subtype 1b (SEQ ID NO: 7) and its resolved amino acid sequence (SEQ ID NO: 8).

[0122] The use of the terms "a" and "an" and "the" and similar referents in the context of describing the invention (especially in the context of the following claims) are to be construed to cover both the singular and the plural, unless otherwise indicated herein or clearly contradicted by context. The terms "comprising," "having," "including," and "containing" are to be construed as open-ended terms (i.e., meaning "including, but not limited to,") unless otherwise noted. Recitation of ranges of values herein are merely intended to serve as a shorthand method of referring individually to each separate value falling within the range, unless otherwise indicated herein, and each separate value is incorporated into the specification as if it were individually recited herein. All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., "such as") provided herein, is intended merely to better illuminate the invention and does not pose a limitation on the scope of the invention unless otherwise claimed. No language in the specification should be construed as indicating any non-claimed element as essential to the practice of the invention.

[0123] Preferred embodiments of this invention are described herein, including the best mode known to the inventors for carrying out the invention. Variations of those preferred embodiments may become apparent to those of ordinary skill in the art upon reading the foregoing description. The inventors expect skilled artisans to employ such variations as appropriate, and the inventors intend for the invention to be practiced otherwise than as specifically described herein. Accordingly, this invention includes all modifications and equivalents of the subject matter recited in the claims appended hereto as permitted by applicable law.

SEQUENCE LISTING



[0124] 

<210> 1

<211> 9607
<212> DNA
<213> synthetic

<400> 1













<210> 2
<211> 3011
<212> PRT
<213> synthetic

<400> 2





























<210> 3
<211> 27
<212> PRT
<213> synthetic

<400> 3



<210> 4
<211> 10
<212> PRT
<213> synthetic

<400> 4

<210> 5
<211> 9607
<212> DNA
<213> synthetic

<400> 5













<210> 6
<211> 3011
<212> PRT
<213> synthetic

<400> 6































<210> 7
<211> 9569
<212> DNA
<213> synthetic

<400> 7













<210> 8
<211> 3010
<212> PRT
<213> synthetic

<400> 8
































Claims

1. A nucleic acid molecule encoding the genome of a synthetic hepatitis C virus subtype 1a (Bole1a) comprising the nucleotide sequence of SEQ ID NO: 1 or the complement thereof.
 
2. An isolated host cell comprising the isolated nucleic acid molecule of claim 1.
 
3. The isolated host cell of claim 2, wherein the host cell is a mammalian cell.
 
4. An isolated nucleic acid molecule that specifically hybridizes to the nucleotide sequence of synthetic hepatitis C virus subtype 1a (Bolela) of SEQ ID NO: 1 or to the complement thereof.
 
5. A nucleic acid molecule according to claim 4, which is an oligonucleotide primer between about 10 and about 30 nucleotides in length.
 
6. A pair of oligonucleotide primers for PCR, wherein the first primer is an isolated nucleic acid molecule between about 10 and about 30 nucleotides in length that specifically hybridizes to the nucleotide sequence of synthetic hepatitis C virus subtype 1a (Bole1a) of SEQ ID NO: 1 and the second primer is an isolated nucleic acid molecule between about 10 and about 30 nucleotides in length that specifically hybridizes to the complement of the nucleotide sequence of synthetic hepatitis C virus subtype 1a (Bolela) of SEQ ID NO: 1.
 
7. An isolated polypeptide encoded by nucleic acid according to claim 1.
 
8. The isolated polypeptide of claim 7, comprising the amino acid sequence of SEQ ID NO: 2.
 
9. A viral particle comprising:

a) a polynucleotide encoding the E1E2 region of SEQ ID NO:1 comprising the HVR1 sequence of SEQ ID NO: 3; and

b) a reporter element.


 
10. The viral particle of claim 9, wherein the reporter element comprises the luciferase polyprotein or a functional portion thereof.
 
11. A method of preparing a synthetic HCV virus polynucleotide comprising:

a) obtaining two or more HCV polynucleotide sequences;

b) aligning the polynucleotide sequences using an appropriate alignment program;

c) preparing two or more phylogenetic trees from the alignment in b) using a Bayesian or Maximum Likelihood method applied to two phylogenetically informative regions of the alignment for a sufficient number of iterations to confirm convergence of parameters for phylogenetic trees;

d) using both phylogenetic trees and inferring ancestral sequences for the rest of the HCV genome, wherein the program used for estimation must infer the ancestral sequence as a probability distribution for each position, generating a probability for each base;

e) inferring the final representative sequence in the following manner (methods I & II):

for each nucleotide position i in the genome, if both trees agree on the maximum posterior probability (MPP) residue, the probability of that position pi is selected to be the greater of the two MPPs, and these positions are defined as concordant;

for each discordant position (where the MPP residue does not agree), either (method I) go directly to step d or (method II) calculate the joint probability of the codon k containing the discordant position based on both trees;

for concordant residues within such codons, the pi calculated in the previous step is used in calculating the joint probability;

the codon with the higher joint MPP from the two trees is selected to represent that codon position;

f) determining a stringent threshold for codon/nucleotide MPP, wherein the inflection in the distribution of codon/nucleotide MPPs at which the variance in second derivative is less than 10-6 for MPP values is used as a threshold for resolving a codon/nucleotide, wherein each codon/nucleotide with an MPP greater than or equal to the threshold based on either tree is accepted as ancestral and its constituent positions are defined as resolved;

g) using covariance analysis to examine still-unresolved positions, wherein the observed and expected frequencies of pairs of bases is determined and the chi-squared metric is calculated as shown in equation 1 and adjusted for multiple comparisons using the Holm-Bonferroni method at α = 0.05:

h) using the adjusted chi-squared metric, all resolved positions j that significantly covaried with unresolved positions i are identified; where in case of a positive interaction (oij > eij), the MPP codon/nucleotide containing the positively interacting residue is selected, for negative interactions (oij < eij), all codon/nucleotide with the negatively interacting base are eliminated and the MPP codon from the remaining is selected, and

i) synthesizing the synthetic HCV polynucleotide or a fragment or portion thereof.


 


Ansprüche

1. Ein Nukleinsäuremolekül, das das Genom eines synthetischen Hepatitis-C-Virus des Subtyps 1a (Bole1a) kodiert, beinhaltend die Nukleotidsequenz von SEQ ID NO: 1 oder das Komplement davon.
 
2. Eine isolierte Wirtszelle, beinhaltend das isolierte Nukleinsäuremolekül gemäß Anspruch 1.
 
3. Isolierte Wirtszelle gemäß Anspruch 2, wobei die Wirtszelle eine Säugetierzelle ist.
 
4. Ein isoliertes Nukleinsäuremolekül, das spezifisch an die Nukleotidsequenz des synthetischen Hepatitis-C-Virus des Subtyps 1a (Bole1a) von SEQ ID NO: 1 oder an das Komplement davon hybridisiert.
 
5. Nukleinsäuremolekül gemäß Anspruch 4, das ein Oligonukleotidprimer mit einer Länge von zwischen ungefähr 10 und ungefähr 30 Nukleotiden ist.
 
6. Ein Paar von Oligonukleotidprimern für die PCR, wobei der erste Primer ein isoliertes Nukleinsäuremolekül mit einer Länge von zwischen ungefähr 10 und ungefähr 30 Nukleotiden ist, das spezifisch an die Nukleotidsequenz des synthetischen Hepatitis-C-Virus des Subtyps 1a (Bole1a) von SEQ ID NO: 1 hybridisiert und
der zweite Primer ein isoliertes Nukleinsäuremolekül mit einer Länge von zwischen ungefähr 10 und ungefähr 30 Nukleotiden ist, das spezifisch an die Nukleotidsequenz des synthetischen Hepatitis-C-Virus des Subtyps 1a (Bole1a) von SEQ ID NO: 1 hybridisiert.
 
7. Ein isoliertes Polypeptid, das durch die Nukleinsäure gemäß Anspruch 1 kodiert ist.
 
8. Isoliertes Polypeptid gemäß Anspruch 7, das die Aminosäuresequenz von SEQ ID NO: 2 beinhaltet.
 
9. Ein Viruspartikel, das Folgendes beinhaltet:

a) ein Polynukleotid, das die E1E2-Region von SEQ ID NO: 1, beinhaltend die HVR1-Sequenz von SEQ ID NO: 3, kodiert; und

b) ein Reporterelement.


 
10. Viruspartikel gemäß Anspruch 9, wobei das Reporterelement das Luciferasepolyprotein oder einen funktionellen Bestandteil davon beinhaltet.
 
11. Ein Verfahren zum Herstellen eines synthetischen HCV-Virus-Polynukleotids, das Folgendes beinhaltet:

a) Gewinnen von zwei oder mehr HCV-Polynukleotidsequenzen;

b) Ausrichten der Polynukleotidsequenzen unter Verwendung eines geeigneten Ausrichtungsprogramms;

c) Herstellen von zwei oder mehr phylogenetischen Bäumen aus der Ausrichtung in b) unter Verwendung eines bayesschen oder eines Maximum-Wahrscheinlichkeits-Verfahrens, das auf zwei phylogenetisch informative Bereiche der Ausrichtung angewendet wird, damit eine ausreichende Anzahl von Iterationen die Konvergenz von Parametern für phylogenetische Bäume bestätigt;

d) Verwenden der beiden phylogenetischen Bäume und Ableiten von anzestralen Sequenzen für den Rest des HCV-Genoms, wobei das zum Schätzen verwendete Programm die anzestrale Sequenz als eine Wahrscheinlichkeitsverteilung für jede Position ableiten muss, wobei eine Wahrscheinlichkeit für jede Basis generiert wird;

e) Ableiten der endgültigen repräsentativen Sequenz auf folgende Weise (Verfahren I und II):

für jede Nukleotidposition i in dem Genom, falls der Rest der Maximum-a-posteriori-Wahrscheinlichkeit (MAP) mit beiden Bäumen übereinstimmt, wird die Wahrscheinlichkeit der Position pi als die größere von den beiden MAPs ausgewählt, und diese Positionen sind als konkordant definiert;

für jede diskordante Position (wenn der MAP-Rest nicht übereinstimmt), entweder (Verfahren I) direkt mit Schritt d fortfahren oder (Verfahren II) die gemeinsame Wahrscheinlichkeit, dass das Codon k die diskordante Position basierend auf beiden Bäumen enthält, berechnen;

für konkordante Reste innerhalb solcher Codons wird das in den vorhergehenden Schritten berechnete pi zum Berechnen der gemeinsamen Wahrscheinlichkeit verwendet;

das Codon mit der größeren gemeinsamen MAP von den beiden Bäumen wird ausgewählt, um die Codonposition darzustellen;

f) Bestimmen eines strengeren Grenzwerts der Codon/Nukleotid-MAP, wobei die Inflektion in der Verteilung von Codon/Nukleotid-MAPs, bei welcher die Abweichung in dem zweiten Derivat weniger als 10-6 für MAP-Werte beträgt, als ein Grenzwert zum Auflösen eines Codons/Nukleotids verwendet wird, wobei jedes Codon/Nukleotid mit einer MAP, die größer oder gleich groß wie der Grenzwert basierend auf einem der Bäume ist, als anzestral akzeptiert wird und dessen Bestandteilpositionen als aufgelöst definiert sind;

g) Verwenden einer Kovarianzanalyse zum Untersuchen von noch unaufgelösten Positionen, wobei die beobachteten und erwarteten Häufigkeiten von Basenpaaren bestimmt werden und der Chi-Quadrat-Messwert wie in Gleichung 1 gezeigt berechnet wird und für multiple Vergleiche unter Verwendung des Holm-Bonferroni-Verfahrens auf α = 0,05 angepasst ist:

h) Verwenden des angepassten Chi-Quadrat-Messwerts, wobei alle unaufgelösten j-Positionen, die deutlich mit den unaufgelösten i-Positionen covariierten, identifiziert werden; wobei im Falle einer positiven Interaktion (oij > eij) das den positiven Interaktionsrest enthaltende MAP-Codon/Nukleotid ausgewählt wird, bei negativen Interaktionen (oij < eij) alle Codons/Nukleotide mit der negativ interagierenden Basis eliminiert werden und das MAP-Codon aus den verbleibenden ausgewählt wird, und

i) Synthetisieren der synthetischen HCV-Polynukleotide oder eines Fragments oder Anteils davon.


 


Revendications

1. Une molécule d'acide nucléique codant pour le génome d'un virus de l'hépatite C synthétique du sous-type 1a (Bole1a) comprenant la séquence nucléotidique de SEQ ID NO : 1 ou le complément de celle-ci.
 
2. Une cellule hôte isolée comprenant la molécule d'acide nucléique isolée de la revendication 1.
 
3. La cellule hôte isolée de la revendication 2, la cellule hôte étant une cellule de mammifère.
 
4. Une molécule d'acide nucléique isolée qui s'hybride spécifiquement à la séquence nucléotidique du virus de l'hépatite C synthétique du sous-type 1a (Bole1a) de SEQ ID NO : 1 ou au complément de celle-ci.
 
5. Une molécule d'acide nucléique selon la revendication 4, laquelle est une amorce oligonucléotidique d'une longueur comprise entre environ 10 et environ 30 nucléotides.
 
6. Une paire d'amorces oligonucléotidiques pour PCR, dans laquelle la première amorce est une molécule d'acide nucléique isolée d'une longueur comprise entre environ 10 et environ 30 nucléotides qui s'hybride spécifiquement à la séquence nucléotidique du virus de l'hépatite C synthétique du sous-type 1a (Bole1a) de SEQ ID NO : 1 et la deuxième amorce est une molécule d'acide nucléique isolée d'une longueur comprise entre environ 10 et environ 30 nucléotides qui s'hybride spécifiquement au complément de la séquence nucléotidique du virus de l'hépatite C synthétique du sous-type 1a (Bole1a) de SEQ ID NO: 1.
 
7. Un polypeptide isolé codé par l'acide nucléique selon la revendication 1.
 
8. Le polypeptide isolé de la revendication 7, comprenant la séquence d'acides aminés de SEQ ID NO : 2.
 
9. Une particule virale comprenant :

a) un polynucléotide codant pour la région E1E2 de SEQ ID NO : 1 comprenant la séquence de l'HVR1 de SEQ ID NO : 3 ; et

b) un élément rapporteur.


 
10. La particule virale de la revendication 9, dans laquelle l'élément rapporteur comprend la polyprotéine de la luciférase ou une partie fonctionnelle de celle-ci.
 
11. Un procédé de préparation d'un polynucléotide de virus VHC synthétique comprenant :

a) l'obtention de deux séquences polynucléotidiques du VHC ou plus ;

b) l'alignement des séquences polynucléotidiques en utilisant un programme d'alignement approprié ;

c) la préparation de deux arbres phylogénétiques ou plus à partir de l'alignement en b) en utilisant une méthode bayésienne ou du maximum de vraisemblance appliquée à deux régions phylogénétiquement informatives de l'alignement pour un nombre suffisant d'itérations afin de confirmer la convergence de paramètres pour des arbres phylogénétiques ;

d) l'utilisation des deux arbres phylogénétiques et l'inférence de séquences ancestrales pour le reste du génome du VHC, le programme utilisé pour l'estimation devant inférer la séquence ancestrale comme une distribution de probabilité pour chaque position, générant une probabilité pour chaque base ;

e) l'inférence de la séquence représentative finale de la manière suivante (méthodes I & II) :

pour chaque position i de nucléotide dans le génome, si les deux arbres s'accordent sur le résidu de probabilité postérieure maximale (PPM), la probabilité de cette position pi est sélectionnée comme étant la plus grande des deux PPM, et ces positions sont définies comme étant concordantes ;

pour chaque position discordante (où le résidu de PPM ne s'accorde pas), soit (méthode I) aller directement à l'étape d, soit (méthode II) calculer la probabilité conjointe que le codon k contienne la position discordante sur la base des deux arbres ;

pour des résidus concordants au sein de tels codons, la pi calculée à l'étape précédente est utilisée dans le calcul de la probabilité conjointe ; le codon ayant la plus grande PPM conjointe d'après les deux arbres est sélectionné afin de représenter cette position de codon ;

f) la détermination d'un seuil rigoureux pour la PPM de codon/nucléotide, l'inflexion dans la distribution de PPM de codons/nucléotides à laquelle la variance de la deuxième dérivée est inférieure à 10-6 pour des valeurs de PPM étant utilisée comme un seuil pour résoudre un codon/nucléotide, chaque codon/nucléotide ayant une PPM plus grande que ou égale au seuil sur la base d'un arbre comme de l'autre étant acceptée comme étant ancestrale et ses positions de composants étant définies comme étant résolues ;

g) l'utilisation d'une analyse de covariance afin d'examiner des positions encore non résolues, les fréquences observées et attendues de paires de bases étant déterminées et la métrique du chi-carré étant calculée comme indiqué dans l'équation 1 et ajustée pour de multiples comparaisons en utilisant la méthode de Holm-Bonferroni à α = 0,05 :

h) en utilisant la métrique du chi-carré ajustée, toutes les positions j résolues qui covariaient significativement avec des positions i non résolues sont identifiées ; où dans le cas d'une interaction positive (oij > eij), le codon/nucléotide de PPM contenant le résidu qui interagit positivement est sélectionné, pour des interactions négatives (oij < eij), tous les codons/nucléotides ayant la base interagissant négativement sont éliminés et le codon de PPM provenant du reste est sélectionné, et

i) la synthétisation du polynucléotide du VHC synthétique ou d'un fragment ou d'une partie de celui-ci.


 




Drawing

















Cited references

REFERENCES CITED IN THE DESCRIPTION



This list of references cited by the applicant is for the reader's convenience only. It does not form part of the European patent document. Even though great care has been taken in compiling the references, errors or omissions cannot be excluded and the EPO disclaims all liability in this regard.

Patent documents cited in the description




Non-patent literature cited in the description