NON-ENDOGENOUS, CONSTITUTIVELY ACTIVATED HUMAN G PROTEIN-COUPLED RECEPTORS

(19)

(11)

EP 1 121 431 B9

(12)	CORRECTED EUROPEAN PATENT SPECIFICATION
	Note: Bibliography reflects the latest situation

(15)	Correction information:
	Corrected version no 2 (W2 B1)
	Corrections, see Bibliography

(48)	Corrigendum issued on:
	24.10.2007 Bulletin 2007/43

(45)	Mention of the grant of the patent:
	22.02.2006 Bulletin 2006/08

(21)	Application number: 99951991.1

(22)	Date of filing: 12.10.1999

(51)

International Patent Classification (IPC):

C12N 15/12^(2006.01)
G01N 33/50^(2006.01)

C07K 14/72^(2006.01)
G01N 33/566^(2006.01)

(86)	International application number:
	PCT/US1999/023938

(87)	International publication number:
	WO 2000/022129 (20.04.2000 Gazette 2000/16)

(54)	NON-ENDOGENOUS, CONSTITUTIVELY ACTIVATED HUMAN G PROTEIN-COUPLED RECEPTORS NICHT-ENDOGENE, KONSTITUTIV AKTIVIERTE, MENSCHLICHE, AN EIN G-PROTEIN GEKOPPELTE REZEPTOREN RECEPTEURS COUPLES A LA PROTEINE G HUMAINE NON ENDOGENES ET ACTIVES DE FA ON CONSTITUTIVE

(84)	Designated Contracting States:
	AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

(30)

Priority:

13.10.1998 US 170496

(43)	Date of publication of application:
	08.08.2001 Bulletin 2001/32

(73)	Proprietor: Arena Pharmaceuticals, Inc.
	San Diego, CA 92121 (US)

(72)	Inventors:
	BEHAN, Dominic, P. San Diego, CA 92131 (US) CHALMERS, Derek, T. Solana Beach, CA 92075 (US) LIAW, Chen, W. San Diego, CA 92129 (US)

(74)	Representative: Hallybone, Huw George et al
	Carpmaels and Ransford, 43 Bloomsbury Square London WC1A 2RA London WC1A 2RA (GB)

(56)

References cited: :

WO-A-97/21731

WO-A-98/38217

KJELSBERG M. A. ET AL.: "CONSTITUTIVE ACTIVATION OF THE ALPHA1B-ADRENERGIC RECEPTOR BY ALL AMINO ACID SUBSTITUTIONS AT A SINGLE SITE" JOURNAL OF BIOLOGICAL CHEMISTRY, vol. 267, no. 3, 25 January 1992 (1992-01-25), pages 1430-1433, XP002911764 ISSN: 0021-9258
SCHEER A. ET AL.: "CONSTITUTIVELY ACTIVE G PROTEIN-COUPLED RECEPTORS: POTENTIAL MECHANISMS OF RECEPTOR ACTIVATION" JOURNAL OF RECEPTOR AND SIGNAL TRANSDUCTION RESEARCH, vol. 17, no. 1/03, 1997, pages 57-73, XP000867531 ISSN: 1079-9893
PAUWELS P. J. ET AL.: "REVIEW: AMINO ACID DOMAINS INVOLVED IN CONSTITUTIVE ACTIVATION OF G-PROTEIN-COUPLED RECEPTORS" MOLECULAR NEUROBIOLOGY, vol. 17, no. 1/03, 1998, pages 109-135, XP000866477 ISSN: 0893-7648


	Remarks:
	The sequence listing, which is published as annex to the application documents, was filled after the date of filing. The applicant has declared that it does not include matter which goes beyond the content of the application as filed.

Note: Within nine months from the publication of the mention of the grant of the European patent, any person may give notice to the European Patent Office of opposition to the European patent granted. Notice of opposition shall be filed in a written reasoned statement. It shall not be deemed to have been filed until the opposition fee has been paid. (Art. 99(1) European Patent Convention).

Description

FIELD OF THE INVENTION

[0001] The invention disclosed in this patent document relates to transmembrane receptors, and more particularly to human G protein-coupled receptors (GPCRs) which have been altered such that altered GPCRs are constitutively activated. Most preferably, the altered human GPCRs are used for the screening of therapeutic compounds.

BACKGROUND OF THE INVENTION

[0002] Although a number of receptor classes exist in humans, by far the most abundant and therapeutically relevant is represented by the G protein-coupled receptor (GPCR or GPCRs) class. It is estimated that there are some 100,000 genes within the human genome, and of these, approximately 2% or 2,000 genes, are estimated to code for GPCRs. Of these, there are approximately 100 GPCRs for which the endogenous ligand that binds to the GPCR has been identified. Because of the significant time-lag that exists between the discovery of an endogenous GPCR and its endogenous ligand, it can be presumed that the remaining 1,900 GPCRs will be identified and characterized long before the endogenous ligands for these receptors are identified. Indeed, the rapidity by which the Human Genome Project is sequencing the 100,000 human genes indicates that the remaining human GPCRs will be fully sequenced within the next few years. Nevertheless, and despite the efforts to sequence the human genome, it is still very unclear as to how scientists will be able to rapidly, effectively and efficiently exploit this information to improve and enhance the human condition. The present invention is geared towards this important objective.

[0003] Receptors, including GPCRs, for which the endogenous ligand has been identified are referred to as "known" receptors, while receptors for which the endogenous ligand has not been identified are referred to as "orphan" receptors. This distinction is not merely semantic, particularly in the case of GPCRs. GPCRs represent an important area for the development of pharmaceutical products: from approximately 20 of the 100 known GPCRs, 60% of all prescription pharmaceuticals have been developed. Thus, the orphan GPCRs are to the pharmaceutical industry what gold was to California in the late 19^th century - an opportunity to drive growth, expansion, enhancement and development. A serious drawback exists, however, with orphan receptors relative to the discovery of novel therapeutics. This is because the traditional approach to the discovery and development of pharmaceuticals has required access to both the receptor and its endogenous ligand. Thus, heretofore, orphan GPCRs have presented the art with a tantalizing and undeveloped resource for the discovery of pharmaceuticals.

[0004] Under the traditional approach to the discovery ofpotential therapeutics, it is generally the case that the receptor is first identified. Before drug discovery efforts can be initiated, elaborate, time consuming and expensive procedures are typically put into place in order to identify, isolate and generate the receptor's endogenous ligand - this process can require from between 3 and ten years per receptor, at a cost of about $5million (U.S.) per receptor. These time and financial resources must be expended before the traditional approach to drug discovery can commence. This is because traditional drug discovery techniques rely upon so-called "competitive binding assays" whereby putative therapeutic agents are "screened" against the receptor in an effort to discover compounds that either block the endogenous ligand from binding to the receptor ("antagonists"), or enhance or mimic the effects of the ligand binding to the receptor ("agonists"). The overall objective is to identify compounds that prevent cellular activation when the ligand binds to the receptor (the antagonists), or that enhance or increase cellular activity that would otherwise occur if the ligand was properly binding with the receptor (the agonists). Because the endogenous ligands for orphan GPCRs are by definition not identified, the ability to discover novel and unique therapeutics to these receptors using traditional drug discovery techniques is not possible. The present invention, as will be set forth in greater detail below, overcomes these and other severe limitations created by such traditional drug discovery techniques.

[0005] GPCRs share a common structural motif. All these receptors have seven sequences of between 22 to 24 hydrophobic amino acids that form seven alpha helices, each ofwhich spans the membrane (each span is identified by number, i.e., transmembrane-1 (TM-1), transmebrane-2 (TM-2), etc.). The transmembrane helices are joined by strands of amino acids between transmembrane-2 and transmembrane-3, transmembrane-4 and transmembrane-5, and transmembrane-6 and transmembrane-7 on the exterior, or "extracellular" side, of the cell membrane (these are referred to as "extracellular" regions 1, 2 and 3 (EC-1, EC-2 and EC-3), respectively). The transmembrane helices are also joined by strands of amino acids between transmembrane-1 and transmembrane-2, transmembrane-3 and transmembrane-4, and transmembrane-5 and transmembrane-6 on the interior, or "intracellular" side, of the cell membrane (these are referred to as "intracellular" regions 1, 2 and 3 (IC-1, IC-2 and IC-3), respectively). The "carboxy" ("C") terminus of the receptor lies in the intracellular space within the cell, and the "amino" ("N") terminus of the receptor lies in the extracellular space outside of the cell. The general structure of G protein-coupled receptors is depicted in Figure 1.

[0006] Generally, when an endogenous ligand binds with the receptor (often referred to as "activation" of the receptor), there is a change in the conformation of the intracellular region that allows for coupling between the intracellular region and an intracellular "G-protein." Although other G proteins exist, currently, Gq, Gs, Gi, and Go are G proteins that have been identified. Endogenous ligand-activated GPCR coupling with the G-protein begins a signaling cascade process (referred to as "signal transduction"). Under normal conditions, signal transduction ultimately results in cellular activation or cellular inhibition. It is thought that the IC-3 loop as well as the carboxy terminus of the receptor interact with the G protein. A principal focus of this invention is directed to the transmembrane-6 (TM6) region and the intracellular-3 (IC3) region of the GPCR.

[0007] Under physiological conditions, GPCRs exist in the cell membrane in equilibrium between two different conformations: an "inactive" state and an "active" state. As shown schematically in Figure 2, a receptor in an inactive state is unable to link to the intracellular signaling transduction pathway to produce a biological response. Changing the receptor conformation to the active state allows linkage to the transduction pathway (via the G-protein) and produces a biological response.

[0008] A receptor may be stabilized in an active state by an endogenous ligand or a compound such as a drug. Recent discoveries, including but not exclusively limited to modifications to the amino acid sequence of the receptor, provide means other than endogenous ligands or drugs to promote and stabilize the receptor in the active state conformation. These means effectively stabilize the receptor in an active state by simulating the effect of an endogenous ligand binding to the receptor. Stabilization by such ligand-independent means is termed "constitutive receptor activation."

[0009] As noted above, the use of an orphan receptor for screening purposes has not been possible. This is because the traditional "dogma" regarding screening ofcompounds mandates that the ligand for the receptor be known. By definition, then, this approach has no applicability with respect to orphan receptors. Thus, by adhering to this dogmatic approach to the discovery of therapeutics, the art, in essence, has taught and has been taught to forsake the use of orphan receptors unless and until the endogenous ligand for the receptor is discovered. Given that there are an estimated 2,000 G protein coupled receptors, the majority of which are orphan receptors, such dogma castigates a creative, unique and distinct approach to the discovery of therapeutics.

[0010] Information regarding the nucleic acid and/or amino acid sequences of a variety ofGPCRs is summarized below in Table A. Because an important focus of the invention disclosed herein is directed towards orphan GPCRs, many of the below-cited references are related to orphan GPCRs. However, this list is not intended to imply, nor is this list to be construed, legally or otherwise, that the invention disclosed herein is only applicable to orphan GPCRs or the specific GPCRs listed below. Additionally, certain receptors that have been isolated are not the subject of publications per se; for example, reference is made to a G Protein-Coupled Receptor database on the "world-wide web" (neither the named inventors nor the assignee have any affiliation with this site) that lists GPCRs. Other GPCRs are the subject of patent applications owned by the present assignee and these are not listed below (including GPR3, GPR6 and GPR12; see U.S. Provisional Number 60/094879):

Table A

Receptor Name	Publication Reference
GPR1	23 Genomics 609 (1994)
GPR4	14 DNA and Cell Biology 25 (1995)
GPR5	14 DNA and Cell Biology 25 (1995)
GPR7	28 Genomics 84 (1995)
GPR8	28 Genomics 84 (1995)
GPR9	184 J. Exp. Med. 963 (1996)
GPR10	29 Genomics 335 (1995)
GPR15	32 Ge,omics 462 (1996)
GPR17	70 J Neurochem. 1357 (1998)
GPR18	42 Genomics 462 (1997)
GPR20	187 Gene 75 (1997)
GPR21	187 Gene 75 (1997)
GPR22	187 Gene 75 (1997)
GPR24	398 FEBS Lett. 253 (1996)
GPR30	45 Genomics 607 (1997)
GPR31	42 Genomics 519 (1997)
GPR32	50 Genomics 281 (1997)
GPR40	239 Biochem. Biophys. Res. Commun. 543 (1997)
GPR41	239 Biochem. Biophys. Res. Commun. 543 (1997)
GPR43	239 Biochem. Biophys. Res. Commun. 543 (1997)
APJ	136 Gene 355 (1993)
BLR1	22 Eur. J. Immunol. 2759 (1992)
CEPR	231 Biochem. Biophys. Res. Commun. 651 (1997)
EBI1	23 Genomics 643 (1994)
EBI2	67 J. Virol. 2209 (1993)
ETBR-LP2	424 FEBS Lett. 193 (1998)
GPCR-CNS	54 Brain Res. Mol. Brain Res. 152 (1998); 45 Genomics 68 (1997)
GPR-NGA	394 FEBS Lett. 325 (1996)
H9	386 FEBS Lett 219 (1996)
HBA954	1261 Biochim. Biophys. Acta 121 (1995)
HG38	247 Biochem. Biophys. Res. Commun. 266 (1998)
HM74	5 Int. Immunol. 1239 (1993)
OGR1	35 Genomics 397 (1996)
V28	163 Gene (1995)

[0011] As will be set forth and disclosed in greater detail below, utilization of a mutational cassette to modify the endogenous sequence of a human GPCR leads to a constitutively activated version of the human GPCR. These non-endogenous, constitutively activated versions ofhuman GPCRs can be utilized, inter alia, for the screening of candidate compounds to directly identify compounds of, e.g., therapeutic relevance.

[0012] WO 97/21731 discloses a consitutively active, non-endogenous version of an endogenous human GPCR, the CCK-B/gastrin receptor, which is characterised in that the valine residue, found at position 16 when counting from the endogenous proline residue within the transmembrane-6 region in a C-terminal to N-terminal direction, was altered to a non-endogenous glutamic acid residue.

[0013] WO 98/38217 discloses an alignment method for identifying in the sequences of GPCRs the amino acid equivalent to the alanine residue at position 293 of the alpha 1B-adrenergic receptor. It teaches to apply the alignment method to create contitutively active versions of other GPCRs.

SUMMARY OF THE INVENTION

[0014] The present invention provides a method for creating a non-endogenous, constitutively active version of an endogenous human G protein coupled receptor (GPCR), said endogenous GPCR comprising a transmembrane 6 region and an intracellular loop 3 region, the method comprising:

(a) selecting an endogenous human GPCR comprising a proline residue in the transmembrane 6 region;
(b) identifying the endogenous 16th amino acid residue from the proline residue of step (a), in a carboxy-terminus to amino-terminus direction;
(c) altering the identified amino acid residue of step (b) to a non-endogenous amino acid residue to create a non-endogenous version of the endogenous human GPCR; and
(d) determining if the non-endogenous version of the endogenous human GPCR of step (c) is constitutively active by measuring a difference in an intracellular signal measured for the non-endogenous version as compared with a signal induced by the endogenous GPCR.

[0015] Disclosed herein is a non-endogenous, human G protein-coupled receptor comprising (a) as a most preferred amino acid sequence region (C-terminus to N-terminus orientation) and/or (b) as a most preferred nucleic acid sequence region (3' to 5' orientation) transversing the transmembrane-6 (TM6) and intracellular loop-3 (IC3) regions of the GPCR:

(a)

P¹ AA₁₅ X

wherein:

(1) P¹ is an amino acid residue located within the TM6 region of the GPCR, where P¹ is selected from the group consisting of (i) the endogenous GPCR's proline residue, and (ii) a non-endogenous amino acid residue other than proline;

(2) AA₁₅ are 15 amino acids selected from the group consisting of (a) the endogenous GPCR's amino acids (b) non-endogenous amino acid residues, and (c) a combination of the endogenous GPCR's amino acids and non-endogenous amino acids, excepting that none of the 15 endogenous amino acid residues that are positioned within the TM6 region of the GPCR is proline; and

(3) X is a non-endogenous amino acid residue located within the IC3 region of said GPCR, preferably selected from the group consisting of lysine, hisitidine and arginine, and most preferably lysine, excepting that when the endogenous amino acid at position X is lysine, then X is an amino acid other than lysine, preferably alanine;

and/or

(b)

P^codon (AA-codon)₁₅ X_codon

wherein:

(1) P^codon is a nucleic acid sequence within the TM6 region of the GPCR, wherein P^codon encodes an amino acid selected from the group consisting of (i) the endogenous GPCR's proline residue, and (ii) a non-endogenous amino acid residue other than proline;

(2) (AA-codon)₁₅ are 15 codons encoding 15 amino acids selected from the group consisting of (a) the endogenous GPCR's amino acids (b) non-endogenous amino acid residues and (c) a combination of the endogenous GPCR's amino acids and non-endogenous amino acids, excepting that none of the 15 endogenous codons within the TM6 region of the GPCR encodes a proline amino acid residue; and

(3) X_codon is a nucleic acid encoding region residue located within the IC3 region of said GPCR, where X_codon encodes a non-endogenous amino acid, preferably selected from the group consisting of lysine, hisitidine and arginine, and most preferably lysine, excepting that when the endogenous encoding region at position X_codon encodes the amino acid lysine, then X_codon encodes an amino acid other than lysine, preferably alanine.

The terms endogenous and non-endogenous in reference to these sequence cassettes are relative to the endogenous GPCR. For example, once the endogenous proline residue is located within the TM6 region of a particular GPCR, and the 16^th amino acid therefrom is identified for mutation to constitutively activate the receptor, it is also possible to mutate the endogenous proline residue (i.e., once the marker is located and the 16^th amino acid to be mutated is identified, one may mutate the marker itself), although it is most preferred that the proline residue not be mutated. Similarly, and while it is most preferred that AA₁₅be maintained in their endogenous forms, these amino acids may also be mutated. The only amino acid that must be mutated in the non-endogenous version of the human GPCR is X i.e., the endogenous amino acid that is 16 residues from P¹ cannot be maintained in its endogenous form and must be mutated, as further disclosed herein. Stated again, while it is preferred that in the non-endogenous version of the human GPCR, P¹ and AA₁₅ remain in their endogenous forms (i.e., identical to their wild-type forms), once X is identified and mutated, any and/or all of P¹ and AA₁₅ can be mutated. This applies to the nucleic acid sequences as well. In those cases where the endogenous amino acid at position X is lysine, then in the non-endogenous version of such GPCR, X is an amino acid other than lysine, preferably alanine.

[0016] Accordingly, and as a hypothetical example, if the endogenous GPCR has the following endogenous amino acid sequence at the above-noted positions:

P-AACCTTGGRRRDDDE-Q

then any of the following exemplary and hypothetical cassettes would fall within the scope of the disclosure (non-endogenous amino acids are set forth in bold):

P-AACCTTGGRRRDDDE-K

P-AACCTTHIGRRDDDE-K

P-ADEETTGGRRRDDDE-A

P-LLKFMSTWZLVAAPQ-K

A-LLKFMSTWZLVAAPQ-K

It is also possible to add amino acid residues within AA₁₅, but such an approach is not particularly advanced. Indeed, in the most preferred embodiments, the only amino acid that differs in the non-endogenous version of the human GPCR as compared with the endogenous version of that GPCR is the amino acid in position X; mutation of this amino acid itself leads to constitutive activation of the receptor.

[0017] Thus, in particularly preferred embodiments, P¹ and P^codon are endogenous proline and an endogenous nucleic acid encoding region encoding proline, respectively; and X and X_codon are non-endogenous lysine or alanine and a non-endogenous nucleic acid encoding region encoding lysine or alanine, respectively, with lysine being most preferred. Because it is most preferred that the non-endogenous versions of the human GPCRs which incorporate these mutations are incorporated into mammalian cells and utilized for the screening of candidate compounds, the non-endogenous human GPCR incorporating the mutation need not be purified and isolated per se (i.e., these are incorporated within the cellular membrane of a mammalian cell), although such purified and isolated non-endogenous human GPCRs are well within the purview of this disclosure. Gene-targeted and transgenic non-human mammals (preferably rats and mice) incorporating the non-endogenous human GPCRs are also within the purview of this disclosure; in particular, gene-targeted mammals are most preferred in that these animals will incorporate the non-endogenous versions of the human GPCRs inplace of the non-human mammal's endogenous GPCR-encoding region (techniques for generating such non-human mammals to replace the non-human mammal's protein encoding region with a human encoding region are well known; see, for example, U.S. Patent No. 5,777,194.)

[0018] It has been discovered that these changes to an endogenous human GPCR render the GPCR constitutively active such that, as will be further disclosed herein, the non-endogenous, constitutively activated version of the human GPCR can be utilized for, inter alia, the direct screening of candidate compounds without the need for the endogenous ligand. Thus, methods for using these materials, and products identified by these methods are also within the purview of the following disclosure.

BRIEF DESCRIPTION OF THE DRAWINGS

[0019]

Figure 1 shows a generalized structure of a G protein-coupled receptor with the numbers assigned to the transmembrane helixes, the intracellular loops, and the exhacellular loops.

Figure 2 schematically shows the two states, active and inactive, for a typical G protein coupled receptor and the linkage of the active state to the second messenger transduction pathway.

Figure 3 is a sequence diagram of the preferred vector pCMV, including restriction enzymen site locations.

Figure 4 is a diagrammatic representation ofthe signal measured comparing pCMV, non-endogenous, constitutively active GPR30 inhibition of GPR6-mediated activation ofCRE-Luc reporter with endogenous GPR30 inhibition of GPR6-mediated activation of CRE-Luc reporter.

Figure 5 is a diagrammatic representation ofthe signal measured comparing pCMV, non-endogenous, constitutively activated GPR 17 inhibition of GPR3-mediated activation of CRE-Luc reporter with endogenous GPR17 inhibition of GPR3-mediated activation of CRE-Luc reporter.

Figure 6 provides diagrammatic results of the signal measured comparing control pCMV, endogenous APJ and non-endogenous APJ.

Figure 7 provides an illustration of IP₃ production from non-endogenous human 5-HT_2A receptor as compared to the endogenous version of this receptor.

Figure 8 are dot-blot format results for GPR1 (8A), GPR30 (8B) and APJ (8C).

DETAILED DESCRIPTION

[0020] The scientific literature that has evolved around receptors has adopted a number of terms to refer to ligands having various effects on receptors. For clarity and consistency, the following definitions will be used throughout this patent document. To the extent that these definitions conflict with other definitions for these terms, the following definitions shall control:

[0021] AGONISTS shall mean compounds that activate the intracellular response when they bind to the receptor, or enhance GTP binding to membranes.

[0022] AMINO ACID ABBREVIATIONS used herein are set below:

ALANINE	ALA	A
ARGININE	ARG	R
ASPARAGINE	ASN	N
ASPARTIC ACID	ASP	D
CYSTEINE	CYS	C
GLUTAMIC ACID	GLU	E
GLUTAMINE	GLN	Q
GLYCINE	GLY	G
HISTIDINE	HIS	H
ISOLEUCINE	ILE	I
LEUCINE	LEU	L
LYSINE	LYS	K
METHIONINE	MET	M
PHENYLALANINE	PHE	F
PROLINE	PRO	P
SERINE	SER	S
THREONINE	THR	T
TRYPTOPHAN	TRP	W
TYROSINE	TYR	Y
VALINE	VAL	V

[0023] PARTIAL AGONISTS shall mean compounds which activate the intracellular response when they bind to the receptor to a lesser degree/extent than do agonists, or enhance GTP binding to membranes to a lesser degree/extent than do agonists

[0024] ANTAGONIST shall mean compounds that competitively bind to the receptor at the same site as the agonists but which do not activate the intracellular response initiated by the active form of the receptor, and can thereby inhibit the intracellular responses by agonists or partial agonists. ANTAGONISTS do not diminish the baseline intracellular response in the absence of an agonist or partial agonist.

[0025] CANDIDATE COMPOUND shall mean a molecule (for example, and not limitation, a chemical compound) which is amenable to a screening technique. Preferably, the phrase "candidate compound" does not include compounds which were publicly known to be compounds selected from the group consisting of inverse agonist, agonist or antagonist to a receptor, as previously determined by an indirect identification process ("indirectly identified compound"); more preferably, not including an indirectly identified compound which has previously been determined to have therapeutic efficacy in at least one mammal; and, most preferably, not including an indirectly identified compound which has previously been determined to have therapeutic utility in humans.

[0026] CODON shall mean a grouping of three nucleotides (or equivalents to nucleotides) which generally comprise a nucleoside (adenosine (A), guanosine (G), cytidine (C), uridine (U) and thymidine (T)) coupled to a phosphate group and which, when translated, encodes an amino acid.

[0027] COMPOUND EFFICACY shall mean a measurement of the ability of a compound to inhibit or stimulate receptor functionality, as opposed to receptor binding affinity. A preferred means of detecting compound efficacy is via measurement of, e.g., [³⁵S]GTPγS binding, as further disclosed in the Example section of this patent document.

[0028] CONSTITUTIVELY ACTIVATED RECEPTOR shall mean a receptor subject to constitutive receptor activation, In accordance with the invention disclosed herein, a non-endogenous, human constitutively activated G protein-coupled receptor is one that has been mutated to include the amino acid cassette P¹AA₁₅X, as set forth in greater detail below.

[0029] CONSTITUTIVE RECEPTOR ACTIVATION shall mean stabilization of a receptor in the active state by means other than binding of the receptor with its endogenous ligand or a chemical equivalent thereof. Preferably, a G protein-coupled receptor subjected to constitutive receptor activation in accordance with the invention disclosed herein evidences at least a 10% difference in response (increase or decrease, as the case may be) to the signal measured for constitutive activation as compared with the endogenous form of that GPCR, more preferably, about a 25% difference in such comparative response, and most preferably about a 50% difference in such comparative response. When used for the purposes of directly identifying candidate compounds, it is most preferred that the signal difference be at least about 50% such that there is a sufficient difference between the endogenous signal and the non-endogenous signal to differentiate between selected candidate compounds. In most instances, the "difference" will be an increase in signal; however, with respect to Gs-coupled GPCRS, the "difference" measured is preferably a decrease, as will be set forth in greater detail below.

[0030] CONTACT or CONTACTING shall mean bringing at least two moieties together, whether in an in vitro system or an in vivo system.

[0031] DIRECTLY IDENTIFYING or DIRECTLY IDENTIFIED, in relationship to the phrase "candidate compound", shall mean the screening of a candidate compound against a constitutively activated G protein-coupled receptor, and assessing the compound efficacy of such compound. This phrase is, under no circumstances, to be interpreted or understood to be encompassed by or to encompass the phrase "indirectly identifying" or "indirectly identified."

[0032] ENDOGENOUS shall mean a material that is naturally produced by the genome of the species. ENDOGENOUS in reference to, for example and not limitation, GPCR, shall mean that which is naturally produced by a human, an insect, a plant, a bacterium, or a virus. By contrast, the term NON-ENDOGENOUS in this context shall mean that which is not naturally produced by the genome of a species. For example, and not limitation, a receptor which is not constitutively active in its endogenous form, but when mutated by using the cassettes disclosed herein and thereafter becomes constitutively active, is most preferably referred to herein as a "non-endogenous, constitutively activated receptor." Both terms can be utilized to describe both "in vivo" and "in vitro" systems. For example, and not limitation, in a screening approach, the endogenous or non-endogenous receptor may be in reference to an in vitro screening system whereby the receptor is expressed on the cell-surface of a mammalian cell. As a further example and not limitation, where the genome of a mammal has been manipulated to include a non-endogenous constitutively activated receptor, screening of a candidate compound by means of an in vivo system is viable.

[0033] HOST CELL shall mean a cell capable of having a Plasmid and/or Vector incorporated therein. In the case of a prokaryotic Host Cell, a Plasmid is typically replicated as an autonomous molecule as the Host Cell replicates (generally, the Plasmid is thereafter isolated for introduction into a eukaryotic Host Cell); in the case of a eukaryotic Host Cell, a Plasmid is integrated into the cellular DNA of the Host Cell such that when the eukaryotic Host Cell replicates, the Plasmid replicates. Preferably, for the purposes of the invention disclosed herein, the Host Cell is eukaryotic, more preferably, mammalian, and most preferably selected from the group consisting of 293, 293T and COS-7 cells.

[0034] INDIRECTLY IDENTIFYING or INDIRECTLY IDENTIFIED means the traditional approach to the drug discovery process involving identification of an endogenous ligand specific for an endogenous receptor, screening of candidate compounds against the receptor for determination of those which interfere and/or compete with the ligand-receptor interaction, and assessing the efficacy of the compound for affecting at least one second messenger pathway associated with the activated receptor.

[0035] INHIBIT or INHIBITING, in relationship to the term "response" shall mean that a response is decreased or prevented in the presence of a compound as opposed to in the absence of the compound.

[0036] INVERSE AGONISTS shall mean compounds which bind to either the endogenous form of the receptor or to the constitutively activated form of the receptor, and which inhibit the baseline intracellular response initiated by the active form of the receptor below the normal base level of activity which is observed in the absence of agonists or partial agonists, or decrease GTP binding to membranes. Preferably, the baseline intracellular response is inhibited in the presence of the inverse agonist by at least 30%, more preferably by at least 50%, and most preferably by at least 75%, as compared with the baseline response in the absence of the inverse agonist.

[0037] KNOWN RECEPTOR shall mean an endogenous receptor for which the endogenous ligand specific for that receptor has been identified.

[0038] LIGAND shall mean an endogenous, naturally occurring molecule specific for an endogenous, naturally occurring receptor.

[0039] MUTANT or MUTATION in reference to an endogenous receptor's nucleic acid and/or amino acid sequence shall mean a specified change or changes to such endogenous sequences such that a mutated form of an endogenous, non-constitutively activated receptor evidences constitutive activation of the receptor. In terms of equivalents to specific sequences, a subsequent mutated form of a human receptor is considered to be equivalent to a first mutation of the human receptor if (a) the level of constitutive activation of the subsequent mutated form of the receptor is substantially the same as that evidenced by the first mutation of the receptor; and (b) the percent sequence (amino acid and/or nucleic acid) homology between the subsequent mutated form of the receptor and the first mutation of the receptor is at least about 80%, more preferably at least about 90% and most preferably at least 95%. Ideally, and owing to the fact that the most preferred cassettes disclosed herein for achieving constitutive activation includes a single amino acid and/or codon change between the endogenous and the non-endogenous forms of the GPCR (i.e. X or X_codon), the percent sequence homology should be at least 98%.

[0040] ORPHAN RECEPTOR shall mean an endogenous receptor for which the endogenous ligand specific for that receptor has not been identified or is not known.

[0041] PHARMACEUTICAL COMPOSITION shall mean a composition comprising at least one active ingredient, whereby the composition is amenable to investigation for a specified, efficacious outcome in a mammal (for example, and not limitation, a human). Those of ordinary skill in the art will understand and appreciate the techniques appropriate for determining whether an active ingredient has a desired efficacious outcome based upon the needs of the artisan.

[0042] PLASMID shall mean the combination of a Vector and cDNA. Generally, a Plasmid is introduced into a Host Cell for the purpose of replication and/or expression of the cDNA as a protein.

[0043] STIMULATE or STIMULATING, in relationship to the term "response" shall mean that a response is increased in the presence of a compound as opposed to in the absence of the compound.

[0044] TRANSVERSE or TRANSVERSING, in reference to either a defined nucleic acid sequence or a defined amino acid sequence, shall mean that the sequence is located within at least two different and defined regions. For example, in an amino acid sequence that is 10 amino acid moieties in length, where 3 of the 10 moieties are in the TM6 region of a GPCR and the remaining 7 moieties are in the IC3 region of the GPCR, the 10 amino acid moiety can be described as transversing the TM6 and IC3 regions of the GPCR.

[0045] VECTOR in reference to cDNA shall mean a circular DNA capable of incorporating at least one cDNA and capable of incorporation into a Host Cell.

[0046] The order of the following sections is set forth for presentational efficiency and is not intended, nor should be construed, as a limitation on the disclosure or the claims to follow.

A. Introduction

[0047] The traditional study of receptors has always proceeded from the a priori assumption (historically based) that the endogenous ligand must first be identified before discovery could proceed to find antagonists and other molecules that could affect the receptor. Even in cases where an antagonist might have been known first, the search immediately extended to looking for the endogenous ligand. This mode of thinking has persisted in receptor research even after the discovery of constitutively activated receptors. What has not been heretofore recognized is that it is the active state of the receptor that is most useful for discovering agonists, partial agonists, and inverse agonists of the receptor. For those diseases which result from an overly active receptor or an under-active receptor, what is desired in a therapeutic drug is a compound which acts to diminish the active state of a receptor or enhance the activity of the receptor, respectively, not necessarily a drug which is an antagonist to the endogenous ligand. This is because a compound that reduces or enhances the activity of the active receptor state need not bind at the same site as the endogenous ligand. Thus, as taught by a method of this invention, any search for therapeutic compounds should start by screening compounds against the ligand-independent active state.

[0048] Screening candidate compounds against non-endogenous, constitutively activated GPCRs allows for the direct identification of candidate compounds which act at these cell surface receptors, without requiring any prior knowledge or use of the receptor's endogenous ligand. By determining areas within the body where the endogenous version of such GPCRs are expressed and/or over-expressed, it is possible to determine related disease/disorder states which are associated with the expression and/or over-expression of these receptors; such an approach is disclosed in this patent document.

B. Disease/Disorder Identification and/or Selection

[0049] Most preferably, inverse agonists to the non-endogenous, constitutively activated GPCRs can be identified using the materials of this invention. Such inverse agonists are ideal candidates as lead compounds in drug discovery programs for treating diseases related to these receptors. Because of the ability to directly identify inverse agonists, partial agonists or agonists to these receptors, thereby allowing for the development of pharmaceutical compositions, a search, for diseases and disorders associated with these receptors is possible. For example, scanning both diseased and normal tissue samples for the presence of these receptor now becomes more than an academic exercise or one which might be pursued along the path of identifying, in the case of an orphan receptor, an endogenous ligand. Tissue scans can be conducted across a broad range of healthy and diseased tissues. Such tissue scans provide a preferred first step in associating a specific receptor with a disease and/or disorder.

[0050] Preferably, the DNA sequence of the endogenous GPCR is used to make a probe for either radiolabeled cDNA or RT-PCR identification of the expression of the GPCR in tissue samples. The presence of a receptor in a diseased tissue, or the presence of the receptor at elevated or decreased concentrations in diseased tissue compared to a normal tissue, can be preferably utilized to identify a correlation with that disease. Receptors can equally well be localized to regions of organs by this technique. Based on the known functions of the specific tissues to which the receptor is localized, the putative functional role of the receptor can be deduced.

C. A "Human GPCR Proline Marker" Algorithm and the Creation of Non-Endogenous, Constitutively-Active Human GPCRs

[0051] Among the many challenges facing the biotechnology arts is the unpredictability in gleaning genetic information from one species and correlating that information to another species - nowhere in this art does this problem evidence more annoying exacerbation than in the genetic sequences that encode nucleic acids and proteins. Thus, for consistency and because of the highly unpredictable nature of this art, the following invention is limited, in terms of mammals, to human GPCRs - applicability of this invention to other mammalian species, while a potential possibility, is considered beyond mere rote application.

[0052] In general, when attempting to apply common "rules" from one related protein sequence to another or from one species to another, the art has typically resorted to sequence alignment, i.e., sequences are linearized and attempts are then made to find regions of commonality between two or more sequences. While useful, this approach does not always prove to result in meaningful information. In the case of GPCRs, while the general structural motif is identical for all GPCRs, the variations in lengths of the TMs, ECs and ICs make such alignment approaches from one GPCR to another difficult at best. Thus, while it may be desirable to apply a consistent approach to, e.g., constitutive activation from one GPCR to another, because of the great diversity in sequence length, fidelity, etc from one GPCR to the next, a generally applicable, and readily successful mutational alignment approach is in essence not possible. In an analogy, such an approach is akin to having a traveler start a journey at point A by giving the traveler dozens of different maps to point B, without any scale or distance markers on any of the maps, and then asking the traveler to find the shortest and most efficient route to destination B only by using the maps. In such a situation, the task can be readily simplified by having (a) a common "place-marker" on each map, and (b) the ability to measure the distance from the place-marker to destination B - this, then, will allow the traveler to select the most efficient from starting-point A to destination B.

[0053] In essence, a feature of the invention is to provide such coordinates within human GPCRs that readily allows for creation of a constitutively active form of the human GPCRs.

[0054] As those in the art appreciate, the transmembrane region of a cell is highly hydrophobic; thus, using standard hydrophobicity plotting techniques, those in the art are readily able to determine the TM regions of a GPCR, and specifically TM6 (this same approach is also applicable to determining the EC and IC regions of the GPCR). It has been discovered that within the TM6 region of human GPCRs, a common proline residue (generally near the middle of TM6), acts as a constitutive activation "marker." By counting 15 amino acids from the proline marker, the 16^th amino acid (which is located in the IC3 loop), when mutated from its endogenous form to a non-endogenous form, leads to constitutive activation of the receptor. For convenience, we refer to this as the "Human GPCR Proline Marker" Algorithm. Although the non-endogenous amino acid at this position can be any of the amino acids, most preferably, the non-endogenous amino acid is lysine. While not wishing to be bound by any theory, we believe that this position itself is unique and that the mutation at this location impacts the receptor to allow for constitutive activation.

[0055] We note that, for example, when the endogenous amino acid at the 16^th position is already lysine (as is the case with GPR4 and GPR32), then in order for X to be a non-endogenous amino acid, it must be other than lysine; thus, in those situations where the endogenous GPCR has an endogenous lysine residue at the 16^th position, the non-endogenous version of that GPCR preferably incorporates an amino acid other than lysine, preferably alanine, histidine and arginine, at this position. Of further note, it has been determined that GPR4 appears to be linked to Gs and active in its endogenous form (data not shown).

[0056] Because there are only 20 naturally occurring amino acids (although the use of non-naturally occurring amino acids is also viable), selection of a particular non-endogenous amino acid for substitution at this 16^th position is viable and allows for efficient selection of a non-endogenous amino acid that fits the needs of the investigator. However, as noted, the more preferred non-endogenous amino acids at the 16^th position are lysine, hisitidine, arginine and alanine, with lysine being most preferred. Those of ordinary skill in the art are credited with the ability to readily determine proficient methods for changing the sequence of a codon to achieve a desired mutation.

[0057] It has also been discovered that occasionally, but not always, the proline residue marker will be preceded in TM6 by W2 (i.e., W2P¹AA₁₅X) where W is tryptophan and 2 is any amino acid residue.

[0058] Our discovery, amongst other things, negates the need for unpredictable and complicated sequence alignment approaches commonly used by the art. Indeed, the strength of our discovery, while an algorithm in nature, is that it can be applied in a facile manner to human GPCRs, with dexterous simplicity by those in the art, to achieve a unique and highly useful end-product, i.e., a constitutively activated version of a human GPCR. Because many years and significant amounts of money will be required to determine the endogenous ligands for the human GPCRs that the Human Genome project is uncovering, the disclosed invention not only reduces the time necessary to positively exploit this sequence information, but at significant cost-savings. This approach truly validates the importance of the Human Genome Project because it allows for the utilization of genetic information to not only understand the role of the GPCRs in, e.g., diseases, but also provides the opportunity to improve the human condition.

D. Screening of Candidate Compounds

1. Generic GPCR screening assay techniques

[0059] When a G protein receptor becomes constitutively active, it couples to a G protein (e.g., Gq, Gs, Gi, Go) and stimulates release and subsequent binding of GTP to the G protein. The G protein then acts as a GTPase and slowly hydrolyzes the GTP to GDP, whereby the receptor, under normal conditions, becomes deactivated. However, constitutively activated receptors, including the non-endogenous, human constitutively active GPCRs of the present invention, continue to exchange GDP for GTP. A non-hydrolyzable analog of GTP, [³⁵S]GTPγS, can be used to monitor enhanced binding to G proteins present on membranes which express constitutively activated receptors. It is reported that [³⁵S]GTPγS can be used to monitor G protein coupling to membranes in the absence and presence of ligand. An example of this monitoring, among other examples well-known and available to those in the art, was reported by Traynor and Nahorski in 1995. The preferred use of this assay system is for initial screening of candidate compounds because the system is generically applicable to all G protein-coupled receptors regardless of the particular G protein that interacts with the intracellular domain of the receptor.

B 2. Specific GPCR screening assay techniques

[0060] C Once candidate compounds are identified using the "generic" G protein-coupled receptor assay (i.e., an assay to select compounds that are agonists, partial agonists, or inverse agonists), further screening to confirm that the compounds have interacted at the receptor site is preferred. For example, a compound identified by the "generic" assay may not bind to the receptor, but may instead merely "uncouple" the G protein from the intracellular domain.

a. Gs and Gi.

[0061] Gs stimulates the enzyme adenylyl cyclase. Gi (and Go), on the other hand, inhibit this enzyme. Adenylyl cyclase catalyzes the conversion of ATP to cAMP; thus, constitutively activated GPCRs that couple the Gs protein are associated with increased cellular levels of cAMP. On the other hand, constitutively activated GPCRs that couple the Gi (or Go) protein are associated with decreased cellular levels of cAMP. See, generally, "Indirect Mechanisms of Synaptic Transmission," Chpt. 8, From Neuron To Brain (3rd Ed.) Nichols, J.G. et al eds. Sinauer Associates, Inc. (1992). Thus, assays that detect cAMP can be utilized to determine if a candidate compound is, e.g., an inverse agonist to the receptor (i.e., such a compound would decrease the levels of cAMP). A variety of approaches known in the art for measuring cAMP can be utilized; a most preferred approach relies upon the use of anti-cAMP antibodies in an ELISA-based format. Another type of assay that can be utilized is a whole cell second messenger reporter system assay. Promoters on genes drive the expression of the proteins that a particular gene encodes. Cyclic AMP drives gene expression by promoting the binding of a cAMP-responsive DNA binding protein or transcription factor (CREB) which then binds to the promoter at specific sites called cAMP response elements and drives the expression of the gene. Reporter systems can be constructed which have a promoter containing multiple cAMP response elements before the reporter gene, e.g., β-galactosidase or luciferase. Thus, a constitutively activated Gs-linked receptor causes the accumulation of cAMP that then activates the gene and expression of the reporter protein. The reporter protein such as β-galactosidase or luciferase can then be detected using standard biochemical assays (Chen et al. 1995). With respect to GPCRs that link to Gi (or Go), and thus decrease levels of cAMP, an approach to the screening of, e.g., inverse agonists, based upon utilization of receptors that link to Gs (and thus increase levels of cAMP) is disclosed in the Example section with respect to GPR17 and GPR30.

b. Go and Gq.

[0062] Gq and Go are associated with activation of the enzyme phospholipase C, which in turn hydrolyzes the phospholipid PIP₂, releasing two intracellular messengers: diacycloglycerol (DAG) and inistol 1,4,5-triphoisphate (IP₃). Increased accumulation of IP₃ is associated with activation of Gq- and Go-associated receptors. See, generally, "Indirect Mechanisms of Synaptic Transmission," Chpt. 8, From Neuron To Brain (3^rd Ed.) Nichols, J.G. et al eds. Sinauer Associates, Inc. (1992). Assays that detect IP₃ accumulation can be utilized to determine if a candidate compound is, e.g., an inverse agonist to a Gq- or Go-associated receptor (i.e., such a compound would decrease the levels of IP₃). Gq-associated receptors can also been examined using an AP1 reporter assay in that Gq-dependent phospholipase C causes activation of genes containing AP1 elements; thus, activated Gq-associated receptors will evidence an increase in the expression of such genes, whereby inverse agonists thereto will evidence a decrease in such expression, and agonists will evidence an increase in such expression. Commercially available assays for such detection are available.

E. Medicinal Chemistry

[0063] Generally, but not always, direct identification of candidate compounds is preferably conducted in conjunction with compounds generated via combinatorial chemistry techniques, whereby thousands of compounds are randomly prepared for such analysis. Generally, the results of such screening will be compounds having unique core structures; thereafter, these compounds are preferably subjected to additional chemical modification around a preferred core structure(s) to further enhance the medicinal properties thereof. Such techniques are known to those in the art and will not be addressed in detail in this patent document.

F. Pharmaceutical Compositions

[0064] Candidate compounds selected for further development can be formulated into pharmaceutical compositions using techniques well known to those in the art. Suitable pharmaceutically-acceptable carriers are available to those in the art; for example, see Remington's Pharmaceutical Sciences, 16^th Edition, 1980, Mack Publishing Co., (Oslo et al., eds.)

G. Other Utility

[0065] Although a preferred use of the non-endogenous versions of the disclosed human GPCRs is for the direct identification of candidate compounds as inverse agonists, agonists or partial agonists (preferably for use as pharmaceutical agents), these receptors can also be utilized in research settings. For example, in vitro and in vivo systems incorporating these receptors can be utilized to further elucidate and understand the roles of the receptors in the human condition, both normal and diseased, as well understanding the role of constitutive activation as it applies to understanding the signaling cascade. A value in these non-endogenous receptors is that their utility as a research tool is enhanced in that, because of their unique features, the disclosed receptors can be used to understand the role of a particular receptor in the human body before the endogenous ligand therefor is identified. Other uses of the disclosed receptors will become apparent to those in the art based upon, inter alia, a review of this patent document.

EXAMPLES

[0066] The following examples are presented for purposes of elucidation, and not limitation, of the present invention. Following the teaching of this patent document that a mutational cassette may be utilized in the IC3 loop of human GPCRs based upon a position relative to a proline residue in TM6 to constitutively activate the receptor, and while specific nucleic acid and amino acid sequences are disclosed herein, those of ordinary skill in the art are credited with the ability to make minor modifications to these sequences while achieving the same or substantially similar results reported below. Particular approaches to sequence mutations are within the purview of the artisan based upon the particular needs of the artisan.

Example 1

Preparation of Endogenous Human GPCRs

[0067] A variety of GPCRs were utilized in the Examples to follow. Some endogenous human GPCRs were graciously provided in expression vectors (as acknowledged below) and other endogenous human GPCRs were synthesized de novo using publicly-available sequence information.

1. GPR1 (GenBank Accession Number: U13666)

[0068] The human cDNA sequence for GPR1 was provided in pRcCMV by Brian O'Dowd (University of Toronto). GPR1 cDNA (1.4kB fragment) was excised from the pRcCMV vector as a NdeI-XbaI fragment and was subcloned into the NdeI-XbaI site of pCMV vector (see Figure 3). Nucleic acid (SEQ.ID.NO.: 1) and amino acid (SEQ.ID.NO.: 2) sequences for human GPR1 were thereafter determined and verified.

2. GPR4 (GenBank Accession Numbers: L36148, U35399, U21051)

[0069] The human cDNA sequence for GPR4 was provided in pRcCMV by Brian O'Dowd (University of Toronto). GPR 1 cDNA (1.4kB fragment) was excised from the pRcCMV vector as an ApaI(blunted)-XbaI fragment and was subcloned (with most of the 5' untranslated region removed) into HindIII(blunted)-XbaI site of pCMV vector. Nucleic acid (SEQ.ID.NO.: 3) and amino acid (SEQ.ID.NO.: 4) sequences for human GPR4 were thereafter determined and verified.

3. GPR5 (GenBank Accession Number: L36149)

[0070] The cDNA for human GPR5 was generated and cloned into pCMV expression vector as follows: PCR was performed using genomic DNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM ofeach primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 64°C for 1min; and 72 °C for 1.5 min. The 5' PCR primer contained an EcoRI site with the sequence:

5'-TATGAATTCAGATGCTCTAAACGTCCCTGC-3' (SEQ.ID.NO.:5)

and the 3' primer contained BamHI site with the sequence:

5'-TCCGGATCCACCTGCACCTGCGCCTGCACC-3' (SEQ.ID.NO.:6)

The 1.1 kb PCR fragment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI site of PCMV expression vector. Nucleic acid (SEQ.ID.NO.: 7) and amino acid (SEQ.ID.NO.: 8) sequences for human GPR5 were thereafter determined and verified.

4. GPR7 (GenBank Accession Number: U22491)

[0071] The cDNA for human GPR7 was generated and cloned into pCMV expression vector as follows: PCR condition- PCR was performed using genomic DNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 62°C for 1min; and 72°C for 1min and 20 sec. The 5'PCR primer contained a Hindin site with the sequence:

5'-GCAAGCTTGGGGGACGCCAGGTCGCCGGCT-3' (SEQ.ID.NO.:9)

and the 3' primer contained a BamHI site with the sequence:

5'-GCGGATCCGGACGCTGGGGGAGTCAGGCTGC-3' (SEQ.ID.NO.:10)

The 1.1 kb PCR fragment was digested with HindIII and BamHI and cloned into HindIII-BamHI site ofpCMV expression vector. Nucleic acid (SEQ.ID.NO.: 11) and amino acid (SEQ.ID.NO.: 12) sequences for human GPR7 were thereafter determined and verified.

5. GPR8 (GenBank Accession Number: U22492)

[0072] The cDNA for human GPR8 was generated and cloned into pCMV expression vector as follows: PCR was performed using genomic DNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 62°C for 1 min; and 72 °C for 1min and 20 sec. The 5' PCR primer contained an EcoRI site with the sequence:

5'-CGGAATTCGTCAACGGTCCCAGCTACAATG-3' (SEQ.ID.NO.: 13)

and the 3' primer contained a BamHI site with the sequence:

5'-ATGGATCCCAGGCCCTTCAGCACCGCAATAT-3' (SEQ.ID.NO.: 14).

The 1.1 kb PCR fragment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI site of PCMV expression vector. All 4 cDNA clones sequenced contained a possible polymorphism involving a change of amino acid 206 from Arg to Gln. Aside from this difference, nucleic acid (SEQ.ID.NO.:15) and amino acid (SEQ.ID.NO.: 16) sequences for human GPR8 were thereafter determined and verified.

6. GPR9 (GenBank Accession Number: X95876)

[0073] The cDNA for human GPR9 was generated and cloned into pCMV expression vector as follows: PCR was performed using a clone (provided by Brian O'Dowd) as template and pfu polymerase (Stratagene) with the buffer system provided by the manufacturer supplemented with 10% DMSO, 0.25 µM of each primer, and 0.5 mM of each of the 4 nucleotides. The cycle condition was 25 cycles of: 94°C for 1 min; 56°C for 1mm; and 72 °C for 2.5 min. The 5' PCR primer contained an EcoRI site with the sequence:

5'-ACGAATTCAGCCATGGTCCTTGAGGTGAGTGACCACCAAGTGCTAAAT-3' (SEQ.ID.NO.: 17)

and the 3' primer contained a BamHI site with the sequence:

5'-GAGGATCCTGGAATGCGGGGAAGTCAG-3' (SEQ.ID.NO.: 18).

The 1.2 kb PCR fragment was digested with EcoRI and cloned into EcoRI-SmaI site of PCMV expression vector. Nucleic acid (SEQ.ID.NO.: 19) and amino acid (SEQ.ID.NO.: 20) sequences for human GPR9 were thereafter determined and verified.

7. GPR9-6 (GenBank Accession Number: U45982)

[0074] The cDNA for human GPR9-6 was generated and cloned into pCMV expression vector as follows: PCR was performed using genomic DNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 62°C for 1min; and 72°C for 1 min and 20 sec. The 5' PCR primer was kinased with the sequence:

5'-TTAAGCTTGACCTAATGCCATCTTGTGTCC-3' (SEQ.ID.NO.: 21)

and the 3' primer contained a BamHI site with the sequence:

5'-TTGGATCCAAAAGAACCATGCACCTCAGAG-3' (SEQ.ID.NO.: 22).

The 1.2 kb PCR fragment was digested with BamHI and cloned into EcoRV-BamHI site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 23) and amino acid (SEQ.m.NO.: 24) sequences for human GPR9-6 were thereafter determined and verified.

8. GPR10 (GenBank Accession Number: U32672)

[0075] The human cDNA sequence for GPR10 was provided in pRcCMV by Brian O'Dowd (University of Toronto). GPR10 cDNA (1.3kB fragment) was excised from the pRcCMV vector as an EcoRI-XbaI fragment and was subcloned into EcoRI-Xbal site of pCMV vector. Nucleic acid (SEQ.ID.NO.: 25) and amino acid (SEQ.ID.NO.: 26) sequences for human GPR10 were thereafter determined and verified.

9. GPR15 (GenBank Accession Number: U34806)

[0076] The human cDNA sequence for GPR15 was provided in pCDNA3 by Brian O'Dowd (University of Toronto). GPR15 cDNA (1.5kB fragment) was excised from the pCDNA3 vector as a HindIII-Bam fragment and was subcloned into HindIII-Bam site of pCMV vector. Nucleic acid (SEQ.ID.NO.: 27) and amino acid (SEQ.ID.NO.: 28) sequences for human GPR15 were thereafter determined and verified.

10. GPR17 (GenBank Accession Number: Z94154)

[0077] The cDNA for human GPR17 was generated and cloned into pCMV expression vector as follows: PCR was performed using genomic DNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM of each 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 56°C for 1min and 72 °C for 1 min and 20 sec. The 5' PCR primer contained an EcoRI site with the sequence:
5'-CTAGAATTCTGACTCCAGCCAAAGCATGAAT-3' (SEQ.ID.NO.: 29)and the 3' primer contained a BamHI site with the sequence:

5'-GCTGGATCCTAAACAGTCTGCGCTCGGCCT-3' (SEQ.ID.NO.: 30).

The 1.1 kb PCR fragment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 31) and amino acid (SEQ.ID.NO.: 32) sequences for human GPR17 were thereafter determined and verified.

11. GPR18 (GenBank Accession Number: L42324)

[0078] The cDNA for human GPR18 was generated and cloned into pCMV expression vector as follows: PCR was performed using genomic DNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM of each ofthe 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 54°C for 1min; and 72 °C for 1min and 20 sec. The 5' PCR primer was kinased with the sequence:

5'-ATAAGATGATCACCCTGAACAATCAAGAT-3' (SEQ.ID.NO.: 33)

and the 3' primer contained an EcoRI site with the sequence:

5'-TCCGAATTCATAACATTTCACTGTTTATATTGC-3' (SEQ.ID.NO.: 34).

The 1.0 kb PCR fragment was digested with EcoRI and cloned into blunt-EcoRI site of pCMV expression vector. All 8 cDNA clones sequenced contained 4 possible polymorphisms involving changes of amino acid 12 from Thr to Pro, amino acid 86 from Ala to Glu, amino acid 97 from Ile to Leu and amino acid 310 from Leu to Met. Aside from these changes, nucleic acid (SEQ.ID.NO.: 35) and amino acid (SEQ.ID.NO.: 36) sequences for human GPR18 were thereafter determined and verified.

12. GPR20 (GenBank Accession Number: U66579)

[0079] The cDNA for human GPR20 was generated and cloned into pCMV expression vector as follows: PCR was performed using genomic DNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 62°C for 1min; and 72 °C for 1 min and 20 sec. The 5' PCR primer was kinased with the sequence:

5'-CCAAGCTTCCAGGCCTGGGGTGTGCTGG-3' (SEQ.ID.NO.: 37)

and the 3' primer contained a BamHI site with the sequence:

5'-ATGGATCCTGACCTTCGGCCCCTGGCAGA-3' (SEQ.ID.NO.: 38).

The 1.2 kb PCR fragment was digested with BamHI and cloned into EcoRV-BamHI site of PCMV expression vector. Nucleic acid (SEQ.ID.NO.: 39) and amino acid (SEQ.ID.NO.: 40) sequences for human GPR20 were thereafter determined and verified.

13. GPR21 (GenBank Accession Number: U66580)

[0080] The cDNA for human GPR21 was generated and cloned into pCMV expression vector as follows: PCR was performed using genomic DNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 62°C for 1min; and 72 °C for 1 min and 20 sec. The 5' PCR primer was kinased with the sequence:

5'-GAGAATTCACTCCTGAGCTCAAGATGAACT-3' (SEQ.ID.NO.: 41)

and the 3' primer contained a BamHI site with the sequence:

5'-CGGGATCCCCGTAACTGAGCCACTTCAGAT-3' (SEQ.ID.NO.: 42).

The 1.1 kb PCR fragment was digested with BamHI and cloned into EcoRV-BamHI site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 43) and amino acid (SEQ.ID.NO.: 44) sequences for human GPR21 were thereafter determined and verified.

14. GPR22 (GenBank Accession Number: U66581)

[0081] The cDNA for human GPR22 was generated and cloned into pCMV expression vector as follows: PCR was performed using genomic DNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 50°C for 1min; and 72 °C for 1.5 min. The 5' PCR primer was kinased with the sequence:

5'-TCCCCCGGGAAAAAAACCAACTGCTCCAAA-3' (SEQ.ID.NO.: 45)

and the 3' primer contained a BamHI site with the sequence:

5'-TAGGATCCATTTGAATGTGGATTTGGTGAAA-3' (SEQ.ID.NO.: 46).

The 1.38 kb PCR fragment was digested with BamHI and cloned into EcoRV-BamHI site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 47) and amino acid (SEQ.ID.NO.: 48) sequences for human GPR22 were thereafter determined and verified.

15. GPR24 (GenBank Accession Number: U71092)

[0082] The cDNA for human GPR24 was generated and cloned into pCMV expression vector as follows: PCR was performed using genomic DNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM ofeach primer, and 0.2 mM of each 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 56°C for 1min; and 72 °C for 1 min and 20 sec. The 5' PCR primer contains a HindIII site with the sequence:

5'-GTGAAGCTTGCCTCTGGTGCCTGCAGGAGG-3' (SEQ.ID.NO.: 49)

and the 3' primer contains an EcoRI site with the sequence:

5'-GCAGAATTCCCGGTGGCGTGTTGTGGTGCCC-3' (SEQ.ID.NO.: 50).

The 1.3 kb PCR fragment was digested with HindIII and EcoRI and cloned into HindIII-EcoRI site of pCMV expression vector. The nucleic acid (SEQ.ID.NO.: 51) and amino acid sequence (SEQ.ID.NO.: 52) for human GPR24 were thereafter determined and verified.

16. GPR30 (GenBank Accession Number: U63917)

[0083] The cDNA for human GPR30 was generated and cloned as follows: the coding sequence of GPR30 (1128bp in length) was amplified from genomic DNA using the primers:

5'-GGCGGATCCATGGATGTGACTTCCCAA-3' (SEQ.ID.NO.: 53)

and

5'-GGCGGATCCCTACACGGCACTGCTGAA-3' (SEQ.ID.NO.: 54).

The amplified product was then cloned into a commercially available vector, pCR2.1 (Invitrogen), using a "TOPO-TA Cloning Kit" (Invitrogen, #K4500-01), following manufacturer instructions. The full-length GPR30 insert was liberated by digestion with BamH1, separated from the vector by agarose gel electrophoresis, and purified using a Sephaglas Bandprep™ Kit (Pharmacia, # 27-9285-01) following manufacturer instructions. The nucleic acid (SEQ.ID.NO.: 55) and amino acid sequence (SEQ.ID.NO.: 56) for human GPR30 were thereafter determined and verified.

17. GPR31 (GenBank Accession Number: U65402)

[0084] The cDNA for human GPR31 was generated and cloned into pCMV expression vector as follows: PCR was performed using genomic DNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 58°C for 1min; and 72 °C for 2 min. The 5' PCR primer contained an EcoRI site with the sequence:

5'-AAGGAATTCACGGCCGGGTGATGCCATTCCC-3' (SEQ.ID.NO.: 57)

and the 3' primer contained a BamHI site with the sequence:

5'-GGTGGATCCATAAACACGGGCGTTGAGGAC-3' (SEQ.ID.NO.: 58).

The 1.0 kb PCR fragment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 59) and amino acid (SEQ.ID.NO.: 60) sequences for human GPR31 were thereafter determined and verified.

18. GPR32 (GenBank Accession Number: AF045764)

[0085] The cDNA for human GPR32 was generated and cloned into pCMV expression vector as follows: PCR was performed using genomic DNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM of each 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 56°C for 1min; and 72 °C for 1 min and 20 sec. The 5' PCR primer contained an EcoRI site with the sequence:

5'-TAAGAATTCCATAAAAATTATGGAATGG-3' (SEQ.ID.NO.:243)

and the 3' primer contained a BamHI site with the sequence:

5'-CCAGGATCCAGCTGAAGTCTTCCATCATTC-3' (SEQ.ID.NO.: 244).

The 1.1 kb PCR fragment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI site ofpCMV expression vector. Nucleic acid (SEQ.ID.NO.: 245) and amino acid (SEQ.ID.NO.: 246) sequences for human GPR32 were thereafter determined and verified.

19. GPR40 (GenBank Accession Number: AF024687)

[0086] The cDNA for human GPR40 was generated and cloned into pCMV expression vector as follows: PCR was performed using genomic DNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM of each 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min, 65°C for 1min and 72 °C for 1 min and 10 sec. The 5' PCR primer contained an EcoRI site with the sequence

5'-GCAGAATTCGGCGGCCCCATGGACCTGCCCCC-3' (SEQ.ID.NO.: 247)

and the 3' primer contained a BamHI site with the sequence

5'-GCTGGATCCCCCGAGCAGTGGCGTTACTTC-3' (SEQ.ID.NO.: 248).

The 1 kb PCR fragment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI site ofpCMV expression vector. Nucleic acid (SEQ.ID.NO.: 249) and amino acid (SEQ.ID.NO.: 250) sequences for human GPR40 were thereafter determined and verified.

20. GPR41 (GenBank Accession Number AF024688)

[0087] The cDNA for human GPR41 was generated and cloned into pCMV expression vector as follows: PCR was performed using genomic DNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM of each 4 nucleotides. The cycle condition was 30 cycles of 94°C for 1 min, 65°C for 1min and 72 °C for 1 min and 10 sec. The 5' PCR primer contained an HindIII site with the sequence:

5'-CTCAAGCTTACTCTCTCTCACCAGTGGCCAC-3' (SEQ.ID.NO.: 251)

and the 3' primer was kinased with the sequence

5'-CCCTCCTCCCCCGGAGGACCTAGC-3' (SEQ.ID.NO.: 252).

The 1 kb PCR fragment was digested with HindIII and cloned into HindIII-blunt site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 253) and amino acid (SEQ.ID.NO.: 254) sequences for human GPR41 were thereafter determined and verified.

21. GPR43 (GenBank Accession Number AF024690)

[0088] The cDNA for human GPR43 was generated and cloned into pCMV expression vector as follows: PCR was performed using genomic DNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM of each 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 65°C for 1min; and 72 °C for 1 min and 10 sec. The 5' PCR primer contains an HindIII site with the sequence:

5'-TTTAAGCTTCCCCTCCAGGATGCTGCCGGAC-3' (SEQ.ID.NO.: 255)

and the 3' primer contained an EcoRI site with the sequence:

5'-GGCGAATTCTGAAGGTCCAGGGAAACTGCTA-3' (SEQ.ID.NO.: 256).

The 1 kb PCR fragment was digested with HindIII and EcoRI and cloned into HindIII-EcoRI site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 257) and amino acid (SEQ.ID.NO.: 258) sequences for human GPR43 were thereafter determined and verified.

22. APJ (GenBank Accession Number: U03642)

[0089] Human APJ cDNA (in pRcCMV vector) was provided by Brian O'Dowd (University of Toronto). The human APJ cDNA was excised from the pRcCMV vector as an EcoRI-XbaI (blunted) fragment and was subcloned into EcoRI-SmaI site of pCMV vector. Nucleic acid (SEQ.ID.NO.: 61) and amino acid (SEQ.ID.NO.:62) sequences for human APJ were thereafter determined and verified.

23. BLR1 (GenBank Accession Number: X68149)

[0090] The cDNA for human BLR1 was generated and cloned into pCMV expression vector as follows: PCR was performed using thymus cDNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 62°C for 1min; and 72 °C for 1 min and 20 sec. The 5' PCR primer contained an EcoRI site with the sequence:

5'-TGAGAATTCTGGTGACTCACAGCCGGCACAG-3' (SEQ.ID.NO.: 63):

and the 3' primer contained a BamHI site with the sequence:
5'-GCCGGATCCAAGGAAAAGCAGCAATAAAAGG-3' (SEQ.ID.NO.: 64). The 1.2 kb PCR fragment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 65) and amino acid (SEQ.ID.NO.: 66) sequences for human BLR1 were thereafter determined and verified.

24. CEPR (GenBank Accession Number: U77827)

[0091] The cDNA for human CEPR was generated and cloned into pCMV expression vector as follows: PCR was performed using genomic DNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 65°C for 1min; and 72 °C for 1 min and 20 sec. The 5' PCR primer was kinased with the sequence:

5'-CAAAGCTTGAAAGCTGCACGGTGCAGAGAC-3' (SEQ.ID.NO.:67)

and the 3' primer contained a BamHI site with the sequence:

5'-GCGGATCCCGAGTCACACCCTGGCTGGGCC-3' (SEQ.ID.NO.: 68).

The 1.2 kb PCR fragment was digested with BamHI and cloned into EcoRV-BamHI site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 69) and amino acid (SEQ.ID.NO.: 70) sequences for human CEPR were thereafter determined and verified.

25. EBI1 (GenBank Accession Number: L31581)

[0092] The cDNA for human EBI1 was generated and cloned into pCMV expression vector as follows: PCR was performed using thymus cDNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 62°C for 1min; and 72 °C for 1 min and 20 sec. The 5' PCR primer contained an EcoRI site with the sequence:

5'-ACAGAATTCCTGTGTGGTTTTACCGCCCAG-3' (SEQ.ID.NO.: 71)

and the 3' primer contained a BamHI site with the sequence:

5'-CTCGGATCCAGGCAGAAGAGTCGCCTATGG-3' (SEQ.ID.NO.: 72).

The 1.2 kb PCR fragment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI site of PCMV expression vector. Nucleic acid (SEQ.ID.NO.: 73) and amino acid (SEQ.ID.NO.: 74) sequences for human EBI1 were thereafter determined and verified.

26. EBI2 (GenBank Accession Number: L08177)

[0093] The cDNA for human EBI2 was generated and cloned into pCMV expression vector as follows: PCR was performed using cDNA clone (graciously provided by Kevin Lynch, University of Virginia Health Sciences Center; the vector utilized was not identified by the source) as template and pfu polymerase (Stratagene) with the buffer system provided by the manufacturer supplemented with 10% DMSO, 0.25 µM of each primer, and 0.5 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 60°C for 1min; and 72°C for 1 min and 20 sec. The 5' PCR primer contained an EcoRI site with the sequence:

5'-CTGGAATTCACCTGGACCACCACCAATGGATA-3' (SEQ.ID.NO.: 75)

and the 3' primer contained a BamHI site with the sequence

5'-CTCGGATCCTGCAAAGTTTGTCATACAGTT-3' (SEQ.ID.NO.: 76).

The 1.2 kb PCR fragment was digested with EcoRI and BamHI and cloned into EcoRI-BarnHI site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 77) and amino acid (SEQ.ID.NO.: 78) sequences for human EBI2 were thereafter determined and verified.

27. ETBR-LP2 (GenBank Accession Number: D38449)

[0094] The cDNA for human ETBR-LP2 was generated and cloned into pCMV expression vector as follows: PCR was performed using brain cDNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 65°C for 1min; and 72 °C for 1.5 min. The 5' PCR contained an EcoRI site with the sequence:

5'-CTGGAATTCTCCTGCTCATCCAGCCATGCGG-3' (SEQ.ID.NO.: 79)

and the 3' primer contained a BamHI site with the sequence:

5'-CCTGGATCCCCACCCCTACTGGGGCCTCAG-3' (SEQ.ID.NO.: 80).

The 1.5 kb PCR fragment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 81) and amino acid (SEQ.ID.NO.: 82) sequences for human ETBR-LP2 were thereafter determined and verified.

28. GHSR (GenBank Accession Number: U60179)

[0095] The cDNA for human GHSR was generated and cloned into pCMV expression vector as follows: PCR was performed using hippocampus cDNA as template and TaqPlus Precision polymerase (Stratagene) with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 68°C for 1 min; and 72 °C for 1 min and 10 sec. For first round PCR, the 5' PCR primer sequence was:

5'-ATGTGGAACGCGACGCCCAGCG-3' (SEQ.ID.NO.: 83)

and the 3' primer sequence was:

5'-TCATGTATTAATACTAGATTCT-3' (SEQ.ID.NO. 84).

Two microliters of the first round PCR was used as template for the second round PCR where the 5' primer was kinased with sequence:

5'-TACCATGTGGAACGCGACGCCCAGCGAAGAGCCGGGGT-3' (SEQ.ID.NO.:85)

and the 3' primer contained an EcoRI site with the sequence:

5'-CGGAATTCATGTATTAATACTAGATTCTGTCCAGGCCCG-3' (SEQ.ID.NO.:86).

The 1.1 kb PCR fragment was digested with EcoRI and cloned into blunt-EcoRI site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 87) and amino acid (SEQ.ID.NO.: 88) sequences for human GHSR were thereafter determined and verified.

29. GPCR-CNS (GenBank Accession Number: AFO17262)

[0096] The cDNA for human GPCR-CNS was generated and cloned into pCMV expression vector as follows: PCR was performed using brain cDNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 65°C for 1min; and 72°C for 2 min. The 5' PCR primer contained a HindIII site with the sequence:

5'-GCAAGCTTGTGCCCTCACCAAGCCATGCGAGCC-3' (SEQ.ID.NO.: 89)

and the 3' primer contained an EcoRI site with the sequence:

5'-CGGAATTCAGCAATGAGTTCCGACAGAAGC-3' (SEQ.ID.NO.: 90).

The 1.9 kb PCR fragment was digested with HindIII and EcoRI and cloned into HindIII-EcoRI site of pCMV expression vector. All nine clones sequenced contained a potential polymorphism involving a S284C change. Aside from this difference, nucleic acid (SEQ.ID.NO.: 91) and amino acid (SEQ.ID.NO.: 92) sequences for human GPCR-CNS were thereafter determined and verified.

30. GPR-NGA (GenBank Accession Number: U55312)

[0097] The cDNA for human GPR-NGA was generated and cloned into pCMV expression vector as follows: PCR was performed using genomic DNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of 94°C for 1 min, 56°C for 1 min and 72°C for 1.5 min. The 5' PCR primer contained an EcoRI site with the sequence:

5'-CAGAATTCAGAGAAAAAAAGTGAATATGGTTTTT-3' (SEQ.ID.NO.: 93)

and the 3' primer contained a BamHI site with the sequence:

5'-TTGGATCCCTGGTGCATAACAATTGAAAGAAT-3' (SEQ.ID.NO.: 94).

The 1.3 kb PCR fragment was digested with EcoRI and BamHI and cloned into EcoRI-BamHI site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 95) and amino acid (SEQ.ID.NO.: 96) sequences for human GPR-NGA were thereafter determined and verified.

31. H9 (GenBank Accession Number: U52219)

[0098] The cDNA for human HB954 was generated and cloned into pCMV expression vector as follows: PCR was performed using pituitary cDNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM of each 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min, 62°C for 1min and 72 °C for 2 min. The 5' PCR primer contains a HindIII site with the sequence:

5'-GGAAAGCTTAACGATCCCCAGGAGCAACAT-3' (SEQ.ID.NO.: 97)

and the 3' primer contains a BamHI site with the sequence:

5'-CTGGGATCCTACGAGAGCATTTTTCACACAG-3' (SEQ.ID.NO.: 98).

The 1.9 kb PCR fragment was digested with HindIII and BamHI and cloned into HindIII-BamHI site of pCMV expression vector. When compared to the published sequences, a different isoform with 12 bp in frame insertion in the cytoplasmic tail was also identified and designated "H9b." Both isoforms contain two potential polymorphisms involving changes of amino acid P320S and amino acid G448A. Isoform H9a contained another potential polymorphism of amino acid S493N, while isoform H9b contained two additional potential polymorphisms involving changes of amino acid 1502T and amino acid A532T (corresponding to amino acid 528 of isoform H9a). Nucleic acid (SEQ.ID.NO.: 99) and amino acid (SEQ.ID.NO.: 100) sequences for human H9 were thereafter determined and verified (in the section below, both isoforms were mutated in accordance with the Human GPCR Proline Marker Algorithm).

32. HB954 (GenBank Accession Number: D38449)

[0099] The cDNA for human HB954 was generated and cloned into pCMV expression vector as follows: PCR was performed using brain cDNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0 25 µM of each primer, and 02 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of 94°C for 1 min, 58°C for 1 min and 72°C for 2 min. The 5' PCR contained a HindIII site with the sequence:

5'-TCCAAGCTTCGCCATGGGACATAACGGGAGCT-3' (SEQ.ID.NO.: 101)

and the 3' primer contained an EcoRI site with the sequence:

5'-CGTGAATTCCAAGAATTTACAATCCTTGCT-3' (SEQ.ID.NO.: 102).

The 1.6 kb PCR fragment was digested with HindIII and EcoRI and cloned into HindIII-EcoRI site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 103) and amino acid (SEQ.ID.NO.: 104) sequences for human HB954 were thereafter determined and verified.

33. HG38 (GenBank Accession Number: AF062006)

[0100] The cDNA for human HG38 was generated and cloned into pCMV expression vector as follows: PCR was performed using brain cDNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM of each 4 nucleotides. The cycle condition was 30 cycles of 94°C for 1 min, 56°C for 1min and 72 °C for 1 min and 30 sec. Two PCR reactions were performed to separately obtain the 5' and 3' fragment. For the 5' fragment, the 5' PCR primer contained an HindIII site with the sequence:

5'-CCCAAGCTTCGGGCACCATGGACACCTCCC-3' (SEQ.ID.NO.: 259)

and the 3' primer contained a BamHIsite with the sequence:

5'-ACAGGATCCAAATGCACAGCACTGGTAAGC-3' (SEQ.ID.NO.: 260).

This 5' 1.5 kb PCR fragment was digested with HindIII and BamHI and cloned into an HindIII-BamHI site of pCMV. For the 3' fragment, the 5' PCR primer was kinased with the sequence: 5'-CTATAACTGGGTIACATGGTTTAAC-3' (SEQ.ID.NO. 261) and the 3' primer contained an EcoRI site with the sequence:

5'-TTTGAATTCACATATTAATTAGAGACATGG-3' (SEQ.ID.NO.: 262).

The 1.4 kb 3' PCR fragment was digested with EcoRI and subcloned into a blunt-EcoRI site of pCMV vector. The 5' and 3' fragments were then ligated together through a common EcoR V site to generate the full length cDNA clone. Nucleic acid (SEQ.ID.NO.: 263) and amino acid (SEQ.ID.NO.: 264) sequences for human HG38 were thereafter determined and verified.

34. HM74 (GenBank Accession Number: D10923)

[0101] The cDNA for human HM74 was generated and cloned into pCMV expression vector as follows: PCR was performed using either genomic DNA or thymus cDNA (pooled) as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 65°C for 1 min; and 72°C for 1 min and 20 sec. The 5' PCR primer contained an EcoRI site with the sequence:

5'-GGAGAATTCACTAGGCGAGGCGCTCCATC-3' (SEQ.ID.NO.: 105)

and the 3' primer was kinased with the sequence:

5'-GGAGGATCCAGGAAACCTTAGGCCGAGTCC-3' (SEQ.ID.NO.:106).

The 1.3 kb PCR fragment was digested with EcoRI and cloned into EcoRI-SmaI site of pCMV expression vector. Clones sequenced revealed a potential polymorphism involving a N94K change. Aside from this difference, nucleic acid (SEQ.ID.NO.: 107) and amino acid (SEQ.ID.NO.: 108) sequences for human HM74 were thereafter determined and verified.

35. MIG (GenBank Accession Numbers: AF044600 and AFO44601)

[0102] The cDNA for human MIG was generated and cloned into pCMV expression vector as follows: PCR was performed using genomic DNA as template and TaqPlus Precision polymerase (Stratagene) for first round PCR or pfu polymerase (Stratagene) for second round PCR with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM (TaqPlus Precision) or 0.5 mM (pfu) of each of the 4 nucleotides. When pfu was used, 10% DMSO was included in the buffer. The cycle condition was 30 cycles of: 94°C for 1 min; 65°C for 1min; and 72 °C for: (a) 1 min for first round PCR; and (b) 2 min for second round PCR. Because there is an intron in the coding region, two sets of primers were separately used to generate overlapping 5' and 3' fragments. The 5' fragment PCR primers were:

5'-ACCATGGCTTGCAATGGCAGTGCGGCCAGGGGGCACT-3' (external sense) (SEQ.ID.NO.: 109)

and

5'-CGACCAGGACAAACAGCATCTTGGTCACTTGTCTCCGGC-3' (internal antisense) (SEQ.ID.NO.: 110).

[0103] The 3' fragment PCR primers were:

5'-GACCAAGATGCTGTTTGTCCTGGTCGTGGTGTTTGGCAT-3' (internal sense) (SEQ.ID.NO.: 111)

and

5'-CGGAATTCAGGATGGATCGGTCTCTTGCTGCGCCT-3' (external antisense with an EcoRI site (SEQ.ID.NO.: 112).

The 5' and 3' fragments were ligated together by using the first round PCR as template and the kinased external sense primer and external antisense primer to perform second round PCR. The 1.2 kb PCR fragment was digested with EcoRI and cloned into the blunt-EcoRI site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 113) and amino acid (SEQ.1D.NO.: 114) sequences for human MIG were thereafter determined and verified.

36. OGR1 (GenBank Accession Number: U48405)

[0104] The cDNA for human OGR1 was generated and cloned into pCMV expression vector as follows: PCR was performed using genomic DNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 65°C for 1min; and 72 °C for 1 min and 20 sec. The 5' PCR primer was kinased with the sequence:

5'-GGAAGCTTCAGGCCCAAAGATGGGGAACAT-3' (SEQ.ID.NO.: 115):

and the 3' primer contained a BamHI site with the sequence:

5'-GTGGATCCACCCGCGGAGGACCCAGGCTAG-3' (SEQ.ID.NO.: 116).

The 1.1 kb PCR fragment was digested with BamHI and cloned into the EcoRV-BamHI site ofpCMV expression vector. Nucleic acid (SEQ.ID.NO.: 117) and amino acid (SEQ.ID.NO.: 118) sequences for human OGR1 were thereafter determined and verified.

37. Serotonin 5HT_2A

[0105] The cDNA encoding endogenous human 5HT_2A receptor was obtained by RT-PCR using human brain poly-A⁺ RNA; a 5' primer from the 5' untranslated region with an Xho I restriction site:

5'-GACCTCGAGTCCTTCTACACCTCATC-3' (SEQ.ID.NO.: 119)

and a 3' primer from the 3' untranslated region containing an Xba I site:

5'-TGCTCTAGATTCCAGATAGGTGAAAACTTG-3' (SEQ.ID.NO.: 120)

PCR was performed using either TaqPlus^™ precision polymerase (Stratagene) or rTth^™ polymerase (Perkin Elmer) with the buffer system provided by the manufacturers, 0.25 µM of each primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 57 °C for 1min; and 72°C for 2 min. The 1.5 kb PCR fragment was digested with Xba 1 and subcloned into Eco RV-Xba 1 site of pBluescript. The resulting cDNA clones were fully sequenced and found to encode two amino acid changes from the published sequences. The first one was a T25N mutation in the N-terminal extracellular domain; the second is an H452Y mutation. Because cDNA clones derived from two independent PCR reactions using Taq polymerase from two different commercial sources (TaqPlus^™ from Stratagene and rTth^™ Perkin Elmer) contained the same two mutations, these mutations are likely to represent sequence polymorphisms rather than PCR errors. With these exceptions, the nucleic acid (SEQ.ID.NO.: 121) and amino acid (SEQ.ID.NO.: 122) sequences for human 5HT_2A were thereafter determined and verified.

38. Serotonin 5HT_2C

[0106] The cDNA encoding endogenous human 5HT_2C receptor was obtained from human brain poly-A⁺ RNA by RT-PCR. The 5' and 3' primers were derived from the 5' and 3' untranslated regions and contained the following sequences:

5'-GACCTCGAGGTTGCTTAAGACTGAAGC-3' (SEQ.ID.NO.: 123)

5'-ATTTCTAGACATATGTAGCTTGTACCG-3' (SEQ.ID.NO.: 124)

Nucleic acid (SEQ.ID.NO.: 125) and amino acid (SEQ.ID.NO.: 126) sequences for human 5HT_2C were thereafter determined and verified.

39. V28 (GenBank Accession Number: U20350)

[0107] The cDNA for human V28 was generated and cloned into pCMV expression vector as follows: PCRwas performed using brain cDNA as template and rTth polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 µM of each primer, and 0.2 mM of each of the 4 nucleotides. The cycle condition was 30 cycles of: 94°C for 1 min; 65°C for 1min; and 72 °C for 1 min and 20 sec. The 5' PCR primer contained a HindIII site with the sequence:

5'-GGTAAGCTTGGCAGTCCACGCCAGGCCTTC-3' (SEQ.ID.NO.: 127)

and the 3' primer contained an EcoRI site with the sequence:

5'-TCCGAATTCTCTGTAGACACAAGGCTTTGG-3' (SEQ.ID.NO.: 128)

The 1.1 kb PCR fragment was digested with HindIII and EcoRI and cloned into HindIII-EcoRI site ofpCMV expression vector. Nucleic acid (SEQ.ID.NO.:129) and amino acid (SEQ.ID.NO.: 130) sequences for human V28 were thereafter determined and verified.

Example 2

PREPARATION OF NON-ENDOGENOUS HUMAN GPCRS

1. Site-Direrted Mutagenesis

[0108] Mutagenesis based upon the Human GPCR Proline Marker approach disclosed herein was performed on the foregoing endogenous human GPCRs using Transformer Site-Directed Mutagenesis Kit (Clontech) according to the manufacturer instructions. For this mutagenesis approach, a Mutation Probe and a Selection Marker Probe (unless otherwise indicated, the probe of SEQ.ID.NO.: 132 was the same throughout) were utilized, and the sequences of these for the specified sequences are listed below in Table B (the parenthetical number is the SEQ. ID.NO.). For convenience, the codon mutation incorporated into the human GPCR is also noted, in standard form:

Table B

Receptor Identifier (Codon Mutation)	Mutation Probe Sequence (5'-3') (SEQ.ID.NO.)	Selection Marker Probe Sequence (5'-3') (SEQ.ID.NO.)
GPR1 (F245K)
GPR4 (K223A)
GPR5 (V224K)
GPR7 (T250K)

GPR8 (T259K)
GPR9 (M254K)
GPR9-6 (L241K)

GPR10 (F276K)

GPR15 (I240K)
GPR17 (V234K)
GPR18 (I231K)
GPR20 (M240K)
GPR21 (A251K)

GPR22 (F312K)
GPR24 (T304K)
GPR30 (L258K)	alternate approach; see below	alternate approach; see below
GPR31 (Q221K)
GPR32 (K255A)
GPR40 (A223K)
GPR41 (A223K)
GPR43 (V221K)
APJ (L247K)	alternate approach; see below	alternate approach; see below
BLR1 (V258K)
CEPR (L258K)

EBI1 (I262K)
EBI2 (L243K)
ETBR-LP2 (N358K)
GHSR (V262K)
GPCR-CNS (N491K)
GPR-NGA (I275K)
H9a and H9b (F236K)
HB954 (H265K)
HG38 (V765K)

HM74 (I230K)
MIG (T273K)
OGR1 (Q227K)
Serotonin 5HT_2A (C322K)	alternate approach; see below	alternate approach; see below
Serotonin 5HT_2C (S310K)	alternate approach; see below	alternate approach; see below
V28 (I230K)

The non-endogenous human GPCRs were then sequenced and the derived and verified nucleic acid and amino acid sequences are listed in the accompanying "Sequence Listing" appendix to this patent document, as summarized in Table C below:

Table C

Mutated GPCR	Nucleic Acid Sequence Listing	Amino Acid Sequence Listing
GPR1	SEQ.ID.NO.: 163	SEQ.ID.NO.: 164
(F245K)
GPR4	SEQ.ID.NO.: 165	SEQ.ID.NO.: 166
(K223A)
GPR5	SEQ.ID.NO.: 167	SEQ.ID.NO.: 168
(V224K)
GPR7	SEQ.ID.NO.: 169	SEQ.ID.NO.: 170
(T250K)
GPR8	SEQ.ID.NO.: 171	SEQ.ID.NO.: 172
(T259K)
GPR9	SEQ.ID.NO.: 173	SEQ.ID.NO.: 174
(M254K)
GPR9-6	SEQ.ID.NO.: 175	SEQ.ID.NO.: 176
(L241K)
GPR10	SEQ.ID.NO.: 177	SEQ.ID.NO.: 178
(F276K)
GPR15	SEQ.ID.NO.: 179	SEQ.ID.NO.: 180
(I240K)
GPR17	SEQ.ID.NO.: 181	SEQ.ID.NO.: 182
(V234K)
GPR18	SEQ.ID.NO.: 183	SEQ.ID.NO.: 184
(I231K)
GPR20	SEQ.ID.NO.: 185	SEQ.ID.NO.: 186
(M240K)
GPR21	SEQ.ID.NO.: 187	SEQ.ID.NO.: 188
(A251K)
GPR22	SEQ.ID.NO.: 189	SEQ.ID.NO.: 190
(F312K)
GPR24	SEQ.ID.NO.: 191	SEQ.ID.NO.: 192
(T304K))
GPR30	SEQ.ID.NO.: 193	SEQ.ID.NO.: 194
(L258K)
GPR31	SEQ.ID.NO.: 195	SEQ.ID.NO.: 196
(Q221K)
GPR32	SEQ.ID.NO.: 269	SEQ.ID.NO.: 270
(K255A)
GPR40	SEQ.ID.NO.: 271	SEQ.ID.NO.: 272
(A223K)
GPR41	SEQ.ID.NO.: 273	SEQ.ID.NO.: 274
(A223K)
GPR43	SEQ.ID.NO.: 275	SEQ.ID.NO.: 276
(V221K)
APJ	SEQ.ID.NO.: 197	SEQ.ID.NO.: 198
(L247K)
BLR1	SEQ.ID.NO.: 199	SEQ.ID.NO.: 200
(V258K)
CEPR	SEQ.ID.NO.: 201	SEQ.ID.NO.: 202
(L258K)
EBII	SEQ.ID.NO.: 203	SEQ.ID.NO.: 204
(I262K)
EBI2	SEQ.ID.NO.: 205	SEQ.ID.NO.: 206
(L243K)
ETBR-LP2	SEQ.ID.NO.: 207	SEQ.ID.NO.: 208
(N358K)
GHSR	SEQ.ID.NO.: 209	SEQ.ID.NO.: 210
(V262K)
GPCR-CNS	SEQ.ID.NO.: 211	SEQ.ID.NO.: 212
(N491K)
GPR-NGA	SEQ.ID.NO.: 213	SEQ.ID.NO.: 214
(I275K)
H9a	SEQ.ID.NO.: 215	SEQ.ID.NO.: 216
(F236K)
H9b	SEQ.ID.NO.: 217	SEQ.ID.NO.: 218
(F236K)
HB954	SEQ.ID.NO.: 219	SEQ.ID.NO.: 220
(H265K)
HG38	SEQ.ID.NO.: 277	SEQ.ID.NO.: 278
(V765K)
HM74	SEQ.ID.NO.: 221	SEQ.ID.NO.: 222
(I230K)
MIG	SEQ.ID.NO.: 223	SEQ.ID.NO.: 224
(T273K)
OGR1	SEQ.ID.NO.: 225	SEQ.ID.NO.: 226
(Q227K)
Serotonin 5HT_2A	SEQ.ID.NO.: 227	SEQ.ID.NO.: 228
(C322K)
Serotonin 5HT_2C	SEQ.ID.NO.: 229	SEQ.ID.NO.: 230
(S310K)
V28	SEQ.ID.NO.: 231	SEQ.ID.NO.: 232
(I230K)

2. Alternate Mutation Approaches for Employment of the Proline Marker Algorithm: APJ; Serotonin 5HT_2A; Serotonin 5HT_2C; and GPR30

[0109] Although the above site-directed mutagenesis approach is particularly preferred, other approaches can be utilized to create such mutations; those skilled in the art are readily credited with selecting approaches to mutating a GPCR that fits within the particular needs of the artisan.

α. APJ

[0110] Preparation of the non-endogenous, human APJ receptor was accomplished by mutating L247K. Two oligonucleotides containing this mutation were synthesized:

5'-GGCTTAAGAGCATCATCGTGGTGCTGGTG-3' (SEQ.ID.NO.: 233)

5'-GTCACCACCAGCACCACGATGATGCTCTTAAGCC-3' (SEQ.ID.NO.: 234)

The two oligonucleotides were annealed and used to replace the NaeI-BstEII fragment of human, endogenous APJ to generate the non-endogenous, version of human APJ.

b. Serotonin 5HT_2A

[0111] cDNA containing the point mutation C322K was constructed by utilizing the restriction enzyme site Sph I which encompasses amino acid 322. A primer containing the C322K mutation:

5'-CAAAGAAAGTACTGGGCATCGTCTTCTTCCT-3' (SEQ.ID.NO.: 235)

was used along with the primer from the 3' untranslated region of the receptor:

5'-TGCTCTAGATTCCAGATAGGTGAAAACTTG-3' (SEQ.ID.NO.: 236)

to perform PCR (under the conditions described above). The resulting PCR fragment was then used to replace the 3' end of endogenous 5HT_2A cDNA through the T4 polymerase blunted Sph I site.

c. Serotonin 5HT_2C

[0112] The cDNA containing a S310K mutation was constructed by replacing the Sty I restriction fragment containing amino acid 310 with synthetic double stranded oligonucleotides that encode the desired mutation. The sense strand sequence utilized had the following sequence:

5'-CTAGGGGCACCATGCAGGCTATCAACAATGAAAGAAAAGCTAAGAAAGTC-3' (SEQ.ID.NO.: 237)

and the antisense strand sequence utilized had the following sequence:

5'-CAAGGACTTTCTTAGCTTTTCTTTCATTGTTGATAGCCTGCATGGTGCCC-3' (SEQ.ID.NO.: 238)

d. GPR30

[0113] Prior to generating non-endogenous GPR30, several independentpCR2.1/GPR30 isolates were sequenced in their entirety in order to identify clones with no PCR-generated mutations. A clone having no mutations was digested with EcoR1 and the endogenous GPR30 cDNA fragment was transferred into the CMV-driven expression plasmid pCI-neo (Promega), by digesting pCI-Neo with EcoRI and subcloning the EcoRI-liberated GPR30 fragment from pCR2.1/GPR30, to generate pCI/GPR30. Thereafter, the leucine at codon 258 was mutated to a lysine using a Quick-Change™ Site-Directed Mutagenesis Kit (Stratagene, #200518), according to manufacturer's instructions, and the following primers:

5'-CGGCGGCAGAAGGCGAAACGCATGATCCTCGCGGT-3' (SEQ.ID.NO.: 239)

and

5'-ACCGCGAGGATCATGCGTTTCGCCTTCTGCCGCCG-3' (SEQ.ID.NO.: 240)

Example 3

Receptor (Endogenous and Mutated) Expression

[0114] Although a variety of cells are available to the art for the expression of proteins, it is most preferred that mammalian cells be utilized. The primary reason for this is predicated upon practicalities, i.e., utilization of, e.g., yeast cells for the expression of a GPCR, while possible, introduces into the protocol a non-mammalian cell which may not (indeed, in the case of yeast, does not) include the receptor-coupling, genetic-mechanism and secretary pathways that have evolved for mammalian systems - thus, results obtained in non-mammalian cells, while of potential use, are not as preferred as that obtained from mammalian cells. Of the mammalian cells, COS-7, 293 and 293T cells are particularly preferred, although the specific mammalian cell utilized can be predicated upon the particular needs of the artisan.

[0115] Unless otherwise noted herein, the following protocol was utilized for the expression of the endogenous and non-endogenous human GPCRs. Table D lists the mammalian cell and number utilized (per 150mm plate) for GPCR expression.

Table D

Receptor Name (Endogenous or Non-Endogenous)	Mammalian Cell (Number Utilized)
GPR17	293 (2 x 10⁴)
GPR30	293 (4 x 10⁴)
APJ	COS-7 (5X10⁶)
ETBR-LP2	293 (1 x 10⁷)
	293T (1 x 10⁷)
GHSR	293 (1 x 10⁷)
	293T (1 x 10⁷)
MIG	293 (1 x 10⁷)
Serotonin 5HT_2A	293T (1 x 10⁷)
Serotonin 5HT_2c	293T (1 x 10⁷)

[0116] On day one, mammalian cells were plated out. On day two, two reaction tubes were prepared (the proportions to follow for each tube are per plate): tube A was prepared by mixing 20µg DNA (e.g., pCMV vector; pCMV vector with endogenous receptor cDNA, and pCMV vector with non-endogenous receptor cDNA.) in 1.2ml serum free DMEM (Irvine Scientific, Irvine, CA); tube B was prepared by mixing 120µl lipofectamine (Gibco BRL) in 1.2ml serum free DMEM. Tubes A and B were then admixed by inversions (several times), followed by incubation at room temperature for 30-45min. The admixture is referred to as the "transfection mixture". Plated cells were washed with 1XPBS, followed by addition of 10ml serum free DMEM. 2.4ml of the transfection mixture was then added to the cells, followed by incubation for 4hrs at 37°C/5% CO₂. The transfection mixture was then removed by aspiration, followed by the addition of 25ml of DMEM/10% Fetal Bovine Serum. Cells were then incubated at 37°C/5% CO₂. After 72hr incubation, cells were then harvested and utilized for analysis.

1. Gi-Coupled Receptors: Co-Transfection with Gs-Coupled Receptors

[0117] In the case of GPR30, it has been determined that this receptor couples the G protein Gi. Gi is known to inhibit the enzyme adenylyl cyclase, which is necessary for catalyzing the conversion of ATP to cAMP. Thus, a non-endogenous, constitutively activated form of GPR30 would be expected to be associated with decreased levels of cAMP. Assay confirmation of a non-endogenous, constitutively activated form of GPR30 directly viameasurement of decreasing levels of cAMP, while viable, can be preferably measured by cooperative use of a Gs-coupled receptor. For example, a receptor that is Gs-coupled will stimulate adenylyl cyclase, and thus will be associated with an increase in cAMP. The assignee of the present application has discovered that the orphan receptor GPR6 is an endogenous, constitutively activated GPCR. GPR6 couples to the Gs protein. Thus when co-transfected, one can readily verify that a putative GPR30-mutation leads to constitutive activation thereof: i.e., an endogenous, constitutively activated GPR6/endogenous, non-constitutively activated GPR30 cell will evidence an elevated level of cAMP when compared with an endogenous, constitutively active GPR6/non-endogenous, constitutively activated GPR30 (the latter evidencing a comparatively lower level of cAMP). Assays that detect cAMP can be utilized to determine if a candidate compound is e.g., an inverse agonist to a Gs-associated receptor (i. e., such a compound would decrease the levels of cAMP) or a Gi-associated receptor (or a Go-associated receptor) (i.e., such a candidate compound would increase the levels of cAMP). A variety of approaches known in the art for measuring cAMP can be utilized; a preferred approach relies upon the use of anti-cAMP antibodies. Another approach, and most preferred, utilizes a whole cell second messenger reporter system assay. Promoters on genes drive the expression of the proteins that a particular gene encodes. Cyclic AMP drives gene expression by promoting the binding of a cAMP-responsive DNA binding protein or transcription factor (CREB) which then binds to the promoter at specific sites called cAMP response elements and drives the expression ofthe gene. Reporter systems can be constructed which have a promoter containing multiple cAMP response elements before the reporter gene, e.g., β-galactosidase or luciferase. Thus, an activated receptor such as GPR6 causes the accumulation of cAMP which then activates the gene and expression of the reporter protein. Most preferably, 293 cells are co-transfected with GPR6 (or another Gs-linked receptor) and GPR30 (or another Gi-linked receptor) plasmids, preferably in a 1:1 ratio, most preferably in a 1:4 ratio. Because GPR6 is an endogenous, constitutively active receptorthat stimulates the production of cAMP, GPR6 strongly activates the reporter gene and its expression. The reporter protein such as β-galactosidase or luciferase can then be detected using standard biochemical assays (Chen et al. 1995). Co-transfection of endogenous, constitutively active GPR6 with endogenous, non-constitutively active GPR30 evidences an increase in the luciferase reporter protein. Conversely, co-transfection of endogenous, constitutively active GPR6 with non-endogenous, constitutively active GPR30 evidences a drastic decrease in expression of luciferase. Several reporter plasmids are known and available in the art for measuring a second messenger assay. It is considered well within the skilled artisan to determine an appropriate reporter plasmid for a particular gene expression based primarily upon the particular need of the artisan. Although a variety of cells are available for expression, mammalian cells are most preferred, and of these types, 293 cells are most preferred. 293 cells were transfected with the reporter plasmid pCRE-Luc/GPR6 and non-endogenous, constitutively activated GPR30 using a Mammalian Transfection^™ Kit (Stratagene, #200285) CaPO₄ precipitation protocol according to the manufacturer's instructions (see, 28 Genomics 347 (1995) for the published endogenous GPR6 sequence). The precipitate contained 400ng reporter, 80ng CMV-expression plasmid (having a 1:4 GPR6 to endogenous GPR30 or non-endogenous GPR30 ratio) and 20ng CMV-SEAP (a transfection control plasmid encoding secreted alkaline phosphatase). 50% of the precipitate was split into 3 wells of a 96-well tissue culture dish (containing 4X10⁴ cells/well); the remaining 50% was discarded. The following morning, the media was changed. 48 hr after the start of the transfection, cells were lysed and examined for luciferase activity using a Luclite^™ Kit (Packard, Cat. # 6016911) and Trilux 1450 Microbeta^™ liquid scintillation and luminescence counter (Wallac) as per the vendor's instructions. The data were analyzed using GraphPad Prism 2.0a (GraphPad Software Inc.).

[0118] With respect to GPR17, which has also been determined to be Gi-linked, a modification of the foregoing approach was utilized, based upon, inter alia, use of another Gs-linked endogenous receptor, GPR3 (see 23 Genomics 609 (1994) and 24 Genomics 391 (1994)). Most preferably, 293 cells are utilized. These cells were plated-out on 96 well plates at a density of 2 x 10⁴ cells per well and were transfected using Lipofectamine Reagent (BRL) the following day according to manufacturer instructions. A DNA/lipid mixture was prepared for each 6-well transfection as follows: 260ng of plasmid DNA in 100µl of DMEM were gently mixed with 2µl of lipid in 100µl of DMEM (the 260ng of plasmid DNA consisted of 200ng of a 8xCRE-Luc reporterplasmid (see below), 50ng ofpCMV comprising endogenous receptor or non-endogenous receptor or pCMV alone, and 10ng of a GPRS expression plasmid (GPRS in pcDNA3 (Invitrogen)). The 8XCRE-Luc reporter plasmid was prepared as follows: vector SRIF-β-gal was obtained by cloning the rat somatostatin promoter (-71/+51) at BgIV-HindIII site in the pβgal-Basic Vector (Clontech). Eight (8) copies of cAMP response element were obtained by PCR from an adenovirus template AdpCF126CCRE8 (see 7 Human Gene Therapy 1883 (1996)) and cloned into the SRIF-β-gal vector at the Kpn-BgIV site, resulting in the 8xCRE-β-gal reporter vector. The 8xCRE-Luc reporter plasmid was generated by replacing the beta-galactosidase gene in the 8xCRE-β-gal reporter vector with the luciferase gene obtained from the pGL3-basic vector (Promega) at the HindIII-BamHI site. Following 30min. incubation at room temperature, the DNA/lipid mixture was diluted with 400 µl of DMEM and 100µl of the diluted mixture was added to each well. 100 µl of DMEM with 10% FCS were added to each well after a 4hr incubation in a cell culture incubator. The next morning the transfected cells were changed with 200 µl/well of DMEM with 10% FCS. Eight (8) hours later, the wells were changed to 100 µl /well of DMEM without phenol red, after one wash with PBS. Luciferase activity were measured the next day using the LucLite^™ reporter gene assay kit (Packard) following manufacturer instructions and read on a 1450 MicroBeta^™ scintillation and luminescence counter (Wallac).

[0119] Figure 4 evidences that constitutively active GPR30 inhibits GPR6-mediated activation of CRE-Luc reporter in 293 cells. Luciferase was measured at about 4.1 relative light units in the expression vector pCMV. Endogenous GPR30 expressed luciferase at about 8.5 relative light units, whereas the non-endogenous, constitutively active GPR30 (L258K), expressed luciferase at about 3.8 and 3.1 relative light units, respectively. Co-transfection of endogenous GPR6 with endogenous GPR30, at a 1:4 ratio, drastically increased luciferase expression to about 104.1 relative light units. Co-transfection of endogenous GPR6 with non-endogenous GPR30 (L258K), at the same ratio, drastically decreased the expression, which is evident at about 18.2 and 29.5 relative light units, respectively. Similar results were observed with respect to GPR17 with respect to co-transfection with GPR3, as set forth in Figure 5.

Example 3

ASSAYS FOR DETERMINATION OF CONSTITUTIVE ACTIVITY OF NON-ENDOGENOUS GPCRS

A. Membrane Binding Assays

1. [^3SS]GTPγS Assay

[0120] When a G protein-coupled receptor is in its active state, either as a result of ligand binding or constitutive activation, the receptor couples to a G protein and stimulates the release of GDP and subsequent binding of GTP to the G protein. The alpha subunit of the G protein-receptor complex acts as a GTPase and slowly hydrolyzes the GTP to GDP, at which point the receptor normally is deactivated. Constitutively activated receptors continue to exchange GDP for GTP. The non-hydrolyzable GTP analog, [³⁵S]GTPγS, can be utilized to demonstrate enhanced binding of [³⁵S]GTPγS to membranes expressing constitutively activated receptors. The advantage of using [³⁵S]GTPγS binding to measure constitutive activation is that: (a) it is generically applicable to all G protein-coupled receptors; (b) it is proximal at the membrane surface making it less likely to pick-up molecules which affect the intracellular cascade.

[0121] The assay utilizes the ability of G protein coupled receptors to stimulate [³⁵S]GTPγS binding to membranes expressing the relevant receptors. The assay can, therefore, be used in the direct identification method to screen candidate compounds to known, orphan and constitutively activated G protein-coupled receptors. The assay is generic and has application to drug discovery at all G protein-coupled receptors.
The [³⁵S]GTPγS assay was incubated in 20 mM HEPES and between 1 and about 20mM MgCl₂ (this amount can be adjusted for optimization of results, although 20mM is preferred) pH 7.4, binding buffer with between about 0.3 and about 1.2 nM [³⁵S]GTPγS (this amount can be adjusted for optimization of results, although 1.2 is preferred) and 12.5 to 75 µg membrane protein (e.g. COS-7 cells expressing the receptor; this amount can be adjusted for optimization, although 75µg is preferred) and 1 µM GDP (this amount can be changed for optimization) for 1 hour. Wheatgerm agglutinin beads (25 µl; Amersham) were then added and the mixture was incubated for another 30 minutes at room temperature. The tubes were then centrifuged at 1500 x g for 5 minutes at room temperature and then counted in a scintillation counter.

[0122] A less costly but equally applicable alternative has been identified which also meets the needs of large scale screening. Flash plates^™ and Wallac^™ scintistrips may be utilized to format a high throughput [³⁵S]GTPγS binding assay. Furthermore, using this technique, the assay can be utilized for known GPCRs to simultaneously monitor tritiated ligand binding to the receptor at the same time as monitoring the efficacy via [³⁵S]GTPγS binding. This is possible because the Wallac beta counter can switch energy windows to look at both tritium and ³⁵S-1abeled probes. This assay may also be used to detect other types of membrane activation events resulting in receptor activation. For example, the assay may be used to monitor ³²P phosphorylation of a variety of receptors (both G protein coupled and tyrosine kinase receptors). When the membranes are centrifuged to the bottom of the well, the bound [³⁵S]GTPγS or the ³²P-phosphorylated receptor will activate the scintillant which is coated of the wells. Scinti® strips (Wallac) have been used to demonstrate this principle. In addition, the assay also has utility for measuring ligand binding to receptors using radioactively labeled ligands. In a similar manner, when the radiolabeled bound ligand is centrifuged to the bottom of the well, the scintistrip label comes into proximity with the radiolabeled ligand resulting in activation and detection.

[0123] Representative results of graph comparing Control (pCMV), Endogenous APJ and Non-Endogenous APJ, based upon the foregoing protocol, are set forth in Figure 6.

2. Adenylyl Cyclase

[0124] A Flash Plate™ Adenylyl Cyclase kit (New England Nuclear; Cat. No. SMP004A) designed for cell-based assays was modified for use with crude plasma membranes. The Flash Plate wells contain a scintillant coating which also contains a specific antibody recognizing cAMP. The cAMP generated in the wells was quantitated by a direct competition for binding of radioactive cAMP tracer to the cAMP antibody. The following serves as a brief protocol for the measurement of changes in cAMP levels in membranes that express the receptors.

[0125] Transfected cells were harvested approximately three days after transfection. Membranes were prepared by homogenization of suspended cells in buffer containing 20mM HEPES, pH 7.4 and 10mM MgCl₂. Homogenization was performed on ice using a Brinlanan Polytron™ for approximately 10 seconds. The resulting homogenate was centrifuged at 49,000 X g for 15 minutes at 4°C. The resulting pellet was then resuspended in buffer containing 20mM HEPES, pH 7.4 and 0.1 mM EDTA, homogenized for 10 seconds, followed by centrifugation at 49,000 X g for 15 minutes at 4°C. The resulting pellet can be stored at -80°C until utilized. On the day of measurement, the membrane pellet was slowly thawed at room temperature, resuspended in buffer containing 20mM HEPES, pH 7.4 and 10mM MgCL₂ (these amounts can be optimized, although the values listed herein are prefereed), to yield a final protein concentration of 0.60mg/ml (the resuspended membranes were placed on ice until use).

[0126] cAMP standards and Detection Buffer (comprising 2 µCi of tracer [¹²⁵I cAMP (100 µl] to 11 ml Detection Buffer) were prepared and maintained in accordance with the manufacturer's instructions. Assay Buffer was prepared fresh for screening and contained 20mM HEPES, pH 7.4, 10mM MgCl₂, 20mM (Sigma), 0.1 units/ml creatine phosphokinase (Sigma), 50 µM GTP (Sigma), and 0.2 mM ATP (Sigma); Assay Buffer can be stored on ice until utilized. The assay was initiated by addition of 50ul of assay buffer followed by addition of 50ul of membrane suspension to the NEN Flash Plate. The resultant assay mixture is incubated for 60 minutes at room temperature followed by addition of 100ul of detection buffer. Plates are then incubated an additional 2-4 hours followed by counting in a Wallac MicroBeta scintillation counter. Values of cAMP/well are extrapolated from a standard cAMP curve which is contained within each assay plate. The foregoing assay was utilized with respect to analysis of MIG.

B. Reporter-Based Assays

1. CREB Reporter Assay (Gs-associated receptors)

[0127] A method to detect Gs stimulation depends on the known property of the transcription factor CREB, which is activated in a cAMP-dependent manner. A PathDetect CREB trans-Reporting System (Stratagene, Catalogue # 219010) was utilized to assay for Gs coupled activity in 293 or 293T cells. Cells were transfected with the plasmids components of this above system and the indicated expression plasmid encoding endogenous or mutant receptor using a Mammalian Transfection Kit (Stratagene, Catalogue #200285) according to the manufacurer's instructions. Briefly, 400 ng pFR-Luc (luciferase reporter plasmid containing Gal4 recognition sequences), 40 ng pFA2-CREB (Gal4-CREB fusion protein containing the Gal4 DNA-binding domain), 80 ng CMV-receptor expression plasmid (comprising the receptor) and 20 ng CMV-SEAP (secreted alkaline phosphatase expression plasmid; alkaline phosphatase activity is measured in the media of transfected cells to control for variations in transfection efficiency between samples) were combined in a calcium phosphate precipitate as per the Kit's instructions. Half of the precipitate was equally distributed over 3 wells in a 96-well plate, kept on the cells overnight, and replaced with fresh medium the following morning. Forty-eight (48) hr after the start of the transfection, cells were treated and assayed for luciferase activity as set forth with resepct to the GPR30 system, above. This assay was used with respect to GHSR.

2. AP1 reporter assay (Gq-associated receptors)

[0128] Ae method to detect Gq stimulation depends on the known property ofGq-dependent phospholipase C to cause the activation of genes containing AP 1 elements in their promoter. A Pathdetect AP-1 cis-Reporting System (Stratagene, Catalogue # 219073) was utilized following the protocl set forth above with respect to the CREB reporter assay, except that the components of the calcium phosphate precipitate were 410 ng pAP1-Luc, 80 ng receptor expression plasmid, and 20 ng CMV-SEAP. This assay was used with respect to ETBR-LP2

C. Intracellular IP3 Accumulation Assay

[0129] On day 1, cells comprising the serotonin receptors (endogenous and mutated) were plated onto 24 well plates, usually 1 × 10⁵ cells/well. On day 2 cells were transfected by firstly mixing 0.25ug DNA in 50 ul serumfree DMEM/well and 2 ul lipofectamine in 50 µl serumfree DMEM/well. The solutions were gently mixed and incubated for 15-30 min at room temperature. Cells were washed with 0.5 ml PBS and 400 µl of serum free media was mixed with the transfection media and added to the cells. The cells were then incubated for 3-4 hrs at 37°C/5%CO₂ and then the transfection media was removed and replaced with 1ml/well of regular growth media. On day 3 the cells were labeled with ³H-myo-inositol. Briefly, the media was removed the cells were washed with 0.5 ml PBS. Then 0.5 ml inositol-free/serumfree media (GIBCO BRL) was added/well with 0.25 µCi of ³H-myo-inositol / well and the cells were incubated for 16-18 hrs o/n at 37°C/5%CO₂. On Day 4 the cells were washed with 0.5 ml PBS and 0.45 ml of assay medium was added containing inositol-free/serum free media 10 µM pargyline 10 mM lithium chloride or 0.4 ml of assay medium and 50 ul of 10x ketanserin (ket) to final concentration of 10µM. The cells were then incubated for 30 min at 37°C. The cells were then washed with 0.5 ml PBSand 200 ul of fresh/icecold stop solution (1M KOH; 18 mM Na-borate; 3.8 mM EDTA) was added/well. The solution was kept on ice for 5-10 min or until cells were lysed and then neutralized by 200 µl of fresh/ice cold neutralization sol. (7.5 % HCL). The lysate was then transferred into 1.5 ml eppendorf tubes and 1 ml of chloroform/methanol (1:2) was added/tube. The solution was vortexed for 15 sec and the upper phase was applied to a Biorad AG1-X8 anion exchange resin (100-200 mesh). Firstly, the resin was washed with water at 1:1.25 W/V and 0.9 ml of upper phase was loaded onto the column. The column was washed with 10 mls of 5 mM myo-inositol and 10 ml of 5 mM Na-borate/60mM Na-formate. The inositol tris phosphates were eluted into scintillation vials containing 10 ml of scintillation cocktail with 2 ml of 0.1 M formic acid/ 1 M ammonium formate. The columns were regenerated by washing with 10 ml of 0.1 M formic acid/3M ammonium formate and rinsed twice with dd H₂O and stored at 4°C in water.

[0130] Figure 7 provides an illustration ofIP3 production from the human 5-HT_2A receptor that incorporates the C322K mutation. While these results evidence that the Proline Mutation Algorithm approach constitutively activates this receptor, for purposes of using such a receptor for screening for identification of potential therapeutics, a more robust difference would be preferred. However, because the activated receptor can be utilized for understanding and elucidating the role of constitutive activation and for the identification of compounds that can be further examined, we believe that this difference is itself useful in differentiating between the endogenous and non-endogenous versions of the human 5HT_2A receptor.

D. Result Summary

[0131] The results for the GPCRs tested are set forth in Table E where the Per-Cent Increase indicates the percentage difference in results observed for the non-endogenous GPCR as compared to the endogenous GPCR; these values are followed by parenthetical indications as to the type of assay utilized. Additionally, the assay sytem utilized is parenthetically listed (and, in cases where different Host Cells were used, both are listed). As these results indicate, a variety of assays can be utilized to determine constitutive activity of the non-endogenous versions ofthe human GPCRs. Those skilled in the art, based upon the foregoing and with reference to information available to the art, are creditied with theability to selelect and/ot maximize a particular assay approach that suites the particualr needs of the investigator.

Table E

Receptor Identifier (Codon Mutation)	Per-Cent Difference
GPR17 (V234K)	74.5 (CRE-Luc)
GPR30 (L258K)	71.6 (CREB)
APJ (L247K)	49.0 (GTPγS)
ETBR-LP2 (N358K)	48.4(AP1-Luc - 293) 61.1(AP1-Luc - 293T)
GHSR (V262K)	58.9(CREB - 293) 35.6(CREB - 293T)
MIG (I230K)	39 (cAMP)
Serotonin 5HT_2A (C322K)	33.2 (IP₃)
Serotonin 5HT_2C (S310K)	39.1 (IP₃)

Example 6

Tissue Distribution of Endogenous Orphan GPCRs

[0132] Using a commercially available human-tissue dot-blot format, endogenous orphan GPCRs were probed for a determination ofthe areas where such receptors are localized. Except as indicate below, the entire receptor cDNA (radiolabelled) was used as the probe: radiolabeled probe was generated using the complete receptor cDNA (excised from the vector) using a Prime-It II^™ Random Primer Labeling Kit (Stratagene, #300385), according to manufacturer's instructions. A human RNA Master Blot^™ (Clontech, #7770-1) was hybridized with the GPCR radiolabeled probe and washed under stringent conditions according manufacturer's instructions. The blot was exposed to Kodak BioMax Autoradiography film overnight at-80°C.

[0133] Representative dot-blot format results are presented in Figure 8 for GPR1 (8A), GPR30 (8B), and APJ (8C), with results being summarized for all receptors in Table F

Table F

GPCR	Tissue Distribution (highest levels, relative to other tissues in the dot-blot)
GPR1	Placenta, Ovary, Adrenal
GPR4	Broad; highest in Heart, Lung, Adrenal, Thyroid, Spinal Cord
GPR5	Placenta, Thymus, Fetal Thymus Lesser levels in spleen, fetal spleen
GPR7	Liver, Spleen, Spinal Cord, Placenta
GPR8	No expression detected
GPR9-6	Thymus, Fetal Thymus Lesser levels in Small Intestine
GPR18 .	Spleen, Lymph Node, Fetal Spleen, Testis
GPR20	Broad
GPR21	Broad; very low abundance
GPR22	Heart, Fetal Heart Lesser levels in Brain
GPR30	Stomach
GPR31	Broad
BLR1	Spleen
CEPR	Stomach, Liver, Thyroid, Putamen
EBI1	Pancreas Lesser levels in Lymphoid Tissues
EBI2	Lymphoid Tissues, Aorta, Lung, Spinal Cord
ETBR-LP2	Broad; Brain Tissue
GPCR-CNS	Brain Lesser levels in Testis, Placenta
GPR-NGA	Pituitary Lesser levels in Brain
H9	Pituitary
HB954	Aorta, Cerebellum Lesser levels in most other tissues
HM74	Spleen, Leukocytes, Bone marrow, Mammary Glands, Lung, Trachea
MIG	Low levels in Kidney, Liver, Pancreas, Lung, Spleen
ORG1	Pituitary, Stomach, Placenta
V28	Brain. Spleen, Peripheral Leukocytes

[0134] Based upon the foregoing information, it is noted that human GPCRs can also be assessed for distribution in diseased tissue; comparative assessments between "normal" and diseased tissue can then be utilized to determine the potential for over-expression or under-expression of a particular receptor in a diseased state. In those circumstances where it is desirable to utilize the non-endogenous versions of the human GPCRs for the purpose of screening to directly identify candidate compounds ofpotential therapeutic relevance, it is noted that inverse agonists are useful in the treatment of diseases and disorders where a particular human GPCR is over-expressed, whereas agonists or partial agonists are useful in the treatment of diseases and disorders where a particular human GPCR is under-expressed.

[0135] As desired, more detailed, cellular localization of the receptors, using techniques well-known to those in the art (e.g., in-situ hybridization) can be utilized to identify particualr cells within these tissues where the receptor of interest is expressed.

[0136] As those skilled in the art will appreciate, numerous changes and modifications may be made to the preferred embodiments of the invention without departing from the scope of the claims.

[0137] Although a variety of expression vectors are available to those in the art, for purposes of utilization for both the endogenous and non-endogenous human GPCRs, it is most preferred that the vector utilized be pCMV.

SEQUENCE LISTING

[0138]

(1) GENERAL INFORMATION:

(i) APPLICANT: Behan, Dominic P. Chalmers, Derek T. Liaw, Chen W.

(ii) TITLE OF INVENTION: Non-Endogenous, Constitutively Activated Human G Protein-Coupled Orphan Receptors

(iii) NUMBER OF SEQUENCES: 280

(iv) CORRESPONDENCE ADDRESS:

(A) ADDRESSEE: Arena Pharmaceuticals, Inc.
(B) STREET: 6166 Nancy Ridge Drive
(C) CITY: San Diego
(D) STATE: CA
(E) COUNTRY: USA
(F) ZIP: 92122

(v) COMPUTER READABLE FORM:

(A) MEDIUM TYPE: Floppy disk
(B) COMPUTER: IBM PC compatible
(C) OPERATING SYSTEM: PC-DOS/MS-DOS
(D) SOFTWARE: PatentIn Release #1.0, Version #1.30

(vi) CURRENT APPLICATION DATA:

(A) APPLICATION NUMBER: US
(B) FILING DATE:
(C) CLASSIFICATION:

(viii) ATTORNEY/AGENT INFORMATION:

(A) NAME: Burgoon, Richard P.
(B) REGISTRATION NUMBER: 34,787

(ix) TELECOMMUNICATION INFORMATION:

(A) TELEPHONE: (619)453-7200
(B) TELEFAX: (619)453-7210

(2) INFORMATION FOR SEQ ID NO:1:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1068 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:

(3) INFORMATION FOR SEQ ID NO:2:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 355 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:

(4) INFORMATION FOR SEQ ID NO:3:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1089 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:

(5) INFORMATION FOR SEQ ID NO:4:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 362 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:

(6) INFORMATION FOR SEQ ID NO:5:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5:
TATGAATTCA GATGCTCTAA ACGTCCCTGC 30

(7) INFORMATION FOR SEQ ID NO:6:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:
TCCGGATCCA CCTGCACCTG CGCCTGCACC 30

(8) INFORMATION FOR SEQ ID NO: 7:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1002 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:

(9) INFORMATION FOR SEQ ID NO:8:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 333 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:

(10) INFORMATION FOR SEQ ID NO:9:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:
GCAAGCTTGG GGGACGCCAG GTCGCCGGCT 30

(11) INFORMATION FOR SEQ ID NO:10:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:
GCGGATCCGG ACGCTGGGGG AGTCAGGCTG C 31

(12) INFORMATION FOR SEQ ID NO:11:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 987 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:

(13) INFORMATION FOR SEQ ID NO:12:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 328 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:

(14) INFORMATION FOR SEQ ID NO:13:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:
CGGAATTCGT CAACGGTCCC AGCTACAATG 30

(15) INFORMATION FOR SEQ ID NO:14:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:
ATGGATCCCA GGCCCTTCAG CACCGCAATA T 31

(16) INFORMATION FOR SEQ ID NO:15:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1002 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:15:

(17) INFORMATION FOR SEQ ID NO:16:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 333 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16:

(18) INFORMATION FOR SEQ ID NO:17:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 48 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:17:
ACGAATTCAG CCATGGTCCT TGAGGTGAGT GACCACCAAG TGCTAAAT 48

(19) INFORMATION FOR SEQ ID NO:18:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 27 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18:
GAGGATCCTG GAATGCGGGG AAGTCAG 27

(20) INFORMATION FOR SEQ ID NO:19:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1107 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION SEQ ID NO:19:

(21) INFORMATION FOR SEQ ID NO:20:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 368 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20:

(22) INFORMATION FOR SEQ ID NO:21:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21:
TTAAGCTTGA CCTAATGCCA TCTTGTGTCC 30

(23) INFORMATION FOR SEQ ID NO:22:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22:
TTGGATCCAA AAGAACCATG CACCTCAGAG 30

(24) INFORMATION FOR SEQ ID NO:23:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1074 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23:

(25) INFORMATION FOR SEQ ID NO:24:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 357 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24:

(26) INFORMATION FOR SEQ ID NO:25:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1110 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25:

(27) INFORMATION FOR SEQ ID NO:26:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 369 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26:

(28) INFORMATION FOR SEQ ID NO:27:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1083 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27:

(29) INFORMATION FOR SEQ ID NO:28:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 360 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28:

(30) INFORMATION FOR SEQ ID NO:29:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29:
CTAGAATTCT GACTCCAGCC AAAGCATGAA T 31

(31) INFORMATION FOR SEQ ID NO:30:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:30:
GCTGGATCCT AAACAGTCTG CGCTCGGCCT 30

(32) INFORMATION FOR SEQ ID NO:31:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1020 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:31:

(33) INFORMATION FOR SEQ ID NO:32:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 339 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32:

(34) INFORMATION FOR SEQ ID NO:33:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 29 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33:
ATAAGATGAT CACCCTGAAC AATCAAGAT 29

(35) INFORMATION FOR SEQ ID NO:34:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 33 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:34:
TCCGAATTCA TAACATTTCA CTGTTTATAT TGC 33

(36) INFORMATION FOR SEQ ID NO:35:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 996 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35:

(37) INFORMATION FOR SEQ ID NO:36:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 331 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:36:

(38) INFORMATION FOR SEQ ID NO:37:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 28 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:37:
CCAAGCTTCC AGGCCTGGGG TGTGCTGG 28

(39) INFORMATION FOR SEQ ID NO:38:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 29 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:38:
ATGGATCCTG ACCTTCGGCC CCTGGCAGA 29

(40) INFORMATION FOR SEQ ID NO:39:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1077 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:39:

(41) INFORMATION FOR SEQ ID NO:40:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 358 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:40:

(42) INFORMATION FOR SEQ ID NO:41:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:41:
GAGAATTCAC TCCTGAGCTC AAGATGAACT 30

(43) INFORMATION FOR SEQ ID NO:42:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:42:
CGGGATCCCC GTAACTGAGC CACTTCAGAT 30

(44) INFORMATION FOR SEQ ID NO:43:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1050 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:43:

(45) INFORMATION FOR SEQ ID NO:44:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 349 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:44:

(46) INFORMATION FOR SEQ ID NO:45:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:45:
TCCCCCGGGA AAAAAACCAA CTGCTCCAAA 30

(47) INFORMATION FOR SEQ ID NO:46:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:46:
TAGGATCCAT TTGAATGTGG ATTTGGTGAA A 31

(48) INFORMATION FOR SEQ ID NO:47:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1302 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:47:

(49) INFORMATION FOR SEQ ID NO:48:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 433 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:48:

(50) INFORMATION FOR SEQ ID NO:49:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:49:
GTGAAGCTTG CCTCTGGTGC CTGCAGGAGG 30

(51) INFORMATION FOR SEQ ID NO:50:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:50:
GCAGAATTCC CGGTGGCGTG TTGTGGTGCC C 31

(52) INFORMATION FOR SEQ ID NO:51:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1209 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:51:

(53) INFORMATION FOR SEQ ID NO:52:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 402 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:52:

(54) INFORMATION FOR SEQ ID NO:53:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 27 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:53:
GGCGGATCCA TGGATGTGAC TTCCCAA 27

(55) INFORMATION FOR SEQ ID NO:54:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 27 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:54:
GGCGGATCCC TACACGGCAC TGCTGAA 27

(56) INFORMATION FOR SEQ ID NO:55:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1128 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:55:

(57) INFORMATION FOR SEQ ID NO:56:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 375 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:56:

(58) INFORMATION FOR SEQ ID NO:57:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:57:
AAGGAATTCA CGGCCGGGTG ATGCCATTCC C 31

(59) INFORMATION FOR SEQ ID NO:58:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:58:
GGTGGATCCA TAAACAGGG CGTTGAGGAC 30

(60) INFORMATION FOR SEQ ID NO:59:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 960 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:59:

(61) INFORMATION FOR SEQ ID NO:60:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 319 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:60:

(62) INFORMATION FOR SEQ ID NO:61:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1143 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:61:

(63) INFORMATION FOR SEQ ID NO:62:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 380 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:62:

(64) INFORMATION FOR SEQ ID NO:63:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:63:
TGAGAATTCT GGTGACTCAC AGCCGGCACA G 31

(65) INFORMATION FOR SEQ ID NO:64:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:64:
GCCGGATCCA AGGAAAAGCA GCAATAAAAG G 32

(66) INFORMATION FOR SEQ ID NO:65:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1119 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:65:

(67) INFORMATION FOR SEQ ID NO:66:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 372 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:66:

(68) INFORMATION FOR SEQ ID NO: 67:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:67:
CAAAGCTTGA AAGCTGCACG GTGCAGAGAC 30

(69) INFORMATION FOR SEQ ID NO:68:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:68:
GCGGATCCCG AGTCACACCC TGGCTGGGCC 30

(70) INFORMATION FOR SEQ ID NO:69:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1128 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:69:

(71) INFORMATION FOR SEQ ID NO:70:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 375 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:70:

(72) INFORMATION FOR SEQ ID NO:71:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:71:
ACAGAATTCC TGTGTGGTTT TACCGCCCAG 30

(73) INFORMATION FOR SEQ ID NO:72:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:72:
CTCGGATCCA CGCTGAAGAG TCGCCTATGG 30

(74) INFORMATION FOR SEQ ID NO:73:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1137 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:73:

(75) INFORMATION FOR SEQ ID NO:74:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 378 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:74:

(76) INFORMATION FOR SEQ ID NO:75:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:75:
CTGGAATTCA CCTGGACCAC CACCAATGGA TA 32

(77) INFORMATION FOR SEQ ID NO:76:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:76:
CTCGGATCCT GCAAAGTTTG TCATACAGTT 30

(78) INFORMATION FOR SEQ ID NO:77:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1085 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:77:

(79) INFORMATION FOR SEQ ID NO:78:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 361 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:78:

(80) INFORMATION FOR SEQ ID NO:79:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:79:
CTGGAATTCT CCTGCTCATC CAGCCATGCG G 31

(81) INFORMATION FOR SEQ ID NO:80:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:80:
CCTGGATCCC CACCCCTACT GGGGCCTCAG 30

(82) INFORMATION FOR SEQ ID NO:81:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1446 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:81:

(83) INFORMATION FOR SEQ ID NO:82:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 481 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:82:

(84) INFORMATION FOR SEQ ID NO:83:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 22 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:83:
ATGTGGAACG CGACGCCCAG CG 22

(85) INFORMATION FOR SEQ ID NO:84:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 22 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:84:
TCATGTATTA ATACTAGATT CT 22

(86) INFORMATION FOR SEQ ID NO:85:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 38 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:85:
TACCATGTGG AACGCGACGC CCAGCGAAGA GCCGGGGT 38

(87) INFORMATION FOR SEQ ID NO:86:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 39 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:86:
CGGAATTCAT GTATTAATAC TAGATTCTGT CCAGGCCCG 39

(88) INFORMATION FOR SEQ ID NO:87:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1101 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:87:

(89) INFORMATION FOR SEQ ID NO:88:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 366 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:88:

(90) INFORMATION FOR SEQ ID NO:89:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 33 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:89:
GCAAGCTTGT GCCCTCACCA AGCCATGCGA GCC 33

(91) INFORMATION FOR SEQ ID NO:90:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:90:
CGGAATTCAG CAATGAGTTC CGACAGAAGC 30

(92) INFORMATION FOR SEQ ID NO:91:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1842 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:91:

(93) INFORMATION FOR SEQ ID NO:92:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 613 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:92:

(94) INFORMATION FOR SEQ ID NO:93:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 34 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:93:
CAGAATTCAG AGAAAAAAAG TGAATATGGT TTTT 34

(95) INFORMATION FOR SEQ ID NO:94:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:94:
TTGGATCCCT GGTGCATAAC AATTGAAAGA AT 32

(96) INFORMATION FOR SEQ ID NO:95:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1248 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:95:

(97) INFORMATION FOR SEQ ID NO:96:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 415 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:96:

(98) INFORMATION FOR SEQ ID NO:97:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:97:
GGAAAGCTTA ACGATCCCCA GGAGCAACAT 30

(99) INFORMATION FOR SEQ ID NO:98:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:98:
CTGGGATCCT ACGAGAGCAT TTTTCACACA G 31

(100) INFORMATION FOR SEQ ID NO:99:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1842 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:99:

(101) INFORMATION FOR SEQ ID NO:100:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 613 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:100:

(102) INFORMATION FOR SEQ ID NO:101:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:101:
TCCAAGCTTC GCCATGGGAC ATAACGGGAG CT 32

(103) INFORMATION FOR SEQ ID NO:102:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:102:
CGTGAATTCC AAGAATTTAC AATCCTTGCT 30

(104) INFORMATION FOR SEQ ID NO:103:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1548 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:103:

(105) INFORMATION FOR SEQ ID NO:104:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 515 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:104:

(106) INFORMATION FOR SEQ ID NO:105:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 29 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:105:
GGAGAATTCA CTAGGCGAGG CGCTCCATC 29

(107) INFORMATION FOR SEQ ID NO:106:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:106:
GGAGGATCCA GGAAACCTTA GGCCGAGTCC 30

(108) INFORMATION FOR SEQ ID NO:107:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1164 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:107:

(109) INFORMATION FOR SEQ ID NO:108:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 387 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:108:

(110) INFORMATION FOR SEQ ID NO:109:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 37 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(iv) ANTI-SENSE: NO

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:109:
ACCATGGCTT GCAATGGCAG TGCGGCCAGG GGGCACT 37

(111) INFORMATION FOR SEQ ID NO:110:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 39 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(iv) ANTI-SENSE: YES

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:110:
CGACCAGGAC AAACAGCATC TTGGTCACTT GTCTCCGGC 39

(112) INFORMATION FOR SEQ ID NO:111:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 39 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(iv) ANTI-SENSE: NO

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:111:
GACCAAGATG CTGTTTGTCC TGGTCGTGGT GTTTGGCAT 39

(113) INFORMATION FOR SEQ ID NO:112:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 35 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(iv) ANTI-SENSE: YES

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:112:
CGGAATTCAG GATGGATCGG TCTCTTGCTG CGCCT 35

(114) INFORMATION FOR SEQ ID NO:113:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1212 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:113:

(115) INFORMATION FOR SEQ ID NO:114:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 403 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:114:

(116) INFORMATION FOR SEQ ID NO:115:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:115:
GGAAGCTTCA GGCCCAAAGA TGGGGAACAT 30

(117) INFORMATION FOR SEQ ID NO:116:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:116:
GTGGATCCAC CCGCGGAGGA CCCAGGCTAG 30

(118) INFORMATION FOR SEQ ID NO:117:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1098 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:117:

(119) INFORMATION FOR SEQ ID NO:118:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 365 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:118:

(120) INFORMATION FOR SEQ ID NO:119:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 26 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:119:
GACCTCGAGT CCTTCTACAC CTCATC 26

(121) INFORMATION FOR SEQ ID NO:120:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:120:
TGCTCTAGAT TCCAGATAGG TGAAAACTTG 30

(122) INFORMATION FOR SEQ ID NO:121:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1416 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:121:

(123) INFORMATION FOR SEQ ID NO:122:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 471 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:122:

(124) INFORMATION FOR SEQ ID NO:123:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 27 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:123:
GACCTCGAGG TTGCTTAAGA CTGAAGC 27

(125) INFORMATION FOR SEQ ID NO:124:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 27 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:124:
ATTTCTAGAC ATATGTAGCT TGTACCG 27

(126) INFORMATION FOR SEQ ID NO:125:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1377 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:125:

(127) INFORMATION FOR SEQ ID NO:126:

(i) SEQUENCE CHARACTERISTICS :

(A) LENGTH: 458 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:126:

(128) INFORMATION FOR SEQ ID NO:127:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:127:
GGTAAGCTTG GCAGTCCACG CCAGGCCTTC 30

(129) INFORMATION FOR SEQ ID NO:128:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:128:
TCCGAATTCT CTGTAGACAC AAGGCTTTGG 30

(130) INFORMATION FOR SEQ ID NO:129:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1068 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:129:

(131) INFORMATION FOR SEQ ID NO:130:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 355 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:130:

(132) INFORMATION FOR SEQ ID NO:131:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:131:
GATCTCCAGT AGGCATAAGT GGACAATTCT GG 32

(133) INFORMATION FOR SEQ ID NO:132:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:132:
CTCCTTCGGT CCTCCTATCG TTGTCAGAAG 30

(134) INFORMATION FOR SEQ ID NO:133:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:133:
AGAAGGCCAA GATCGCGCGG CTGGCCCTCA 30

(135) INFORMATION FOR SEQ ID NO:134:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:134:
CGGCGCCACC GCACGAAAAA GCTCATCTTC 30

(136) INFORMATION FOR SEQ ID NO : 135 :

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 33 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:135:
GCCAAGAAGC GGGTGAAGTT CCTGGTGGTG GCA 33

(137) INFORMATION FOR SEQ ID NO:136:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:136:
CAGGCGGAAG GTGAAAGTCC TGGTCCTCGT 30

(138) INFORMATION FOR SEQ ID NO : 137 :

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 33 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:137:
CGGCGCCTGC GGGCCAAGCG GCTGGTGGTG GTG 33

(139) INFORMATION FOR SEQ ID NO:138:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:138:
CCAAGCACAA AGCCAAGAAA GTGACCATCA C 31

(140) INFORMATION FOR SEQ ID NO:139:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:139:
GCGCCGGCGC ACCAAATGCT TGCTGGTGGT 30

(141) INFORMATION FOR SEQ ID NO:140:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 41 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:140:
CAAAAAGCTG AAGAAATCTA AGAAGATCAT CTTTATTGTC G 41

(142) INFORMATION FOR SEQ ID NO:141:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:141:
CAAGACCAAG GCAAAACGCA TGATCGCCAT 30

(143) INFORMATION FOR SEQ ID NO:142:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:142:
GTCAAGGAGA AGTCCAAAAG GATCATCATC 30

(144) INFORMATION FOR SEQ ID NO:143:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:143:
CGCCGCGTGC GGGCCAAGCA GCTCCTGCTC 30

(145) INFORMATION FOR SEQ ID NO:144:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 33 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:144:
CCTGATAAGC GCTATAAAAT GGTCCTGTTT CGA 33

(146) INFORMATION FOR SEQ ID NO:145:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 36 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:145:
GAAAGACAAA AGAGAGTCAA GAGGATGTCT TTATTG 36

(147) INFORMATION FOR SEQ ID NO:146:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 33 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:146:
CGGAGAAAGA GGGTGAAACG CACAGCCATC GCC 33

(148) INFORMATION FOR SEQ ID NO:147:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:147:
AAGCTTCAGC GGGCCAAGGC ACTGGTCACC 30

(149) INFORMATION FOR SEQ ID NO:148:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:148:
CAGCGGCAGA AGGCAAAAAG GGTGGCCATC 30

(150) INFORMATION FOR SEQ ID NO:149:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:149:
CGGCAGAAGG CGAAGCGCAT GATCCTCGCG 30

(151) INFORMATION FOR SEQ ID NO:150:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:150:
GAGCGCAACA AGGCCAAAAA GGTGATCATC 30

(152) INFORMATION FOR SEQ ID NO:151:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 39 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:151:
GGTGTAAACA AAAAGGCTAA AAACACAATT ATTCTTATT 39

(153) INFORMATION FOR SEQ ID NO:152:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 27 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:152:
GAGAGCCAGC TCAAGAGCAC CGTGGTG 27

(154) INFORMATION FOR SEQ ID NO:153:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:153:
CCACAAGCAA ACCAAGAAAA TGCTGGCTGT 30

(155) INFORMATION FOR SEQ ID NO:154:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:154:
CATCAAGTGT ATCATGTGCC AAGTACGCCC 30

(156) INFORMATION FOR SEQ ID NO:155:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 34 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:155:
CTAGAGAGTC AGATGAAGTG TACAGTAGTG GCAC 34

(157) INFORMATION FOR SEQ ID NO:156:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 36 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:156:
CGGACAAAAG TGAAAACTAA AAAGATGTTC CTCATT 36

(158) INFORMATION FOR SEQ ID NO:157:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 33 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:157:
GCTGAGGTTC GCAATAAACT AACCATGTTT GTG 33

(159) INFORMATION FOR SEQ ID NO:158:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 29 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:158:
GGGAGGCCGA GCTGAAAGCC ACCCTGCTC 29

(160) INFORMATION FOR SEQ ID NO:159:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:159:
CAAGATCAAG AGAGCCAAAA CCTTCATCAT G 31

(161) INFORMATION FOR SEQ ID NO:160:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:160:
CCGGAGACAA GTGAAGAAGA TGCTGTTTGT C 31

(162) INFORMATION FOR SEQ ID NO:161:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:161:
GCAAGGACCA GATCAAGCGG CTGGTGCTCA 30

(163) INFORMATION FOR SEQ ID NO:162:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 34 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:162:
CAAGAAAGCC AAAGCCAAGA AACTGATCCT TCTG 34

(164) INFORMATION FOR SEQ ID NO:163:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1068 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:163:

(165) INFORMATION FOR SEQ ID NO:164:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 355 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:164:

(166) INFORMATION FOR SEQ ID NO:165:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1089 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:165:

(167) INFORMATION FOR SEQ ID NO:166:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 362 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:166:

(168) INFORMATION FOR SEQ ID NO:167:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1002 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:167:

(169) INFORMATION FOR SEQ ID NO:168:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 333 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:168:

(170) INFORMATION FOR SEQ ID NO:169:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 987 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:169:

(171) INFORMATION FOR SEQ ID NO:170:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 328 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:170:

(172) INFORMATION FOR SEQ ID NO:171:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1002 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:171:

(173) INFORMATION FOR SEQ ID NO:172:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 333 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:172:

(174) INFORMATION FOR SEQ ID NO:173:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1107 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:173:

(175) INFORMATION FOR SEQ ID NO:174:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 368 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:174:

(176) INFORMATION FOR SEQ ID NO:175:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1074 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:175:

(177) INFORMATION FOR SEQ ID NO:176:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 357 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:176:

(178) INFORMATION FOR SEQ ID NO:177:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1110 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:177:

(179) INFORMATION FOR SEQ ID NO:178:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 369 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:178:

(180) INFORMATION FOR SEQ ID NO:179:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1083 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:179:

(181) INFORMATION FOR SEQ ID NO:180:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 360 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:180:

(182) INFORMATION FOR SEQ ID NO:181:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1020 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:181:

(183) INFORMATION FOR SEQ ID NO:182:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 339 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:182:

(183) INFORMATION FOR SEQ ID NO:183:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 996 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:183:

(185) INFORMATION FOR SEQ ID NO:184:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 331 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:184:

(186) INFORMATION FOR SEQ ID NO:185:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1077 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:185:

(187) INFORMATION FOR SEQ ID NO:186:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 358 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:186:

(188) INFORMATION FOR SEQ ID NO:187:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1050 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:187:

(189) INFORMATION FOR SEQ ID NO:188:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 349 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:188:

(190) INFORMATION FOR SEQ ID NO:189:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1302 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:189:

(191) INFORMATION FOR SEQ ID NO:190:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 433 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:190:

(192) INFORMATION FOR SEQ ID NO:191:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1209 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:191:

(193) INFORMATION FOR SEQ ID NO:192:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 402 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:192:

(194) INFORMATION FOR SEQ ID NO:193:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1128 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:193:

(195) INFORMATION FOR SEQ ID NO:194:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 375 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:194:

(196) INFORMATION FOR SEQ ID NO:195:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 960 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:195:

(197) INFORMATION FOR SEQ ID NO:196:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 319 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:196:

(198) INFORMATION FOR SEQ ID NO:197:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1143 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:197:

(199) INFORMATION FOR SEQ ID NO:198:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 380 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:198:

(200) INFORMATION FOR SEQ ID NO:199:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1119 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:199:

(201) INFORMATION FOR SEQ ID NO:200:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 372 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:200:

(202) INFORMATION FOR SEQ ID NO:201:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1128 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:201:

(203) INFORMATION FOR SEQ ID NO:202:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 375 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:202:

(204) INFORMATION FOR SEQ ID NO:203:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1137 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:203:

(205) INFORMATION FOR SEQ ID NO:204:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 378 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:204:

(206) INFORMATION FOR SEQ ID NO:205:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1086 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 205 :

(207) INFORMATION FOR SEQ ID NO:206:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 361 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:206:

(208) INFORMATION FOR SEQ ID NO:207:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1446 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:207:

(209) INFORMATION FOR SEQ ID NO:208:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 481 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:208:

(210) INFORMATION FOR SEQ ID NO:209:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1101 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:209:

(211) INFORMATION FOR SEQ ID NO:210:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 366 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:210:

(212) INFORMATION FOR SEQ ID NO:211:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1842 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:211:

(213) INFORMATION FOR SEQ ID NO:212:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 613 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:212:

(214) INFORMATION FOR SEQ ID NO:213:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1248 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:213:

(215) INFORMATION FOR SEQ ID NO:214:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 415 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:214:

(216) INFORMATION FOR SEQ ID NO:215:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1842 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:215:

(217) INFORMATION FOR SEQ ID NO:216:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 613 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:216:

(218) INFORMATION FOR SEQ ID NO:217:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1854 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:217:

(219) INFORMATION FOR SEQ ID NO:218:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 617 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:218:

(220) INFORMATION FOR SEQ ID NO:219:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1548 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:219:

(221) INFORMATION FOR SEQ ID NO:220:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 515 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:220:

(222) INFORMATION FOR SEQ ID NO:221:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1164 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:221:

(223) INFORMATION FOR SEQ ID NO:222:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 387 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:222:

(224) INFORMATION FOR SEQ ID NO:223:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1212 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:223:

(225) INFORMATION FOR SEQ ID NO:224:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 403 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:224:

(226) INFORMATION FOR SEQ ID NO:225:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1098 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:225:

(227) INFORMATION FOR SEQ ID NO:226:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 365 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:226:

(228) INFORMATION FOR SEQ ID NO:227:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1416 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:227:

(229) INFORMATION FOR SEQ ID NO:228:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 470 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:228:

(230) INFORMATION FOR SEQ ID NO:229:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1377 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:229:

(231) INFORMATION FOR SEQ ID NO:230:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 458 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:230:

(232) INFORMATION FOR SEQ ID NO:231:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1068 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:231:

(233) INFORMATION FOR SEQ ID NO:232:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 355 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:232:

(234) INFORMATION FOR SEQ ID NO:233:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 29 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(iv) ANTI-SENSE: NO

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:233:
GGCTTAAGAG CATCATCGTG GTGCTGGTG 29

(235) INFORMATION FOR SEQ ID NO:234:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 34 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(iv) ANTI-SENSE: YES

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:234:
GTCACCACCA GCACCACGAT GATGCTCTTA AGCC 34

(236) INFORMATION FOR SEQ ID NO:235:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:235:
CAAAGAAAGT ACTGGGCATC GTCTTCTTCC T 31

(237) INFORMATION FOR SEQ ID NO:236:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:236:
TGCTCTAGAT TCCAGATAGG TGAAAACTTG 30

(238) INFORMATION FOR SEQ ID NO.237:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 50 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(iv) ANTI-SENSE: NO

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:237:
CTAGGGGCAC CATGCAGGCT ATCAACAATG AAAGAAAAGC TAAGAAAGTC 50

(239) INFORMATION FOR SEQ ID NO:238:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 50 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(iv) ANTI-SENSE: YES

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:238:
CAAGGACTTT CTTAGCTTTT CTTTCATTGT TGATAGCCTG CATGGTGCCC 50

(240) INFORMATION FOR SEQ ID NO:239:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 35 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:239:
CGGCGGCAGA AGGCGAAACG CATGATCCTC GCGGT 35

(241) INFORMATION FOR SEQ ID NO:240:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 35 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:240:
ACCGCGAGGA TCATGCGTTT CGCCTTCTGC CGCCG 35

(242) INFORMATION FOR SEQ ID NO:241:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 24 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:241:
GAGACATATT ATCTGCCACG GAGG 24

(243) INFORMATION FOR SEQ ID NO:242:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 24 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:242:
TTGGCATAGA AACCGGACCC AAGG 24

(244) INFORMATION FOR SEQ ID NO:243:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 28 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:243:
TAAGAATTCC ATAAAAATTA TGGAATGG 28

(245) INFORMATION FOR SEQ ID NO:244:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:244:
CCAGGATCCA GCTGAAGTCT TCCATCATTC 30

(246) INFORMATION FOR SEQ ID NO:245:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1071 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:245:

(247) INFORMATION FOR SEQ ID NO:246:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 356 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:246:

(248) INFORMATION FOR SEQ ID NO:247:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:247:
GCAGAATTCG GCGGCCCCAT GGACCTGCCC CC 32

(249) INFORMATION FOR SEQ ID NO:248:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:248:
GCTGGATCCC CCGAGCAGTG GCGTTACTTC 30

(250) INFORMATION FOR SEQ ID NO:249:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 903 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:249:

(251) INFORMATION FOR SEQ ID NO:250:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 300 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:250:

(252) INFORMATION FOR SEQ ID NO:251:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:251:
CTCAAGCTTA CTCTCTCTCA CCAGTGGCCA C 31

(253) INFORMATION FOR SEQ ID NO:252:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 24 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:252:
CCCTCCTCCC CCGGAGGACC TAGC 24

(254) INFORMATION FOR SEQ ID NO:253:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1041 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:253:

(255) INFORMATION FOR SEQ ID NO:254:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 346 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:254:

(256) INFORMATION FOR SEQ ID NO:255:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:255:
TTTAAGCTTC CCCTCCAGGA TGCTGCCGGA C 31

(257) INFORMATION FOR SEQ ID NO:256:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:256:
GGCGAATTCT GAAGGTCCAG GGAAACTGCT A 31

(258) INFORMATION FOR SEQ ID NO:257:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 993 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:257:

(259) INFORMATION FOR SEQ ID NO:258:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 362 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:258:

(260) INFORMATION FOR SEQ ID NO:259:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS : single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:259:
CCCAAGCTTC GGGCACCATG GACACCTCCC 30

(261) INFORMATION FOR SEQ ID NO:260:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:260:
ACAGGATCCA AATGCACAGC ACTGGTAAGC 30

(262) INFORMATION FOR SEQ ID NO:261:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 25 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:261:
CTATAACTGG GTTACATGGT TTAAC 25

(263) INFORMATION FOR SEQ ID NO:262:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:262:
TTTGAATTCA CATATTAATT AGAGACATGG 30

(264) INFORMATION FOR SEQ ID NO:263:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 2724 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:263:

(265) INFORMATION FOR SEQ ID NO:264:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 907 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:264:

(266) INFORMATION FOR SEQ ID NO:265:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:265:
CGGAAGCTGC GGGCCAAATG GGTGGCCGGC 30

(267) INFORMATION FOR SEQ ID NO:266:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 27 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:266:
CAGAGGAGGG TGAAGGGGCT GTTGGCG 27

(268) INFORMATION FOR SEQ ID NO:267:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:267:
GGCGGCGCCG AGCCAAGGGG CTGGCTGTGG 30

(269) INFORMATION FOR SEQ ID NO:268:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:268:
GGGACTGCTC TATGAAAAAA CACATTGCCC TG 32

(270) INFORMATION FOR SEQ ID NO:269:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1071 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:269:

(271) INFORMATION FOR SEQ ID NO:270:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 356 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:270:

(272) INFORMATION FOR SEQ ID NO:271:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 903 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:271:

(273) INFORMATION FOR SEQ ID NO:272:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 300 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:272:

(274) INFORMATION FOR SEQ ID NO:273:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 1041 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:273:

(275) INFORMATION FOR SEQ ID NO:274:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 346 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:274:

(276) INFORMATION FOR SEQ ID NO:275:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 993 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:275:

(277) INFORMATION FOR SEQ ID NO:276:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 330 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:276:

(278) INFORMATION FOR SEQ ID NO:277:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 2724 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:277:

(279) INFORMATION FOR SEQ ID NO:278:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 907 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS:
(D) TOPOLOGY: not relevant

(ii) MOLECULE TYPE: protein

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:278:

(280) INFORMATION FOR SEQ ID NO:279:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:279:
CATGCCAACC GGCCCGCGAG GCTGCTGCTG GT 32

(281) INFORMATION FOR SEQ ID NO:280:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:280:
ACCAGCAGCA GCCTCGCGGG CCGGTTGGCA TG 32

Annex to the application documents - subsequently filed sequences listing

SEQUENCE LISTING

[0139]

<110> Arena Pharmaceuticals, Inc.

<120> Non Endogenous Constitutively Activated Human G Protein-Coupled Receptors

<130> AREN0049

<140> PCT/US99/23938
<141> 1999-10-12

<150> 09/170,496
<151> 1998-10-13

<160> 280

<170> Patent In Ver. 2.1

<210> 1
<211> 1068
<212> DNA
<213> Homo sapiens

<400> 1

<210> 2
<211> 355
<212> PRT
<213> Homo sapiens

<400> 2

<210> 3
<211> 1089
<212> DNA
<213> Homo sapiens

<400> 3

<210> 4
<211> 362
<212> PRT
<213> Homo sapiens

<400> 4

<210> 5
<211> 30
<212> DNA
<213> Homo sapiens

<400> 5
tatgaattca gatgctctaa acgtccctgc 30

<210> 6
<211> 30
<212> DNA
<213> Homo sapiens

<400> 6
tccggatcca cctgcacctg cgcctgcacc 30

<210> 7
<211> 1002
<212> DNA
<213> Homo sapiens

<400> 7

<210> 8
<211> 333
<212> PRT
<213> Homo sapiens

<400> 8

<210> 9
<211> 30
<212> DNA
<213> Homo sapiens

<400> 9
gcaagcttgg gggacgccag gtcgccggct 30

<210> 10
<211> 31
<212> DNA
<213> Homo sapiens

<400> 10
gcggatccgg acgctggggg agtcaggctg c 31

<210> 11
<211> 987
<212> DNA
<213> Homo sapiens

<400> 11

<210> 12
<211> 328
<212> PRT
<213> Homo sapiens

<400> 12

<210> 13
<211> 30
<212> DNA
<213> Homo sapiens

<400> 13
cggaattcgt caacggtccc agctacaatg 30

<210> 14
<211> 31
<212> DNA
<213> Homo sapiens

<400> 14
atggatccca ggcccttcag caccgcaata t 31

<210> 15
<211> 1002
<212> DNA
<213> Homo sapiens

<400> 15

<210> 16
<211> 333
<212> PRT
<213> Homo sapiens

<400> 16

<210> 17
<211> 48
<212> DNA
<213> Homo sapiens

<400> 17
acgaattcag ccatggtcct tgaggtgagt gaccaccaag tgctaaat 48

<210> 18
<211> 27
<212> DNA
<213> Homo sapiens

<400> 18
gaggatcctg gaatgcgggg aagtcag 27

<210> 19
<211> 1107
<212> DNA
<213> Homo sapiens

<400> 19

<210> 20
<211> 368
<212> PRT
<213> Homo sapiens

<400> 20

<210> 21
<211> 30
<212> DNA
<213> Homo sapiens

<400> 21
ttaagcttga cctaatgcca tcttgtgtcc 30

<210> 22
<211> 30
<212> DNA
<213> Homo sapiens

<400> 22
ttggatccaa aagaaccatg cacctcagag 30

<210> 23
<211> 1074
<212> DNA
<213> Homo sapiens

<400> 23

<210> 24
<211> 357
<212> PRT
<213> Homo sapiens

<400> 24

<210> 25
<211> 1
<212> DNA
<213> Homo sapiens

<400> 25

<210> 26
<211> 369
<212> PRT
<213> Homo sapiens

<400> 26

<210> 27
<211> 1083
<212> DNA
<213> Homo sapiens

<400> 27

<210> 28
<211> 360
<212> PRT
<213> Homo sapiens

<400> 28

<210> 29
<211> 31
<212> DNA
<213> Homo sapiens

<400> 29
ctagaattct gactccagcc aaagcatgaa t 31

<210> 30

<211> 30
<212> DNA
<213> Homo sapiens

<400> 30
gctggatcct aaacagtctg cgctcggcct 30

<210> 31
<211> 1020
<212> DNA
<213> Homo sapiens

<400> 31

<210> 32
<211> 339
<212> PRT
<213> Homo sapiens

<400> 32

<210> 33
<211> 29
<212> DNA
<213> Homo sapiens

<400> 33
ataagatgat caccctgaac aatcaagat 29

<210> 34
<211> 33
<212> DNA
<213> Homo sapiens

<400> 34
tccgaattca taacatttca ctgtttatat tgc 33

<210> 35
<211> 996
<212> DNA
<213> Homo sapiens

<400> 35

<210> 36
<211> 331
<212> PRT
<213> Homo sapiens

<400> 36

<210> 37
<211> 28
<212> DNA
<213> Homo sapiens

<400> 37
ccaagcttcc aggcctgggg tgtgctgg 28

<210> 38
<211> 29
<212> DNA
<213> Homo sapiens

<400> 38
atggatcctg accttcggcc cctggcaga 29

<210> 39
<211> 1077
<212> DNA
<213> Homo sapiens

<400> 39

<210> 40
<211> 358
<212> PRT
<213> Homo sapiens

<400> 40

<210> 41
<211> 30
<212> DNA
<213> Homo sapiens

<400> 41
gagaattcac tcctgagctc aagatgaact 30

<210> 42
<211> 30
<212> DNA
<213> Homo sapiens

<400> 42
cgggatcccc gtaactgagc cacttcagat 30

<210> 43
<211> 1050
<212> DNA
<213> Homo sapiens

<400> 43

<210> 44
<211> 349
<212> PRT
<213> Homo sapiens

<400> 44

<210> 45
<211> 30
<212> DNA
<213> Homo sapiens

<400> 45
tcccccggga aaaaaaccaa ctgctccaaa 30

<210> 46
<211> 31
<212> DNA
<213> Homo sapiens

<400> 46
taggatccat ttgaatgtgg atttggtgaa a 31

<210> 47
<211> 1413
<212> DNA
<213> Homo sapiens

<400> 47

<210> 48
<211> 433
<212> PRT
<213> Homo sapiens

<400> 48

<210> 49
<211> 30
<212> DNA
<213> Homo sapiens

<400> 49
gtgaagcttg cctctggtgc ctgcaggagg 30

<210> 50
<211> 31
<212> DNA
<213> Homo sapiens

<400> 50
gcagaattcc cggtggcgtg ttgtggtgcc c 31

<210> 51
<211> 1209
<212> DNA
<213> Homo sapiens

<400> 51

<210> 52
<211> 402
<212> PRT
<213> Homo sapiens

<400> 52

<210> 53
<211> 27
<212> DNA
<213> Homo sapiens

<400> 53
ggcggatcca tggatgtgac ttcccaa 27

<210> 54
<211> 27
<212> DNA
<213> Homo sapiens

<400> 54
ggcggatccc tacacggcac tgctgaa 27

<210> 55
<211> 1128
<212> DNA
<213> Homo sapiens

<400> 55

<210> 56
<211> 375
<212> PRT
<213> Homo sapiens

<400> 56

<210> 57
<211> 31
<212> DNA
<213> Homo sapiens

<400> 57
aaggaattca cggccgggtg atgccattcc c 31

<210> 58
<211> 30
<212> DNA
<213> Homo sapiens

<400> 58
ggtggatcca taaacacggg cgttgaggac 30

<210> 59
<211> 960
<212> DNA
<213> Homo sapiens

<400> 59

<210> 60
<211> 319
<212> PRT
<213> Homo sapiens

<400> 60

<210> 61
<211> 1143
<212> DNA
<213> Homo sapiens

<400> 61

<210> 62
<211> 380
<212> PRT
<213> Homo sapiens

<400> 62

<210> 63
<211> 31
<212> DNA
<213> Homo sapiens

<400> 63
tgagaattct ggtgactcac agccggcaca g 31

<210> 64
<211> 31
<212> DNA
<213> Homo sapiens

<400> 64
gccggatcca aggaaaagca gcaataaaag g 31

<210> 65
<211> 1119
<212> DNA
<213> Homo sapiens

<400> 65

<210> 66
<211> 372
<212> PRT
<213> Homo sapiens

<400> 66

<210> 67
<211> 30
<212> DNA
<213> Homo sapiens

<400> 67
caaagcttga aagctgcacg gtgcagagac 30

<210> 68
<211> 30
<212> DNA
<213> Homo sapiens

<400> 68
gcggatcccg agtcacaccc tggctgggcc 30

<210> 69
<211> 1128
<212> DNA
<213> Homo sapiens

<400> 69

<210> 70
<211> 375
<212> PRT
<213> Homo sapiens

<400> 70

<210> 71
<211> 30
<212> DNA
<213> Homo sapiens

<400> 71
acagaattcc tgtgtggttt taccgcccag 30

<210> 72
<211> 30
<212> DNA
<213> Homo sapiens

<400> 72
ctcggatcca ggcagaagag tcgcctatgg 30

<210> 73
<211> 1137
<212> DNA
<213> Homo sapiens

<400> 73

<210> 74
<211> 378
<212> PRT
<213> Homo sapiens

<400> 74

<210> 75
<211> 32
<212> DNA
<213> Homo sapiens

<400> 75
ctggaattca cctggaccac caccaatgga ta 32

<210> 76
<211> 30
<212> DNA
<213> Homo sapiens

<400> 76
ctcggatcct gcaaagtttg tcatacagtt 30

<210> 77
<211> 1086
<212> DNA
<213> Homo sapiens

<400> 77

<210> 78
<211> 361
<212> PRT
<213> Homo sapiens

<400> 78

<210> 79
<211> 31
<212> DNA
<213> Homo sapiens

<400> 79
ctggaattct cctgctcatc cagccatgcg g 31

<210> 80
<211> 30
<212> DNA
<213> Homo sapiens

<400> 80
cctggatccc cacccctact ggggcctcag 30

<210> 81
<211> 1446
<212> DNA
<213> Homo sapiens

<400> 81

<210> 82
<211> 481
<212> PRT
<213> Homo sapiens

<400> 82

<210> 83
<211> 22
<212> DNA
<213> Homo sapiens

<400> 83
atgtggaacg cgacgcccag cg 22

<210> 84
<211> 22
<212> DNA
<213> Homo sapiens

<400> 84
tcatgtatta atactagatt ct 22

<210> 85
<211> 38
<212> DNA
<213> Homo sapiens

<400> 85
taccatgtgg aacgcgacgc ccagcgaaga gccggggt 38

<210> 86
<211> 39
<212> DNA
<213> Homo sapiens

<400> 86
cggaattcat gtattaatac tagattctgt ccaggcccg 39

<210> 87
<211> 1101
<212> DNA
<213> Homo sapiens

<400> 87

<210> 88
<211> 366
<212> PRT
<213> Homo sapiens

<400> 88

<210> 89
<211> 33
<212> DNA
<213> Homo sapiens

<400> 89
gcaagcttgt gccctcacca agccatgcga gcc 33

<210> 90
<211> 30
<212> DNA
<213> Homo sapiens

<400> 90
cggaattcag caatgagttc cgacagaagc 30

<210> 91
<211> 1842
<212> DNA
<213> Homo sapiens

<400> 91

<210> 92
<211> 613
<212> PRT
<213> Homo sapiens

<400> 92

<210> 93
<211> 34
<212> DNA
<213> Homo sapiens

<400> 93
cagaattcag agaaaaaaag tgaatatggt tttt 34

<210> 94
<211> 32
<212> DNA
<213> Homo sapiens

<400> 94
ttggatccct ggtgcataac aattgaaaga at 32

<210> 95
<211> 1248
<212> DNA
<213> Homo sapiens

<400> 95

<210> 96
<211> 415
<212> PRT
<213> Homo sapiens

<400> 96

<210> 97
<211> 30
<212> DNA
<213> Homo sapiens

<400> 97
ggaaagctta acgatcccca ggagcaacat 30

<210> 98
<211> 31
<212> DNA
<213> Homo sapiens

<400> 98
ctgggatcct acgagagcat ttttcacaca g 31

<210> 99
<211> 1842
<212> DNA
<213> Homo sapiens

<400> 99

<210> 100
<211> 613
<212> PRT
<213> Homo sapiens

<400> 100

<210> 101
<211> 32
<212> DNA
<213> Homo sapiens

<400> 101
tccaagcttc gccatgggac ataacgggag ct 32

<210> 102
<211> 30
<212> DNA
<213> Homo sapiens

<400> 102
cgtgaattcc aagaatttac aatccttgct 30

<210> 103
<211> 1548
<212> DNA
<213> Homo sapiens

<400> 103

<210> 104
<211> 515
<212> PRT
<213> Homo sapiens

<400> 104

<210> 105
<211> 29
<212> DNA
<213> Homo sapiens

<400> 105
ggagaattca ctaggcgagg cgctccatc 29

<210> 106
<211> 30
<212> DNA
<213> Homo sapiens

<400> 106
ggaggatcca ggaaacctta ggccgagtcc 30

<210> 107
<211> 1164
<212> DNA
<213> Homo sapiens

<400> 107

<210> 108
<211> 387
<212> PRT
<213> Homo sapiens

<400> 108

<210> 109
<211> 37
<212> DNA
<213> Homo sapiens

<400> 109
accatggctt gcaatggcag tgcggccagg gggcact 37

<210> 110
<211> 39
<212> DNA
<213> Homo sapiens

<400> 110
cgaccaggac aaacagcatc ttggtcactt gtctccggc 39

<210> 111
<211> 39
<212> DNA
<213> Homo sapiens

<400> 111
gaccaagatg ctgtttgtcc tggtcgtggt gtttggcat 39

<210> 112
<211> 35
<212> DNA
<213> Homo sapiens

<400> 112
cggaattcag gatggatcgg tctcttgctg cgcct 35

<210> 113
<211> 1212
<212> DNA
<213> Homo sapiens

<400> 113

<210> 114
<211> 403
<212> PRT
<213> Homo sapiens

<400> 114

<210> 115
<211> 30
<212> DNA
<213> Homo sapiens

<400> 115
ggaagcttca ggcccaaaga tggggaacat 30

<210> 116
<211> 30
<212> DNA
<213> Homo sapiens

<400> 116
gtggatccac ccgcggagga cccaggctag 30

<210> 117
<211> 1098
<212> DNA
<213> Homo sapiens

<400> 117

<210> 118
<211> 365
<212> PRT
<213> Homo sapiens

<400> 118

<210> 119
<211> 26
<212> DNA
<213> Homo sapiens

<400> 119
gacctcgagt ccttctacac ctcatc 26

<210> 120
<211> 30
<212> DNA
<213> Homo sapiens

<400> 120
tgctctagat tccagatagg tgaaaacttg 30

<210> 121
<211> 1416
<212> DNA
<213> Homo sapiens

<400> 121

<210> 122
<211> 471
<212> PRT
<213> Homo sapiens

<400> 122

<210> 123
<211> 27
<212> DNA
<213> Homo sapiens

<400> 123
gacctcgagg ttgcttaaga ctgaagc 27

<210> 124
<211> 27
<212> DNA
<213> Homo sapiens

<400> 124
atttctagac atatgtagct tgtaccg 27

<210> 125
<211> 1377
<212> DNA
<213> Homo sapiens

<400> 125

<210> 126
<211> 458
<212> PRT
<213> Homo sapiens

<400> 126

<210> 127
<211> 30
<212> DNA
<213> Homo sapiens

<400> 127
ggtaagcttg gcagtccacg ccaggccttc 30

<210> 128
<211> 30
<212> DNA
<213> Homo sapiens

<400> 128
tccgaattct ctgtagacac aaggctttgg 30

<210> 129
<211> 1068
<212> DNA
<213> Homo sapiens

<400> 129

<210> 130
<211> 355
<212> PRT
<213> Homo sapiens

<400> 130

<210> 131
<211> 32
<212> DNA
<213> Homo sapiens

<400> 131
gatctccagt aggcataagt ggacaattct gg 32

<210> 132
<211> 30
<212> DNA
<213> Homo sapiens

<400> 132
ctccttcggt cctcctatcg ttgtcagaag 30

<210> 133
<211> 30
<212> DNA
<213> Homo sapiens

<400> 133
agaaggccaa gatcgcgcgg ctggccctca 30

<210> 134
<211> 30
<212> DNA
<213> Homo sapiens

<400> 134
cggcgccacc gcacgaaaaa gctcatcttc 30

<210> 135
<211> 33
<212> DNA
<213> Homo sapiens

<400> 135
gccaagaagc gggtgaagtt cctggtggtg gca 33

<210> 136
<211> 30
<212> DNA
<213> Homo sapiens

<400> 136
caggcggaag gtgaaagtcc tggtcctcgt 30

<210> 137
<211> 33
<212> DNA
<213> Homo sapiens

<400> 137
cggcgcctgc gggccaagcg gctggtggtg gtg 33

<210> 138
<211> 31
<212> DNA
<213> Homo sapiens

<400> 138
ccaagcacaa agccaagaaa gtgaccatca c 31

<210> 139
<211> 30
<212> DNA
<213> Homo sapiens

<400> 139
gcgccggcgc accaaatgct tgctggtggt 30

<210> 140
<211> 41
<212> DNA
<213> Homo sapiens

<400> 140
caaaaagctg aagaaatcta agaagatcat ctttattgtc g 41

<210> 141
<211> 30
<212> DNA
<213> Homo sapiens

<400> 141
caagaccaag gcaaaacgca tgatcgccat 30

<210> 142
<211> 30
<212> DNA
<213> Homo sapiens

<400> 142
gtcaaggaga agtccaaaag gatcatcatc 30

<210> 143
<211> 30
<212> DNA
<213> Homo sapiens

<400> 143
cgccgcgtgc gggccaagca gctcctgctc 30

<210> 144
<211> 33
<212> DNA
<213> Homo sapiens

<400> 144
cctgataagc gctataaaat ggtcctgttt cga 33

<210> 145
<211> 36
<212> DNA
<213> Homo sapiens

<400> 145
gaaagacaaa agagagtcaa gaggatgtct ttattg 36

<210> 146
<211> 33
<212> DNA
<213> Homo sapiens

<400> 146
cggagaaaga gggtgaaacg cacagccatc gcc 33

<210> 147
<211> 30
<212> DNA
<213> Homo sapiens

<400> 147
aagcttcagc gggccaaggc actggtcacc 30

<210> 148
<211> 30
<212> DNA
<213> Homo sapiens

<400> 148
cagcggcaga aggcaaaaag ggtggccatc 30

<210> 149
<211> 30
<212> DNA
<213> Homo sapiens

<400> 149
cggcagaagg cgaagcgcat gatcctcgcg 30

<210> 150
<211> 30
<212> DNA
<213> Homo sapiens

<400> 150
gagcgcaaca aggccaaaaa ggtgatcatc 30

<210> 151
<211> 39
<212> DNA
<213> Homo sapiens

<400> 151
ggtgtaaaca aaaaggctaa aaacacaatt attcttatt 39

<210> 152
<211> 27
<212> DNA
<213> Homo sapiens

<400> 152
gagagccagc tcaagagcac cgtggtg 27

<210> 153
<211> 30
<212> DNA
<213> Homo sapiens

<400> 153
ccacaagcaa accaagaaaa tgctggctgt 30

<210> 154
<211> 30
<212> DNA
<213> Homo sapiens

<400> 154
catcaagtgt atcatgtgcc aagtacgccc 30

<210> 155
<211> 34
<212> DNA
<213> Homo sapiens

<400> 155
ctagagagtc agatgaagtg tacagtagtg gcac 34

<210> 156
<211> 34
<212> DNA
<213> Homo sapiens

<400> 156
ctagagagtc agatgaagtg tacagtagtg gcac 34

<210> 157
<211> 33
<212> DNA
<213> Homo sapiens

<400> 157
gctgaggttc gcaataaact aaccatgttt gtg 33

<210> 158
<211> 29
<212> DNA
<213> Homo sapiens

<400> 158
gggaggccga gctgaaagcc accctgctc 29

<210> 159
<211> 31
<212> DNA
<213> Homo sapiens

<400> 159
caagatcaag agagccaaaa ccttcatcat g 31

<210> 160
<211> 31
<212> DNA
<213> Homo sapiens

<400> 160
ccggagacaa gtgaagaaga tgctgtttgt c 31

<210> 161
<211> 30
<212> DNA
<213> Homo sapiens

<400> 161
gcaaggacca gatcaagcgg ctggtgctca 30

<210> 162
<211> 34
<212> DNA
<213> Homo sapiens

<400> 162
caagaaagcc aaagccaaga aactgatcct tctg 34
<210> 163
<211> 1068
<212> DNA
<213> Homo sapiens

<400> 163

<210> 164
<211> 355
<212> PRT
<213> Homo sapiens

<400> 164

<210> 165
<211> 1089
<212> DNA
<213> Homo sapiens

<400> 165

<210> 166
<211> 362
<212> PRT
<213> Homo sapiens

<400> 166

<210> 167
<211> 1002
<212> DNA
<213> Homo sapiens

<400> 167

<210> 168
<211> 333
<212> PRT
<213> Homo sapiens

<400> 168

<210> 169
<211> 987
<212> DNA
<213> Homo sapiens

<400> 169

<210> 170
<211> 328
<212> PRT
<213> Homo sapiens

<400> 170

<210> 171
<211> 1002
<212> DNA
<213> Homo sapiens

<400> 171

<210> 172
<211> 333
<212> PRT
<213> Homo sapiens

<400> 172

<210> 173
<211> 1107
<212> DNA
<213> Homo sapiens

<400> 173

<210> 174
<211> 368
<212> PRT
<213> Homo sapiens

<400> 174

<210> 175
<211> 1074
<212> DNA
<213> Homo sapiens

<400> 175

<210> 176
<211> 357
<212> PRT
<213> Homo sapiens

<400> 176

<210> 177
<211> 1110
<212> DNA
<213> Homo sapiens

<400> 177

<210> 178
<211> 369
<212> PRT
<213> Homo sapiens

<400> 178

<210> 179
<211> 1083
<212> DNA
<213> Homo sapiens

<400> 179

<210> 180
<211> 360
<212> PRT
<213> Homo sapiens

<400> 180

<210> 181
<211> 1020
<212> DNA
<213> Homo sapiens

<400> 181

<210> 182
<211> 339
<212> PRT
<213> Homo sapiens

<400> 182

<210> 183
<211> 996
<212> DNA
<213> Homo sapiens

<400> 183

<210> 184
<211> 331
<212> PRT
<213> Homo sapiens

<400> 184

<210> 185
<211> 1077
<212> DNA
<213> Homo sapiens

<400> 185

<210> 186
<211> 358
<212> PRT
<213> Homo sapiens

<400> 186

<210> 187
<211> 1050
<212> DNA
<213> Homo sapiens

<400> 187

<210> 188
<211> 349
<212> PRT
<213> Homo sapiens

<400> 188

<210> 189
<211> 1302
<212> DNA
<213> Homo sapiens

<400> 189

<210> 190
<211> 433
<212> PRT
<213> Homo sapiens

<400> 190

<210> 191
<211> 1209
<212> DNA
<213> Homo sapiens

<400> 191

<210> 192
<211> 402
<212> PRT
<213> Homo sapiens

<400> 192

<210> 193
<211> 1128
<212> DNA
<213> Homo sapiens

<400> 193

<210> 194
<211> 375
<212> PRT
<213> Homo sapiens

<400> 194

<210> 195
<211> 960
<212> DNA
<213> Homo sapiens

<400> 195

<210> 196
<211> 319
<212> PRT
<213> Homo sapiens

<400> 196

<210> 197
<211> 1143
<212> DNA
<213> Homo sapiens

<400> 197

<210> 198
<211> 380
<212> PRT
<213> Homo sapiens

<400> 198

<210> 199
<211> 1119
<212> DNA
<213> Homo sapiens

<400> 199

<210> 200
<211> 372
<212> PRT
<213> Homo sapiens

<400> 200

<210> 201
<211> 1128
<212> DNA
<213> Homo sapiens

<400> 201

<210> 202
<211> 375
<212> PRT
<213> Homo sapiens

<400> 202

<210> 203
<211> 1137
<212> DNA
<213> Homo sapiens

<400> 203

<210> 204
<211> 378
<212> PRT
<213> Homo sapiens

<400> 204

<210> 205
<211> 1086
<212> DNA
<213> Homo sapiens

<400> 205

<210> 206
<211> 361
<212> PRT
<213> Homo sapiens

<400> 206

<210> 207
<211> 1446
<212> DNA
<213> Homo sapiens

<400> 207

<210> 208
<211> 481
<212> PRT
<213> Homo sapiens

<400> 208

<210> 209
<211> 1101
<212> DNA
<213> Homo sapiens

<400> 209

<210> 210
<211> 366
<212> PRT
<213> Homo sapiens

<400> 210

<210> 211
<211> 1842
<212> DNA
<213> Homo sapiens

<400> 211

<210> 212
<211> 613
<212> PRT
<213> Homo sapiens

<400> 212

<210> 213
<211> 1248
<212> DNA
<213> Homo sapiens

<400> 213

<210> 214
<211> 415
<212> PRT
<213> Homo sapiens

<400> 214

<210> 215
<211> 1842
<212> DNA
<213> Homo sapiens

<400> 215

<210> 216
<211> 613
<212> PRT
<213> Homo sapiens

<400> 216

<210> 217
<211> 1854
<212> DNA
<213> Homo sapiens

<400> 217

<210> 218
<211> 617
<212> PRT
<213> Homo sapiens

<400> 218

<210> 219
<211> 1548
<212> DNA
<213> Homo sapiens

<400> 219

<210> 220
<211> 515
<212> PRT
<213> Homo sapiens

<400> 220

<210> 221
<211> 1164
<212> DNA
<213> Homo sapiens

<400> 221

<210> 222
<211> 387
<212> PRT
<213> Homo sapiens

<400> 222

<210> 223
<211> 1212
<212> DNA
<213> Homo sapiens

<400> 223

<210> 224
<211> 403
<212> PRT
<213> Homo sapiens

<400> 224

<210> 225
<211> 1098
<212> DNA
<213> Homo sapiens

<400> 225

<210> 226
<211> 365
<212> PRT
<213> Homo sapiens

<400> 226

<210> 227
<211> 1416
<212> DNA
<213> Homo sapiens

<400> 227

<210> 228
<211> 471
<212> PRT
<213> Homo sapiens

<400> 228

<210> 229
<211> 1377
<212> DNA
<213> Homo sapiens

<400> 229

<210> 230
<211> 458
<212> PRT
<213> Homo sapiens

<400> 230

<210> 231
<211> 1068
<212> DNA
<213> Homo sapiens

<400> 231

<210> 232
<211> 355
<212> PRT
<213> Homo sapiens

<400> 232

<210> 233
<211> 29
<212> DNA
<213> Homo sapiens

<400> 233
ggcttaagag catcatcgtg gtgctggtg 29

<210> 234
<211> 34
<212> DNA
<213> Homo sapiens

<400> 234
gtcaccacca gcaccacgat gatgctctta agcc 34

<210> 235
<211> 31
<212> DNA
<213> Homo sapiens

<400> 235
caaagaaagt actgggcatc gtcttcttcc t 31

<210> 236
<211> 30
<212> DNA
<213> Homo sapiens

<400> 236
tgctctagat tccagatagg tgaaaacttg 30

<210> 237
<211> 50
<212> DNA
<213> Homo sapiens

<400> 237
ctaggggcac catgcaggct atcaacaatg aaagaaaagc taagaaagtc 50

<210> 238
<211> 50
<212> DNA
<213> Homo sapiens

<400> 238
caaggacttt cttagctttt ctttcattgt tgatagcctg catggtgccc 50

<210> 239
<211> 35
<212> DNA
<213> Homo sapiens

<400> 239
cggcggcaga aggcgaaacg catgatcctc gcggt 35

<210> 240
<211> 35
<212> DNA
<213> Homo sapiens

<400> 240
accgcgagga tcatgcgttt cgccttctgc cgccg 35

<210> 241
<211> 24
<212> DNA
<213> Homo sapiens

<400> 241
gagacatatt atctgccacg gagg 24

<210> 242
<211> 24
<212> DNA
<213> Homo sapiens

<400> 242
ttggcataga aaccggaccc aagg 24

<210> 243
<211> 28
<212> DNA
<213> Homo sapiens

<400> 243
taagaattcc ataaaaatta tggaatgg 28

<210> 244
<211> 30
<212> DNA
<213> Homo sapiens

<400> 244
ccaggatcca gctgaagtct tccatcattc 30
<210> 245
<211> 1071
<212> DNA
<213> Homo sapiens

<400> 245

<210> 246
<211> 356
<212> PRT
<213> Homo sapiens

<400> 246

<210> 247
<211> 32
<212> DNA
<213> Homo sapiens

<400> 247
gcagaattcg gcggccccat ggacctgccc cc 32

<210> 248
<211> 30
<212> DNA
<213> Homo sapiens

<400> 248
gctggatccc ccgagcagtg gcgttacttc 30

<210> 249
<211> 903
<212> DNA
<213> Homo sapiens

<400> 249

<210> 250
<211> 300
<212> PRT
<213> Homo sapiens

<400> 250

<210> 251
<211> 31
<212> DNA
<213> Homo sapiens

<400> 251
ctcaagctta ctctctctca ccagtggcca c 31

<210> 252
<211> 24
<212> DNA
<213> Homo sapiens

<400> 252
ccctcctccc ccggaggacc tagc 24

<210> 253
<211> 1041
<212> DNA
<213> Homo sapiens

<400> 253

<210> 254
<211> 346
<212> PRT
<213> Homo sapiens

<400> 254

<210> 255
<211> 31
<212> DNA
<213> Homo sapiens

<400> 255
tttaagcttc ccctccagga tgctgccgga c 31

<210> 256
<211> 31
<212> DNA
<213> Homo sapiens

<400> 256
ggcgaattct gaaggtccag ggaaactgct a 31

<210> 257
<211> 993
<212> DNA
<213> Homo sapiens

<400> 257

<210> 258
<211> 330
<212> PRT
<213> Homo sapiens

<400> 258

<210> 259
<211> 30
<212> DNA
<213> Homo sapiens

<400> 259
cccaagcttc gggcaccatg gacacctccc 30

<210> 260
<211> 30
<212> DNA
<213> Homo sapiens

<400> 260
acaggatcca aatgcacagc actggtaagc 30

<210> 261
<211> 25
<212> DNA
<213> Homo sapiens

<400> 261
ctataactgg gttacatggt ttaac 25

<210> 262
<211> 30
<212> DNA
<213> Homo sapiens

<400> 262
tttgaattca catattaatt agagacatgg 30

<210> 263
<211> 2724
<212> DNA
<213> Homo sapiens

<400> 263

<210> 264
<211> 907
<212> PRT
<213> Homo sapiens

<400> 264

<210> 265
<211> 30
<212> DNA
<213> Homo sapiens

<400> 265
cggaagctgc gggccaaatg ggtggccggc 30

<210> 266
<211> 27
<212> DNA
<213> Homo sapiens

<400> 266
cagaggaggg tgaaggggct gttggcg 27

<210> 267
<211> 30
<212> DNA
<213> Homo sapiens

<400> 267
ggcggcgccg agccaagggg ctggctgtgg 30

<210> 268
<211> 32
<212> DNA
<213> Homo sapiens

<400> 268
gggactgctc tatgaaaaaa cacattgccc tg 32

<210> 269
<211> 1071
<212> DNA
<213> Homo sapiens

<400> 269

<210> 270
<211> 356
<212> PRT
<213> Homo sapiens

<400> 270

<210> 271
<211> 903
<212> DNA
<213> Homo sapiens

<400> 271

<210> 272
<211> 300
<212> PRT
<213> Homo sapiens

<400> 272

<210> 273
<211> 1041
<212> DNA
<213> Homo sapiens

<400> 273

<210> 274
<211> 346
<212> PRT
<213> Homo sapiens

<400> 274

<210> 275
<211> 993
<212> DNA
<213> Homo sapiens

<400> 275

<210> 276
<211> 330
<212> PRT
<213> Homo sapiens

<400> 276

<210> 277
<211> 2724
<212> DNA
<213> Homo sapiens

<400> 277

<210> 278
<211> 907
<212> PRT
<213> Homo sapiens

<400> 278

<210> 279
<211> 32
<212> DNA
<213> Homo sapiens

<400> 279
catgccaacc ggcccgcgag gctgctgctg gt 32

<210> 280
<211> 32
<212> DNA
<213> Homo sapiens

<400> 280
accagcagca gcctcgcggg ccggttggca tg 32

Claims

1. A method for creating a non-endogenous, constitutively active version of an endogenous human G protein coupled receptor (GPCR), said endogenous GPCR comprising a transmembrane 6 region and an intracellular loop 3 region, the method comprising:

(a) selecting an endogenous human GPCR comprising a proline residue in the transmembrane 6 region;

(b) identifying the endogenous 16th amino acid residue from the proline residue of step (a), in a carboxy-terminus to amino-terminus direction;

(c) altering the identified amino acid residue of step (b) to a non-endogenous amino acid residue to create a non-endogenous version of the endogenous human GPCR; and

(d) determining if the non-endogenous version of the endogenous human GPCR of step (c) is constitutively active by measuring a difference in an intracellular signal measured for the non-endogenous version as compared with a signal induced by the endogenous GPCR.

2. A method for directly identifying a compound selected from the group consisting of inverse agonist, agonist and partial agonist to a non-endogenous, constitutively activated human G protein coupled receptor, said receptor comprising a transmembrane 6 region and an intracellular loop 3 region, the method comprising steps (a) to (d) of claim 1 and further comprising the steps:

(e) contacting a candidate compound with a non-endogenous, constitutively active GPCR identified in step (d); and

(f) determining, by measurement of the compound efficacy at said contacted receptor, whether said compound is an inverse agonist, agonist or partial agonist of said receptor.

3. The method of claim 1 wherein the amino acid residue that is two residues from said proline residue in the transmembrane 6 region, in a carboxy-terminus to amino-terminus direction, is tryptophan.

4. The method of any one of claims 1 to 3 wherein the endogenous 16th amino acid residue from said proline residue in a carboxy-terminus to amino-terminus direction has been altered to a lysine residue.

5. The method of any one of claims 1 to 3 wherein the endogenous 16th amino acid residue from said proline residue in a carboxy-terminus to amino-terminus direction has been altered to an alanine residue.

6. The method of any one of claims 1 to 3 wherein the endogenous 16th amino acid residue from said proline residue in a carboxy-terminus to amino-terminus direction has been altered to an arginine residue.

7. The method of any one of claims 1 to 3 wherein the endogenous 16th amino acid residue from said proline residue in a carboxy-terminus to amino-terminus direction has been altered to a histidine residue.

8. A method for creating a non-endogenous, constitutively active version of an endogenous human G protein coupled receptor (GPCR), said endogenous GPCR comprising a transmembrane 6 region and an intracellular loop 3 region, the method comprising:

(a) providing a polynucleotide, said polynucleotide encoding an endogenous human GPCR, said endogenous GPCR comprising a transmembrane 6 region and an intracellular loop 3 region, said transmembrane 6 region comprising a proline residue;

(b) identifying the codon of said polynucleotide corresponding to the endogenous 16th amino acid residue from said proline residue of said GPCR of step (a), in a carboxy-terminus to amino-terminus direction;

(c) altering said identified codon of step (b) to encode a non-endogenous amino acid residue, to provide a non-endogenous polynucleotide;

(d) expressing said non-endogenous polynucleotide in a host cell, thereby providing a non-endogenous version of the endogenous human GPCR; and

(e) determining if the non-endogenous version of the endogenous human GPCR of step (d) is constitutively active by measuring a difference in an intracellular signal measured for the non-endogenous version as compared with a signal induced by the endogenous GPCR.

9. The method of claim 8 wherein the amino acid residue that is two residues from said proline residue in the transmembrane 6 region, in a carboxy-terminus to amino-terminus direction, is tryptophan.

10. The method of claim 8 or claim 9 wherein said identified codon of step (b) has been altered to be a codon encoding lysine.

11. The method of claim 8 or claim 9 wherein said identified codon of step (b) has been altered to be a codon encoding alanine.

12. The method of claim 8 or claim 9 wherein said identified codon of step (b) has been altered to be a codon encoding arginine.

13. The method of claim 8 or claim 9 wherein said identified codon of step (b) has been altered to be a codon encoding histidine.

14. The method of claim 2 wherein the directly identified compound is an inverse agonist.

15. The method of claim 2 wherein the directly identified compound is an agonist.

16. The method of claim 2 wherein the directly identified compound is a partial agonist.

17. The method of claim 2, further comprising the step (g) of formulating the compound into a pharmaceutical composition.

Ansprüche

1. Verfahren zum Erzeugen einer nicht-endogenen, konstitutiv aktiven Version eines endogenen humanen G-Proteingekoppelten Rezeptors (GPCR), wobei der endogene GPCR eine Transmembran-6-Region und eine intrazelluläre Schleifen-3-Region umfasst, wobei das Verfahren umfasst:

(a) Auswählen eines endogenen humanen GPCR, umfassend einen Prolinrest in der Transmembran-6-Region;

(b) Identifizieren des endogenen 16. Aminosäurerestes des Prolinrestes von Schritt (a) in Richtung vom Carboxyterminus zum Aminoterminus;

(c) Verändern des in Schritt (b) identifizierten Aminosäurerestes zu einem nicht-endogenen Aminosäurerest zum Erzeugen einer nicht-endogenen Version des endogenen humanen GPCR, und

(d) Feststellen, ob die nicht-endogene Version des endogenen humanen GPCR von Schritt (c) konstitutiv aktiv ist, indem ein Unterschied eines für die nicht-endogene Version gemessenen intrazellulären Signals im Vergleich zu einem von dem endogenen GPCR induzierten Signal gemessen wird.

2. Verfahren zum direkten Identifizieren einer Verbindung, die aus der Gruppe ausgewählt ist, welche aus einem inversen Agonisten, Agonisten und partiellen Agonisten eines nicht-endogenen, konstitutiv aktivierten humanen G-Protein-gekoppelten Rezeptor besteht, wobei der Rezeptor eine Transmembran-6-Region und eine intrazelluläre Schleifen-3-Region umfasst, wobei das Verfahren Schritt (a) bis (d) von Anspruch 1 umfasst und des weiteren folgende Schritte umfasst:

(e) In-Berührung-Bringen einer Kandidatenverbindung mit einem in Schritt (d) identifizierten nicht-endogenen, konstitutiv aktiven GPCR, und

(f) Feststellen durch Messen der Verbindungswirksamkeit an dem berührten Rezeptor, ob es sich bei der Verbindung um einen inversen Agonisten, Agonisten oder partiellen Agonisten des Rezeptors handelt.

3. Verfahren nach Anspruch 1, wobei der Aminosäurerest, der sich in Richtung vom Carboxyterminus zum Aminoterminus zwei Reste von dem Prolinrest in der Transmembran-6-Region befindet, Tryptophan ist.

4. Verfahren nach einem der Ansprüche 1 bis 3, wobei der endogene 16. Aminosäurerest des Prolinrestes in einer Richtung vom Carboxyterminus zum Aminoterminus zu einem Lysinrest geändert wurde.

5. Verfahren nach einem der Ansprüche 1 bis 3, wobei der endogene 16. Aminosäurerest des Prolinrestes in einer Richtung vom Carboxyterminus zum Aminoterminus zu einem Alaninrest geändert wurde.

6. Verfahren nach einem der Ansprüche 1 bis 3, wobei der endogene 16. Aminosäurerest des Prolinrestes in einer Richtung vom Carboxyterminus zum Aminoterminus zu einem Argininrest geändert wurde.

7. Verfahren nach einem der Ansprüche 1 bis 3, wobei der endogene 16. Aminosäurerest des Prolinrestes in einer Richtung vom Carboxyterminus zum Aminoterminus zu einem Histidinrest geändert wurde.

8. Verfahren zum Erzeugen einer nicht-endogenen, konstitutiv aktiven Version eines endogenen humanen G-Proteingekoppelten Rezeptors (GPCR), wobei der endogene GPCR eine Transmembran-6-Region und eine intrazelluläre Schleifen-3-Region umfasst, wobei das Verfahren umfasst:

(a) Bereitstellen eines Polynukleotids, wobei das Polynukleotid einen endogenen humanen GPCR kodiert, wobei der endogene GPCR eine Transmembran-6-Region und eine intrazelluläre Schleifen-3-Region umfasst, wobei die Transmembran-6-Region einen Prolinrest umfasst;

(b) Identifizieren des Kodons des Polynukleotids, das dem endogenen 16. Aminosäurerest des Prolinrestes des GPCR von Schritt (a) in Richtung vom Carboxyterminus zum Aminoterminus entspricht.

(c) Verändern des in Schritt (b) identifizierten Kodons, um einen nicht-endogenen Aminosäurerest zu kodieren, um ein nicht-endogenes Polynukleotid bereit zu stellen.

(d) Exprimieren des nicht-endogenen Polynukleotids in einer Wirtszelle, womit eine nicht-endogene Version des endogenen humanen GPCR bereit gestellt wird, und

(e) Feststellen, ob die nicht-endogene Version des endogenen humanen GPCR von Schritt (d) konstitutiv aktiv ist, indem ein Unterschied eines für die nicht-endogene Version gemessenen intrazellulären Signals im Vergleich zu einem von dem endogenen GPCR induzierten Signal gemessen wird.

9. Verfahren nach Anspruch 8, wobei der Aminosäurerest, der sich in Richtung vom Carboxyterminus zum Aminoterminus zwei Reste von dem Prolinrest in der Transmembran-6-Region befindet, Tryptophan ist.

10. Verfahren nach Anspruch 8 oder Anspruch 9, wobei das in Schritt (b) identifizierte Kodon zu einem Kodon verändert wurde, das Lysin kodiert.

11. Verfahren nach Anspruch 8 oder Anspruch 9, wobei das in Schritt (b) identifizierte Kodon zu einem Kodon verändert wurde, das Alanin kodiert.

12. Verfahren nach Anspruch 8 oder Anspruch 9, wobei das in Schritt (b) identifizierte Kodon zu einem Kodon verändert wurde, das Arginin kodiert.

13. Verfahren nach Anspruch 8 oder Anspruch 9, wobei das in Schritt (b) identifizierte Kodon zu einem Kodon verändert wurde, das Histidin kodiert.

14. Verfahren nach Anspruch 2, wobei die direkt identifizierte Verbindung ein inverser Agonist ist.

15. Verfahren nach Anspruch 2, wobei die direkt identifizierte Verbindung ein Agonist ist.

16. Verfahren nach Anspruch 2, wobei die direkt identifizierte Verbindung ein partieller Agonist ist.

17. Verfahren nach Anspruch 2, das des Weiteren den Schritt (g) des Formulierens der Verbindung in eine pharmazeutische Zusammensetzung umfasst.

Revendications

1. Méthode de création d'une version constitutivement active non endogène d'un récepteur couplé aux protéines-G humain endogène RCPG, ledit RCPG endogène comprenant une région 6 transmembranaire et une région 3 de boucle intracellulaire, la méthode comprenant :

(a) sélectionner un RCPG humain endogène comprenant un résidu de proline dans la région 6 transmembranaire ;

(b) identifier le résidu du 16^ème acide aminé endogène du résidu de proline de l'étape (a), dans le sens de l'extrémité carboxy vers l'extrémité amino;

(c) transformer le résidu d'acide aminé identifié de l'étape (b) en un résidu d'acide aminé non endogène pour créer une version non endogène du RCPG humain endogène; et

(d) déterminer si la version non endogène du RCPG endogène humain de l'étape (c) est constitutivement active en mesurant une différence dans un signal intracellulaire mesuré pour la version non endogène comparativement à un signal induit par le RCPG endogène

2. Méthode d'identification directe d'un composé choisi dans le groupe consistant en un agoniste inverse, un agoniste et un agoniste partiel d'un récepteur couplé aux protéines-G activé, ledit récepteur comprenant une région 6 transmembranaire et une région 3 de boucle intracellulaire, la méthode comprenant les étapes (a) à (d) de la revendication 1 et comprenant en outre les étapes :

(e) mettre en contact un composé candidat avec un RCPG constitutivement actif non endogène identifié dans l'étape (d) ; et

(f) déterminer, en mesurant l'efficacité du composé au contact dudit récepteur, si ledit composé est un agoniste inverse, un agoniste ou un agoniste partiel dudit récepteur.

3. Méthode selon la revendication 1, dans laquelle le résidu d'acide aminé, c'est-à-dire deux résidus dudit résidu de proline dans la région 6 transmembranaire, dans le sens de l'extrémité carboxy vers l'extrémité amino, est le tryptophane

4. Méthode l'une quelconque des revendications 1 à 3 dans laquelle le résidu du 16^ème acide aminé endogène dudit résidu de proline dans le sens de l'extrémité carboxy vers l'extrémité amino a été transformé en résidu de lysine.

5. Méthode l'une quelconque des revendications 1 à 3 dans laquelle le résidu du 16^ème acide aminé endogène dudit résidu de proline dans le sens de l'extrémité carboxy vers l'extrémité amino a été transformé en résidu d'alanine.

6. Méthode l'une quelconque des revendications 1 à 3 dans laquelle le résidu du 16^ème acide aminé endogène dudit résidu de proline dans le sens de l'extrémité carboxy vers l'extrémité amino a été transformé en résidu d'arginine.

7. Méthode l'une quelconque des revendications 1 à 3 dans laquelle le résidu du 16^ème acide aminé endogène dudit résidu de proline dans le sens de l'extrémité carboxy vers l'extrémité amino a été transformé en résidu d'histidine.

8. Méthode de création d'une version constitutivement active non endogène d'un récepteur couplé aux protéines-G humain endogène RCPG, ledit RCPG endogène comprenant une région 6 transmembranaire et une région 3 de boucle intracellulaire, la méthode comprenant :

(a) fournir un polynucléotide, ledit polynucléotide codant un RCPG humain endogène, ledit RCPG endogène comprenant une région 6 transmembranaire et une région 3 de boucle intracellulaire, ladite région 6 transmembranaire comprenant un résidu de proline ;

(b) identifier le codon dudit polynucléotide correspondant au résidu du 16^ème acide aminé endogène dudit résidu de proline de ladite étape (a) de RCPG, dans le sens de l'extrémité carboxy vers l'extrémité amino ;

(c) transformer ledit codon identifié de l'étape (b) pour coder un résidu d'acide aminé non endogène, pour fournir un polynucléotide non endogène ;

(d) faire exprimer ledit polynucléotide non endogène dans une cellule hôte, et fournir par ce moyen une version non endogène du RCPG humain endogène ; et

(e) déterminer si la version non endogène de l'étape (d) du RCPG humain endogène est constitutivement active en mesurant une différence dans un signal intracellulaire mesuré pour la version non endogène par rapport à un signal induit par le RCPG endogène.

9. Méthode selon la revendication 8, dans laquelle le résidu de l'acide aminé, c'est à dire deux résidus dudit résidu de proline dans la région 6 transmembranaire, dans le sens de l'extrémité carboxy vers l'extrémité amino, est le tryptophane.

10. Méthode selon la revendication 8 ou la revendication 9 dans laquelle ledit codon identifié de l'étape (b) a été transformé pour être un codon codant la lysine

11. Méthode selon la revendication 8 ou la revendication 9 dans laquelle ledit codon identifié de l'étape (b) a été transformé pour en faire un codon codant l'alanine

12. Méthode selon la revendication 8 ou la revendication 9 dans laquelle ledit codon identifié de l'étape (b) a été transformé pour en faire un codon codant l'arginine

13. Méthode selon la revendication 8 ou la revendication 9 dans laquelle ledit codon identifié de l'étape (b) a été transformé pour en faire un codon codant l'histidine.

14. Méthode de la revendication 2 dans laquelle le composé directement identifié est un agoniste inverse.

15. Méthode de la revendication 2 dans laquelle le composé directement identifié est un agoniste.

16. Méthode de la revendication 2 dans laquelle le composé directement identifié est un agoniste partiel.

17. Méthode de la revendication 2, comprenant en outre l'étape (g) de formulation du composé en une composition pharmaceutique.

Drawing

Cited references

REFERENCES CITED IN THE DESCRIPTION

This list of references cited by the applicant is for the reader's convenience only. It does not form part of the European patent document. Even though great care has been taken in compiling the references, errors or omissions cannot be excluded and the EPO disclaims all liability in this regard.

Patent documents cited in the description

Non-patent literature cited in the description

Indirect Mechanisms of Synaptic TransmissionFrom Neuron To BrainSinauer Associates, Inc.19920000 [0061]