(19)
(11) EP 2 518 166 A2

(12) EUROPEAN PATENT APPLICATION

(43) Date of publication:
31.10.2012 Bulletin 2012/44

(21) Application number: 12166440.3

(22) Date of filing: 18.05.2006
(51) International Patent Classification (IPC): 
C12Q 1/68(2006.01)
G01N 33/53(2006.01)
(84) Designated Contracting States:
AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

(30) Priority: 20.05.2005 US 683173 P

(62) Application number of the earlier application in accordance with Art. 76 EPC:
06770766.1 / 1888785

(71) Applicant: Veridex, LLC
Warren, NJ 07059 (US)

(72) Inventors:
  • Jiang, Yuqiu
    San Diego, CA 92130 (US)
  • Backus, John W.
    Ontario, NY 14519 (US)
  • Mazumder, Abhijit
    Basking Ridge, NJ 07970 (US)
  • Chowdary, Dondapati
    Princeton Junction, NJ 08550 (US)
  • Yang, Fei
    San Diego, CA 92130 (US)
  • Wang, Yixin
    San Diego, CA 92130 (US)
  • Jatkoe, Timothy
    San Diego, CA 92122 (US)

(74) Representative: Goodfellow, Hugh Robin 
Carpmaels & Ransford One Southampton Row
London WC1B 5HA
London WC1B 5HA (GB)

 
Remarks:
The complete document including Reference Tables and the Sequence Listing can be downloaded from the EPO website
Remarks:
This application was filed on 02-05-2012 as a divisional application to the application mentioned under INID code 62.
Remarks:
Claims filed after the date of filing of the application/after the date of receipt of the divisional application (Rule 68(4) EPC).
 


(54) Thyroid fine needle aspiration molecular assay


(57) The present invention relates to methods, compositions and articles directed to diagnosing thyroid carcinoma, differentiating between thyroid carcinoma and benign thyroid diseases, testing indeterminate thyroid fine needle aspirate samples of thyroid nodules, and determining patient protocols and outcomes.


Description

BACKGROUND OF THE INVENTION



[0001] There are approximately 25,600 new cases of thyroid carcinoma diagnosed in the United States each year, and 1,400 patients will die of the disease. About 75% of all thyroid cancers belong to the papillary thyroid carcinoma type. The rest consist of 10% follicular carcinoma, 5% to 9% medullary thyroid cancer, 1% to 2% anaplastic cancer, 1% to 3% lymphoma, and less than 1% sarcoma and other rare tumors. Usually a lump (nodule) in the thyroid is the first sign of thyroid cancer. There are 10 to 18 million people in US with a single thyroid nodule, and approximately 490,000 become clinically apparent each year. Fortunately only about 5% of these nodules are cancerous.

[0002] The commonly used method for thyroid cancer diagnosis is fine needle aspiration (FNA) biopsy. FNA samples are examined cytologically to determine whether the nodules are benign or cancerous. The sensitivity and specificity of FNA range from 68% to 98%, and 72% to 100% respectively, depending on institutions and doctors. Unfortunately, in 25% of the cases the specimens are either inadequate for diagnosis or indeterminable by cytology. In current medical practice, patients with indeterminate results are sent to surgery, with consequence that only 25% have cancer and 75% end up with unnecessary surgery. A molecular assay with high sensitivity and a better specificity (higher than 25%) would greatly improve current diagnostic accuracy of thyroid cancer, and omit unnecessary surgery for non-cancerous patients.

[0003] Comparative genomic hybridization (CGH), serial analysis of gene expression (SAGE), and DNA microarray have been used to identify genetic events occurring in thyroid cancers such as loss of heterozygosity, up and down gene regulation, and genetic rearrangements. PAX8 and PPARγ genetic rearrangement event has been demonstrated to be associated with follicular thyroid cancer (FTC). Rearrangement of the ret proto-oncogene is related to papillary thyroid cancer (PTC). Down-regulation of thyroid peroxidase (TPO) gene is observed in both FTC and PTC. Galectin-3 was reported to be a candidate marker to differentiate malignant thyroid neoplasms from benign lesions. However, there are other studies demonstrating that Galectin-3 is not a cancer-specific marker. Many genes purported to be useful in thyroid cancer diagnosis lack the sensitivity and specificity required for an accurate molecular assay.

SUMMARY OF THE INVENTION



[0004] The present invention encompasses methods of diagnosing thyroid cancer by obtaining a biological sample from a patient; and measuring the expression levels in the sample of genes selected from the group consisting of those encoding mRNA: corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or corresponding to SEQ ID NOs: 199, 207, 255 and 354; or recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25; where the gene expression levels above or below pre-determined cut-off levels are indicative of thyroid cancer.

[0005] The present invention encompasses methods of differentiating between thyroid carcinoma and benign thyroid diseases by obtaining a sample from a patient; and measuring the expression levels in the sample of genes selected from the group consisting of those encoding mRNA: corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or corresponding to SEQ ID NOs: 199, 207, 255 and 354; or recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25; where the gene expression levels above or below pre-determined cut-off levels are indicative of thyroid carcinoma.

[0006] The present invention encompasses methods of testing indeterminate thyroid fine needle aspirate (FNA) thyroid nodule samples by: obtaining a sample from a patient; and measuring the expression levels in the sample of genes selected from the group consisting of those encoding mRNA: corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or corresponding to SEQ ID NOs: 199, 207, 255 and 354; or recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25; where the gene expression levels above or below pre-determined cut-off levels are indicative of thyroid cancer.

[0007] The present invention encompasses methods of determining thyroid cancer patient treatment protocol by: obtaining a biological sample from a thyroid cancer patient; and measuring the expression levels in the sample of genes selected from the group consisting of those encoding mRNA: corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or corresponding to SEQ ID NOs: 199, 207, 255 and 354; or recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25; where the gene expression levels above or below pre-determined cut-off levels are sufficiently indicative of cancer to enable a physician to determine the type of surgery and/or therapy recommend to treat the disease.

[0008] The present invention encompasses methods of treating a thyroid cancer patient by obtaining a biological sample from a thyroid cancer patient; and measuring the expression levels in the sample of genes selected from the group consisting of those encoding mRNA: corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or corresponding to SEQ ID NOs: 199, 207, 255 and 354; or recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25; where the gene expression levels above or below pre-determined cut-off levels are indicative of cancer; and treating the patient with a thyroidectomy if they are cancer positive.

[0009] The present invention encompasses methods of cross validating a gene expression profile for thyroid carcinoma patients by: a. obtaining gene expression data from a statistically significant number of patient biological samples; b. randomizing sample order; c. setting aside data from about 10% - 50% of samples; d. computing, for the remaining samples, for factor of interest on all variables and selecting variables that meet a p-value cutoff (p); e. selecting variables that fit a prediction model using a forward search and evaluating the training error until it hits a predetermined error rate; f. testing the prediction model on the left-out 10-50% of samples; g. repeating steps c., -g. with a new set of samples removed; and h. continuing steps c) -g) until 100% of samples have been tested and record classification performance.

[0010] The present invention encompasses methods of independently validating a gene expression profile and gene profiles obtained thereby for thyroid carcinoma patients by obtaining gene expression data from a statistically significant number of patient biological samples; normalizing the source variabilities in the gene expression data; computing for factor of interest on all variables that were selected previously; and testing the prediction model on the sample and record classification performance.

[0011] The present invention encompasses a method of generating a posterior probability score to enable diagnosis of thyroid carcinoma patients by: obtaining gene expression data from a statistically significant number of patient biological samples; applying linear discrimination analysis to the data to obtain selected genes; and applying weighted expression levels to the selected genes with discriminate function factor to obtain a prediction model that can be applied as a posterior probability score.

[0012] The present invention encompasses methods of generating a thyroid carcinoma prognostic patient report and reports obtained thereby, by obtaining a biological sample from the patient; measuring gene expression of the sample; applying a posterior probability thereto; and using the results obtained thereby to generate the report.

[0013] The present invention encompasses compositions containing at least one probe set selected from the group consisting of: SEQ ID NOs: 36, 53, 73, 211 and 242; and/or SEQ ID NOs: 199, 207, 255 and 354; or the psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25.

[0014] The present invention encompasses kits for conducting an assay to determine thyroid carcinoma diagnosis in a biological sample containing: materials for detecting isolated nucleic acid sequences, their complements, or portions thereof of a combination of genes selected from the group consisting of those encoding mRNA: corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or corresponding to SEQ ID NOs: 199, 207, 255 and 354; or recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25.

[0015] The present invention encompasses articles for assessing thyroid carcinoma status containing: materials for detecting isolated nucleic acid sequences, their complements, or portions thereof of a combination of genes selected from the group consisting of those encoding mRNA: corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or corresponding to SEQ ID NOs: 199, 207, 255 and 354; or recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25.

[0016] The present invention encompasses microarrays or gene chips for performing the methods provided herein.

[0017] The present invention encompasses diagnostic/prognostic portfolios containing isolated nucleic acid sequences, their complements, or portions thereof of a combination of genes selected from the group consisting of those encoding mRNA: corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or corresponding to SEQ ID NOs: 199, 207, 255 and 354; or recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25 where the combination is sufficient to characterize thyroid carcinoma status or risk of relapse in a biological sample.

BRIEF DESCRIPTION OF THE DRAWINGS



[0018] 

Figure 1 is an ROC curve of the LOOCV of the 4-gene signature in 98 training samples.

Figure 2 is an ROC curve of the 5-gene signature in 98 training samples.

Figure 3a is an ROC curve of the 4-gene signature in 74 independent validation samples; 3b is an ROC curve of the 5-gene signature in 74 independent validation samples.

Figure 4a is an ROC curve of the 4-gene signature that is normalized to the three-thyroid control genes; 4b is an ROC curve of the 5-gene signature that is normalized to the three-thyroid control genes.

Figure 5a is an ROC curve of the 4-gene signature with one-round amplification in 47 thyroid samples; 5b is an ROC curve of the 4-gene signature with two-round amplification in 47 thyroid samples; 5c is an ROC curve of the 5-gene signature with one-round amplification in 47 thyroid samples; 5d is an ROC curve of the 5-gene signature with two-round amplification in 47 thyroid samples.

Figures 6a and 6b depict the ROC curves for cross validation with the 83 independent fresh frozen thyroid samples.

Figures 7a and 7b depict the ROC curves for signature validation with the 47 fine needle aspirate (FNA) thyroid samples.

Figures 8a and 8b depict the ROC curves for signature performance in 28 paired fresh frozen and FNA thyroid samples.


DETAILED DESCRIPTION



[0019] In this study the goal was to identify signatures that can be used in assays such as DNA chip-based assay to differentiate thyroid carcinomas from benign thyroid diseases. 31 primary papillary thyroid tumors, 21 follicular thyroid cancers, 33 follicular adenoma samples, and 13 benign thyroid diseases were analyzed by using the Affymetrix human U133A Gene Chip. Comparison of gene expression profiles between thyroid cancers and benign tissues has enabled us to identify two signatures: a 5-gene signature identified by percentile analysis and manual selection, and a 4-gene signature selected by Linear Discrimination Analysis (LDA) approach. These two signatures have the performance of sensitivity/specificity 92%/70% and 92%/61%, respectively, and have been validated in 74 independent thyroid samples. The results presented herein demonstrate that these candidate signatures facilitate the diagnosis of thyroid cancers with better sensitivity and specificity than currently available diagnostic procedures. These two signatures are suitable for use in testing indeterminate FNA samples.

[0020] By performing gene profiling on 98 representative thyroid benign and tumor samples on Affymetrix U133a chips, we have selected two gene signatures, a 5-gene signature and a 4-gene signature, for thyroid FNA molecular assay. Signatures were selected to achieve the best sensitivity of the assay at a close to 95%. Except for fibronectin and thyroid peroxidase, the other seven genes from the two signatures have not been implicated previously in thyroid tumorogenesis. Both signatures have been validated with an independent 74 thyroid samples, and achieved performance that is equivalent to the one in the 98 training samples. The performances of the two gene signatures are 92% sensitivity and 70%/61% specificity, respectively. When these two signatures are normalized to the specific thyroid control genes the performances are improved relative to the ones of the non-normalized signatures. Furthermore, the signatures performed equivalently with two different target preparations, namely one-round amplification and two-round amplifications. This validation is extremely important for thyroid assays that are FNA samples, which usually contain limited numbers of thyroid cells.

[0021] The mere presence or absence of particular nucleic acid sequences in a tissue sample has only rarely been found to have diagnostic or prognostic value. Information about the expression of various proteins, peptides or mRNA, on the other hand, is increasingly viewed as important. The mere presence of nucleic acid sequences having the potential to express proteins, peptides, or mRNA (such sequences referred to as "genes") within the genome by itself is not determinative of whether a protein, peptide, or mRNA is expressed in a given cell. Whether or not a given gene capable of expressing proteins, peptides, or mRNA does so and to what extent such expression occurs, if at all, is determined by a variety of complex factors. Irrespective of difficulties in understanding and assessing these factors, assaying gene expression can provide useful information about the occurrence of important events such as tumorogenesis, metastasis, apoptosis, and other clinically relevant phenomena. Relative indications of the degree to which genes are active or inactive can be found in gene expression profiles. The gene expression profiles of this invention are used to provide a diagnosis and treat patients for thyroid cancer.

[0022] Sample preparation requires the collection of patient samples. Patient samples used in the inventive method are those that are suspected of containing diseased cells such as cells taken from a nodule in a fine needle aspirate (FNA) of thyroid tissue. Bulk tissue preparation obtained from a biopsy or a surgical specimen and laser capture microdissection are also suitable for use. Laser Capture Microdissection (LCM) technology is one way to select the cells to be studied, minimizing variability caused by cell type heterogeneity. Consequently, moderate or small changes in gene expression between normal or benign and cancerous cells can be readily detected. Samples can also comprise circulating epithelial cells extracted from peripheral blood. These can be obtained according to a number of methods but the most preferred method is the magnetic separation technique described in U.S. Patent 6,136,182. Once the sample containing the cells of interest has been obtained, RNA is extracted and amplified and a gene expression profile is obtained, preferably via microarray, for genes in the appropriate portfolios.

[0023] The present invention encompasses methods of diagnosing thyroid cancer by obtaining a biological sample from a patient; and measuring the expression levels in the sample of genes from those encoding mRNA: corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or corresponding to SEQ ID NOs: 199, 207, 255 and 354; or recognized specifically by the probe sets from psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or recognized specifically by the probe sets from psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25; where the gene expression levels above or below pre-determined cut-off levels are indicative of thyroid cancer.

[0024] The present invention encompasses methods of differentiating between thyroid carcinoma and benign thyroid diseases by obtaining a sample from a patient; and measuring the expression levels in the sample of genes from those encoding mRNA: corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or corresponding to SEQ ID NOs: 199, 207, 255 and 354; or recognized specifically by the probe sets from psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or recognized specifically by the probe sets from psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25; where the gene expression levels above or below pre-determined cut-off levels are indicative of thyroid carcinoma.

[0025] The present invention encompasses methods of testing indeterminate thyroid fine needle aspirate (FNA) thyroid nodule samples by: obtaining a sample from a patient; and measuring the expression levels in the sample of genes from those encoding mRNA: corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or corresponding to SEQ ID NOs: 199, 207, 255 and 354; or recognized specifically by the probe sets from psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or recognized specifically by the probe sets from psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25; where the gene expression levels above or below pre-determined cut-off levels are indicative of thyroid cancer.

[0026] The present invention encompasses methods of determining thyroid cancer patient treatment protocol by: obtaining a biological sample from a thyroid cancer patient; and measuring the expression levels in the sample of genes from those encoding mRNA: corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or corresponding to SEQ ID NOs: 199, 207, 255 and 354; or recognized specifically by the probe sets from psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or recognized specifically by the probe sets from psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25; where the gene expression levels above or below pre-determined cut-off levels are sufficiently indicative of cancer to enable a physician to determine the type of surgery and/or therapy recommend to treat the disease.

[0027] The present invention encompasses methods of treating a thyroid cancer patient by obtaining a biological sample from a thyroid cancer patient; and measuring the expression levels in the sample of genes from those encoding mRNA: corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or corresponding to SEQ ID NOs: 199, 207, 255 and 354; or recognized specifically by the probe sets from psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or recognized specifically by the probe sets from psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25; where the gene expression levels above or below pre-determined cut-off levels are indicative of cancer; and treating the patient with thyroidectomy if they are cancer positive.

[0028] The SEQ ID NOs in the above methods can be 36, 53, 73, 211 and 242 or 199, 207, 255 and 354, or 45, 215, 65, 29, 190, 199, 207, 255 and 354.

[0029] The invention also encompasses the above methods containing the steps of further measuring the expression level of at least one gene encoding mRNA: corresponding to SEQ ID NOs: 142, 219 and 309; and/or corresponding to SEQ ID NOs: 9, 12 and 18; or recognized specifically by the probe sets from psids corresponding to SEQ ID NOs: 130, 190 and 276 as depicted in Table 25; and/or recognized specifically by the probe sets from psids corresponding to SEQ ID NOs: 9, 12 and 18 as depicted in Table 25. The invention also encompasses the above methods containing the steps of further measuring the expression level of at least one gene constitutively expressed in the sample.

[0030] Cadherin 3, type 1 (SEQ ID NO: 53) is mentioned in US20030194406; US 20050037439; and US 20040137539. Fibronectin (SEQ ID NO: 242) is mentioned in US6436642 and US20030104419. Secretory granule, neuroendocrine protein 1 (SEQ ID NO: 76) is mentioned in US20030232350; and US20040002067. Testican-1 (SEQ ID NO: 36) is mentioned in US20030108963; and US20050037463. Thyroid peroxidase (SEQ ID NO: 211) is mentioned in US6066449, US20030118553; US20030054571; WO9102061; and WO9856953. Chemokine C (C-C) motif ligand 18 (SEQ ID NO: 354) is mentioned in WO2005005601 and US20020114806. Pulmonary surfactant-associated protein B (SEQ ID NO: 355) is mentioned in US20030219760; and US20030232350. K+ channel beta subunit (SEQ ID NO: 207) is mentioned in US20030096782; and US 20020168638. Putative prostate cancer suppressor (SEQ ID NO: 178) is mentioned in WO2005020784. Bone marrow stromal cell antigen 1 (SEQ ID NO: 142) is mentioned in WO2004040014; and WO2005020784. Leucocyte immunoglobulin-like receptor-6b (SEQ ID NO: 219) is mentioned in US20030060614. Bridging integrator 2 (SEQ ID NO: 309) is mentioned in EP1393776; WO02057414; WO0116158 and US6831063. Cysteine-rich, angiogenic inducer, 61 (SEQ ID NO: 9) is mentioned in W02004030615; and WO9733995. Selenoprotein P, Plasma 1 (SEQ ID NO: 12) is mentioned in US20040241653 and W02005015236. Insulin-like growth factor-binding protein 4 (SEQ ID NO: 18) is mentioned in WO2005015236; WO9203469; WO9203152; and EP0546053.

[0031] In this invention, the most preferred method for analyzing the gene expression pattern of a patient in the methods provided herein is through the use of a linear discrimination analysis program. The present invention encompasses a method of generating a posterior probability score to enable diagnosis of thyroid carcinoma patients by: obtaining gene expression data from a statistically significant number of patient biological samples; applying linear discrimination analysis to the data to obtain selected genes; and applying weighted expression levels to the selected genes with discriminate function factor to obtain a prediction model that can be applied as a posterior probability score. Other analytical tools can also be used to answer the same question such as, logistic regression and neural network approaches.

[0032] For instance, the following can be used for linear discriminant analysis:



where,
I(psid) = The log base 2 intensity of the probe set enclosed in parenthesis.
d(CP) = The discriminant function for the cancer positive class
d(CN) = The discriminant function for the cancer negative class
P(CP) = The posterior p-value for the cancer positive class
P(CN) = The posterior p-value for the cancer negative class

[0033] Numerous other well-known methods of pattern recognition are available. The following references provide some examples: Weighted Voting: Golub et al. (1999); Support Vector Machines: Su et al. (2001); and Ramaswamy et al. (2001); K-nearest Neighbors: Ramaswamy (2001); and Correlation Coefficients: van't Veer et al. (2002).

[0034] Preferably, portfolios are established such that the combination of genes in the portfolio exhibit improved sensitivity and specificity relative to individual genes or randomly selected combinations of genes. In the context of the instant invention, the sensitivity of the portfolio can be reflected in the fold differences exhibited by a gene's expression in the diseased state relative to the normal state. Specificity can be reflected in statistical measurements of the correlation of the signaling of gene expression with the condition of interest. For example, standard deviation can be a used as such a measurement. In considering a group of genes for inclusion in a portfolio, a small standard deviation in expression measurements correlates with greater specificity. Other measurements of variation such as correlation coefficients can also be used in this capacity. The invention also encompasses the above methods where the specificity is at least about 40%, at least about 50% and at least about 60%. The invention also encompasses the above methods where the sensitivity is at least at least about 90% and at least about 92%.

[0035] The invention also encompasses the above methods where the comparison of expression patterns is conducted with pattern recognition methods. One method of the invention involves comparing gene expression profiles for various genes (or portfolios) to ascribe diagnoses. The gene expression profiles of each of the genes comprising the portfolio are fixed in a medium such as a computer readable medium. This can take a number of forms. For example, a table can be established into which the range of signals (e.g., intensity measurements) indicative of disease is input. Actual patient data can then be compared to the values in the table to determine whether the patient samples are normal, benign or diseased. In a more sophisticated embodiment, patterns of the expression signals (e.g., fluorescent intensity) are recorded digitally or graphically. The gene expression patterns from the gene portfolios used in conjunction with patient samples are then compared to the expression patterns.

[0036] Pattern comparison software can then be used to determine whether the patient samples have a pattern indicative of the disease. Of course, these comparisons can also be used to determine whether the patient is not likely to experience the disease. The expression profiles of the samples are then compared to the portfolio of a control cell. If the sample expression patterns are consistent with the expression pattern for cancer then (in the absence of countervailing medical considerations) the patient is treated as one would treat a thyroid cancer patient. If the sample expression patterns are consistent with the expression pattern from the normal/control cell then the patient is diagnosed negative for cancer.

[0037] Preferably, levels of up and down regulation are distinguished based on fold changes of the intensity measurements of hybridized microarray probes. A 1.5 fold difference is preferred for making such distinctions (or a p-value less than 0.05). That is, before a gene is said to be differentially expressed in diseased versus normal cells, the diseased cell is found to yield at least about 1.5 times more, or 1.5 times less intensity than the normal cells. The greater the fold difference, the more preferred is use of the gene as a diagnostic or prognostic tool. Genes selected for the gene expression profiles of this invention have expression levels that result in the generation of a signal that is distinguishable from those of the normal or non-modulated genes by an amount that exceeds background using clinical laboratory instrumentation.

[0038] Statistical values can be used to confidently distinguish modulated from non-modulated genes and noise. Statistical tests find the genes most significantly different between diverse groups of samples. The Student's T-test is an example of a robust statistical test that can be used to find significant differences between two groups. The lower the p-value, the more compelling the evidence that the gene is showing a difference between the different groups. Nevertheless, since microarrays measure more than one gene at a time, tens of thousands of statistical tests may be asked at one time. Because of this, one is unlikely to see small p-values just by chance and adjustments for this using a Sidak correction as well as a randomization/permutation experiment can be made. A p-value less than 0.05 by the T-test is evidence that the gene is significantly different. More compelling evidence is a p-value less then 0.05 after the Sidak correction is factored in. For a large number of samples in each group, a p-value less than 0.05 after the randomization/permutation test is the most compelling evidence of a significant difference.

[0039] The present invention encompasses microarrays or gene chips for performing the methods provided herein. The microarrays can contain isolated nucleic acid sequences, their complements, or portions thereof of a combination of genes from those encoding mRNA: corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or corresponding to SEQ ID NOs: 199, 207, 255 and 354; or recognized specifically by the probe sets from psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or recognized specifically by the probe sets from psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25 where the combination is sufficient to characterize thyroid carcinoma or risk of relapse in a biological sample. The microarray preferably measures or characterizes at least about 1.5-fold over- or under-expression, provides a statistically significant p-value over- or under-expression, or a p-value is less than 0.05. Preferably, the microarray contains a cDNA array or an oligonucleotide array and may contain one or more internal control reagents. One preferred internal control reagent is a method of detecting PAX8 gene expression which can be measured using SEQ ID NOs: 409-411.

[0040] Preferably, an oligonucleotide in the array corresponds to the 3' non-coding region of the gene the expression of which is being measured.

[0041] Another parameter that can be used to select genes that generate a signal that is greater than that of the non-modulated gene or noise is the use of a measurement of absolute signal difference. Preferably, the signal generated by the modulated gene expression is at least 20% different than those of the normal or non-modulated gene (on an absolute basis). It is even more preferred that such genes produce expression patterns that are at least 30% different than those of normal or non-modulated genes.

[0042] Preferred methods for establishing gene expression profiles include determining the amount of RNA that is produced by a gene that can code for a protein or peptide. This is accomplished by reverse transcriptase PCR (RT-PCR), competitive RT-PCR, real time RT-PCR, differential display RT-PCR, Northern Blot analysis and other related tests. While it is possible to conduct these techniques using individual PCR reactions, it is best to amplify complementary DNA (cDNA) or complementary RNA (cRNA) produced from mRNA and analyze it via microarray. A number of different array configurations and methods for their production are known to those of skill in the art and are described in U.S. Patents such as: 5,445,934; 5,532,128; 5,556,752; 5,242,974; 5,384,261; 5,405,783; 5,412,087; 5,424,186; 5,429,807; 5,436,327; 5,472,672; 5,527,681; 5,529,756; 5,545,531; 5,554,501; 5,561,071; 5,571,639; 5,593,839; 5,599,695; 5,624,711; 5,658,734; and 5,700,637.

[0043] Microarray technology allows for the measurement of the steady-state mRNA level of thousands of genes simultaneously thereby presenting a powerful tool for identifying effects such as the onset, arrest, or modulation of uncontrolled cell proliferation. Two microarray technologies are currently in wide use. The first are cDNA arrays and the second are oligonucleotide arrays. Although differences exist in the construction of these chips, essentially all downstream data analysis and output are the same. The product of these analyses are typically measurements of the intensity of the signal received from a labeled probe used to detect a cDNA sequence from the sample that hybridizes to a nucleic acid sequence at a known location on the microarray. Typically, the intensity of the signal is proportional to the quantity of cDNA, and thus mRNA, expressed in the sample cells. A large number of such techniques are available and useful. Preferred methods for determining gene expression can be found in US Patents 6,271,002; 6,218,122; 6,218,114; and 6,004,755.

[0044] Analysis of the expression levels is conducted by comparing such signal intensities. This is best done by generating a ratio matrix of the expression intensities of genes in a test sample versus those in a control sample. For instance, the gene expression intensities from a diseased tissue can be compared with the expression intensities generated from benign or normal tissue of the same type. A ratio of these expression intensities indicates the fold-change in gene expression between the test and control samples.

[0045] Gene expression profiles can also be displayed in a number of ways. The most common method is to arrange raw fluorescence intensities or ratio matrix into a graphical dendogram where columns indicate test samples and rows indicate genes. The data are arranged so genes that have similar expression profiles are proximal to each other. The expression ratio for each gene is visualized as a color. For example, a ratio less than one (indicating down-regulation) may appear in the blue portion of the spectrum while a ratio greater than one (indicating up-regulation) may appear as a color in the red portion of the spectrum. Commercially available computer software programs are available to display such data including "GENESPRING" from Silicon Genetics, Inc. and "DISCOVERY" and "INFER" software from Partek, Inc.

[0046] Modulated genes used in the methods of the invention are described in the Examples. The genes that are differentially expressed are either up regulated or down regulated in patients with thyroid cancer relative to those with benign thyroid diseases. Up regulation and down regulation are relative terms meaning that a detectable difference (beyond the contribution of noise in the system used to measure it) is found in the amount of expression of the genes relative to some baseline. In this case, the baseline is the measured gene expression of a benign disease patient. The genes of interest in the diseased cells are then either up regulated or down regulated relative to the baseline level using the same measurement method. Diseased, in this context, refers to an alteration of the state of a body that interrupts or disturbs, or has the potential to disturb, proper performance of bodily functions as occurs with the uncontrolled proliferation of cells. Someone is diagnosed with a disease when some aspect of that person's genotype or phenotype is consistent with the presence of the disease. However, the act of conducting a diagnosis or prognosis includes the determination of disease/status issues such as determining the likelihood of relapse, type of therapy and therapy monitoring. In therapy monitoring, clinical judgments are made regarding the effect of a given course of therapy by comparing the expression of genes over time to determine whether the gene expression profiles have changed or are changing to patterns more consistent with normal tissue.

[0047] Genes can be grouped so that information obtained about the set of genes in the group provides a sound basis for making a clinically relevant judgment such as a diagnosis, prognosis, or treatment choice. These sets of genes make up the portfolios of the invention. As with most diagnostic markers, it is often desirable to use the fewest number of markers sufficient to make a correct medical judgment. This prevents a delay in treatment pending further analysis as well unproductive use of time and resources.

[0048] One method of establishing gene expression portfolios is through the use of optimization algorithms such as the mean variance algorithm widely used in establishing stock portfolios. This method is described in detail in US patent publication number 20030194734. Essentially, the method calls for the establishment of a set of inputs (stocks in financial applications, expression as measured by intensity here) that will optimize the return (e.g., signal that is generated) one receives for using it while minimizing the variability of the return. Many commercial software programs are available to conduct such operations. "Wagner Associates Mean-Variance Optimization Application," referred to as "Wagner Software" throughout this specification, is preferred. This software uses functions from the "Wagner Associates Mean-Variance Optimization Library" to determine an efficient frontier and optimal portfolios in the Markowitz sense is preferred. Use of this type of software requires that microarray data be transformed so that it can be treated as an input in the way stock return and risk measurements are used when the software is used for its intended financial analysis purposes.

[0049] The process of selecting a portfolio can also include the application of heuristic rules. Preferably, such rules are formulated based on biology and an understanding of the technology used to produce clinical results. More preferably, they are applied to output from the optimization method. For example, the mean variance method of portfolio selection can be applied to microarray data for a number of genes differentially expressed in subjects with cancer. Output from the method would be an optimized set of genes that could include some genes that are expressed in peripheral blood as well as in diseased tissue. If samples used in the testing method are obtained from peripheral blood and certain genes differentially expressed in instances of cancer could also be differentially expressed in peripheral blood, then a heuristic rule can be applied in which a portfolio is selected from the efficient frontier excluding those that are differentially expressed in peripheral blood. Of course, the rule can be applied prior to the formation of the efficient frontier by, for example, applying the rule during data pre-selection.

[0050] Other heuristic rules can be applied that are not necessarily related to the biology in question. For example, one can apply a rule that only a prescribed percentage of the portfolio can be represented by a particular gene or group of genes. Commercially available software such as the Wagner Software readily accommodates these types of heuristics. This can be useful, for example, when factors other than accuracy and precision (e.g., anticipated licensing fees) have an impact on the desirability of including one or more genes.

[0051] The gene expression profiles of this invention can also be used in conjunction with other non-genetic diagnostic methods useful in cancer diagnosis, prognosis, or treatment monitoring. For example, in some circumstances it is beneficial to combine the diagnostic power of the gene expression based methods described above with data from conventional markers such as serum protein markers (e.g., Cancer Antigen 27.29 ("CA 27.29")). A range of such markers exists including such analytes as CA 27.29. In one such method, blood is periodically taken from a treated patient and then subjected to an enzyme immunoassay for one of the serum markers described above. When the concentration of the marker suggests the return of tumors or failure of therapy, a sample source amenable to gene expression analysis is taken. Where a suspicious mass exists, a fine needle aspirate (FNA) is taken and gene expression profiles of cells taken from the mass are then analyzed as described above. Alternatively, tissue samples may be taken from areas adjacent to the tissue from which a tumor was previously removed. This approach can be particularly useful when other testing produces ambiguous results.

[0052] The present invention encompasses methods of cross validating a gene expression profile and the profiles thus obtained, for thyroid carcinoma patients by: a. obtaining gene expression data from a statistically significant number of patient biological samples; b. randomizing sample order; c. setting aside data from about 10% - 50% of samples; d. computing, for the remaining samples, for factor of interest on all variables and selecting variables that meet a p-value cutoff (p); e. selecting variables that fit a prediction model using a forward search and evaluating the training error until it hits a predetermined error rate; f. testing the prediction model on the left-out 10-50% of samples; g. repeating steps c., -g. with a new set of samples removed; and h. continuing steps c) -g) until 100% of samples have been tested and record classification performance. In this method, preferably, the gene expression data obtained in step h. is represented by genes from those encoding mRNA: corresponding to SEQ ID NOs: 1, 4, 7, 8, 10-11, 13-17, 19-24, 26-27, 29-31, 33-35, 37-38, 40-52, 54-72, 75-82, 84-135, 138-141, 144-151, 153-159, 161-162, 164, 166-173, 176-198, 200-201, 203-206, 208-209, 212-213, 215-218, 220-221, 223, 227-233, 235-241, 243-244, 246-249, 251, 253-254, 256-263, 265-289, 291-293, 295-308, 310-331, 333-341, 343-345, 347-348, 350-353 and 355-363; or recognized specifically by the probe sets from psids in Table 25 corresponding to SEQ ID NOs: 1, 4, 7, 8, 10-11, 13-17, 19-24, 26-27, 29-31, 33-35, 37-38, 40-52, 54-72, 75-82, 84-135, 138-141, 144-151, 153-159, 161-162, 164, 166-173, 176-198, 200-201, 203-206, 208-209, 212-213, 215-218, 220-221, 223, 227-233, 235-241, 243-244, 246-249, 251, 253-254, 256-263, 265-289, 291-293, 295-308, 310-331, 333-341, 343-345, 347-348, 350-353 and 355-363.

[0053] The present invention encompasses methods of independently validating a gene expression profile and the profiles thus obtained, for thyroid cancer patients by obtaining gene expression data from a statistically significant number of patient biological samples; normalizing the source variabilities in the gene expression data; computing for factor of interest on all variables that were selected previously; and testing the prediction model on the sample and record classification performance. In this method, preferably, the gene expression data obtained in step d. is represented by genes from those encoding mRNA: corresponding to SEQ ID NOs: 1, 4, 7, 8, 10-11, 13-17, 19-24, 26-27, 29-31, 33-35, 37-38, 40-52, 54-72, 75-82, 84-135, 138-141, 144-151, 153-159, 161-162, 164, 166-173, 176-198, 200-201, 203-206, 208-209, 212-213, 215-218, 220-221, 223, 227-233, 235-241, 243-244, 246-249, 251, 253-254, 256-263, 265-289, 291-293, 295-308, 310-331, 333-341, 343-345, 347-348, 350-353 and 355-363; or recognized specifically by the probe sets from psids in Table 25 corresponding to SEQ ID NOs: 1, 4, 7, 8, 10-11, 13-17, 19-24, 26-27, 29-31, 33-35, 37-38, 40-52, 54-72, 75-82, 84-135, 138-141, 144-151, 153-159, 161-162, 164, 166-173, 176-198, 200-201, 203-206, 208-209, 212-213, 215-218, 220-221, 223, 227-233, 235-241, 243-244, 246-249, 251, 253-254, 256-263, 265-289, 291-293, 295-308, 310-331, 333-341, 343-345, 347-348, 350-353 and 355-363.

[0054] The present invention encompasses methods of generating a posterior probability to enable diagnosis of thyroid carcinoma patients by obtaining gene expression data from a statistically significant number of patient biological samples; applying linear discrimination analysis to the data to obtain selected genes; applying weighted expression levels to the selected genes with discriminate function factor to obtain a prediction model that can be applied as a posterior probability score. For instance, the following can be used for Linear Discriminant Analysis:




where,
I(psid) = The log base 2 intensity of the probe set enclosed in parenthesis.
d(CP) = The discriminant function for the cancer positive class
d(CN) = The discriminant function for the cancer negative class
P(CP) = The posterior p-value for the cancer positive class
P(CN) = The posterior p-value for the cancer negative class

[0055] The present invention encompasses methods of generating a thyroid carcinoma diagnostic patient report and reports obtained thereby, by obtaining a biological sample from the patient; measuring gene expression of the sample; applying a posterior probability score thereto; and using the results obtained thereby to generate the report. The report can also contain an assessment of patient outcome and/or probability of risk relative to the patient population.

[0056] The present invention encompasses compositions containing at least one probe set from: SEQ ID NOs: 36, 53, 73, 211 and 242; and/or SEQ ID NOs: 199, 207, 255 and 354; or the psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25.

[0057] The present invention encompasses kits for conducting an assay to determine thyroid carcinoma diagnosis in a biological sample containing: materials for detecting isolated nucleic acid sequences, their complements, or portions thereof of a combination of genes from those encoding mRNA: corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or corresponding to SEQ ID NOs: 199, 207, 255 and 354; or recognized specifically by the probe sets from psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or recognized specifically by the probe sets from psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25. The SEQ ID NOs. can be 36, 53, 73, 211 and 242, 199, 207, 255 and 354 and 45, 215, 65, 29, 190, 199, 207, 255 and 354.

[0058] Kits made according to the invention include formatted assays for determining the gene expression profiles. These can include all or some of the materials needed to conduct the assays such as reagents and instructions and a medium through which nucleic acid sequences, their complements, or portions thereof are assayed.

[0059] Articles of this invention include representations of the gene expression profiles useful for treating, diagnosing, prognosticating, and otherwise assessing diseases. These profile representations are reduced to a medium that can be automatically read by a machine such as computer readable media (magnetic, optical, and the like). The articles can also include instructions for assessing the gene expression profiles in such media. For example, the articles may comprise a CD ROM having computer instructions for comparing gene expression profiles of the portfolios of genes described above. The articles may also have gene expression profiles digitally recorded therein so that they may be compared with gene expression data from patient samples. Alternatively, the profiles can be recorded in different representational format. A graphical recordation is one such format. Clustering algorithms such as those incorporated in "DISCOVERY" and "INFER" software from Partek, Inc. mentioned above can best assist in the visualization of such data.

[0060] Different types of articles of manufacture according to the invention are media or formatted assays used to reveal gene expression profiles. These can comprise, for example, microarrays in which sequence complements or probes are affixed to a matrix to which the sequences indicative of the genes of interest combine creating a readable determinant of their presence. Alternatively, articles according to the invention can be fashioned into reagent kits for conducting hybridization, amplification, and signal generation indicative of the level of expression of the genes of interest for detecting cancer.

[0061] The present invention encompasses articles for assessing thyroid carcinoma status containing: materials for detecting isolated nucleic acid sequences, their complements, or portions thereof of a combination of genes from those encoding mRNA: corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or corresponding to SEQ ID NOs: 199, 207, 255 and 354; or recognized specifically by the probe sets from psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or recognized specifically by the probe sets from psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25. The SEQ ID NOs. can be 36, 53, 73, 211 and 242; 199, 207, 255 and 354; or 45, 215, 65, 29, 190, 199, 207, 255 and 354.

[0062] The present invention encompasses diagnostic/prognostic portfolios containing isolated nucleic acid sequences, their complements, or portions thereof of a combination of genes from those encoding mRNA: corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or corresponding to SEQ ID NOs: 199, 207, 255 and 354; or recognized specifically by the probe sets from psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or recognized specifically by the probe sets from psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25 where the combination is sufficient to characterize thyroid carcinoma status or risk of relapse in a biological sample. Preferably, the portfolio measures or characterizes at least about 1.5-fold over- or under-expression or provides a statistically significant p-value over- or under-expression. Preferably, the p-value is less than 0.05.

[0063] The invention further provides the following numbered embodiments:
  1. 1. A method of diagnosing thyroid cancer comprising the steps:
    1. a. obtaining a biological sample from a patient; and
    2. b. measuring the expression levels in the sample of genes selected from the group consisting of those encoding mRNA:
      1. i. corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or
      2. ii. corresponding to SEQ ID NOs: 199, 207, 255 and 354; or
      3. iii. recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or
      4. iv. recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25
        wherein the gene expression levels above or below pre-determined cut-off levels are indicative of thyroid cancer.
  2. 2. A method of differentiating between thyroid carcinoma and benign thyroid diseases comprising the steps:

    a. obtaining a sample from a patient; and

    b. measuring the expression levels in the sample of genes selected from the group consisting of those encoding mRNA:

    1. i. corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or
    2. ii. corresponding to SEQ ID NOs: 199, 207, 255 and 354; or
    3. iii. recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or
    4. iv. recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25
      wherein the gene expression levels above or below pre-deteimined cut-off levels are indicative of thyroid carcinoma.

  3. 3. A method of testing indeterminate thyroid fine needle aspirate (FNA) thyroid nodule samples comprising the steps:
    1. a. obtaining a sample from a patient; and
    2. b. measuring the expression levels in the sample of genes selected from the group consisting of those encoding mRNA:
      1. i. corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or
      2. ii. corresponding to SEQ ID NOs: 199, 207, 255 and 354; or
      3. iii. recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or
      4. iv. recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25
        wherein the gene expression levels above or below pre-determined cut-off levels are indicative of thyroid cancer.
  4. 4. A method of determining thyroid cancer patient treatment protocol comprising the steps:
    1. a. obtaining a biological sample from a thyroid cancer patient; and
    2. b. measuring the expression levels in the sample of genes selected from the group consisting of those encoding mRNA:
      1. i. corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or
      2. ii. corresponding to SEQ ID NOs: 199, 207, 255 and 354; or
      3. iii. recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or
      4. iv. recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25
        wherein the gene expression levels above or below pre-determined cut-off levels are sufficiently indicative of cancer to enable a physician to determine the type of surgery and/or therapy recommended to treat the disease.
  5. 5. A method of treating a thyroid cancer patient comprising the steps:
    1. a. obtaining a biological sample from a thyroid cancer patient; and
    2. b. measuring the expression levels in the sample of genes selected from the group consisting of those encoding mRNA:
      1. i. corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or
      2. ii. corresponding to SEQ ID NOs: 199, 207, 255 and 354; or
      3. iii. recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or
      4. iv. recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25
        wherein the gene expression levels above or below pre-determined cut-off levels are indicative of cancer; and
    3. c. treating the patient with thyroidectomy if they are cancer positive.
  6. 6. The method of one of embodiments 1-5 wherein the SEQ ID NOs. are 36, 53, 73, 211 and 242.
  7. 7. The method of one of embodiments 1-5 wherein the SEQ ID NOs. are 199, 207, 255 and 354.
  8. 8. The method of one of embodiments 1-5 wherein the SEQ ID NOs. are 45, 215, 65, 29, 190, 199, 207, 255 and 354.
  9. 9. The method of one of embodiments 1-5 wherein the sample is prepared by a method are selected from the group consisting of fine needle aspiration, bulk tissue preparation and laser capture microdissection.
  10. 10. The method of embodiment 9 wherein the bulk tissue preparation is obtained from a biopsy or a surgical specimen.
  11. 11. The method of one of embodiments 1-5 further comprising measuring the expression level of at least one gene encoding mRNA:
    1. a. corresponding to SEQ ID NOs: 142, 219 and 309; and/or
    2. b. corresponding to SEQ ID NOs: 9, 12 and 18; or
    3. c. recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 130, 190 and 276 as depicted in Table 25; and/or
    4. d. recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 9, 12 and 18 as depicted in Table 25.
  12. 12. The method of one of embodiments 1-5 further comprising measuring the expression level of at least one gene constitutively expressed in the sample.
  13. 13. The method of one of embodiments 1-5 wherein the specificity is at least about 40%.
  14. 14. The method of one of embodiments 1-5 wherein the specificity is at least about 50%.
  15. 15. The method of one of embodiments 1-5 wherein the specificity is at least about 60%.
  16. 16. The method of one of embodiments 1-5 wherein the sensitivity is at least at least about 90%.
  17. 17. The method of one of embodiments 1-5 wherein the sensitivity is at least at least about 92%.
  18. 18. The method of one of embodiments 1-5 wherein the comparison of expression patterns is conducted with pattern recognition methods.
  19. 19. The method of embodiment 18 wherein the pattern recognition methods include the use of a Cox proportional hazards analysis.
  20. 20. The method of one of embodiments 1-5 wherein the pre-determined cut-off levels are at least about 1.5-fold over- or under- expression in the sample relative to benign cells or normal tissue.
  21. 21. The method of one of embodiments 1-5 wherein the pre-determined cut-off levels have at least a statistically significant p-value over-expression in the sample having thyroid carcinoma cells relative to benign cells or normal tissue.
  22. 22. The method of embodiment 21 wherein the p-value is less than about 0.05.
  23. 23. The method of one of embodiments 1-5 wherein gene expression is measured on a microarray or gene chip.
  24. 24. The method of embodiment 23 wherein the microarray is a cDNA array or an oligonucleotide array.
  25. 25. The method of embodiment 24 wherein the microarray or gene chip further comprises one or more internal control reagents.
  26. 26. The method of one of embodiments 1-5 wherein gene expression is determined by nucleic acid amplification conducted by polymerase chain reaction (PCR) of RNA extracted from the sample.
  27. 27. The method of embodiment 26 wherein said PCR is reverse transcription polymerase chain reaction (RT-PCR).
  28. 28. The method of embodiment 27, wherein the RT-PCR further comprises one or more internal control reagents.
  29. 29. The method of embodiment 28, wherein the internal control reagent is a method of detecting PAX8 gene expression
  30. 30. The method of embodiment 29, wherein PAX8 gene expression is measured using SEQ ID NOs: 409-411.
  31. 31. The method of one of embodiments 1-5 wherein gene expression is detected by measuring or detecting a protein encoded by the gene.
  32. 32. The method of embodiment 31 wherein the protein is detected by an antibody specific to the protein.
  33. 33. The method of one of embodiments 1-5 wherein gene expression is detected by measuring a characteristic of the gene.
  34. 34. The method of embodiment 33 wherein the characteristic measured is selected from the group consisting of DNA amplification, methylation, mutation and allelic variation.
  35. 35. A method of cross validating a gene expression profile for thyroid carcinoma patients comprising the steps:
    1. a. obtaining gene expression data from a statistically significant number of patient biological samples;
    2. b. randomizing sample order;
    3. c. setting aside data from about 10% - 50% of samples;
    4. d. computing, for the remaining samples, for factor of interest on all variables and selecting variables that meet a p-value cutoff (p);
    5. e. selecting variables that fit a prediction model using a forward search and evaluating the training error until it hits a predetermined error rate;
    6. f. testing the prediction model on the left-out 10-50% of samples;
    7. g. repeating steps c., -g. with a new set of samples removed; and
    8. h. continuing steps c) -g) until 100% of samples have been tested and record classification performance.
  36. 36. The method according to embodiment 35 wherein the gene expression data obtained in step h. is represented by genes selected from the group consisting of those encoding mRNA:
    1. a. corresponding to SEQ ID NOs: 1, 4, 7, 8, 10-11, 13-17, 19-24, 26-27, 29-31, 33-35, 37-38, 40-52, 54-72, 75-82, 84-135, 138-141, 144-151, 153-159, 161-162, 164, 166-173, 176-198, 200-201, 203-206, 208-209, 212-213, 215-218, 220-221, 223, 227-233, 235-241, 243-244, 246-249, 251, 253-254, 256-263, 265-289, 291-293, 295-308, 310-331, 333-341, 343-345, 347-348, 350-353 and 355-363; or
    2. b. recognized specifically by the probe sets selected from the group consisting of psids in Table 25 corresponding to SEQ ID NOs: 1, 4, 7, 8, 10-11, 13-17, 19-24, 26-27, 29-31, 33-35, 37-38, 40-52, 54-72, 75-82, 84-135, 138-141, 144-151, 153-159, 161-162, 164, 166-173, 176-198, 200-201, 203-206, 208-209, 212-213, 215-218, 220-221, 223, 227-233, 235-241, 243-244, 246-249, 251, 253-254, 256-263, 265-289, 291-293, 295-308, 310-331, 333-341, 343-345, 347-348, 350-353 and 355-363.
  37. 37. A method of independently validating a gene expression profile for thyroid carcinoma patients comprising the steps:

    a. obtaining gene expression data from a statistically significant number of patient biological samples;

    b. normalizing the source variabilities in the gene expression data;

    c. computing for factor of interest on all variables that were selected previously; and

    d. testing the prediction model on the sample and record classification performance.

  38. 38. The method according to embodiment 37 wherein the gene expression data obtained in step d. is represented by genes selected from the group consisting of those encoding mRNA:
    1. a. corresponding to SEQ ID NOs: 1, 4, 7, 8, 10-11, 13-17, 19-24, 26-27, 29-31, 33-35, 37-38, 40-52, 54-72, 75-82, 84-135, 138-141, 144-151, 153-159, 161-162, 164, 166-173, 176-198, 200-201, 203-206, 208-209, 212-213, 215-218, 220-221, 223, 227-233, 235-241, 243-244, 246-249, 251, 253-254, 256-263, 265-289, 291-293, 295-308, 310-331, 333-341, 343-345, 347-348, 350-353 and 355-363; or
    2. b. recognized specifically by the probe sets selected from the group consisting of psids in Table 25 corresponding to SEQ ID NOs: 1, 4, 7, 8, 10-11, 13-17, 19-24, 26-27, 29-31, 33-35, 37-38, 40-52, 54-72, 75-82, 84-135, 138-141, 144-151, 153-159, 161-162, 164, 166-173, 176-198, 200-201, 203-206, 208-209, 212-213, 215-218, 220-221, 223, 227-233, 235-241, 243-244, 246-249, 251, 253-254, 256-263, 265-289, 291-293, 295-308, 310-331, 333-341, 343-345, 347-348, 350-353 and 355-363.
  39. 39. A gene profile obtained by the method according to embodiment 37 or 38.
  40. 40. A method of generating a posterior probability score to enable diagnosis of thyroid carcinoma patients comprising the steps:
    1. a. obtaining gene expression data from a statistically significant number of patient biological samples;
    2. b. applying linear discrimination analysis to the data to obtain selected genes;
    3. c. applying weighted expression levels to the selected genes with discriminate function factor to obtain a prediction model that can be applied as a posterior probability score.
  41. 41. The method according to embodiment 40, wherein the linear discriminant analysis is calculated using the equation:




    where,
    I(psid) = The log base 2 intensity of the probe set enclosed in parenthesis.
    d(CP) = The discriminant function for the cancer positive class
    d(CN) = The discriminant function for the cancer negative class
    P(CP) = The posterior p-value for the cancer positive class
    P(CN) = The posterior p-value for the cancer negative class
  42. 42. A method of generating a thyroid carcinoma prognostic patient report comprising the steps:

    a. obtaining a biological sample from the patient;

    b. measuring gene expression of the sample;

    c. applying a Relapse Hazard Score to the results of step b.; and

    d. using the results obtained in step c. to generate the report.

  43. 43. The method of embodiment 42 wherein the report contains an assessment of patient outcome and/or probability of risk relative to the patient population.
  44. 44. A patient report generated by the method according to embodiment 42.
  45. 45. A composition comprising at least one probe set selected from the group consisting of: SEQ ID NOs: 36, 53, 73, 211 and 242; and/or SEQ ID NOs: 199, 207, 255 and 354; or the psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25.
  46. 46. A kit for conducting an assay to determine thyroid carcinoma prognosis in a biological sample comprising: materials for detecting isolated nucleic acid sequences, their complements, or portions thereof of a combination of genes selected from the group consisting of those encoding mRNA:
    1. a. corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or
    2. b. corresponding to SEQ ID NOs: 199, 207, 255 and 354; or
    3. c. recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or
    4. d. recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25.
  47. 47. The kit of embodiment 46 wherein the SEQ ID NOs. are 36, 53, 73, 211 and 242.
  48. 48. The kit of embodiment 46 wherein the SEQ ID NOs. are 199, 207, 255 and 354.
  49. 49. The kit of embodiment 46 wherein the SEQ ID NOs. are 45, 215, 65, 29, 190, 199, 207, 255 and 354.
  50. 50. The kit of embodiment 46 further comprising reagents for conducting a microarray analysis.
  51. 51. The kit of embodiment 46 further comprising a medium through which said nucleic acid sequences, their complements, or portions thereof are assayed.
  52. 52. Articles for assessing thyroid carcinoma status comprising: materials for detecting isolated nucleic acid sequences, their complements, or portions thereof of a combination of genes selected from the group consisting of those encoding mRNA:
    1. a. corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or
    2. b. corresponding to SEQ ID NOs: 199, 207, 255 and 354; or
    3. c. recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or
    4. d. recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25.
  53. 53. The articles of embodiment 52 wherein the SEQ ID NOs. are 36, 53, 73, 211 and 242.
  54. 54. The articles of embodiment 52 wherein the SEQ ID NOs. are 199, 207, 255 and 354.
  55. 55. The articles of embodiment 52 wherein the SEQ ID NOs. are 45, 215, 65, 29, 190, 199, 207, 255 and 354.
  56. 56. The articles of embodiment 52 further comprising reagents for conducting a microarray analysis.
  57. 57. The articles of embodiment 52 further comprising a medium through which said nucleic acid sequences, their complements, or portions thereof are assayed.
  58. 58. A microarray or gene chip for performing the method of one of embodiments 1-5.
  59. 59. The microarray of embodiment 58 comprising isolated nucleic acid sequences, their complements, or portions thereof of a combination of genes selected from the group consisting of those encoding mRNA:
    1. a. corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or
    2. b. corresponding to SEQ ID NOs: 199, 207, 255 and 354; or
    3. c. recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or
    4. d. recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25
      where the combination is sufficient to characterize thyroid carcinoma or risk of relapse in a biological sample.
  60. 60. The microarray of embodiment 59 wherein the measurement or characterization is at least about 1.5-fold over- or under-expression.
  61. 61. The microarray of embodiment 59 wherein the measurement provides a statistically significant p-value over- or under-expression.
  62. 62. The microarray of embodiment 59 wherein the p-value is less than about 0.05.
  63. 63. The microarray of embodiment 59 comprising a cDNA array or an oligonucleotide array.
  64. 64. The microarray of embodiment 59 further comprising or more internal control reagents.
  65. 65. A diagnostic/prognostic portfolio comprising isolated nucleic acid sequences, their complements, or portions thereof of a combination of genes selected from the group consisting of those encoding mRNA:
    1. a. corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242; and/or
    2. b. corresponding to SEQ ID NOs: 199, 207, 255 and 354; or
    3. c. recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 36, 53, 73, 211 and 242 as depicted in Table 25; and/or
    4. d. recognized specifically by the probe sets selected from the group consisting of psids corresponding to SEQ ID NOs: 199, 207, 255 and 354 as depicted in Table 25
      where the combination is sufficient to characterize thyroid carcinoma status or risk of relapse in a biological sample.
  66. 66. The portfolio of embodiment 65 wherein the measurement or characterization is at least about 1.5-fold over- or under-expression.
  67. 67. The portfolio of embodiment 65 wherein the measurement provides a statistically significant p-value over- or under-expression.
  68. 68. The portfolio of embodiment 65 wherein the p-value is less than about 0.05.


[0064] The following examples are provided to illustrate but not limit the claimed invention. All references cited herein are hereby incorporated by reference herein.

Example 1


Materials and Methods


Tissue samples



[0065] Fresh frozen thyroid benign diseases, follicular adenoma, follicular carcinoma, and papillary carcinoma samples were obtained from different commercial vendors including Genomics Collaborative, Inc. (Cambridge, MA), Asterand (Detroit, MI), and Proteogenex (Los Angeles, CA). All samples were collected according to an Institutional Review Board approval protocol. Patients demographic and pathology information were also collected. The histopathological features of each sample were reviewed to confirm diagnosis, estimate sample preservation and tumor content.

RNA isolation



[0066] Standard TriZol protocol was used for all the RNA isolations. Tissue was homogenized in TriZol reagent (Invitrogen, Carlsbad, CA). Total RNA was isolated from TriZol and precipitated at -20°C with isopropyl alcohol. RNA pellets were washed with 75% ethanol, dissolved in water and stored at -80°C until use. RNA integrity was examined with Agilent 2100 Bioanalyzer RNA 6000 NanoAssay (Agilent Technologies, Palo Alto, CA).

Linear Discrimination Analysis



[0067] Linear Discriminant Analysis was performed using these steps: calculation of a common (pooled) covariance matrix and within-group means; calculation of the set of linear discriminant functions from the common covariance and the within-group means; and classification using the linear discriminant functions.

[0068] Plugging the chip intensity readings for each probe into the following equation can be used to derive the posterior probability of an unknown thyroid sample as either cancer positive or negative. For example, if a thyroid sample is tested with the assay and gives a p(CP) > 0.5 this sample will be classified as thyroid cancer.

[0069] For the 4 gene signature:





[0070] For the 5 gene signature:








where,
I(psid) = The log base 2 intensity of the probe set enclosed in parenthesis.
d(CP) = The discriminant function for the cancer positive class
d(CN) = The discriminant function for the cancer negative class
P(CP) = The posterior p-value for the cancer positive class
P(CN) = The posterior p-value for the cancer negative class

Two-round aRNA amplification



[0071] aRNA was amplified from 10 ng total RNA using the RiboBeast 2-Round Aminoallyl-aRNA Amplification kit (Epicentre, WI), a T7 based RNA linear amplification protocol, with some modifications. Total RNA was reverse transcribed using an oligo(dT) primer containing a T7 RNA polymerase promoter sequence and Superscript III RT. The second-strand synthesis was carried out using Bst DNA polymerase. An extra step of incubation with an exonuclease mix of Exo I and Exo VII was performed to reduce background. The double-stranded cDNA served as the template for T7-mediated linear amplification by in vitro transcription. For the second round of amplification, instead of using the RiboBeast reagents, the ENZO BioArray HighYield RNA Transcript Labeling kit (Affymetrix, CA) was used in place of the in vitro transcription step of Aminoallyl-aRNA. The aRNA was quantified by Agilent Nano Chip technology.

Example 2


Microarray analysis



[0072] Labeled cRNA was prepared and hybridized with the high-density oligonucleotide array Hu133A Gene Chip (Affymetrix, Santa Clara, CA) containing a total of 22,000 probe sets. Hybridization was performed according to a standard protocol provided by the manufacturer. Arrays were scanned using Affymetrix protocols and scanners. For subsequent analysis, each probe set was considered as an independent gene. Expression values for each gene were calculated by using Affymetrix Gene Chip analysis software MAS 5.0. All chips met the following quality control standards: the percentage of "presence" call, the scaling factor, the background level, and the noise level have to be within the range of mean plus or minus 3 standard deviation. All chips used for subsequent analysis have passed these quality control criteria. Sample collection for signature selection and independent validation is summarized in Table 1.
Table 1. Sample collection for signature training and validation
Training Sample Set
Category Number of Samples
Follicular Adenoma (FA) 33
Follicular Carcinoma (FC) 21
Benign Diseases (BN) 13
Papillary Carcinoma (PC) 31
Validation Sample Set
Category Number of Samples
Follicular Adenoma (FA) 38
Follicular Carcinoma (FC) 5
Follicular Variant of Papillary Carcinoma (FVPTC) 11
Papillary Carcinoma (PC) 20

Example 3


Results Signature Identification


A. Gene Selection



[0073] A total of 98 samples including 31 primary papillary thyroid tumors, 21 follicular thyroid cancers, 33 follicular adenoma, and 13 benign thyroid tissues were analyzed by using Affymetrix human U133A gene chips. Five gene selection criteria were applied to the entire data set to obtain a limited number of genes for subsequent gene marker or signature identification:
  1. 1. Genes with at least one "Present Call" in this sample set were considered.
  2. 2. Genes with more than one "Present Call" in 12 PBL samples were excluded.
  3. 3. Only genes with chip intensity larger than 200 in all samples were selected.
  4. 4. Using genes that passed the above three criteria, we performed a variety of analyses, as listed in Table 2, to identify genes that are either up-regulated or down-regulated in thyroid tumors.
  5. 5. Finally, genes with expression change greater than 1.4-fold were selected.
Table 2. Summary of different types of percentile analyses
  Type of Percentile Analysis
1 20% FC vs 100% Benign
2 30% FC vs 90% Benign
3 30% PC vs 90% Benign
4 70% FC vs 50% Benign
5 70% PC vs 50% Benign
6 90% Benign vs 30% FC
7 90% Benign vs 30% PC


[0074] The final number of selected genes for signature identification is 322, described in Table 25, SEQ ID NOs: 1, 4, 7, 8, 10-11, 13-17, 19-24, 26-27, 29-31, 33-35, 37-38, 40-52, 54-72, 75-82, 84-135, 138-141, 144-151, 153-159, 161-162, 164, 166-173, 176-198, 200-201, 203-206, 208-209, 212-213, 215-218, 220-221, 223, 227-233, 235-241, 243-244, 246-249, 251, 253-254, 256-263, 265-289, 291-293, 295-308, 310-331, 333-341, 343-345, 347-348, 350-353 and 355-363. The data obtained from the 322 selected genes are provided in Table 3 and summarized in Table 4.







































































































































Table 4. 4-Gene signature performance in 98 training samples
  Tumor Benign
Positive 48 18
Negative 4 28
Sensitivity 92% (0.82, 0.97)
Specificity 61% (0.46, 0.74)

B. Signature Identification using Linear Discrimination Analysis



[0075] We used a forward selection process that adds one gene at a time until the posterior error as evaluated by a linear discriminator is less than or equal to 0.1. A four-gene signature was discovered using this approach with the 322 genes. The identities of these 4 genes are listed in Tables 5 and 16 and their chip data are shown in Tables 6 and 7.

[0076] 
Table 5. 4-Gene Signature
SEQ ID NO: Gene Name
354 Chemokine (C-C motif) ligand 18 (pulmonary and activation-regulated)
199 Pulmonary surfactant-associated protein B (SP-B)
207 K+ channel beta subunit
255 Putative prostate cancer tumor suppressor


[0077] Leave One Out Cross Validation (LOOCV) resulted in 92% sensitivity and 61% specificity, shown in Table 4. The ROC curve gave an AUC of 0.897, as shown in Figure 1.
Table 6
  Signal
SEQ ID PC_984TT PC_986TT FA_987TT PC_988TT PC_989TT FA_992TT FA_993TT
12 3399.2 7041.3 5438 6376.7 2734.6 3569.5 4305.9
18 6550.8 5870.6 5815.9 4265.1 8856.4 3454.4 20405.3
29 4578.6 6455.8 2329.4 4259 2666.8 3694 7531.2
  FA_994TT FA_995TT FA_ 996TT FA_998TT FA_999TT FA_1001TT FA_1002TT
12 2545.7 2451.8 4418 3938.3 3648.5 6834.5 4882
18 12566.4 9889.5 4341.5 5210.1 6639.5 4794.8 3921.9
29 2239.8 1834.5 4607.1 6424.1 3069.6 4301.4 3164
  FA_1004TT FA_1005TT FA_1006TT FA_1010TT FA_1013TT FA_1014TT FA_1017TT
12 2274.9 2644.2 3978.8 2999.7 4012.5 1724 5005.8
18 10402 7376.3 9985.1 1330.3 4618.8 6132 4562.6
29 1688.9 4771.2 3159.1 3082.6 6861.7 1465 3384.7
  FA_1018TT FA_1020TT FA_1023TT FA_1024TT FA_1026TT FA_1027TT FA_1028TT
12 2847.6 7102 7513.5 2086.3 2431.3 1318.2 1231
18 6388.5 2640.8 1967.7 6853.3 5068.5 2006.4 1804.9
29 3829.5 4235 4272.5 924 1702.3 1691.5 1731.3
  FA_1029TT FA_1030TT FA_1031TT FA_1032TT FA_1034TT FA_1035TT FC_1037TT
12 4607.6 2416.1 2203.7 6489.9 2095.3 1547.9 1847.4
18 3507.3 2270.8 8058 11247.5 8258.1 7485.8 8309
29 1530.5 5641.4 3460.1 4802.1 1650.7 854.6 2418.4
  PC_1039TT PC_1040TT PC_1041TT PC_1042TT PC_1043TT PC_1044TT PC_1045TT
12 8204.9 2638.5 4795 2607.8 6725.4 5178.7 2111.1
18 6454.7 10129.3 7964 4952.6 3663.3 1887.7 19840.6
29 8858.5 4136.8 8434 2230.7 3434 2770 2243.1
  PC_1046TT PC_1047TT PC_1048TT PC_1049TT PC_1050TT PC_1051TT PC-FV_1052TT
12 4403.6 9394.7 2262.5 4831.2 2826.9 1288.6 1953.2
18 2417.7 5678.4 6655.2 4325.2 1706 6904.8 4596.1
29 2319.8 20509.6 1176.1 5375.3 2053.1 4027.5 3332.3
  PC 1053TT PC 1054TT PC 1055TT PC 1059TT PC 1060TT PC 1061TT PC_1062TT
12 4150.4 6180.4 1835.7 2447.2 4201.3 4984.3 3399.3
18 5965.7 13847.6 3740.6 4167.7 14516.2 4896.7 8891.6
29 3718.1 4818.3 1728.8 2972.2 4851 3469 3718.6
  PC-FV_1064TT PC-FV_1065TT PC-FV_1066TT PC-FV 1067TT PC-FV 1068TT PC-FV_1069TT PC-FV 1071TT
12 4221 4884.5 7256.6 640 4055 3208.2 6301.3
18 4095 6627.6 1427.1 5405.5 10722 11509.9 10314.3
29 5896.7 5461.2 9437.3 1634.4 1344.2 2186.4 1794.8
  PC-FV_1072TT FA_1073TT FA _1074TT FA-1075TT FA-1076TT FC_1077TT FC_1078TT
12 1390.3 3325.5 1430.8 1784.4 1563.9 1999.4 1830.7
18 5629 10010.5 3459.2 11454 10807.8 1384.8 11906.2
29 2974.5 7133.1 3009 4184.9 1673.9 1044.5 4264.1
  FC_1079TT PC-FV_1080TT PC-FV_1081TT FC_1082TT pb1 pb2 pb3
12 2856.2 2437.9 3541.2 4439.8 11.4 9.1 9.9
18 22738 4344.1 4108.3 3312.3 94.2 50.7 67.6
29 1092.8 3052.8 3302.6 2383 8.7 21.6 30.5
  pb4 pb5' pb6' pb7' pb8' pd10 pd11
12 33.8 8.3 6.6 18.8 61.4 38.2 10.7
18 74.9 29.7 53.6 37.8 31.8 68.8 117.5
29 66.1 64.6 36.4 51.5 25.1 17.2 17.2
  pd12 pd9          
12 4.1 11.6          
18 70.8 23.4          
29 19.7 11          
Table 7
  Signal
SEQ ID FA_987TT FA_992TT FA_993TT FA-994TT FA-995TT FA_996TT FA_998TT
354 50.9 100.4 63.7 14 699.7 181.7 11.9
199 81.7 18 43.3 130.3 461.9 469.4 135.6
207 3844.9 1325.1 77.9 397.2 518.3 807.5 1084.3
255 1374.4 2332.1 1631 611.1 1616.7 419.2 230.7
  FA_999TT FA_1001TT FA_1002TT FA_1004TT FA_1005TT FA_1006TT FA_1010TT
354 194.7 2397.3 50.6 205.2 9.6 29.1 341.5
199 124.1 574.9 206 158.5 423.6 121 2489.4
207 1861.5 1325.1 846.6 308.8 1765.9 598.9 1067.7
255 266.5 186.7 324.2 94 1071.3 1098.5 1294
  FA_1013TT FA_1014TT FA_1017TT FA_1018TT FA_1020TT FA_1023TT FA_1024TT
354 207.5 69 21.7 20.1 64.7 87.4 372.5
199 440.8 99.8 221.6 226.9 108.2 155.2 112.6
207 535.7 1910.9 1185.2 571.8 1552.7 2739.5 692.6
255 437.2 655.6 165.1 178.4 86.4 436.5 40.7
  FA_1026TT FA_1027TT FA_1028TT FA_1029TT FA_1030TT FA_1031TT FA_1032TT
354 21.5 18.1 3.8 13.3 66 15.1 38.8
199 70.9 46.1 561.3 67.7 35.8 48 107.3
207 1230.4 80.5 732 3216.4 2253.5 1989.9 2115.2
255 479.9 226.5 35.9 1403.4 1144.3 247.9 1989.4
  FA_1034TT FA_1035TT FA_1073TT FA_1074TT FA_1075TT FA_1076TT PC_984TT
354 61.3 25.3 1355.7 188.5 14.7 8.9 97.5
199 216.5 56.1 1980.6 454.1 86.2 173.8 154.2
207 421.8 1493.5 771.6 1291.4 116.5 1169.9 759.2
255 489.2 229.5 205.3 655 152.7 156.3 152
  PC_986TT PC_988TT PC_989TT PC_1039TT PC_1040TT PC_1041 TT PC_1042TT
354 966.6 619 12.9 2577 540.4 1229.7 320
199 103.9 142.5 288.5 1006.6 3990.7 17402.7 1023.2
207 400.5 2357.5 1213.8 627.5 62.6 133.6 981.5
255 299.7 2137.5 131.9 500.6 2412.4 2763.1 303.4
  PC_1043TT PC_1044TT PC_1045TT PC_1046TT PC_1047TT PC_1048TT PC_1049TT
354 444.3 170.8 817.6 1095.1 966.8 1263.5 800.7
199 604.3 222.7 22961.4 9488.3 10311.6 1702.7 1366
207 1680.3 594.2 136.5 673.2 755 20.8 254.8
255 407.6 4162.4 872.5 2472.3 1497.3 2944.8 2399.9
  PC_1050TT PC_1051TT PC_1053TT PC_1054TT PC_1055TT PC_1059TT PC_1060TT
354 292.9 614.4 1035.4 1545.3 139.5 500.2 37.1
199 117 29233.8 1435 8550.5 242.8 13914.6 2377.3
207 87.8 113.4 73.3 187.7 491.6 158.4 207.3
255 2331.8 917.2 1505.9 2186.4 3033.2 2191.1 215.4
  PC_1061TT PC_1062TT FC_1037TT FC_1077TT FC_1078TT FC_1079TT FC_1082TT
354 71 23.8 18.4 28.9 87.2 13.5 27.7
199 6933.5 616.2 21.2 226.7 200.8 40.8 144.4
207 1088.5 1133.5 838.8 2012.3 24.9 196.8 1356.4
255 3084.8 1017.8 1355.3 199 1352.4 1578 1294.4
  PC-FV_1052TT PC-FV_1064TT PC-FV_1065TT PC-FV_1066TT PC-FV_1067TT PC-FV_1068TT PC-FV_1069TT
354 394.8 131.4 143.7 281.9 302.1 973.6 44.9
199 4620.7 1931 1613.2 170.8 579.2 20066.3 96.9
207 88.2 1033.1 705 244.1 2559 8.8 12.3
255 2479.7 242.2 1322 641.9 1222.8 1914.8 93.2
  PC-FV_1071TT PC-FV_11072TT PC-FV_1080TT PC-FV_1081TT      
354 109.5 127.2 205.7 12.4      
199 257.5 2302.7 320 206      
207 42.9 72.2 2795.9 145      
255 1128.1 2320.8 1705.2 196.8      

C. Manual Selection of Markers



[0078] Individual genes were selected with an aim to formulate a RT-PCR based assay. Comparison of gene expression profiles between thyroid cancers and non-cancer tissues has identified a five-gene signature from these 322 genes. The identities of these five genes are shown in Tables 8 and 16 and the chip data are shown in Table 9. The performance of this signature was assessed using LDA in the 98 samples, and the signature gives 92% sensitivity and 70% specificity, shown in Table 10. The ROC curve gave an AUC of 0.88, as shown in Figure 2.
Table 8. 5-Gene signature
SEQ ID NO: Gene Name
53 Cadherin 3, type 1 (P cadherin)
242 Fibronectin
73 Secretory granule, neuroendocrine protein 1
36 Testican-1
211 Thyroid Peroxidase (TPO)
Table 9
SEQ ID BN_800TT BN_801TT BN_802TT BN_804TT BN_805TT BN_806TT BN_807TT
36 908.3 247 118.9 349.7 425.1 229.8 227.7
53 48.6 25 24.1 34 17.8 32.2 16.7
76 252.3 272.9 75.2 260.6 169 587.1 189.8
242 22247.9 1204.2 676.7 2441.5 3900.1 2073.4 2060.1
211 24.2 16983.5 32579.5 20189.3 24480.5 14186.3 6488.1
  BN_871TT BN_913TT BN_914TT BN_915TT FA_917TT FA_918TT FA_919TT
36 897.2 435.1 442.3 508.7 169.6 184.3 213.3
53 192.4 28 76.1 44.2 67.7 135.7 25.7
76 2134.6 252.5 956.3 174.3 62 144 233.8
242 13722.5 10481.9 4172.9 1891.1 1654.2 11791 2788.1
211 6243.3 16339.1 14355 36350.1 29876.5 19395.7 15935.8
  FA_818TT FA_819TT FA_820TT FA_821TT FA_822TT FA_842TT FA_862TT
36 120.4 155.8 210.6 31.7 283.3 122.2 24.1
53 27.2 26.9 34 33.2 34.4 27.8 66.1
76 13 69.8 36.2 171.2 116.2 190.8 15.9
242 1433.3 2211.8 2714.7 325 2278 3959.5 1573.1
211 15976.6 19882.4 10029.6 29411.3 25179.3 12492.5 32721.4
  FA_863TT FA_864TT FA_865TT FA_866TT FA_867TT FA_869TT FA_907TT
36 955.1 173.1 400.5 230.8 233.5 2581.5 207.8
53 333.9 27.9 147.1 155 25 1224.9 52.8
76 2614.7 624.1 614.9 336.7 163.5 1921.3 38.5
242 930.4 1419.2 4708.4 2169.8 1081.9 3241.5 4555.3
211 19631.3 27382.8 29846 25011.6 30599.1 23680.6 35313.3
  FA_908TT FA_920TT FA_938TT FA_940TT FA_941TT Fa_EA4037 4_921TT Fa_EA40376 _923TT
36 329.6 165.4 331.5 298.8 267.4 211.3 188.6
53 35.8 28.5 21.7 291.4 45.3 23.8 62.8
76 253.8 64.7 3312.7 870.1 1038.3 153.3 236.1
242 1630.4 2452.4 6684 765 2229.5 1809.2 3314
211 21639.5 31897.6 26420.3 18786.4 31922.5 36025.3 15065.2
  BN_EA40377_9 24TT Fa_EA40378_92 5TT Fa_EA40379_ 926TT BN_EA4038 0_927TT Fa_EA4038 7_955TT Fa_EA4038 8_956TT Fa_EA40389 _957TT
36 1409.1 215.1 228.5 833.3 94.7 259.7 786.2
53 2294.2 40.5 496.1 64.9 197.4 230.4 17.3
76 624.1 149.1 211.3 501.1 2225.4 1443.7 223.7
242 21699.6 1079.3 21653.5 3263.3 2194.8 989.6 2107.8
211 664.1 12583.3 13036.6 43191.4 9409.2 19630.9 4160.9
  Fa_EA40390_9 59TT Fa_EA40391_96 0TT Fa_EA40392_ 961TT Fa_EA40393 _962TT PC_829TT PC_830TT PC_831TT
36 2496.1 175.3 58.2 1470.7 330.3 210.3 432.7
53 24.4 40.3 52.6 183 73 49.6 1916.2
76 2397 47.6 77.2 1048.1 758.4 576.2 394.3
242 1991.2 1042.5 1474.6 688.1 6760 6785 25733
211 38927.5 31764 34287.3 44314.7 25565 31930.4 138.4
  PC_832TT PC_834TT PC_835TT PC_836TT PC_837TT PC_838TT PC_839TT
36 1376.8 4076.8 1208.3 585 1015.5 1620.2 71
53 1909.1 2701 1178.5 710.7 1469.6 3009.9 41.2
76 433.6 665.5 654.3 2432.3 412.1 1946.7 327.2
242 9423.8 18037.6 23053.8 1044.8 22442.7 27313.9 3641
211 7174.2 1961.7 282.1 24646.9 15840.2 264.2 17049.3
  PC_879TT PC_881TT PC_882TT PC_883TT PC_884TT PC_885TT PC_886TT
36 457.4 1779.4 561.2 722.1 1129.2 552.5 1244.9
53 2036.8 3121.5 2072.4 2753 2221.1 1703.8 1325.9
76 1168 1829.9 1210.6 2663.6 1076 1232.2 1866.8
242 46283.3 32477.3 33142.3 38026.7 42087.5 49249 39172.5
211 86.8 613.5 48.3 126.2 1565.2 120 1553
  PC_890TT PC_892TT PC_893TT PC_894TT PC_903TT PC_904TT PC_928TT
36 813.3 2211.4 1046.2 585.3 3395 767.9 246.6
53 4685.6 3536.8 1363.5 1244.4 3057.5 199.9 41.1
76 3926.2 2787 2359.3 1020.3 1564 203.6 421.5
242 36114.4 31599.6 34700.3 41193.8 18485.8 19181.5 3729.4
211 1569.2 1703.6 234.5 1594 4063.1 5354.1 18199.2
  PC_932TT PC_933TT Pc_EA40375_ 922TT Pc_EA40381 _945TT Pc_EA4038 2_946TT Pc_EA4038 3_947TT Pc_EA40384 _948TT
36 352.4 920.3 316.8 818.6 3078.8 532.1 311.1
53 3447.4 788.3 41.8 803.9 2330.1 1337.6 1556.2
76 850.2 865.6 211.2 237.9 1830 1279.7 598
242 21913.7 24536 1758.5 13301.5 22076.9 31987.1 27579.2
211 490.5 13171.6 30959.7 331 857.8 2007.7 556.5
  FC_823TT FC_824TT FC_825TT FC_827TT FC_828TT FC_840TT FC_896TT
36 282.5 2413 1506.3 821.8 564.4 331.6 857.1
53 23.2 24.1 25.5 348.1 27.8 32.3 245.4
76 460.1 2833.9 1128.9 1421.6 705.3 341.8 358.9
242 12129 22000.8 874.8 2544.9 845.1 3459.7 1998.1
211 30520.5 496.2 7923.9 11508.2 33401.7 10882.8 1625.3
  FC_898TT FC_899TT FC_900TT FC_901TT FC_902TT FC_909TT FC_910TT
36 827.5 1152.4 778.8 11199.3 578.7 713 132.4
53 38.1 1949.7 2221 29.8 185.9 21.5 57.2
76 2517.5 1009.8 3523.8 28603.2 2201.1 195.6 4744.8
242 39786.8 38117 29127.6 4032.5 43673.6 3478.5 1273.2
211 1160.7 132.9 79.4 26.8 251.4 488 8979.7
  Fc_EA40386_954TT Fc_EA40394_967TT Fc_EA40395_968TT Fc_EA40396_969TT Fc EA40397_970TT Fc_EA40405_982TT Fc_EA40406_983TT
36 619.3 412.7 134 218.6 463 1233.6 234.6
53 15.8 31 23.8 55.9 65.5 31.6 120.8
76 4136.1 682.5 41.2 18.4 594.7 931 2232
242 969.6 5003.3 13867.1 2129.7 15742.7 4086.9 11985.6
211 20581.7 1509.8 2603.1 15427.4 369.3 30161.2 19869.7
Table 10. 5-Gene signature performance in 98 training samples
  Tumor Benign
Positive 48 14
Negative 4 32
Sensitivity 92% (0.82, 0.97)
Specificity 70% (0.55, 0.81)

D. Cross Validation with the 74 Independent Thyroid Samples



[0079] 74 independent thyroid samples were processed and profiled with the U133a chip, and the chip data for these two signatures are shown in the Table 11. The performances of the 4-gene and the 5-gene signatures were assessed with LDA. Both signatures gave equivalent performance in these samples compared to the 98 training samples. The sensitivity and specificity for both signatures are shown in Table 12, and the ROC curves are demonstrated in Figure 3a and 3b.
Table 11
  Signal
SEQ ID FA_987TT FA_992TT FA_993TT FA_994TT FA_995TT FA_996TT FA_998TT
36 92.8 437.8 640.5 4152.5 714.7 254.8 303.4
53 15.8 422.8 25.2 29.5 258.9 45.6 26.8
76 485.1 2536.5 1708.6 288.3 643.3 605.6 341
242 686.5 971.4 24532.8 895.7 6424.8 842.1 905.1
211 28269.8 14689.4 357.5 33711.6 34943.2 16131 22376.1
  FA_999TT FA_1001TT FA_1002TT FA_1004TT FA_1005TT FA_1006TT FA_1010TT
36 63.7 22.3 55.6 408.5 128.3 353 442
53 30.9 31.8 39.6 70.8 37.9 38.9 119.6
76 110.7 90.2 253.5 159.1 1559.6 413.7 1222.5
242 1550.6 1689.3 256.5 1668 3921.5 1013.2 1837.3
211 26779.6 15870.7 24525.5 43040.2 25392.9 32641.9 20140.7
  FA_1013TT FA_1014TT FA_1017TT FA_1018TT FA_1020TT FA_1023TT FA_1024TT
36 453.5 333.7 179.6 146.6 417.3 75.1 129.2
53 39.8 21.4 99.9 23.7 26.8 62.3 22.2
76 333.7 1231.2 241.1 131 1435.3 1491.6 260.8
242 1459.4 415 1696.3 287.1 617.5 1932.6 1689.2
211 7911.3 19359.4 18089 18222.8 30038.8 9053 15611.2
  FA_1026TT FA_1027TT FA_1028TT FA_1029TT FA_1030TT FA_103HT FA_1032TT
36 281.1 130.5 150.4 629.1 760.4 191.6 1671.5
53 125.5 62.3 118.2 230.7 420.5 27.7 519
76 77.4 111.9 184 2100.9 1741.3 338.1 3643.5
242 1091.6 754.3 705.9 684.7 372.4 1848.8 3094.7
211 21979.1 7890.5 26226 15344.1 10368 30439.9 23943.3
  FA_1034TT FA_1035TT FA_1073TT FA_1074TT FA_1075TT FA_1076TT PC_984TT
36 24.8 119.9 111.7 1010.9 26 224.5 303
53 29.6 30.4 366.7 96.9 94.3 31 37.2
76 289.3 239.7 167.4 326.8 818.9 248.5 198.3
242 1482.1 590.9 4842.5 2815.5 2726.8 754.8 785.1
211 38135.4 30510.6 29406.4 31075.4 20833.9 34960.4 24938.2
  PC_986TT PC_988TT PC_989TT PC_1039TT PC_1040TT PC_1041TT PC_1042TT
36 676.3 391.7 264.9 735.7 524.7 769.9 1125
53 32.5 395.6 305.5 655 1153.9 2194.1 111.5
76 287.9 1384.8 888.3 1048.4 1381.1 412.6 1384.9
242 2450.7 2243.2 698.6 20366.4 32090.9 27480.9 1915.7
211 2689.5 19903.2 32075.9 9197.7 152.2 1862.2 10375.9
  PC_1043TT PC_1044TT PC_1045TT PC_1046TT PC_1047TT PC_1048TT PC_1049TT
36 1013.6 1023.8 809.8 1811.1 1546.2 481.8 2491.9
53 730.7 2965.3 2533.7 2052.8 756.1 988.3 2000.6
76 1977.4 3380.4 422 4343.9 3570.2 1786.8 1619.4
242 3907.5 20074.8 41761.2 33253.5 28124.7 30516 21324.5
211 7015.5 11.5 136.1 59.1 3841.7 31.1 1476
  PC_1050TT PC_1051TT PC_1053TT PC_1054TT PC_1055TT PC_1059TT PC_1060TT
36 443.9 669 477.1 952.8 645.3 660.2 648.5
53 4037.8 1290.7 933.9 1336.9 2123.2 1353.8 66.9
76 856.7 1039 463.2 1399.8 4385 365.8 160.7
242 1261.6 58258.5 26004.9 21219.8 12760.8 26084.4 3277.6
211 1150.2 375.9 73.2 2575.8 400.5 24.2 1637.7
  PC_1061TT PC_1062TT FC_1037TT FC_1077TT FC_1078TT FC_1079TT FC_1082TT
36 3523.2 1912.5 234.2 128.3 825.1 297.5 1447.6
53 2348.3 975.7 97.6 46.2 15.8 38.5 254.6
76 4994.9 3891.7 5559.8 252 1103.5 700 1754.7
242 15397.4 2006.1 1826 2407.5 22661.8 1595.5 713.7
211 3379.4 2201.9 41570 15804.5 3572.1 4377.9 16863.7
  PC-FV_1052TT PC-FV_1064TT PC-FV_1065TT PC-FV_1066TT PC-FV_1067TT PC-FV_1068TT PC-FV_1069TT
36 1993.3 230.6 1128.8 708.4 1348 331.4 33.2
53 2450.3 72.7 610.3 72.5 146.4 1073.7 21.5
76 2075 451.3 1350.2 922.5 1743.7 950.5 332.8
242 21494 7110.9 10228.8 2206.3 3319.2 28034.2 430.1
211 1581.5 29936.9 1594.7 5693.1 11288.8 573.6 12485.5
  PC- FV_1071TT PC-FV_1072TT PC-FV_1080TT PC-FV_1081TT      
36 148.9 332.2 1288.7 672      
53 1836.9 1450.3 2221 22.9      
76 4053 1628.6 2275.9 62.6      
242 1810.6 26396.9 683.4 984      
211 8.9 24 3416.7 2630.8      
Table 12. 4-Gene and 5-gene signatures performance in 74 validation samples
4-Gene Signature
  Tumor Benign
Positive 33 12
Negative 3 26
Sensitivity 92% (0.78, 0.97)
Specificity 68% (0.53, 0.81)
5-Gene Signature
  Tumor Benign
Positive 33 9
Negative 3 29
Sensitivity 92% (0.78, 0.97)
Specificity 76% (0.61, 0.87)

E. Control Gene Marker Identification



[0080] With the 98 thyroid samples and 12 PBL samples we selected two groups of genes as sampling control. One group consists of genes that are expressed in thyroid but not in PBL, the second group includes genes that are expressed in PBL but not in thyroid. The full gene list and corresponding chip data are shown in Tables 13a and 13b. From these genes we selected six genes that are abundant and the differentiation between thyroid and PBL is relatively large. Their expression profile was validated in the 74 independent thyroid samples. The identities of these six genes are listed in Table 14 and their chip data are shown in Tables 15a and 15b.
Table 13a
  13a1                
    ThyMixBen_......_Signal
  SEQ ID   02014_40062_H 133A_800TT   02014_40063_H 133A_801TT 02014_40064_ H133A_802TT 02014_40065_ H133A_804TT 02014_40066_ H133A_805TT 02014_40067_ H133A_806TT
  2   2789.6   2676.2 3009.2 2957.5 3644.5  
  3   6346.1   794.7 623.9 772 2977.3 622.7
  5   845.5   3039.1 4452.6 2982.5 8283.2 1627.6
  6   2730.4   1108.5 1155.6 2201.2 1954.3 1750
  9   2248.9   12094.5 7529.2 2827.3 7701.7 1548.3
  12   10625.3   3137.2 2382.7 5727.8 6238 4685.1
  18   3171   5947.9 13241.7 10294.9 7849.3 4037.4
  22   2715.2   9522.5 6429.7 2590.9 5294.3 6490.4
  25   19159.3   1467.3 550.1 2838 11294.7 1374.2
  28   722.7   835.1 727.1 568.1 1336.7 740.7
  32   4160.6   8115.2 4061.8 6625.3 7348.9 3964.1
  39   314.3   1243.1 1486.8 742.5 1027.7 1185.9
  74   323.9   1227.6 1965.1 1281.5 760.1 505.4
  78   250.8   677 1029.5 403.7 795.2 355
  143   1337.3   1339.1 2607.9 1758.8 1933.5 1809
  174   1263.1   4430.7 6041.6 3543.2 7346 3920.5
  175   1996.3   1829.5 3632.7 1646.5 2847.5 3054.2
  191   7108.4   848.3 2063 1133.9 1482.4 1149
  212   258.2   1405.8 1196.9 882.4 1296.8 1790
  222   20586.8   1371.2 2411.9 2573.9 1228.6 1342
  233   1410.6   14126.8 3020.4 2186.1 8439.4 6024.4
  234   5048.1   2183.3 3215.5 4103.9 4096.9 2220.2
  237   905   3397.9 2525.1 2135.4 2822 2492.3
  238   198   1937.7 1289.5 804.5 1538.6 1454.7
  245   143.7   1598.5 1964.2 897.9 2276.6 1348.2
  250   1300.9   777 1405.9 1466.4 1301.3 1865.3
  252   945.6   1547.3 1210.2 1357 1437.1 1141.1
  264   455.1   988.1 932.8 631.7 1253.9 845.4
  290   345.6   327.2 533.1 470.2 700.5 231.9
  294   1043   3655.8 4435.4 3563.9 4377.4 2734.8
  296   845.5   1242.5 1068.6 1506.5 2423.4 1527.3
  342   771.4   535.9 998.5 922.5 1108.1 605.4
  13a2                
    Signal
    ThyMixBen_0201402014_40198
  SEQ IID   _4006807_TT H133A_8_T _H133A_871T 02014_40321_ H133A_913TT 02014_40322_ H133A_914TT 02014_40323_ H133A_915TT 02014_40324_H 133A_917TT
  2   1749.6   1571.1 2634.4 2419.2 2697.4 2901.1
  3   377.2   508 2322.7 259.8 1123.4 875.2
  5   2762.9   1795.8 6670.2 2094.7 11425.7 4957.2
  6   1280.1   2033.3 1910.3 2543.7 2173.2 1358.7
  9   3346 8   548 3 11398 5 953 4 2085 2 10684.4
  12   6927.1   4060.5 6664 4 2730.5 4322.8 2616 2
  18   7418 9   148604 3134 7 26576 3 9585.4 8588 9
  22   1797.5   744 5 7334.3 317 9 5120 1 8611.7
  25   1245.8   815.8 7201.1 253.7 2882.9 1053.4
  28   301.1   640.8 1325.9 409.2 762.7 838.8
  32   2992 4   3260 2 3588 1 792 5 5413.2 2581
  39   435.8   958.1 1152.1 499.6 556.9 898.6
  74   689.2   1453.7 491.7 736.6 986.9 1217
  78   160.1   383 151.2 172.9 597.4 745.6
  143   1338.8   1466.6 1069.9 2070.9 2377.2 1976.3
  174   2464.3   1616.3 829.8 1365.2 4938 4804.3
  175   1701   1511.5 1101.1 2325.4 2530.2 2237.6
  191   763.6   2383.3 2013.2 1274.2 601.2 1100.2
  212   481   1131.9 912.5 663.2 666.4 1081
  222   280.8   1574 4944.1 342.2 1225.9 449
  233   1289.5   1272.4 6285.6 866 8 1839.3 5797.4
  234   1645.4   3271.1 2263.3 1789.3 6497.4 3416
  237   1732.6   879.6 3780 1345.3 2154.2 2802.8
  238   493.5   893.1 1098.7 405.9 592.8 1326.4
  245   661 9   1378 3 1244 1 645 4 11242 1171.1
  250   736.7   1305.6 633 1409.4 1368.9 1146.7
  252   1145.8   1986 1203.2 1323.5 1544.6 757
  264   462.7   891.3 629.4 649.7 872.1 1319.5
  290   295.8   697.1 987.8 819.9 440.6 268.7
  294   1909.1   1128.4 924.1 1519.2 3225.3 2954.9
  296   898.4   1232.2 925.1 1290.3 738.1 1069.6
  342   482.4   712.6 476.1 694.8 801.2 605.7
  13a3                
    Signal
            ThyFolBen_0201 ThyFolBen_02014 ThyFolBen_02 ThyFolBen_020
  SEQ ID   02014_40325_ H133A_918TT   02014_40326_ H133A_919TT 4_40079_H133A_ _818TT 40080_H133A_8 19TT 014_40081_H 133A_820TT 14_40082_H13 3A_821TT
  2   3457.7   2769.1 4234.5 5589.2 3035.3 2681.6
  3   920.5   1289.3 627 690.9 585.4 610
  5   1805.2   5184.1 1487.2 1942 2663.8 1231.8
  6   1560.8   1519.3 996.2 1140.3 2093.4 841.6
  9   11354 8   9921.9 3894 2165.4 7467 76856
  12   7531 7   6887 2 1379.1 12924 5085.1 3585 9
  18   8220.1   5452 2 7762.8 7277 4 5374 9 2379 1
  22   4750 8   5739 1 5655 3 7618 6 6227 9 1551 9
  25   1557.9   3017.5 1200.4 881.1 1279.8 406.4
  28   886.6   1189.2 731.4 681.6 1225.7 473.9
  32   4134 1   10964 4 9276 3075 2 2775 608.8
  39   1259.2   1022.1 1423.7 1632.9 899.5 688.9
  74   741.8   948.9 770 674.6 1139.7 213
  78   447.9   488.2 790.6 602.8 776.3 332.1
  143   3165.1   1865.5 2330.7 1160.7 1676 1672.4
  174   607.5   2151.8 4026.2 4702.6 10677.1 3826.8
  175   5358.5   2402.7 3765.6 1231.9 2531.9 2307.7
  191   2685.9   1183.6 1661.4 1657.1 1377.8 368.2
  212   1306.3   1072.1 2181.8 2035.6 954.2 1663.4
  222   2332.8   431.2 1015.6 1651.7 3755.4 1669.9
  233   5181 8   12377 3400 9 4679 3 2476 15066
  234   3760.2   1988.2 3527.4 3971.4 4526.5 997.6
  237   2712   3073.8 1930.6 3521.8 2951.3 2673.5
  238   1552.2   1636.1 1333.5 837.6 1069.7 1860.5
  245   1722.2   1981.5 1978 8 2170 7636 1392 9
  250   1502.7   1094 1083.3 710.8 864.9 1823.2
  252   1609.9   1679.3 1599.4 1463.9 1199.9 1019.2
  264   860.1   835.6 1451.1 1315.6 1221.6 936.8
  290   773.4   490.4 218.3 507.2 648.3 294.6
  294   555.3   2490.7 1854.9 2997.4 7108.2 2160.7
  296   1240.1   1863 1139.6 1245.4 1172.5 1318.3
  342   738.7   509.3 1018.4 728.8 977.2 710.6
  13a4                
    Signal
  SEQ ID   ThyFolBen_02014_40 083_H133A_822TT   fa_EA40173_V DX842TT fa_EA40191_fa_EA40192 VDX862TT _VDX863TT fa_EA40193 _VDX864TT fa_EA40194 _VDX865TT
  2   3403.4   3423.5 2246.6 2723.3 4130.4 2828.3
  3   644.4   950.1 1313.2 344.4 1731.1 2576.9
  5   3439.5   754.3 2119 3324.7 1663.9 4423
  6   1373.7   987.7 834.7 1399.9 1115.4 1872.1
  9   7457 5   2730 9 9307 8 7048 5 3775.7 6931 3
  12   3978 9   9462.8 777.5 4280 8 4821 2268 2
  18   6386.9   1591 2 9546.6 4979 5 5005.2 4409 5
  22   2652 5   7124 7 11054 7 4669 9 4776 9 5782.5
  25   1887.7   470.8 703.9 714.5 445.9 698.2
  28   548.5   562.4 596.8 915.9 1194.1 1041
  32   4425   2120.3 1084 1 2638.9 2593.1 1919.9
  39   1103.1   1110.2 898.4 1375.1 1520.4 1744.1
  74   285.9   895.8 532.1 841.2 1938.6 1109.5
  78   643.1   346.6 533.3 680.5 621.5 385.8
  143   1979.9   3302.9 1686.6 2406.6 2255.4 3419.1
  174   5408.1   608.2 5072.8 5907.1 2956.1 2670.2
  175   2650.3   6596.8 2600.8 4704.8 3123 5286.8
  191   667.7   2665.7 1382.1 740.2 1133.3 3678.2
  212   1357.8   2545.2 1590.2 1326.3 1135.1 1079
  222   2239.9   471 548.8 584.1 899.4 1002.2
  233   7318.7   11617 9 10868.4 6925.9 5416 9 14212.9
  234   2395.9   2832.7 6005.3 2493.8 2128.8 6818.7
  237   2275.3   2639.3 2406.6 2595.1 1857.2 2546.1
  238   1181.8   2124.4 1586.7 2125.3 2039.5 1859.1
  245   2082 5   2302 1 2846 4 2280 8 3973 8 1520 6
  250   1684   1718 862.9 2782.6 1973.4 1594.3
  252   1166.7   1579.6 590.4 1623.2 1735.2 1584.3
  264   654.2   1604.2 3062.9 1460.8 1510.1 2018.9
  290   342   431.8 458.8 602.3 745 977.6
  294   2978.3   651.5 3533.7 4065.9 2720.4 2292.8
  296   2132.4   1762.9 815.3 1040.3 1787.6 932.6
  342   1058.5   488.3 475.4 759.3 658.3 851.2
  13a5                
      Signal
  SEQ ID   fa_EA40195_V DX866TT   fa_EA40196_V DT867TT fa_EA40197_VDX869TT fa_EA40219 _VDX907TT fa_EA40220 _VDX908TT 02014_40327_ H133A_920TT
  2   3243.6   2969.9 3761.4 3352.1 2486.7 3109.7
  3   728.9   548.2 531.1 796 1204.6 981.6
  5   2392.7   6268.7 3870.9 10216 3635.2 11319.3
  6   1534   965.4 2782.8 2638.6 1155.7 2100.9
  9   8238.3   8829 9 4175.6 16504 7363.2 3172
  12   33346   2752 3382 5 3589 1 6717 7 3191 1
  18   9146.3   8150.5 7796 3 8021 5 8185 8907 1
  22   4665.5   8163.5 3812 4001.7 5539.5 6346.7
  25   555.6   524.6 1592.8 2057.7 4103.3 886.1
  28   1115.3   604 918.9 1190.9 999.1 998.2
  32   1089.3   1808 7 1245 6 4202 7454.6 1100 2
  39   1606.8   1418.2 1392.4 1443.3 949.8 1388.7
  74   638.2   2085.2 694.7 1778.1 557.7 1486.9
  78   480.6   603.6 959.8 1000.3 476 721.1
  143   3003   2033.2 3441.7 3224.1 2461.3 2701.2
  174   3322   4573.5 4168.2 5982 3022.2 5710.7
  175   4747.7   2573.2 5834.5 4666.3 3540.8 3751
  191   2724.1   726 831.2 1991.3 1556.8 643.4
  212   2362.7   1705.9 2344.8 1259.8 1288.5 1033.6
  222   271.7   2039.4 683.4 1370.8 728.8 1675.2
  233   10892.9   6981 9 10048.5 6773.9 15803 3520.9
  234   4015.4   3229.5 4734.4 7233.3 2712.8 8148.3
  237   2159.9   2415.2 2194.8 2990.7 2565.8 2364
  238   1451   1111.6 1037.4 914.9 1634.9 1078.9
  245   2059 1   25486 2180.9 2093 2 1730 1329 4
  250   3534.3   993.3 2513.3 878.9 686.3 980.4
  252   1589.8   1334.1 2762.4 1267.3 1857.6 1608.2
  264   1335.1   1560.1 1656.9 942.7 950.4 1076.1
  290   496.3   626.2 584.9 549.5 309.7 613.4
  294   2869.4   4842.4 2762.9 4119.2 2299.8 3497.8
  296   1414.4   882.4 1284 1106.4 1493.4 888.5
  342   595.6   660.6 1442.5 1050.3 607.5 1278.6
  13a6                
      Signal
  SEQID   02014_40332_ H133A_938TT   02014_40335_ H133A_940TT 02014_40334_H133A_941TT EA02014_40374 _H133A_921TT EA02014_40376 _H133A_923TT EA02014_40377 _H133A_924TT
  2   1963.7   2662.2 2416.5 3047.7 2043.6 3322
  3   942.5   342 749.1 1300.6 549.3 1182
  5   2218   3903.1 3011.1 1669.8 1508.3 1453.3
  6   939.5   1239.9 1853.8 837.7 1477.8 1129.4
  9   5245 3   1988 9 1165 8 4137 8735.7 4453 3
  12   2500.4   2100 3597 7 5716 2 6694.6 3667 8
  18   5929 7   7819 7 53354 4017.5 2899 7 5013 3
  22   2265 8   6298 2 2112 7 3386 6 4519 5 4186 6
  25   1160.2   916.5 614.7 1055.5 1184.4 828.4
  28   617.6   408.5 740.7 477.3 701.3 550.8
  32   2918.8   3242 3 3083 9 1411 4 4079 5 2063 5
  39   1356.6   876.8 1855 1231.3 982.1 1248
  74   1102.1   826.7 1384.4 296.9 632.9 787.7
  78   361.5   383.5 482.7 578.5 351.8 389.9
  143   1394.9   2889.2 1838.1 1776 1563 3881.9
  174   2198.3   1887.5 3921.8 3076.4 1858.1 1967.3
  175   1391.4   3677.8 2169.3 2169 2397.2 5896.8
  191   1235.1   364.7 1283.5 616.4 1037.7 1957
  212   1625.4   871.1 1083.8 1086.2 1450.7 981.1
  222   503.4   638.5 320.1 1707.8 1089.4 640.8
  233   8499.4   1999.9 4012 1 3345.8 10991 6 11095 5
  234   2067.7   2973.7 2630.4 2027.1 1597 3240
  237   2763.1   1501.1 2724 1997.9 2249.3 2941.1
  238   1055.6   1055.7 1596.5 1035.6 1810.9 1257.5
  245   751.4   1076 1 9152 27144 2120 1 1553.6
  250   1050.6   1474.3 2517.1 1428.1 1169.4 2857.1
  252   987.8   1382.5 2070.7 881.9 1589.3 2217.2
  264   1278.3   1261.1 826.2 865.3 1055.9 1034.7
  290   786.4   936.8 952.1 312.4 385.7 1231.1
  294   2197.5   2064.7 2836.2 2573.5 1835.4 2657.6
  296   635.2   572.8 700.9 1768.2 1426.3 2462.5
  342   718.6   866.8 735.7 770.7 565.3 1695.1
  13a7                
    EA02014_ Signal
  SEQ ID   40378_H133A _925TT   40379_H133A_ 926TT 40380_H133A_9 27TT 40387_H133A_9 55TT 40388_H133A_9 56TT 40389_H133A_9 57TT
  2   3274   3025.4 2424.9 2302.6 2724.6 2641.3
  3   1545.1   1151.1 687.6 424.7 420.6 609.2
  5   3661.8   1323.2 7949.9 4860 4059.3 4895.3
  6   1371.5   1220.8 1831 2330.9 1642.8 1963.8
  9   10387.7   11670 1 6402 7 3105 3 4256 6 8744 8
  12   5829.9   6903 8 4783 9 2874 5 3494 1 2451
  18   6648   4053 1 7410.2 4600 2 3912 9 16235 3
  22   6859.7   6621 2 3008 1 1369.9 2480 5 4685
  25   1035.6   3448.2 1347.1 365.3 1587.2 816.8
  28   1087   917.1 747.8 1057 1169.8 974.1
  32   5212 9   6841 6 3282.5 28142 4083 1 8121 2
  39   1228.7   1245.7 762.4 904.9 1136.2 1862.7
  74   823.9   721.6 1460.4 2781.4 970.2 839
  78   713.8   377.6 621.3 212.8 421.3 891.4
  143   1953.5   2925.7 2861.8 1243.3 2335.1 2564.3
  174   3029   1818.4 5264.1 3925.2 4789.1 4553.5
  175   3229.1   4827.2 3821.1 1561.7 4603.8 4098.3
  191   1112.4   5308.2 841 861.6 865.2 1964.9
  212   890.9   915.6 729.1 1264.1 1494.2 1177.2
  222   480.1   2895.3 605.6 145.2 425.3 1712.4
  233   6680 7   9190 5 2236 9 441 5 3626.6 11326 8
  234   1871.2   4194.1 4428.8 1260.8 1599 4627.7
  237   2758.4   2108.7 2070.1 1655.9 2140.3 2820.6
  238   1210.7   1289.5 774.3 1017.9 1645.6 1150.6
  245   2369 7   1585.8 785 1 1808 4 2014 9 1185 6
  250   971.8   1387.3 1193.5 2014.8 2033 608.6
  252   1417.1   990.9 1186.9 1561.4 2006.1 1838
  264   1089.9   538.3 758.6 1340.4 1171.9 933.1
  290   565.1   601.8 485.3 597.5 511.7 852.7
  294   2574.6   1541.3 3862.3 2639.9 3968.8 3955.7
  296   1821.8   1115.3 1015.1 1379.4 1444.1 867.3
  342   621   757.4 791.6 504.5 671.6 1320.3
  13a8                
    Signal
  SEQ ID   EA02014_4039 0_H133A_959T   EA02014_403 91 _H133A_96 0TT EA02014_40 392_H133A_ 961TT EA02014_403 93_H133A_96 2TT ThyPapCan_0201 4_40090_H133A_829TT ThyPapCan_02014_ 40091_H133A_830TT@IC
  2   1817.9   1937.6 2632.3 2613.7 2167.7 1372.7
  3   605.6   546.8 697.5 480.4 1793.5 1236.1
  5   8452.7   11532.9 5852.5 5194.8 3735.8 2822.7
  6   2508.7   1035.1 1581.2 1561.5 1568.4 1723.3
  9   1557.3   17159 3 4950 2 1272.3 3597.7 3243.8
  12   84236   3364.4 3626 3 1198.7 59384 3878 6
  18   9560.7   5478 7 8997.6 4601 2 6673 3 6322.9
  22   57452   85846 5849.4 2275 2 4413.3 2539 9
  25   2003.7   619.6 797.7 887.6 2860.8 3257.2
  28   1280.7   1023.3 810.5 604.5 703.2 592.2
  32   5819 5   1430 6760 1 736.3 5057 7 6877 1
  39   844.6   1193.7 2035 1643.1 906.4 977.6
  74   1794.4   871.9 1342.7 1207.3 873.7 438.1
  78   1290.5   705.2 582.1 644.6 514.2 463
  143   3606.5   1206.5 2610 3369.8 2462.6 1861.9
  174   5367.9   3197 2890.4 2641.3 4481.7 4198.6
  175   6112   1232.1 3258.2 5085.3 3246.9 2442.8
  191   628.6   681.7 1024.6 538.5 913.3 723.7
  212   762   850.9 1201 690.9 797.9 784
  222   3380.6   2257.2 1397.9 127.7 892.4 646.7
  233   1273 2   3459 2 5807 2 1846 1 4170.6 4565 3
  234   4522.8   4556.7 4306.9 2687.8 3024.6 3006.1
  237   1813.8   2740.9 2387.6 1999.7 1879.2 1708
  238   1451   897.7 1411.1 1416.3 966.6 803.6
  245   841.7   1542 14342 1628 4 1415.8 1521.5
  250   834.9   839.6 1111.7 2545.8 1268.4 993.6
  252   1612   1068.9 1122.2 1770.9 1379.1 1201.1
  264   1720.7   1090.3 766.9 838.8 728.4 1170.1
  290   548.7   268.4 292.1 366.4 284.9 221.6
  294   4894.2   2878.6 1768.4 2181.6 2628.2 1888.8
  296   1074.9   850.9 729.8 975.7 1061.8 946.4
  342   1199.4   802.1 874.5 1428.7 842.7 661
  13a9                
    ThyPapCan_02014_.... Signal
  SEQ ID   40092_H133A_83 1TT 40093_H 133A_832TT 40095_H133A_834TT 40096_H133A_835TT 40097_H133A _836TT 40098_H133A_83 7TT@I2
  2   2960.8   3724.8 3502.5 2185.5 2399.1 2900.3
  3   1034.7   1002.2 539.6 891.3 452.9 1885.7
  5   1648.6   2929.1 3651.5 1611.8 3731.1 3572.5
  6   1182.8   1644.7 1603.7 966.3 2282.6 1818.1
  9   1637.6   5720.9 3045.3 1451 5197.6 7768 2
  12   4300 1   3654 1 3555 9 3946.6 2299 2877.5
  18   6335 8   3922 6 35796 3282 2 3591 8 9567
  22   3973   4533 5478 5 2546.7 2145 4 7779 3
  25   7110.3   2680 2117.2 5212.6 869.3 8325.2
  28   550.3   1205.7 660.4 730.8 642.1 734.4
  32   3476 1   8075 3 2580 2524 9 2454 3 7864 8
  39   973.9   1322.9 1283.6 847.1 1328.7 1793.3
  74   611.8   1044.9 1510.9 831.5 519.6 513.4
  78   179.8   404.5 344.8 133.7 582.7 482.5
  143   2333.5   2803.8 3730.7 2073.1 2033.9 3437.3
  174   2628.9   5254.6 2333.9 922.6 5487.5 2544
  175   3237.9   5586.3 5010.7 2460.9 2800.9 4480.6
  191   3519.9   1826.9 1761.9 1360.1 531.4 2426.7
  212   957   1085.3 1218.4 764.9 1307.9 1161.5
  222   3373.3   1016 723.7 3010.1 829.7 1642.5
  233   11130 3   16096 3 16100 3 10534.6 2873 8 13024 1
  234   2382.4   2253.4 2236.6 842.4 1767 3577
  237   2141.4   2712.7 2749.7 2120.8 2269.8 2385.2
  238   719.8   2489.2 1179.1 806.1 1516.4 1336.1
  245   873 3   1325.6 1693 870.8 1916 1 1869.5
  250   2074.6   1295 2234.8 1654.7 2588.9 1104.7
  252   1496.7   2444.6 2031.7 1605.8 1808.1 1897.2
  264   931.2   1035.7 1188 746.7 1074.1 1170.2
  290   499.8   667.3 663.2 626.9 634.4 625.8
  294   2010   4135.1 2098.4 1207.1 3722.1 2055.4
  296   2128.8   2051.6 2507.5 2156.5 1001 2034.3
  342   835.3   1453.2 1071.6 615.5 1084 1209
  13a10                
    Signal
  SEQ ID   ThyPapCan_02014_40 099_H133A_838TT   02014_40317@I2_H133A_839TT pc_EA40200 _VDX879TT pc_EA40201 _VDX881TT pc_EA40202 _VDX882TT pc_EA40203_ VDX883TT
  2   3540.3   1657 3331 3253.1 3349.3 3642
  3   1796.3   750.6 1334.9 1248.1 477.8 697.9
  5   4954   2602 3011.2 2106.6 1504.9 2070.9
  6   2558.4   1246.9 1008.5 1708.8 874.6 704.3
  9   4538   3328.9 1097 2 1789 6 857 2 1327.3
  12   2544   64206 5055.9 4641.7 3423 2 2283
  18   9287.1   5085 6074 5 5157.6 7305 1 5996 2
  22   5415 4   3850 3537 3100 8 25884 2125 9
  25   12197.9   1296 3368.6 4941.8 1669.9 1166.3
  28   915.5   631 842.7 869.9 606.5 459.6
  32   4804 9   3883.4 3039.3 5722.3 3401.7 1964.8
  39   1409.2   756 1855 1356 1421 1878.4
  74   992.8   349.3 597.9 708.5 741.6 421.6
  78   352   259.9 342.4 434.8 172.4 207.2
  143   4199.6   1314.8 2573.1 5070.9 4302.9 3455.3
  174   3924.5   1399.1 5622.6 4217.6 1231.4 1538.6
  175   6697.7   1590 4211.2 7931.3 7786.3 5294.6
  191   3745.9   816.2 4875.5 3853.9 2370.5 1807.8
  212   1219.1   1064.8 1480.3 1571.7 1350.2 874.7
  222   8593.9   434.6 1994.5 2235.1 980.8 748.4
  233   20899.6   4657 37907 7 22959 2 12092.7 30504 2
  234   3763.9   1496.4 3667.8 2431.2 1077.6 1447.8
  237   2711.1   2046.9 1892.2 2919 2374.3 2011.2
  238   994.1   1047.3 1092.4 929.6 877.7 776.8
  245   1339 1   1019 4 2826 7 1538 1 1379 8 1254.9
  250   1736   710.8 2570.5 1597.7 1899 1923.8
  252   2933.1   1316.3 1849.2 2395.5 2621 1992.7
  264   1426.6   987 2826.4 1272.7 2511.5 1495.4
  290   666.3   368.1 487 665.4 692.1 706
  294   3175.8   1237.3 4622.7 3449.7 976.8 1074.7
  296   1837.4   899.1 2639 2217.5 2398.4 2049.3
  342   1355.2   406.1 1662.9 803.3 1501.3 863
  13a11                
    Signal
  SEQ ID   pc_EA40204_VD X884TT pc_EA40205_ VDX885TT DX886TT pc_EA40206_V pc_EA40207_ VDX890TT pc_EA40208 _VDX892TT pc_EA40209_ VDX893TT
  2   2895.1   2994.7 3397.6 4328.1 4435.3 5980.3
  3   1144   932.3 1277.6 3294.1 1361.2 475.7
  5   1096.3   1151.4 4943.4 1741.8 3416.4 653.5
  6   916.1   945 1696.2 1593.6 1278.6 740.7
  9   2396 9   3699 9 1671.3 1488.6 2591 9 1644 7
  12   3762 2   3308 4 3095.3 2865 4 5996.8 1644.2
  18   9468 3   6669 2 10475.1 18044 8 9220 2 11224.4
  22   3823 4   28134 4326 9 7811.1 7837 1650 1
  25   1322.8   496 4227.5 4437.5 1271.9 878.1
  28   563.6   333.7 602.3 782.9 724.2 636.5
  32   4985.4   854 7 1788.8 24236 2099.4 629.4
  39   963   1562 1937.7 1973.7 1774.1 1387.6
  74   417   443.9 383.6 600.6 814.2 578.6
  78   169.9   266.2 346.4 207.7 507.4 117.7
  143   2357.9   2936.4 2961 6472.6 4513.4 3395.7
  174   1096.5   2227.9 1825 1268.5 5715 380.6
  175   3559.1   4964.6 5395 12711.3 7961.8 5835.7
  191   5320.2   8526.6 4668.1 4129.5 8005 4597.4
  212   1168.2   1634.1 1396.4 1197.9 1161.5 1523.4
  222   805.9   739.2 2330.2 3890.4 873.8 1137.4
  233   25920.6   37919 3 33438 7762.1 23325.9 20940.7
  234   1882.8   1513.3 2438.5 4528.9 2232.7 1253.9
  237   2950.6   1089.9 1886.5 2930.8 2484.4 2082.4
  238   1037.6   1240 818.8 709.9 1096.2 877.7
  245   1151 5   1872 7 2007.7 1316.8 2042 3 17174
  250   3324   2153.4 2363.7 1885.4 2787.9 3192.2
  252   1641   1236.4 1558.6 2605.7 1944.3 2613.4
  264   1120.1   2994.2 1776.5 1401.3 1517.1 2362.8
  290   624.5   516.4 692.1 520.9 778.6 1209.7
  294   1273.1   3050.2 1557 1442.8 3564.9 424.1
  296   2601.3   2023.6 2244.2 2054.1 2777.3 4347
  342   723.2   973.5 587 2028.3 679.8 746.2
  13a12                
    Signal
  SEQ ID   pc_EA40210_ VDX894TT   02014_40319_H 133A_903TT 02014_40320_ H133A_904TT 02014_40328_ H133A_928TT 02014_40330_H 133A_932TT 02014_40331_ H133A_933TT
  2   4563.8   3729.2 2859.9 2475.3 3977.2 3386.3
  3   475.3   827.4 4626.7 246.3 2009.2 1481.9
  5   1717.5   3130.1 1505.9 1737.4 1875.4 4763.8
  6   698.9   1153.2 2668.3 1104 1295.6 1385.4
  9   3017.4   3184.7 63162 1911 3 3801.9 2397 6
  12   1662.3   4850 7880 2 3035 58742 3637 2
  18   7186   6997.2 14099 11077 7 97246 10100 8
  22   1195.2   35834 60582 4793 7 6322 5 4421.3
  25   587.7   868.9 28372.6 667.2 3526 4180.2
  28   728.8   621.5 1685.2 927.3 646.4 882.8
  32   1503.2   2517.7 19668 7 3607 7 1986 9 6399.2
  39   1609.6   1035.8 1140.8 1205 1028.3 1359.6
  74   240.3   718.9 623.5 583.2 953 1021.1
  78   196.7   180.1 370.1 543 465.1 373.7
  143   1838.3   3062.8 1513.2 2903.2 3977.3 3056.1
  174   1548.4   1810.3 1087.6 2910.3 3511.6 3939.4
  175   2462.7   5545.8 1615.4 4518.2 7011.2 5184.6
  191   3292.3   2497.1 3479.2 1085 5805.4 2286.3
  212   1289.5   1230.1 645.1 1795.6 639 1130.8
  222   502.2   315.3 8484.1 1685.8 4996.9 611.8
  233   40786.7   16562 17697.8 3636 9 19787 1 4648.9
  234   1248.4   1443.8 3706.7 2565.5 2733.9 2463.9
  237   1286.1   2852.7 2832.7 2610.7 2178.6 2681.6
  238   1382.7   1243.2 990.6 1636.4 1423.2 1469.2
  245   2174   1000 7 703.1 2029 7 1000.1 1302.2
  250   2738.6   2792.3 1076.8 1439.3 3692.8 2113.9
  252   1078.4   1615.7 2091.1 2366.6 1895.9 2343.2
  264   2550.8   948.7 659.3 1351 811.7 1104.5
  290   518.9   745.9 1057.1 320.1 924.2 1016.3
  294   1936.2   1348.3 757.5 2287.3 2252.8 2351.2
  296   2194.3   1614.6 1446.4 729.6 2955.6 1690.2
  342   612.9   748.4 366.7 1061 1424.5 894.4
  13a13                
    Signal
  SEQ ID   EA02014_403 75_H133A_92 2TT   EA02014_40381 _H133A_945TT EA02014_40382 _H133A_946TT EA02014_40383_H133A_94 7TT EA02014_4038 4_H133A_948TT ThyFolCan_02014_40084_H133A_823TT
  2   1365.7   2250.1 3658.1 2299 3252.8 3923.8
  3   935.6   1124.4 1023.9 2053.6 1187.9 1004.7
  5   7421.5   656.7 3000.3 4151.2 1730.9 3450.3
  6   1026.6   982.6 1628.3 1939.7 1023.9 1961.6
  9   10992.7   1452.2 2478.2 4581.5 15676 3548 1
  12   3007   2413.7 5402.5 7248.9 43826 6659 3
  18   4106.5   1506 8 9107 3 6629 5 9741 9 7072 7
  22   4766 3   5493.6 3034.7 4839.3 4984.8 4552
  25   900.4   10351.1 1252.1 10112.4 3841.3 7385.3
  28   685.8   1022.4 714.8 1191.6 656.5 625.4
  32   3099   1984 3215.8 12202 3 3643.5 4762.1
  39   1012.3   480.5 1173 1094.8 781.3 1099.1
  74   987.5   61.1 894.2 877.1 629 887.2
  78   418.8   111.9 529.1 188.7 371.9 650.2
  143   2030.1   1411.5 3847.8 2492.6 2091.2 2930.6
  174   2288.4   269.9 4741.5 2231.5 2488.1 7050.3
  175   2412.3   2138.9 6427.7 3285.3 3691.4 4648.5
  191   617.2   983.6 2182.4 2820.3 2081 1043.3
  212   1018.3   208.9 1054.6 964.8 771.1 591.5
  222   725.6   8103.4 443.9 4302.9 1433 10512.7
  233   4450.9   1177.3 20110.2 19683 9 4061 3 2305 1
  234   1926   2079.3 2095.5 2384.8 2482.7 6283.2
  237   2547.1   2912.4 3040.2 2736.1 3005.4 2681.3
  238   1822.6   264.6 1023.9 929.2 1088.1 1176.9
  245   2699 8   113 4 1230.8 1359 8 887 10676
  250   1198.1   249 1802.6 1764.8 2220.8 1112.4
  252   1311.3   613.6 2434.6 2260.9 1332.7 1526.7
  264   1364.1   382.4 1206.6 1095.1 1210.5 1451.3
  290   463.8   1223.5 805.1 865.9 867.2 932.7
  294   2024.9   340.9 4085.3 1859.8 2226.1 3796.8
  296   1214   407.7 1761.4 1917.2 3476.1 1298.6
  342   619   828.4 1154.4 639.7 934.8 1110
  13a14                
            Signal      
  SEQ ID ThyFolCan_02014_ 40085_H133A_824 TT   ThyFolCan_0201 4_40086_H133A _825TT   ThyFolCan_0201 4_40088_H133A_ _827TT ThyFolCan_0201440089_H133A_8 28TT 02014_4031 8_H133A_84 0TT fc_EA40212_VDX896TT
  2   4103.9   1719.4 2824.5 3008.4 2542.7 2391.5
  3   2480.6   144.5 880.5 497.6 1222.4 1069.4
  5   2190.2   1199.7 4743.6 2936.8 3696.7 776.8
  6   678.7   2589.2 2824.2 1605.5 1411.7 1403.4
  9   6166.7   326.1 5428 5 2131 9 9187.9 10000.2
  12   2872 9   12574 4087 8 3981.7 7045.4 3506 5
  18   2329.3   4765.3 11522.2 72662 30454 4683.8
  22   4848.8   1224 5 3611.1 6001.8 8784 3 1719.5
  25   11319.1   234.5 3308.3 557.1 3685.8 383.8
  28   772.7   206.8 830.6 1057.1 1256.5 1118.8
  32   2272   1393 6 5410.1 3330.7 13825 869
  39   892.8   541.9 838.5 941.3 1041 3461.7
  74   126.8   2071.9 988.3 1357.4 696.8 1290.4
  78   253.6   307.6 635.6 920.6 422.6 705.5
  143   2718.2   1570.7 2322.8 2621.6 1038.8 4569.2
  174   323   1399.8 2817 9345.5 876 8849.8
  175   4437.4   1584 4411.6 4784.9 1387.4 7804.9
  191   2214.1   575.3 641 1208.4 1370.9 4609.8
  212   1667.2   662.2 577.7 1864 1229 590.6
  222   13107.5   470.6 1030.1 3020.4 431.9 641.9
  233   3196   1087 3 2932.3 7081 12626 7 4567 8
  234   1659.8   875 2243.3 1839.4 1328 7454.4
  237   4693.9   830.4 3700.2 2563.5 3040.5 1008.9
  238   206.3   244.8 1695.9 2546.2 1614.2 1880.4
  245   248 7   425 1273.9 1215.4 1333 6 2807 1
  250   935.9   379.8 2284.4 894.4 867.2 844.3
  252   1005.5   1023.3 1166.6 1884.3 1524.6 678.4
  264   231.5   589 1480.9 1740.6 670.9 2459.5
  290   638.4   233 961.8 668.6 430.5 996
  294   253.4   1349.8 3087.1 7034.2 1137.3 7044.6
  296   1530.8   1570.6 1591.5 1110.6 1647.7 1498.9
  342   1457.8   712.8 1052.1 946.9 436.8 1893.8
  13a15                
    Signal
  SEQ ID fc_EA40214_ VDX898TT   fc_EA40215_VDX899TT X900TT fc_EA40216_VD fc_EA40217 VDX901TT _fc_EA40218_ VDX902TT fc_EA40221_ VDX909TT
  2   2470.7   3328.2 4429.9 993.8 2649.9 4427.4
  3   226.4   523.3 1216.2 230.6 417 2364.4
  5   1037.2   3614.5 1440.6 3319 1487.4 9590
  6   997   685.7 1641.1 967.8 2453.4 2537.2
  9   384 9   1897 9 973.8 468.7 915.8 3339 3
  12   9940.9   1131.5 4352 5 3202.2 4269.6 15203.8
  18   4150 7   7715 2 5736.9 555.2 9139 4 7575.1
  22   499.6   4466 2120 9 208.2 1563 8 1471.9
  25   1721.2   1912 10485.2 826.1 1002 6640.8
  28   232.1   706.8 501.4 122.6 779.4 645.5
  32   2728   3753 2644 6 1820.8 3404 5 10045 3
  39   1006.6   1103.3 1192.6 880.8 2310.8 346.3
  74   2642.5   794.9 374.8 1621.8 1057.9 294
  78   294.6   280.8 131.2 136.4 344.4 391.3
  143   2141.4   2616.6 5902.7 3882.4 3860.3 2140.3
  174   407.1   2263.7 4270 2006.5 2076.1 4365.1
  175   2416.2   3697.7 9897.3 6489.5 5706.6 2832.4
  191   1436.8   2900.8 5153.2 165.9 963.5 512.1
  212   839.5   1241.8 173.4 186.6 451.1 539.9
  222   293.5   1318.1 6645.4 426.3 623.3 2727.5
  233   17314 7   18164.1 2971 6 2885.8 2068 7 15099.5
  234   2834.4   1575.6 3547 904.6 3783.5 6440.1
  237   945.9   3105.6 1268.9 2192.9 893.1 2333.7
  238   702.9   1189 744.7 554.4 383.3 908.6
  245   9146   1729.7 1038.5 8394 7842 459 1
  250   845.9   2673.5 2175.8 702.9 1305.7 723.2
  252   1300.4   2454.3 1377.7 1061.4 1235.8 2869.5
  264   467.6   1277.4 638.7 2058 1076.7 569.3
  290   573.7   617.2 993.4 767.4 1281.3 866.1
  294   507.1   1922.1 2827.1 1753.3 1585.8 2926.6
  296   2156.2   2563 1832.8 533.7 2130.3 2647.8
  342   535.6   937.4 2261.2 1949.2 975.7 618.5
  13a16                
    Signal
  SEQ ID fc_EA40222 _VDX910TT   EA02014_40386_H133A_954TT   EA02014_40394 _H133A_967TT EA02014_40395 _H133A_968TT EA02014_40396_H133A_969TT EA02014_40397 _H133A_970TT
  2   2362.1   1417 1834.4 2135.5 3006.2 1378.8
  3   254.5   263.8 293.1 346.3 346.5 230.2
  5   2789.1   8005.6 1358.8 637.3 1214.1 1214.7
  6   1096.6   1050.7 1291.7 987.4 1223.8 2564.6
  9   834 8   1052.4 2701.2 1171.4 6080 4404
  12   2031.6   4283 7 2805 1 4294.1 4966.4 1686 4
  18   9498.5   18893.2 15454.8 23854 3575 2 14192 8
22     1449.5 1020 888 3 25854   3721.5 727 2
25     541.3 253.5 572.5 3378.9   709.2 372
28     502.3 443 868.6 378.9   964.1 341.9
32     1205 9 7122 3 811 1 2553 9   797 1753 4
39     1215.4 1031.1 511 722   2166.1 527.2
74     1385.4 1477.5 509.1 1159.3   1031.4 495.4
78     367.5 834.2 200.7 235.3   433.7 262.9
143     1956.7 1083.1 1486.7 2746   2538.4 1943.7
174     4466.4 2823.7 1156.3 3547.9   1541.5 2214.3
175     1799.1 1264 1758.7 4539.5   3663.3 2264.4
191     560.2 1494 607.9 2943.2   2727.4 1474.7
212     1046.1 307.8 548.8 340.4   982.9 647
222     248.3 533.1 1356.4 1194.2   1169.8 228.2
233     1183 6 612 7 1063.5 5528 1   6265.3 1076
234     1886.7 3460.3 1207.2 2384.1   3860.6 3322.7
237     934.6 933.5 2133.3 791   1890.7 1257.5
238     834.9 291.1 1034.1 380.8   946.4 415
245     1509 7 786 657.1 384 9   47734 447.6
250     2406.8 1245.7 1247.1 659   1647.1 595
252     1536.8 1182.4 1035.6 526.7   893 1276.7
264     1315.8 582.8 1373.3 958   1622.3 508.6
290     732.4 835.3 683.8 1744.3   523 850.2
294     2817 2372.1 929.5 2611.8   1279.4 2097.3
296     1088.4 1541 2014.6 621.4   1368.3 981.8
342     763.6 866.9 722.2 559   594.3 445.9
13a17                  
  Signal
SEQ ID     EA02014_40405_H133A_982TT EA02014_40406_ H133A_983TT pb1 pb2 pb3 pb4 pb5'
2     3615.6 2106 204.1 324.2 178.6 206.6 249.1
3     765 372.2 171.5 303.8 159.1 61.1 90.2
5     2446.1 394 107.9 19.9 68.8 74.6 69
6     1051.6 1176 211.6 214.4 318.6 218.8 355.4
9     2059.2 1081.4 18 18.8 29 2 9.7 134
12     3520 391 5 11.4 9.1 9 9 33.8 8.3
18     46334 13339 942 50 7 67 6 74.9 29 7
22     6678 8 403.5 80.1 8.8 58 9 6.8 5
25     1229.9 601.5 197.1 106.7 65.2 18.7 87.1
28     778.2 197.6 11.4 9 6.6 11.3 1.7
32     1302 9 4182.6 8 7 216 30 5 66 1 646
39     2104 699 246.6 121.1 42.8 122.8 171.7
74     1189.7 1142.3 59.9 94.4 70.3 29 5.2
78     833.6 80.9 48 45.5 66.9 49.5 119.7
143     2842.2 1778.2 290 291.2 190.4 65.6 203.5
174     3307.3 329.8 29.8 42.8 25.8 11.5 13.9
175     3878.9 1966.8 165.7 282.2 158.3 67.2 249.2
191     3952.2 5278 46.8 181.4 148.2 97.3 152.1
212     2201.7 1134.3 70.4 55.7 3.5 87 99.8
222     469 451.7 120.4 148.4 57.7 49.6 58
233     11069 2 5607 13 9.8 76.9 7.6 5.6
234     3590 3449.5 186 201.3 165.3 54.8 141.1
237     2057.6 740.2 135.3 150.7 16.6 14.5 19
238     1679.4 432.3 22.5 43.8 11.4 6.2 5.1
245 2301.6   1040.5 7.1 12.7 22.5 5.9 4.6
250 434.6   1079.4 97.8 138.9 33.7 71.3 46.8
252 844.6   1446.1 35.6 80.3 29.6 63.3 88.5
264 1125.7   1084.5 102.8 31.6 147.6 101.5 163.3
290 347.9   466.9 35.8 7.3 14.4 87.2 71.7
294 1601.7   575.2 105.1 94.5 41.5 31.9 35.3
296 872.8   1349.1 217.8 162.6 186.2 171.2 212.5
342 948.7   482.8 296.5 220.2 308.1 126.3 298.6
13a18                  
  Signal
SEQ ID pb6' pb7' pb8' pd10 pd11 pd12 pd9    
2 167.6 233.9 232.6 116.9 204.5 293.2 79.6    
3 110.3 88 221 90.7 74.9 90.9 129.5    
5 29.7 70.2 182.5 32.6 34.5 130.5 12.9    
6 226.5 440.1 207 219.7 275.2 134.1 193.1    
9 28.9 26.8 30.9 32.4 5.3 25.7 20.3    
12 6.6 18.8 61.4 38.2 10.7 4.1 11.6    
18 53.6 37.8 31.8 68.8 117.5 70.8 23.4    
22 16.9 10.6 32.1 6.4 29.2 21.2 3.1    
25 127 123.7 118.7 9.7 10.6 64.9 6.6    
28 13.2 4.9 43.3 8 8.1 12 26.6    
32 36.4 51.5 25.1 17.2 17.2 19.7 11    
39 326.5 182.6 27.3 126.2 151.9 26 182.7    
74 48.2 29 39.8 55.3 35.3 76.9 6.2    
78 92.1 20 93.1 26.9 11.9 56.6 104.8    
143 176.4 224.5 248.5 126.2 184.6 162.8 249.6    
174 40.8 149.8 24.2 10.6 15.6 81.4 16.8    
175 262.5 355.4 207.8 21.6 213.9 175.8 193.7    
191 151.2 310.7 93.3 65.7 108.4 95.9 79.3    
212 93.6 121.7 141.7 30.8 42.6 108.4 93.2    
222 83.5 82.8 60.7 16.4 50 71.1 66.9    
233 144.7 64.1 16.5 7 6.4 70.8 8.1    
234 40.4 248.1 43.8 45.5 143.3 30.9 77.2    
237 223.8 141.1 129.5 57.6 21.9 11.5 18.7    
238 32.1 27.6 66.5 10.5 33.4 7.7 61.6    
245 19.8 7.2 4.8 2.4 11.2 4.7 6.7    
250 75.3 109.2 65.3 85.7 119.9 118.7 66.4    
252 103.8 91.2 94.8 17 69.3 76.1 22.5    
264 151 224.6 129.9 122.8 99.8 61.5 24.8    
290 21.8 27.6 50.7 49.3 6.2 2.7 3    
294 91.1 58.4 40 24.8 52.6 67.8 31.7    
296 282.4 299.2 143.7 149.7 197.9 205.2 147.7    
342 304.5 437.8 327.4 90.9 247.6 224.4 268.6    
Table 13b
    13b1              
      ThyMixBen_02014_....._Signal
    SEQ ID 40062_H133A_ 800TT 40063_H133A _801TT 40064_H133 A_802TT 40065_H133 A_804TT 40066_H133A_ 805TT 40067_H13 3A_806TT 40068_H13 3A_807TT
    83 156.7 114.3 73.8 240.3 173 193.6 291.2
    142 28.3 7.9 15 28.4 26.9 25.8 14.2
    136 100.4 23.4 15.3 106.9 28.3 14.9 56
    137 527.2 47 32.3 177.2 19.5 53.1 404.6
    152 237 5 230 4 188 7 465 2 183 260 9 190 9
    160 291.6 141.9 80.9 323.1 98.1 120.3 219.7
    163 15.6 53.1 25.1 70.8 20.1 56.1 47.8
    165 190.2 230.7 142.4 483.5 179.3 241.7 410.9
    210 436.9 8.6 18 4 343 7 31 8 8 148 1
    214 269.2 134.4 167.7 141.8 145.7 140.5 349.1
    219 217 19.6 7.9 85.6 21.8 26 7 115 1
    224 588.2 121.4 119.2 275.2 136.6 240.5 392.5
    225 508.1 30.1 42.3 107.3 59.4 29.5 147.3
    226 57.3 48.2 80.9 103.9 43.5 12.3 88.9
    309 257 7 20 6 20 7 75 5 19 2 24 833 9
    332 287.1 17 26.1 103.8 115.9 68.8 37.9
    346 622.2 18.3 24 17.6 20.7 13.2 442.1
    13b2              
      02014_...._Signal
    SEQ ID 40198_H13 3A_871TT 40321_H13 3A_913TT 40322_H133A_914TT 40323_H133 A_915TT 40324_H133A_917TT 40325_H133A_9 18TT 40326_H133A_919TT
    83 39.5 90.2 76.4 255.8 150.8 109.7 142.6
    142 253 225 129 34.4 85 206 67 7
    136 9 8 31 8 16 8 132 115.4 130 7 2
    137 227.8 53.6 68.7 60.1 29.6 133.2 29
    152 1844 214.3 253.2 294.7 313 8 251 3 294 2
    160 267.5 250.7 206.5 328.8 191.1 220.8 114.4
    163 44.3 56.4 23.9 98.5 35.2 68 41.8
    165 174.3 221.8 185.5 266.3 104.6 217.1 159.9
    210 25.8 153.6 42 55.5 8.2 152 49.1
    214 43.2 83.6 110.6 213.7 132.6 89.3 138.7
    219 56.6 53 5 45.2 122 1 24.4 124 41 3
    224 269.7 357.4 260 191.5 168.8 330.8 169.8
    225 204.8 322.6 114.2 55.6 121.8 317.6 95.8
    226 11.9 57.4 69.9 22 67.5 33.6 19.3
    309 37 9 37 28.2 53 10 2 32.6 24.2
    332 9.9 59.2 66.7 57.8 13 40.5 9
    346 28.6 263.8 11.1 27.2 32.5 153.4 61.1
    13b3 ThyFolBen_02014_...._Signal
  SEQ ID   40079_H133A_818TT 40080_H133A_819TT 40081_H133A_8 20TT 40082_H133A_8 21TT 40083_H133A_822TT
    83 63.4 151.8 104.1 27.5 138.9
    142 15 20 2 18.8 184 23
    136 176 1354 771 317 7 177
    137 24.7 36.1 18 28.4 42.6
    152 62.1 176 7 1464 273 328 2
    160 153 138.1 131 192.4 124.1
    163 6.9 21.4 35 63.5 13.4
    165 28.7 155.3 112.6 135.2 222.6
    210 27.5 104 8 1 9.9 32 7
    214 211.7 210.8 83.9 166.4 160.2
    219 18.7 111 8 45 7 10.9 296
    224 114.7 241.8 100 160.9 242.4
225   69.4   63.8   145.4   110.9 126.1
226   92.1   95.5   36.3   74.5 67.2
309   16 8   17 1   12 1   18.8 27 3
332   40   20.5   21   11 6.6
346   7.2   27.1   18.6   29.3 26.7
13b4                  
  fa_EA40...._Signal
SEQ ID 191_VDX 862TT 192_VDX 863TT 193_VDX 864TT 194_VDX 865TT 195_VDX 866TT 196_VDX 867TT 197_VDX 869TT 219_VDX 907TT 220_VDX 908TT
83 26.9 24.4 56.7 155.1 22.3 94.8 63 41.9 256
142 16 1 224 14 5 54.8 17.1 15 8 15 21 9 23.9
136 23 59 204 121.4 12 5 884 77.1 47.1 36 8
137 26.7 32.4 19.4 45.2 32.8 30.8 55.2 97 63.4
152 75.7 189 4 211.4 320.4 194 75 314 114.8 260
160 239.5 101.4 91.3 292.1 55.8 86.8 53.7 47.4 145.4
163 48.5 47.6 23.1 168.3 35.4 13.4 51.5 13.5 60.4
165 26.5 94.3 81.1 197.9 112.9 164.6 155.9 29.7 250.4
210 32 162 23.8 81 4 25 3 22 7 140.5 15 5 200.8
214 119.9 130.9 141.7 235.4 263.7 30.4 124.9 57 241
219 80.6 7.6 15.9 20.9 25 10 16.5 23 4 20.1
224 50.9 25.8 50 80 32.6 99.7 136.4 137.5 167
225 66.7 41.4 64.4 69.5 18.1 94.7 51.1 51.6 149.8
226 76 2.8 3.5 99.1 88.6 18.3 60.6 6.4 11.6
309 6.7 15 2 9 4 62.6 20.2 18.4 15.9 10.2 33.3
332 9.6 36.9 10.8 91.1 103 30.1 12.3 19.7 36.5
346 25.1 16.5 20.8 89.5 34.2 20.9 18.3 11.3 114.6
13b5                  
  02014_...._Signal
SEQ ID 40327_H133A_920TT 40332_H133A_938TT 40334_H133A_941TT 40335_H133A_940TT
83 216.8 145.1 58.1 102.1
142 222 79 66 14.5
136 717 30.1 47 9 57 1
137 47 48.1 69.1 57.4
152 252 7 189 184 9 138 7
160 153.6 179.9 121.9 66.6
163 53.3 13 69.1 38.6
165 146.9 146.7 85.1 120
210 24 59 1 21 5 23.8
214 90.5 224.7 178.7 139.3
219 364 175 14.9 633
224 180.2 126.6 154.2 161
225 59.7 197.5 83.8 115.7
226 31.9 58.1 33.7 34.5
309 27.1 7.7 62 15.6
332 83.8 7.3 40.7 47.6
346 17 18.8 27.6 24.4
13b6                  
  EA02014_...._Signal
SEQ ID 40374_H133A_921TT 40376_H133A_923TT 40377_H133A_924TT 40378_H133A_925TT 40379_H133A_926TT 40380_H133A_927TT
83 174.4 32.7   69.1   86.6 67.7 204.9
142 13 7 19 8 16 5 57 7 229 189
136 24 2 93 3 121 25 3 1904 163
137 38.3 143.4 28.2 77.5 156.1 255.2
152 178 2 230 8 155 237 4 235 7 70
160 125.2 133.5 74.9 197 170.1 234
163 54.4 4.9 35.5 46.1 65.2 59.9
165 218.9 155.1 206 85.6 158.1 532.5
210 36 8 25 20 6 25 8 19 8 551 3
214 126.7 186.1 167.2 114.1 191.9 161.4
219 32 3 25 5 153 10.2 956 185
224 220.8 301.9 129.7 178.8 308.2 270.9
225 158 221.8 127.9 65.6 269.4 60.6
226 37.5 33.4 9.6 57.2 80.8 18.1
309 26.1 16 17.8 31.8 32.6 49.2
332 26.3 38.5 6.7 11.6 73.5 21.6
346 20.7 151.4 21.6 19.2 179.4 53.6
13b7                  
  EA02014_...._Signal
SEQ ID 40387_H133A_955TT 40388_H133A_956TT 40389_H133A _957TT 40390_H13 3A_959TT 40391 _H 133 A_960TT 40392_H13 3A_961TT 40393_H133A_962TT
83 29.1 121.5 70.7 28.8 121.6 120.5 34
142 16 7 14 2 10.8 21.4 18 6 16 5 21 9
136 15.7 27 8 67 7 45 9 15.4 8.2 43
137 28.5 44.8 34.5 50 31.3 162.2 18.5
152 85.4 163 183.4 127.1 298.3 193 221 2
160 121.9 51.9 175 137.4 208.5 125.9 162.9
163 47 27.7 43.6 80.1 29.7 33.8 44.4
165 42.2 122.9 103.8 150.6 253.3 106.9 221.6
210 27 1 5.2 20 4 73.7 29.4 38 7 85
214 36.4 68.3 165.9 25.3 129.8 165.9 100.9
219 23.3 51 29.1 25.7 67 20 8 21
224 143.8 95.1 128.8 78.7 108.8 152 109
225 140.5 83.1 34.3 54 62.8 114.7 94.3
226 4.3 24.7 84.1 47.7 33.3 66.1 13.3
309 18.7 11 4 26.7 44 2 20 6 34.1 35 2
332 11.1 4 37.9 14.1 17.8 13.1 11.5
346 21.8 22.1 12.8 35.7 38.7 29.6 10.1
13b8                  
  ThyPapCan_02014_...._Signal
SEQ ID 40090_H133A_829T T 40091_H133A_830T T@IC T 40092_H133A_831T T 40093_H133A_832TT 40095_H1 33A_834T 40096_H133A_835TT 40097_H133A_836T 40098_H133A_837T T@I2 40099_H133A_838TTT@I2
83 304.3 199.4 258.5 171.4 112.3 186.5 28.5 137.1 85.9
142 26 34.7 22.4 19.1 16.4 25.8 12.1 22.5 15.4
136 122 116.6 21 7 27 3 30 7 110.2 143.8 14 1 140 7
137 40.7 72 23.8 36.8 19.5 31.5 29.6 29.4 32.5
152 611 5 6826 236 2 170.5 2006 104 1364 118 155.9
160 218.1 244.7 225.4 68.9 76.4 137.3 83.2 171.7 66.7
163 17.8 119.7 38.9 51.8 52.9 42.7 25.5 57.2 23.6
165 370.6 263.3 353.8 154.1 165.1 233.2 100.4 227.9 213.3
210 223 2 249.8 303 9 7 30 2 94 5 2.1 9 42 1 23 1
214 178.4 246.7 155.4 188.3 138.6 133.7 69 298.6 90.8
219 36 8 1364 64.9 24.3 19 1 25 9 11.7 13 6 23 8
224 364.7 427.6 180 191.1 163.6 211.5 36.2 303.4 164.5
225 333.9 213.5 221 124.6 137.2 195.1 46.9 145.6 34.3
226 179.2 142 57 15.4 13.7 91.8 33.5 132.6 38
309 88 8 1366 48 8 22.1 236 40 5 16 5 33.9 24 7
332 92 56.9 42.4 61.1 34.4 70.6 14.5 108.1 60.8
346 58.4 61 197.1 38 29.1 281.5 37.5 30.8 33
13b9                  
Signal
SEQ ID 02014_40317_H 133A_839TT pc-EA40200VDX879TT pc_EA40200_VDX381TT pc_EA40202_V DX883TT pc_EA40203_VDX884TT pc_EA40204_VDX884TT
83 288.1 25.7 32.9 29.2 58.6 184.4
142 161 21.2 235 7 6 21.5 306
136 62 8 160 5 141 2 8 7 13.2 194
137 45.4 155.5 35.5 63.9 41.5 31.1
152 3834 197 90.6 123.7 188 240
160 142.9 79.7 181.6 205.4 157.5 137.8
163 60.7 102.6 138.6 54.2 51.3 9.1
165 238.4 123.6 227.1 113.8 224.6 241.9
210 134.1 184.4 28 4 42.7 12.1 10
214 189.2 173 50 164.7 109.2 250.7
219 26 8 304 13 78.6 34 3 34.8
224 215.8 139.5 255.3 96.9 181.3 163.2
225 167.5 35.2 98.1 32.4 221.4 54.3
226 49.4 43.4 84.3 16.4 141.3 27.2
309 3408 33.7 32 22 8 29.8 45 9
332 15.9 39.9 67.1 14.3 88.5 25.2
346 14.3 37.9 36.5 33.5 28.7 37.4
13b10                  
pc_EA40....Signal
SEQ ID 205_VDX885TT 206_VDX886TT 207_VDX890TT 2 08_VDX892TT 209_VDX893TT 210_VDX894TT
83 22.9 81.5 34.1 159.2 88.7 43.6
142 26.4 22.6 8 7 10.8 18 5 23 6
136 28 1 182 96 146 17 3 22.5
137 207.7 94.4 68.8 83.7 17.8 140.4
152 77 3 225 5 227 5 95 5 301 1 59 9
160 294.5 164.6 110.3 84.5 105.6 163
163 140.9 34 25.9 9.3 41.9 83.5
165 241.8 101.4 193.6 161.9 112.8 190.6
210 416 8 24.1 45.5 15 1 26.9 15 6
214 323.3 139.5 148.2 174.1 104.5 142.1
219 131 8 12 25 9 48 9 14.4 96 2
224 215.3 136.2 126 193.8 195.4 67.1
225 49.5 105 54.2 97.4 33 38.8
226 181.1 55.2 34.1 4 15 125.1
309 89 7 474 58.6 17.8 23 5 72 3
332 73   66.8 48.2 27.5 41.1 25.3
  346   13.6 107.7 145.3 12.7 18.3 20  
      13b11            
02014_40...._Signal
      SEQ ID 319_H133A_903TT 320_H133A_904TT 328_H133A_928TT 330_H133A_932TT 331_H133A_933TT
      83 58.3 226.6 92.7 79.2 106.8
      142 14.5 118 3 7 4 8 13
      136 18 9 87 29 4 86.9 9 3
      137 24.4 174.5 25.7 135.9 31.8
      152 178 1 137.4 176 6 151 1 166 2
      160 83.2 205 93.5 103.6 73.3
      163 31.5 100.2 19.2 40.5 33.4
      165 223.6 235.9 161.5 132.8 132
      210 35 5 99.9 164 124 9 17
      214 118.8 178 25.6 115.6 117
      219 15.1 80.2 27 7 54 2 38.2
      224 174.5 323.7 223.3 203 185.3
      225 170.3 200.3 180.3 157.3 157.2
      226 65.2 26.9 25.2 4.6 35.7
      309 11 1 270 7 15.9 91.5 18.6
      332 12 77.4 15.1 7.2 7.3
      346 30.4 387.2 25.4 204.4 14.7
      13b12            
EA02014_403...._Signal
      SEQ ID 75_H 33A_922TT 381 _H133A_945TT 382_H133A_946TT 383_H133A_947TT 384_H133A_948TT
      83 88.1 22.1 193.5 166.9 73.7
      142 12 19.6 13.2 216 16 9
      136 11.8 12.3 52 46.7 107.9
      137 51.3 182.2 43.4 23.9 14.5
      152 341 139 248.1 299.5 84.2
      160 95.6 102 99.5 184.6 90.4
      163 77.6 41.6 25.9 51.7 4.9
      165 254.8 63.5 172.6 209.2 81.1
      210 8.8 78 8 28.8 44 92 5
      214 79 94.5 148.7 101.7 82.7
      219 22.7 31 8 63.9 23.8 18 1
      224 190.5 194.7 212 278.9 201.1
      225 48.8 135.5 46.8 274.4 97.5
      226 20 24 90.8 45.3 8.8
      309 30 5 195 29 7 53.4 37 9
      332 10.4 4.2 9.2 75 44
      346 19.1 298.1 35.4 251.2 37.2
      13b13            
ThyFolCan_02014_400...._Signal
      ID 84_H133A_823TT SEQ 85_H133A_824TT 86_H133A_825TT   88_H133A_827TT 89_H133A_828TT  
      83 242 70.1 71.2 24.8 145.4
      142 198 12 5 35 7 17 7 196
      136 35.7 21 594 20.3 9
      137 86.1 23.7 116.3 18.2 48.6
      152 271 9 199 6 213 1 140 9 202 5
      160 134.5 134.1 415.1 162.4 154
      163 64.5 46.1 82.1 15.5 57.8
      165 189.3 222.6 239 223.8 145.3
      210 8 8 236.3 50 21 8 33.8
      214 33.1 71.3 121.6 157.3 100.8
      219 24.8 64.8 82.1 14 1 60.2
      224 213 96.4 371.4 159.1 92.8
      225 164.9 103.3 224.4 145.2 98.3
      226 10.3 5 24 30.5 21.6
      309 1.5 33 3 38.4 14 9 19 5
      332 77.6 55.7 124.1 10.6 84.5
      346 39 124.6 31.5 29.5 21.3
    13b14              
Signal
    SEQ ID 02014_40318_ H133A_840TT fc_EA40212VDX896TT fc_EA40214VDX898 fc_EA40214VDX899TT _fc_EA40216 VDX900TT _fc_EA40217VDX901TT _fc_EA40218_VDX902TT
    83 189.9 110.7 51.4 35.8 104.6 22.8 44.7
    142 18 9 20.5 246 2 18 5 20 7 25 8 28.6
    136 22 6 1052 21 3 20 3 33.6 24 5 24 4
    137 32.3 40.6 46.3 33.8 23.9 90.6 48.5
    152 306 216.1 317.8 169.4 154 6 169 3 374.5
    160 99.5 197.5 240.4 68.7 146.7 210.5 196.7
    163 52.1 29.7 70.7 75.3 13.6 25.1 59.3
    165 236.7 142.2 147 168 174.2 28.5 89.4
    210 26.5 108.8 39.9 25.5 42.2 11 24.4
    214 115.1 63.9 59 129.9 150.1 194.1 218.2
    219 16.2 462 62.7 39.4 42.1 92.3 39.4
    224 203.9 194.6 321.6 225.2 238 212.5 187.3
    225 198 53.6 82.8 52.5 250.7 72 53.5
    226 31 57.3 111.5 66.7 24.8 41 79.9
    309 17 2 40 2 44 3 43 202 21.3 63.8
    332 51.4 26.5 129.1 7.8 5.6 37.8 31.6
    346 11 27.8 21.6 19.4 320.2 24.3 210.6
    13b15              
Signal
    SEQ ID fc_EA40 221_VDX909TT fc_EA40 222_VDX 910TT EA02014_40 386_H133A_954TT EA02014_4 A_967TT EA02014_40395 _H133A_968TT EA02014_40396 _H133A_969TT EA02014_40397 _H133A_970TT
    83 312.2 30.6 105.2 42.2 47.9 24.4 46.5
    142 17 8 9 9 41 7 30 5 11 8 16 7 25 7
    136 63 1 87 5 38 29.7 206.6 10.1 99 2
    137 170.4 111.7 78.6 32.1 20.5 29.4 18.2
    152 624 7 121 2154 212 7 160 9 216 7 214 3
    160 257.9 30.5 247.6 161.6 192.1 87.9 538.8
    163 21.6 27 43.5 50.7 56.9 72.4 48.7
    165 251.9 105 92.7 136.3 166.5 123.8 180.7
    210 34 37 1 49 43 9 21.1 7.4 167.3
    214 257.9 176.9 168.4 69.9 113.6 140.1 14
    219 117 7 25.9 43.2 42 1 24 12.6 57.9
    224 55.1 150 176.9 280.1 184.5 194.4 374.4
225   205.8 42.3 66.7 320.3 27 66.2 54.6  
226   84.9 14.8 148.6 26 47.6 36.9 21.7  
309   745.6 9.7 53.6 42 23.5 14.6 71  
332   15.9 7.6 69.9 10.3 63.6 13.3 26.8  
346   48 25 54.7 142.6 73.6 16.6 79.7  
13b16                  
Signal
SEQ ID EA02014_ A_982TT 40405_H133EA02014_40406_H133A_983TT pb1 pb2 pb3 pb4 pb5'    
83   58.1 37.8 524.7 768.1 855.6 1163.9 876.3  
142   19.4 37.4 1289.3 1088.9 659.7 732.6 1415  
136   94.2 11 1770.1 1622.7 523 979.3 617.3  
137   17.6 111.2 1871 2554 1313.7 781.1 2087.1  
152   255.8 37.6 9696.4 9404.8 1099.8 9900.7 7060  
160   54.7 347.1 1273.2 1171.6 907 937.4 926.2  
163   32.9 59.7 833 682.5 559.6 358.6 467.8  
165   71.8 232.2 1217.8 986.3 1278.9 2537.1 1464.3  
210   5.2 33.2 3401 2546.6 756.3 4893 1770.8  
214   212.6 165 1384.3 1540 636.7 656.5 1213.9  
219   14.1 43.6 1692.6 1911.7 912 2112.2 1417.9  
224   137 228.2 2481.3 3104.2 1905.4 2292.4 2242.7  
225   109.6 153.4 2367.8 2860.8 1887.6 2052.6 2116.6  
226   41.6 11 797.2 969.9 617.1 626.6 920  
309   20.7 30.6 5357.5 5904 3572.6 6641.9 3580.9  
332   6.8 60.2 258.8 304.2 110.1 255.3 363.3  
346   27 57.3 1584.2 1705.7 1507.5 1626.3 1587.1  
13b17                  
Signal
SEQ ID pb6' pb7' pb8' pd10 pd11 pd12 pd9    
83   818.3 333.4 692.1 709.8 699.7 926.2 534.7  
142   1287.9 565.2 2004.2 821.9 904.8 1619.6 282.5  
136   1039 962.2 1013.5 1147.9 840.9 599.8 284  
137   1881.4 667.8 2058.1 2182.1 1678.9 1632.9 912.1  
152   3657.7 2025 2993.5 11129.5 11824.2 11963.5 1607  
160   1154.5 301.4 473.2 1849.5 1424.7 660 242  
163   628.5 6534.3 1863.4 148.7 247.4 230.2 4365.5  
165   1485.2 889 1904.7 1404.2 1560.4 1123.6 1919.6  
210   2729.4 595.1 5286.8 7026.5 3861.8 1028.8 1724.8  
214   871.1 442.3 644.5 936.1 939.1 852 347.7  
219   1216.7 411.6 959.7 1085.9 1974.4 1580.2 245.9  
224   2039.1 1488.3 1844.2 2280.1 2823.4 1252.3 1246.6  
225   2214.8 1494.6 2186 1961.1 2628.3 1184.1 1230.7  
226   822.2 437.3 684.5 540.5 839.5 1062.8 273.7  
309   3787.2 1401.5 3537 3522.1 4518.1 5118.1 931.7  
332   218.7 630.6 462.4 168.4 220.4 242.2 322.8  
346   1682.4 909 2317.6 1193.5 1748.9 2159 622.7  
Table 14 Control Genes
Control Genes for Blood  
SEQ ID NO: Gene Name
   
142 Bone marrow stromal cell antigen 1 (BST1)
219 Leucocyte immunoglobulin-like receptor-6b (LIR-6)
309 Bridging integrator 2 (BIN2) Control Genes for Thyroid
9 Cysteine-rich, angiogenic inducer, 61 (CYR61)
12 Selenoprotein P, plasma, 1 (SEPP1)
18 Insulin-like growth factor-binding protein 4 (IGFBP4)
Table 15a
SEQ ID BN_800TT BN_801TT BN_802TT BN_804TT BN_805TT BN_806TT BN_807TT
142 28.3 7.9 15 28.4 26.9 25.8 14.2
219 217 19.6 7.9 85.6 21.8 26.7 115.1
309 257.7 20.6 20.7 75.5 19.2 24 833.9
  BN_871TT BN_913TT BN_914TT BN_915TT FA_917TT FA_918TT FA_919TT
142 25.3 22.5 12.9 34.4 8.5 20.6 67.7
219 56.6 53.5 45.2 122.1 24.4 124 41.3
309 37.9 37 28.2 53 10.2 32.6 24.2
  FA_818TT FA_819TT FA_820TT FA_821TT FA_822TT FA_842TT FA_862TT
142 15 20.2 18.8 18.4 23 10 16.1
219 18.7 111.8 45.7 10.9 29.6 28.9 80.6
309 16.8 17.1 12.1 18.8 27.3 15.2 6.7
  FA_863TT FA_864TT FA_865TT FA_866TT FA_867TT FA_869TT FA_907TT
142 22.4 14.5 54.8 17.1 15.8 15 21.9
219 7.6 15.9 20.9 25 10 16.5 23.4
309 15.2 9.4 62.6 20.2 18.4 15.9 10.2
  FA_908TT FA_920TT FA_938TT FA_940TT FA_941TT Fa_EA40374 _921TT Fa_EA40376 _923TT
142 23.9 22.2 7.9 6.6 14.5 13.7 19.8
219 20.1 36.4 17.5 14.9 63.3 32.3 25.5
309 33.3 27.1 7.7 6.2 15.6 26.1 16
  BN_EA40377 _924TT Fa_EA40378 _925TT Fa_EA40379 _926TT BN_EA40380 _927TT Fa_EA40387 _955TT Fa_EA40388 _956TT Fa_EA40389 _957TT
142 16.5 57.7 22.9 18.9 16.7 14.2 10.8
219 15.3 10.2 95.6 18.5 23.3 51 29.1
309 17.8 31.8 32.6 49.2 18.7 11.4 26.7
  Fa_EA40390_ 959TT Fa_EA40391 _960TT Fa_EA40392 _961TT Fa_EA40393_ 962TT PC_829TT PC_830TT PC_831TT
142 21.4 18.6 16.5 21.9 26 34.7 22.4
219 25.7 67 20.8 21 36.8 136.4 64.9
309 44.2 20.6 34.1 35.2 88.8 136.6 48.8
  PC_832TT PC_834TT PC_835TT PC_836TT PC_837TT PC_838TT PC_839TT
142 19.1 16.4 25.8 12.1 22.5 15.4 16.1
219 24.3 19.1 25.9 11.7 13.6 23.8 26.8
309 22.1 23.6 40.5 16.5 33.9 24.7 340.8
  PC_879TT PC_881TT PC_882TT PC_883TT PC_884TT PC_885TT PC_886TT
142 21.2 23.5 7.6 21.5 30.6 26.4 22.6
219 30.4 13.3 78.6 34.3 34.8 131.8 12
309 33.7 32 22.8 29.8 45.9 89.7 47.4
  PC_890TT PC_892TT PC_893TT PC_894TT PC_903TT PC_904TT PC_928TT
142 8.7 10.8 18.5 23.6 14.5 118.3 7.4
219 25.9 48.9 14.4 96.2 15.1 80.2 27.7
309 58.6 17.8 23.5 72.3 11.1 270.7 15.9
  PC_932TT PC_933TT Pc_EA40375 _922TT Pc_EA40381_ 945TT Pc_EA40382 _946TT Pc_EA40383 _947TT Pc_EA40384 _948TT
142 8 13 12 19.6 13.2 21.6 16.9
219 54.2 38.2 22.7 31.8 63.9 23.8 18.1
309 91.5 18.6 30.5 195 29.7 53.4 37.9
  FC_823TT FC_824TT FC_825TT FC_827TT FC_828TT FC_840TT FC_896TT
142 19.8 12.5 35.7 17.7 19.6 18.9 20.5
219 24.8 64.8 82.1 14.1 60.2 16.2 46.2
309 15 33.3 38.4 14.9 19.5 17.2 40.2
  FC_898TT FC_899TT FC_900TT FC_901TT FC_902TT FC_909TT FC_910TT
142 246.2 18.5 20.7 25.8 28.6 17.8 9.9
219 62.7 39.4 42.1 92.3 39.4 117.7 25.9
309 44.3 43 20.2 21.3 63.8 745.6 9.7
  Fc_EA40386_954TT Fc_EA40394_967TT Fc_EA40395 _968TT Fc_EA40396_ 969TT Fc_EA40397 _970TT Fc_EA40405 _982TT Fc_EA40406 _983TT
142 41.7 30.5 11.8 16.7 25.7 19.4 37.4
219 43.2 42.1 24 12.6 57.9 14.1 43.6
309 53.6 42 23.5 14.6 71 20.7 30.6
  pb1_Signal pb2_Signal pb3_Signal pb4_Signal pb5'_Signal pb6'_Signal pb7'_Signal
142 1289.3 1088.9 659.7 732.6 1415 1287.9 565.2
219 1692.6 1911.7 912 2112.2 1417.9 1216.7 411.6
309 5357.5 5904 3572.6 6641.9 3580.9 3787.2 1401.5
  pb8'_Signal pd10_Signal pd11_Signal pd12_Signal pd9-Signal    
142 2004.2 821.9 904.8 1619.6 282.5    
219 959.7 1085.9 1974.4 1580.2 245.9    
309 3537 3522.1 4518.1 5118.1 931.7    
Table 15b
Signal
SEQ ID PC_984TT PC_986TT FA_987TT PC_988TT PC_989TT FA_992TT FA_993TT
142 25.1 125.8 18.5 11.4 13.7 11.8 113.6
219 17.5 25.7 22.6 11.3 32.4 61.9 25.2
309 27.6 661.8 15.7 47.4 29.8 33.2 28.4
  FA_994TT FA_995TT FA_996TT FA_998TT FA_999TT FA_1001TT FA_1002TT
142 20.6 25.2 34.1 21.8 31.4 12.7 74.6
219 25.9 36.3 41.5 24.9 16.4 26.5 25.5
309 16.8 26.3 19.4 22.5 14.6 32.3 47.7
  FA_1004TT FA_1005TT FA_1006TT FA_1010TT FA_1013TT FA_1014TT FA_1017TT
142 24.7 20.1 67.7 14.9 48.9 29.8 26.9
219 27.5 11.7 53.3 112.9 35.1 39.5 27.2
309 39.5 10.3 34.6 83.2 46.8 20 25.7
  FA_1018TT FA_1020TT FA_1023TT FA_1024TT FA_1026TT FA_1027TT FA_1028TT
142 25.1 25.5 35.6 26.9 27.9 15.2 24.5
219 66.3 26 6.1 26 59.3 18.8 31.7
309 31 37.4 20.6 46.9 51.4 8.8 33.3
  FA_1029TT FA_1030TT FA_1031TT FA_1032TT FA_1034TT FA_1035TT FC_1037TT
142 10.6 14 20 21.6 31.9 17.1 17.8
219 12.5 33.5 15.1 15 68.5 25.6 63.5
309 20.7 13 21.5 32.2 52.6 24.6 26.2
  PC_1039TT PC_1040TT PC_1041TT PC_1042TT PC_1043TT PC_1044TT PC_1045TT
142 16.1 22.7 99.5 129.8 20.7 79.2 16.4
219 27.9 29.5 23.6 29 21.5 19.8 38.4
309 585.3 37.6 48.7 41 32.8 59.2 112.3
  PC_1046TT PC_1047TT PC_1048TT PC_1049TT PC_1050TT PC_1051TT PC-FV_1052TT
142 22.4 81.2 24.7 37.5 25.9 26.6 26.4
219 11.1 97.5 61.8 14.9 14.7 109.1 30.6
309 22.5 207.8 35.4 64 39.5 56.7 28.2
  PC_1053TT PC_1054TT PC_1055TT PC_1059TT PC_1060TT PC_1061TT PC_1062TT
142 12.8 18.5 24.4 22.7 24.6 15.8 25.6
219 12.6 19.6 10.2 24.5 82.6 53.7 20.7
309 20.4 22.1 24 18.9 32.3 54.5 29.1
  PC-FV_1064TT PC-FV_1065TT PC-FV_1066TT PC-FV_1067TT PC-FV_1068TT PC-FV_1069TT PC-FV_1071TT
142 16.7 11.3 25.5 17.4 17.5 14 16.7
219 174.4 20.3 132.4 24.8 12.3 14.8 18.8
309 30.4 169.6 49.4 16.9 135.5 18.1 7.6
  PC-FV_1072TT FA_1073TT FA_1074TT FA_1075TT FA_1076TT FC_1077TT FC_1078TT
142 14 25.5 22.7 27 16.4 15.8 38.9
219 8.7 73.2 57 54.3 69.9 20 17.1
309 29.9 145.4 21 53 24.3 9.1 32.9
  FC_1079TT PC-FV_1080TT PC-FV_1081TT FC_1082TT pb1 pb2 pb3
142 31.9 10.2 22 19.5 1289.3 1088.9 659.7
219 40.1 17.4 38.3 88.1 1692.6 1911.7 912
309 27.1 23.4 40.8 12.9 5357.5 5904 3572.6
  pb4 pb5' pb6' pb7' pb8' pd10 pd11
142 732.6 1415 1287.9 565.2 2004.2 821.9 904.8
219 2112.2 1417.9 1216.7 411.6 959.7 1085.9 1974.4
309 6641.9 3580.9 3787.2 1401.5 3537 3522.1 4518.1
  pd12 pd9          
142 1619.6 282.5          
219 1580.2 245.9          
309 5118.1 931.7          

F. Signature Normalized to Control Genes



[0081] We further examined our 5-gene and 4-gene signatures by normalizing these genes to the three selected thyroid control genes as an algorithm for gene chip data normalization. The average fluorescent intensities of the three thyroid control genes were used for signature gene signal normalization. The performance of both signatures improved slightly when these two signatures were normalized. The sensitivity and specificity of the two signatures are listed in Table 16, and the ROC curves are shown in Figure 4a and 4b.
Table 16. The performances of the 4-gene and 5-gene signatures normalized to control genes 4-Gene Signature
  Tumor Benign
Positive 33 11
Negative 3 27
Sensitivity 92% (0.78, 0.97)
Specificity 71% (0.55, 0.83)
5-Gene Signature
Positive 33 8
Negative 3 30
Sensitivity 92% (0.78, 0.97)
Specificity 79% (0.64, 0.89)

G. Signature Validation with Two-Round Amplified Probes



[0082] To determine if the FNA samples lack sufficient thyroid cells to provide enough probe material for hybridizing to the Affymetrix U133a gene chips after one round of amplification, two-round amplification of the target RNAs we performed two-round amplification with 47 samples that are among the 74 independent validation sample set. The data obtained show that the performances of the 5-gene and 4-gene signatures are identical with either one-round or two-round amplifications. The ROC curves of the two gene signatures with two different target preparations are shown in Figures 5a, 5b, 5c, and 5d.

Example 4


Cross validation with independent samples


A. Cross Validation with the 83 Independent Fresh Frozen Thyroid Samples



[0083] 83 independent thyroid samples were processed and profiled with the U133a chip. The number of samples in each category is list in Table 17.
Table 17. Fresh Frozen sample collection for signature validation
Sample Type Number of Samples
Follicular Adenoma 18
Follicular Thyroid Carcinoma 1
Papillary Carcinoma, Follicular Variant 18
Papillary Thyroid Carcinoma 11
Adenomatoid Nodules 18
Adenomatoid Nodules w/Hashimoto 1
Multinodular Hyperplasia 16


[0084] The performance of the 4-gene signature and the 5-gene signature was assessed with LDA using the same cut-off value as in the training set. Both signatures gave equivalent performance in these samples, and they are comparable with the performance in the 98 training set. The sensitivity and specificity for both signatures are shown in Table 18, and the ROC curves are demonstrated in Figures 6a and 6b.
Table 18. 4-Gene and 5-gene signatures performance in 83 validation samples 4-Gene Signature
  Tumor Benign
Positive 20 11
Negative 10 42
Sensitivity 67% (0.49, 0.81)
Specificity 79% (0.67, 0.88)
  5-Gene Signature  
  Tumor Benign
Positive 24 24
Negative 6 29
Sensitivity 80% (0.63, 0.90)
Specificity 55% (0.41, 0.67)

B. Signature Validation with the 47 Fine Needle Aspirate (FNA) Thyroid Samples



[0085] 47 thyroid FNA samples were processed and profiled with the U133a chip. The number of samples in each category is list in Table 19.
Table 19. FNA sample collection for signature validation
Sample Type Number of Samples
Follicular Adenoma 10
Follicular Thyroid Carcinoma 2
Papillary Carcinoma, Follicular Variant 3
Papillary Thyroid Carcinoma 13
Adenomatoid Nodules 10
Adenomatoid Nodules w/Hashimoto 1
Multinodular Hyperplasia 8


[0086] The performance of the 4-gene signature and the 5-gene signature was assessed with LDA model. Both signatures gave equivalent performance in the FNA samples, and they are comparable with the performance in the 98 training set. The sensitivity and specificity for both signatures are shown in Table 20, and the ROC curves are demonstrated in Figures 7a and 7b.
Table 20. 4-Gene and 5-gene signatures performance in 47 FNA samples 4-Gene Signature
  Tumor Benign
Positive 15 4
Negative 3 25
Sensitivity 94% (0.74, 0.99)
Specificity 62% (0.44, 0.77)
  5-Gene Signature  
  Tumor Benign
Positive 17 10
Negative 1 19
Sensitivity 94% (0.74, 0.99
Specificity 66% (0.47, 0.80)

C. Signature Performance in 28 Paired Fresh Frozen and FNA Thyroid Samples



[0087] Within the 83 fresh frozen and the 47 FNA sample collections there are 28 samples that were from the same patient. The direct comparison of our signatures in these paired samples demonstrates how well the signature will translate into the final molecular assay. The performance of the 4-gene signature and the 5-gene signature was assessed with the LDA model. Both signatures gave equivalent performance in the fresh frozen and FNA samples. These results demonstrated that our 4-gene and 5-gene signatures can perform equally well in both sample types, and proved the approach using fresh frozen samples for gene/signature identification is valid. The sensitivity and specificity for both signatures are shown in Table 21, and the ROC curves are demonstrated in Figures 8a and 8b.
Table 21. 4-Gene and 5-gene signatures performance in 28 matched fresh frozen and FNA samples
4-Gene Signature
  Fresh Frozen FNA
  Tumor Benign Tumor Benign
Positive 10 4 12 7
Negative 3 11 1 8
Sensitivity 77% (0.50 - 0.92) 92% (0.67 - 0.99)
Specificity 73% (0.48 - 0.89) 53% (0.30 - 0.75)
5-Gene Signature
  Fresh Frozen FNA
  Tumor Benign Tumor Benign
Positive 11 7 12 7
Negative 2 8 1 8
Sensitivity 85% (0.58 - 0.96) 92% (0.67 - 0.99)  
Specificity 53% (0.30 - 0.75) 53% (0.30 - 0.75)  

Example 5


Real-time quantitative RT-PCR


Sample Acquisition:



[0088] In order to determine whether a subset of the gene profiles and/or controls would give adequate specificity and sensitivity with RT-PCR, the following experiment was performed. The following has the advantage of requiring only one round of RNA amplification.

[0089] A total of 107 thyroid biopsies were analyzed in our study: 26 follicular adenoma, 23 follicular carcinoma, 38 papillary carcinoma, 5 normal, 3 papillary carcinoma follicular variant, 3 Hashimoto thyroiditis, 2 microfollicular adenoma, 1 diffuse goiter, 1 goiter with papillary hyperplasia, 1 Hurthle cell adenoma, 1 hyperplasia with papillary structure, 1 multinodular goiter, 1 oncocytic hyperplasia, 1 thyroiditis. Total RNA isolation was extracted by using the Trizol reagent according to the manufacturer's instructions. RNA concentrations were determined by absorbance readings at 260 nm with the Gene-Spec (Hitachi) spectrophotometer. The isolated RNA was stored in RNase-free water at -80°C until use.

Primer and Probe Design:



[0090] The primers and hydrolysis probes were designed using Oligo 6.0 and the Genebank sequences for thyroid cancer status markers (Table 22). These primers and probe sets were designed such that the annealing temperature of the primers was 62°C and the probes 5-10°C higher and amplicon size ranges from 100-150bp. Genomic DNA amplification was excluded by designing our primers around exon-intron splicing sites. Hydrolysis probes were labeled at the 5' nucleotide with FAM as the reporter dye and at 3' nucleotide with TAMRA as the quenching dye.
Table 22
Target Accession # Forward Primer (SEQ ID NO:) Reverse Primer (SEQ ID NO:) Probe Sequence (SEQ ID NO:) Product Length
SGENE NM_003020 gaaagcggagqagtgtcaatcca (364) ggttttcqtctagcatcttctcttta (365) atctacaaggacagagactggataatgttg (366) 130
TESTICAN1 AF231124 gtgagctgtgaagaggagcaggaa (367) ctttggtcccagctcccgttca(368) cctcaggggattttggcagtggtgggtccg (369) 102
GABRE NM_004961 taaactccgccatcctcgtatcaa (370) cagtggtgacaatctggcacacaa (371) ccgtgcccatgccccgtacccatgca (372) 113
CDH3 NM_001793 ctgaagcaggatacatatgacgtgca (373) aggatccagggcaggtttcgaca (374) catggcagtcgcacacagtggccctqa (375) 124
FN1 NM_002026 tggtgccatgacaatggtgtgaacta (376) catcatcgtaacacattacctcatna (377) aagattgggagagaagtgggaccgtcaggga (378) 148
TPO-1 NM_000547 catctgtqacaacactggcctca (379) gccacacttgtcgtcttgaggaa (380) caaattccccgaagactttgagtcttgtgacagc (381) 148
TPO-2 NM_000547 aacctgcgtagactccgggaggc (382) gccagtgcgtgtccacctgca (383) ctcgggtgacttggatctccatqtcgctgg (384) 124
KCNAB1-1 NM_003471 tgaaagttccagggcttcactgaa (385) agctgaggtagtgtgcatcccaga (386) ctaccagtggttgaaagaaagaattgtaagtgaag (387) 138
KCNAB1-2 NM_003471 tagctgttgcgtggtgcctgagaa (388) tgatgtcatctttgggagaacctgaa (389) aaggtgtgagttctgtgctcctgggatcat (390) 119
FABP4-1 NM_001442 gaaagaagtaggagtgggctttgcca (391) ggcccagtatgaaggaaatctcaata (392) aaaggtactttcagatttaatggtgatcacatccc (393) 143
FABP4-2 NM_001442 aggaaagtcaagagcaccataacctta (394) gacgcattccaccaccagtttatca (395) ttgattttccatcccatttctgcacatgta (396) 123
DOI1-1 NM_000792 gggtaaatctggcccttggaactaca (397) cccgttggtcacctagaattgagata (398) cccagaggaagttcgtqctgttctggaaaa (399) 106
DOI1-2 NM_000792 tgcatcagatggctgggctttta (400) gctctggttctgcatggtgtcca (401) atgccctccccatgcatcctgcgt (402) 142
B-ACTIN NM_001101 tcacccacactgtgcccatctacga (403) cagcggaaccgctcattgccaatgg (404) atgccctcccccatgccatcctgcgt (405) 295
GOLGIN67 NM_015003 gcatggtgatctttgtgaggcga (406) ctcctggtggtcctgcatctca (407) ctcaccaacagcgtggagcctgcgcaagga (408) 139
PAX8 BC001060 actccagcttgctgagttcccca (409) actccagcttgctgagttcccca (410) attattacagttccacatcaaggccgagtgca (411) 125
HERC NM_003922 qqgtqcttttcatgaggtttgtgtca (412) tgaggtaggcagactgtcgtaaggcc (413) ccaacactgctgacatttctcagagatttcaaat (414) 219
DDIT3 NM_004083 cctggaaatgaagaggaagaatcaa (415) agctctgactggaatctggagagtqa (416) aatcttcaccactcttgaccctgcttctctgg (417) 142
ITM1 NM_152713 tggctggtcaggatatacaaggtaaa (418) tcaacgtcctaaatgtgatgtgctca(419) cctggataatcqaggcttgtcaaggacataaa (420) 117
C1OF24 NM_052966 tctgaaagtgataaaggaagctgcta (421) ctttagatctgttaagctggacacac (422) cttgaagaaacacaacttatttgaagataacatg (423) 103
MOT8 NM_018836 acggcctataacgagaccctgca (424) gaagaggagggtcggtttccattaa (425) ttctcacgagtgcgtcagggcatctgtgc (426) 133
ARG2 NM_00172 atttgaccctacactggctccagc (427) gatccaatgctgatagcaaccctgta (428) actcctgttgtcgggggactaacctatcgaga (429) 119

Real-Time Quantitative RT-PCR:



[0091] Gene specific real-time quantitative RT-PCR amplification of 21 thyroid cancer status genes and a housekeeping gene was performed using the TaqMan One-Step RT-PCR Master Mix (2x) (Applied Biosystems) and the ABI Prism 7900HT sequence detection system (Applied Biosystems). In a 25 µl one-step reaction total RNA (10ng was added to a mix that contained: 1x RT-PCR Master Mix, 0.25U/µl Multiscribe Enzyme, 0.6uM primers and 0.25µM probe. Cycling parameters were 48°C for 30min and 95°C 10min, followed by 40 cycles of 95°C 15sec and 62°C 1min. Real-time PCR monitoring was achieved by measuring fluorescent signal at the end of the annealing phase for each cycle. The number of cycles to reach the fluorescence threshold was defined as the cycle threshold (Ct value). To minimize the errors arising from the integrity of the RNA in the samples β-actin mRNA was amplified as an internal reference. External standards were prepared by 10-fold serial dilutions of known thyroid cancer positive RNA and used to ensure linearity throughout our assays. Results were expressed in mean Ct value and samples were excluded that had a standard deviation greater then one. The results are provided in Tables 23a, 23b, 23c, 24a and 24b.

[0092] The data show that the two gene signature shown in Table 23b is not as sensitive and specific as the four-gene signature from which it was derived. Table 24 shows that use of the PAX8 gene in an RT-PCR reaction as a control for thyroid-specific tissue is effective in an RT-PCR reaction.

[0093] Although the foregoing invention has been described in some detail by way of illustration and example for purposes of clarity of understanding, the descriptions and examples should not be construed as limiting the scope of the invention.













































Table 25
Sequence identifications
SEQ NO: psid Name Accession No. Description
1 200635_s_at   AU145351 Hs.75216 prot tyrosine phosphatase, rec. type, F
2 200771_at   NM_002293 laminin, gamma 1
3 201069_at   NM_004530 matrix metalloproteinase 2
4 201117_s_at CPE NM_001873 carboxypeptidase E
5 201150_s_at   NM_000362 tissue inhibitor of metalloproteinase 3
6 201185_at   NM_002775 protease, serine, 11
7 201203_s_at   Al921320 ribosome binding protein 1
8 201212_at   NM_005606 cysteine protease
9 201289_at CYR61 NM_001554 cysteine-rich, angiogenic inducer, 61
10 201292_at   NM_001067 topoisomerase (DNA) II alpha
11 201418_s_at SOX4 NM_003107 SRY (sex determining region Y)-box 4
12 201427_s_at SEPP1 NM_005410 Selenoprotein P, plasma, 1
13 201430_s_at DPYSL3 NM_001387 dihydropvrimidinase-like 3
14 201431_s_at DPYSL3 NM_001387 dihydropyrimidinase-like 3
15 201438_at COL6A3 NM_004369 collagen, type VI, alpha 3
16 201474_s_at ITGA3 NM_002204 integrin, α 3 transcript variant a
17 201505_at LAMB1 NM_002291 laminin, beta 1
18 201508_at IGFBP4 NM_001552 Insulin-like growth factor-binding protein 4
19 201525_at APOD NM_001647 apolipoprotein D
20 201645_at HXB NM_002160 hexabrachion (tenascin C, cytotactin)
21 201650_at KRT19 NM_002276 keratin 19
22 201667_at GJA1 NM_000165 gap junction protein, α 1, 43kD (connexin 43)
23 201744_s_at LUM NM_002345 lumican
24 201792_at AEBP1 NM_001129 AE-binding protein 1
25 201852_x_at   Al813758 Collaqen, type III, alpha 1
26 201893_x_at   AF138300 decorin variant A
27 201983_s_at   AW157070 epidermal growth factor receptor
28 202133_at   AA081084 Transcriptional co-activator w PDZ-binding motif
29 202219_at SLC6A8 NM_005629 solute carrier family 6, member 8
30 202237_at NNMT NM_006169 nicotinamide N-methyltransferase
31 202286_s_at   NM_002353 tumor-associated calcium signal transducer 2
32 202291_s_at   NM_000900 matrix Gla protein
33 202310_s_at   NM_000088 proalpha 1 (I) chain of type I procollagen
34 202350_s_at MATN2 NM_002380 matrilin 2 precursor, transcript variant 1
35 202357_s_at BF NM_001710 B-factor, properdin
36 202363_at   AF231124 testican-1
37 202376_at SERPINA3 NM_001085 Ser (or Cys) proteinase inhib clade A mem 3
38 202404_s_at COL1A2 NM_000089 collagen, type I, alpha 2
39 202440_s_at   NM_005418 suppression of tumorigenicity 5
40 202504_at ATDC NM_012101 ataxia-telangiectasia group D-assoc. protein
41 202575_at CRABP2 NM_001878 cellular retinoic acid-binding protein 2
42 202588_at AK1 NM_000476 adenylate kinase 1
43 202712_s_at CKMT1 NM_020990 creatine kinase, mitochondrial 1
44 202796_at KIAA1029 NM_007286 synaptopodin
45 202826_at SPINT1 NM_003710 serine protease inhibitor, Kunitz type 1
46 202834_at SERPINA8 NM_000029 Ser (or Cys) proteinase inhib, clade A mem 8
47 202898_at KIAA0468 NM_014654 KIAA0468 gene product
48 202992_at C7 NM_000587 complement component 7
49 203021_at SLPI NM_003064 secretory leukocyte protease inhibitor
50 203083_at THBS2 NM_003247 thrombospondin 2
51 203180_at ALDH1A3 NM_000693 aldehyde dehydrogenase 1 family, mem. A3
52 203228_at PAFAH1B3 NM_002573 platelet-activating factor acetylhydrolase, isoform lb, γ sub
53 203256_at CDH3 NM_001793 cadherin 3, type 1, P-cadherin (placental)
54 203349_s_at ETV5 NM_004454 ets variant gene 5 (ets-related molecule)
55 203352_at ORC4L NM_002552 origin recognition complex, subunit 4-like
56 203354_s_at   NM_015310  
57 203381_s_at   NM_000041  
58 203382_s_at APOE NM_000041 apolipoprotein E
59 203407_at PPL NM_002705 periplakin
60 203417_at MFAP2 NM_017459 microfibrillar-associated protein 2, tran var 1
61 203438_at   NM_003714 stanniocalcin 2
62 203453_at SCNN1A NM_001038 sodium channel, nonvoltage-gated 1 α
63 203499_at EPHA2 NM_004431 EphA2
64 203548_s_at   NM_000237 lipoprotein lipase
65 203570_at LOXL1 NM_005576 lysyl oxidase-like 1
66 203632_s_at GPRC5B NM_016235 G protein-coupled rec, fam C, group 5, mem B
67 203673_at TG NM_003235 thyroglobulin
68 203699_s_at   NM_013989 type II iodothyronine deiodinase
69 203700_s_at DIO2 NM_013989 deiodinase, iodothyronine, type II, tran var 1
70 203786_s_at TPD52L1 NM_003287 tumor protein D52-like 1
71 203851_at IGFBP6 NM_002178 insulin-like growth factor binding protein 6
72 203854_at IF NM_000204 I factor (complement)
73 203859_s_at PALM NM_002579 paralemmin
74 203875_at   NM_003069 SWISNF related, matrix assoc, actin dep regulator of chromatin subfam a mem 1
75 203881_s_at   NM_004010 dystrophin, includes DXS142, DXS164, DXS206, DXS230, DXS239, DXS268, DXS269, DXS270, DXS272 (DMD), transcript variant Dp427p2
76 203889_at SGNE1 NM_003020 secretory granule, neuroendocrine protein 1
77 203911_at RAP1GA1 NM_002885 RAP1, GTPase activating protein 1
78 203934_at KDR NM_002253 kinase insert domain receptor
79 203986_at GENX-3414 NM_003943 genethonin 1
80 204105_s_at NRCAM NM_005010 neuronal cell adhesion molecule
81 204124_at NaPi-IIb AF146796 Na dependent phosphate transporter isoform
82 204149_s_at GSTM4 NM_000850 glutathione S-transferase M4
83 204152_s_at   AI738965 manic fringe (Drosophila) homolog
84 204154_at CDO1 NM_001801 cysteine dioxygenase, type I
85 204259_at MMP7 NM_002423 matrix metalloproteinase 7 (matrilysin, uterine)
86 204260_at CHGB NM_001819 chromogranin B (secretogranin 1)
87 204268_at S100A2 NM_005978 S100 calcium-binding protein A2
88 204288_s_at ARGBP2 NM_021069 ArgAbl-interacting prot ArgBP2, trans var 2
89 204298_s_at LOX NM_002317 lysyl oxidase
90 204337_at   NM_005613 regulator of G-protein signalling 4
91 204416_x_at APOC1 NM_001645 apolipoprotein C-I
92 204424_s_at   NM_018640 neuronal specific transcription factor DAT1
93 204433_s_at   NM_006038 spermatogenesis associated PD1
94 204442_x_at LTBP4 NM_003573 latent transforming GFβ binding protein 4
95 204452_s_at   NM_003505 frizzled 1
96 204476_s_at PC NM_022172 pyruvate carboxylase
97 204503_at EVPL NM_001988 envoplakin
98 204591_at CHL1 NM_006614 cell adhesion molecule w/ homology to homolog L1
99 204600_at EPHB3 NM_004443 EphB3
100 204623_at TFF3 NM_003226 trefoil factor 3 (intestinal)
101 204625_s_at   NM_000212 integrin, β 3 (platelet glycoprotein IIIa, CD61)
102 204697_s_at CHGA NM_001275 chromogranin A
103 204741_at BICD1 NM_001714 Bicaudal D (Drosophila) homolog 1
104 204753_s_at HLF NM_002126 hepatic leukemia factor
105 204754_at HLF NM_002126 hepatic leukemia factor
106 204755_x_at HLF NM_002126 hepatic leukemia factor
107 204787_at Z391G NM_007268 Ig superfamily protein
108 204797_s_at EMAPL NM_004434 echinoderm microtubule-associated pro-like
109 204869_at   AL031664 DNA seq RP4-531H16 chrom 20p11.22-12
110 204870_s_at PCSK2 NM_002594 proprotein convertase subtilisinkexin type 2
111 204933_s_at TNFRSF11 B NM_002546 TNFR superfamily, member 11b
112 204934_s_at HPN NM_002151 hepsin (transmembrane protease, serine 1)
113 204944_at PTPRG NM_002841 protein tyrosine phosphatase receptor type G
114 204964_s_at SSPN NM_005086 sarcospan (Kras oncogene-associated gene)
115 204975_at EMP2 NM_001424 epithelial membrane protein 2
116 204990_s_at ITGB4 NM_000213 integrin, beta 4
117 205051_s_at KIT NM_000222 v-kit Hardy-Zuckerman 4 feline sarcoma viral oncogene homolog
118 205110_s_at FGF13 NM_004114 fibroblast qrowth factor 13
119 205153_s_at TNFRSF5 NM_001250 TNFR superfamily, mem 5
120 205168_at DDR2 NM_006182 discoidin domain receptor family, member 2
121 205258_at INHBB NM_002193 inhibin, β B (activin AB β polypeptide)
122 205286_at   NM_003222 transcription factor AP-2 gamma
123 205325_at KIAA0273 NM_014759 KIAA0273 gene product
124 205336_at PVALB NM_002854 parvalbumin
125 205402_x_at PRSS2 NM_002770 protease, serine, 2 (trypsin 2)
126 205413_at C11ORF8 NM_001584 chromosome 11 open reading frame 8
127 205455_at MST1R NM_002447 macrophage stimulating 1 receptor
128 205470_s_at KLK11 NM_006853 kallikrein 11
129 205479_s_at PLAU NM_002658 plasminogen activator, urokinase
130 205481_at ADORA1 NM_000674 adenosine A1 receptor
131 205485_at RYR1 NM_000540 ryanodine receptor 1 (skeletal)
132 205490_x_at connexin 31 NM_024009 gap junction protein, beta 3, 31 kD
133 205531_s_at GA NM_013267 breast cell glutaminase
134 205593_s_at PDE9A NM_002606 phosphodiesterase 9A
135 205614_x_at MST1 NM_020998 macrophage stimulating 1
136 205627_at   NM_001785 cytidine deaminase
137 205639_at   NM_001637 acyloxyacyl hydrolase
138 205683_x_at TPSB1 NM_003294 tryptase beta 1
139 205689_at KIAA0435 NM_014801 KIAA0435 gene product
140 205700_at RODH NM_003725 oxidative 3 α hydroxysteroid dehydrogenase; retinol dehydrogenase; 3-hydroxysteroid epimerase
141 205710_at LRP2 NM_004525 low density lipoprotein-related protein 2
142 205715_at BST1 NM_004334 bone marrow stromal cell antigen 1
143 205717_x_at   NM_002588 protocadherin gamma subfamily C, 3
144 205728_at   AL022718 DNA seq from clone 1052M9 on chrom Xq25
145 205747_at CBLN1 NM_004352 cerebellin 1 precursor
146 205778_at KLK7 NM_005046 kallikrein 7 (chymotryptic, stratum corneum)
147 205858_at NGFR NM_002507 nerve growth factor receptor
148 205927_s_at CTSE NM_001910 cathepsin E
149 205980_s_at ARHGAP8 NM_015366 Rho GTPase activating protein 8
150 206002_at GPR64 NM_005756 G protein-coupled receptor 64
151 206114_at EPHA4 NM_004438 EphA4
152 206390_x_at   NM_002619 platelet factor 4
153 206594_at KIAA0135 NM_015148 KIAA0135 protein
154 206595_at CST6 NM_001323 cystatin EM
155 206714_at ALOX15B NM_001141 arachidonate 15-lipoxygenase, second type
156 206757_at PDE5A NM_001083 phosphodiesterase 5A, cGMP-specific
157 206866_at CDH4 NM_001794 cadherin 4, type 1, R-cadherin (retinal)
158 206884_s_at SCEL NM_003843 sciellin
159 206912_at FOXE1 NM_004473 forkhead box E1(thyroid transcription factor2)
160 207111_at   NM_001974 egf-like module cont., mucin-like, hormone rec-like seq 1
161 207144_s_at CITED1 NM_004143 Cbpp300-interacting transactivator, with GluAsp-rich carboxy-terminal domain, 1
162 207173_s_at   NM_001797 OB-cadherin-1
163 207674_at FCAR NM_002000 Fc fragment of IgA
164 207695_s_at IGSF1 NM_001555 immunoglobulin superfamily, member 1
165 207795_s_at CD94 AB009597  
166 207826_s_at ID3 NM_002167 inhibitor of DNA binding 3, dominant negative protein
167 207923_s_at PAX8 NM_013953 paired box gene 8
168 208396_s_at PDE1A NM_005019 phosphodiesterase 1A, calmodulin-dependent
169 208451_s_at C4B NM_000592 complement component 4B
170 208712_at PRAD1 M73554 cyclinD1
171 208747_s_at   NM_001734 subcomponent C1s, α- and β-chains
172 209021_x_at   BC001331 Similar to KIAA0652 gene product
173 209035_at   NM_002391 midkine
174 209071_x_at   AF159570 regulator of G-protein signalling 5
175 209079_x_at   AF152318 protocadherin gamma A1
176 209173_at   NM_006408 putative secreted protein XAG
177 209208_at   NM_004870 clone 015e11 My008 protein
178 209228_x_at   NM_006765 Putative prostate cancer tumor suppressor
179 209270_at LAMB3 NM_000228 laminin S B3 chain
180 209280_at   NM_006039 chromosome 17 unknown product mRNA
181 209291_at   NM_001546 inhibitor of DNA binding 4, dominant negative helix-loop-helix protein
182 209297_at ITSN AF114488 intersectin short isoform
183 209335_at   AI281593  
184 209365_s_at ECM1 NM_004425 extracellular matrix protein 1
185 209386_at   AI346835 transmembrane 4 superfamily member 1
186 209485_s_at ORP1 AF274714 oxysterol-binding protein-related protein
187 209496_at   BC000069 retinoic acid receptor responder 2
188 209505_at   NM_005654 nuclear receptor subfam 2, group F, mem 1
189 209506_s_at   BC004154 nuclear receptor subfam 2, group F, mem 1
190 209529_at   AF047760 phosphatidic acid phosphohydrolase type-2c
191 209596_at   AF245505 adlican
192 209598_at KIAA0883 AB020690 KIAA0883 protein
193 209652_s_at   BC001422 Similar to placental growth factor, vascular endothelial growth factor-related protein
194 209691_s_at   BC003541 hypothetical protein FLJ10488
195 209739_s_at   AI814551 GS2 gene
196 209772_s_at   X69397 cell surface antigen
197 209781_s_at T-Star AF069681 T-Star
198 209792_s_at   BC002710 kallikrein 10
199 209810_at SP-B J02761 pulmonary surfactant-associated protein B
200 209897_s_at   AF055585 neurogenic extracellular slit protein Slit2
201 209924_at   AB000221 small inducible cytokine subfamily A (Cys-mem18, pulmonary/activation-reg
202 209946_at   U58111 FLT4 ligand
203 209990_s_at   NM_005458 GABA-B receptor
204 210051_at CAMP-GEFI U78168 cAMP-regulated guanine nucleotide exchange factor I
205 210055_at   NM_000369 thyroid stimulating hormone receptor
206 210072_at   NM_006274 beta chemokine Exodus-3
207 210078_s_at   L39833 K+ channel beta subunit
208 210096_at   NM_000779 lung cvtochrome P450 (IV subfamily) BI
209 210298_x_at FHL1 AF098518 four and 1/2 LIM domains1 protein isoform B
210 210321_at   M36118 cytotoxin serine protease-C
211 210342_s_at TPO NM_000547 thyroid peroxidase
212 210372_s_at TPD52L2 AF208012 tumor protein D52-like 2
213 210397_at   U73945 beta-defensin-1
214 210401_at   U45448 P2x1 receptor
215 210471_s_at   U33428 K+ channel β 1a subunit mRNA, alt spliced
216 210473_s_at p58GTA M37712 galactosyltransferase assoc protein kinase
217 210605_s_at   BC003610 Similar to milk fat globule-EGF factor 8
218 210640_s_at GPCR-Br U63917 G protein coupled receptor
219 210660_at LIR-6 AF025529 leucocyte immunoglobulin-like receptor-6b
220 210727_at   NM_001741 Calcitonin, calcitonincalcitonin-rel polypeptide, α
221 210762_s_at HP NM_006094 HP protein
222 210809_s_at   D13665 osf-2 mRNA for osteoblast specific factor 2
223 210827_s_at ESE-1 U73844 epithelial-specific transcription factor ESE-1 a
224 211100_x_at   U82278 immunoglobulin-like transcript 1c
225 211101_x_at   U82276 immunoglobulin-like transcript 1a
226 211102_s_at   U82277 immunoglobufin-like transcript 1b
227 211161_s_at   AF130082 collagen, type III, alpha 1
228 211217_s_at   AF051426 slow delayed rectifier channel subunit
229 211538_s_at   U56725 heat shock 70kD protein 2
230 211564_s_at   BC003096 Similar to LIM domain protein
231 211679_x_at GABBR2 AF095784 GABA-B receptor R2
232 211813_x_at   AF138303 decorin D
233 211959_at IGFBP5 AW007532 insulin-like growth factor binding protein 5
234 211964_at   AK025912 type IV collagen alpha (2) chain
235 212067_s_at   AL573058 complement component 1, r subcomponent
236 212253_x_at KIAA0728 BG253119 KIAA0728 protein
237 212294_at   BG111761 DKFZp586B0918
238 212328_at KIAA1102 AB029025 KIAA1102 protein
239 212344_at KIAA1077 AW043713 KIAA1077 protein
240 212353_at KIAA1077 AI479175 KIAA1077 protein
241 212354_at KIAA1077 BE500977 KIAA1077 protein
242 212464_s_at FN X02761 fibronectin precursor
243 212724_at   NM 005168 ras homolog gene family, member E
244 212738_at   AV717623  
245 212775_at   AI978623 KIAA0657 protein
246 212803_at   NM_005967 NGFI-A binding protein 2
247 212806_at KIAA0367 AL138349 KIAA0367 protein
248 212850_s_at   AA584297 low density lipoprotein receptor-rel protein 4
249 212865_s_at   BF449063 collagen, type XIV, alpha 1 (undulin)
250 212912_at   AI992251 Ribosomal pro S6 kinase, 90 kDa, polypep 2
251 212992_at   AI935123  
252 213029_at   AL110126 DKFZp564H1916
253 213306_at   AA917899 multiple PDZ domain protein
254 213381_at   N91149 DKFZp586M2022
255 213423_x_at   AI884858 Putative prostate cancer tumor suppressor
256 213553_x_at   W79394 apolipoprotein C-I
257 213668_s_at   AI989477 SRY (sex determining region Y)-box 4
258 213693_s at   AI610869 mucin 1, transmembrane
259 213800_at   X04697 complement factor H 38-kDa N-term frag
260 213904_at   AL390170 DKFZp547E184
261 213924_at FLJ11585 BF476502 hypothetical protein FLJ11585
262 214023_x_at   AL533838 tubulin, beta polypeptide
263 214175_x_at   AI254547 LIM domain protein
264 214239_x_at Mel-18 AI560455 Zinc finger protein 144
265 214307_at   AI478172 homogentisate 1,2-dioxygenase
266 214434_at KIAA0417 AB007877 KIAA0417 gene product
267 214632_at   NM_003872 neuropilin 2
268 214680_at   BF674712 neurotrophic tyrosine kinase receptor 2
269 214702_at MSF-FN70 AJ276395 migration stimulation factor FN70
270 214763_at FLJ13875 AK023937 FLJ13875
271 214803_at   BF344237 DKFZp564N1116
272 214955_at   AI912086 DNA seq clone 1170K4 chrom 22q12.2-13.1
273 214977_at FLJ13790 AK023852 FLJ13790
274 215016_x_at KIAA0728 BC004912 KIAA0728 protein
275 215034_s_at FLJ13302 AI189753 FLJ13302
276 215076_s_at FLJ11428 AU144167 FLJ11428
277 215243_s_at GJB3 AF099730 connexin 31
278 215388_s_at   X56210 complement Factor H-related protein 1
279 215442_s_at   BE740743 thyroid stimulating hormone receptor
280 215443_at   BE740743 thyroid stimulating hormone receptor
281 215506_s_at   AK021882 highly sim to putative tumor sup NOEY2
282 215536_at LMP7, X87344 DMA, DMB, HLA-Z1, IPP2, LMP2, TAP1, TAP2, DOB, DQB2 and RING8, 9, 13, 14 genes
283 216356_x_at KIAA0734 AB018277 KIAA0734 protein
284 216470_x_at   AF009664 T cell receptor β locus, 3 trypsinogen repeats
285 216569_at FABP3-ps U72237 fatty acid-binding protein pseudogene
286 217546_at   R06655  
287 217561_at   BF447272  
288 217592_at   AV684859  
289 217767_at   NM 000064  
290 217820_s_at FLJ10773 NM_018212 hypothetical protein FLJ10773
291 217875_s_at TMEPAI NM_020182 transmem, prostate androgen induced RNA
292 218002_s_at SCYB14 NM_004887 small inducible cytokine subfam B (Cys-X-mem 14 (BRAK)
293 218182_s_at CLDN1 NM_021101 claudin 1
294 218353_at   NM_025226 MSTP032 protein
295 218368_s_at FN14 NM_016639 type I transmembrane protein Fn14
296 218418_s_at   NM_015493 DKFZP434N161
297 218469_at CKTSF1B1 NM_013372 Cvs knot superfamily 1, BMP antagonist 1
298 218537_at FLJ20568 NM_017885 hypothetical protein FLJ20568
299 218546_at FLJ14146 NM_024709 hypothetical protein FLJ14146
300 218613_at   NM_018422 hypothetical protein DKFZp761K1423
301 218653_at SLC25A15 NM_014252 solute carrier family 25 (mitochondrial carrier; ornithine transporter) member 15
302 218691_s at RIL NM_003687 reversion-induced LIM protein
303 218844_at FLJ20920 NM_025149 hypothetical protein FLJ20920
304 218856_at LOC51323 NM_016629 hypothetical protein LOC51323
305 218952_at SAAS NM_013271 granin-like neuroendocrine peptide precursor
306 218960_at TMPRSS4 NM_016425 transmembrane protease, serine 4
307 219010_at FLJ10901 NM_018265 hypothetical protein FLJ10901
308 219127_at MGC11242 NM_024320 hypothetical protein MGC11242
309 219191_s_at BIN2 NM_016293 bridging integrator 2
310 219195_at PPARGC1 NM_013261 peroxisome proliferative activated receptor, y, coactivator 1
311 219211_at USP18 NM_017414 ubiquitin specific protease 18
312 219277_s_at FLJ10851 NM_018245 hypothetical protein FLJ10851
313 219331_s_at FLJ10748 NM_018203 hypothetical protein FLJ10748
314 219416_at CSR1 NM_016240 CSR1 protein
315 219436_s_at LOC51705 NM_016242 endomucin-2
316 219440_at RAI2 NM_021785 retinoic acid induced 2
317 219463_at HS1119D91 NM_012261 sim to S68401 (cattle) glucose induced gene
318 219476_at MGC4309 NM_024115 hypothetical protein MGC4309
319 219525_at FLJ10847 NM_018242 hypothetical protein FLJ10847
320 219527_at FLJ20605 NM_017898 hypothetical protein FLJ20605
321 219561_at LOC51226 NM_016429 COPZ2 for nonclathrin coat protein zeta-COP
322 219596_at LOC56906 NM_020147 hyp protein from EUROIMAGE 511235
323 219597_s_at DUOX1 NM_017434 dual oxidase 1
324 219743_at HEY2 NM_012259 hairyenhancer-of-split rel with YRPW motif 2
325 219749_at FLJ20967 NM_022071 hypothetical protein FLJ20967
326 219836_at MGC10796 NM_024508 hypothetical protein MGC10796
327 219855_at FLJ10628 NM_018159 hypothetical protein FLJ10628
328 219856_at MGC2742 NM_023938 hypothetical protein MGC2742
329 219926_at POP3 NM_022361 popeye protein 3
330 219932_at VLCS-H 1 NM_014031 v long-chain acyl-CoA synthetase homolog 1
331 219958_at FLJ11190 NM_018354 hypothetical protein FLJ11190
332 220034_at   NM_007199 interleukin-1 receptor-associated kinase M
333 220108_at GNA14 NM_004297 guanine nucl binding protein (G protein), α 14
334 220332_at CLDN16 NM_006580 claudin 16
335 220595_at DKFZp434B0 417 NM_013377 hypothetical protein DKFZp434B0417
336 220751_s_at C5ORF4 NM_016348 chromosome 5 open reading frame 4
337 221009_s_at PGAR NM_016109 PPAR(gamma) angiopoietin related protein
338 221073_s_at NOD1 NM_006092 caspase recruitment domain 4
339 221147_x_at WWOX NM_018560 WW domain-containing oxidoreductase
340 221266_s_at LOC81501 NM_030788 DC-specific transmembrane protein
341 221270_s_at TGT NM_031209 tRNA-guanine transglycosylase
342 221489_s_at   AF227517 sprouty (Drosophila) homolog 4
343 221577_x_at   AF003934 prostate differentiation factor
344 221636_s_at   AL136931 DKFZp586G2122
345 221701_s_at   AF352728 STRA6 isoform 1 mRNA, alternatively spliced
346 221724_s_at   AF200738 C-type lectin DDB27 short form
347 221795_at   AI346341 Similar to hypothetical protein FLJ20093
348 221796_at   AA707199 Similar to hypothetical protein FLJ20093
349 221799_at KIAA1402 AB037823 KIAA1402 protein
350 221870_at   AI417917 FLJ22356 fis
351 221900_at   AI806793 collagen, type VIII, alpha 2
352 221928_at   AI057637 Weakly sim to 2109260A B cell growth factor
353 221959_at   BE672313 highly sim to HSU79298 Human clone 23803
354 32128_at   Y13710 alternative activated macrophage specific CC chemokine 1
355 37004_at SP-B J02761 pulmonary surfactant-associated protein B
356 37117_at   Z83838 GTPase-activating protein sim to rhoGAP protein. ribosomal protein L6 pseudogene
357 37152_at   L07592 peroxisome proliferator activated receptor
358 37408_at KIAA0709 AB014609 KIAA0709 protein
359 38691_s at SP5 J03553 pulmonary surfactant protein
360 45297_at   AI417917  
361 63305_at   D81792  
362 823_at   HSU84487 CX3C chemokine precursor, alt spliced
363 91920_at   AI205180  
364       SGENE forward primer
365       SGENE reverse primer
366       SGENE probe
367       TESTICAN1 forward primer
368       TESTICAN1 reverse primer
369       TESTICAN1 probe
370       GABRE forward primer
371       GABRE reverse primer
372       GABRE probe
373       CDH3 forward primer
374       CDH3 reverse primer
375       CDH3 probe
376       FN1 forward primer
377       FN1 reverse primer
378       FN1 probe
379       TPO-1 forward primer
380       TPO-1 reverse primer
381       TPO-1 probe
382       TPO-2 forward primer
383       TPO-2 reverse primer
384       TPO-2 probe
385       KCNAB1-1 forward primer
386       KCNAB1-1 reverse primer
387       KCNAB1-1 probe
388       KCNAB1-2 forward primer
389       KCNAB1-2 reverse primer
390       KCNAB1-2 probe
391       FABP4-1 forward primer
392       FABP4-1 reverse primer
393       FABP4-1 probe
394       FABP4-2 forward primer
395       FABP4-2 reverse primer
396       FABP4-2 probe
397       DOI1-1 forward primer
398       DOI1-1 reverse primer
399       DOI1-1 probe
400       DOI1-2 forward primer
401       DOI1-2 reverse primer
402       DOI1-2 probe
403       B-ACTIN forward primer
404       B-ACTIN reverse primer
405       B-ACTIN probe
406       GOLGIN67 forward primer
407       GOLGIN67 reverse primer
408       GOLGIN67 probe
409       PAX8 forward primer
410       PAX8 reverse primer
411       PAX8 probe
412       HERC forward primer
413       HERC reverse primer
414       HERC probe
415       DDIT3 forward primer
416       DDIT3 reverse primer
417       DDIT3 probe
418       ITM1 forward primer
419       ITM1 reverse primer
420       ITM1 probe
421       C1OF24 forward primer
422       C1OF24 reverse primer
423       C1OF24 probe
424       MOT8 forward primer
425       MOT8 reverse primer
426       MOT8 probe
427       ARG2 forward primer
428       ARG2 reverse primer
429       ARG2 probe



Claims

1. A method of diagnosing thyroid cancer, testing indeterminate thyroid fine needle aspirate (FNA) thyroid nodule samples, or of differentiating between thyroid carcinoma and benign thyroid diseases in a biological sample obtained from a patient; comprising the step of measuring the expression levels in the sample of genes encoding mRNA corresponding to SEQ ID NOs: 199, 207, 255 and 354; wherein the gene expression levels above or below pre-determined cut-off levels are indicative of thyroid cancer or of thyroid carcinoma.
 
2. The method of claim 1 wherein the sample was prepared by a method selected from the group consisting of fine needle aspiration, bulk tissue preparation (for example, from a biopsy or a surgical specimen) and laser capture microdissection.
 
3. The method of claim 1 further comprising measuring

a) the expression level of at least one gene encoding mRNA:

i. corresponding to SEQ ID NOs: 142, 219 and 309; and/or

ii. corresponding to SEQ ID NOs: 9, 12 and 18; or

b) the expression level of at least one gene constitutively expressed in the sample.


 
4. The method of claim 1 wherein the comparison of expression patterns is conducted with pattern recognition methods, which can optionally include the use of a Cox proportional hazards analysis.
 
5. The method of claim 1 wherein the pre-determined cut-off levels

a) are at least about 1.5-fold over- or under- expression in the sample relative to benign cells or normal tissue; or

b) have at least a statistically significant p-value over-expression in the sample having thyroid carcinoma cells relative to benign cells or normal tissue, wherein optionally the p-value is less than about 0.05.


 
6. The method of claim 1 wherein gene expression

a) is measured on a microarray or gene chip

b) is determined by nucleic acid amplification conducted by polymerase chain reaction (PCR) of RNA extracted from the sample; or

c) is detected by measuring or detecting a protein encoded by the gene; or

d) is detected by measuring a characteristic of the gene, which may optionally be selected from the group consisting of DNA amplification, methylation, mutation and allelic variation.


 
7. The method of claim 6(a) wherein the microarray is a cDNA array or an oligonucleotide array which may optionally comprise one or more internal control reagents.
 
8. The method of claim 6(b) wherein said PCR is reverse transcription polymerase chain reaction (RT-PCR).
 
9. The method of claim 8, wherein the RT-PCR further comprises one or more internal controls, for example a method of detecting PAX8 gene expression which may optionally be measured using SEQ ID NOs: 409-411.
 
10. The method of claim 6(c) wherein the protein is detected by an antibody specific to the protein.
 
11. A composition comprising at least one probe set consisting of SEQ ID NOs: 199, 207, 255 and 354.
 
12. A kit for conducting an assay to determine thyroid carcinoma prognosis in a biological sample comprising: materials for detecting isolated nucleic acid sequences, their complements, or portions thereof of a combination of genes encoding mRNA corresponding to SEQ ID NOs: 199, 207, 255 and 354, wherein the kit optionally further comprises reagents for conducting a microarray analysis or a medium through which said nucleic acid sequences, their complements, or portions thereof are assayed.
 
13. The use of a microarray or gene chip for performing the method of claim 1.
 
14. The use of claim 13 wherein the microarray or gene chip comprises isolated nucleic acid sequences, their complements, or portions thereof of a combination of genes encoding mRNA corresponding to SEQ ID NOs: 199, 207, 255 and 354;
where the combination is sufficient to characterize thyroid carcinoma or risk of relapse in a biological sample.
 
15. The use of claim 14 wherein:

a) the measurement or characterization is at least about 1.5-fold over- or under-expression; or

b) the microarray or gene chip comprises a cDNA array or an oligonucleotide array; or

c) the microarray or gene chip comprises one or more internal control reagents.


 




Drawing
































Cited references

REFERENCES CITED IN THE DESCRIPTION



This list of references cited by the applicant is for the reader's convenience only. It does not form part of the European patent document. Even though great care has been taken in compiling the references, errors or omissions cannot be excluded and the EPO disclaims all liability in this regard.

Patent documents cited in the description