(11)EP 3 653 702 A1


(43)Date of publication:
20.05.2020 Bulletin 2020/21

(21)Application number: 18206532.6

(22)Date of filing:  15.11.2018
(51)Int. Cl.: 
C12N 9/16  (2006.01)
C12N 9/12  (2006.01)
(84)Designated Contracting States:
Designated Extension States:
Designated Validation States:

(71)Applicant: Deutsches Krebsforschungszentrum, Stiftung des öffentlichen Rechts
69120 Heidelberg (DE)

  • BUND, Timo
    69120 Heidelberg (DE)
  • HANSMAN, Grant S.
    69120 Heidelberg (DE)
  • KILIC, Turgay
    69120 Heidelberg (DE)
  • POPOV, Alexander N.
    38000 Grenoble (FR)
    69120 Heidelberg (DE)
  • ZUR HAUSEN, Harald
    69120 Heidelberg (DE)

(74)Representative: Schüssler, Andrea 
Kanzlei Huber & Schüssler Truderinger Strasse 246
81825 München
81825 München (DE)



(57) The present invention relates to crystal forms of a replication protein encoded by a plasmid isolated from a Multiple Sclerosis patient, crystal structure information obtained from them, methods of preparing such crystal forms, their use for the identification and / or design of inhibitors of said replication proteins and methods for identifying, optimizing and designing compounds which should have the ability to interact with or inhibit the replication protein.


[0001] The present invention relates to crystal forms of a replication protein encoded by a plasmid isolated from a Multiple Sclerosis patient, crystal structure information obtained from them, methods of preparing such crystal forms, their use for the identification and / or design of inhibitors of said replication proteins and methods for identifying, optimizing and designing compounds which should have the ability to interact with or inhibit the replication protein.

Background of the invention

[0002] The consumption of bovine meat and milk are considered one kind of risk factor for the development of human degenerative and malignant diseases, e.g. colon and breast cancers (1-3). Indeed, epidemiologic data suggest there is a correlation of these cancers with the consumption of bovine products from Eurasian Aurochs-derived cattle (4-7). Recently, it was coined that Bovine Meat and Milk Factors (BMMFs), which are circular, single stranded episomal DNAs (<3 kb) found in bovine meat and milk products, might represent a possible etiological agent of such diseases (8-10). However, BMMFs were also isolated from patients with multiple sclerosis and studies suggested these to be a possible infectious agent of this disease (10-13).

[0003] Typically, BMMFs encode an autonomous plasmid trans-acting replication initiator protein termed "Rep". The Rep binds at an origin of replication on the DNA (termed ori) and in most cases, comprises of a set of repetitive DNA elements (termed iterons), which are present within most BMMFs (5). Replication of various plasmids, including Circular Rep-Encoding Single-Stranded (CRESS) DNA viruses, also requires the binding of the Rep to a specific DNA sequence (14). Within prokaryotes, the Rep plays a central role in maintaining the plasmid copy number, as reported for the F plasmid in Escherichia coli (15). This regulation is also critical for the replication of plasmid-derived, bacteriophage-like, or virus-like DNA genomes (16). Reps are essential for the replication of multidrug-resistant bacteria in humans (17) and studies have suggested that Reps have a role in transmissible amyloid proteinopathy (18-20).

[0004] The X-ray crystal structures of Reps have been well documented and the structural basis for autonomous replication has been described (24-27). The Reps are composed of two winged-helix domains (termed WH1 and WH2) that are essentially a fused N- and C-terminal protein. The Reps transform between monomeric and dimeric forms, depending on the specific function and binding to DNA (28). Large structural changes involving both domains complement these oligomeric forms. The structural transformation requires certain α-helix and β-strands on the Rep to be refolded and/ or shifted (26). In the dimeric form, the Rep functions as a repressor, where the WH2 binds to each operator DNA repeat and the WH1 functions to form the dimerization interface. In the monomeric form, the Rep functions as replication initiators, where the WH1 undergoes a large structural movement, i.e. dimer dissociation, thereby allowing the WH1 to bind to the iteron end, while the WH2 binds to the opposite iteron end.

[0005] Recently, an episomal circular DNA (isolate MSBI1.176, accession LK931491.1) was isolated from a brain sample of a patient with multiple sclerosis (11). The MSBI1.176 Rep exhibits a 98% amino acid identity with the Sphinx-1.76 encoded Reps (GenBank ADR65123.1 and HQ444404.1), which were isolated from culture and brain preparations of transmissible encephalopathy-related agents (21). Moreover, there were indications for detection of Sphinx.1.76 encoded Reps in neural cells (GT1-cell line) and brain samples of mouse CNS, hamster CNS, and human glioblastoma based on Sphinx-1.76-specific antibodies (22). Serology based on the MSBI1.176 Rep antigen showed positive immune responses for healthy human blood donors and indicated a possibly pre-exposure towards these agents (23). Therefore, deciphering the functions of BMMFs in human malignant and degenerative disease is becoming increasingly important.

Object of the Present Invention

[0006] For deciphering the functions of BMMFs in human malignant and degenerative diseases it would be very helpful to have a crystal structure of MSBI1.176 Rep protein which, however, failed until now. Such a crystal structure would be also helpful for the identification and / or design of inhibitors of said replication proteins and methods for identifying, optimizing and designing compounds which should have the ability to interact with or inhibit the replication protein.

[0007] Thus, it is the objective of the present invention to provide crystal structure coordinates of MSBI1.176 Rep protein, methods to obtain the crystal form of MSBI1.176 Rep protein as well as methods for designing, identifying and optimizing compounds as inhibitors of MSBI1.176 Rep protein based on the crystal structure and crystal structure coordinates.

Detailed Description of the Invention

[0008] The objective of the present invention is solved by the teachings of the independent claims. Further advantageous features, aspects and details of the invention are evident from the dependent claims, the description, the figures, and the examples of the present application.

[0009] Thus, the present invention concerns a crystal form of MSBI1.176 Rep protein as characterized in claim 1. Preferred embodiments are the subject-matter of the dependent claims.

MSBI1 Rep Original full length replication protein

[0010] GenBank: CDS63398.1
(encoded in replication competent episomal DNA MSBI1.176)
Embl accession DNA sequence: LK931491.1
source: multiple sclerosis-affected human brain

>MSBI1.176 putative replication initiator protein

[0011] During the experiments resulting in the present invention, the X-ray crystal structure of the MSBI1.176 WH1 domain (residues 2-133) was solved to 1.53 Å resolution (data statistics are given in Table 1). The asymmetric unit consisted of one WH1 dimer, i.e., two protomers (termed A and B). The electron density was well resolved for most of the protein (av. B-factor = 29,43 Å2). However, residues 36-39 could not be fitted into the B protomer due to lack of discernible electron density, although the electron density was distinct in the other protomer. The WH1 structure comprised of five α-helixes (α1-α5) and five β-strands (β1-β5) in each protomer (Figure 1). The A and B protomers were closely related (RMSD = 0.37 Å), however, a minor structural shift was observed at the β2-β3 hairpin, suggesting some flexibility of this region. Importantly, with this improved resolution over previous structures (24-27), water molecules were effectively added to this Rep structure.

[0012] The data collection and refinement statistics of the MSBI1.176 WH1 protein structure are summarized in Table 1:
Table 1. Data collection and refinement statistics of the MSBI1.176 WH1 protein structure.
ParameterValues for S-SADValues for S-SAD NativeValues for Native (PDB code 6H24)
Data collection      
 ESRF beamline ID23-1 ID23-1 ID30B
 Wavelength (Å) 1.850 0.972 0.979
 No. of crystals 7 1 1
 Space group P21 P21 P21
 Cell dimensions      
  a, b, c (Å) 104.86 43.96 107.71 104.86 43.96 107.71 32.38 77.77 47.68
  α, β, γ (°) 90 97.72 90 90 97.72 90 90 90.66 90
 Resolution range (Å) 19.91-2.30 (2.38-2.30)a 19.91-1.58 (1.63-1.58)a 40.65-1.53 (1.58-1.53)a
 Rmerge (%) 11.20 (52.00)a 4.00 (92.70)a 4.09 (68.48)a
 l/σl 33.30 (9.60)a 15.80 (1.30)a 20.91 (1.98)a
 Completeness (%) 99.90 (98.30)a 98.30 (95.3)a 99.42 (96.93)a
 Redundancy 84.0 (28.5)a 3.7 (3.4)a 6.4 (6.1)a
 Resolution range (Å)     40.65-1.53
 No. of reflections     35692
 Rwork/Rfree (%)     18.33/20.74
 No. of atoms      
  Protein     2029
  Water     152
 Average B factors (Å2)      
  Protein     29.43
  Water     32.58
  Bond lengths (Å)     0.005
  Bond angles (°)     1.080
aValues in parentheses are for the highest-resolution shell.

[0013] A database search for closely related structures and sequences revealed that MSBI1.176 WH1 had an exceptionally low amino acid identity of 28% and 17% amino acid identity with Pseudomonas syringae RepA WH1 (dRepA, PDB ID 1HKQ) and Escherichia coli RepE fused WH1-2 (RepE54, 1REP), respectively (24, 25). Similar to the dRepA, the MSBI1.176 WH1 was also folded as the replication-inert dimer, while RepE54 (WH1-2 construct) was crystallized in the monomeric initiator form. Superposition of MSBI1.176 WH1 and dRepA WH1 showed that these two domains were structurally similar (RMSD = 1.20 Å), both having the typical five α-helix and five β-strands (Figure 2). A number of structural similarities and differences were observed between these two Reps. The MSBI1.176 and dRepA dimeric interfaces, which involved five β-strands (β1-β5-β4-β3-β2), were held with a similar number of main-chain binding interactions, although not at identical residues (Figure 3A). This result suggested that the dimeric interface feature was likely functional related among the diverse Rep isolates. The inventors observed that water molecules bound at this dimeric interface (Figure 3B). However, how these water molecules stabilize the dimeric interface and/ or are displaced after binding DNA and changing conformation is not yet known.

[0014] The inventors also observed that the MSBI1.176 WH1 region that comprised of α1-α2-α5 was similar in orientation as the dRepA, having the typical α1-α2 bend thereby making a V-shaped structure (Figure 3C). This region of dRepA forming the linker to WH2, also contained the hydrophobic heptad pocket, which typically contained a number of leucine residues (e.g., dRepA: Leu12, Leu19, and Leu26; RepE: Leu24, Leu31, and Leu39). The MSBI1.176 WH1 hydrophobic pocket also comprised or consisted of three leucine residues, i.e., Leu11, Leu18 and/or Ile25, which were similarly positioned as within dRepA. Not surprising, water molecules were absent in the MSBI1.176 WH1 hydrophobic pocket (Figure 3B).

[0015] In general, many MSBI1.176 WH1 structural features were conserved to other known dimeric Rep structures (24-27). However, loop movements and different α-helix and β-strands have been observed among the different structures. In the case of the MSBI1.176, the β2-β3-hairpin shifted approximately 23A when compared to the dRepA β2-β3-hairpin (Figure 4). In the case of the RepE54, this equivalent β2-β3-hairpin (residues 97-110) was not added to the structure, since the electron density was lacking (20, 24). It was suggested that the RepE54 β2-β3-hairpin was flexible and this flexibility might function by destabilizing the anti-parallel β2-β3-hairpin and blocking dimerization (20, 24). However, the MSBI1.176 WH1 β2-β3-hairpin was clearly held with direct main-chain interactions, not unlike the dRepA structure (Figure 3A). Moreover, we perceived that water-mediated interactions at this dimeric interface might also add further stability to this hairpin (Figure 3B).

[0016] Previous modeling analysis of the dRepA domain indicated that six basic residues on the α2, β2, β3, and adjacent loops (dRepA numbering: Lys74, Arg81, Arg91, Arg93, Lys62, and Arg78) might follow a minor groove of a DNA backbone (24). In the MSBI1.176 WH1 structure, six basic residues were also found in this region. These are considered to represent the DNA-interacting site comprising or consisting of Lys69, Lys73 (located on α2), Lys85 (β2), Arg90 (β3), Arg78, and/or Arg96 (on adjacent loops). In the DNA-interacting site preferably positively charged amino acids are accumulated to make an electrostatic interaction with the negative charges of the DNA. Although, the electron density for MSBI1.176 WH1 Lys73A/B chains, Lys85A/B chains and Arg90A chain side-chains were weak, two of these residues (Arg78 and Arg96) were at equivalent dRepA positions and were suggested to interact with a DNA molecule (24). The function of the MSBI1.176 WH1 β2-β3 sheet orientation is not obvious, although the MSBI1 WH1 had a three amino acid insertion in the β2-strand that extended the sheet. Presumably, this insertion elegantly shifted MSBI1.176 WH1 Lys85 (β2) and Arg90 (β3) on the β2-β3 sheet when compared to equivalent dRepA residues Arg81 and Arg91. Seen in another way, MSBI1.176 β2-β3 sheet was rather flattered, whereas the dRepA β2-β3 hairpin was hooking in an opposite direction (Figure 2).

[0017] Elucidation of the MSBI1.176 WH1 structure represents a crucial step forward in better understanding how structural features might change among Rep proteins among this diverse group of BMMF DNAs. The finding that this MSBI1.176 WH1 protein isolated from a patient with multiple sclerosis was closely similar to a prokaryote Rep structure has important consequences. Altogether, this new structural information supports the development and design of new drug targets that can inhibit the oligomeric nature of Reps. According to the present invention a Rep WH1 domain encoded on a BMMF (MSBI1.176) isolated from a multiple sclerosis human brain sample was determined to 1.53 Å resolution using X-ray crystallography. The overall structure of the MSBI1.176 WH1 was remarkably similar to other Rep structures, despite having a low (28%) amino acid identity. The MSBI1.176 WH1 contained elements common to other Reps, including five α-helix, five β-strands, and a hydrophobic pocket. Interestingly, the MSBI1.176 WH1 β2-β3 hairpin shifted approximately 23 Å when compared to other Reps. This region is known to interact with DNA and an amino acid insertion in the MSBI1.176 WH1 hairpin shifted positively charged DNA-binding residues further along the β2-β3 sheet. The data of the present invention also show that water molecules additionally stabilize α-helix and β-strands in the protein. Altogether, these new findings support that the MSBI1.176 Rep might have comparable roles and functions as other known Reps from different origins.

[0018] Further, the present invention enables to establish methods for identifying inhibitors MSBI1.176 as well as methods for preparing crystal forms of MSBI1.176 Rep protein and their crystal structure information. The data enable rational drug design based on the use of such structural data.

[0019] The present invention provides also the possibility to identify and / or design inhibitors of MSBI 1.176 Rep protein and relates to the crystal form of, the crystal structure information obtained from the crystal form, methods of preparing such a crystal form, its use for the identification and/or design of inhibitors of MSBI 1.176 Rep protein activity and the diagnostic and/or pharmaceutical use of those inhibitors of MSBI 1.176 Rep protein identified by these methods.

[0020] The terms "crystal form" or "crystal structure" (which are used interchangeably) refer to a crystal form of the MSBI1.176 Rep protein with or without detergents and/or nucleic acids (in particular: DNA) bound to the MSBI1.176 Rep protein.

[0021] The term "unit cell" as used herein refers to the smallest repeating unit that can generate a crystal with only translation operations. The unit cell is the smallest unit of volume that contains all of the structural and symmetry information of a crystal and that by translation can reproduce a pattern in all of space. Structural information is thereby the pattern (atoms) plus all surrounding space and symmetry information means mirrors, glides, axes, and inversion centers. The translation refers to motion along a cell edge the length of the cell edge. An asymmetric unit is the smallest unit that can be rotated and translated to generate one unit cell using only the symmetry operators allowed by the crystallographic symmetry. The asymmetric unit may be one molecule or one subunit of a multimeric protein, but it can also be more than one. Hence the asymmetric unit is the smallest unit of volume that contains all of the structural information and that by application of the symmetry operations can reproduce the unit cell. The shape of the unit cell is constrained by the collection of symmetry elements which make-up the group. For space groups, seven lattice types are possible: triclinic, monoclinic, orthorhombic, tetragonal, hexagonal, rhombic and cubic. In crystallography, space groups are also called the crystallographic or Fedorov groups. The edges of the unit cell define a set of unit vector axes a, b, and c, with the unit cell dimensions as their respective length a, b, and c. These vectors need not be at right angles, and the angles between the axes are denoted α as between the bc-axes, β between the ac-axes, and γ between the ab-axes. The different shapes arise depending on restrictions placed on the lengths of the three edges (a, b, and c) and the values of the three angles (α, β, and γ).

[0022] In another preferred embodiment of the present invention the crystal form of MSBI 1.176 Rep protein contains or further contains a hydrophobic pocket, wherein the hydrophobic pocket contains the amino acids Leu11, Leu18 and Ile25.

[0023] A potential inhibitor of MSBI 1.176 Rep protein could bind to the DNA-interacting site in order to decrease the activity of the Rep protein. Thus, a method for designing, identifying and/or optimizing compounds which might have the ability to bind to MSBI1.176 Rep protein could be based on the crystal structure coordinates.

[0024] A potential inhibitor of MSBI1.176 Rep protein could bind to a Rep oligomerization interface modulating structure-based function and protein activity/localization/stability/modification.

[0025] A potential inhibitor of MSBI1.176 Rep protein could bind to yet unknown protein interaction surfaces modulating structure-based function and protein activity/localization/stability/modification.

[0026] A potential inhibitor of MSBI1.176 Rep protein could bind to structurally-related prokaryotic Rep (e.g. from Acinetobacter baumannii species) to decrease the activity of such Rep proteins from different origin based on sterical hindrance of DNA-binding, oligomerization and/or addition protein interaction interfaces.

[0027] The crystal forms of MSBI1.176 Rep protein as disclosed herein can be obtained by the following crystallization method. Thus the present invention relates to a method for crystallizing MSBI1.176 Rep protein comprising the steps:
  1. (a) preparing a solution of recombinant prepared MSBI1.176 WH1 domain in a crystallization reagent, preferably containing at least one inorganic salt and at least one precipitation agent;
  2. (b) crystallizing said MSBI1.176 WH1 domain by vapor diffusion.

[0028] The at least one inorganic salt is used in an amount of between 0.01 mM and 2 M, preferably 0.1 mM and 1 M, more preferably between 0.1 M and 0.5 M, more preferably about 0.2 M. In case more than one inorganic salt is used, the aforementioned concentration refers to the concentration of all inorganic salts together and not to the concentration of each single salt used in the mixture of inorganic salts.

[0029] Preferably, inorganic salts are selected from the group comprising or consisting of ammonium chloride, ammonium sulfate, ammonium acetate, ammonium fluoride, ammonium bromide, ammonium iodide, ammonium nitride, calcium chloride, calcium acetate, magnesium acetate, magnesium formate, magnesium nitrate, potassium acetate, potassium bromide, potassium fluoride, potassium chloride, potassium iodide, potassium nitrate, sodium acetate, sodium hydroxide, sodium bromide, sodium fluoride, sodium iodide, sodium nitrate, sodium sulfate, sodium chloride, zinc chloride, zinc sulfate, zinc acetate. A preferred inorganic salt is magnesium acetate.

[0030] In theory, the at least one precipitating agent competes with the protein solutes for water, thus leading to supersaturation of the proteins. Crystals can normally only grow from supersaturated states, and thus they can grow from precipitates. Salts, polymers, and organic solvents are suitable precipitating agents which are used in an amount between 5% by weight and 50% by weight, preferably between 10% by weight and 40% by weight, and more preferably between 15% by weight and 35%, and most preferably between 20% by weight and 30% by weight of the precipitating agent or the mixture of precipitating agents in regard to the total weight of the buffered precipitant solution.

[0031] The precipitating agent used in the buffered precipitant solution of step (b) may preferably be selected from the group consisting of or comprising: 2-methyl-2,4 pentanediol, glycerol, polyethylene glycol (PEG), pentaerythritol propoxylate, pentaerythritol ethoxylate, sodium polyacrylate, hexandiol, isopropanol, ethanol, tert-butanol, dioxane, ethylene imine polymer, etylene glycol, propanediol, polyacrylic acid, polyvinylpyrrolidone, 2-ethoxyethanol, or mixtures thereof. Most preferred as the precipitant is PEG, preferably having molecular weights ranging from PEG 200 to PEG 20,000, more preferably having molecular weights range from PEG 1,000 to PEG 18,000, yet more preferably having molecular weights range from PEG 3,000 to PEG 15,000. PEG is a very preferred precipitating agent which is preferably used in an amount of 20 - 30 % by weight of the buffered precipitant solution.

[0032] In a most preferred embodiment the crystallization buffer contains 0.2 M magnesium acetate and 20% PEG3350.

[0033] In a preferred embodiment the hanging-drop or the sitting-drop methods are used for crystallization. "The hanging drop vapor diffusion" technique is the most popular method for the crystallization of macromolecules. The principle of vapor diffusion is straightforward. A drop composed of a mixture of sample and crystallization reagent is placed in vapor equilibration with a liquid reservoir of reagent. Typically the drop contains a lower reagent concentration than the reservoir. To achieve equilibrium, water vapor leaves the drop and eventually ends up in the reservoir. As water leaves the drop, the sample undergoes an increase in direction to super-saturation. Both the sample and reagent increase in concentration as water leaves the drop for the reservoir. Equilibration is reached when the reagent concentration in the drop is approximately the same as that in the reservoir.

[0034] Further important aspects of the invention are related to the use of the crystal forms of MSBI1.176 Rep protein for an in-silico prediction model for the identification, optimization and/or design of inhibitors of MSBI1.176 Rep protein. Knowing the exact positions of the atoms of the amino acids in the DNA-interacting site provides the possibility to design suitable inhibitors, identify suitable inhibitors e.g. from a compound library or optimize a known suitable inhibitor by increasing the inhibitory potential. Design, identification and optimization of suitable inhibitors can be performed with standard computer based methods and software programs well known in the art. A variety of commercially available software programs are available for conducting the analysis and comparison of data in the computer-based system. One skilled in the art will readily recognize which of the available algorithms or implementing software packages for conducting computer analyses can be utilized or adapted for use in the computer-based system. A target structural motif or target motif refers to any rationally selected sequence or combination of sequences in which the sequence(s) are chosen based on a three-dimensional configuration or electron density map which is formed upon the folding of the target motif. There are a variety of target motifs known in the art. A variety of structural formats for the input and output means can be used to input and output the information in the computer-based systems of the present invention. A variety of comparing means can be used to compare a target sequence or target motif with the data storage means to identify structural motifs or interpret electron density maps derived in part from the atomic coordinate/x-ray diffraction data. One skilled in the art can readily recognize any one of the publicly available computer modeling programs that can be used. Suitable software that can be used to view, analyze, design, and/or model a protein comprise Alchemy™, LabVision™, Sybyl™, Molcadd ™, Leapfrog™, Matchmaker20™, Genefold ™ and Sitel ™ (available from Tripos Inc., St. Louis, MO.); Quanta ™, MacroModel ™ and GRASP™ , Univision ™, Chem 3D ™ and Protein Expert ™.

[0035] Thus in a further aspect the present invention is related to methods for designing, identifying and optimizing inhibitors of MSBI1.176 Rep protein by applying the crystal form and the related structure coordinates of the crystal form or at least of one of the DNA-interacting sites in order to design, identify or optimize inhibitors by means of computer based methods or software programs.

[0036] The atomic coordinate/x-ray diffraction data may be used to create a physical model which can then be used to design molecular models of compounds that should have the ability or property to inhibit and/or interact with the determined DNA-interacting sites or other structural or functional domains or subdomains such as the hydrophobic pocket and/or the pocket neighboring the DNA-interacting site. Alternatively, the atomic coordinate/x-ray diffraction data of the complex may be represented as atomic model output data on computer readable media which can be used in a computer modeling system to calculate different molecules expected to inhibit and/or interact with the determined sites, or other structural or functional domains or subdomains. For example, computer analysis of the data allows one to calculate the three-dimensional interaction of the MSBI1.176 Rep protein and the compound to confirm that the compound binds to, or changes the conformation of, particular domain(s) or subdomain(s). Compounds identified from the analysis of the physical or computer model can then be synthesized and tested for biological activity with an appropriate assay.

[0037] In case an inhibitor is identified from a compound library or in case a known inhibitor is optimized by theoretical chemical modifications, testing of the actual compound is desired in order to verify the inhibitory effect and to continue the optimization process. Thus, preferably the above method for identifying and/or optimizing a compound further comprising the steps of
  1. (a) obtaining the identified or optimized compound; and
  2. (b) contacting the identified or optimized compound with MSBI1.176 Rep protein in order to determine the inhibitory effect on MSBI1.176 Rep protein.

[0038] It is not necessary to use all the structure coordinates as listed in Table 1. Thus only the structure coordinates of the hydrophobic pocket and/or the DNA-interacting site could be used.

[0039] The following abbreviations are used for the common and modified amino acids referred to herein.

Amino acids

Aspartic acid (Aspartate)
Glutamic acid (Glutamate)

Description of the Figures


Figure 1. The X-ray crystal structure of the MSBI1.176 WH1 dimer. The MSBI1.176 WH1 protomers were colored according to A chain, cyan and B chain, orange. One protomer comprises of five α-helix (α1-α5) and five β-strands (β1-β5). The dimeric interface involves five β-strands (β1-β5-β4-β3-β2). Presumably a DNA molecule would interact along the dimeric interface and with possibly six basic residues in this region, i.e., Lys69, Lys73, Arg78 (located on α2), Lys85 (β2), Arg90 (β3), and Arg96.

Figure 2. Structural comparison to closely matching prokaryote RepA WH1. The MSBI1.176 WH1 and RepA WH1 had 28% amino acid identity. Superposition of these RepA (gray) showed that these two WH1 dimers were highly similar, having an RMSD = 1.197 Å. Structural differences in extended loops were observed, noticeably the loop connecting the α1-β1 as well as the β2-β3 hairpin.

Figure 3. MSBI1.176 WH1 structural similarities. (A) The five β-sheets (β1-β5-β4-β3-β2) showing the main chain interactions for MSBI1.176 WH1 (cyan and orange) and dRepA (gray). The β-strands were held with numerous main-chain hydrogen bonds (dashed lines), similar to dRepA. (B) The MSBI1.176 WH1 protein was mostly covered with water molecules (blue spheres). Water-mediated interactions likely stabilized these β-sheets and other regions on the protein. However, water molecules were not located in the hydrophobic pocket. For simplicity, water molecules on one protomer (orange) were excluded in the side view. (C) The region containing the α1-α2-α5 was similar in orientation as the dRepA. This region produced a V-shaped structure and the α5 is the linker region to WH2 domain. The hydrophobic pocket also consisted of three leucine residues, i.e., Leu11, Leu18, and Ile25, which were similarly positioned as the dRepA (data not shown).

Figure 4. MSBI1.176 WH1 water interactions and the hydrophobic pocket. The MSBI1.176 WH1 β2-β3-hairpin shifted approximately 23 Å when compared to the equivalent dRepA hairpin. The MSBI1.176 β2-β3-hairpin was held with direct main-chain interactions (Figure 3A) as well as water-mediated interactions. Note the different β2-β3-hairpin twists between the MSBI1.176 and dRepA WH1 structures in Figure 3A. The RepE54 β2-β3 hairpin was positioned in between these two WH1 β2-β3 hairpins.

[0042] The following examples are included to demonstrate preferred embodiments of the invention. It should be appreciated by those of skill in the art that the techniques disclosed in the following examples represent techniques discovered by the inventor to function well in the practice of the invention, and thus can be considered to constitute preferred modes for its practice. Modifications and alternative embodiments of various aspects of the invention will be apparent to those skilled in the art in view of this description.

[0043] Accordingly, this description is to be construed as illustrative only and is for the purpose of teaching those skilled in the art the general manner of carrying out the invention. It is to be understood that the forms of the invention shown and described herein are to be taken as examples of embodiments. Elements and materials may be substituted for those illustrated and described herein, parts and processes may be reversed, and certain features of the invention may be utilized independently, all as would be apparent to one skilled in the art after having the benefit of this description of the invention.


Example 1: Method of obtaining crystals of MsBI1 WH1

[0044] The MSBI1.176 DNA (LK931491.1) was isolated from a brain sample of a patient with multiple sclerosis (11). The MSBI1 WH1 domain (residues 1-135) was expressed in E. coli and purified as previously described for human norovirus protruding domains (29). Briefly, the codon optimized WH1 was cloned in a modified expression vector pMal-c2X (Geneart) and transformed into BL21 cells for protein expression. Transformed cells were grown in LB medium supplemented with 100 µg/ml ampicillin for 4 hours at 37°C. Expression was induced with IPTG (0.75 mM) at an OD600 of 0.7 for 18 h at 22 °C. Cells were harvested by centrifugation at 6000 rpm for 15 min and disrupted by sonication on ice. His-tagged fusion-MBSI1 protein was initially purified from a Ni column (Qiagen), dialyzed in a gel filtration buffer (GFB: 25 mM Tris-HCI and 300 mM NaCI) with 10 mM imidazole and digested with HRV-3C protease (Novagen) overnight at 4 °C. The cleaved MSBI1 WH1 was then applied on the Ni column again to separate and collect the cleaved protein, and dialyzed in GFB overnight at 4 °C. The MSBI1 WH1 protein was further purified by size exclusion chromatography, concentrated to 5 mg/ml and stored in GFB at 4 °C. Crystals of MSBI1 WH1 were grown using hanging-drop vapor diffusion method at 18 °C for ∼ 6-10 days in a 1:1 mixture of protein sample and mother liquor (0.2 M magnesium acetate and 20% PEG3350). Prior to data collection, MSBI1 WH1 crystals were transferred to a cryoprotectant containing the mother liquor with 40% PEG3350, followed by flash freezing in liquid nitrogen.

Example 2: X-ray diffraction

[0045] X-ray diffraction data of the MSBI1 WH1 domain were collected on the European Synchrotron Radiation Facility (ESRF) on beamlines ID23-1 and ID30B. For the single-wavelength anomalous diffraction using native sulfurs (S-SAD) experiments, diffraction data from seven crystals were collected at λ = 1.850 Å on beamline ID23-1 equipped with Dectris pixel array detector PILATUS-6M. The X-ray beam size at the sample position was 50 µm and the size of crystals was approximately 70x70x200 µm3. To decrease the radiation damage effects, the helical data collection strategy was applied. One native data set was collected at ID23-1 at λ = 0.972 Å for initial phase extension and a second native data set was collected at ID30B at λ = 0.979 Å for structure refinement.. Optimal experimental parameters for data collection were designed using the BEST (30) incorporated into the MxCube software (31) at ESRF. The single native date set was processed with XDS while multiple data sets for S-SAD were processed with XDS and then merged using XSCALE (32).

[0046] Several data sets were collected using S-SAD for further processing (33). S-SAD phasing protocol was carried out using the SHELXC/D/E pipeline as implemented in HKL2MAP (34). One thousand trials were carried out for substructure determination in SHELXC. Using a resolution of 2.3 Å and an anomalous signal truncated to 3.1 Å, SHELXD correctly identified all 24 sulfur sites. Four hundred and fifteen residues were built automatically by SHELXE, which resulted in an interpretable map for further processing. Finally ARP-wARP was then used for automated model building based on the first S-SAD native data set collected (35). The structure was refined using the second high resolution native data set in multiple rounds of manual model building in COOT (36) and PHENIX (37). The structure was validated using Molprobity and Procheck. Interactions were analyzed using Accelrys Discovery Studio (Version 4.1), with hydrogen bond distances between 2.4-3.5 Å. Figures and protein contact potentials were generated using PyMOL. Atomic coordinates and structure factors have been deposited in the Protein Data Bank (PDB) with the accession code of 6H24.


  1. 1. Chan DS, et al. (2011) Red and processed meat and colorectal cancer incidence: meta-analysis of prospective studies. PloS one 6(6):e20456.
  2. 2. Corpet DE (2011) Red meat and colon cancer: should we become vegetarians, or can we make meat safer? Meat science 89(3):310-316.
  3. 3. Huxley RR, et al. (2009) The impact of dietary and lifestyle risk factors on risk of colorectal cancer: a quantitative overview of the epidemiological evidence. International journal of cancer 125(1):171-180.
  4. 4. zur Hausen H & de Villiers EM (2015) Dairy cattle serum and milk factors contributing to the risk of colon and breast cancers. International journal of cancer 137(4):959-967.
  5. 5. Zur Hausen H, Bund T, & de Villiers EM (2017) Infectious Agents in Bovine Red Meat and Milk and Their Potential Role in Cancer and Other Chronic Diseases. Current topics in microbiology and immunology 407:83-116.
  6. 6. zur Hausen H (2012) Red meat consumption and cancer: reasons to suspect involvement of bovine infectious factors in colorectal cancer. International journal of cancer 130(11):2475-2483.
  7. 7. zur Hausen H (2015) Risk factors: What do breast and CRC cancers and MS have in common? Nat Rev Clin Oncol 12(10):569-570.
  8. 8. Funk M, et al. (2014) Isolation of protein-associated circular DNA from healthy cattle serum. Genome announcements 2(4).
  9. 9. Falida K, Eilebrecht S, Gunst K, Zur Hausen H, & de Villiers EM (2017) Isolation of Two Virus-Like Circular DNAs from Commercially Available Milk Samples. Genome announcements 5(17).
  10. 10. zur Hausen H, Bund T, & de Villiers E-M (In press) Specific Nutritional Infections Early in Life as Risk Factors for Human Colon and Breast Cancers Several Decades Later. International Journal of Cancer.
  11. 11. Whitley C, et al. (2014) Novel replication-competent circular DNA molecules from healthy cattle serum and milk and multiple sclerosis-affected human brain tissue. Genome announcements 2(4).
  12. 12. Lamberto I, Gunst K, Muller H, Zur Hausen H, & de Villiers EM (2014) Mycovirus-like DNA virus sequences from cattle serum and human brain and serum samples from multiple sclerosis patients. Genome announcements 2(4).
  13. 13. Gunst K, Zur Hausen H, & de Villiers EM (2014) Isolation of bacterial plasmid-related replication-associated circular DNA from a serum sample of a multiple sclerosis patient. Genome announcements 2(4).
  14. 14. Kornberg A, and Baker, T. (1992) DNA Replication. University Science Books 2nd Ed.
  15. 15. Kline BC (1985) A review of mini-F plasmid maintenance. Plasmid 14(1):1-16.
  16. 16. Ruiz-Maso JA, et al. (2015) Plasmid Rolling-Circle Replication. Microbiology spectrum 3(1):PLAS-0035-2014.
  17. 17. Schumacher MA, et al. (2014) Mechanism of staphylococcal multiresistance plasmid replication origin assembly by the RepA protein. Proceedings of the National Academy of Sciences of the United States of America 111(25):9121-9126.
  18. 18. Molina-Garcia L, Gasset-Rosa F, Alamo MM, de la Espina SM, & Giraldo R (2018) Addressing Intracellular Amyloidosis in Bacteria with RepA-WH1, a Prion-Like Protein. Methods in molecular biology 1779:289-312.
  19. 19. Giraldo R, et al. (2016) RepA-WH1 prionoid: Clues from bacteria on factors governing phase transitions in amyloidogenesis. Prion 10(1):41-49.
  20. 20. Giraldo R, Moreno-Diaz de Ia Espina S, Fernandez-Tresguerres ME, & Gasset-Rosa F (2011) RepA-WH1 prionoid: a synthetic amyloid proteinopathy in a minimalist host. Prion 5(2):60-64.
  21. 21. Manuelidis L (2011) Nuclease resistant circular DNAs copurify with infectivity in scrapie and CJD. Journal of neurovirology 17(2):131-145.
  22. 22. Yeh YH, Gunasekharan V, & Manuelidis L (2017) A prokaryotic viral sequence is expressed and conserved in mammalian brain. Proceedings of the National Academy of Sciences of the United States of America 114(27):7118-7123.
  23. 23. Eilebrecht S, et al. (2018) Expression and replication of virus-like circular DNA in human cells. Scientific reports 8(1):2851.
  24. 24. Giraldo R, Fernandez-Tornero C, Evans PR, Diaz-Orejas R, & Romero A (2003) A conformational switch between transcriptional repression and replication initiation in the RepA dimerization domain. Nature structural biology 10(7):565-571.
  25. 25. Komori H, et al. (1999) Crystal structure of a prokaryotic replication initiator protein bound to DNA at 2.6 A resolution. The EMBO journal 18(17):4597-4607.
  26. 26. Nakamura A, Wada C, & Miki K (2007) Structural basis for regulation of bifunctional roles in replication initiator protein. Proceedings of the National Academy of Sciences of the United States of America 104(47):18484-18489.
  27. 27. Swan MK, Bastia D, & Davies C (2006) Crystal structure of pi initiator protein-iteron complex of plasmid R6K: implications for initiation of plasmid DNA replication. Proceedings of the National Academy of Sciences of the United States of America 103(49):18481-18486.
  28. 28. Forest KT & Filutowicz MS (2003) Remodeling of replication initiator proteins. Nature structural biology 10(7):496-498.
  29. 29. Hansman GS, et al. (2011) Crystal structures of GII.10 and GII.12 norovirus protruding domains in complex with histo-blood group antigens reveal details for a potential site of vulnerability. Journal of virology 85(13):6687-6701.
  30. 30. Bourenkov GP & Popov AN (2010) Optimization of data collection taking radiation damage into account. Acta crystallographica. Section D, Biological crystallography 66(Pt 4):409-419.
  31. 31. Gabadinho J, et al. (2010) MxCuBE: a synchrotron beamline control environment customized for macromolecular crystallography experiments. Journal of synchrotron radiation 17(5):700-707.
  32. 32. Kabsch W (2010) XDS. Acta Cryst. D66:125-132.
  33. 33. Liu Q, et al. (2012) Structures from anomalous diffraction of native biological macromolecules. Science 336(6084):1033-1037.
  34. 34. Sheldrick GM (2010) Experimental phasing with SHELXC/D/E: combining chain tracing with density modification. Acta crystallographica. Section D, Biological crystallography 66(Pt 4):479-485.
  35. 35. Langer G, Cohen SX, Lamzin VS, & Perrakis A (2008) Automated macromolecular model building for X-ray crystallography using ARP/wARP version 7. Nature protocols 3(7):1171-1179.
  36. 36. Emsley P LB, Scott WG, Cowtan K. (2010) Features and development of Coot. Acta Crystallographica Section D: Biological Crystallography. 66(4):486-501.
  37. 37. Adams PD, et al. (2010) PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallographica Section D 66(2):213-221.


1. Crystal structure of MSBI1.176 Rep protein characterized as having (a) a space group of P21 and (b) unit cell dimensions of a = 32.38 A ± 1-2 A, b = 77.77 A ± 1-2 A and c = 47.68 A ± 1-2 A, α = 90°, β = 90.66° and γ = 90°.
2. Crystal structure of MSBI1.176 Rep protein containing a hydrophobic pocket comprising Leu11, Leu18 and/or Ile25.
3. Crystal structure of MSBI1.176 Rep protein containing a DNA interacting site comprising residues Lys69, Lys73, Lys85, Arg90, Arg78 and/or Arg96.
4. A method for producing a crystal of MSBI1.176 Rep protein or a crystallizable fragment thereof, said method comprising the steps of:

(a) preparing a solution of recombinant prepared MSBI1.176 Rep protein, preferably WH1 domain, in a crystallization reagent,

(b) crystallizing said MSBI1.176 WH1 domain by vapor diffusion.

5. The method of claim 4, wherein the crystallization reagent contains 0.2 M magnesium acetate and 20% PEG3350.
6. The method of claim 4, wherein the vapor diffusion is hanging-drop vapor diffusion method.
7. A method for screening an inhibitor of the MSBI1.176 Rep protein, said method comprising the steps of:

(a) providing a solution of said MSBI1.176 Rep protein or a crystallizable fragment thereof,

(b) contacting at least one candidate compound with the MSBI1.176 Rep protein in said solution,

(c) preparing crystals of said MSBI1.176 Rep protein, and

(d) identifying a candidate binding compound of said MSBI1.176 Rep protein.

8. The method according to claims 7, wherein in step (d) the binding of the candidate compound to the DNA-interacting site as defined in claim 3 is determined.
7. Use of the crystal structure as defined in any one of claims 1 to 3 for obtaining atomic spatial relationship data.
8. Use of the crystal structure according to claim 7 for screening, identifying, designing, or optimizing a drug binding to the MSBI1.176 Rep protein.
9. Use of the crystal structure of MSBI1.176 Rep protein for in silico screening of the ability of a candidate compound to bind to said MSBI1.176 Rep protein, in particular to bind to the DNA interacting site.



This list of references cited by the applicant is for the reader's convenience only. It does not form part of the European patent document. Even though great care has been taken in compiling the references, errors or omissions cannot be excluded and the EPO disclaims all liability in this regard.

Non-patent literature cited in the description