(11)EP 3 777 596 B1


(45)Mention of the grant of the patent:
20.09.2023 Bulletin 2023/38

(21)Application number: 19780684.7

(22)Date of filing:  03.04.2019
(51)International Patent Classification (IPC): 
A43D 1/02(2006.01)
A43D 999/00(2006.01)
G06T 7/62(2017.01)
G06T 17/00(2006.01)
A61B 5/107(2006.01)
(52)Cooperative Patent Classification (CPC):
A43D 1/025; A43D 999/00; A61B 5/1074; G06T 7/62; G06T 7/75; G06T 2207/30244; G06T 2207/30196; G06T 7/12; G06T 2207/10016; A61B 5/1079
(86)International application number:
(87)International publication number:
WO 2019/194707 (10.10.2019 Gazette  2019/41)





(84)Designated Contracting States:

(30)Priority: 03.04.2018 RU 2018111932

(43)Date of publication of application:
17.02.2021 Bulletin 2021/07

(73)Proprietor: FITTIN Inc.
Redwood City, CA 94063 (US)

  • REVKOV, Andrej Anatolevich
    Voronezh 394043 (RU)
  • CHUJKO, Grigorij Vladimirovich
    Voronezh 394000 (RU)
  • SHEDRIN, Ivan Sergeevich
    Voronezh 394087 (RU)
  • REVKOV, Egor Andreevich
    Novosibirsk 630049 (RU)
  • GRISHKO, Natalja Demjanovna
    Novosibirsk 630001 (RU)
  • POSMETEV, Viktor Valerevich
    Voronezh 394088 (RU)
  • KANIN, Dmitry Mikhailovich
    Selo Jamnoe 396005 (RU)

(74)Representative: Sloboshanin, Sergej et al
V. Füner Ebbinghaus Finck Hano Patentanwälte Mariahilfplatz 3
81541 München
81541 München (DE)

(56)References cited: : 
US-A1- 2002 172 433
US-A1- 2010 238 271
US-A1- 2017 249 783
RU-C1- 2 034 509
US-A1- 2009 245 679
US-A1- 2016 286 906
US-B1- 6 650 778
    Note: Within nine months from the publication of the mention of the grant of the European patent, any person may give notice to the European Patent Office of opposition to the European patent granted. Notice of opposition shall be filed in a written reasoned statement. It shall not be deemed to have been filed until the opposition fee has been paid. (Art. 99(1) European Patent Convention).



    [0001] This invention falls into the category of measurement methods and can be used in light industry, especially in the footwear and garment industries, when measuring the shape and size of parts of human body. It can also be used in orthopedics and makes it possible to perform remote diagnosis of foot deformities (diseases), particularly, flat-footedness.

    [0002] The method can be used in everyday life and in the trading industry during the marketing, sale, and remote purchase of apparel and footwear that fit the real features of the customer's body. It can also be used for virtual footwear fitting at Internet stores or in mobile applications for remotely purchasing footwear and ordering handmade footwear.


    [0003] A known foot measurement method [US 20170053335 A1] entails acquiring a three-dimensional image of a foot and comparing it to an extensive database of stored footwear information. This method makes it possible to determine the size better than conventional systems, wherein a smartphone /iPhone/ digital camera and related interface tools suggest the best choice of a footwear option from multiple offers.

    [0004] The shortcomings of this method consist of low accuracy, a narrow area of application (the method does not work when heavy shadows, glares, and poor lighting are present), high labor intensity, and lengthy footwear selection time.

    [0005] Document US 2017/249783 A1 refers to a method and a system of virtual fitting a foot with a shoe by obtaining a first and second 3D model associated with the surface of the foot and interior cavity of the shoe, respectively. According to said method, a first image depicting a substantially planar surface is obtained wherein the planar surface has non-patterned and nor-repetitive markings. Further, a sequence of second images is obtained wherein the second images show an object to be scanned together with the planar surface. The shortcomings of said method consist in the necessity to provide additional planar surface with markings for scanning and to scan said planar surface without the object to be scanned. Also, the object should be fully scanned over 360° which is of high labor intensity.

    [0006] A method for the contactless measurement of the foot surface [RU2034509, 05/10/1995] is the one closest to the engineering solution at hand. This method consists of placing the foot on a base platform, illuminating the foot surface with sheet light beams, and obtaining an image of the trace of these beams for the outer and inner side surfaces of the foot separately using two video cameras, the optical axes of which are oriented at acute angles relative to the reference plane and the sheet beams. The foot is aligned by the video cameras, which are turned toward the heel of the foot. Here, an angle forms between the projection of the optical axes of the video cameras onto the reference plane. This angle's vertex is oriented toward the heel of the foot and occupies a point located along the projection of the angle's bisector between the optical axes of the cameras on the reference plane. Using a heel stop, the heel of the foot is aligned with a predetermined reference point located along the projection of the angle's bisector.

    [0007] The shortcomings of this prototype method include poor accuracy, high labor intensity, and lengthy footwear selection time.


    [0008] The technical result of the invention is increased accuracy.

    [0009] The invention relates to a method for measuring a shape and a size of parts of a human body as specified in appended independent claim 1. Additional embodiments of the invention are disclosed in the dependent claims.


    [0010] The utilization of this invention will lead to a decrease in labor intensity, as well as part of human body shape and size measurement time, and an expanded opportunity for its use on groups of people with limited mobility: infants and the elderly.

    [0011] The method's use makes it possible to generate a 3D model, for example, of a leg (the foot and the lower leg) by means of processing a series of photographs obtained while passing a device with a video camera around the foot- increased accuracy, a decrease in labor intensity and measurement time, and the expansion of the area of application by virtue of the ability to use it in conjunction with mobile devices. The subject method is ideally suited for virtual fitting, since it is convenient for users, in addition to which it does not require the presence of special equipment, software, or the transfer of a large volume of data.

    [0012] At the present time, the process of the 3D scanning of objects using professional 3D scanners is based on the additive principle; i.e., everything that a scanner sees is then included in the 3D model. For this reason, scanning consists of two steps: object scanning and subsequent manual 3D model processing, during which it is necessary to eliminate the debris and residue of the surface on which the object is standing or is being held, and to properly mend the holes. Because this step is not presently automated in any way, the human factor plays a major role, which means that there is a high likelihood of human errors.

    [0013] This method, firstly, is based on the inverse principle, everything that a scanner captures is excluded from the final voxel model, and secondly, it automates all the steps, minimizes errors, and reduces noise during all the steps, in addition to which verification operations exist for ensuring the best object accuracy.

    [0014] Scanning with a conventional 3D scanner results in obtaining a model with a multitude of noises. Figure 1 shows a 3D model of a foot that was obtained using a professional Sense 3D scanner, which has a stated accuracy on the order of 1 mm. Next, in Figure 2, a final voxel model obtained by scanning with our technique is depicted. A comparison of these two 3D models, which is presented in Figure 3, confirms the high accuracy of our technique. Another advantage of the method under consideration consists of the fact that a solid-state model is ultimately obtained, as opposed to the polygonal model produced by scanning with a professional Sense 3D scanner.

    [0015] The embodiment of the invention is explained by the following examples.

    Example 1.

    1. Preparation for scanning

    [0016] Prior to the start of scanning, it is necessary to ensure conditions that are sufficient for obtaining the highest quality three-dimensional model of the foot. In particular:
    • uniform lighting, without bright directional light, shadows, or glares;
    • the floor color must be a contrast of white, monochromatic, and necessarily hard (when scanning on a carpet with a high pile, the corners of the A4 sheet are raised, which does not make it possible to accurately determine the A4 sheet coordinates)
    • the foot should be clad in a dark sock (preferably black), so it is isolated from the rest of the background as a connected dark domain

    2. The scanning step (obtaining a series of photographs)

    [0017] in order to generate a three-dimensional model of a foot, it is necessary to place the foot in the middle of a sheet of white A4 paper, line the heel up with the edge of the sheet, lean on the foot, and pass a device with a video camera (a smartphone or a web camera connected to a personal computer or laptop) around the foot in an arc of about 200° from one side of the body to the other. It is desirable to perform foot contouring from the knee level in order to achieve the best 3D model accuracy; however, the method is still efficient when the height of device movement is 10 cm or more from the floor. Upon the commencement of scanning, the user presses the START button, the smartphone's camera is activated for recording, and one frame per second is saved from the video stream, with the scanning process ending when the DONE button is pressed. The scanning process using a smartphone is examined in Figures 4 and 5.

    [0018] it is understood that the A4 sheet must be fully visible in each photograph. The lower leg must be visible in the photographs in order to determine lower leg shape and size. The condition must also be met that at least two pixels fall within one millimeter of the scene shown in the photograph. For example, for a 5-megapixel camera, this is achieved if the length of the A4 sheet in a photograph is not less than ⅓ of the length of the photograph.

    3. Creating of a series of photographs

    [0019] As a result of scanning, a series of photographs is obtained in the resolution set on the mobile device (at least 2, the maximum number is unlimited, the optimal number is 12). An example of photographs of this type is presented in Figure 6.

    [0020] In order to achieve maximum three-dimensional model quality and a high processing speed, it is advisable to keep 12 photographs from the entire set, selected as the sharpest and evenly distributed throughout the set.

    4. Reducing and cropping the photographs

    [0021] For the purpose of ensuring identical (predictable) photograph processing conditions and decreasing outgoing traffic when transferring photographs from a smartphone to a server, all the photographs are reduced by an integer number of times that is multiple of 2, so that the resolution is at least 480 x 640. Photographs larger than this resolution are cropped down to it, so the A4 sheet is located in the center of a photograph. At the output, we obtain a series of photographs with a resolution of 480 × 640.

    5. The A4 sheet contour is determined on each photograph

    [0022] By gradually increasing the image contrast with a shift in the average level, after 10-20 iterations, the light areas become white, the dark areas become black, and the border remains in its place. The shadows disappear and blurred borders turn into sharp ones.

    [0023] The Monte Carlo technique is used to search for the A4 sheet. A set of photographs with a detected A4 sheet contour is examined in Figure 7.

    [0024] The template shown in Figure 8 is used to verify the match of the predicted A4 sheet image with the real one, which takes the form of rows of points parallel to each edge of the sheet. The number of rows is 2 or higher, the optimal number is 6 rows, the number of points per row is 4 or higher, and the optimal number is 20-40. The optimization criterion for checking the match consists of the differences in intensity at the paired points of the corresponding series, taken with certain weighting coefficients.

    [0025] The gradual narrowing of the desired camera parameter ranges is used to hasten the Monte Carlo technique convergence. The number of elementary checks for finding an A4 sheet with an acceptable accuracy comes to on the order of 105-106.

    6. The camera position and orientation are determined as a function of the A4 sheet position

    [0026] The task of determining the camera parameters (the x, y, and z coordinates in space, the camera orientation angles, α, β, and γ, and the focal length, f) is posed as the inverse: it is necessary to select the camera parameters in such a manner that the predicted image of the A4 sheet matches the real one in a photograph to the greatest extent possible. An illustration that explains the determination of camera position and orientation is presented in Figure 9.

    [0027] The camera parameters found are then used to cut the 3D model from the voxel array.

    [0028] 7. The closed contour of the foot and the lower leg between the dark sock and the light A4 sheet and the background is distinguished in each photograph using the evolutionary technique.

    [0029] The closed contour of the foot and the lower leg is sought using a particle dynamics technique that is adaptable to the subject area. This technique consists of the fact that when the contour is pulled onto the foot area, the points are initially located on the edges of the A4 sheet, and the "physical interaction" and "mechanical movement" of the points are calculated. The conditional attractive forces acting between adjacent points lead to drawing of points towards the foot while keeping the distances between the points roughly constant. A force proportional to the numerical estimation of the second derivative of photograph intensity across the contour is added in order to find the foot contour. To this end, the intensity is read from the paired points outside and beyond the contour, distributed relative to the contour (by analogy with the A4 sheet search step). The result of foot contour isolation from the background is shown in Figure 10.

    [0030] Figure 11 demonstrates the use of this technique to isolate a human contour. Figure 12 demonstrates the use of the technique to isolate a contour of the human hand.

    [0031] Contour accuracy is additionally checked via the following sequence of actions:
    • the image is converted to gray,
    • a median filter is used for smoothing,
    • a Canny boundary detector (a Canny filter) is then applied,
    • the image brightness gradient at each point is calculated using the Sobel operator.

    [0032] The result of these actions is depicted in Figure 13. From this fill, we then obtain the contour that is shown in Figure 14.

    [0033] The subject contour construction technique makes it possible to determine the foot arch height, which must be taken into account during the selection of footwear for people with flat feet. With most lighting options, a shadow is present in the image near this area. The algorithm developed makes it possible to isolate the dark connected area of the foot from the shadow, with the exception of instances of a very strong or smoothly transitioning shadow, when it is difficult, to isolate the shadow from the surface of the foot even for a person using a visual apparatus. A 3D model of a foot with a well-defined foot arch is examined in Figures 15 and 16.

    [0034] 8. With each frame, from the voxel array, which is initially parallelepiped-shaped and is located above the A4 sheet, voxels are cut off outside the space bounded by a conical secant surface with its vertex located at the camera and its directrix being the foot contour obtained during the previous step, as depicted in Figure 17.

    [0035] Each frame makes it possible to cut the projection of the previously found foot contour from the initial monolithic voxel array. As the voxel model is gradually cut from different aspect angles, a sufficiently accurate 3D voxel model of the foot is created. The foot voxel model creation sequence is presented in Figure 18. Three projections of the initial voxel array are shown on the left. Three projections of the resultant foot voxel model are shown on the right. Voxel and polygonal representations (without additional smoothing and grid regularization) of a 3D model of the foot are examined in Figure 19 (the right foot - a voxel representation, the left foot - a polygonal representation).

    [0036] 9. A scanning artifact in the form of a peculiar "comb" is eliminated from the resultant voxel model.

    [0037] In order to simplify and speed up the scanning procedure, the camera is guided in a smooth arc at knee height. Here, the area above the foot (the "bridge") is less accurately defined than the other parts of the foot due to the fact that when the voxel array is cut from these optical directions by the leg contour projections above the foot, an incompletely cut area is created - a "comb", which represents an area of the model with a sharp upper edge, as a result of which 3D foot model parameters such as "height in lift" and "lift-heel girth" are overestimated.

    [0038] Therefore, an algorithm for correcting the upper surface of the foot is used during the next step.

    [0039] A special algorithm based on the interacting moving point technique is used to eliminate the "comb". To this end, the surface area above the foot, which must be corrected, is isolated beforehand. During the evolutionary cycle, the mechanical interaction and the movement of the surface points are then calculated. Information about the shape of the foot from the sides is used to facilitate the soundest restoration of the surface above the foot. Two types of forces act between the moving points: attractive forces, which tend to shorten the distance between the points to the greatest extent possible, and bending reduction forces, which tend to make the surface as smooth as possible. In Figure 20, three projections of a voxel model with a "comb" are examined on the left, while three projections of a voxel model following "comb" removal are examined on the right. Two voxel models with a "comb" and without a "comb" are presented in Figure 21.

    [0040] The comb correction algorithm is not required if more thorough leg scanning is performed, capturing "side views" on the left and right, for example, if the camera guidance arc does not proceed from the knee height, but rather begins at the supporting surface to the right of the leg, rises to the knee level when viewed from the front of the leg, and descends again to the supporting surface to the left of the leg.

    [0041] 10. The 3D model is converted from the voxel format to obj format - for subsequent viewing and storage in a compact form. In this instance, the model is additionally smoothed and grid regularization is performed.

    [0042] Over the course of processing, a leg area space filling field is created beforehand, then the key points are located on the outer surface of the field (the maximal gradient region) based on the maximization of surface coverage uniformity, then they are connected by means of ribs. Using the field approach, the model is smoothed; the smoothing level is regulated by a special parameter during the creation of the field. Using the set of points and ribs consequently obtained, triangles are constructed, and this data set is converted to the "obj" format.

    [0043] Two accurate polygonal 3D models of human legs are presented in Figure 22, which were obtained in a contactless manner by passing a device with a video camera around the leg in an arc from knee level from one side of the body to the other.


    1. A method for measuring a shape and a size of parts of a human body, over a course of which:

    - a part of the human body is placed in the middle of a flat object of a known shape and a known size,

    - based on a series of photographs obtained from different aspect angles of the part of the human body, an image of the flat object of the known shape and a known size is identified on each photograph, by means of which a position and an orientation of a camera are determined, over a course of which camera position and orientation parameters are repeatedly reset in a random manner, and a predicted image of the flat object of the known shape and known size is compared to a real one using a template that includes several rows of points along a contour of the flat object of the known shape and size, between the points of an inner and an outer rows of which a difference in brightness is computed pairwise in order to calculate a match criterion,

    - a body part area is isolated from a background in the image, and an area of an arbitrary body that does not belong to a projection of the part of the human body is gradually cut out of the arbitrary body that initially included the part of the human body on each photograph,

    wherein with each photograph, from a voxel array, which is initially parallelepiped-shaped and is located above the flat object of the known shape and known size, voxels are cut off outside the space bounded by a conical secant surface with its vertex located at the camera and its directrix being the contour of the part of the human body obtained during the previous step,

    wherein as the voxel model is gradually cut from different aspect angles, a 3D voxel model of the part of the human body is created, allowing to generate a 3D model of the part of the human body by means of processing the series of photographs obtained while passing a device with the camera around the part of the human body.

    2. The method according to claim 1, wherein the part of the human body is a foot and a leg, and wherein the flat object of the known shape and the known size is a sheet of white A4 paper.
    3. The method according to any one of the preceding claims, wherein for isolating the part of the human body in the image from the background, a particle dynamics technique is employed, within which a boundary between the body part area and the background is presented in a form of a set of points located sequentially so as to form a contour, moving under an action of conditional forces that preserve a connectivity of the contour and tend to move a point to a place of a significant change in brightness, thereby forming a closed contour that circumscribes a simply connected body part area.


    1. Verfahren zum Messen einer Form und einer Abmessung von Teilen eines menschlichen Körpers, wobei:

    - ein Teil des menschlichen Körpers in der Mitte eines flachen Objekts einer bekannten Form und einer bekannten Abmessung platziert wird,

    - basierend auf einer Reihe von Fotografien, die aus verschiedenen Blickwinkeln von dem Teil des menschlichen Körpers aufgenommen werden, ein Bild des flachen Objekts der bekannten Form und einer bekannten Abmessung auf jeder Fotografie identifiziert wird, anhand dessen eine Position und eine Ausrichtung einer Kamera bestimmt werden, im Laufe dessen Positions- und Ausrichtungsparameter der Kamera wiederholt nach dem Zufallsprinzip zurückgesetzt werden, und ein vorhergesagtes Bild des flachen Objekts der bekannten Form und bekannten Abmessung mit einem realen Bild unter Verwendung einer Schablone verglichen wird, die mehrere Reihen von Punkten entlang einer Kontur des flachen Objekts der bekannten Form und Abmessung umfasst, zwischen deren Punkten einer inneren und einer äußeren Reihe ein Helligkeitsunterschied paarweise berechnet wird, um ein Übereinstimmungskriterium zu berechnen,

    - ein Körperteilbereich von einem Hintergrund in dem Bild isoliert wird, und ein Bereich eines beliebigen Körpers, der nicht zu einer Projektion des Teils des menschlichen Körpers gehört, aus dem beliebigen Körper schrittweise ausgeschnitten wird, der den Teil des menschlichen Körpers auf jeder Fotografie ursprünglich umfasst hat,

    wobei bei jeder Fotografie aus einem Voxel-Array, das anfänglich die Form eines Parallelepipeds hat und über dem flachen Objekt der bekannten Form und bekannten Abmessung angeordnet ist, Voxel außerhalb des Raums abgeschnitten werden, der durch eine konische Schnittoberfläche begrenzt wird, deren Scheitelpunkt sich bei der Kamera befindet und deren Leitlinie die Kontur des Teils des menschlichen Körpers ist, die während des vorhergehenden Schritts erhalten wurde,

    wobei dadurch, dass das Voxelmodell schrittweise aus verschiedenen Blickwinkeln geschnitten wird, ein 3D-Voxelmodell des Teils des menschlichen Körpers geschaffen wird, das es erlaubt, ein 3D-Modell des Teils des menschlichen Körpers mittels Verarbeiten der Reihe von Fotografien zu erzeugen, die erhalten werden, während eine Vorrichtung mit der Kamera um den Teil des menschlichen Körpers herumgeführt wird.

    2. Verfahren nach Anspruch 1, wobei der Teil des menschlichen Körpers ein Fuß und ein Bein ist, und wobei das flache Objekt der bekannten Form und der bekannten Abmessung ein Blatt weißes A4-Papier ist.
    3. Verfahren nach einem der vorhergehenden Ansprüche, wobei zum Isolieren des Teils des menschlichen Körpers in dem Bild von dem Hintergrund ein Partikeldynamikverfahren eingesetzt wird, wobei eine Grenze zwischen dem Körperteilbereich und dem Hintergrund in Form einer Menge von Punkten dargestellt wird, die sequentiell angeordnet sind, um eine Kontur zu bilden, die sich unter einer Einwirkung von bedingten Kräften bewegt, die eine Konnektivität der Kontur bewahren und dazu neigen, einen Punkt zu einem Ort mit einer signifikanten Helligkeitsänderung zu bewegen, wodurch eine geschlossene Kontur gebildet wird, die einen einfach verbundenen Körperteilbereich begrenzt.


    1. Procédé de mesure d'une forme et d'une taille de parties d'un corps humain, au cours duquel :

    - une partie du corps humain est placée au milieu d'un objet plat de forme connue et de taille connue,

    - -sur la base d'une série de photographies obtenues sous différents angles d'aspect de la partie du corps humain, une image de l'objet plat de la forme connue et d'une taille connue est identifiée sur chaque photographie, au moyen de quoi une position et une orientation d'un appareil photo sont déterminées, au cours duquel les paramètres de position et d'orientation de l'appareil photo sont réinitialisés à plusieurs reprises de manière aléatoire, et une image prédite de l'objet plat de forme connue et de taille connue est comparée à une image réelle à l'aide d'un modèle qui comprend plusieurs rangées de points le long d'un contour de l'objet plat de forme et de taille connues, entre les points d'une rangée intérieure et d'une rangée extérieure dont une différence de luminosité est calculée par paire afin de calculer un critère de concordance,

    - -une zone de partie du corps est isolée d'un arrière-plan dans l'image, et une zone d'un corps arbitraire qui n'appartient pas à une projection de la partie du corps humain est progressivement découpée du corps arbitraire qui comprenait initialement la partie du corps humain sur chaque photographie,

    dans lequel avec chaque photographie, à partir d'un réseau de voxels, qui est initialement de forme parallélépipédique et est situé au-dessus de l'objet plat de forme connue et de taille connue, des voxels sont coupés à l'extérieur de l'espace délimité par une surface sécante conique avec son sommet situé au niveau de l'appareil photo et sa directrice étant le contour de la partie du corps humain obtenu lors de l'étape précédente,

    dans lequel, à mesure que le modèle de voxel est progressivement découpé sous différents angles d'aspect, un modèle de voxel 3D de la partie du corps humain est créé, permettant de générer un modèle 3D de la partie du corps humain au moyen du traitement de la série de photographies obtenues tout en faisant passer un dispositif avec l'appareil photo autour de la partie du corps humain.

    2. Procédé selon la revendication 1, dans lequel la partie du corps humain est un pied et une jambe, et dans lequel l'objet plat de forme connue et de taille connue est une feuille de papier A4 blanc.
    3. Procédé selon l'une quelconque des revendications précédentes, dans lequel pour isoler la partie du corps humain dans l'image de l'arrière-plan, une technique de dynamique des particules est utilisée, à l'intérieur de laquelle une limite entre la zone de la partie du corps et l'arrière-plan est présentée sous la forme d'un ensemble de points situés séquentiellement de manière à former un contour, se déplaçant sous l'action de forces conditionnelles qui préservent une connectivité du contour et tendent à déplacer un point à un endroit d'un changement significatif de luminosité, formant ainsi un contour fermé qui circonscrit une zone de partie du corps simplement connectée.


    Cited references


    This list of references cited by the applicant is for the reader's convenience only. It does not form part of the European patent document. Even though great care has been taken in compiling the references, errors or omissions cannot be excluded and the EPO disclaims all liability in this regard.

    Patent documents cited in the description