(19)
(11)EP 3 430 594 B1

(12)EUROPEAN PATENT SPECIFICATION

(45)Mention of the grant of the patent:
04.11.2020 Bulletin 2020/45

(21)Application number: 17710238.1

(22)Date of filing:  10.03.2017
(51)International Patent Classification (IPC): 
G06T 7/223(2017.01)
G06T 7/80(2017.01)
(86)International application number:
PCT/EP2017/055715
(87)International publication number:
WO 2017/157799 (21.09.2017 Gazette  2017/38)

(54)

METHOD FOR MOTION ESTIMATION BETWEEN TWO IMAGES OF AN ENVIRONMENTAL REGION OF A MOTOR VEHICLE, COMPUTING DEVICE, DRIVER ASSISTANCE SYSTEM AS WELL AS MOTOR VEHICLE

VERFAHREN ZUR BEWEGUNGSSCHÄTZUNG ZWISCHEN ZWEI BILDERN EINES UMGEBUNGSBEREICHS EINES KRAFTFAHRZEUGS, RECHENVORRICHTUNG, FAHRERASSISTENZSYSTEM SOWIE KRAFTFAHRZEUG

PROCÉDÉ D'ESTIMATION DE MOUVEMENT ENTRE DEUX IMAGES D'UNE RÉGION ENVIRONNEMENTALE D'UN VÉHICULE MOTORISÉ, DISPOSITIF INFORMATIQUE, SYSTÈME D'ASSISTANCE AU CONDUCTEUR AINSI QUE VÉHICULE MOTORISÉ


(84)Designated Contracting States:
AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

(30)Priority: 15.03.2016 DE 102016104732

(43)Date of publication of application:
23.01.2019 Bulletin 2019/04

(73)Proprietor: Connaught Electronics Ltd.
Tuam, County Galway (IE)

(72)Inventors:
  • ERMILIOS, Pantelis
    Tuam County Galway (IE)
  • MCNAMARA, Bryan
    Tuam County Galway (IE)
  • O'GRADY, William
    Tuam County Galway (IE)
  • FRIEL, Myles
    Tuam County Galway (IE)

(74)Representative: Jauregui Urbahn, Kristian 
Valeo Schalter und Sensoren GmbH CDA-IP Laiernstraße 12
74321 Bietigheim-Bissingen
74321 Bietigheim-Bissingen (DE)


(56)References cited: : 
DE-A1-102014 019 078
US-A- 5 485 224
  
      
    Note: Within nine months from the publication of the mention of the grant of the European patent, any person may give notice to the European Patent Office of opposition to the European patent granted. Notice of opposition shall be filed in a written reasoned statement. It shall not be deemed to have been filed until the opposition fee has been paid. (Art. 99(1) European Patent Convention).


    Description


    [0001] The invention relates to a method for motion estimation between two images of an environmental region of a motor vehicle captured by a camera of the motor vehicle. In addition, the invention relates to a computing device, to a driver assistance system as well as to a motor vehicle with a driver assistance system.

    [0002] Presently, the interest focuses on cameras for motor vehicles. The cameras are adapted to capture an environmental region of the motor vehicle in images or rather video frames. Information extracted from the images can be provided to a driver assistance system of the motor vehicle. For purpose of motion estimation between two video frames, it is known from the prior art to use block matching algorithms in order to locate matching or corresponding blocks, so-called macroblocks, in a sequence of the video frames, see e.g. US5485224.

    [0003] Corresponding blocks is understood to be image areas of two image frames which at least partially display the same environmental region. The block matching algorithm usually involves dividing the current frame of a video into macroblocks and comparing each of the macroblocks with a corresponding block and its adjacent neighbours in a nearby frame of the videos, in particular the previous one. A motion vector is created that models the movement of a macroblock from one location to another. This movement, calculated for all the macroblocks comprising a frame, constitutes the motion estimated in a frame. In difficult environmental conditions such as low-light or adverse weather the video frames can be corrupted by high-level of noise, motion blur and other artefacts that generally degrade the block matching quality and thus the accuracy or validity of the extracted motion vectors leading into a large number of outliers. Thus, block matching techniques known from prior art may struggle or fail to produce usable motion vectors.

    [0004] It is the object of the present invention to provide a solution how motion estimation between images captured by a vehicle-side camera can be performed in a reliable manner, in particular despite adverse environmental conditions.

    [0005] According to the invention, this object is solved by a method, a computing device, a driver assistance system as well as a motor vehicle comprising the features according to the respective independent claims. Advantageous embodiments of the invention are the subject matter of the dependent claims, of the description, and of the figures.

    [0006] In an embodiment of the invention a method is provided which serves for motion estimation between two images of an environmental region of a motor vehicle captured by a camera of the motor vehicle. In particular, at least two image areas of a first image are determined as at least two first blocks in the first image, and for each first block a respective search region is defined in a second image for searching the respective search region in the second image for a second block corresponding to the respective first block. Then, a cost surface for each first blocks and its respective search region can be determined and an averaged cost surface for one of the at least two first blocks based on the cost surfaces can be determined. Moreover, a motion vector for the one of the first blocks relating to the average cost surface can be determined which describes a motion of a location of the first block in the first image and the corresponding second block in the second image.

    [0007] In a preferred embodiment of the invention a method is provided which serves for motion estimation between two images of an environmental region of a motor vehicle captured by a camera of the motor vehicle. For this, at least two image areas of a first image are determined as at least two first blocks in the first image, and for each first block a respective search region is defined in a second image for searching the respective search region in the second image for a second block corresponding to the respective first block. Then, a cost surface for each first blocks and its respective search region is determined and an averaged cost surface for one of the at least two first blocks based on the cost surfaces is determined. Moreover, a motion vector for the one of the first blocks relating to the average cost surface is determined which describes a motion of a location of the first block in the first image and the corresponding second block in the second image.

    [0008] By means of the method a block matching operation can be performed, e.g. by a vehicle-side computing device, in particular even in adverse environmental conditions. The block matching operation is performed based on the two images or video frames captured by the vehicle-side camera. The vehicle can comprise four cameras that can form a surround view camera system, wherein a first camera can be disposed in a front area of the motor vehicle, a second camera can be disposed in a rear area of the motor vehicle, a third camera can be disposed on a driver's side of the motor vehicle, in particular on a driver's side wing mirror, and a fourth camera can be disposed on a passenger's side of the motor vehicle, in particular on a passenger's side wing mirror. The cameras can comprise fisheye lenses in order to enlarge an angle of view and thus a detection range of the cameras.

    [0009] The images can be separated into blocks or macroblocks which can be adjacent or overlapping and their shape can be rectangular, circular or anything else. Preferably, adjacent rectangular blocks are arranged on a regular grid. In case of a camera having a fish eye lens the fisheye distorted images can be transferred into a virtual plan view in order to remove the fisheye distortion. Individual cost surfaces for each block in the block grid are determined, where the first blocks are obtained from their fixed locations on the block-grid in one frame, or rather the first image, and matched within their respective search regions in the other frame, or rather the second image. The size and position of the search regions can be fixed or dynamically assigned by predicting the camera ego-motion between the two frames e.g. from the vehicle odometry. The odometry is the use of data from motion sensors of the motor vehicle, e.g. wheel rotation, steering angle, etc., to estimate a vehicle's change in position over time. This can be derived by positioning a first block as reference block from one frame at all possible locations within a search region or search window of another frame. The purpose of this step is to find the corresponding second block for each first blocks, wherein the second block at least partially displays the same environmental subarea as the corresponding first block. In particular, the first block has a smaller size, e.g. 32x32 pixels, than the search region, e.g. 64x64 pixels. Then, the Sum of the Squared Differences (SSD) between the pixels of the reference block and the respective pixels in the search window can be taken. The shape of the individual cost surfaces can be inherently irregular.

    [0010] The averaged cost surface for one of the blocks is determined based on the individual cost surfaces determined for the at least two first blocks and their corresponding search regions. From this averaged cost surface having less fluctuation than the individual cost surfaces the motion vector can be derived. The motion vector or displacement vector describes the displacement of the two corresponding blocks between two images.

    [0011] The invention is based on the knowledge that motion vectors directly derived from the individual cost surfaces would be identified as outliers, in particular in case of adverse environmental conditions, like low-light or adverse weather, since the shape of the individual cost surfaces is inherently irregular. In contrast, using the method according to the invention motion vectors for block matching operations can be determined reliably even in adverse environmental conditions.

    [0012] Particularly preferably, a global minimum of the averaged cost surface is determined and the motion vector is determined in dependency on the global minimum. Since the individual cost surfaces can be inherently irregular, global minima can hardly be identified or even be at the wrong location. A motion vector extracted from that global minimum location would be consequently erroneous. Since the averaged cost surface has less fluctuation the global minimum is clearly visible as a strong negative peak. Thus, the global minimum of the averaged cost surface is easy to identify. Based on the global minimum of the averaged cost surface the motion vector, in particular the coordinates of the motion vector, can be determined.

    [0013] Preferably, for determining the average cost surface, a mean value of each cost surface is determined, and respective weighting factors for determining the averaged cost surface are determined based on the mean values. In particular, the weighting factors are determined as a reciprocal of the respective mean value. In other words the averaged cost surface is determined as a weighted average of the individual cost surfaces. The weights can be chosen so as to approximately equalise the energy of the individual cost surfaces.

    [0014] In a further embodiment of the invention, a sliding window is determined comprising a predetermined number of first blocks, wherein the motion vector is determined for one of the first blocks within the sliding window based on the cost surfaces of all first blocks within the sliding window. In particular, the number of first blocks within the sliding window is preset such that one first block is completely surrounded by further first blocks within the sliding window, wherein the motion vector is determined for the first block in the middle surrounded by the further first blocks. For instance, using a sliding window having 3x3 first blocks the cost surfaces from a group of neighbouring first blocks in the block-grid enclosed by the sliding window can be selected. The weighted average of their cost values can be determined to obtain a new cost surface where each value is the weighted average of the respective values of the individual cost surfaces. The resulting new cost surface is the averaged cost surface, in particular the cost surface corresponding to the block in the centre of the sliding window. Then, the motion vector that would normally correspond to the central block of the sliding window is determined. The sliding window can be moved to the next location and the process can be repeated until all locations within the block-grid have been exhausted.

    [0015] In an advantageous development of the invention, an extrinsic calibration of the camera is performed based on the motion vector derived from the averaged cost surface. In this embodiment a method for extrinsic calibration of the camera is provided. The extrinsic calibration method or motion tracking calibration (MTC) operation serves for online calibration of the vehicle-side camera by analysing a sequence of at least two images captured by the camera while the vehicle is moving particularly in an approximately straight path. In other words the purpose is to determine and calibrate at least one extrinsic parameter of the camera during movement of the motor vehicle by means of the time sequence comprising at least one pair of images. For example, the extrinsic parameters can comprise a spatial orientation of the camera relative to the road surface or ground plane and a height of the camera relative to the ground or relative to another camera of the vehicle. The method can be performed by the computing device which can also be adapted to individually calibrate each camera of the vehicle.

    [0016] The images at least partially show the road surface or ground surface in a vicinity of the motor vehicle as well as a texture being present on the surface such as tarmac. In particular, the calibration operation according to this embodiment does not require any particular features of interest such as corners, contours, edges or lines to be present on the image. In particular, at least one calibration cycle is performed resulting in a single calibration result.

    [0017] Preferably, for performing the extrinsic calibration, a rotation calibration of the camera is performed, wherein a loss function describing a deviation between the motion vector and a predetermined vector is determined and a rotation-compensated motion vector is determined by minimizing the loss function. In other words this means that the rotation calibration is performed by finding the 3-DOF (degree of freedom) spatial orientation of the camera relative to the ground plane or road surface by analysing the motion vector. The orientation of the camera is a first extrinsic parameter to be calibrated. The orientation of the camera can deviate from an default predetermined orientation due to mechanical tolerances in the vehicle manufacturing process. The orientation can for example be expressed in a roll-pitch-yaw rotation scheme rotating in sequence about the fixed X-, Y- and Z-axes of the world coordinate system, wherein the X-axis is orientated along a longitudinal vehicle axis, the Y-axis is orientated along a transverse vehicle axis and the Z-axis is orientated along a vertical vehicle axis.

    [0018] Then, the loss function or cost function to be minimised is formulated that particularly exploits geometric properties of the motion vectors on the road surface or ground plane and, in particular, the constraint for approximately straight driving of the motor vehicle. The sum of squared errors or sum of absolute errors from the mean or median can be minimised with the help of an iterative algorithm, such as gradient descent, Gauss-Newton, Levenberg-Marquardt or others, wherein those iterative, non-linear method gives a great flexibility on how the loss function can be formulated to tweak the solution. It is also possible to state the problem as an over-determined system of linear equations and solve in a least-squares sense with various methods. The invention is based on the fact that a calibrated system should produce motion vectors free from perspective distortion and of equal length, in particular while driving straight on a flat ground. For instance, motion vectors free from perspective distortion are motion vectors that are all parallel to the horizontal x-axis (in the image frame), or parallel to the longitudinal vehicle axis in case of the motion vector projected onto the ground surface. Thus, the predetermined vector is the distortion-free vector parallel to the horizontal or longitudinal axis. This objective can be expressed by the sum of squares of the y-components and the variance of the x-components of all motion vectors. For a calibrated camera in ideal conditions this would be zero. Therefore, the loss function indicates the difference between the present, real condition and the ideal condition. Thus, by finding the loss function, a present camera orientation deviating from the default camera orientation can easily be determined as the first extrinsic parameter of the camera. By minimizing the loss function, a rotation-compensated motion vector can be derived. The rotation-compensated motion vectors relating to a plurality of corresponding blocks within the images can be re-projected on the plane of the road surface or ground plane 11, wherein the re-projected, calibrated motion vectors particularly are orientated parallel to the longitudinal vehicle axis and have also equal length.

    [0019] In a further development, for calibrating the camera, a height calibration of the camera is performed, wherein the height of the camera is determined in dependency on a length of the rotation-compensated motion vector and an expected value of the length of the rotation-compensated motion vector. The height of the camera is a second extrinsic parameter to be calibrated. The height of the camera can deviate from the nominal default due to airmatic suspension or due to loading changes in the motor vehicle such as the number of passengers or weight in a vehicle's boot.

    [0020] The height is determined by analysing the rotation-compensated motion vector which is derived from the minimisation of the loss function. This means that a height calibration is performed to find the absolute height of each camera from the ground surface or the relative height between the cameras by analysing the rotation-compensated, calibrated and re-projected motion vectors from the previous step.

    [0021] The length of the corrected, rotation-calibrated and re-projected motion vector of the camera is proportional to the velocity or speed of the vehicle and inversely proportional to the height of the camera from the ground plane. Therefore, the expected value for the length is advantageously preset in dependency on the velocity of the vehicle. In particular, the velocity of the motor vehicle is determined by means of odometry of the motor vehicle and/or based on at least one further motion vector, which is determined based on images captured by at least one further camera of the motor vehicle. This latter means that motion vectors from more than one camera can be considered. Therefore, the height of the camera can be determined by adjusting the value of the height of each camera such that the re-projected motion vectors have equal length in all cameras. For this purpose it proves advantageous, if a mean length of the motion vectors of the mirror left and mirror right cameras are taken as a reference and the value of the height of the other cameras is adjusted to match it, since the mirror cameras are affected the least by loading changes of the motor-vehicle. In summary, MTC adjusts the calibration so that the projected motion vectors on the ground plane are parallel to the longitudinal vehicle axis and have also equal length.

    [0022] The invention additionally relates to a computing device for a driver assistance system of a motor vehicle, which is adapted to perform a method according to the invention. The computing device which can be formed by a vehicle-side control unit is adapted to perform a block matching procedure even at bad environmental conditions. In addition, the computing device can be adapted to calibrate each camera of the motor vehicle individually based on the results of the block matching procedure.

    [0023] Moreover, the invention additionally relates to a driver assistance system for a motor vehicle comprising at least one camera and a computing device according to the invention. The driver assistance supports a driver of the vehicle by means of the cameras monitoring the environmental region of the vehicle. The driver assistance system can comprise four cameras forming a surround view camera system. The cameras can also comprise fisheye lenses enlarging a detection area of the cameras. The driver assistance system may reliably detect objects in the images captured by the calibrated cameras and thus, may support the driver of the vehicle, for instance by producing a warning signal or automatically braking the motor vehicle when detecting an imminent collision of the motor vehicle with the detected object. In addition, the images captured by the calibrated camera can be displayed, in particular as a video, on a vehicle-side display device. In other words, the camera can be configured as a video camera.

    [0024] A motor vehicle according to the invention includes a driver assistance system according to the invention. The motor vehicle is in particular configured as a passenger car.

    [0025] The preferred embodiments presented with respect to the method according to the invention and the advantages thereof correspondingly apply to the computing device, to the driver assistance system according to the invention as well as to the motor vehicle according to the invention.

    [0026] Now, the invention is explained in more detail based on preferred embodiments as well as with reference to the attached drawings.

    [0027] These show in:
    Fig. 1
    a schematic representation of an embodiment of a motor vehicle according to the invention;
    Fig. 2
    a schematic representation of a block matching operation;
    Fig. 3a, 3b
    a schematic view of a motion field divided into blocks as well as a schematic view of motion vectors;
    Fig. 4
    a schematic representation of a flow chart of an embodiment of a method according to the invention; and
    Fig. 5
    a schematic view of cost surfaces as well as an averaged cost surface.


    [0028] In the figures, identical as well as functionally identical elements are provided with the same reference characters.

    [0029] Fig. 1 shows a motor vehicle 1 according to the present invention. In the present case, the motor vehicle 1 is configured as a passenger car. The motor vehicle 1 has a driver assistance system 2 for supporting a driver of the motor vehicle 1. The driver assistance system 2 comprises a computing device 3, which can for example be formed by a vehicle-side control unit. Additionally, the driver assistance system 2 comprises at least one camera 4. In the present case, the driver assistance system 2 includes four cameras 4, wherein a first camera 4 is disposed in a front area 5 of the motor vehicle 1, a second camera 4 is disposed in a rear area 6 of the motor vehicle 1, a third camera 4 is disposed on a driver's side 7 of the motor vehicle 1, in particular on a wing mirror of the driver's side 7, and a fourth camera 4 is disposed on a passenger side 8 of the motor vehicle 1, in particular on a wing mirror of the passenger's side 8. The cameras 4 disposed on the driver's side 7 and the passenger's side 8 can also replace the wing mirrors, whereby the motor vehicle 1 can be designed as a mirrorless vehicle 1 enabling mirrorless driving. By means of the cameras 4, an environmental region 9 of the motor vehicle 1 can be captured in images. The cameras 4 can comprise fisheye lenses in order to enlarge an angle of view and thus a detection range of the cameras 4.

    [0030] The computing device 3 is adapted to perform a block matching operation based on the images or rather video frames captured by the at least one camera 4 by means of tracking a texture, such as tarmac, of a road surface 11 of a road 10 for the motor vehicle 1. In particular, the computing device 3 is adapted to improve the accuracy of the road surface texture tracking in the block matching algorithm when operating in difficult environmental conditions, such as low-light or adverse weather. In such conditions, the images or video frames can be corrupted by high-level of noise, motion blur and other artefacts that generally degrade the block matching quality.

    [0031] The block matching algorithm is a method of locating matching blocks B (see Fig. 2) of image data in a sequence of images or digital video frames for the purposes of motion estimation or optical flow. Fig. 2 visualizes a block matching operation known from the priot art. Therefore, the current video frame is divided into blocks B also referred to as macroblocks, wherein each macroblock is particularly compared for its similarity with all possible blocks of the same size within a specified search window 15 in another frame. In Fig. 2, the block 14 is a macroblock B with a size NxN in the current video frame and the block 13 is a macroblock B under search with a size NxN in the previous frame within a search window 15. The location that gives the highest similarity between the two blocks 13, 14, the so-called minimum cost location, can be selected and registered as a motion vector, i.e. a 2-dimension vector where the two components x, y correspond to the horizontal and vertical relative displacement of the block 13, 14 between the two frames.

    [0032] The similarity measure, also referred to as a block-distortion measure or matching cost, can be obtained by various methods such as Cross Correlation, Normalised Cross Correlation, Phase Correlation, Sum of Absolute Differences (SAD), Sum of Squared Differences (SSD), Sum of Absolute Transformed Differences (SATD), Census/Hamming distance and many more. SAD and SSD are popular because of their computational efficiency and sufficiently good quality for many applications. The lower the matching cost is at a particular location the higher the similarity between the compared blocks 13, 14 is at that location. For example a perfect or ideal match would result in zero cost when the sum of absolute or squared differences of the overlapping pixels of the two blocks 13, 14 is taken. Radiometric invariance can be achieved by pre-filtering the images with a Laplacian of Gaussian, Difference of Gaussians or other filters.

    [0033] The matching cost between the reference block 14 and a candidate block 13 at each test location can be computed by means of exhaustive search for all possible locations within the nominated search window 15. This method which is also referred to as full search (FS) yields the most accurate results and lowest number of outliers or erroneous motion vectors but is also the most computationally expensive. Various methods are known that seek to reduce the amount of computation by avoiding computing the matching cost at all possible locations at the expense of degrading the quality of the results. These include Diamond Search (DS), Hexagon-based Search (HEXBS), Adaptive Rood Pattern Search (ARPS) and many more. A notable exception is a simple computational optimisation of the full-search method known as Fast Computation of Full Search (FCFS) that does not impact quality.

    [0034] With those techniques each reference block 14 can be matched independently within its nominated search window 15. The proposed technique can combine information from neighbouring blocks in a unique way, in particular under the assumption that all blocks B represent a uniform motion locally on the same plane, leading to strong matches.

    [0035] For example, a motion field 16a, 16b on the ground surface 11 obtained by block matching two successive frames while the vehicle 1 is travelling straight is shown in Fig. 3a and Fig. 3b. In case of cameras 4 having fisheye lenses, the frames have been previously rectified into virtual-plan views, in particular for the purpose of an extrinsic calibration algorithm for the camera 4. Fig. 3a shows the motion field 16a of an uncalibrated camera 4 which exhibits perspective distortion as the virtual plan-view has been generated using incorrect extrinsic calibration parameters. A calibrated system should produce motion vectors on the virtual plan view free from perspective distortion, i.e. parallel to the horizontal x-axis and of equal length. The motion vectors v shown in Fig. 3a are not parallel to the horizontal x-axis and of equal length, and thus, indicate the uncalibrated system.

    [0036] Fig. 3b has been generated with the correct extrinsic calibration parameters and has no perspective distortion i.e. the ground plane is mapped correctly to a parallel virtual image plane. Thus Fig. 3b shows the motion field 16b of a calibrated camera 4. The calibrated motion vectors vc are all parallel to the horizontal x-axis and of equal length. The square 17 indicates an arbitrary group of motion vectors v.

    [0037] The calibrated motion vectors vc as shown in Fig. 3b can, for instance, be generated by extrinsic calibration, a so-called motion tracking calibration, MTC, wherein the extrinsic rotations and height of the camera 4 are calibrated by analysing the motion field 16a. The MTC operation uses at least two images consecutively captured by the camera 4 and tracks the relative movement of road surface texture such as tarmac between the images or video frames, in particular without the need of strong features, like a kerb 12 alongside the road 10.

    [0038] In particular, using the calibration algorithm, a spatial orientation of the camera 4 can be found relative to the ground plane 11 by analysing the motion vectors v. The orientation of the cameras 4 can be expressed in the roll-pitch-yaw rotation scheme rotating in sequence about the fixed X, Y and Z axes of the world coordinate system, where X is the longitudinal vehicle axis, Y is the transverse vehicle axis and Z is the vertical vehicle axis as shown in Fig. 1.

    [0039] A loss function to be minimised is formulated that exploits the geometric properties of the motion vectors v on the ground plane 11, in particular considering the constraint for approximately straight driving of the motor vehicle. By minimizing the cost function the motion vectors v can be mapped and re-projected to rotation-compensated and calibrated motion vectors vc that are free from perspective distortion and are all parallel to the horizontal x-axis.

    [0040] In addition, the height of the cameras 4 can be calibrated by finding the absolute height of the camera 4 from the ground surface 11 or the relative height between the cameras 4 by analysing the calibrated and re-projected motion vectors vc. The height of the cameras 4 can deviate from the nominal default due to airmatic suspension or due to loading changes in the motor vehicle 1 such as the number of passengers or weight in a vehicle's boot. The absolute height of a single camera 4 can be estimated based on vehicle odometry or image features of known scale. The length of the corrected, rotation calibrated, and re-projected motion vectors vc of a single camera 4 is proportional to the velocity or speed of the vehicle 1 and inversely proportional to the height of that camera 4 from the ground plane 11. Given the odometry of the vehicle from the vehicle-side network, e.g. CAN or FlexRay, an expected length of the motion vectors vc on the ground 11 can be calculated and the height of the camera 4 can be adjusted to match it with the re-projected motion vectors vc.

    [0041] It is possible to smooth the motion field 16a before using it for calibration by averaging motion vectors v within a sliding window. For example, the central motion vector v within a sliding window 17 can be computed as the average of the nine motion vectors v contained by the window 17. This can be repeated for each possible location of the sliding window 17 in the block grid while storing the output motion vectors v in a separate buffer to avoid recursion. Any small biases introduced by the averaging of motion vectors v under perspective distortion can be neutralised progressively as the virtual-plan view is updated from the calibration result in a feedback loop.

    [0042] In difficult environmental conditions such as low-light or adverse weather, block matching produces mostly outlier motion vectors v, i.e. motion vectors v at random lengths and directions. Averaging these motion vectors v would not improve the quality of the information used for calibration. Thus, for improving the performance of the autonomous road based calibration algorithm such as MTC, that is tracking the relative movement of road surface texture between two video frames, a method for motion estimation between two images is performed. The method is visualized in Fig. 4 by means of a flow chart.

    [0043] In particular, the main idea of the method is to take a weighted average of cost surfaces 18 (see Fig. 5) of all the blocks B within the sliding window 17 before extracting the motion vector v referenced to a block B, particularly the central block B, of the sliding window 17. Within the method, the following steps S1, S2, S3, S4 are performed.

    [0044] In a first step S1, an individual cost surface 18 for each block B in the block grid is computed, where the blocks B are obtained from their fixed locations on the block-grid in one frame and matched within their respective search regions in the other frame. The size and position of the search regions can be fixed or dynamically assigned by predicting the camera ego-motion between the two frames, e.g. from the vehicle odometry. (NB: The search region 15 or search window 15 used for block matching should not be confused with the sliding-window 17 used for selecting the blocks B whose cost surfaces 18 are to be averaged). The individual cost surfaces 18 of blocks B within the sliding window 17 are shown in Fig. 5. For instance, a cost surface 18 can be derived by positioning a 32x32 pixels reference block 14 from one frame at all possible locations within a 64x64 pixels search window 15 of another frame and taking the Sum of the Squared Differences (SSD) between the pixels of the reference block 14 and the respective pixels in the search window 15. As can be seen in Fig. 5 the shape of the cost surfaces 18 are inherently irregular as there is a random level of similarity between the block 12 and the pixels of the search region 15 at every location. The location of highest similarity has the lowest cost, i.e. is at the global minimum of the cost surface 18 - the ideal case being zero cost which occurs when all subtracted pixels are identical. In higher levels of image noise these fluctuations become stronger and may even exceed the depth at the location where the best match would be expected and produce a hardly visible global minimum or a global minimum at the wrong location. A motion vector v extracted from that global minimum location would consequently be erroneous.

    [0045] In a second step S2 the mean value of each cost surface 18 is determined and its reciprocal is stored as a weighting factor wij (see Fig. 5) to be used in the next step S3. In this next, third step S3, by using a sliding window 17, e.g. with size 3x3 blocks or another size, the cost surfaces 18 are selected from a group of neighbouring blocks B in the block-grid enclosed by the sliding window 17. A weighted average of their cost values is determined so as to obtain a new cost surface 19 where each value is the weighted average of the respective values of the individual cost surfaces 18. The resulting cost surface 19 will have less fluctuations and a stronger, clearly visible minimum 20 formed by the contributions of weaker minima corrupted by noise in the individual cost surfaces 18. The expectation is that this stronger minimum 20 will now be a global minimum so that correct motion vector v can be extracted.

    [0046] In a fourth step S4, the global minimum location within the averaged cost surface 19 is determined. From that the coordinates of the motion vector v at the centre of the sliding window 17 can be derived, i.e. the motion vector v that would normally correspond to the central block B of the sliding window 17. Then the sliding window 17 can be moved to the next location and repeat the process steps S1 to S4 until all locations within the block-grid have been exhausted.

    [0047] In summary, the motion vector v of a particular block B is extracted from the minimum cost location of the weighted-average cost surface 19. The weights wij are chosen so as to approximately equalise the energy of the individual cost surfaces 18. For this purpose the individual cost surfaces 18 are weighted by the reciprocal of their own average value. To avoid breaking symmetry motion vectors v are particularly not computed for blocks B without symmetrical neighbours. This method can be extended to any window size and can be applied also in a hierarchical way such that multiple layers of cost averaging constrain the search area in subordinate layers to further reduce the possibility of outliers. For example, a global motion vector can be computed from the average cost surface 19 of all blocks B on the grid having a single very strong peak representing the average motion vector. This global motion vector can be used to constrain the search in a previous layer or otherwise for obtaining approximate information about the global motion or perspective error in the scene.


    Claims

    1. Method for motion estimation between two images of an environmental region (9) of a motor vehicle (1) captured by a camera (4) of the motor vehicle (1), wherein the following steps are performed:

    a) determining at least two image areas of a first image as at least two first blocks (B) in a block grid in the first image,

    b) for each first block (B), defining a respective search region in a second image for searching the respective search region in the second image for a second block (B) corresponding to the respective first block (B) using block matching;

    c) determining a cost surface (18) for each first blocks (B) and its respective search region;

    d) determining an averaged cost surface (19) for one of the at least two first blocks (B) based on the cost surfaces (18);

    e) identifying from the averaged cost surface (19) a motion vector (v) for the one of the first blocks (B) describing a motion of a location of the first block (B) in the first image and the corresponding second block (B) in the second image.


     
    2. Method according to claim 1,
    characterized in that
    a global minimum (20) of the averaged cost surface (19) is determined and the motion vector (v) is determined in dependency on the global minimum (20).
     
    3. Method according to claim 1 or 2,
    characterized in that
    for determining the average cost surface (19), a mean value of each cost surface (18) is determined, and respective weighting factors (wij) for determining the averaged cost surface (19) are determined based on the mean values.
     
    4. Method according to claim 3,
    characterized in that
    the weighting factor (wij) is determined as a reciprocal of the respective mean value.
     
    5. Method according to any one of the preceding claims,
    characterized in that
    a sliding window (17) is determined comprising a predetermined number of first blocks (B), wherein the motion vector (v) is determined for one of the first blocks (B) within the sliding window (17) based on the cost surfaces (18) of all first blocks (B) within the sliding window (17).
     
    6. Method according to claim 5,
    characterized in that
    the number of first blocks (B) within the sliding window (17) is preset such that one first block (B) is completely surrounded by further first blocks (B) within the sliding window (17), wherein the motion vector (v) is determined for the first block (B) in the middle surrounded by the further first blocks (B).
     
    7. Method according to any one of the preceding claims,
    characterized in that
    an extrinsic calibration of the camera (4) is performed based on the motion vector (v) derived from the averaged cost surface (19).
     
    8. Method according to claim 7,
    characterized in that
    for performing the extrinsic calibration, a rotation calibration of the camera (4) is performed, wherein a loss function describing a deviation between the motion vector (v) and a predetermined vector is determined and a rotation-compensated motion vector is determined by minimizing the loss function.
     
    9. Method according to claim 8,
    characterized in that
    for performing the extrinsic calibration, a height calibration of the camera (4) is performed, wherein the height of the camera (4) is determined in dependency on a length of the rotation-compensated motion vector and an expected value of the length of the rotation-compensated motion vector.
     
    10. Method according to claim 9,
    characterized in that
    the expected value for the length is preset in dependency on a velocity of the motor vehicle (1).
     
    11. Method according to claim 10,
    characterized in that
    the velocity of the motor vehicle (1) is determined by means of odometry and/or based on at least one further motion vector determined for at least one further camera (4).
     
    12. Computing device (3) for a driver assistance system (2) of a motor vehicle (1), which is adapted to perform a method according to any one of the preceding claims.
     
    13. Driver assistance system (2) for a motor vehicle (1) comprising at least one camera (4) and a computing device (3) according to claim 12.
     
    14. Motor vehicle (1) with a driver assistance system (2) according to claim 13.
     


    Ansprüche

    1. Verfahren zur Bewegungsschätzung zwischen zwei Bildern eines Umgebungsbereichs (9) eines Kraftfahrzeugs (1), die durch eine Kamera (4) des Kraftfahrzeugs (1) aufgenommen werden, wobei die folgenden Schritte durchgeführt werden:

    a) Bestimmen mindestens zweier Bildbereiche eines ersten Bildes als mindestens zwei erste Blöcke (B) in einem Blockgitter im ersten Bild;

    b) für jeden ersten Block (B) Definieren eines jeweiligen Suchbereichs in einem zweiten Bild zum Suchen des jeweiligen Suchbereichs im zweiten Bild für einen zweiten Block (B), der dem jeweiligen ersten Block (B) entspricht, unter Verwendung eines Blockabgleichs;

    c) Bestimmen einer Kostenoberfläche (18) für jeden ersten Block (B) und seinen jeweiligen Suchbereich;

    d) Bestimmen einer gemittelten Kostenoberfläche (19) für einen der mindestens zwei ersten Blöcke (B) auf der Grundlage der Kostenoberflächen (18);

    e) Identifizieren eines Bewegungsvektors (v) für den einen der ersten Blöcke (B) aus der gemittelten Kostenoberfläche (19), der eine Bewegung eines Ortes des ersten Blocks (B) im ersten Bild und des entsprechenden zweiten Blocks (B) im zweiten Bild beschreibt.


     
    2. Verfahren nach Anspruch 1,
    dadurch gekennzeichnet, dass
    ein globales Minimum (20) der gemittelten Kostenoberfläche (19) bestimmt wird und der Bewegungsvektor (v) in Abhängigkeit von dem globalen Minimum (20) bestimmt wird.
     
    3. Verfahren nach Anspruch 1 oder 2,
    dadurch gekennzeichnet, dass
    zum Bestimmen der durchschnittlichen Kostenoberfläche (19) ein Mittelwert jeder Kostenoberfläche (18) bestimmt wird und auf der Grundlage der Mittelwerte jeweilige Gewichtungsfaktoren (wij) zum Bestimmen der gemittelten Kostenoberfläche (19) bestimmt werden.
     
    4. Verfahren nach Anspruch 3,
    dadurch gekennzeichnet, dass
    der Gewichtungsfaktor (wij) als ein Kehrwert des jeweiligen Mittelwerts bestimmt wird.
     
    5. Verfahren nach einem der vorhergehenden Ansprüche,
    dadurch gekennzeichnet, dass
    ein Gleitfenster (17) bestimmt wird, das eine vorgegebene Anzahl erster Blöcke (B) umfasst, wobei der Bewegungsvektor (v) für einen der ersten Blöcke (B) in dem Gleitfenster (17) auf der Grundlage der Kostenoberflächen (18) aller ersten Blöcke (B) in dem Gleitfenster (17) bestimmt wird.
     
    6. Verfahren nach Anspruch 5,
    dadurch gekennzeichnet, dass
    die Anzahl der ersten Blöcke (B) in dem Gleitfenster (17) derart vorgegeben wird, dass ein erster Block (B) in dem Gleitfenster (17) vollständig von weiteren ersten Blöcken (B) umgeben ist, wobei der Bewegungsvektor (v) für den ersten Block (B) in der Mitte bestimmt wird, der von den weiteren ersten Blöcken (B) umgeben ist.
     
    7. Verfahren nach einem der vorhergehenden Ansprüche,
    dadurch gekennzeichnet, dass
    eine extrinsische Kalibrierung der Kamera (4) auf der Grundlage des Bewegungsvektors (v) durchgeführt wird, der von der gemittelten Kostenoberfläche (19) abgeleitet wird.
     
    8. Verfahren nach Anspruch 7,
    dadurch gekennzeichnet, dass
    zum Durchführen der extrinsischen Kalibrierung eine Rotationskalibrierung der Kamera (4) durchgeführt wird, wobei eine Verlustfunktion bestimmt wird, die eine Abweichung zwischen dem Bewegungsvektor (v) und einem vorgegebenen Vektor beschreibt, und durch Minimieren der Verlustfunktion ein rotationskompensierter Bewegungsvektor bestimmt wird.
     
    9. Verfahren nach Anspruch 8,
    dadurch gekennzeichnet, dass
    zum Durchführen der extrinsischen Kalibrierung eine Höhenkalibrierung der Kamera (4) durchgeführt wird, wobei die Höhe der Kamera (4) in Abhängigkeit von einer Länge des rotationskompensierten Bewegungsvektors und einem Erwartungswert der Länge des rotationskompensierten Bewegungsvektors bestimmt wird.
     
    10. Verfahren nach Anspruch 9,
    dadurch gekennzeichnet, dass
    der Erwartungswert für die Länge in Abhängigkeit von einer Geschwindigkeit des Kraftfahrzeugs (1) vorgegeben wird.
     
    11. Verfahren nach Anspruch 10
    dadurch gekennzeichnet, dass
    die Geschwindigkeit des Kraftfahrzeugs (1) mittels Odometrie und/oder auf der Grundlage von mindestens einem weiteren Bewegungsvektor, der für mindestens eine weitere Kamera (4) bestimmt wird, bestimmt wird.
     
    12. Rechenvorrichtung (3) für ein Fahrerunterstützungssystem (2) eines Kraftfahrzeugs (1), die ausgelegt ist, ein Verfahren nach einem der vorhergehenden Ansprüche durchzuführen.
     
    13. Fahrerunterstützungssystem (2) für ein Kraftfahrzeug (1), das mindestens eine Kamera (4) und eine Rechenvorrichtung (3) nach Anspruch 12 umfasst.
     
    14. Kraftfahrzeug (1) mit einem Fahrerunterstützungssystem (2) nach Anspruch 13.
     


    Revendications

    1. Procédé d'estimation de mouvement entre deux images d'une région environnementale (9) d'un véhicule à moteur (1) capturées par une caméra (4) du véhicule à moteur (1), dans lequel les étapes suivantes sont effectuées :

    a) détermination d'au moins deux zones d'image d'une première image sous la forme d'au moins deux premiers blocs (B) dans une grille de blocs dans la première image ;

    b) pour chaque premier bloc (B), définition d'une région de recherche respective dans une deuxième image en vue de rechercher, dans la région de recherche respective dans la deuxième image, un deuxième bloc (B) correspondant au premier bloc respectif (B) au moyen d'une mise en correspondance de blocs ;

    c) détermination d'une surface de coût (18) pour chaque premier bloc (B) et sa région de recherche respective ;

    d) détermination d'une surface de coût moyenne (19) pour un bloc donné parmi les au moins deux premiers blocs (B) sur la base des surfaces de coût (18) ;

    e) identification, à partir de la surface de coût moyenne (19), d'un vecteur de mouvement (v) pour le bloc donné parmi les premiers blocs (B) décrivant un mouvement d'un emplacement du premier bloc (B) dans la première image et du deuxième bloc correspondant (B) dans la deuxième image.


     
    2. Procédé selon la revendication 1,
    caractérisé en ce que
    un minimum global (20) de la surface de coût moyenne (19) est déterminé et le vecteur de mouvement (v) est déterminé en fonction du minimum global (20).
     
    3. Procédé selon la revendication 1 ou 2,
    caractérisé en ce que
    pour déterminer la surface de coût moyenne (19), une valeur moyenne de chaque surface de coût (18) est déterminée, et des facteurs de pondération respectifs (wij) destinés à déterminer la surface de coût moyenne (19) sont déterminés sur la base des valeurs moyennes.
     
    4. Procédé selon la revendication 3,
    caractérisé en ce que
    le facteur de pondération (wij) est déterminé comme un inverse de la valeur moyenne respective.
     
    5. Procédé selon l'une quelconque des revendications précédentes,
    caractérisé en ce que
    une fenêtre glissante (17) comprenant un nombre prédéterminé de premiers blocs (B) est déterminée, le vecteur de mouvement (v) étant déterminé pour un des premiers blocs (B) à l'intérieur de la fenêtre glissante (17) sur la base des surfaces de coût (18) de la totalité des premiers blocs (B) à l'intérieur de la fenêtre glissante (17).
     
    6. Procédé selon la revendication 5,
    caractérisé en ce que
    le nombre de premiers blocs (B) à l'intérieur de la fenêtre glissante (17) est préréglé de telle sorte qu'un premier bloc (B) est entièrement entouré par d'autres premiers blocs (B) à l'intérieur de la fenêtre glissante (17), le vecteur de mouvement (v) étant déterminé pour le premier bloc (B) au milieu entouré par les autres premiers blocs (B).
     
    7. Procédé selon l'une quelconque des revendications précédentes,
    caractérisé en ce que
    un étalonnage extrinsèque de la caméra (4) est effectué sur la base du vecteur de mouvement (v) déduit de la surface de coût moyenne (19).
     
    8. Procédé selon la revendication 7,
    caractérisé en ce que
    pour effectuer l'étalonnage extrinsèque, on effectue un étalonnage en rotation de la caméra (4), dans lequel on détermine une fonction de perte décrivant un écart entre le vecteur de mouvement (v) et un vecteur prédéterminé et on détermine un vecteur de mouvement compensé en rotation en minimisant la fonction de perte.
     
    9. Procédé selon la revendication 8,
    caractérisé en ce que
    pour effectuer l'étalonnage extrinsèque, on effectue un étalonnage en hauteur de la caméra (4), dans lequel on détermine la hauteur de la caméra (4) en fonction d'une longueur du vecteur de mouvement compensé en rotation et d'une valeur prévue de la longueur du vecteur de mouvement compensé en rotation.
     
    10. Procédé selon la revendication 9,
    caractérisé en ce que
    la valeur prévue pour la longueur est préréglée en fonction d'une vitesse du véhicule à moteur (1).
     
    11. Procédé selon la revendication 10,
    caractérisé en ce que
    la vitesse du véhicule à moteur (1) est déterminée par odométrie et/ou sur la base d'au moins un autre vecteur de mouvement déterminé pour au moins une autre caméra (4).
     
    12. Dispositif informatique (3) pour un système d'aide à la conduite (2) d'un véhicule à moteur (1), qui est adapté pour effectuer un procédé selon l'une quelconque des revendications précédentes.
     
    13. Système d'aide à la conduite (2) pour un véhicule à moteur (1) comprenant au moins une caméra (4) et un dispositif informatique (3) selon la revendication 12.
     
    14. Véhicule à moteur (1) avec un système d'aide à la conduite (2) selon la revendication 13.
     




    Drawing




















    Cited references

    REFERENCES CITED IN THE DESCRIPTION



    This list of references cited by the applicant is for the reader's convenience only. It does not form part of the European patent document. Even though great care has been taken in compiling the references, errors or omissions cannot be excluded and the EPO disclaims all liability in this regard.

    Patent documents cited in the description