Field of the Invention
[0001] This invention relates generally to image processing, and more particularly to retargeting
images.
Background of the Invention
[0002] The diversity and versatility of print and display devices imposes demands on designers
of multimedia content for rendering and viewing. For instance, designers must provide
different alternatives for web-content, and design different layouts for different
rendering applications and devices, ranging from tiny "thumbprints" of images often
seen in selections menus, small, low resolution mobile telephone screens, slightly
larger PDA screens, to large, high resolution elongated flat panel displays, and projector
screens. Adapting images to different rendering applications and devices than originally
intended is called
image retargeting.
[0003] Conventional image retargeting typically involves scaling and cropping. Image scaling
is insufficient because it ignores the image content and typically can only be applied
uniformly. Scaling also does not work well when the aspect ratio of the image needs
to change, because it introduces visual distortions. Cropping is limited because it
can only remove pixels from the image periphery. More effective resizing can only
be achieved by considering the image
content as a whole, in conjunction with geometric constraints of the output device.
[0004] Image resizing is an alternative tool for image retargeting. Image resizing works
by uniformly resizing a source image to a size of target display. While resizing an
image, there is a desire to change the size of the image while maintaining important
features in the content of the image. This can be done with top-down or bottom-up
methods. Top-down methods use tools such as face detectors to detect important regions
in the image, whereas bottom-up methods rely on visual saliency methods to construct
visual saliency map of the source image. After the saliency map is constructed, cropping
can be used to display the most important region of the image.
[0005] One method generates automatically thumbnail images based on either a saliency map
or the output of a face detector,
Suh et al., "Automatic thumbnail cropping and its effectiveness," UIST '03: Proceedings
of the 16th annual ACM symposium on User interface software and technology, ACM Press,
New York, NY, USA, 95-104, 2003. A source image is cropped to capture the most salient region in the image. Another
method adapts images to mobile devices,
Chen et al., "A visual attention model for adapting images on small displays," Multimedia
Systems 9, 4, 353-364, 2003. In that method, the most important region in the image is automatically detected
and transmitted to the mobile device.
[0007] All of the above rely on conventional image resizing and cropping operations to retarget
of the image.
[0008] Another method uses an adaptive grid-based document layout that maintains a clear
separation between content and template, Jacobs et al., "Adaptive grid-based document
layout. In Proceedings of ACM SIGGRAPH, 838-847, 2003. A designer constructs several
possible templates. When the content is displayed, the most suitable template is used.
[0009] A compromise between image resizing and image cropping is to use non-linear, data
dependent scaling for image retargeting, Liu et al., "Automatic Image Retargeting
with Fisheye-View Warping," ACM UIST, 153-162, 2005. They use image information, such
as low-level salience and high-level object recognition to find important regions
in the source image. Then, they apply a non-linear image warping function to emphasize
important aspects of the image while retaining the surrounding context.
[0010] Another method uses an automatic, non-photorealistic method for retargeting large
images to small size displays, Setlur et al., "Automatic Image Retargeting. In the
Mobile and Ubiquitous Multimedia (MUM), ACM Press, 2005. They decompose the image
into a background layer and foreground obj ects. The retargeting method segments an
image into regions, identifies important regions, removes them, fills the resulting
gaps, resize the remaining image, and re-insert the important regions.
[0011] Another method uses a feature-aware texture mapping that warps an image to a new
shape, while preserving user-specified regions, Gal et al., "Feature aware texturing,"
Eurographics Symposium on Rendering, 2006. They solve a particular formulation of
the Laplace editing technique suited to accommodate similarity constraints in images.
However, local constraints are propagated through the entire image to accommodate
all constraints at once, and may sometimes fail.
[0012] Another method composes a novel photomontage from several images,
Agarwala et al., "Interactive digital photomontage," ACM Trans. Graph. 23, 3, 294-302,
2004. A user selects ROIs from different input images, which are then composited into
an output image. Another method uses drag-and-drop pasting, Jia et al., "Drag-and-
drop pasting," Proceedings of SIGGRAPH, 2006. They determine an optimal boundary between
the source and target images. Another method generates a collage image from a collection
of images, Rother et al., "Autocollage," Proceedings of SIGGRAPH 2006. None of these
compositing methods address image retargeting.
[0015] Changing the appearance of an image has been extensively described in the field of
texture synthesis, where the goal is to generate an output image that has different
texture than an input image, while preserving the basic idea of the content,
U.S. Patent 6,919,903, issued to Freeman et al. on July 19, 2005, "Texture synthesis and transfer for pixel images." That method does not consider
image retargeting.
[0017] Patch based methods , approaches use automatic guidance to determine synthesis ordering,
"Fragment-based image completion," Proceedings of ACM SIGGRAPH, 303-312, 2003, and
Criminisi et al., "Object removal by exemplar-based inpainting," In IEEE Conference
on Computer Vision and Pattern Recognition, 417-424, 2003.
[0018] Another interactive method provides inpainting for images missing strong visual structure,
by propagating structure along user-specified curves, Sum et al, "Image completion
with structure propagation", "Proceedings of ACM SIGGRAPH, 2005.
Summary of the Invention
[0019] The invention provides a method for content-aware image retargeting that uses geometric
constraints as well as image content constraints. The invention provides
minimum energy seam applications that support content-aware image retargeting. The seam applications can reduce and
increase the size of images, while preserving a rectangular (or square) shape.
[0020] As defined herein, a
seam is an optimal n-connected set of pixels in an image extending either from the top
edge to the bottom edge, or from the left edge to the right edge, where optimality
is defined by an
energy of the image. A seam is one pixel wide.
[0021] By applying the seam to the image, the size of the image can be changed. The applying
can either remove or insert seams to change, e.g., an aspect ratio of the image, e.g.,
from 4:3 to 16:9. In a preferred embodiment the seam is eight-connected.
[0022] By applying the seam operation in both directions, a source image can be retargeted
for a smaller or larger display. The selection and order of the seams preserve the
content of the image, as defined by the energy function.
[0023] The seam applications can also be used for image content enlargement and object removal.
The invention provides various visual saliency measures for definig the energy of
an image, and can also include user input to guide the retargeting process. By storing
the order of seams in a memory, a
multi-size image can be generated to support real-time retargeting of an image to any size.
Brief Description of the Drawings
[0024]
Figures 1A is a flow diagram of a method for retargeting an image according to an
embodiment of the invention;
Figures 1B-1E are images including seams according to the embodiments of the invention;
Figure 1F is an energy image according to an embodiment of the invention;
Figures 1G-1H are vertical and horizontal seam index maps according to embodiments
of the invention;
Figure 2A is a source image according to an embodiment of the invention;
Figure 2B is a target image of the source image of Figure 2A according to an embodiment
of the invention;
Figure 2C is a target image obtained from the image of Figure 2A by scaling;
Figure 2D is a target image obtained from the image of Figure 2A by cropping;
Figure 3A is source image according to an embodiment of the invention;
Figure 3B is an enlarged target image obtained from the image of Figure 3A according
to an embodiment of the invention;
Figure 4A is source image according to an embodiment of the invention;
Figure 4B is a target image with enlarged content obtained from the image of Figure
4A according to an embodiment of the invention;
Figure 5A is source image according to an embodiment of the invention;
Figure 5B is a target image obtained from the image of Figure 5A with an object removed
according to an embodiment of the invention;
Figure 6A is source image according to an embodiment of the invention;
Figure 6B is a target image obtained from the image of Figure 6A with an object removed
and resized according to an embodiment of the invention;
Figure 7A is a source image according to an embodiment of the invention;
Figures 7B and 7C are horizontal and vertical index maps corresponding to the image
of Figure 7A according to an embodiment of the invention; and
Figures 8A and 8B are consistent index maps corresponding to the maps of Figures 7B
and 7C according to an embodiment of the invention.
Detailed Description of the Preferred Embodiments
[0025] As shown in Figure 1A, the embodiments of our invention provide a method for content-aware
retargeting a source image 100 to a target image 109. The retargeting can change a
size of the source image, while preserving a rectangular shape in the target image.
By rectangular shape, we include square shapes.
[0026] Input to our method is the source image 100. From the source image, we generate an
energy image 112 using an energy function 111. Using a minimizing function 121, we
determine 120 one or more minimal energy seams 122. Each seam 122 can then be applied
130, one or more times, to the source image 100 to produce a target image 109. An
optimal order 132 in which the seems 122 are applied 130 can be according to an objective
function 131.
[0027] The application of the seams 122 can increase or decrease the size of an image, changing
the aspect ratio, remove content, i.e., selected objects, or enlarge the content of
the source image while preserving the rectangular shape of the image. The retargeting
can be done in either the pixel intensity domain or gradient domain.
[0028] The embodiments of the invention use
sean applications, which can change the size of an image by gracefully removing or inserting pixels
in different parts of the image. The retargeting uses the energy function 111 to define
the 'importance' of pixels in the source image. The retargeting can support several
types of energy functions such as gradient magnitude, entropy, visual saliency, eye-gaze
movement, and object detections, e.g., faces, pedestrians.
[0029] A seam is defined as a set of pixels, one pixel wide crossing the image from top
edge to the bottom edge, or from the left edge to the right edge. By successively
removing or inserting seams, it is possible to reduce, as well as to enlarge, the
size of an image in both directions. For image reduction, seam selection ensures that
the basic image structure is preserved by removing more of the low energy pixels,
while retaining high energy pixels.
[0030] For content enlargement, the order 132 of seam insertion ensures a balance between
the original image content and the artificially inserted pixels. This defines, in
effect, retargeting of images in a
content-aware fashion. Seam removal and insertion can be used to change the aspect ratio, image
retargeting, image content enlargement, and object removal.
[0031] Furthermore, by storing the order 132 of seam removal and insertion applications
and carefully interleaving seams in both vertical and horizontal directions,
multi-size images can be defined. Such images can continuously change their size in a content-aware
manner. A designer can author a multi-size image once, and the client application,
depending on the size needed, can resize the image in real time to fit the exact layout
or the display.
[0032] Our method removes or inserts pixels in an unnoticeable manner using the
energy image 112. Particularly, we only remove or insert pixels where the pixels
blend with surrounding pixels. Pixels that are similar to or blend with neighboring pixels
are said to have a low
energy. Pixels that are dissimilar to neighboring pixels are said to have a high energy.
Therefore, we generate the energy image 112 from the source image 100 according to
the
energy function 111:

where
I(x,
y) is a particular pixel. As stated above, other energy functions can also be used.
[0033] Given the energy function, there are several ways to change the size of the source
image while preserving its basic rectangular (or square) shape. An optimal strategy
maximizes as much energy in the target image as possible, i.e., we retain pixels with
higher energies, and remove pixels with lowest energies in an ascending order. However,
this could change the shape of the image, because we may remove a different number
of pixels from each row or column of pixels.
[0034] If we want to prevent the image from becoming distorted, then we can remove an equal
number of low energy pixels from every row. This preserves the shape of the image
but destroys the image content by creating visible zigzag effect. To preserve both
the shape and the visual coherence of the image we can use cropping. That is, we locate
a sub-window in the source image that is the size of the target image, which has a
highest energy. Another possible strategy, somewhat between removing pixels and cropping,
is to remove entire columns of pixels with the lowest energy. However, this might
still produce annoying, artifacts.
[0035] Therefore, we use a method that is less restrictive than cropping or column removal,
but can still preserve the image content and shape better than single pixel removals.
This leads to our
seam applications 130, and our definition of
seams 122.
[0036] As shown in Figures 1B and 1E, the source image 100 has
n ×
m pixels. That is, the image has two dimensions, vertical and horizontal, indexed respectively
by
n and
m.
[0037] A seam is a one pixel wide set of pixels through a source image I 100. A seam can
extend from one edge of the image to an opposing edge. The edges can be the top and
bottom edges, or the left and right edges. The seam has
exactly one pixel for each index along a particular dimension in which the seam is oriented. For example, if the orientation
dimension is vertical, the vertical seam has exactly
n pixels.
[0038] A vertical seam 101 is:

where x is a mapping
x : [1, ...,
n]
→ [1, ...,
m]. That is, the vertical seam 101 is an eight-connected path, from the top edge to
the bottom edge, containing one, and only one, pixel in each of
n rows of pixels in the image 100.
[0039] Similarly, a
horizontal seam 102 is:

where
y is a mapping
y : [1, ...,
m]
→ [1, ...,
n]. That is, the horizontal seam 102 is an eight-connected path, from the left to the
right, containing one, and only one, pixel in each
m columns of pixels in the image 100.
[0040] The pixels on the of seam s, e.g., vertical seam {
si} 101, are

[0041] Note that similar to removing an entire row or column from an image, removing the
pixels of a seam from an image has only a local effect. All the pixels of the image
are shifted left (or up) to compensate for the removed pixels. The visual impact,
if any, is only noticeable only along the seam, leaving the rest of the image intact.
If the removed or inserted pixels have a low energy, then the visual impact is negligible.
[0042] If an image has 240 × 320 image, i.e., 240 rows and 320 columns, then it has a 4:3
aspect ratio. Changing the aspect ratio from 4:3 to 16:9 image can be performed either
by inserting vertical seams, or by removing horizontal seams. By inserting 106 vertical
seams, we obtain a 240 × 426 image. By removing 60 rows, we obtain a 180 × 320 image.
Both resulting images have a 16:9 aspect ratio. The first method has the advantage
of only adding pixels, none of the original pixels are removed.
[0043] We can replace the constraint |
x(
i)
-x(
i-
1)|
≤ 1 with |
x(
i)
-x(
I-1)|
≤ k, and obtain either an entire column (or row) for
k = 0, or a piece-wise connected pixels, see Figure 1D - not to scale, or even a complete
or partially disconnected pixels for any value 1
≤ k ≤ m, see Figure 1E.
[0044] Given the energy function
e(
I), we define the
energy of a seam as

[0045] We use the following minimizing function 121 to locate the optimal minimal energy
seam
s* 122:

[0046] The optimal minimal energy seam
s* 122 can be found using dynamic programming. A first step traverses the pixels of
the image, one row at the time, and determines a cumulative minimum energy
M for all possible seams for each entry

[0047] At the end of this process, the minimum of the last row in
M indicates the end of a minimal connected vertical seam. Hence, in the second step,
we backtrack from this minimum in row Mto find the path of the optimal seam 122. The
optimal minimal energy horizontal seams can be found in a similar manner.
[0048] Figure 1F shows the energy image 112 in terms of a magnitude of the intensity gradients.
Figures 1G and 1H shows the corresponding vertical and horizontal path maps used to
determine 120 the seams 122.
Energy Preservation Measure
[0049] To evaluate the effectiveness of the different strategies for our content-aware retargeting,
we determine an average energy of all of pixels in the source image 100 as:

during retargeting. Randomly removing pixels keeps the average unchanged, while content-aware
retargeting increases the average as the retargeting removes low energy pixels and
retains high energy pixels.
[0050] There are several well known image importance measures that we can use, such as the
L1-norm, the
L2-norm of the gradient, a saliency measure, and a Harris-comers measure.
Discrete Image Resizing
Aspect Ratio Change
[0051] We want to change the aspect ratio of a source image
I from
n ×
m to
n ×
m', where
m - m' =
c, some constant. This can be achieved by successively removing
c vertical seams from the image I. In contrast with conventional scaling, the seam
applications 130 according to our invention do not alter important parts of the image,
as defined by the energy function 111. In effect, this generates a non-uniform, content-aware
resizing of the image. Figure 2A shows a source image. Figure 2B shows a target image
according to the embodiments of the invention. Figures 2C and 2D show conventional
scaling and cropping.
[0052] As shown in Figures 3A and 3B, the same aspect ratio correction, from
n ×
m to
n ×
m' can also be achieved by increasing the number of rows by a factor of
m/
m'. The added value of this approach is that it does not remove any pixels from the image.
We discuss our strategy for increasing the image size in detail below.
Retargeting with Optimal Seams-Order
[0053] Image retargeting generalizes aspect ratio change such that an image I of size
n ×
m is retargeted to size
n ' ×
m'. We assume that
m' <
m and
n' <
n. The optimal order 132 for pixel removal is an optimization of the following objective
function 131:

where
k =
r +
c, r = (
m -
m')
, c = (
n -
n') and α
i is a parameter that determines whether we remove a horizontal or vertical seam at
step
i:

[0054] We find the optimal order using a transport map
T that specifies, for each desired target image size
n' ×
m', the cost of the optimal sequence of horizontal and vertical seam applications 130.
[0055] That is, an entry
T(r, c) in the transport map
T stores a minimal cost needed to obtain an image of size
n - r × m -c. We determine
T using dynamic programming.
[0056] Starting at
T(0, 0) = 0, we determine for each entry (
r,
c), selecting the best of two options, either removing a horizontal seam from an image
of size
n -
r ×
m -
c + 1, or removing a vertical seam from an image of size
n - r + 1 ×
m-c:

where
In-r×m-c denotes the image of size
n - r ×
m -
c, and
E(
sx(
I)) and
E(
sy(
I)) are the costs of the respective seam applications.
[0057] Given the transport map
T and target size
n' ×m', where
n' =
n-r and
m' =
m-c, we can backtrack from
T(
r,
c), and find the optimal path by successively moving to the minimum of the top and
left neighbors of the current entry, until we reach T(0, 0). Selecting a left neighbor
corresponds to a vertical seam application, while selecting the top neighbor corresponds
to a horizontal seam application.
Increasing Image Size
[0058] The process of removing vertical and horizontal seams can express as a time-evolution
process. We denote
I(t) as a smaller image generated after t seam have been removed from the source image
I.
[0059] To increase the size of an image, we approximate an `inversion' of this time evolution
and insert new 'artificial' seams in the image. Hence, to increase the size of the
source image
I by one, we determine the optimal vertical (horizontal) seam s in the source image
I, and duplicate the pixels of seam s by averaging the pixels with their left and right
neighbors, or top and bottom in the case of a horizontal seam.
[0060] Using the time evolution notation, we denote the resulting target image as
I(-1). Repeating this process generates a stretching artifact by selecting the same seam.
To achieve effective increasing, it is important to balance between the original image
content and the inserted pixels.
[0061] Therefore, to increase the size of an image by
k, we find the first
k seams for
removal, and duplicate these seams in order to obtain the image
I(-k). This can be viewed as the process of traversing back in time to recover pixels from
a larger image that would have otherwise been removed by seam removals.
[0062] To continue in a content-aware fashion for an excessive image size increase, for
instance, greater than 50%, we break the process into several steps. In each step,
we enlarge not more than a fraction of the size of the image from the previous step,
essentially guarding the important content from being stretched.
Content Enlargement
[0063] Instead of increasing the size of the image, our image retargeting can be used to
magnify or enlarge the content of the image, while preserving its size and shape.
This can be achieved by combining seam removal and standard image scaling. To preserve
the image content as much as possible, we first use conventional image scaling to
increase the size of the image. Then, we apply seam removal on the larger image to
reduce the image back to its original size, see Figures 4A and 4B. Note that the removed
pixels are in effect sub-pixels of the source image. Content enlargement is effectively
similar to data-driven zooming in. Zooming out or reducing the content can be done
in a similar manner.
Seam Removal in the Gradient Domain
[0064] There are times when removing multiple seams from a source image still causes noticeable
visual artifacts in the target image. To overcome this, we can combine seam removal
with a Poisson reconstruction, see
U.S. Patent 7,038,185 issued to Tumblin et al. on May 2, 2006, "Camera for directly generating a gradient image."
[0065] Specifically, we determine the energy image 112 as before, but instead of removing
the seams from the source image, we work in the gradient domain and remove the seams
from the
x and
y derivatives (gradients) of the source image. At the end of this process we use a
Poisson solver to reconstruct the target image 109.
Object Removal
[0066] We can also remove an object from the source image. The pixels associated with the
object are marked, and then seams are removed from the image until all marked pixels
are gone. We can automatically determine the smaller of the vertical or horizontal
diameters, in terms of pixels, of the target removal region and perform vertical or
horizontal removals accordingly, see Figures 5A and 5B. In the examples shown in Figures
6A and 6B, an object (one shoe) is removed. After the object removal, the image is
increased to its original size. Note that this example would be difficult to accomplish
using conventional in-painting or texture synthesis.
Multi-Size Images
[0067] So far, we assume that the size of the target image 109 is known. However, this might
not be possible in some cases. Consider, for example, an image embedded in a web page.
The page designer does not know, ahead of time, at what size the web page will eventually
be displayed. Therefore, the designer cannot generate a
single target image. In a different scenario, the user might want to explore target images
with different sizes, and select a most suitable size.
[0068] Seam applications are linear in the number of pixels, and image retargeting is therefore
linear in the number of seams to be removed or inserted. On the average, we can retarget
an image of size 400×500 to 100×100 in a couple of seconds. However, determining tens
or hundreds of seams for multiple different sized images is a challenging task when
it is to be done in real time.
[0069] To address this issue, we provide a representation of multi-size images that encodes,
for a source image 100 of size
(m ×
n), an entire range of retargeting sizes from 1 x 1 to
m ×
n, and even larger to
N' ×
M', when
N' > n, M' >
m. This information enables retargeting an image continuously in real time. From a
different perspective, this can be seen as storing an
explicit representation of the time-evolution implicit process of seam applications.
[0070] First, consider the case of changing the width of the source image shown in Figure
7A. We define an index map
V of size
n ×
m that encodes, for each pixel, the index of the seam that removed it, i.e.,
V(i, j) = t means that pixel (
i,
j) is removed by the vertical
tth seam removal, see Figure 7B. The order or index for seam removal is shown dark to
light. To get an image of width
m', we only need to gather, in each row, all pixels with a seam index greater than or
equal to
m - m'.
[0071] For example, pixels removed by the first seam get the index number 1, and are deemed
less important than pixels removed by seam with index number 20.
[0072] This representation supports image enlarging as well as image reduction. For example,
if we want to support enlarging of the image up to size
M' >
m, we enlarge the image using seam insertion to a size
n×
M' as described above.
[0073] However, instead of averaging the pixels in the
kth seam with the pixels in the two neighboring seams, we do not modify the source image
pixels in the seam, but insert new pixels to the image as the average of the
kth seam and its left (or right) pixel neighbor. The inserted seams are given a negative
index starting at -1.
[0074] To enlarge the source image by
k, (m < k <
M'), we use exactly the same procedure of gathering, from the enlarged image, all pixels
whose seam index is greater than
(m - (m +
k)) = -
k, and obtain an image of size
m -(
-k)=
m+k.
[0075] Determining a horizontal index map
H for image height enlarging and reduction is achieved in a similar manner, see Figure
7C. However, supporting resizing in both image dimensions, while determining index
maps
H and
V independently can cause problems. This is because horizontal and vertical seams can
intersect in more than one place, see Figures 7B and 7C, and removing a seam in one
direction may affect the index map in the other direction. One way to avoid this is
to allow seam removal in one direction, and use degenerate seams, i.e., rows or columns,
in the other direction.
Constructing Consistent Index Maps
[0076] Determining the horizontal index map
H and the vertical index map
V independently for multi-size image does not work. To see why, we start with a definition.
The maps
H and
V are
consistent if every horizontal seam intersects all the
vertical seam indexes, and every vertical seam intersects all
horizontal seam indexes.
[0077] Consistency assures that removing a seam in any dimension removes exactly one pixel
from all seams in the other dimension, retaining the index map structure. If consistency
is not maintained, then after removing one horizontal seam we might be left with vertical
seams with different number of pixels and the rectangular structure of the image is
affected.
[0078] Aside from limiting seams to be rows or columns in one (or two) of the dimensions,
we described another approach to this problem, that is restricted to temporally 0-connected
seams. These are seams that are spatially connected on the source image
I(0).
[0079] For such seams, the only possible violation of consistency between the
H and
V maps can occur in diagonal seam steps. Our method first determines temporally 0-connected
seams in one direction, e.g., vertical, and then impose the constraints on the diagonal
when determining the seams in the other direction.
[0080] To understand why the only violation of consistency occurs in diagonals, assume without
loss of generality, that some vertical seam
j ∈ {1, ...,
m} violates the consistency constraint. This means that the seam does not intersect
all horizontal seams 1, ....,
n. Hence, the seam must touch some horizontal seam
i ∈ {1, ...,
n} more than once.
[0081] Denote the pixels where seam
j intersects seam
i as
p and
q. Because pixels
p and
q are part of a vertical seam
j, the pixels cannot be in the same row. However, the pixels are also part of the horizontal
seam
i, and cannot be in the same column.
[0082] Let us examine the rectangle defined in its two corners by
p and
q. Seams
i and
j are connected inside this rectangle and the seams touch the corners. However, one
is a vertical seam and the other a horizontal seam. The only possibility for this
to happen is that the rectangle is in fact a square, and both seams pass through the
diagonal of the square.
[0083] Note that the above deduction relies on the fact that all seams are connected in
the image at one place, which is not true if we use non 0-connected seams, see Figure
1E. Because we restrict ourselves to temporally 0-connected seams in both directions,
we can replace the time evolution process of determining 120 the seams 122 by concurrently
determining
all of the seams in each dimension. The reason is, that for 0-connected seams, we can
examine the source size image, and process each pair of rows independently.
[0084] For each pair of rows, we can find the optimal set of 1-edge paths linking all pixels
of one row to all pixels of the next row. The global multiple seam paths from the
top of the image to the bottom is the concatenation of those 1-edge paths.
[0085] Finding the best 1-edge paths between a pair of rows is similar to a weighted assignment
problem where each pixel in one row (column) is connected to its three neighboring
pixels in the other row (column). We use the well known Hungarian algorithm to solve
this weighted assignment problem. The Hungarian algorithm is described in, for example,
a textbook entitled "
Network Programming," by K. Murty, Prentice Hall, Englewood Cliffs, N.J., 1992, pp.
168-187.
[0086] After we find the multi-seams paths in one dimension, we repeat the process in the
other dimension, but we mask out every diagonal edge that was already used by any
of the first direction seams. This guarantees that the seams in the second dimension
are consistent with the first dimension as shown for the consistent seams of Figure
8A and 8B, corresponding to inconsistent seams of Figures 7B and 7C, respectively.
Effect of the Invention
[0087] The embodiments of the present invention provide seam applications for content-aware
image retargeting. Seams are determined as optimal minimum energy paths in a source
image. Pixels can be removed or inserted along the seams.
[0088] The seam applications can be used for a variety of image manipulation operations
including: aspect ratio change, image resizing, content enlargement and object removal.
The seam applications can be integrated with various saliency measures, as well as
user input, to guide the retargeting process. In addition, the invention provides
a data structure for multi-size images that support continuous retargeting in real
time.
[0089] Although the invention has been described by way of examples of preferred embodiments,
it is to be understood that various other adaptations and modifications can be made
within the spirit and scope of the invention. Therefore, it is the object of the appended
claims to cover all such variations and modifications as come within the true spirit
and scope of the invention.
1. A method for content-aware image retargeting, comprising:
generating an energy image from a source image according to an energy function;
determining, from the energy image, one or more seams according to a minimizing function
such that each seam has a minimal energy; and
applying each seam to the source image to obtain a target image that preserves content
and a rectangular shape of the source image.
2. The method of claim 1, in which pixels in the source image are indexed in a vertical
dimension by indices n and in a horizontal dimension by indices m, and each minimal energy seam extends from one edge of the source image to an opposite
edge of the image, and a number of pixels in each seam corresponds to the indices
of the dimension in which the seam is oriented, and the seams include vertical seams
and horizontal seams.
3. The method of claim 2, further comprising:
applying the seams in an optimal order according to an objective function.
4. The method of claim 1, in which a particular seam is applied repeatedly.
5. The method of claim 1, in which the applying changes a size of the source image.
6. The method of claim 5, in which the size of the source image is increased.
7. The method of claim 5, in which the size of the source image is decreased.
8. The method of claim 1, in which the applying changes an aspect ratio of the source
image.
9. The method of claim 1, in which the applying removes an object from the source image.
10. The method of claim 1, in which the applying enlarges the content of the source image
while preserving a size of the source image.
11. The method of claim 1, in which the generating of the energy image is performed in
an intensity domain of the source image.
12. The method of claim 1, in which the generating of the energy image is performed in
a gradient domain of the source image.
13. The method of claim 1, in which the energy function is a gradient magnitude.
14. The method of claim 1, in which the energy function is entropy.
15. The method of claim 1, in which the energy function is according to visual saliency.
16. The method of claim 1, in which the energy function is according to eye-gaze movement.
17. The method of claim 3, further comprising:
storing the optimal order to enable changing a size of the source image in real time.
18. The method of claim 2, in which pixels similar to neighboring pixels have a low energy,
and pixels dissimilar to neighboring pixels have a high energy.
19. The method of claim 1, in which the
energy function is

where
I(
x, y) is a particular pixel.
20. The method of claim 1, in which the applying maximizes energy of the target image.
21. The method of claim 3, in which the vertical seam is:

where
x is a mapping
x : [1, ...,
n] → [1, ...,
m], and the horizontal seam is:

where
y is a mapping
y : [1, ...,
m]
→[1, ...,
n].
22. The method of claim 21, in which the pixels of the seam are eight-connected.
23. The method of claim 21, in which the pixels of the seams are piece-wise connected.
24. The method of claim 21, in which the pixels of the seams are disconnected.
25. The method of claim 21, the pixels of the vertical seam {
si} 101, are

and the energy of the vertical seam is

and the optimal minimal energy seam
s* is
26. The method of claim 25, in which the optimal minimal energy seam s* is found using
dynamic programming.
27. The method of claim 1, in which the energy function is according to user input.
28. The method of claim 1, in which the energy function is according to object detection.
29. The method of claim 2, further comprising:
constructing horizontal and vertical index seam maps T including entries T(r, c).
30. The method of claim 3, in which the optimal order of horizontal and vertical seam
removal is computed using

where
In-r×m-c denotes an image of
size n - r × m - c, and
E(
sx(
I)) and
E(
sy(
I)) are costs of the respective seam applications.