TECHNICAL FIELD
[0002] Embodiments of this application relate to the field of image processing, and in particular,
to an image signal processing method and apparatus.
BACKGROUND
[0003] There are a plurality of conventional joint photographic experts group (joint photographic
experts group, JPEG) image compression solutions, for example, Huffman coding and
arithmetic coding are currently commonly used compression solutions of entropy coding.
[0004] The Huffman coding is a compression solution that allocates short binary bits to
characters with a high occurrence probability to perform coding, and allocates long
binary bits to characters with a low occurrence probability to perform coding, so
that an average coding length of the characters is the shortest. The Huffman coding
needs to first calculate an occurrence frequency of each character in original data,
generate a corresponding dictionary, and then perform coding. An entire coding process
has a high compression rate and is complex to implement. Once a decoding error occurs,
error spreading at a decoder is serious, which affects image compression performance.
[0005] How to easily and effectively control error spreading at the decoder to improve image
compression performance is a problem to be urgently resolved.
SUMMARY
[0006] This application provides an image signal processing method and apparatus. The solution
is easy to implement, has a low compression rate, and effectively controls error spreading
at a decoder, to improve image compression performance.
[0007] To achieve the foregoing objective, the following technical solutions are used in
embodiments of this application.
[0008] According to a first aspect, an image signal processing method is provided, where
the method is applied to an encoder. The method may include: obtaining a binary source
XN of an image; performing a polarization operation on
XN, to obtain a to-be-compressed bit set
UN ; obtaining a compressed reserved bit set
US from
UN ; performing channel coding on
US , to obtain a coded bit sequence; and outputting the coded bit sequence.
[0009] According to the image signal processing method provided in this application, the
binary source is compressed through polar coding, and fixed-length source coding is
implemented based on an expected compression rate. This process is easy to implement.
In addition, a coding granularity of the polar coding is an image block (block). Even
if a decoding error occurs, the error spreads only in the block, thereby effectively
controlling error spreading at a decoder, and improving image compression performance.
[0010] The polarization operation may be multiplying a generation matrix of a polar code.
[0011] A bit in
US is a reserved bit during source compression, and
US are some bits in
UN.
[0012] Optionally, a source entropy of the bit in
US approximates 1; or a source entropy of the bit in
US tends to be greater than or equal to a first preset threshold; or a Bhattacharyya
parameter of an equivalent channel corresponding to the bit in
US is greater than or equal to a second preset threshold; or a channel capacity of an
equivalent channel corresponding to the bit in
US is less than or equal to a third preset threshold.
[0013] With reference to the first aspect, in a possible implementation, the obtaining a
compressed reserved bit
US from
UN may be specifically implemented as: converting source coding into channel coding;
and using, based on a sequence of equivalent channel reliability in descending order,
bits corresponding to equivalent channel reliability ranked low in
UN as
US. A higher source entropy of a source bit indicates lower reliability of an equivalent
channel corresponding to the source bit.
[0014] Source coding is equivalent to channel decoding. A channel reliability ranking algorithm
is used during channel decoding, to obtain the sequence of the channel reliability
in descending order, and the sequence is used as a sequence of source entropies of
bits in ascending order in
UN. Fixed bits
UF of an equivalent channel are used as
US (that is, U
F ↔ U
S), and information bits
UI of the equivalent channel are used as
USC (that is, U
I ↔ U
SC).
[0015] For example, the channel reliability ranking algorithm may include a Gaussian approximation
(Gaussian Approximation) algorithm, a channel degradation and upgradation merging
(channel degradation and upgradation merging) algorithm, some ranking methods, or
the like.
[0016] Specifically, the using bits corresponding to equivalent channel reliability ranked
low in
UN as
US includes: using, based on the sequence of the equivalent channel reliability in descending
order, A bits corresponding to equivalent channel reliability ranked low in
UN as
US, where A is determined by the expected compression rate. This is not limited in this
application.
[0017] With reference to the first aspect, in a possible implementation, the obtaining a
compressed reserved bit
US from
UN may be specifically implemented as: calculating a source entropy of each bit in
UN, and using bits whose source entropies approximate 1 in
UN as
US. A process of calculating the source entropies may be implemented through mathematical
derivation.
[0018] With reference to the first aspect or any one of the foregoing possible implementations,
in another possible implementation,
UN further includes a compressed discarded bit set
USC obtained through polar code source coding, where
USC is a complementary set of
US , and includes bits other than
US in
UN. The image signal processing method provided in this application may further include:
performing channel decoding on preset output values of an equivalent channel based
on
US, to obtain a decoded

; obtaining error location information
T based on
USC and

; and recording
US and
T , to train a preset neural network for predicting the error location information.
The preset neural network is a neural network that is trained based on M groups of
compressed reserved bit sets and error location information obtained through polar
code source coding until the neural network converges. A value of M may be configured
based on an actual requirement.
[0019] The performing channel decoding on preset output values of an equivalent channel
based on
US, to obtain a decoded

may include: equating a process of performing source coding on a binary Bernoulli
source to a process of performing channel decoding on a binary symmetric channel (binary
symmetric channel, BSC), which is equivalent to one BSC. Noise distribution of the
channel is the same as that of
XN , the preset output values of the preset equivalent channel are used as a received
signal of the BSC,
US is used as fixed bits
UF included in the received signal, the preset output values of the equivalent channel
and
UF are input into a polar code decoder to perform channel decoding, to obtain estimated
values

of the information bits
UI, and

is used as estimated values

of
USC.
[0020] The preset output values of the equivalent channel may be an all-zero set, or may
be other preset values. This is not specifically limited in this application. The
error location information
T records locations of different bits in
USC and

.
[0021] With reference to the first aspect or any one of the foregoing possible implementations,
in another possible implementation,
UN further includes a compressed discarded bit set
USC obtained through polar code source coding, where
USC is a complementary set of
US, and includes bits other than
US in
UN. The image signal processing method provided in this application may further include:
performing channel decoding on preset output values of an equivalent channel based
on
US, to obtain a decoded

; and obtaining error location information
T based on
USC and

. In this implementation, the performing channel coding on
US, to obtain a coded bit sequence may be specifically implemented as: performing channel
coding on
US and
T, to obtain the coded bit sequence. The decoder performs decoding based on
T, to implement lossless compression of the image.
[0022] It should be noted that before channel coding is performed on
T ,
T further needs to be converted into a binary form.
[0023] With reference to the first aspect or any one of the foregoing possible implementations,
in another possible implementation, the obtaining a binary source
XN of an image includes: performing color conversion, grayscale translation, discrete
cosine transform (discrete cosine transform, DCT), quantization, and zig-zag scanning
on the image, to obtain
XN.
[0024] With reference to the first aspect or any one of the foregoing possible implementations,
in another possible implementation, the obtaining a binary source
XN of an image includes: performing color conversion, grayscale translation, DCT, quantization,
zig-zag scanning, and entropy coding on the image, to obtain
XN. The entropy coding may include run-length coding or the like.
[0025] According to a second aspect, another image signal processing method is provided,
where the method is applied to a decoder. The method may include: obtaining a received
signal
YN, where
YN is a signal obtained after a bit sequence that is obtained after source coding and
channel coding are performed on a binary source
XN of an image is transmitted through a channel; performing channel decoding on
YN , to obtain information bits
ÛI in channel decoding, and using
ÛI as a compressed reserved bit set
ÛS on which source decoding is to be performed; performing channel decoding on preset
output values of an equivalent channel based on error location information
T̂ and
ÛS , to obtain a compressed discarded bit set
ÛSC on which source decoding is to be performed, where
T̂ indicates a location of a flipped bit during channel decoding; obtaining
ÛN , where
ÛN includes
ÛS and
ÛSC ; and performing a polarization operation on
ÛN , to obtain a reconstructed signal
X̂N of
XN.
[0026] According to the image signal processing method provided in this application, corresponding
decoding is performed, based on the error location information, on the signal that
is compressed through polar coding, to implement lossless compression of the image
signal. This process is easy to implement. In addition, a coding granularity of the
polar coding is an image block (block). Even if a decoding error occurs, the error
spreads only in the block, thereby effectively controlling error spreading at a decoder,
and improving image compression performance.
[0027] With reference to the second aspect, in a possible implementation, the image signal
processing method provided in this application may further include: inputting
ÛS to a preset neural network, to obtain the error location information
T̂ through prediction. The preset neural network is a neural network that is trained
based on M groups of compressed reserved bit sets and error location information obtained
through polar code source coding until the neural network converges.
[0028] The preset neural network may perform training offline based on data, recorded by
an encoder, of the M groups of compressed reserved bit sets and error location information
obtained through polar code source coding, and trained data is configured in the decoder.
[0029] With reference to the second aspect, in a possible implementation, the performing
channel decoding on
YN, to obtain information bits
ÛI in channel decoding may be specifically implemented as: performing channel decoding
on
YN, to obtain
ÛI and
T̂.
T̂ is sent by the encoder, to achieve lossless image compression.
[0030] Further,
T̂ obtained through channel decoding further needs to be converted into a decimal form.
[0031] With reference to the first aspect or any one of the foregoing possible implementations,
in another possible implementation, the image signal processing method provided in
this application further includes: performing inverse zig-zag scanning, dequantization,
inverse discrete cosine transform (inverse discrete cosine transform, IDCT), grayscale
translation, color conversion, and the like on
X̂N, to restore the image.
[0032] With reference to the first aspect or any one of the foregoing possible implementations,
in another possible implementation, the obtaining a binary source
XN of an image includes: performing entropy coding, inverse zig-zag scanning, dequantization,
IDCT, grayscale translation, color conversion, and the like on
X̂N, to restore the image. The entropy coding may include run-length coding or decoding
corresponding to another entropy coding.
[0033] According to a third aspect, an image signal processing apparatus is provided. The
apparatus may be an encoder, or may be an apparatus or a chip system in an encoder.
The image signal processing apparatus may implement the functions in the first aspect
or the possible implementations of the first aspect; and the functions may be implemented
through hardware, or may be implemented through hardware executing corresponding software.
The hardware or the software includes one or more modules corresponding to the functions.
The image signal processing apparatus may include a first obtaining unit, a polarization
unit, an obtaining unit, a coding unit, and an output unit.
[0034] The first obtaining unit is configured to obtain a binary source
XN of an image.
[0035] The polarization unit is configured to perform a polarization operation on
XN obtained by the first obtaining unit, to obtain a to-be-compressed bit set
UN.
[0036] The obtaining unit is configured to obtain a compressed reserved bit set
US from
UN obtained by the polarization unit.
[0037] The coding unit is configured to perform channel coding on
US obtained by the obtaining unit, to obtain a coded bit sequence.
[0038] The output unit is configured to output the coded bit sequence obtained by the coding
unit.
[0039] With reference to the third aspect, in a possible implementation, the obtaining unit
may be specifically configured to: use, based on a sequence of equivalent channel
reliability in descending order, bits corresponding to equivalent channel reliability
ranked low in
UN as
US . A higher source entropy of a source bit indicates lower reliability of an equivalent
channel corresponding to the source bit.
[0040] With reference to the third aspect or any one of the foregoing possible implementations,
in another possible implementation,
UN further includes a compressed discarded bit set
USC obtained through polar code source coding. The apparatus may further include a second
obtaining unit, configured to: perform channel decoding on preset output values of
an equivalent channel based on
US, to obtain a decoded

; and obtain error location information
T based on
U SC and

; and a recording unit, configured to record
US and
T, to train a preset neural network for predicting the error location information.
The preset neural network is a neural network that is trained based on M groups of
compressed reserved bit sets and error location information obtained through polar
code source coding until the neural network converges.
[0041] With reference to the third aspect or any one of the foregoing possible implementations,
in another possible implementation,
UN further includes a compressed discarded bit set U
SC obtained through polar code source coding. The apparatus may further include a second
obtaining unit, configured to: perform channel decoding on preset output values of
an equivalent channel based on
US, to obtain a decoded

; and obtain error location information
T based on U
SC and

. Correspondingly, the coding unit may be specifically configured to perform channel
coding on
US and
T.
[0042] It should be noted that the image signal processing apparatus provided in the third
aspect is configured to perform the image signal processing method provided in any
one of the first aspect or the possible implementations of the first aspect. For specific
implementations, reference may be made to each other, or a same effect may be achieved.
[0043] According to a fourth aspect, another image signal processing apparatus is provided.
The apparatus may be a decoder, or may be an apparatus or a chip system in a decoder.
The image signal processing apparatus may implement the functions in the second aspect
or the possible implementations of the second aspect; and the functions may be implemented
through hardware, or may be implemented through hardware executing corresponding software.
The hardware or the software includes one or more modules corresponding to the functions.
The apparatus may include an obtaining unit, a first decoding unit, a second decoding
unit, an obtaining unit, and a polarization unit.
[0044] The obtaining unit is configured to obtain a received signal
YN, where
YN is a signal obtained after a bit sequence that is obtained after source coding and
channel coding are performed on a binary source
XN of an image is transmitted through a channel;
[0045] The first decoding unit is configured to: perform channel decoding on
YN obtained by the obtaining unit, to obtain information bits
ÛI in channel decoding; and use the information bits as a compressed reserved bit set
ÛS on which source decoding is to be performed.
[0046] The second decoding unit is configured to perform channel decoding on preset output
values of an equivalent channel based on error location information
T̂ and
ÛS, to obtain a compressed discarded bit set
ÛSC on which source decoding is to be performed.
T̂ indicates a location of a flipped bit during channel decoding.
[0047] The obtaining unit is configured to obtain
ÛN, where
ÛN includes
ÛS obtained by the first decoding unit and
ÛSC obtained by the second decoding unit.
[0048] The polarization unit is configured to: perform a polarization operation on
ÛN obtained by the obtaining unit, to obtain a reconstructed signal
X̂N of
XN.
[0049] With reference to the fourth aspect, in a possible implementation, the apparatus
may further include a prediction unit, configured to: input
ÛS to a preset neural network, to obtain the error location information
T̂ through prediction. The preset neural network is a neural network that is trained
based on M groups of compressed reserved bit sets and error location information obtained
through polar code source coding until the neural network converges.
[0050] With reference to the fourth aspect, in another possible implementation, the first
decoding unit may be specifically configured to perform channel decoding on
YN, to obtain
ÛI and
T̂.
[0051] It should be noted that the image signal processing apparatus provided in the fourth
aspect is configured to perform the image signal processing method provided in any
one of the second aspect or the possible implementations of the second aspect. For
specific implementations, reference may be made to each other, or a same effect may
be achieved.
[0052] According to a fifth aspect, an encoder is provided. The encoder includes a processor
and a memory. The memory is configured to store computer-readable instructions (or
referred to as computer programs), and the processor is configured to read the computer-readable
instructions to implement the method provided in the first aspect and any implementation
of the first aspect.
[0053] In a possible implementation, the encoder further includes a transceiver, configured
to receive and send data.
[0054] According to a sixth aspect, a decoder is provided. The decoder includes a processor
and a memory. The memory is configured to store computer-readable instructions (or
referred to as computer programs), and the processor is configured to read the computer-readable
instructions to implement the method provided in the second aspect and any implementation
of the second aspect.
[0055] In a possible implementation, the decoder further includes a transceiver, configured
to receive and send data.
[0056] According to a seventh aspect, an embodiment of this application further provides
a computer-readable storage medium, including instructions. When the instructions
are run on a computer, the computer is enabled to perform the image signal processing
method according to any one of the foregoing aspects or any possible implementation
of the aspects.
[0057] According to an eighth aspect, an embodiment of this application further provides
a computer program product. When the computer program product runs on a computer,
the computer is enabled to perform the image signal processing method according to
any one of the foregoing aspects or any possible implementation of the aspects.
[0058] According to a ninth aspect, an embodiment of this application provides a chip system.
The chip system includes a processor, and may further include a memory, configured
to implement a function of the image signal processing method according to any one
of the foregoing aspects or any possible implementation of the aspects. The chip system
may include a chip, or may include a chip and another discrete component.
[0059] According to a tenth aspect, a chip system is provided, where the chip system includes
a processor and a memory. The memory is configured to store computer-readable instructions
or a computer program, and the processor is configured to read the computer-readable
instructions to implement the image signal processing method according to any one
of the foregoing aspects or any possible implementation of the aspects.
[0060] According to an eleventh aspect, an image signal processing system is provided. The
image signal processing system includes: an encoder according to any one of the foregoing
aspects or any possible implementation of the aspects, and a decoder according to
any one of the foregoing aspects or any possible implementation of the aspects.
[0061] It should be noted that various possible implementations of any one of the foregoing
aspects may be combined provided that the solutions are not contradictory.
BRIEF DESCRIPTION OF DRAWINGS
[0062]
FIG. 1 is a schematic diagram of a structure of an image processing system according
to an embodiment of this application;
FIG. 2 is a schematic diagram of internal principles of an encoder 100 and a decoder
200 according to an embodiment of this application;
FIG. 3 is a schematic diagram of typical coding of an 8 × 8 polar code according to
an embodiment of this application;
FIG. 4 is a schematic diagram of a polarization change according to an embodiment
of this application;
FIG. 5 is a schematic diagram of a structure of a feedforward neural network according
to an embodiment of this application;
FIG. 6 is a schematic diagram of an operation of each neuron in a neural network according
to an embodiment of this application;
FIG. 7 is a schematic flowchart of an image signal processing method according to
an embodiment of this application;
FIG. 8 is a schematic diagram of a BSC according to an embodiment of this application;
FIG. 9 is a schematic flowchart of another image signal processing method according
to an embodiment of this application;
FIG. 10 is a schematic flowchart of lossless polar compression according to an embodiment
of this application;
FIG. 11 is a schematic flowchart of still another image signal processing method according
to an embodiment of this application;
FIG. 12 is a schematic diagram of a structure of an image signal processing apparatus
according to an embodiment of this application;
FIG. 13 is a schematic diagram of a structure of another image signal processing apparatus
according to an embodiment of this application;
FIG. 14 is a schematic diagram of a structure of still another image signal processing
apparatus according to an embodiment of this application; and
FIG. 15 is a schematic diagram of a structure of yet another image signal processing
apparatus according to an embodiment of this application.
DESCRIPTION OF EMBODIMENTS
[0063] In addition, in the embodiments of this application, the word "example" or "for example"
is used to represent giving an example, an illustration, or a description. Any embodiment
or design scheme described as "in an example" or "for example" in the embodiments
of this application should not be explained as being more preferred or having more
advantages than another embodiment or design scheme. Exactly, use of the word such
as "example" or "for example" is intended to present a related concept in a specific
manner.
[0064] This application provides an image signal processing method. Basic principles of
the method are as follows: source coding is performed, based on channel polarization
characteristics of polar coding (a polar code), by using the polar code to compress
a binary source, and only a compressed bit sequence is transmitted to a peer end;
and the peer end performs decoding by using an existing polar decoder, to implement
fixed-length entropy coding. An entire process is easy to implement, and spreading
of decoding errors can be controlled.
[0065] The solution provided in this application may be applied to an image processing system
shown in FIG. 1. An image encoder 100 and an image decoder 200 in the image processing
system shown in FIG. 1 are configured to perform image compression and decompression
according to the image processing method proposed in this application.
[0066] As shown in FIG. 1, the image processing system includes a source apparatus 10 and
a destination apparatus 20. The source apparatus 10 generates a coded bit sequence.
Therefore, the source apparatus 10 may be referred to as an image coding apparatus.
The destination apparatus 20 may decode the coded bit sequence generated by the source
apparatus 10. Therefore, the destination apparatus 20 may be referred to as a video
decoding apparatus. Various implementation solutions of the source apparatus 10, the
destination apparatus 20, or both the source apparatus 10 and the destination apparatus
20 may include one or more processors and a memory coupled to the one or more processors.
The memory may include but is not limited to a RAM, a ROM, an EEPROM, a flash memory,
or any other medium that can be used to store desired program code in a form of an
instruction or a data structure accessible by a computer, as described in this specification.
[0067] The source apparatus 10 and the destination apparatus 20 may include various apparatuses,
including a desktop computer, a mobile computing apparatus, a notebook (for example,
laptop) computer, a tablet computer, a set top box, a handheld telephone set such
as a "smart" phone, a television set, a camera, a display apparatus, a digital media
player, a video game console, an in-vehicle computer, and similar apparatuses.
[0068] The destination apparatus 20 may receive the coded bit sequence from the source apparatus
10 through a link 30. The link 30 may include one or more media or apparatuses that
can transfer the coded bit sequence from the source apparatus 10 to the destination
apparatus 20. In an example, the link 30 may include one or more communication media
that enable the source apparatus 10 to directly transmit the coded bit sequence to
the destination apparatus 20 in real time. In this example, the source apparatus 10
may modulate the coded bit sequence according to a communication standard (for example,
a wireless communication protocol), and may transmit a modulated bit sequence to the
destination apparatus 20. The one or more communication media may include a wireless
and/or wired communication medium, for example, a radio frequency (radio frequency,
RF) spectrum or one or more physical transmission lines. The one or more communication
media may constitute a part of a packet-based network, and the packet-based network
is, for example, a local area network, a wide area network, or a global network (for
example, the Internet). The one or more communication media may include a router,
a switch, a base station, or another device facilitating communication from the source
apparatus 10 to the destination apparatus 20.
[0069] In another example, the coded bit sequence may be output from an output interface
140 to a storage apparatus 40. Similarly, the coded bit sequence may be accessed from
the storage apparatus 40 through an input interface 240. The storage apparatus 40
may include any one of a plurality of distributed data storage media or locally accessible
data storage media, for example, a hard disk drive, a Blu-ray disc, a digital video
disc (digital video disc, DVD), a compact disc read-only memory (compact disc read-only
memory, CD-ROM), a flash memory, a volatile or non-volatile memory, or any other appropriate
digital storage media configured to store coded video data.
[0070] In another example, the storage apparatus 40 may correspond to a file server or another
intermediate storage apparatus that can store the coded bit sequence generated by
the source apparatus 10. The destination apparatus 20 may access the stored bit sequence
from the storage apparatus 40 through streaming or downloading. The file server may
be any type of server capable of storing the coded bit sequence and transmitting the
coded bit sequence to the destination apparatus 20. An example of the file server
includes a network server (which is, for example, used for a website), a file transfer
protocol (file transfer protocol, FTP) server, a network attached storage (network
attached storage, NAS) apparatus, or a local disk drive. The destination apparatus
20 may access the coded bit sequence through any standard data connection (including
an Internet connection). The standard data connection may include a wireless channel
(for example, a wireless fidelity (wireless-fidelity, Wi-Fi) connection), a wired
connection (for example, a digital subscriber line (digital subscriber line, DSL),
or a cable modem), or a combination of a wireless channel and a wired connection,
where the combination is suitable for accessing the coded bit sequence stored on the
file server. Transmission of the coded bit sequence from the storage apparatus 40
may be streaming transmission, download transmission, or a combination of both.
[0071] In the example of FIG. 1, the source apparatus 10 includes an image source 120, the
image encoder 100, and the output interface 140. In some examples, the output interface
140 may include a modulator/demodulator (a modem) and/or a transmitter. The image
source 120 may include an image capture apparatus (for example, a camera), a video
archive including previously captured image data, a feed-in interface for receiving
image data from an image content provider, and/or a computer graphics system for generating
image data, or a combination of these sources of image data.
[0072] The encoder 100 may code image data from the image source 120. In some examples,
the source apparatus 10 directly transmits the coded bit sequence to the destination
apparatus 20 through the output interface 140. In other examples, the coded bit sequence
may also be stored on the storage apparatus 40 for later access by the destination
apparatus 20 for decoding.
[0073] In the example of FIG. 1, the destination apparatus 20 includes the input interface
240, the decoder 200, and a display apparatus 220. In some examples, the input interface
240 includes a receiver and/or a modem. The input interface 240 may receive the coded
bit sequence through the link 30 and/or from the storage apparatus 40. The display
apparatus 220 may be integrated with the destination apparatus 20 or may be external
to the destination apparatus 20. Generally, the display apparatus 220 displays decoded
image data. The display apparatus 220 may include a plurality of types of display
apparatuses, for example, a liquid crystal display (liquid crystal display, LCD),
a plasma display, an organic light-emitting diode (organic light-emitting diode, OLED)
display, or another type of display apparatus.
[0074] The encoder 100 and the decoder 200 may each be implemented as any one of a plurality
of circuits such as one or more microprocessors, digital signal processors (digital
signal processors, DSP), application-specific integrated circuits (application-specific
integrated circuits, ASIC), field programmable gate arrays (field programmable gate
arrays, FPGA), discrete logics, hardware, or any combination thereof. If this application
is implemented partially through software, the apparatus may store, in an appropriate
nonvolatile computer-readable storage medium, instructions used for the software,
and may use one or more processors to execute instructions in hardware, to implement
the technologies in this application. Any one of the foregoing content (including
hardware, software, a combination of hardware and software, and the like) may be considered
as one or more processors. Each of the encoder 100 and the decoder 200 may be included
in one or more encoders or decoders, and any one of the encoder or the decoder may
be integrated as a part of a combined encoder/decoder (codec) in a corresponding apparatus.
[0075] It should be noted that the image processing system described in FIG. 1 is merely
an example, and does not constitute a specific limitation on a structure of the image
processing system.
[0076] FIG. 2 shows internal principles of the encoder 100 and the decoder 200. As shown
in FIG. 2, image data is input to the encoder 100, and undergoes a conventional JPEG
operation, for example, color conversion, grayscale translation, DCT, data quantization,
zig-zag scanning, and entropy coding (for example, run-length coding, which is an
optional operation), to obtain a binary source X
N of the image. Polar code source compression is performed on X
N, and appropriate redundancy protection is performed on a compressed bit sequence
through channel coding to resist channel noise. A bit sequence obtained through channel
coding is transmitted to the decoder 200 through a channel between the encoder 100
and the decoder 200. In the decoder 200, after channel decoding is performed on a
received signal, an operation, for example, polar code source decompression, entropy
decoding (an optional operation), inverse zig-zag scanning, dequantization, IDCT,
grayscale translation addition, and color conversion is performed to restore the image.
The solutions of the present invention are mainly specific to a polar code source
compression process and a polar code source decompression process.
[0077] Before the specific solutions of this application are described, principles used
in the solutions of this application are first described.
[0078] First, polar coding (the polar code) is described.
[0079] The polar coding is an only known channel coding scheme that can be strictly proved
to "meet" a channel capacity, and has characteristics of high performance, low complexity,
a flexible matching manner, and the like.
[0080] FIG. 3 is a schematic diagram of typical coding of an 8 × 8 polar code. To-be-encoded
bits are classified into fixed (frozen) bits and information bits (data) based on
a sequence of their reliability. A bit with higher reliability is set as an information
bit (data), and a bit with lower reliability is set as a fixed (frozen) bit. A value
of the fixed (frozen) bit is usually set to 0, which is known to both a transmit end
and a receive end in actual transmission. As shown in FIG. 3, u
7, u
6, u
5, and u
3 are four bits whose reliability is ranked top, and are set to information bits (data);
and u
4, u
2, u
1, and u
0 are four bits whose reliability is ranked low, and are set to fixed (frozen) bits.
[0081] Then, a source entropy polarization theory is described.
[0082] It is assumed that the source X
N conforms to Bernoulli (Bernoulli) distribution Ber(p), and it can be learned that
a source entropy is as follows:

, and 0 ≤ H(x) ≤ 1.
[0083] FIG. 4 is a schematic diagram of polar code source polarization. It can be learned
from a relationship shown in FIG. 4 that (u
1,u
2) and (x
1,x
2) are one-to-one mapped. Therefore, it can be learned that H(u
1,u
2) = H(x
1,x
2).
[0084] Because a polarization matrix is an invertible matrix, an entropy sum obtained after
the polarization operation remains unchanged, that is, H(u
1,u
2) = H(u
1) + H(u
2|u
1) = 2H(x). Because U
2 = X
2, H(u
2|u
1) ≤ H(x).
[0085] Because the entropy sum remains unchanged, it can be learned that H(u
1) ≥ H(x).
[0086] It can be seen from the foregoing derivation that, from x to u, a polarization phenomenon
occurs after source entropies of discrete sources change through polarization shown
in FIG. 4. To be specific, H(u
2|u
1) becomes smaller than H(x), and H(u
1) becomes greater than H(x). Therefore, after a polarization change structure is extended,
and after a plurality of independently and identically distributed sources

undergo a polarization change, source entropies of some sources approximate 1, in
other words, average uncertainty of the sources increases; and source entropies of
the other sources approximate 0, in other words, average uncertainty of the sources
decreases. A polarization process of the sources is completed. The process may be
expressed as:

, where

tends to a fixed value (0 or 1) with a probability P = 1, that is,

and

[0087] Finally, a neural network (neural network, NN) is described.
[0088] FIG. 5 is a schematic diagram of a structure of a feedforward neural network. The
leftmost of the network is an input layer, the middle of the network is a hidden layer,
and the rightmost of the network is an output layer.
[0089] The neural network includes a plurality of neurons. An operation of each neuron is
shown in FIG. 6, where a
1, a
2, ..., a
n represent n inputs of the neuron, ω
1, ω
2, ... , ω
n represent weights on corresponding connection lines, b is an offset, and f(·) represents
an activation function. It can be learned that a function of a neuron is to sum up
products of input values and weights, and obtain a corresponding output value by using
a non-linear function. To be specific, after linear transformation is performed on
neurons at each layer of the neural network, a nonlinear excitation function (activation
function) is added to convert a linear result. After the linear result is converted
by the plurality of layers of the neural network, a finally obtained output is a complex
nonlinear function.
[0090] After the nonlinear excitation function is added, the neural network does not use
complex linear combination to approximate a smooth curve to divide a plane, but can
possibly directly learn the smooth curve to divide the plane, which is a reason why
the activation function is described. Common activation functions in the neural network
include sigmoid, ReLU, tanh, and the like. In actual application, flexible configuration
may be performed based on advantages and disadvantages of activation functions:
the sigmoid function:

the ReLU function:

and
the tanh function:

[0091] A process in which training data is input into the neural network, passes through
the hidden layer, and reaches the output layer, and an output result is obtained is
a forward propagation process of the neural network. Because there is a specific error
between the output result of the NN and an actual result, an error between an estimated
value and an actual value can be calculated (by using a cost function), and the error
can be propagated back from the output layer to the hidden layer until the error reaches
the input layer. In a back propagation process, a weight of each network element in
the neural network is adjusted based on the error, the foregoing process is continuously
iterated until the neural network converges, and a finally obtained weight is an optimal
weight. A back propagation (back propagation, BP) algorithm is an optimization algorithm
based on gradient descent, which is often used in training an artificial neural network.
[0092] The cost function may include a minimum mean square error (MSE), cross entropy (cross
entropy), or the like.
[0093] Specifically, a process of the gradient descent and the back propagation (BP) algorithm
is described herein.
[0094] A gradient descent algorithm is a method for a neural network to update a weight
based on a network error. A gradient is a vector in a direction in which a function
value increases the fastest, and a reverse direction of the gradient is a direction
in which the function value decreases the fastest. The weight is modified and updated
along the reverse direction of the gradient. After a plurality of rounds of iteration,
a value of the weight when the cost function has the minimum value is found.
[0095] The back propagation algorithm (BP algorithm) starts from the output layer, calculates
an error vector backward, updates weights from the output layer to the input layer
layer by layer, and iterates the above process for a plurality of times. A core mathematical
tool is a chain derivative rule of calculus. That the MSE is the cost function and
the sigmoid function is the activation function is used as an example, and mathematical
derivation of a weight update formula in the BP algorithm is given below:
[0096] A training set (x, y) is given; if it is assumed that an output of the neural network
is y = (y
1,y
2, ... , y
l), a mean square error of the sample is

, where t
i is a target value of the sample corresponding to a node i.
[0097] It is assumed that net
j is a weighted input of a node j, that is, net
j = ω
jx
j = Σ
i ω
jix
ji; and E is a function of net
j, but net
j is a function of ω
ji. According to the chain derivative rule, it can be learned that:

, where x
ji represents an input value transferred by the node i to the node j. For

, there are two types of cases: a case for the output layer and a case for the hidden
layer.
[0098] For the output layer:

, where

, and

.
[0099] Because

, for the output layer, it can be learned that:

-(
tj -
yj)
yj(1 -
yi).

is set, to be specific, an error term
δ of a node is an inverse value of the network error to a partial derivative input
by the node.

is input into the above formula, and it can be learned that:
δj = (
tj -
yi)
yi(1 -
yi).
[0100] For the hidden layer: A set ϕ
j of all downstream nodes of the node j is defined (for example, as shown in FIG. 5,
downstream nodes of a node 8 are a node 11 and a node 12). It can be learned from
the structure of the neural network that net
j can only affect ϕ
j by affecting E. If it is assumed that net
k is an input of the downstream node of the node j, E is a function of net
j. There are a plurality of net
j, and a full derivative formula is used. Therefore, it can be learned that:

[0101] Because
δj and when
δj is input into the foregoing formula, it can be learned that: δ
j = a
j(1 -
αj) Σ
k∈ϕj δ
kω
kj.
[0102] δ
j is an error term of the node j, δ
k is an error term of a node k, and

, namely, an output value of the node j.
[0103] In conclusion, it can be learned that: for the output layer, δ
j = (t
j - y
j)y
j(1 - y
j); and for the hidden layer: δ
j =
αj(1 -
αj) Σ
k∈ϕj δ
kω
kj. .
[0104] The weight update formula is as follows: ω
ji = ω
ji + ηδ
jx
ji, where η is a learning rate.
[0105] It can be learned from the above derivation that the gradient is ∇=δx (where δ is
a negative number for a product of derivative of the error term and the activation
function). Therefore, the weight update formula may be summarized as: ω = ω + η∇.
[0106] It can be understood from the foregoing process that the reverse direction of the
gradient (a reason why the negative number is taken for δ) is the direction in which
the function value decreases the fastest. Therefore, when ω is modified along the
reverse direction of the gradient and after a plurality of rounds of iterative updates,
a value of ω when the cost function has the minimum value is found, in other words,
when the neural network converges.
[0107] The following describes the technical solutions of this application in detail with
reference to the accompanying drawings.
[0108] According to one aspect, an embodiment of this application provides an image signal
processing method. As shown in FIG. 7, the method may include the following steps.
[0109] S701: An encoder obtains a binary source
XN of an image.
[0110] Specifically, when obtaining the image, the encoder performs a conventional JPEG
operation on the image, for example, color conversion, grayscale translation, DCT,
data quantization, zig-zag scanning, and entropy coding (for example, run-length coding,
which is an optional operation), to obtain the binary source X
N of the image. This process is not described in this application.
[0111] S702: The encoder performs a polarization operation on
XN, to obtain a to-be-compressed bit set
UN.
[0112] Specifically, that the encoder performs a polarization operation on
XN in S702 may specifically include: multiplying
XN by a generator matrix G (which may also be referred to as a polarization matrix)
of polar coding (a polar code), to obtain
UN. That is,
UN =
XNG.
[0113] Content of G may be configured based on an actual requirement. This is not specifically
limited in this embodiment of this application.
[0114] In this way, compression of the signal
XN is converted to compression of
UN. Based on the foregoing characteristics of source entropy polarization, because some
bits

in
UN (namely, bits with low uncertainty) may be completely determined by other bits
U1i-1 , these bits may be discarded during compression. Only some bits

(namely, bits with high uncertainty) need to be reserved. S = {i ∈ [N]:

is set, and a sequence U corresponding to the set is recorded as
US and is used as a set of the bits reserved during compression. The set is referred
to as a compressed reserved bit set. A complementary set of S is S
c, and a sequence U corresponding to the set is recorded as
USC and is used as a set of the bits discarded during compression. The set is referred
to as a compressed discarded bit set. That is,
UN = [
US,
USC].
[0115] S703: The encoder obtains the compressed reserved bit set
US from
UN.
[0116] A bit in
US is a reserved bit during source compression, and
US are some bits in
UN.
[0117] Optionally, a source entropy of the bit in
US approximates 1; or a source entropy of the bit in
US tends to be greater than or equal to a first preset threshold; or a Bhattacharyya
parameter of an equivalent channel corresponding to the bit in
US is greater than or equal to a second preset threshold; or a channel capacity of an
equivalent channel corresponding to the bit in
US is less than or equal to a third preset threshold.
[0118] Values of the first preset threshold, the second preset threshold, and the third
preset threshold may be configured based on an actual requirement.
[0119] Specifically, the encoder may implement S703 by using but is not limited to any one
of the following two solutions:
[0120] Solution 1: The encoder equates a source problem to a channel problem.
[0121] Specifically, polar code compression design for a source conforming to Bernoulli
distribution Ber(q) may be equivalent to polar channel decoding design for a BSC.
Fixed bits
UF of an equivalent channel are used as
US (that is, U
F ↔ U
S), and information bits
UI of the equivalent channel are used as
USC (that is, U
I ↔ U
Sc).
[0122] FIG. 8 is a schematic diagram of a BSC according to an embodiment of this application.
Noise distribution of the BSC is the same as that of the source. Therefore, a channel
capacity of BSC (q) can be calculated as 1 - H(q) = 1 + q log
2 q + (1 - q) log
2 (1 - q), which is complementary to H(q).
[0123] Specifically, W = BSC(q) and n = log
2 N are set, and N polarized channels on which n times of polarization are performed
are represented by

. It may be theoretically proved that capacities of these polarized channels are complementary
to the foregoing polarized source entropies, that is,

. A specific proof process is not described again.
[0124] In solution 1 of S703, when source coding is equivalent to channel decoding, channel
reliability ranking may be used during channel decoding. Bits corresponding to equivalent
channel reliability ranked low in
UN are used as
US based on a sequence of equivalent channel reliability in descending order. A higher
source entropy of a source bit indicates lower reliability of an equivalent channel
corresponding to the source bit.
[0125] In solution 1 of S703, the fixed bits
UF of the equivalent channel are used as
US (that is, U
F ↔ U
S), and the information bits
UI of the equivalent channel are used as
USC.
[0126] Specifically, a quantity of bits in
US depends on an expected compression rate. This is not specifically limited in this
embodiment of this application.
[0127] A channel reliability ranking method used during channel coding may use an existing
Gaussian approximation (Gaussian Approximation) algorithm, a channel degradation and
upgradation merging (channel degradation and upgradation merging) algorithm, some
ranking methods, or the like. Details are not described in this embodiment of this
application.
[0128] Solution 2: The encoder calculates a source entropy of each bit in
UN , and uses bits whose source entropies approximate 1 in
UN as
US.
[0129] A process of calculating the source entropies may be implemented through mathematical
derivation. Details are not described in this embodiment of this application.
[0130] S704: The encoder performs channel coding on
US, to obtain a coded bit sequence.
[0131] In S704, a channel coding manner such as SC decoding, SCL decoding, or BP decoding
may be used, or another channel coding manner may be used. A channel coding process
in S704 is not limited in this application.
[0132] S705: The encoder outputs the coded bit sequence.
[0133] Specifically, in S705, the encoder sends the bit sequence coded in S704 to a channel
for transmission.
[0134] According to the image signal processing method provided in this application, the
binary source is compressed through polar coding, and fixed-length source coding is
implemented based on the expected compression rate. This process is easy to implement.
In addition, a coding granularity of the polar coding is an image block (block). Even
if a decoding error occurs, the error spreads only in the block, thereby effectively
controlling error spreading at a decoder, and improving image compression performance.
[0135] Further,
UN further includes the compressed discarded bit set
USC obtained through polar code source coding. Therefore, during source coding,
U SC is discarded and is not transmitted, and the decoder may decode content of the compressed
discarded bit based on other bits.
[0136] Further, during source coding,
USC may be discarded and not transmitted. To implement lossless compression, an error
location obtained when the encoder obtains U
SC may be recorded, and the decoder flips a bit in the error location during decoding,
to implement lossless compression and decompression. As shown in FIG. 9, based on
FIG. 7, the image processing method provided in this embodiment of this application
may further include S706 and S707.
[0137] S706: The encoder performs channel decoding on preset output values of the equivalent
channel based on
US , to obtain a decoded

.

is decoded estimated values of
USC.
[0138] Specifically, in S706, a process of performing source coding on the binary Bernoulli
source may be equivalent to a channel decoding process of the BSC. One BSC is visualized.
It is assumed that a received signal of the polar code decoder is Y
N = 0
N (the preset output values of the equivalent channel are preconfigured, or may be
other values other than 0); and it is assumed that frozen bits (the fixed bits) of
the equivalent channel are
US,
ÛN (a decoding result of the channel) may be obtained by using the polar channel decoder
based on
US and Y
N. The frozen bits remain unchanged, and bits in
ÛN other than
US are

.
[0139] It should be noted that, in an actual implementation process, one BSC described in
this application is visualized by using the existing polar code decoder to perform
channel decoding. An SC decoder is used as an example. f
SC(·) is set to an SC decoding function of the polar code, and U
I = f
SC(Y
N = 0
N, U
S).
[0140] U
S is a vector that is not 0, which is different from that of the existing decoder.
The following linear transformation may be performed, and Z
N is set to be a noise vector of the BSC:

and

[0141] It is equivalent to U
I = f
SC(Y
N = [U
F, 0
I]G, U
F = 0
F). In this way, lossless source compression of the polar code can be directly implemented
by directly using the existing decoder.
[0142] S707: The encoder obtains error location information
T based on
USC and

.
[0143] In S707,

obtained through decoding and original

are compared to find location sequence numbers of different bits, and the numbers
are recorded as
T.
[0144] For example, FIG. 10 is a schematic flowchart of lossless polar compression according
to an embodiment of this application. As shown in FIG. 10, a process may include:
S1001: The source generates N independently and identically distributed (independently
and identically distributed, i.i.d) signals
XN . S1002: Obtain a to-be-compressed bit set
[US,
USC]=
UN =
XNG after the polarization operation is performed, where
G is a polarization matrix. S1003: Use sets S and
SC as a fixed bit set F and an information bit set I during channel coding, set
YN = [
UF,0
I]
G , and input
YN into a standard polar code decoder. S1004: Input the fixed bit set to the polar code
decoder, so that the polar code decoder becomes a polar code decoder of 0
F ; and perform decoding to obtain an estimated value
Ûi , where
i ∈
I. S1005: Determine whether
Ûi =
Ui is met; and if
Ûi =
Ui is not met, perform S1006, or if
Ûi =
Ui is met, perform 1007. S1006: Flip
Ûi , and record
i in a set T. S1007: Determine whether
i is equal to N; and if
i is equal to N, output [
US,
T], or if
i is not equal to N, check a next bit.
[0145] In a possible implementation, S704 is performed after S707, and the encoder performs
channel coding on
US and
T in S704, to obtain the coded bit sequence.
[0146] It should be noted that before channel coding is performed on
T ,
T further needs to be converted into a binary form.
[0147] In another possible implementation, S708 is performed after S707, to train a neural
network for predicting the error location information
T by the decoder.
[0148] S708: The encoder records
US and
T.
[0149] US and
T recorded in S708 are used as a group of data, and are used to train the preset neural
network for predicting the error location information. The preset neural network is
a neural network that is trained based on M groups of compressed reserved bit sets
and error location information obtained through polar code source coding until the
neural network converges.
[0150] It should be noted that the neural network may be trained offline until the neural
network converges based on the M groups of compressed reserved bit sets and error
location information obtained through polar code source coding, to obtain the preset
neural network. The preset neural network is configured in the decoder for predicting
T during decoding. The M groups of compressed reserved bit sets and error location
information obtained through polar code source coding may be obtained through learning,
or may be obtained by executing an image processing process. This is not specifically
limited in this embodiment of this application.
[0151] For a neural network training method, refer to the foregoing BP algorithm or another
training method. This is also not specifically limited in this embodiment of this
application. A process of training a model is a process of optimizing a cost function.
[0152] During training, all stored [
US,
T] are used as training samples, and are sent to a neural network whose input layer
node quantity is length (U
S) and output layer node quantity is a length of
T for training. U
Sj is an input to an input layer of the neural network in each time of training, and
T
j is a corresponding expected output value. Because sizes of
T corresponding to each U
F may be different, a value with the largest size in all
Ts may be selected as an output size of the neural network, or a fixed value may be
set as an output size of the neural network. A cost function of the neural network
is expressed as:

, where
T̃ represents an actual output of 2 the neural network. Predicted data obtained at an
output layer is compared with a corresponding
T , and an error value ℓ between an expected output and an actual output is obtained
by using the cost function. After 1 times of iterative training are performed by using
a back propagation algorithm (BP), weights on connection lines of the neural network
are updated for a plurality of times, and e reaches a minimum value that tends to
be stable. That is, the neural network tends to converge, and a weight on a current
connection line is a final weight.
[0153] According to another aspect, an embodiment of this application provides another image
signal processing method, where the method is applied to the decoder. As shown in
FIG. 11, the method may include the following steps:
[0154] S1101: The decoder obtains the received signal
YN.
[0155] YN is a signal received by the decoder from the channel, and
YN is a signal obtained after a bit sequence that is obtained after source coding and
channel coding are performed on the binary source
XN of the image is transmitted through the channel.
[0156] For example,
YN may be a signal obtained after the coded bit sequence output by the decoder is transmitted
through the channel in S705.
[0157] S1102: The decoder performs channel decoding on
YN, to obtain the information bits
ÛI in channel decoding, and uses the information bits as the compressed reserved bit
set
ÛS on which source decoding is to be performed.
[0158] A channel decoding manner, for example, SC decoding, SCL decoding, or BP decoding
may be used in S1102. A channel decoding process in S1102 is not limited in this application,
provided that the channel decoding process corresponds to a channel coding manner
of the encoder.
[0159] S1103: The decoder performs channel decoding on the preset output values of the equivalent
channel based on the error location information
T̂ and
ÛS , to obtain the compressed discarded bit set
ÛSC on which source decoding is to be performed.
[0160] T̂ indicates a location of a flipped bit during channel decoding.
[0161] Specifically, in S1103, the decoder may first obtain the error location information
T̂. An obtaining process may include but is not limited to the following two processes,
and depends on whether the encoder transmits the error location information.
[0162] Process 1: The decoder inputs
ÛS into the preset neural network, to obtain the error location information
T̂ through prediction. The preset neural network is the neural network that is trained
based on the M groups of compressed reserved bit sets and error location information
obtained through polar code source coding until the neural network converges. The
preset neural network has been described in detail in the foregoing content, and details
are not described herein again.
[0163] Process 2: The encoder transmits the error location information, and the decoder
may obtain
T̂ by performing channel decoding on
YN inS1102.
[0164] Specifically, in S1103, the decoder equates source decoding to polar code channel
decoding, and visualizes one BSC. It is assumed that the received signal of the channel
is Y
N = 0
N (the preset output values of the equivalent channel are preconfigured, or may be
other values); and it is assumed that the frozen bits (the fixed bits) of the equivalent
channel are
ÛS,
ÛSC (the decoding result of the channel) may be obtained by inputting
YN into the polar channel decoder for channel decoding, and flipping the bit at an indication
location of
T̂ during decoding.
[0165] S1104: The decoder obtains
ÛN.
[0166] ÛN includes
ÛS obtained in S1102 and
ÛSC obtained in S1103. Specifically, in S1104, the decoder forms
ÛS and
ÛSC into
ÛN based on a bit location sequence.
[0167] S1105: The decoder performs the polarization operation on
ÛN, to obtain a reconstructed signal
X̂N of
XN.
[0168] The polarization operation includes multiplying a generation matrix of the polar
code.
[0169] Further, the decoder may further perform an operation, for example, entropy decoding
(which is an optional operation), inverse zig-zag scanning, dequantization, IDCT,
grayscale translation addition, and color conversion on
X̂N, to restore the image.
[0170] According to the image signal processing method provided in this application, corresponding
decoding is performed, based on the error location information, on the signal that
is compressed through polar coding, to implement lossless compression of the image
signal. This process is easy to implement. In addition, a coding granularity of the
polar coding is an image block (block). Even if a decoding error occurs, the error
spreads only in the block, thereby effectively controlling error spreading at the
decoder, and improving image compression performance.
[0171] According to another aspect, an embodiment of this application provides an image
signal processing apparatus 120. The apparatus 120 may be an encoder, or may be an
apparatus or a chip system in an encoder. The image signal processing apparatus 120
may implement functions of the encoder in the foregoing method embodiments; and the
functions may be implemented through hardware, or may be implemented through hardware
executing corresponding software. The hardware or the software includes one or more
modules corresponding to the functions. As shown in FIG. 12, the image signal processing
apparatus 120 may include a first obtaining unit 1201, a polarization unit 1202, an
obtaining unit 1203, a coding unit 1204, and an output unit 1205.
[0172] The first obtaining unit 1201 is configured to obtain a binary source
XN of an image.
[0173] The polarization unit 1202 is configured to perform a polarization operation on
XN obtained by the first obtaining unit 1201, to obtain a to-be-compressed bit set
UN.
[0174] The obtaining unit 1203 is configured to obtain a compressed reserved bit set
US from
UN obtained by the polarization unit 1202.
[0175] The coding unit 1204 is configured to perform channel coding on
US obtained by the obtaining unit 1203, to obtain a coded bit sequence.
[0176] The output unit 1205 is configured to output the coded bit sequence obtained by the
coding unit 1204.
[0177] Optionally, the obtaining unit 1203 is specifically configured to: use, based on
a sequence of equivalent channel reliability in descending order, bits corresponding
to equivalent channel reliability ranked low in
UN as
US. A higher source entropy of a source bit indicates lower reliability of an equivalent
channel corresponding to the source bit.
[0178] Optionally,
UN further includes a compressed discarded bit set
USC obtained through polar code source coding. As shown in FIG. 12, the image signal
processing apparatus 120 may further include a second obtaining unit 1206, configured
to: perform channel decoding on preset output values of an equivalent channel based
on
US, to obtain a decoded

; and obtain error location information
T based on U
SC and

; and a recording unit 1207, configured to record
US and
T, to train a neural network for predicting the error location information. The preset
neural network is a neural network that is trained based on M groups of compressed
reserved bit sets and error location information obtained through polar code source
coding until the neural network converges.
[0179] Optionally, the coding unit 1204 may be specifically configured to perform channel
coding on
US and
T.
[0180] According to another aspect, an embodiment of this application provides an image
signal processing apparatus 130. The apparatus 130 may be a decoder, or may be an
apparatus or a chip system in a decoder. The image signal processing apparatus 130
may implement the functions of the decoder in the foregoing method embodiments; and
the functions may be implemented through hardware, or may be implemented through hardware
executing corresponding software. The hardware or the software includes one or more
modules corresponding to the functions. As shown in FIG. 13, the image signal processing
apparatus 130 may include an obtaining unit 1301, a first decoding unit 1302, a second
decoding unit 1303, an obtaining unit 1304, and a polarization unit 1305.
[0181] The obtaining unit 1301 is configured to obtain a received signal
YN, where
YN is a signal obtained after a bit sequence that is obtained after source coding and
channel coding are performed on a binary source
XN of an image is transmitted through a channel.
[0182] The first decoding unit 1302 is configured to: perform channel decoding on
YN obtained by the obtaining unit 1301, to obtain information bits
ÛI in channel decoding; and use the information bits as a compressed reserved bit set
ÛS on which source decoding is to be performed.
[0183] The second decoding unit 1303 is configured to perform channel decoding on preset
output values of an equivalent channel based on error location information
T̂ and
ÛS, to obtain a compressed discarded bit set
ÛSC on which source decoding is to be performed,
T̂ indicates a location of a flipped bit during channel decoding.
[0184] The obtaining unit 1304 is configured to obtain
ÛN, where
ÛN includes
ÛS obtained by the first decoding unit 1302 and
ÛSC obtained by the second decoding unit 1303.
[0185] The polarization unit 1305 is configured to: perform a polarization operation on
ÛN obtained by the obtaining unit 1304, to obtain a reconstructed signal
X̂N of
XN.
[0186] Optionally, as shown in FIG. 13, the image signal processing apparatus 130 may further
include a prediction unit 1306, configured to input
ÛS into a preset neural network, to obtain the error location information
T̂ through prediction. The preset neural network is a neural network that is trained
based on M groups of compressed reserved bit sets and error location information obtained
through polar code source coding until the neural network converges.
[0187] Optionally,
YN includes the error location information
T̂ , and the first decoding unit 1302 may be specifically configured to perform channel
decoding on
YN, to obtain
ÛI and
T̂.
[0188] In the image signal processing apparatus 120 or the image signal processing apparatus
130 in the embodiments of this application, some units (or components) may be implemented
through a hardware circuit, and other units (or devices) may be implemented through
software; or all the units (or components) may be implemented through a hardware circuit;
or all the units (or components) may be implemented through software.
[0189] FIG. 14 is a schematic diagram of a structure of an encoder 140 according to an embodiment
of this application. As shown in FIG. 14, the encoder 140 includes a processor 1401
and a memory 1402. The memory 1402 may be independent of the processor or independent
of the device (a memory #3). The memory 1402 may alternatively be inside the processor
or the device (a memory #1 and a memory #2). The memory 1402 may be a physically independent
unit, or may be storage space, a network hard disk, or the like on a cloud server.
[0190] The memory 1402 is configured to store computer-readable instructions (or referred
to as computer programs).
[0191] The processor 1401 is configured to read the computer-readable instructions to implement
the method provided in any one of the foregoing aspects related to the encoder and
any implementation of the aspects.
[0192] Optionally, the memory 1402 (the memory (memory) #1) is located inside the encoder
140.
[0193] Optionally, the memory 1402 (the memory (memory) #2) is integrated with the processor.
[0194] Optionally, the memory 1402 (the memory (memory) #3) is located outside the encoder
140.
[0195] Optionally, the encoder 140 further includes a transceiver 1403, configured to receive
and send data.
[0196] FIG. 15 is a schematic diagram of a structure of a decoder 150 according to an embodiment
of this application. As shown in FIG. 15, the decoder 150 includes a processor 1501
and a memory 1502. The memory 1502 may be independent of the processor or independent
of the device (a memory #3). The memory 1502 may alternatively be inside the processor
or the device (a memory #1 and a memory #2). The memory 1502 may be a physically independent
unit, or may be storage space, a network hard disk, or the like on a cloud server.
[0197] The memory 1502 is configured to store computer-readable instructions (or referred
to as computer programs).
[0198] The processor 1501 is configured to read the computer-readable instructions to implement
the method provided in any one of the foregoing aspects related to the decoder and
any implementation of the aspects.
[0199] Optionally, the memory 1502 (the memory (memory) #1) is located inside the decoder
150.
[0200] Optionally, the memory 1502 (the memory (memory) #2) is integrated with the processor.
[0201] Optionally, the memory 1502 (the memory (memory) #3) is located outside the decoder
150.
[0202] Optionally, the encoder 150 further includes a transceiver 1503, configured to receive
and send data.
[0203] In addition, the processor 1401 or the processor 1501 may be a central processing
unit, a general purpose processor, a digital signal processor, an application-specific
integrated circuit, a field programmable gate array or another programmable logic
device, a transistor logic device, a hardware component, or any combination thereof.
The processor may implement or execute various example logical blocks, modules, and
circuits described with reference to content disclosed in this application. Alternatively,
the processor may be a combination of processors implementing a computing function,
for example, a combination of one or more microprocessors, or a combination of the
digital signal processor and a microprocessor. In addition, the memory 1402 or the
memory 1502 may include a volatile memory (volatile memory), for example, a random-access
memory (random-access memory, RAM). The memory may alternatively include a non-volatile
memory (non-volatile memory), for example, a flash memory (flash memory), a hard disk
drive (hard disk drive, HDD), a solid-state drive (solid-state drive, SSD), cloud
storage (cloud storage), network attached storage (NAS: network attached storage),
or a network drive (network drive). The memory may further include a combination of
the foregoing types of memories or another medium or product in any form that has
a storage function.
[0204] According to still another aspect, an embodiment of this application provides an
image signal processing system. The image signal processing system includes the encoder
described in any one of the foregoing embodiments and the decoder described in any
one of the foregoing embodiments.
[0205] An embodiment of this application further provides a chip system, including a processor
and a memory. The memory is configured to store computer-readable instructions or
a computer program, and the processor is configured to read the computer-readable
instructions to implement the method in any one of the foregoing embodiments.
[0206] An embodiment of this application further provides a computer-readable medium. The
computer-readable medium stores computer program instructions, and when the computer
program instructions are executed by a computer, the method in any one of the foregoing
embodiments is implemented.
[0207] An embodiment of this application further provides a computer program product. When
the computer program product is executed by a computer, the method in any one of the
foregoing embodiments is implemented.
[0208] A person of ordinary skill in the art may be aware that, in combination with the
examples described in the embodiments disclosed in this specification, units and algorithm
steps can be implemented by electronic hardware or a combination of computer software
and electronic hardware. Whether the functions are performed by the hardware or the
software depends on particular applications and design constraint conditions of the
technical solutions. A person skilled in the art may use different methods to implement
the described functions for each particular application, but it should not be considered
that the implementation goes beyond the scope of this application.
[0209] A person skilled in the art may clearly understand that, for the purpose of convenient
and brief description, for a detailed working process of the system and apparatus
described above, refer to a corresponding process in the foregoing method embodiments,
and details are not described herein again.
[0210] In the several embodiments provided in this application, it should be understood
that the disclosed system, apparatus, and method may be implemented in other manners.
For example, the described apparatus embodiment is merely an example. For example,
division into the units is merely logical function division and may be other division
during actual implementation. For example, a plurality of units or components may
be combined or integrated into another system, or some features may be ignored or
not performed. In addition, the displayed or discussed mutual couplings or direct
couplings or communication connections may be implemented through some interfaces.
The indirect couplings or communication connections between the apparatuses or units
may be implemented in electrical, mechanical, or other forms.
[0211] The units described as separate parts may or may not be physically separate, and
parts displayed as units may or may not be physical units, may be located in one location,
or may be distributed on a plurality of network units. Some or all of the units may
be selected based on an actual requirement to achieve the objectives of the solutions
in the embodiments.
[0212] In addition, functional units in the embodiments of this application may be integrated
into one processing unit, or each of the units may exist alone physically, or two
or more units are integrated into one unit.
[0213] When the functions are implemented in a form of a software functional unit and sold
or used as an independent product, the functions may be stored in a computer-readable
storage medium. Based on such an understanding, the technical solutions of this application
essentially, or the part contributing to the conventional technology, or some of the
technical solutions may be implemented in a form of a software product. The computer
software product is stored in a storage medium, and includes several instructions
for instructing a computer device (which may be a personal computer, a server, or
a network device) to perform all or some of the steps of the methods described in
the embodiments of this application.
[0214] The foregoing descriptions are merely specific implementations of the present invention,
but are not intended to limit the protection scope of the present invention. Any variation
or replacement readily figured out by the person skilled in the art within the technical
scope disclosed in the present invention shall fall within the protection scope of
the present invention. Therefore, the protection scope of the present invention shall
be subject to the protection scope of the claims.