[0001] Embodiments of the invention relate to a method for processing a digital audio signal,
a non-transitory computer readable medium, a computer program product and an apparatus
for processing or generating a digital audio signal. The apparatus can be a digital
audio workstation or an audio network or any other digital signal processing system
e.g. a video digital processing system.
[0002] The majority of today's commercial digital audio processing systems operate at two
standard sampling frequencies: 44.1 kHz and 48 kHz. In the audio industry, it is assumed
to be high enough for the necessary audio bandwidth, which is defined for current
audio processing and distribution as high as 20 kHz. The CD and the most of music
reproducing or distribution systems work with these sampling frequencies.
[0003] However, there are reasons to use higher sampling frequencies for intermediate audio
processing or audio content production process (e.g., mastering) even when the sampling
frequency of the final product has one of the standard values.
[0004] The most important reasons for using higher intermediate sampling frequencies, typically
twice or quadruple of the standard sampling frequencies (often called as oversampling)
are: (1) reduction of the amount of alias frequencies being produced when any non-linear
processing of the audio signal is used (e.g., any kind of dynamics, soft clipping,
saturation, and purposefully distortion generation); (2) reduction of the shape distortion
of digital filters when setting up the center or cut-off frequency close to the Nyquist
frequency (resulting in the squeezing of filter characteristics calculated by using
the bilinear transformation for calculation of digital filters).
[0005] In the current digital audio systems, some measures have been established to reduce
the above-mentioned problems. Processing modules, or so-called plugins in today's
audio processing systems which can significantly benefit from oversampling, incorporate
local oversampling embodiment, i.e., at the input of the module or plugin, a so-called
interpolator is implemented, and at the output a so-called decimator. The interpolator
multiplies the sampling frequency of the input signal, the input signal is processed
with this increased sampling frequency, and after processing the decimator decreases
the sampling frequency back to the original input sampling frequency. However, this
solution has various disadvantages: each module or plugin needs their own interpolator
and decimator filter which are normally developed as a compromise between signal quality
and additional latency caused by such filters. The better the quality in terms of
alias frequency suppression, the longer the FIR filters (finite impulse response filter)
and thus their latency. The cascading of more such plugins in series in one audio
processing channel, that is often the typical application case, may cause latencies
up to seconds which disturbs or makes impossible some audio processes. In addition,
multiple up-sampling filters and down-sampling filters in the chain may significantly
degrade the overall audio quality in the chain.
[0006] The other solution is to switch the whole processing system (e.g., digital audio
workstation) to entirely work with higher sampling frequency and let the plugins work
on that frequency. However, this approach has extensive consequences. For example,
at 4-time oversampling, a system needs approximately four time more computational
power and four times more memory.
[0007] Assuming multitrack, or contemporary multichannel systems (e.g., with 128 channels
or often more) the additional request on resources can be enormous. In addition, already
recorded tracks or signals coming with standard sampling frequencies from other units
or over networks have to be converted to the respective higher sampling frequency
of the processing system and after processing again converted down to a standard sampling
frequency, e.g., 44,1 kHz for CD production streaming services.
[0008] In view of this prior art, it is an object of the invention to provide a method for
processing a digital audio signal, a non-transitory computer readable medium, a computer
program product and an apparatus for processing or generating a digital audio signal,
which lead to an improvement with regard to the sound quality of the audio signal
with only a low overall channel latency as well as a low computational power and a
low amount of memory, compared to a system working permanently at a higher sampling
rate and/or using plugins using their individual oversampling rates or filters.
[0009] According to an embodiment, a method for processing a digital audio signal comprises
the following method steps:
providing a channel for processing the digital audio signal or a part of the digital
audio signal, wherein the channel is working on a predefined or predefinable base
sampling frequency, characterized in that the base sampling frequency is increased
to a working sampling frequency in a predefinable section of the channel and at the
end of the section the working sampling frequency is decreased back to the base sampling
frequency.
[0010] Within the context of the present description, the method for processing a digital
audio signal can be performed on a digital audio workstation (in the following: DAW),
in a virtual sub-mixers within virtual instruments, in virtual instruments as well
as on digital hardware mixing desks or digital mixers.
[0011] Within an embodiment of the invention, the method can also be used in other multichannel
digital processing systems working generally with single sampling frequencies, but
using up-sampling or oversampling for dedicated modules or plugins e.g., video processing
systems.
[0012] Within the context of the present description or invention, a channel is a channel
of a multiple of channels being used for processing a digital audio signal. For example,
in a DAW there are used a multiple of channels for processing different aspects of
audio signals. For example, one channel can be used for a voice recording and processing.
Another channel can be used for an electric guitar, a further channel for a bass.
Another channel can be a virtual instrument. There can also be further channels, for
example, for a drum kit. In the drum kit, the bass drum can be processed in one channel,
a snare can be processed in another channel, the hi-hat can be processed in a further
channel and so on.
[0013] Regarding the idea, which is realized in the invention, only some of the channels,
which are to be processed, are up-sampled or the sampling frequency is increased in
a predefined or pre-definable section of the channel or the channels. Up-sampling
of only a selected number of audio processing channels leads to more efficient performance
in terms of audio quality and required computational power. For example, in some audio
productions only the voice channel or the voice channels, which are situated in the
foreground of the audio signal to be processed, need to be processed in a higher quality
with respect to some plugins that do produce nonlinearities or disturbing frequencies
in an audible range. These channels need to be processed on a higher sampling rate
to get rid of these audible sound problems. Thus, an embodiment is to up-sample just
a section of a channel or some few channels of a multiple of channels in the processing
of a digital audio signal.
[0014] Within the context of the present invention, the base sampling frequency is the base
sampling frequency of the channel. The increase of the base sampling frequency to
the higher working sampling frequency or the up-sampling is performed at the beginning
of the section in which the up-sampled frequency is used, such that at least two inserted
plugins work on this working sampling frequency or the at least two inserted plugins
are triggered or driven by the working sampling frequency.
[0015] In an embodiment, the section of the channel includes at least a part of an insert
area, wherein in the insert area plugins are inserted or are insertable into the channel.
[0016] In a further preferred embodiment, the section of the channel includes a cluster
of at least two successive insert interfaces, in which plugins are inserted or insertable.
[0017] Very good results are to be achieved, if, according to an embodiment, the cluster
includes three or more insert interfaces. The more insert interfaces are used within
an oversampled channel the better the efficiency due to the invention. In a typical
setup of a voice processing channel strip, plugins like equalizer, compressor, limiter,
and soft-clip or tape emulation are used simultaneously in series. In this case 4
local interpolators and decimators can be saved.
[0018] In an embodiment, the cluster includes all insert interfaces of the insert area.
[0019] In the embodiments the plugins inserted in such oversampled channel work with the
increased working sampling frequency without using their own interpolators and decimators.
The most of the current plugins can automatically work with higher than standard (i.e.,
44.1 kHz or 48 kHz) sampling frequency, especially if they incorporate internal oversampling
filters.
[0020] According to an embodiment of the invention, the section starts at the beginning
of the channel or starts after a signal input control. Thus, at the beginning of the
channel or after the signal input control, the sampling frequency is increased. In
the signal input control, for example, the gain and/or phase of the signal is adjusted
or adjustable.
[0021] According to an embodiment of the invention, the section, in which oversampling is
performed, starts prior an insert interface of the channel. The insert interface is
used to insert plugins or processing modules (which can be software or hardware) into
the channel for processing the audio signal. For example, a plugin can be a distortion
effect, a compressor, an equalizer and/or any other audio processing functionality.
[0022] According to an embodiment of the invention, the section ends prior to the output.
In a complex channel topology the oversampled sections can be mixed with other system
specific modules like gains, faders, internal equalizers, etc.
[0023] If the section ends after the pre-fader send interface or the post-fader send interface,
it is preferred to decrease the sampling frequency of the send signal to the base
sampling frequency. This means an extra down-sampling compared to the down-sampling
that is performed in the channel.
[0024] In an embodiment, the working sampling frequency is increased by a factor of k to
the base sampling frequency, wherein k is an integer that is ususally a multiple of
2, but generally can be any factor. According to an embodiment, a non-transitory computer
readable medium, including a computer program, executable by a processor or a gate
array for performing a method, which is described above is, provided.
[0025] Furthermore, according to an embodiment a computer program product for performing
a method as shown above is provided.
[0026] The computer program can run on a stand-alone computer or any kind of a dedicated
embedded computer like a DSP (digital signal processing) engine included in a digital
mixing console or a DAW used a specific DSP platform.
[0027] Furthermore, there is provided in an embodiment an apparatus for processing a digital
audio signal, wherein the apparatus comprises multiple audio processing channels,
the multiple audio processing channels are working on a predefined or predefinable
base sampling frequency, wherein in a predefinable number of channels the base sampling
frequency is increased to a working sampling frequency in a predefinable section of
the number of channels and after the section or the end of the section the working
sampling frequency is decreased back to the base sampling frequency.
[0028] The apparatus can be a digital audio workstation, a virtual sub-mixer, within a virtual
instrument, a digital hardware mixer or a digital mixer. The methods according to
embodiments of the invention and the apparatus according to embodiments of the invention
can be used for audio processing or for video processing.
[0029] According to an embodiment, the apparatus is configured to perform a method, which
is described above.
[0030] The embodiments of the invention use dedicated channels or paths of the audio signal
in the audio processing system or the video signal in a video processing system, which
allow oversampling and thus the use of plugins or processing modules without switching
the whole system to a higher sampling frequency and/or without using local oversampling
filters in each plugin.
[0031] In the framework of this description a plugin can be called a processing module.
[0032] Within the framework of the description, a selected channel or selected channels
within an apparatus for processing a digital audio signal, which can be switched to
higher sampling frequencies by using for example only one interpolator at the input
or somewhere in the beginning of the channel and only one decimeter somewhere later
in the channel or at the output of the channel are used.
[0033] Within such sections, which use a higher sampling frequency, which is called a working
sampling frequency within this description, plugins or processing modules set up to
a desired or predefinable working sampling frequency or oversampling rate can be placed
or inserted without using the own local interpolators and decimators of the plugins.
[0034] Within this description, an increase of the sampling frequency has the same meaning
as an up-sampling or oversampling of the frequency. The same is the case with respect
to a decrease of the sampling frequency and a down-sampling of the frequency.
[0035] Furthermore, a single up-sampling algorithm or module can be used and a single down-sampling
algorithm or module can be used in a channel that has a better quality than up-sampling
and down-sampling algorithm, normally used in single plugins. This further increases
the quality of the processed audio signal. Furthermore, the latency in the channel
is much lower than caused by the series of local oversampling filters as well as the
computational power and the amount of memory is lower compared to a system working
entirely at a higher sampling frequency.
[0036] According to an embodiment of the invention, the length of the section, in which
the working sampling frequency is used, can be different in different channels. The
length can also be the same for two or more channels.
[0037] The increase of the sampling frequency or the oversampling factor k can be different
for different channels. The oversampling factors can be selected differently in different
channels.
[0038] According to an embodiment of the invention, the section of the channel includes
at least a part of an insert area, wherein in the insert area plugins are inserted
or are insertable into the channel. The insert area normally has several inserts or
insert interfaces, in which plugins can be inserted.
[0039] Within this description, an insert interface can be an insert, an insert slot and
an insert point.
[0040] According to an embodiment of the invention, the section starts prior an insert interface.
[0041] According to a further embodiment of the invention, the section starts at the beginning
of the channel or after a signal input control.
[0042] According to an embodiment, the section of the channel includes a cluster of at least
two successive insert interfaces, in which plugins are inserted or insertable.
[0043] An embodiment include the feature, that at a first insert interface a plugin is inserted
or insertable that is enabled to increase the base sampling frequency to the working
sampling frequency and to keep the working sampling frequency at the output of the
plugin and at the end of the section at another insert interface another plugin is
inserted or insertable that is enabled to decrease the working sampling frequency
back to the base sampling frequency.
[0044] In this embodiment, it is possible to insert an interpolator and a decimator as a
plugin. For the embodiment, the plugin standard can implement that in the section
that is selected to be in an up-sampling mode the plugins are not forced to reduce
the sampling frequency to the base sampling frequency of the channel.
[0045] The further method features, which are described above, can also be implemented into
the apparatus.
[0046] Further characteristics of the invention will become apparent from the description
of the embodiments according to the invention together with the claims and the included
drawings. Embodiments according to the invention can fulfill individual characteristics
or a combination of several characteristics.
[0047] The invention is described below, without restricting the general intent of the invention,
based on exemplary embodiments, wherein reference is made expressly to the drawings
with regard to the disclosure of all details according to the invention that are not
explained in greater detail in the text. The drawings show in:
- Fig. 1
- shows schematically a channel according to an embodiment of the invention,
- Fig. 2
- shows a schematic flowchart according to an embodiment of the invention,
- Fig. 3
- shows another schematic flowchart according to an embodiment of the invention.
[0048] In the drawings, the same or similar types of elements or respectively corresponding
parts are provided with the same reference numbers in order to prevent the item from
needing to be reintroduced.
[0049] Fig. 1 shows a channel 18 according to a digital audio processing system or an apparatus
for processing a digital audio signal in a schematic view.
[0050] The audio channel 18 can be a mono-channel format, a stereo-channel format, a surround-channel
format or any multi-channel format configuration for, especially immersive, audio.
It can be a part of a digital audio workstation, a virtual instrument plugin or a
digital mixing desk.
[0051] At 10, the input signal is shown. The input signal 10 can have any mono or multi-channel
configuration and can come from audio inputs, for example, microphones or line sources
or from the hard drive of the audio computer. In case of a hardware mixing console,
the signal can also come from an internal or external recording device. In case of
a virtual instrument, the signal comes from the instrument itself.
[0052] In the flow of the signal after the input signal 10, the signal comes to the input
control, which usually consists or comprises an input trim (gain) and a phase reverse
button.
[0053] At the position 30, which is in the signal flow after the input control 11 and prior
the first insert interface 21, the up-sampling to a working sampling frequency is
performed. This is a good point in the signal chain to apply up-sampling. This may
vary in different situations. The up-sampling can also be performed prior the input
control 11. The up-sampling could also start after a view initial insert interfaces,
which then would work in the base or system or apparatus sampling frequency, for example,
44.1 or 48 kHz. Thus, the up-sampling 30 could also take place, for example between
the first insert interface 21 and the second insert interface 22 or after another
insert interface. In Fig. 1 not all insert interfaces are shown, but only the first
two and the last two of, for example, ten insert interfaces.
[0054] The insert interfaces 21, 22, 23, 24 are dedicated to insert audio processing tools
such as equalizers, dynamic processing tools and so forth. The total number of insert
interfaces varies between systems or apparatuses.
[0055] According to an embodiment of the invention, the down-sampling to the base sampling
frequency can take place at different points in the audio signal processing line or
the channel 18. The down-sampling is shown by the reference number 31 and in the example
of Fig. 1, there are shown different places for down-sampling 31.
[0056] A good point, at which the down-sampling to the base sampling frequency can take
place, can be within the insert area 20 between two insert interfaces, for example,
the insert interface 22 or 23.
[0057] Between the insert interfaces 22 and 23, several other insert interfaces can be positioned.
A good point for the down-sampling is after insert interfaces in which plugins are
inserted, that profit from working on a higher sampling rate or frequency. Other plugins
working satisfactory at the base sampling frequency, e.g. have a good sound quality
if working on the base sampling frequency, can be inserted into insert interfaces,
which are positioned after the down-sampling 31.
[0058] The down-sampling can take place at other positions as shown in Fig. 1.
[0059] In case of many available insert interfaces 21-24, it is a preferred embodiment of
the invention to allow the user to define how many insert interfaces 21-24 are integrated
into the up-sampling cluster.
[0060] If internal digital signal processing tools are available, it also makes sense to
integrate those into the up-sampling cluster.
[0061] After the fader, a further insert interface or insert interfaces 25 are shown, for
example, insert interfaces 11 and 12. After this insert interface 25, in which, for
example, plugins with reverb-effects or delay-effects or other effects can be implemented,
the post-fader sends 17 is implemented. The signal is send transferred to the panorama
14.
[0062] The post-fader sends 17 usually is used to send signals to FX devices such as reverbs
and delays. In case the post-fader send is within the up-sampling cluster, which means
within the section in which the working sampling frequency is used, it is required
to have an extra down-sampling processing of the post-fader sends signal to communicate
with other processes in the base sampling frequency of the system or apparatus.
[0063] The output signal 15 may be mono, stereo or any multi-channel configuration for surround
or immersive audio.
[0064] Fig. 1 shows one embodiment of a channel. However different configurations of channels
can be used according to the invention.
[0065] Fig. 2 shows a schematic flowchart of an embodiment of the invention.
[0066] At 40, the signal is shown, which is put into an interpolator 41, which leads to
an up-sampling of the base sampling frequency Fs of the signal 40 to a working sampling
frequency k*Fs. After that, there is a section 43 or cluster 43, which is processed
with the working sampling frequency k*Fs. In the section 43, several plugins P1, P2
and so on up to Pn are inserted into this section. The plugins P1, P2 ... Pn are inserted
into insert interfaces.
[0067] After the section 43, a decimator 42 is used to down sample the working sampling
frequency back to the base sampling frequency.
[0068] This embodiment shows schematically a channel or a part of a channel of a system
or an apparatus, e.g. a digital audio workstation, a digital mixing console or an
audio matric processor, which incorporates its own interpolator at the beginning or
prior of the oversampled section and its own decimator at the end or after the oversampled
section. So the interpolator and decimator are provided by system manufacturer as
a part of the system in case of channels which can be switched to oversampling mode.
[0069] Fig. 3 shows another schematic flowchart according to an embodiment of the invention.
In this embodiment it is possible to insert an interpolator 41 and a decimator 42
as a plugin. It would be an advantageous solution because of the possibility that
different plugin manufacturers can deliver their own solutions for interpolators and
decimators. It would only be necessary to use a plugin standard or change existing
plugin standards, e.g. the VST standard, in a manner that in the section that is selected
to be in an up-sampling mode the plugins are not forced to reduce the sampling frequency
to the base sampling frequency of the channel.
[0070] All named characteristics, including those taken from the drawings alone, and individual
characteristics, which are disclosed in combination with other characteristics, are
considered alone and in combination as important to the invention. Embodiments according
to the invention can be fulfilled through individual characteristics or a combination
of several characteristics. Features which are combined with the wording "in particular"
or "especially" are to be treated as preferred embodiments.
List of References
[0071]
- 10
- input signal
- 11
- input control
- 12
- input equalizer and dynamics
- 13
- fader
- 14
- panorama
- 15
- output signal
- 16
- pre-fader sends
- 17
- post-fader sends
- 18
- channel
- 20
- insert area
- 21
- insert interface 1
- 22
- insert interface 2
- 23
- insert interface 9
- 24
- insert interface 10
- 25
- insert interface 11 & 12
- 30
- up-sampling
- 31
- down-sampling
- 40
- signal
- 41
- interpolator
- 42
- decimator
- 43
- section
- P1, P2, Pn
- plugin
- Fs
- base sampling frequency
- k*Fs
- working sampling frequency
- k
- multiplication factor
1. A method for processing a digital audio signal (40) comprising the following method
steps:
providing a channel (18) for processing the digital audio signal (40) or a part of
the digital audio signal (40), wherein the channel (18) is working on a predefined
or predefinable base sampling frequency (Fs), characterized in that the base sampling frequency (Fs) is increased to a working sampling frequency (k*Fs)
in a predefinable section (43) of the channel (18) and at the end of the section (43)
the working sampling frequency (k*Fs) is decreased back to the base sampling frequency
(Fs).
2. The method according to claim 1, characterized in that the section (43) of the channel (18) includes at least a part of an insert area (20),
wherein in the insert area (20) plugins (P1, P2, Pn) are inserted or are insertable
into the channel (18).
3. The method according to claim 2, characterized in that the section (43) of the channel (18) includes a cluster of at least two successive
insert interfaces (21, 22, 23, 24), in which plugins (P1, P2, Pn) are inserted or
insertable.
4. The method according to claim 3, characterized in that the cluster includes all inserts (21, 22, 23, 24) of the insert area (20).
5. The method according to any one of the claims 1 to 4, characterized in that the section (43) starts at the beginning of the channel (18) or after a signal input
control (11).
6. The method according to any one of the claims 3 to 5, characterized in that the section (43) ends after the cluster of at least two successive insert interfaces
(21, 22, 23, 24).
7. The method according to any one of the claims 1 to 6, characterized in that the working sampling frequency (k*Fs) is increased by a factor of k compared to the
base sampling frequency (Fs).
8. The method according to claim 7, characterized in that k is an integer that is a multiple of 2.
9. A non-transitory computer readable medium, including a computer program, executable
by a processor or gate array for performing a method according to any one of the claims
1 to 8.
10. A computer program product for performing a method according to any one of the claims
1 to 8.
11. An apparatus for processing a digital audio signal (40), wherein the apparatus comprises
multiple audio processing channels (18), the multiple audio processing channels (18)
are working on a predefined or predefinable base sampling frequency (Fs), characterized in that in a predefinable number of channels (18) the base sampling frequency (Fs) is increased
to a working sampling frequency (k*Fs) in a predefinable section (43) of the number
of channels (18) and after the section (43) or the end of the section (43) the working
sampling frequency (k*Fs) is decreased back to the base sampling frequency (Fs).
12. The apparatus according to claim 11, characterized in that the section (43) of the channel (18) includes at least a part of an insert area (20),
wherein in the insert area (20) plugins (P1, P2, Pn) are inserted or are insertable
into the channel (18).
13. The apparatus according to claim 11 or 12, characterized in that the section (43) starts at the beginning of the channel (18) or after a signal input
control (11).
14. The apparatus according to any one of the claims 11 to 13, characterized in that the section (43) of the channel (18) includes a cluster of at least two successive
insert interfaces (21, 22, 23, 24), in which plugins (P1, P2, Pn) are inserted or
insertable.
15. The apparatus according to any one of the claims 11 to 14, characterized in that at a first insert interface a plugin is inserted or insertable that is enabled to
increase the sampling frequency (Fs) to the working sampling frequency (k*Fs) and
to keep the working sampling frequency (k*Fs) at the output of the plugin and at the
end of the section (43) at another insert interface another plugin is inserted or
insertable that is enabled to decrease the working sampling frequency (k*Fs) back
to the base sampling frequency (Fs).