Constant Q Transform

A slecture by ECE student Robert Schwieterman

Partly based on the ECE438 Fall 2015 lecture material of Prof Boutin.



1. Introduction

In this slecture you will learn about the Constant Q Transform initialy proposed by [Brown]. This slecture
assumes you are already comfortable with the concepts of the Discrete Fourier Transform.


2. Background

The Constant Q Transform (CQT) is closely related to the Discrete fourier Transform (DFT). But where the DFT has
linearly spaced frequency “bins” the CQT’s are logarithmically spaced. This is motivated by the fact that human
hearing is similarly inclined. Music and instruments are built around producing frequencies that are spaced,
not of constant difference, but of constant ratio. The notes on a piano are related in that the every 12 notes (half-steps)
you double frequency. The CQT is built to be more sensitive to this than its predecessor.


3. Theory

We can model the DFT as a discrete sampling of the DTFT convolved with a sinc function.
$ X[k]=X(\omega)*(\frac{sin( \frac{\omega N}{2})}{sin(\frac{\omega }{2})})\bigg|_{\omega=\frac{2\pi k}{N} } $
The sampling stems from the fact that we assume
our window of N samples is periodic with period N. The sinc convolution stems from the fact that in the DFT we are windowing our once infinitely
long sequence. By multiplying by a rectangular window in the index (time) domain, we convolve with a sinc in the frequency domain. This leads
to the phenomenon of frequency bleeding. If we have sequence x[n]=cos((2π/2.2)n) and take a 4 point DFT with frequency bins corresponding to
0, π, π/2, and 3 π/2, then we will have non zero values in the frequency bins, despite the DTFT having non zero components ONLY where w=(2 π/2.2).
It is this sinc convolution and frequency bleeding that allows us to view each frequency bin as a band-pass filter.

It is in this filter model of the DFT that we can begin to understand the Constant Q transform. The Q for which it gets its name
is from the “Quality factor” of a filter, defined as the ratio of center frequency to bandwidth. (fk/BW) The width of our DFT “filter”
is dependent on the number of samples N, the higher the N, the smaller the bandwidth. For a DFT, the number of samples is independent
of the frequency bin, leading to a unchanging bandwidth for each filter. This means that bins in the higher frequencies have a higher
quality-index than those in the lower frequencies. By changing the number of samples used (window length) we can develop such filters that
the Quality index is constant, Constant Q Transform!



4. CQT

[Brown], who first described the CQT in 1991, sets the window length (N) by N=Q(fs/fk). Because our sampling frequency and Q are constant,
we can say that N is inversely proportional to our bin frequency. Just as humans take longer times to distinguish lower frequency sounds accurately,
the CQT must devote more samples, (and thus more operations) to such lower frequencies. This makes the the CQT apt for musical applications,
where the signal will be composed of primarily logarithmically spaced frequencies instead of linearly-spaced ones.


5. References

Judith C. Brown, Calculation of a constant Q spectral transform, J. Acoust. Soc. Am., 89(1):425–434, 1991.



Questions and comments

If you have any questions, comments, etc. please post them here.


Back to 2015 Fall ECE 438 Boutin


Alumni Liaison

BSEE 2004, current Ph.D. student researching signal and image processing.

Landis Huffman