Some basics of OTDM / Nyquist WDM are explained in The Art of Nyquist WDM which I recommend reading if you’re new to the concept. Recently, I find the name OTDM much more fitting due to its duality with OFDM. However, what’s commonly termed OTDM in optics (multiplexing really short pulse trains) I assume stands for *optical TDM*, since it doesn’t make any use of an orthogonality relation (though technically, the tributaries are orthogonal).

Anyway, the basic premise of OTDM is that $K$ pulses are overlapping in time. Only by their special shape (the truncated sinc function) can they be detected without crosstalk. The sinc function has the fortunate property of having zeros at regular intervals. By making it so that these zeros occur at the centers of all neighboring symbol slots, there will only be one symbol not equal to zero at each slot center (this being the symbol that is centered on that particular slot). At all other times within the symbol slot, all symbols can have values other than zero and interfere. Hence, we immediately expect quite different signal statistics depending on the position within the symbol slot we are interested in.

The spectrum of the sinc function is rectangular (or almost rectangular for the truncated sinc) so that many such channels could be put very close to each other without spectral crosstalk.

In contrast to OFDM which was (in a special case) the superposition of randomly chosen letters from the same alphabet (the QAM constellation diagram) *with equal weights*, the weights of the different superposed symbols in OTDM are defined by the sinc function. At the symbol slot center, for instance, the weight for the symbol centered in that slot is unity while for all other symbols it is zero. In this case the PMF for a quadrature, the signal or its power can be simply determined from the constellation diagram of the modulation format.

At all other positions the weights are different and non-zero. We cannot therefore use combinatorical means to tackle this problem. Due to the low probabilities involved (which are expected to be similar to OFDM), an exhaustive numerical simulation is not feasible and a Monte-Carlo simulation will not yield the relevant low-probability details. We can, however, apply the concept of importance sampling to our Monte Carlo simulations [1]. Here, we bias the random drawing (of points from the constellation diagram for each symbol) to favor the occurrence of signals with high powers, i.e. we prefer to stay in the corners of the constellation diagram and switch to the opposite corner whenever the sign of the particular sinc-weight for a symbol becomes negative. In this way we get samples for which all the constituents are quite large and in phase. We have to retain some randomness, however, since we are interested in the PMF/CDF and not just the maximum possible value (more on that below). The biased probability for the symbols is compensated when later calculating the PMF.

Biasing toward higher sample powers results in a smaller number of samples with low power, which in turn leads to larger variance in that part of the PMF. We would like to have an accurate representation of the whole PMF, though. Hence, we will generally simulate the same modulation format with different bias values and later combine their results using the *balance heuristic* which is basically just a pretty optimal way of combining such data [2].

You will find the MATLAB code (not optimized for speed or anything) at the end of the article.

Since the code is available to play around with, we will look at just a few results to highlight the differences between OFDM and OTDM. In particular, the symbol length will be $16T$, i.e. 16 times the time between consecutive symbols (or the inverse symbol rate), and thus $K=16$. The alphabets used will be 16-QAM ($N=4$) and 64-QAM ($N=8$). As can be seen in Fig. 5 in the OFDM article, this results in probabilities in the vicinity of $10^{-28}$ for the largest powers in OFDM signals, something not feasible to be figured out by “regular” Monte Carlo techniques.

The figure shows the worst case regarding high signal powers, with the sampling instants chosen at the symbol slot boundaries. See also the next section for more details. While the probabilities of the maximum values are the same (they depend only on $K$ and $N$ as shown in the OFDM post), the maximum observable power is less than half in OTDM. At high probabilities like $10^{-6}$ the difference is smaller. This is a result of the heterogeneous weights of the contributions from the different symbols to each sample, as opposed to equal (magnitudes of) weigths in OFDM.

Sometimes there will be “kinks” in the graphs so obtained by importance sampling. These are a result of outliers whose effect is amplified by the algorithm (in the code, `data`

with very low `dataweights`

contribute significantly more). This is a known problem of importance sampling. As these occur randomly, they might or might not be there. Their position is also random, so that they can be identified by multiple runs of the code.

In OFDM, the sample power distribution varied only little with sample timing within the symbol slot (cf. Figs. 6 and 7 in the OFDM post). As the constellation diagrams of consecutive OTDM symbols do not rotate relative to each other, it is quite straightforward to determine the maximum possible sample power at various instants within the symbol slot by simply adding the weight magnitudes (plus squaring and scaling). They can even be given analytically.$^1$

Unlike OFDM, the maximum power does not have multiple local maximums within the symbol slot, but varies by about an order of magnitude between symbol slot center and boundaries. The sample power CDFs vary accordingly:

For $t_S=0.5$ (the symbol slot center), the CDF corresponds to the constellation diagram, as at this instant only one of the overlapping symbols is non-zero.

For (always necessary) oversampling, the distributions for the appropriate sampling instances can simply be combined. Interestingly, for simple two-fold oversampling, half the samples will be taken in the symbol slot center and the other half at its boundaries. Thus, the worst case regarding symbol power is included and the high-power samples resulting thereof need to be accounted for by leaving appropriate headroom in the digital-analog converters, amplifiers and whatnot.

clear variables LoS = 16; % length of symbol in units of symbol slots QAM = 64; % number of possible values in each quadrature for 16-QAM - only powers of 2 allowed NoS = 2^20; % number of Monte Carlo samples offset = 0.5; % relative sampling position within symbol slot - 0.5 is center % list of biasing coefficients used; need not be integer explist = [0,1,2,3]; % works well for 16-QAM, LoS = 16 % explist = [0,2,4,6,8]; % works well for 64-QAM, LoS = 16 bins = linspace(0,16,1001); % set histogram bins for graphing %%% set-up of some variables normalization = 1/sqrt(2/3*(QAM-1)); % alphabet normalization normalization = normalization / 0.987345; % unit energy per symbol for LoS = 16 % normalization = normalization / 0.993669; % unit energy per symbol for LoS = 32 normweight = (1/QAM)^LoS; % probability of any particular symbol w/o biasing % sinc coefficients of each sample within the symbol coeffs = sinc((-LoS/2:LoS/2-1) + offset); % constellation diagram NoV = sqrt(QAM); % number of possible values in I and Q values = 0:2:2*(NoV-1); values = values - NoV + 1; % possible values in I and Q for each symbol values = values * normalization; symbols = repmat(values, [NoV,1]) + 1i * repmat(values.', [1,NoV]); symbols = symbols(:); % complex QAM alphabet bins = [bins, Inf]; PMF = zeros(length(bins), length(explist)); bincounts = PMF; %%% importance sampling Monte Carlo simulation for jj = 1:length(explist) % biasing is based on distance to opposite corner symbols (depending on coeff sign) distplus = abs(symbols - symbols(1)); wplus = 1 ./ (1 + distplus).^explist(jj); % weights avoid division by zero wplus = wplus / sum(wplus); % normalization distminus = abs(symbols - symbols(QAM)); % opposite corner wminus = 1 ./ (1 + distminus).^explist(jj); wminus = wminus / sum(wminus); origdata = rand(NoS,LoS); % generate random data (symbol plus neighbors) data = zeros(NoS,LoS); dataweights = zeros(NoS,LoS); for kk = 1:LoS if coeffs(kk) >= 0 for ii = (QAM):-1:1; % draw from biased distribution data(origdata(:,kk) < sum(wplus(1:ii)),kk) = symbols(ii); dataweights(origdata(:,kk) < sum(wplus(1:ii)),kk) = wplus(ii); end else % for negative coeffs bias towards opposite corner for ii = (QAM):-1:1; data(origdata(:,kk) < sum(wminus(1:ii)),kk) = symbols(ii); dataweights(origdata(:,kk) < sum(wminus(1:ii)),kk) = wminus(ii); end end end data = data .* repmat(coeffs, [NoS, 1]); % each row corresponds to one sample powers = abs(sum(data, 2)).^2; dataweights = prod(dataweights, 2); if explist(jj) ~= 0 % remove IS outliers index = find(dataweights >= normweight); powers = powers(index); dataweights = dataweights(index) * NoS / 4; % division by 4 is necessary to compensate for biasing towards only 1 corner else dataweights = dataweights * NoS; end % build PMF histograms for each biasing value for ii = 1:length(bins)-1 index = find(powers >= bins(ii) & powers < bins(ii+1)); PMF(ii,jj) = sum(normweight ./ dataweights(index)); bincounts(ii,jj) = length(index); end end % average histograms using balance heuristic weights = PMF .* bincounts ./ repmat(sum(PMF .* bincounts, 2), [1, length(explist)]); weights(isnan(weights)) = 0; % bincount of zero causes weights to become NaN avgPMF = sum(PMF .* weights, 2); % complementary CDF for ii=1:length(avgPMF) CDF(ii) = sum(avgPMF(ii:end)); end

Note the normalization of the symbol amplitude in order to obtain unit mean energy per symbol. This accounts for the distribution of the constellation points for the particular QAM format as well as the truncated sinc pulse shape. Since

$$\intop_{-K/2}^{K/2} \mathrm{sinc}^2 x \, dx < 1 \quad \text{for} \quad K < \infty$$ the energy reduction resulting from truncation is compensated by appropriate scaling.

**1** The main problem is to find

$$S_\mathrm{max}^2 = \left[\sum_{k=0}^{K-1} \left| \mathrm{sinc} \left(k - K/2 + t_S\right) \right| \right]^2 = \left[\sum_{k=0}^{K-1} \left| \frac{\sin \bigl[\pi \left(k - K/2 + t_S\right)\bigr]}{\pi \left(k - K/2 + t_S\right)} \right| \right]^2$$

which is then multiplied by the highest power occuring in the constellation diagram of the particular modulation format (which also depends on $K$ when properly normalized). For simplicity, we’ll assume $K$ to be even and we will only try and find $S_\mathrm{max}$, and leave the squaring to the interested reader. The sine in the numerator will then have a constant value, determined by $t_S$ that simply alternates its sign with $k$ and thus can be written

$$S_\mathrm{max} = \frac{\sin \pi t_S}{\pi}\sum_{k=0}^{K-1} \left| \frac{c_S \left(-1\right)^k}{k - K/2 + t_S} \right|$$

with

$$c_S = \begin{cases} 1 & K/2 \text{ even}\\ -1 & K/2 \text{ odd}\end{cases}$$

We split the sum to get rid of the negative denominator and with it the absolute value function and the sign coefficients:

$$S_\mathrm{max} = \frac{\sin \pi t_S}{\pi}\sum_{k=0}^{K/2-1} \frac{1}{k + t_S} - \frac{1}{k - K/2 + t_S}$$

These are then just two general harmonic series. Their solution can be given by generalized harmonic numbers or in terms of the Digamma function $\psi(\cdot)$ [3]

$$S_\mathrm{max} = \frac{\sin \pi t_S}{\pi} \left[ \psi\left(t_S - \frac{K}{2}\right) + \psi\left(t_S + \frac{K}{2}\right) - 2 \psi\left(t_S \right) \right]$$

[1] Importance Sampling [Wikipedia]

[2] E. Veach, *Robust Monte Carlo methods for light transport simulation*, dissertation, Stanford University, 1998.

[3] T. M. Rassias and H. M. Srivastava, “Some classes of infinite series associated with the Riemann zeta and polygamma functions and generalized harmonic numbers,” *Applied Mathematics and Computation*, vol. 131, pp. 593–605, 2002.

I take this lightly now because I no longer care about publishing, but it’s actually quite sad. I won’t be re-writing the paper to make the reviewers like it; instead I’ll post it here in its current form for anyone to do with it as he or she pleases.

Download the paper.

Download the MATLAB code to generate the Nyquist-modulated signals (poorly commented and uses functions from the free Optilux library and the communications toolbox).

The proper citation would be

M. Winter, “Nyquist Pulse Signalling for Spectrally Efficient Terabit/s Superchannels,” utterly rejected by

Photonics Technology Lettersin March 2011.

The print version would have had an addition acknowledgment section, mentioning fruitful discussions with René Schmogrow and Wolfgang Freude from KIT, which I forgot to put into the draft but which really is quite important. As far as I know, René will keep pursuing the idea. I wish him more luck with the reviewers.

]]>Anyway, to be able to put “regular” non-OFDM channels very close to each other, their spectrum needs to be filtered very tightly. The tightest possible spectrum which contains all information at the sampling points is rectangular between the (positive and negative) Nyquist frequencies – in this case this is the frequency $f_\mathrm{Nyquist} = 1 / 2T = f_T / 2$ where $T$ is the symbol rate. Everything outside that is in some way redundant in a single channel. The guys from Polito achieved a channel center frequency separation of $1.1/T$ by using a Finisar WaveShaper device [1] – not quite the cheapest way to do that, even though it can also add highpass filtering to compensate for a possible inline lowpass characteristic$^1$. Their transfer function is shown in Fig. 1.

A simpler and cheaper way to do that would be to use some real-time preprocessing in the transmitter or – for those who can’t program their own FPGAs – an arbitrary waveform generator to demonstrate the concept. However, electronic filtering was not much more than a footnote in the various presentations on Nyquist WDM at ECOC. It wouldn’t even take much processing. The required waveforms for each input could be stored in a look-up table and the waveforms for all the symbols then just need to be summed just before being output. Sounds easy enough.

So what do these waveforms look like? Well, to obtain a rectangular signal PSD, we need sinc waveforms for each symbol. The sinc function decays rather slowly and extends (ideally) over infinitely many symbol slots. However, we can truncate the infinitely long symbols to extend only over a finite number of symbol slots (that’s where the summing before the final output comes in). The truncation will of course affect the spectrum, which will no longer be rectangular. It’s quite simple to calculate the expected PSD using the time-domain symbols and the procedure outlined in footnote 1 of this post. In the time domain, the signal is

$$E(t) = \sum_{n=-\infty}^{\infty} c_n \, \mathrm{sinc}\biggl(\pi\frac{t - nT}{T}\biggr) \cdot \Pi\biggl(\frac{t-nT}{kT}\biggr)\tag{1}$$

where $\mathrm{sinc} x = \sin x / x$, $\Pi(t/\tau)$ is a rectangular window of width $\tau$ centered on $t=0$, and $c_n$ is the data encoded on symbol $n$. Hence, the sinc function is truncated to a length of $k$ symbols. A typical output sequence $E(t)$ for $c_n \in \lbrace -1, 1 \rbrace$ is shown in Fig. 2, together with the shape of a single symbol for a symbol length of $8T$ ($k=8$). Note that the symbol time form is zero at the center (the ideal sampling point) of all neighboring symbols.

Given (1), the PSD can be calculated as

$$\mathrm{PSD}(f) = \tilde E(f)^* \tilde E(f) \propto \Bigl[kT \,\mathrm{sinc}\bigl(\pi kT \cdot f\bigr) * \Pi\bigl(T\cdot f\bigr)\Bigr]^2\tag{2}$$

where the “regular” $*$ means convolution, the superscript $^*$ means complex conjugation, and the $\Pi$ function describes a rectangular window of width $2\pi\,T^{-1}$. The whole thing scales with the average power in the data symbols $c_n$, wherefore there is a proportionality relation instead of an equality. I asked trusty old Mathematica to do the convolution for me. Fig. 3 shows the so-calculated spectra for different values of $k$. Clearly, the longer the allotted time window over the sinc function, the closer the spectrum will be to rectangular. However, the shortest time window of $4T$ has a spectrum that is already about as good as the WaveShaper of Fig. 1. Also, the spectra look similar to the OFDM spectra in this post, which also become more rectangular as the number of subchannels (samples per symbol) is increased – by comparing (2) to the mathematical description of an OFDM spectrum we see that there are fundamental differences.

The (rectangular) windowing causes sidelobes to appear which are about 25 dB below peak. These will cause some crosstalk when packing such channels close together to form these superchannels. One way to suppress these sidelobes without increasing the symbol length unnecessarily is to use a non-rectangular window function in the time domain. There are many such functions out there, some of which are better than others. Fig. 4 shows the spectrum when using the Hamming window (my personal favorite for no particular reason)

$$ w(t) = 0.54 + 0.46 \cos\biggl(2\pi \frac{t}{k}\biggr) \quad \text{for} \quad -\frac{kT}{2} \le t \lt \frac{kT}{2}\tag{3}$$

The sidelobes are significantly decreased and crosstalk should be much less of a problem. The window function can be changed as part of the stored waveforms in the real-time preprocessor and is thus easily implemented.

So far we have dealt with continuous signals and talked about storing waveforms digitally in an FPGA/ASIC, which doesn’t make much sense. We can only store sampled signals, which necessitates slight changes in our reasoning. In short, sampling the waveforms makes the corresponding spectrum periodic and these periodic spectra may even overlap, as results from basic Fourier theory. The free spectral range of that periodicity depends on the sampling rate, the width of the main “lobe” does not (unless we undersample). We can thus control how much basically “empty” spectrum appears between two periodic lobes.

If we sample the sinc waveforms at the symbol rate (where we needed only a single sample per symbol since all other samples are zero$^2$) the spectra would overlap or at least “connect” and we would get a single continuous spectrum whose shape then only depends on the sample function. For rectangular samples of width $T$, this is shown in Fig. 3 of this post. By oversampling – just as is done in OFDM – we cause the image spectra to “disconnect” and create an arbitrarily sized (determined by the amount of oversampling) space, or gap, between the spectral lobes which can then be used to remove the unwanted image spectra using common electrical low-pass filters. This principle is shown in Fig. 5. The zero of the enveloping sinc function, which results from the rectangular time samples used, necessarily occurs exactly in the middle of the image spectra. If there is enough room between the main lobes to allow filtering (dashed curve), the image spectra are completely removed (bottom part of the figure) and we are left with an almost rectangular channel spectrum, obtained without any WaveShaper devices and only with a second-order Gaussian filter and a bit (1.8×) of oversampling. Steeper filters need less oversampling, shallower filters (e.g. Bessel-Thomson filters) may need more. If we had wanted, we could even have introduced a high-pass transfer function to pre-compensate subsequent filters by modifying the stored waveform accordingly.

The implementation in an FPGA/ASIC would be quite straightforward. The sampled waveforms for each symbol could be stored in a look-up table, to be read out and added before being output. Alternatively, one could just store values of the sinc function and do a bit more processing for each symbol, but requiring less storage space. Oversampling would not need to increase the implementation complexity significantly. For 2× oversampling, only the size of the entries in the look-up tables changes. For 1.5× oversampling, we would additionally need each possible symbol twice – one version centered at the symbol center and one appropriately shifted, which we would alternate from symbol to symbol. Other, more odd, oversampling factors would require somewhat more extensive tables or the use of sinc tables. An FPGA that is capable of generating OFDM signals should be more than sufficient for this.

A simpler alternative that would work at least for laboratory work would be to program the calculated waveforms into an arbitrary waveform generator (AWG). Here we would not even be limited in the length of the individual symbols since these need not be calculated in real time. For real-time transmission this is not an option, though.

The necessary oversampling comes at the cost of reduced symbol rate for a given DAC sample rate since the sample rate determines the spectral width that we have control over. We can fill any part of the spectrum with zeros, but these zeros are potential data that is not being transmitted. On the other hand we can fill the empty part of the spectrum with parts of the neighboring channels so that overall we do not sacrifice spectral efficiency. This is what’s so great about Nyquist WDM (even though it shouldn’t be called that when using a WaveShaper – then it’s just very dense WDM).

I wonder how long we’ll have to wait until we find an experimental implementation of this…

**1** Simply filtering a modulated signal is however not the same as generating a rectangular spectrum. This would only work if the input to the filter was a (Dirac) pulse sequence, which would yield the required sinc signals at its output. Since the filter input will in general be some modulated signal, the output will be the convolution of the sinc function with whatever is input, which can become a bit messy, as now the sinc zeros are no longer aligned to the sampling points of the neighboring symbols. This is shown in Fig. 6, in which the signal of Fig. 2 is created by filtering 50% RZ-shaped pulses. Clearly, there is some variation in the sampling levels that was not there in Fig. 2.

The filtering thing works best when the input pulses are narrow and thus the spectrum very wide. This should be interesting for the OTDM folks who like to work with very short pulses…

**2** Since the sinc is zero at the sampling points of all neighboring symbols, we only need a single sample per symbol. This should ring a bell, as it is the same as NRZ-modulated signalling. And indeed, since there is no guard interval between the spectral images which appear as a result of the sampled time signal, all we see is one contiguous modulated spectrum, which is exactly what we would expect from either NRZ modulation or Nyquist WDM without oversampling.

[1] G. Gavioli, E. Torrengo, G. Bosco, A. Carena, V. Curri, V. Miot, P. Poggiolini, M. Belmonte, F. Forghieri, C. Muzio, S. Piciaccia, A. Brinciotti, A. La Porta, C. Lezzi, S. Savory, and S. Abrate, “Investigation of the impact of ultra-narrow carrier spacing on the transmission of a 10-carrier 1Tb/s superchannel,” in *Conference on Optical Fiber Communication (OFC)*, March 2010, paper OThD3.

[2] W. Shieh, Q. Yang, and Y. Ma, “107 Gb/s coherent optical OFDM transmission over 1000-km SSMF fiber using orthogonal band multiplexing,” *Optics Express*, vol. 16, no. 9, pp. 6378-6386, April 2008.