Machine learning and applications in ultrafast

Goëry Genty   1 ✉, Lauri Salmela   1, John M. Dudley   2, Daniel Brunner   2, Alexey Kokhanovskiy3,
Sergei Kobtsev   3 and Sergei K. Turitsyn   3,4

Recent years have seen the rapid growth and development of the field of smart photonics, where machine-learning algorithms
are being matched to optical systems to add new functionalities and to enhance performance. An area where machine learning
shows particular potential to accelerate technology is the field of ultrafast photonics — the generation and characterization of
light pulses, the study of light–matter interactions on short timescales, and high-speed optical measurements. Our aim here is
to highlight a number of specific areas where the promise of machine learning in ultrafast photonics has already been realized,
including the design and operation of pulsed lasers, and the characterization and control of ultrafast propagation dynamics. We
also consider challenges and future areas of research.

achine learning is an umbrella term describing the use of illustrate in Fig. 1 an overview of different machine-learning strate-
statistical techniques and numerical algorithms to carry gies and associated architectures, listing the core concepts, imple-
out tasks without explicit programmed and procedural mentation methodologies and applications where these have been
instructions. Machine-learning algorithms are widely used in many applied in ultrafast photonics.
areas of engineering and science, with particular strengths in clas-
sification, pattern recognition, prediction, system parameter opti- Laser design and self-optimization
mization and the construction of models of complex dynamics from In this section, we give an overview of the use of machine learning
observed data. Machine-learning tools have been widely applied in in laser design.
fields such as control systems, speech processing, neuroscience and
computer vision1. Self-tuning of ultrafast fibre lasers. Ultrafast lasers are essential
In optics and photonics, early applications of machine learn- tools in many areas of photonics, including telecommunications,
ing have mostly been in the form of genetic algorithms for pattern material processing and biological imaging19–23. They have also
recognition2, image reconstruction3, aberration corrections4 or the played a central role in several Nobel prizes awarded for femto-
design of optical components5,6. More recent work has focused on second coherent control (1999); the development of the preci-
the analysis of large datasets7,8 and on inverse problems where the sion frequency comb (2005); and, more recently, the generation
superior ability of machine learning to classify data, to identify hid- of high-power femtosecond pulses via chirped pulse amplification
den structures and to deal with a large number of degrees of free- (2018). Although some ultrafast sources are based on relatively
dom have led to many significant results. Particular areas of success simple designs, the operation of many important laser systems is in
include in the design of nanomaterials and structures with specific fact very complex, with dynamic pulse shaping determined by the
target properties9–11, label-free cell classification12, super-resolution interplay between a range of nonlinear, dispersive and dissipative
microscopy13,14, quantum optics15 and optical communications16–18. effects24. Although this complexity certainly creates challenges in
In addition to applications in the general area of data process- controlling and optimizing the laser emission, it also offers consid-
ing, there is particular potential for machine-learning methods to erable performance advantage not available with simpler systems. A
drive the next generation of ultrafast photonic technologies. This key challenge is then to harness this complexity.
is not only because there is increasing demand for adaptive control The difficulty in optimizing a particular ultrafast laser arises
and self-tuning of ultrafast lasers, but also because many ultrafast from the number of degrees of freedom (or control parameters) that
phenomena in photonics are nonlinear and multidimensional, with need to be balanced to achieve stable operation or to reach a specific
noise-sensitive dynamics that are extremely challenging to model dynamical regime. Of course, efforts to develop self-optimized or
using conventional methods. While advances in measurement tech- autotuned lasers have been made for many years, with the dominant
niques have led to substantial progress in experimental studies of such approach being to linearly sweep through a subset of the available
complex dynamics, recent research has shown how machine-learning parameter space while monitoring the laser output and using a feed-
algorithms are providing new ways to identify coherent structures back loop to obtain and maintain a desired operating state. While
within large sets of noisy data, and can even potentially be applied this is a straightforward approach for simpler laser designs with
to determining underlying physical models and governing equations limited parameters, it becomes intractable when the laser operation
based on only the analysis of complex time series. depends on many degrees of freedom, or when multiple output
Our aim here is to review a number of specific areas where the characteristics need to be optimized simultaneously. Moreover,
promise of machine learning in ultrafast photonics has already been there is an increasing demand in both research and industrial appli-
realized, and to also consider challenges and future directions of cations for fully autonomous operation and active realignment in
study as well as applications where substantial impact is expected the presence of external perturbations, as well as for the ability
in the coming years. Before presenting specific details, we first to make dynamic changes in pulse characteristics adapted to the

Coherent control of
ultrafast dynamics44–47,82 Pulse
Optimization, mode locking and
autotuning of fibre lasers27,31–36,41 Chromatic dispersion
Genetic compensation

Implementations Optimization, mode locking

and autotuning
of fibre lasers37–39
Toroidal search
search Search
Hidden physics models66,67
Dimensionality reduction for
fibre laser optimization28,29
Analysis of
ultrafast instabilities75,76
neural Pulse shaping and
networks characterization40,42,43,54,64,65
Singular value Concepts
Mode locking
Unsupervised Supervised of fibre lasers30
learning learning

Convolutional Pulse characterization57,58

neural networks

Multimode fibre
Analysis of propagation dynamics51,59,81
ultrafast instabilities75
Reservoir Prediction of
computing nonlinear dynamics69,70

Signal recovery in
Analysis of optical communications
chaotic dynamics Prediction of nonlinear

Fig. 1 | Overview of the main machine-learning concepts and implementations that can be used in ultrafast photonics. The figure illustrates the core
concepts and corresponding implementation methodologies as delimited by the coloured arcs, and links these to particular applications where these have
been applied in ultrafast photonics. There are also other concepts including semi-supervised learning and reinforcement learning, which use some of the
implementations mentioned in the figure, but these have yet to be exploited in an ultrafast context. Of course, we also stress that all these methods have
been used in many other fields of science in addition to the ones shown here.

target environment (for example, propagation medium or material). generic illustration of machine-learning strategies, control elements
It is for such systems with greatly added complexity that approaches and output parameters for optimization of ultrafast fibre lasers.
based on machine learning are especially promising and desirable. Specifically, Fig. 2a illustrates the training phase where control
An important example here is the widespread fibre laser, where electronics and advanced measurement devices are used to probe
polarization control, pump power, spectral filtering and loss com- the parameter space and map the corresponding operation states,
bine to create a wide range of possible operating regimes governed respectively. Collected data are then fed to machine-learning algo-
by a rich landscape of nonlinear dynamics25,26. Depending on the rithms for training. Figure 2b shows the self-tuning regime where
exact choice of parameters, the same laser can exhibit very differ- the operation state of the laser is characterized in real time with
ent behaviour: continuous-wave lasing, noise-like pulse generation, a simplified measurement system fed into the machine-learning
Q-switching, mode locking, multiple pulsing and bound states. It is algorithm controlling the electronics to lock the system to a desired
for this multivariable optimization problem where machine learn- regime. This is where machine learning is particularly powerful
ing has recently led to a number of dramatic improvements. The as, once trained, the algorithm allows rapid parameter selection
general approach has been to combine an algorithmic feedback loop for optimum operation. Examples of machine-learning algorithms
together with the electronic control of intracavity elements varying that can be used are highlighted in Box 1, and general guidelines in
polarization, pump power and spectral filtering. Figure 2 shows a applying them are provided in Box 2.

92 Nature Photonics | VOL 15 | FebruarY 2021 | 91–101 |

a Training procedure Wavelength-division
Active Pump
OSA Isolator Polarizer EPC


PD +

RFSA algorithm

Control algorithm

b Machine-learning-assisted operation Wavelength-division


Active Pump
Diagnostics Coupler
Isolator Polarizer EPC

Simple measurement


Control algorithm

Fig. 2 | Illustration of machine-learning strategies for optimization and self-tuning of ultrafast fibre lasers using control of intracavity elements via a
feedback loop and control algorithm. a, Training phase where control electronics acting, for example, on the polarization state (electronic polarization
controller (EPC)) sweep the parameter space to map different operating states of the laser to be used as inputs to the control algorithm (Box 1). Guidelines
for algorithm and parameter selection are given in Box 2. In the case of a search algorithm, the training phase is not necessary. Output characteristics are
measured by diagnostics such as an optical spectrum analyser (OSA), fast photodiode (PD) and oscilloscope (OSC), or radio-frequency spectrum analyser
(RFSA), and subsequently used as input to the control algorithm. b, Machine-learning-assisted operation where the laser operation is measured in real
time and fed into the control algorithm.

Ultrafast fibre lasers mode-locked by nonlinear polarization Table 1 summarizes a selection of results that have been
evolution (NPE) are particularly complex, because a change in the obtained so far (extended from ref. 37), also providing the charac-
polarization state affects both spectral and temporal pulse shaping, teristics of the particular algorithms used in each case. In most of
as well as the gain-to-loss balance in the cavity due to the intrin- these studies, the feedback loop typically uses an advanced search
sic saturable absorber role played by the polarization-dependent or genetic algorithm targeting a desired optimal state based on
losses. The first studies combining an algorithmic feedback loop some particular fitness or objective function as the reference cri-
with some cavity control parameter were in fact proof-of-concept terion. Although these results are highly promising, genetic algo-
numerical simulations of an NPE fibre laser, where it was shown rithms have to be carefully designed due to their sensitivity to the
that multipulsing instability could be reduced via filters optimized initial choice of parameters, which can lead the fitness function
with a genetic algorithm27, and that stochastic changes in environ- to converge towards a local optimum. They also cannot accom-
mentally induced birefringence could be mitigated by applying a modate for long-term dependencies, and the fitness function typi-
singular value decomposition method28 or using variational auto- cally monitors a single parameter limiting the operating regime
encoders on the birefringence state map29,30. This modelling was that can be achieved. Another important drawback of genetic algo-
rapidly followed by an experimental implementation using a sin- rithms is their relatively slow convergence speed on the scale of
gular fitness function to identify self-starting regimes in an NPE minutes or even hours (Table 1). However, recent developments
laser31. A number of subsequent experiments for various laser have shown that one can reduce this time considerably using algo-
configurations (NPE, ring cavity and figure of eight) have used rithmic modifications that can mimic human logic, with the pos-
genetic algorithms to achieve self-tuning and autosetting in differ- sibility to lock the laser to a desired operating state and to recover
ent regimes such as Q-switching, mode locking, Q-switched mode to this state from perturbation in less than one second38,39. Further
locking or the generation of on-demand pulses with different dura- improvement in self-tuning speed is likely to require algorithms
tion and energies32–36. that also include models of the pulse-generating mechanism to

Nature Photonics | VOL 15 | FebruarY 2021 | 91–101 | 93

Box 1 | Examples of machine-learning algorithms

Genetic algorithms. Genetic algorithms belong to a family of evo- maps. The output may then be flattened into a vector form for
lutionary algorithms that are inspired by biological evolution. A classification or regression tasks.
(random) initial population of genes (system parameters) is first
evaluated by a fitness function, and the parents of the next gen- Unsupervised learning. This refers to label-free statistical tools for
eration are selected according to the fitness score. The reproduc- exploratory data analysis without prior knowledge about the data
tion includes a crossover of genes between the parents to create or system. The goals of unsupervised learning techniques typically
children that may undergo a mutation in which individual genes include finding inherent patterns and structures to partition
are randomly altered. Genetic algorithms may also include elitism, data into natural groups or clusters according to coordinates
where the best individuals are cloned to the next generation. (for example, x1 and x2), or creating latent variable models for
dimensionality reduction and data visualization.
Feed-forward neural networks. Feed-forward neural networks Recurrent neural networks. Recurrent neural networks are
consist of an input layer accepting input data x, multiple hidden a special type of neural network that are used for processing
layers of basic computational units (neurons or nodes) that perform temporal/sequential data. Their topologies include intralayers
operations on the data using various weights and a nonlinear and nodes with recurrent connections that store the network
activation function, and an output layer that computes the information from the previous input values. The hidden state of
network output y for regression or classification. In feed-forward the recurrent nodes ht is passed on to the next time step such that
neural networks, the information flows forward from the input the output of the recurrent layer yt+1 depends on both the new
layer through the hidden layers to the output layer. input xt+1 and the previous hidden state ht.
Convolutional neural networks. Convolutional neural networks Reservoir computing. Reservoir computing is a particular class
are a special type of feed-forward neural network where the of recurrent neural network. In reservoir computing, the input
input is convolved with a set of filters or kernels, followed by Win and recurrent layer connections W do not participate in the
nonlinearity. The resulting feature map is then downsampled by a training but instead they are pre-defined in an ad hoc fashion and
pooling function reducing the data’s dimensionality by combining are often simply drawn from a random distribution. Training only
nearby points into a single value. The convolution and pooling modifies readout weights Wout and the usually complex neural
operations can be followed by additional convolutional layers to network optimization becomes a simple matrix inversion that can
extract further relevant information from the previous feature be computed in a single step.
a Genetic algorithm Real time b Feed-forward neural network Pre-trained/ c Convolutional neural network Pre-trained
Initial Fitness Selection Reproduction Mutation New Input Hidden
population score generation layer layers Output
layer Conv Pooling Conv Pooling Flatten

8 9 2 9 0.12 10 7 9
8 9 2 9 1 0 2 9 10 7 9
9 6 5 1 0.05
10 9 9 8 9 9 9 8 9 9 9
8 9 9 9
Input Output
10 9 9 0.28 10 9 9 x2
Select parents Crossover to Randomly
3 individuals based on their produce mutate a gene x3

with 4 genes fitness children





Elitism: best individuals are

cloned to the next generation xN

d Unsupervised learning Real time e Recurrent neural network Pre-trained f Reservoir computing Pre-trained
Input Output
layer Reservoir layer
x1 Win Wout
Output layer yt–2 yt–1 yt yt+1
x1(t ) W y1(t )

Hidden layer x2(t ) y2(t )

ht–3 ht–2 ht–1 ht
x3(t ) y3(t )
Hidden state

Input layer xt–2 xt–1 xt xt+1
xN(t ) yM(t )

Widespread and promising machine-learning architectures for ultrafast photonics. a, Genetic algorithm. b, Feed-forward neural network.
c, Convolutional neural network. d, Unsupervised learning. e, Recurrent neural network. f, Reservoir computing. The different algorithms can
be used as indicated: in pre-training before being applied to a particular experimental system, for real-time optimization and tuning, or a
combination of both where the algorithm is pre-trained and subsequently updated during system operation.

provide more targeted control. Unfortunately, while models based measurement techniques40,41 could lead to better understanding of
on nonlinear Schrödinger-like equations (NLSE) are generally able ultrafast laser dynamics, allowing for the construction of laser sys-
to reproduce experimental characteristics qualitatively, quantita- tems with improved robustness.
tive comparison with experiments remains challenging. This is
because accurate modelling necessitates the knowledge of a wide Control of coherent dynamics. In addition to directly control-
range of parameters that are not readily accessible in practice (for ling laser emission as described above, there is widespread use of
example, the random birefringence in the fibre). Ultrafast lasers extra-cavity shaping technology to modify the characteristics of
are also stochastic systems and the impact of noise can generally ultrashort pulses and other light sources used in particular appli-
be reproduced via only computationally intensive Monte Carlo cations. Because such optimization can involve multiple param-
simulations that require the analysis of a very large amount of data. eters that are interconnected in complex ways, this is an area where
One can anticipate that the use of machine-learning techniques for machine learning can clearly surpass other forms of manual or par-
pattern recognition combined with the latest advances in real-time tially automatized control.

94 Nature Photonics | VOL 15 | FebruarY 2021 | 91–101 |

Box 2 | General considerations when applying machine-learning models

Choosing an architecture and associated parameters. Neural optical modes) and parameters of the underlying model. One
networks are universal function approximators whose perfor- can then continuously increase the volume of training data
mance significantly depends on their hyperparameters (variables until the validation error stagnates. The training data should be
that determine the network structure and training). Selecting the representative of the system’s possible states, and therefore sample
optimum architecture (Fig. 1 and Box 1) and tuning the hyper- uniformly the system’s phase space. This can be challenging,
parameters often involves significant heuristics, exhaustive scans, especially for ultrafast nonlinear systems, which may rarely
trial and error, and leveraged optimization tools (genetic algo- visit specific outlier regions (so-called skewed dataset), and can
rithms99,100 or Bayesian methods101,102). Nevertheless, one may con- lead to degraded performance in testing. Feeding representative
sider the following guidelines to select an appropriate architecture datasets is also not always possible during experiments, and
and hyperparameters: a feed-forward neural network is a good data augmentation via simulation is an alternative approach. It
choice if the map from input to output lacks temporal context. This is also important to normalize training data to the ‘useful’ range
is typically the case when one considers input–output mappings of of the neurons’ nonlinear response (around unity) to prevent the
‘single pass’ systems such as pulses undergoing nonlinear propaga- network operating in the linear or saturated regime.
tion, where fluctuations are expected to be independent and un-
correlated, and also for particular classes of similarly (partially) Avoiding overfitting. Unlike in genetic algorithms, overfitting
uncorrelated instabilities in Q-switched lasers. If data contain can occur in neural networks, typically when the testing error
structure along a particular input dimension (for example, space, is large compared with the training error. The risk of overfitting
time or wavelength), architectures including filters such as convo- may be reduced using the following strategies: simplification to
lutional neural networks are better candidates; one may employ reduce the network complexity; data augmentation by increasing
fully connected topologies for input data apparently lacking such the fraction of noisy data during training; cross-validation where
features. If the output is expected to depend on current and past division of data into training and testing sets is varied during
input data, recurrent topologies (long short-term memory, gated training; early stopping where training is stopped when the testing
recurrent units or reservoir computing) should be used. error starts increasing; regularization by including penalties in the
system’s loss function; drop-out by randomly removing individual
Accuracy generally increases with the number of hidden layers connections during training.
or nodes. The number of layers, nodes and training epochs can be
increased until the validation error starts increasing (even if the Robustness and transfer learning. Ultrafast photonics systems
training error still decreases). Note that too many nodes can lead are generally sensitive to their environment. Enabling stable and
to overfitting and reduce generalization (the ability of a trained robust operation is another key objective for machine learning.
model to adapt accurately to data outside the initial training Performance degradation upon a change of environmental
dataset). Continuously reducing the number of nodes for deeper conditions will mostly depend on the parameter space and regimes
layers is a common strategy to improve generalization, and two to explored during training and testing. It is therefore important to
three hidden layers comprising 50 to 1,000 nodes seem sufficient include training data that incorporate possible environmental
for most tasks in ultrafast photonics. A neural network’s inference variations (see also ‘Selecting training data’). Using unsupervised
quality is quantified by a cost function such as mean-squared or learning to determine the dynamic relation between external
root-mean-squared error. The root-mean-squared error penalizes conditions and system output is another approach.
small divergences more heavily and can be employed when fast and A related question is ‘transfer learning’, or how a neural network
accurate convergence is essential. Network weights are typically architecture optimized for a particular system can be ‘transferred’
initialized randomly, and popular activation functions are the to a different yet related problem. In particular, the output of an
rectified linear unit and the sigmoid nonlinearity. The rectified ultrafast system can be divided into different regimes depending
linear unit is computationally less expensive and avoids vanishing on the system parameters. This is particularly true for mode-locked
gradients, while the sigmoid’s upper limit makes blowing-up laser pulses, which typically correspond to fundamental solitons,
solutions less likely. dissipative solitons or periodic breathers depending on the laser
dispersion, nonlinearity, gain, loss and filtering. Transfer learning
Selecting training data. There is generally no one-size-fits-all may then use training data generated with simplified mathematical
criterion to determine the volume of training data needed for a models103 or experiments with reduced complexity. In fact, transfer
specific network and task. Where possible, one can be guided by learning is in itself an important topic of machine-learning
available examples of comparable problems, and more generally, research and from that point of view, ultrafast photonic devices
an initial guess can be obtained by considering the number of could be ideal testbeds for investigating transfer learning problems
classes (output neurons), relevant input features (for example, in general.

For example, pulse compression to a transform-limited duration is Genetic algorithms can also be used for these purposes, and
essential to femtosecond spectroscopy that uses few-cycle laser pulses their application to solve highly nonlinear optimization prob-
to probe physical or chemical interactions. Recently, it has been lems such as fibre supercontinuum generation has also been
shown how an adaptive neural network algorithm can control a pulse very successful44–47. Using custom pulse-train preparation via an
shaper and accelerate significantly the compression implementation integrated pulse-splitter, a genetic algorithm was used to opti-
with a convergence speed 100 times faster than that obtained using mize supercontinuum dynamics to maximize spectral intensity in
more conventional evolutionary algorithms (Fig. 3a)42. Similarly, specific wavelength bands47 (Fig. 3b). In another study, it has been
a neural network was used to determine and optimize the param- shown how Gaussian-like peaks could be generated at desired
eters of a pulse-shaping system composed of a series of dispersive wavelengths in a supercontinuum spectrum using a genetic
and nonlinear fibre elements to generate arbitrary pulse waveforms algorithm to tailor the spectral phase of the incident ultrashort
(parabolic, triangular or rectangular) of desired duration and chirp43. pulses46. Genetic algorithms have also been applied to the design

Nature Photonics | VOL 15 | FebruarY 2021 | 91–101 | 95

Review Article | FOCUS NATuRe PhOTOnics

Table 1 | Comparison of machine-learning tuning approaches in ultrafast fibre lasers

Laser system Control Fitness function(s) Type of algorithm(s) Targeted regime/ Advantages Disadvantages Speed
element(s) parameters
NPE fibre Electrical Different for Rosenbrock search Fundamental Versatile, real Limitations Average
laser38,39,41 polarization different regimes algorithm, random and harmonic time, various of real-time mode-locking
controller collision recovery, mode locking, regimes of techniques time of a
genetic algorithm Q-switching and operation to detect all few seconds,
Q-switched mode classes of laser subsecond
locking instability recovery time
Figure-of-eight Pump diode Pulse Feed-forward neural Replace time Real-time Requires a Not available
laser40 powers (autocorrelation) network, XGBoost, domain comb, multiparameter large number
duration based on linear regression radio-frequency monitoring of measured
nonlinear fibre DFT spectrum and DFT with a single parameters
measurements measurements oscilloscope
by a single
measurement tool
Mode-locked Waveplates, Pulse energy divided Recurrent neural Stable mode Fast recovery Complex and Numerical
fibre laser30 polarizer by spectral kurtosis network, variational locking from changes rather slow results
of the waveform autoencoder with in the fibre training process
latent variable mapping birefringence
(feed-forward neural
NPE fibre laser35 Liquid-crystal- Radio-frequency Genetic algorithm Stable mode Output spectra Only Initial
based electrical power at expected locking can be tuned fundamental mode-locking
polarization repetition rate, mode locking time of 90 s, 30
controller spectral similarity s recovery time
and output power
Ring fibre laser34 Electronic Centre wavelength Genetic algorithm Stable and tunable Tunable centre Limited tuning Not available
polarization and repetition rate Q-switching wavelength and range of around
controller, pump repetition rate 20 nm
NPE fibre laser32 Polarization Modified amplitude Evolutionary algorithm Harmonic Optimized for Slow Harmonic
controller of the nth harmonic mode-locking high-harmonic convergence mode-locking
in radio-frequency regime with mode locking time of 2 h
spectrum anomalous
Figure-of-eight Electronic Peak power, Genetic algorithm Anomalous High contrast Complex fitness ~30 min
laser33 polarization maximized dispersion with between stable function, slow
controller, pump radio-frequency NALM for stable and unstable convergence
power signal at single-pulse mode pulsing regimes
fundamental locking
frequency, and
spectral bandwidth
NPE fibre laser31 Electrical Second-harmonic Evolutionary algorithm Q-switched mode Two regimes of Slow ~30 min
polarization power for anomalous locking and stable operation convergence
controller dispersion operation, mode locking
intensity of FSR
component for
normal dispersion
Mode-locked Polarizer, Pulse energy divided Toroidal search Stable mode Library of Library of Numerical
fibre laser28,29 waveplates by spectral kurtosis algorithm and locking identified all possible results, few to
of waveform singular value birefringence birefringence tens of minutes
decomposition, sparse states can be states must be to build the
search algorithm, used for fast built library
extremum-seeking identification
control of unknown
birefringence and
optimal controller
NPE fibre laser27 Waveplates, Pulse energy of single Genetic algorithm High-pulse-energy Simple fitness Requires Numerical
polarizers, pulse solution mode locking function complex results
amplifier and gain without polarization
multipulsing control
DFT, dispersive Fourier transform; FSR, free spectral range; NALM, nonlinear amplifying loop mirror.

of fibres with optimized dispersion and nonlinearity coefficient Ultrafast characterization. A central element in the application of
to maximize the bandwidth of the coherent supercontinuum in machine learning to tune an ultrafast laser is the feedback loop cou-
the mid-infrared44. pling the emitted pulses with the laser cavity parameters. Although

96 Nature Photonics | VOL 15 | FebruarY 2021 | 91–101 |

some success has been obtained through optimization based on propagation (Fig. 3d)69. At present, however, such work has been
measurements of pulse spectra or temporal autocorrelation func- based on numerical data only — the next step in this field is clearly
tions, ideally a feedback signal based on more complete pulse to uncover the governing models from experimental datasets.
measurements would be desirable. However, such complete pulse Another important area of work involves the study of temporal
characterization on femtosecond and picosecond timescales gener- dependencies observed in nonlinear pulse propagation dynamics,
ally requires complex optical systems, and the retrieval of the field where the temporal and spectral intensity profiles at a specific time
parameters is an inverse problem which can be particularly time instant or propagation length depend on the intensity profiles at
consuming to solve48. earlier times or distance. Recurrent neural networks with internal
Recently, deep neural networks have found applications in solv- memory (which are traditionally used for processing and predic-
ing such inverse problems in areas such as coherent imaging49,50, tions of time series) are particularly well suited to modelling this
imaging through scattering media51,52 or super-resolution53, and type of dynamic behaviour. Indeed, very recent results exploiting
they are now also showing great promise in pulse reconstruction. the memory capacity of recurrent neural networks show how a
The first attempt to apply a neural network to reconstruct a short recurrent neural network with a long short-term memory cell archi-
pulse actually dates back to the mid-1990s and the first develop- tecture can accurately predict the nonlinear propagation dynamics
ment of frequency-resolved optical gating (FROG)54, although this of short pulses for a wide range of scenarios from higher-order soli-
was limited in making strong assumptions about the functional ton compression (where comparison was made with experiment)
form of the pulse being retrieved. In other work, genetic algorithms to octave-spanning supercontinuum generation70. In addition to
have also been successfully applied to FROG trace retrieval55,56, but these studies of single-pass nonlinear propagation dynamics, there
pulse retrieval times still took several minutes. More recently, a con- is clear potential to use recurrent neural networks in predictions of
volutional network trained on simulated data was used to recon- the complex multiscale intermittence dynamics also seen in optical
struct pulses from experimental FROG traces and was shown to be fibre lasers71.
superior to conventional methods even in the presence of high noise
(Fig. 3c)57. Additional studies have employed convolutional net- Chaotic systems and instabilities. Chaotic modulation instability
works to reconstruct pulses from dispersion scan traces58, or from in NLSE-like systems is one of the most fundamental examples of
multimode fibre nonlinear speckle measurements59. Phase recovery instability in optics, with analogues in many other physical systems.
for image reconstruction60–63 and X-ray pulse characterization64,65 Indeed, the study of how incoherent noise can ‘self-organize’ within
are also among important emerging and growing areas of applica- the NLSE to yield coherent breather structures has attracted wide
tions of machine-learning techniques. interest, specifically because of possible links with rogue waves and
extreme events72. However, the complexity of the measurement
Complex dynamics and transient instabilities techniques needed to directly capture such chaotic breathers on
In this section, we review the application of machine learning to ultrafast timescales has imposed severe constraints on the dynami-
the control and characterization of ultrafast propagation dynamics. cal regimes that can be explored in experiments73,74.
Machine learning has been used to address this problem directly
Hidden physics models. The application of machine learning to by training a neural network to determine the temporal character-
derive predictive models from sparse or noisy measurements has istics of a chaotic field based on only the spectral intensity char-
now penetrated research into the study of the basic properties of acteristics (which are easier to measure). Using numerical data
physical systems. In particular, a new field of ‘hidden physics mod- generated from NLSE simulations, a neural network was used to
els’ has arisen where closed-form mathematical models or nonlinear construct a nonlinear transfer function that maps noisy broadband
differential equations governing a physical system66 are identified spectra to the local intensity maximum of the chaotic temporal field
automatically by analysing samples of the dynamical data using (Fig. 3e). This function was then applied to experimental data
‘physics-informed neural networks’. In some cases, the form of the measured using a high-dynamic-range real-time spectrometer75. A
governing equation(s) may be known or assumed in advance, and similar approach was recently used to determine the peak power,
the goal is to extract only the unknown coefficients67. Alternatively, duration and temporal delay of extreme rogue solitons in noisy
one can combine a neural network with a compressed sensing-like supercontinuum generation76. In addition, analysing chaotic data
method to identify only the active terms of the equation(s) from a from modulation instability, unsupervised clustering analysis using
basis of candidate nonlinear functions68. the k-mean algorithm was shown to successfully sort intensity spec-
Using these approaches, a number of applications in ultrafast tra into subclasses associated in the time domain with specific solu-
photonics have been demonstrated to analyse pulse propagation tions of the NLSE related to analytic soliton structures75.
dynamics in optical fibre or in fibre lasers associated with the gen- The application of machine-learning techniques has been
eration of localized and dissipative soliton structures (Fig. 3d)67. extended to even more complex systems such as those observed in
Model-free approaches in the form of reservoir computing (unlike transient laser behaviour and extreme events77. Specifically, using
physics-informed neural networks) have also been implemented the knowledge of previous pulses in a chaotic time series from an
to predict coherent dynamics in particular cases of soliton-like optically injected semiconductor laser, machine-learning methods

Fig. 3 | Machine-learning applications in ultrafast photonics. a, Pulse compression. Left: optimization procedure. Middle: convergence comparison
between neural network and evolutionary algorithm. Right: compressed pulse FROG. b, Controlled nonlinear propagation. Left: schematic. Right:
examples of customized supercontinuum spectra. Pin is the average power of the optimized pulse train leading to maximum spectral intensity at
selected wavelengths corresponding to the blue shaded regions. c, Pulse reconstruction using a convolution neural network. Left: architecture. Middle:
reconstructed FROG. Right: reconstructed pulse. d, NLSE solution using a neural network. Left: pulse evolution (top) and comparison of predicted and
exact solutions (bottom) at three particular points (dashed lines). Right: Kuznetsov–Ma (left) and Akhmediev breather (right) dynamics showing expected
evolution (top), predicted evolution (middle) and relative difference (bottom). All colour bars are in normalized units. e, Modulation instability. Left:
simulated spectra (network input; left) and temporal profiles (network output; right). Middle: network schematic for correlation of spectral and temporal
characteristics. Right: probability density function (PDF) of predicted temporal intensity based on experimental spectra (dashed red line) compared with
simulated PDF (blue line). Figure adapted with permission from: a, ref. 42, c, ref. 57, OSA; b, ref. 47, d, right, ref. 69, e, ref. 75, under a Creative Commons licence
(; d, left, ref. 67, Elsevier.

Nature Photonics | VOL 15 | FebruarY 2021 | 91–101 | 97

Review Article | FOCUS NATuRe PhOTOnics

(nearest neighbours, support vector machine, feed-forward neural multidimensional spatiotemporal systems79, but the predictions in
network and reservoir computing) were analysed for their abil- this case tend to diverge over longer distances80.
ity to predict the intensity of upcoming pulses emitted from the
laser77,78. Although this work was numerical, it clearly shows the Multidimensional systems. A major benefit of neural networks
potential of such prediction for experiment. Attempts have also is their ability to efficiently analyse the properties of multidimen-
been made to model highly incoherent system evolution, including sional systems. This can be particularly useful in multimode fibre

a Pulse compression
Neural network algorithm
0 15 30
Pulsed laser 1.0 800
0 1
6.1 fs

Wavelength (nm)
Deformable Normalized

signal intensity
13 intensity (a.u.)

Actuator 0.5 650
Neural Nonlinear 1,021
network spectrometer
Control of coherent dynamics and ultrafast characterization

0 500
Goal intensity function 0 1,000 2,000 –50 –25 0 25 50
Iterations Time (fs)
Evolutionary algorithm

b Coherent control of nonlinear dynamics

1,600 nm 2.0 1,650 nm 1,750 nm
2.5 Pin = 19.7 mW Single pulse
Intensity counts

Single pulse
2.0 1.5 Pulse splitting
(× 10,000)

Pulse splitting
Delay 1.0
pli d
am ope

1.0 (5 ps div–1)
fib ium-d

ica naly

l s se

pe r


0 0
Highly nonlinear fibre
1,400 1,600 1,800 2,000 1,400 1,600 1,800 2,000
Wavelength (nm) Wavelength (nm)

c Reconstruction of ultrashort pulses Absolute field (a.u.)

Angular frequency

Reconstructed phase
(10–15 rad s–1)

4.6 Original phase 0

Phase (rad)
Phase (a.u.)

field (a.u.)

0.4 Reconstructed
4.8 Original




Time (a.u.) 0
–200 –100 0 100 200 –200 –100 0 100 200
Time (10–15 s) Time (10–15 s)

d Hidden physics model

Absolute field, |h(t,x)| (a.u.)
5 Absolute field (a.u.) Absolute field (a.u.)
3.5 4 4
Distance, x

Data (150 points) –2 W –2


0 0 2 0 2
Distance (a.u.)

1.5 2 2
Time (a.u.)

0.5 4 4
–5 –2 –2
Complex dynamics and transient instabilities

0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 0 0 2

Time, t (a.u.) 2
2 2
Exact Predicted 0.04
5 5 5 –2 –2 0.01
t = 0.59 t = 0.79 t = 0.98 0 0 0 0

2 –0.04 2 –0.01
0 0 0 10 20 30 40 50 10 20 30 40 50
–5 0 5 –5 0 5 –5 0 5 Distance (a.u.) Time (a.u.)
x x x

e Transient instabilities
Input Hidden Hidden Output
intensity (500 W div–1)
intensity (20 dB div–1)

layer layer 1 layer 2 layer 10–2

Single-shot temporal
Single-shot spectral

Wavelength (a.u.)

10–3 Machine
PDF (kW–1)
Time (a.u.)

x3 nj(1) y 10–4
700 750 800 850 900 950 –3 –2 –1 0 1 2 3 Spectrum Intensity 0.5 1.0 1.5 2.5
Wavelength (nm) Time (ps) (a.u.) (a.u.) Maximum peak power (kW)

98 Nature Photonics | VOL 15 | FebruarY 2021 | 91–101 |

systems where spatiotemporal coupling dramatically increases the the combination of several strategies that have so far been used
parameter space and complexity of nonlinear propagation dynam- only separately. For example, recurrent networks based on long
ics. The potential of machine learning in this case was recently dem- short-term memory cells, gated recurrent units or reservoir com-
onstrated with experiments tailoring supercontinuum generation in puting that possess internal memory can be used to model dynami-
a graded-index fibre through control of the injected spatial beam cal systems consisting of time series of different states. These
profile via a neural-network-driven spatial light modulator81. approaches could enable substantial progress in understanding and
Extension to spatial control for enhanced near-field interac- optimizing nonlinear systems, allowing identification of long-term
tions was also shown by combining a neural network with a genetic dependencies and internal dynamics in ultrafast lasers, or the pre-
algorithm to optimize spectral-phase shaping of an incident field diction of complex evolution maps associated with the propagation
to achieve second harmonic generation hotspot switching in plas- of short pulses in nonlinear media and related instabilities. Also,
monic nanoantennas82. In this latter work, the genetic algorithm the capabilities of unsupervised learning to draw inferences and
was added to generate a wide range of nanoantenna designs to be reveal hidden internal structures from datasets without labelled
fed into the neural network. responses could be of significant interest in problems where dimen-
sionality reduction is key. These include, for example, multimodal
Outlook and challenges systems or noise-sensitive dynamics where specific regimes can be
Ultrafast photonic systems are generally very complex, often non- divided into a number of different clusters associated with measur-
linear, and with dynamics extremely sensitive to both their internal able parameter(s). Moreover, approaches employed for the design
parameters and external perturbations. The design and optimiza- of nanophotonic components in the form of machine learning
tion of these systems have been typically based on physical models, combined with the adjoint method91 could be a powerful tool for
numerical simulations and trial-and-error approaches. With the the inverse design of ultrafast photonics systems. The concept of
increased complexity of these systems, driven by the demand for high generative adversarial networks92 where two distinct networks are
stability, robustness against disturbances, tunability and adaptive optimized in the backpropagation operation93 is another promising
control, these approaches are now starting to reach their limits such avenue to explore in ultrafast photonics.
that future major advances will require new methodologies that can There are of course important challenges ahead. When using a
analyse the system characteristics at a global level. One may therefore recurrent network to analyse and predict dynamics, proper sam-
anticipate that machine-learning techniques able to discover hidden pling along the evolution dimension (time or distance) is essential to
features and independently adapt as they are exposed to new data are extract and reproduce the long-term evolution structure. Memory
likely to play a central role in the next generation of ultrafast systems limitations can then become an issue, especially in the context of
and applications. There are of course many ways machine-learning lasers where it takes usually many cavity round trips for a regime
techniques can be exploited, and we discuss below some possible to stabilize. Unsupervised learning analysis divides the data into
future directions of research and challenges to overcome. subsets with similarities, but crucial information on the criterion
Ultrafast fibre lasers are dynamical systems operating in regimes used to perform the division, or on what the similarities actually
determined by dispersion, nonlinearity, gain, losses and satura- are within the clusters is lacking. This means that to fully exploit
tion effects. Optimization, breakthrough performance, high stabil- the power of unsupervised learning, further human investigation
ity against perturbations and automatic tuning requires in-depth is generally needed to establish the link between the clusters and
understanding of the full system parameter space, which can be specific parameters of the system analysed. This can be a limiting
achieved by combining accurate real-time characterization and factor, especially for the case of noise-sensitive systems where tiny
advanced data analysis. Machine-learning-based approaches have variations can result in dramatically different evolution patterns.
the potential to reduce the complexity and number of measure- The use of machine-learning algorithms for real-time processing
ment devices typically required. They could further allow for con- of photonic systems that can produce data in excess of billions of
verting results of measurements into a higher-dimensional space bits per second requires the ability to manage high data volumes, as
where the separation of the role played by the different cavity ele- well as a hardware framework capable of dealing with ultrafast pro-
ments is more apparent, aiding the construction of universal mod- cessing rates. To reduce the large volume of data, one could use the
els. Machine learning may also yield substantial developments in approach of spike-based neural networks that can reconstruct fea-
full and high-speed characterization of short pulses or complex tures of spatiotemporal states based on analysing only a subset of the
fields arising from highly nonlinear dynamics. Adaptive optics and measured data. Inspired by the human brain, which strongly com-
coherent control typically rely on ultrafast laser systems where the presses the information received from the eye94, spike-based neural
spatial, temporal and spectral properties of the laser beam are cen- networks use a specific set of rules such as spike time-dependent
tral to optimum performance in, for example, metrology83, spec- plasticity leading to self-organization of the network’s topology
troscopy84,85, energy harvesting86 or astronomy87. By enabling more and allowing identification of possible correlations in the input
systematic strategies rather than heuristic approaches (for example, data. When combined with lateral inhibition (a spike-based form
in the optimization of multidimensional systems including beam of a winner-take-all topology), spiked-based neural networks can
shaping and spacetime focusing in multimode fibres88–90), machine self-configure to perform a cluster analysis with performance simi-
learning could enable unprecedented level of control in those appli- lar to that achieved with a k-mean algorithm95. Efforts to develop a
cations. Another important area where we expect machine learn- hardware framework allowing for high-speed processing and opti-
ing to lead to substantial progress is the discovery of models using mization on short timescales have already been made, and several
data-driven strategies to identify governing mathematical equations all-optical network architectures have been proposed based on, for
of complex optical phenomena or photonic systems. It is even con- example, multiple layers of diffractive surfaces where each point on
ceivable that in the future, ultrafast fibre lasers could become test- a given layer acts as a node96, or optical matrix multiplication using
beds for the physics discovered from machine learning. a cascaded array of Mach–Zehnder interferometers integrated into
So far, most machine-learning applications to ultrafast pho- a silicon photonic circuit97. Another promising approach could be
tonics have been based on genetic algorithms or feed-forward to combine all-optical field-programmable gate arrays and fully
architectures. While these implementations have undoubtedly parallel photonic neural network hardware. Of course, one impor-
led to remarkable and pioneering results, there are still important tant constraint to the development of all-optical neural networks
approaches that have yet to be fully exploited. Indeed, it is likely that needs to be carefully studied is the tolerance to photonic com-
that realizing the full potential of machine learning will necessitate ponent fabrication imperfections98.

Nature Photonics | VOL 15 | FebruarY 2021 | 91–101 | 99

