Professional Documents
Culture Documents
Quantification of Human Sperm Concentration Using Machine Learning-Based Spectrophotometry
Quantification of Human Sperm Concentration Using Machine Learning-Based Spectrophotometry
learning-based spectrophotometry
Ali Lesani1, Somaieh Kazemnejad2, Mahdi Moghimi Zand1,*, Mojtaba Azadi3, Hassan Jafari4,
Mohammad R.K. Mofrad5, Reza Nosrati6,*
1 Small Medical Devices, BioMEMS and LoC Lab, School of Mechanical Engineering, College of Engineering, University of Tehran, Iran
2 Reproductive Biotechnology Research Center, Avicenna Research Institute, ACECR, Tehran, Iran
3 School of Engineering, San Francisco State University, US
4 Department of Biostatistics and Health Informatics, Institute of Psychology Psychiatry and Neuroscience, King's College London, United Kingdom
5 Departments of Mechanical & Bioengineering, UC Berkeley, Berkeley, US
6 Department of Mechanical & Aerospace Engineering, Monash University, Clayton, Victoria 3800, Australia
* Corresponding authors
M. Moghimi Zand. Small Medical Devices, BioMEMS and LoC Lab, School of Mechanical Engineering, College of Engineering, University of Tehran, Iran
Email: ahdimoghimi@ut.ac.ir
R. Nosrati. Department of Mechanical and Aerospace Engineering, Monash University, VIC 3800, Australia.
Email: Reza.Nosrati@monash.edu
ABSTRACT
Spectrophotometry is an indirect non-invasive and quantitative method for specifying materials with unknown contents based
on absorption behavior. This paper presents the first application of artificial neural network in spectrophotometry for
quantification of human sperm concentration. A well-trained full spectrum neural network (FSNN) model is developed by
examining the absorption response of sperm samples from 41 human subjects to different light spectra (wavelength from 390 to
1100 nm). It is shown that this FSNN accurately estimates sperm concentration based on the full absorption spectrum with over
93% prediction accuracy, and provides 100% agreement with clinical assessments in differentiating the samples of healthy donor
from patient samples. We suggest the machine learning-based spectrophotometry approach with the trained FSNN model as a
rapid, low-cost, and powerful technique to quantify sperm concentration. The performance of this technique is superior to
available spectrophotometry methods currently used for semen analysis and will provide novel research and clinical opportunities
for tackling male infertility.
KEYWORDS
Semen analysis, Sperm concentration, Spectrophotometry, Artificial Neural Network
Figure 1. (a) UV-Visible spectrophotometry experimental setup. (b) A representative absorption spectrum for a tested semen specimen,
demonstrating 5 absorption peaks at 390, 502, 615, 929, and 1075 nm. (c) The effect of sample dilution on absorption intensity for a semen
specimen with the initial raw concentration of 70 Million/ml. Absorption pattern is identical and only lower absorption magnitudes were
recorded.
Where Conci is sperm concentration for sample
number i. Purelin (linear, P), Logsig (log-sigmoid, L) and
Tansig (Hyperbolic tangent sigmoid, T) were used as transfer
functions, and Levenberge-Marquardt (LM) and Bayesian
Regularization (BR) were used as the training methods to
characterize the performance of the models (representative
RSME values for SPNN model is shown in Figure S1 in the
Supplementary Material). The optimum number of neurons in
the first and second hidden layers (between 1 to 20) was also
optimized through assessing 400 ANN structures to achieve
the lowest RMSE for best prediction accuracy and avoiding
local minimums, as listed in Table S2 in the Supplementary
Material.
Figure 3. Absorption spectra for human semen specimens of different concentrations. (a) Absorption spectra for 9 semen specimens ranging
in sperm concentration from 4 M/ml to 70 M/ml and in motility from 10% to 65%. Sample with higher sperm concentrations (>30 M/ml)
exhibit higher absorption intensity with distinct peaks at 929 nm and 1075 nm. (b) Absorption spectra for semen specimens of different
sperm concentrations but with fixed motility of 50%. Dashed lines indicate peak positions.
Figure 4. (a) Calibration curve for sperm concentration based on
Figure 5. Sperm concentration from clinical measurements
the absorption peak at 1075 nm. (b) Clinically measured sperm
versus values predicted from (a) the SPNN model based on the
concentration values versus predicted sperm concentration from
absorption peaks, and (b) the FSNN model based on the full
the calibration curve in a.
absorption spectrum. Each data point is the average value of 3
the other 15 previously unseen samples (Table S2 in the independent measurements with error bars as one standard
deviation.
Supplementary Material).
regression model is mainly attributed to the multi-input nature
3.2.1 | Multi input Selected Peak Neural Network (SPNN) of the SPNN model to account for the change in the dominant
model absorption peak for samples of different concentrations
(rather than only using the peak at 1075 nm).
A selected peak neural network (SPNN) model was developed
to estimate sperm concentration based on the five absorption 3.2.2 | Full Spectrum Neural Network (FSNN) model
peaks at 390, 502, 615, 929, and 1075 nm (product of peak
absorptions and corresponding wavelengths were used as the To develop a comprehensive prediction tool, a full spectrum
input, Figure 2a). The SPNN model with Levenberge- neural network (FSNN) model was constructed that estimates
Marquardt (LM) as the training method and with Logsig and sperm concentration based on the full absorption spectrum
Tansig as the transfer functions for the first and second hidden (711 data points per sample, Figure 2b). The minimum RMSE
layers, respectively, resulted in the minimum RMSE of 0.034 of 0.086 was achieved for the FSNN model with Levenberge-
(Figure S1 in the Supplementary Material). This optimal Marquardt (LM) as the training method and with Tansig and
SPNN model maps five input variables (absorption peaks) Logsig as the transfer functions for the first and second hidden
onto a single output (sperm concentration) with 9 neurons in layers respectively. The best FSNN model maps 711 input
first hidden layer and 19 neurons in the second hidden layer variables (absorption values and corresponding wavelengths)
(denoted as 5:9:19:1), listed as structure number 179 in Table onto a single output (sperm concentration) with 12 neurons in
S1 in Supplementary Material. Figure 5a compares the the first hidden layer and 20 neurons in the second hidden
clinically measured sperm concentrations with predicted layer (denoted as 711:12:20:1), listed as structure number 240
values from the SPNN model, for 15 human samples with in Table S1 in Supplementary Material. Figure 5b compares
concentrations ranging from 5 to 65 M/ml (n=15). Results sperm concentrations from clinical measurements with values
from the SPNN model strongly correlated (R2=0.90, P≤0.001) predicted by the FSNN model, for the same 15 human samples
to clinical measurements and were within 14% of the clinical tested with the SPNN model. Sperm concentration from the
values, demonstrating a prediction accuracy of 86%. This 8% FSNN model strongly correlated (R2=0.98, P≤0.0001) with
improvement in prediction accuracy over the simple linear clinical measurements, with predicted values within 7% of the
6
clinical measurements (93% prediction accuracy). The FSNN
model demonstrated over 7% and 15% improvement in
prediction accuracy as compared with the SPNN and simple
regression models, respectively. The FSNN model accurately
estimates sperm concentration by fully accounting for the
non-linear correlation between sperm concentration and
absorption spectrum, including the change in the dominant
absorption peak and the shift in the peak position for samples
of different concentrations. It is noteworthy that, both SPNN
and FSNN models were also trained and evaluated to also
estimate sperm motility from recorded absorption spectra for
clinically tested samples with sperm concentrations ranging
from 4 M/ml to 70 M/ml and sperm motility ranging from
10% to 65% (i.e. ANN models with sperm concentration and
motility as output parameters). However, the models were
incapable of predicting sperm motility and estimated Figure 6. Direct comparison of sperm concentrations from the
concentrations were independent of sperm motility. linear regression, SPNN, and FSNN models against clinically
To evaluate the performance of the developed models to measured values. Dashed line indicates the lower reference limits
estimate sperm concentration, Figure 6 compares sperm of 15 M/ml for human sperm concentration based on the WHO
concentrations from the linear regression, SPNN and FSNN guidelines.
models against clinically measured values, for patient and correlation between absorption spectrum and sperm
donor samples ranging in concentration from 5 to 65 M/ml concentration. Sperm concentration from the FSNN model
(n=15). A lower reference limit of 15 M/ml is defined by strongly correlated (R2=0.98) with clinical measurements and
WHO for sperm concentration (dashed line in Figure 6) [39], were within 7% of the clinical values, demonstrating a
which is used in clinics as a diagnostic threshold for male prediction accuracy of 93% with over 15% improvement over
infertility. While the linear regression and SPNN models the linear regression model. The FSNN model provided 100%
failed to distinguish between donor and patient samples, the agreement with clinical measurements in terms of clinical
FSNN model provided 100% agreement with clinical outcome for patients to accurately distinguish between donor
measurements in terms of clinical outcome for patients. As and patient samples. However, increasing sample
shown in Figure 6, sperm concentration for Sample number 3, representativeness in terms of sperm concentration by
6, 8, 10 and 13 were all below the threshold value of 15 M/ml increasing the sample size, could possibly increase the
based on clinical measurements and the FSNN model, while prediction accuracy of our method and better validate the
the SPNN model only identified Sample number 3 and 13 as model for clinical adoption. Collectively, the FSNN model
patient samples and the linear regression model failed to provides a rapid and powerful tool for quantifying sperm
identify the patient samples. This improvement in prediction concentration, improving on current spectrophotometry
accuracy and distinguishing between donor and patient methods for semen analysis and providing novel opportunities
samples is significant, enabling a spectrophotometry-based for male infertility diagnosis. While increasing the sample
method for semen analysis with clinical outcomes identical to size, can better inform the model for estimating sperm
those of the conventional CASA systems. The FSNN model concentration and further increase the prediction accuracy, the
also demonstrated the best prediction performance based on FSSN model still demonstrates its performance for male
our statistical analysis (Table S2 in the Supplementary infertility diagnosis. Utilizing the novel machine learning-
Material), resulting in lowest RMSE, AIC and BIC compared based spectrophotometric approach is also suggested for
to SPNN and linear regression models. The performance of samples with DNA fragmentation in search for an effective
the FSNN model was tested against three developed Support diagnostic tool.
Vector Regression (SVR) models (see the Supplementary
Material). The FSNN model outperformed the SVR models
by at least 13% in predicting sperm concentration, while the ACKNOWLEDGMENTS
SVR models also fail to distinguish between donor and patient
samples. Taken together, the FSNN model provides a rapid We gratefully acknowledge the support from the Australian
and powerful tool for quantifying sperm concentration using Research Council Discovery Program (DP190100343) and
spectrophotometry. Monash Interdisciplinary Research Grants to R.N. We also
thank Reza Nobakht Hassanlouei for his help with Artificial
4 | CONCLUS ION S Neural Network.
8
Health Organization reference values for human semen characteristics.
Hum. Reprod. Update, 2010. 16(3): p. 231-245.
41. Reed, R.D. and R.J. Marks, II, Neural smithing : supervised learning in
feedforward artificial neural networks. 1999: Cambridge (Mass.) : MIT
press.
42. Aggarwal, K.K., et al., Bayesian Regularization in a Neural Network
Model to Estimate Lines of Code Using Function Points. Journal of
Computer Science, 2005. 1(4).
43. CHAPTER 3 - Errors in Spectrophotometry, in Studies in Analytical
Chemistry, L. Sommer, Editor. 1989, Elsevier. p. 78-94.