Professional Documents
Culture Documents
Basepaper
Basepaper
Basepaper
Article
Research on a Non-Invasive Hemoglobin Measurement System
Based on Four-Wavelength Photoplethysmography
Zhencheng Chen 1 , Huishan Qin 1 , Wenjun Ge 1 , Shiyong Li 2, * and Yongbo Liang 1, *
1 School of Life and Environmental Sciences, Guilin University of Electronic Technology, Guilin 541004, China
2 School of Electronic Engineering and Automation, Guilin University of Electronic Technology,
Guilin 541004, China
* Correspondence: lishiyong@guet.edu.cn (S.L.); liangyongbo@guet.edu.cn (Y.L.)
Abstract: Hemoglobin is an essential parameter in human blood. This paper proposes a non-
invasive hemoglobin concentration measurement method based on the characteristic parameters
of four-wavelength photoplethysmography (PPG) signals combined with machine learning. The
DCM08 sensor and NRF52840 chip form a data acquisition system to collect 58 human fingertip
photoelectric volumetric pulse wave signals. The 160 four-wavelength PPG signal feature parameters
were constructed and extracted. The feature parameters were screened by combining three feature
selection methods: reliefF, Chi-square score, and information gain. The top 10, 20, and 30 features
screened were used as input to evaluate the prediction performance of different feature sets for
hemoglobin. The prediction models used were XGBoost, support vector machines, and logistic
regression. The results showed that the optimal performance of the 30 feature sets screened using the
Chi-square test was achieved by the XGBoost model with a coefficient of determination (R2 ) of 0.997,
root mean square error (RMSE) of 0.762 g/L, and mean absolute error (MAE) of 0.325 g/L. The study
showed that the four-wavelength-based PPG signal feature parameters with the XGBoost algorithm
could effectively achieve non-invasive detection of hemoglobin, providing a new measurement
method in clinical practice.
Citation: Chen, Z.; Qin, H.; Ge, W.; Keywords: photoplethysmography; hemoglobin; feature selection; machine learning
Li, S.; Liang, Y. Research on a
Non-Invasive Hemoglobin
Measurement System Based on
Four-Wavelength 1. Introduction
Photoplethysmography. Electronics
Hemoglobin (Hb) is one of the important components of red blood cells. It consists of
2023, 12, 1346. https://doi.org/
four protein molecules called globulin chains, each of which contains an important central
10.3390/electronics12061346
structure called the hemoglobin molecule, embedded in iron [1]. Hb is a crucial indicator
Academic Editors: Radu Ciorap, Jiri of anemia, blood loss, and other body symptoms. The primary function of hemoglobin
Hozman and Jan Vrba is to deliver oxygen to the whole body [2]. According to the World Health Organization
Received: 13 February 2023
(WHO), an estimated 1.6 billion people, approximately 30% of the total population, are
Revised: 6 March 2023
suffering from anaemia. This vulnerable group of anaemia includes pregnant women,
Accepted: 9 March 2023
preschool children, and teenagers [3]. The general symptoms of anemia are tiredness,
Published: 12 March 2023 lethargy, weakness, pale lips, shortness of breath, slippery tongue, increased heart rate,
loss of appetite, dizziness, and lethargy [4]. Therefore, the detection of Hb is essential for
preventing and diagnosing related diseases.
Current assays for hemoglobin concentration include mainly invasive and minimally
Copyright: © 2023 by the authors. invasive methods, both of which require collecting a blood sample from the subject, which
Licensee MDPI, Basel, Switzerland. can be painful for the issue. At the same time, there is a risk of cross-infection, the need
This article is an open access article
for professionals to operate, and the inability to detect in real time. The emergence of
distributed under the terms and
noninvasive testing technology is a better solution to the above problems, and currently,
conditions of the Creative Commons
noninvasive hemoglobin testing is mainly based on photoplethysmography (PPG). PPG
Attribution (CC BY) license (https://
is another signal that reflects the state of the cardiovascular system, and has received a
creativecommons.org/licenses/by/
great deal of attention in recent years due to its ease regarding collection, small sensor
4.0/).
size, and non-invasiveness [5]. The pulse wave is highly critical in human life and health
detection and contains rich physiological information. The human pulse wave signal is
collected, and the PPG signal is processed in various ways to extract useful human physi-
ological information. In addition, it is of great significance for detecting related diseases.
PPG can be used not only to assess hemoglobin levels but also to evaluate several aspects,
such as SPO2 [6], heart rate estimation [7], respiratory rate [8], continuous blood pressure
measurement [9], sleep assessment [10], and arrhythmia detection [11]. Thus, clinical
monitoring of PPG and hemoglobin parameters provides a timely diagnostic reference
for the disease and can be used for subsequent studies on various disease and physio-
logical state assessment methods. With the development of machine learning, scholars
have conducted much research on non-invasive hemoglobin detection methods based
on machine learning. Kavsaoglu et al. proposed a non-invasive method for predicting
hemoglobin that utilizes features of the PPG signal using classification and regression
trees (CART), least squares regression (LSR), support vector regression (SVR), and eight
other machine learning regression methods. The results showed good results using RFS
feature selection method combined with SVR (MSE = −0.0027) [12]. Acharya et al. used a
Multi-Model Stacking Regressor such as Selection Operator (LASSO), Ridge, Elastic Net,
and five other machine learning methods to achieve non-invasive hemoglobin prediction.
They suggest that this approach could form the basis of a public health screening tool for
the detection and treatment of maternal anaemia and could complement global health
intervention strategies [13]. Lakshmi et al. used PPG signals and a generalized linear
regression technique to monitor hemoglobin levels in pregnant women. They showed an
absolute deviation of 0.73 g/dL between the predicted and actual hemoglobin concentra-
tion values [14]. Pinto et al. applied Multivariate Partial Least Square Regression (PLSR) to
predict hemoglobin concentration and validated the designed system by Bland–Altman
analysis, which showed good agreement between predicted hemoglobin and reference
hemoglobin [15].
In this paper, the four-wavelength pulse wave signals of fingertips were collected by
photoelectric sensors, 160 morphological feature parameters based on the four-wavelength
pulse wave signals were constructed and extracted, and then the main feature parameters
were screened using reliefF, Chi-square Score, and Information Gain. Next, the hemoglobin
concentration was predicted using XGBoost, support vector machine regression (SVR),
and logistic regression (LR) models with the screened feature set as input. Finally, the pre-
diction performance was evaluated using RMSE, R2 , and MAE.
pre-processed, and then the four wavelengths are sent to the PC host computer by the serial
port. The system framework of the specific design is shown in Figure 2.
by the hospital’s fully automated hematocrit analyzer to obtain the corresponding invasive
hemoglobin assay value, which is used as a reference value for constructing the model.
2.4.1. LR
Logistic regression models are widely used and have powerful explanatory powers
and have been used to describe phenomena in diverse medical and nonmedical research
areas. Similar to other regression models, logistic regression models are often used to assess
predictors and regulate confusion and interactions [21]. The feature-to-result mapping
process adds a layer of function mapping. The sigmoid function uses the sigmoid function
to constrain the linear sum to between (0,1), and the resultant values can be used for binary
classification or regression prediction.
2.4.2. SVR
Smola [22] proposed Support Vector Regression (SVR) in 1998, a machine learning
method based on statistical VC dimensionality theory and structural risk minimization
criteria. It has a high degree of generalization and can solve practical problems such as small
sample size, high dimensionality, strong nonlinearity, and local extrema [23]. Furthermore,
unlike other regression methods, support-vector regression chooses the regression function
by minimizing some observational errors [24].
2.4.3. XGBoost
The XGBoost algorithm [25] is an integrated learning algorithm based on boosting.
It is developed based on the gradient-boosting decision tree (GBDT) algorithm [26]. As a
result, its speed and precision have increased. In addition, the XGBoost algorithm expands
the cost function by introducing regularization to avoid overfitting. In the field of machine
learning, it is a good and widely used algorithm. Furthermore, developing specialized
medical databases, such as the Medical Information Mart for Intensive Care III (MIMIC-III
database), facilitates data extraction and analysis for ML models [27].
From the results, it can be seen that the prediction accuracy of the three models
increases with the increase in the number of features, and the detection error gradually
decreases. It indicates that the introduced feature parameters significantly improve the
prediction accuracy of hemoglobin concentration, and the accuracy of all three feature
selection methods is the highest, with some 30 features. The prediction accuracy of the
XGBoost regression model is the highest for the other two models, and the prediction
accuracy of all three regression models is better than the results of the other two feature
selection methods under the Chi-square Score feature selection method. In addition,
the XGBoost regression model achieved the most petite MAE of 0.325 g/L. Therefore,
overall, higher hemoglobin prediction performance could be achieved using the Chi-square
filtered 30 features combined with the XGBoost regression model.
Table 2. Prediction accuracy of three regression models under three feature selection methods under
different numbers of features.
The 30 key features screened based on the Chi-square feature selection method are
specified in Table 3.
Method Features
3F11 2F11 1F11 4F11 3F10 3F9 3F13 3F12 2F10 1F10
Chi-square 1F9 1F13 2F12 2F9 2F13 3F21 3F5 3F8 3F6 3F27
4F6 4F5 4F8 4F10 1F12 3F22 1F6 1F5 1F8 4F9
Note: 1, 2, 3, and 4 represent wavelength 1, wavelength 2, wavelength 3, and wavelength 4, respectively.
Figure 6 shows the scatter plot of hemoglobin reference values and the XGBoost
regression model predicted values under 30 key characteristic parameters. The horizontal
Electronics 2023, 12, 1346 9 of 12
coordinate is the actual hemoglobin value of the fully automated hematology analyzer,
and the vertical coordinate is the XGBoost regression model hemoglobin predicted value.
The correlation analysis of the valid and predicted values showed that the slope is 0.993,
R2 is 0.997, and MAE is 0.762 g/L.
The Bland–Altman plot in biomedicine is a data plotting method used to assess the
difference between a new and a standard procedure and to analyze the agreement between
two different assays. This paper uses Bland–Altman plots to achieve consistent analysis
of hemoglobin values. The horizontal axis of the field represents the mean value of the
results of each sample measured by the two methods, and the vertical axis represents
the difference between the results of the two methods. The upper and lower horizontal
lines indicate the upper and lower limits of the 95% consistency limits, i.e., 1.96 times the
standard deviation; the middle horizontal solid line indicates the position where the mean
value of the difference is 0. The Bland–Altman plots of the XGBoost regression model
with 30 key parameters are shown in Figure 7, and most of the sample data are within the
consistency limits, with 95% consistency limits of (−1.504, 1.486) g/dL.
Figure 6. Fitting chart of the real value of hemoglobin and predicted value of the XGBoost regres-
sion model.
for this noninvasive hemoglobin measurement device, with a standard deviation being
4.7. Pinto et al. [32] developed a noninvasive hemoglobin measurement device using an
Arduino Uno embedded development board to control five light-emitting diodes with
wavelengths of 670 nm, 770 nm, 810 nm, 850 nm, and 950 nm, respectively. Data from
15 subjects were collected for analysis, and after LED power normalization, the accuracy
reached 98.29%, RMSE was reduced to 0.36 gm/dL, and R2 was 0.981. All of these methods
achieved noninvasive detection of hemoglobin, and the predicted results were evaluated
using different indicators. As can be seen from the table, more volunteers were recruited
in this study than in the literature [29–32], indicating that the experimental data in this
paper have some reliability. For the R2 index, compared with the literature [28,30,32],
the R2 of this paper is closest to 1, indicating that the XGboost algorithm proposed in this
paper improves the generalization ability of the model. For the RMSE index, the RMSE
of this paper is the smallest compared with that of the literature [30]. The RMSE of this
paper is 0.402 more than that of the literature [32], which indicates that the prediction error
of hemoglobin by the system in this paper needs to be further reduced, which is also a
shortcoming of the method in this paper. However, in general, this paper’s experimental
results have improved performance.
Figure 7. Bland–Altman diagram of XGBoost regression model for predicting hemoglobin concentration.
4. Conclusions
The PPG acquisition system combining the four-wavelength DCM08 blood oxygen
sensor and the analog front-end chip ADPD4100 was designed to perform the human fin-
gertip hemoglobin detection study. First, the hemoglobin prediction model was established
by extracting the feature parameters of the four channels’ high-quality PPG waveforms.
Then, different feature parameters were filtered into other regression models using reliefF,
Chi-square, and InfoGain feature selection methods to determine the optimal model and
key feature parameters. Chi-square, a feature selection algorithm that screened 30 feature
quantities, has the best prediction result, R2 is 0.997, and RMSE is 0.762 g/L, which indicates
that this model has good generalization ability and accuracy. The results of the experiments
show that the XGBoost-based noninvasive hemoglobin prediction model established in this
paper has certain reliability and research value, which is helpful for the improvement and
broad application of continuous noninvasive hemoglobin measurement methods and can
be expected to be used for the diagnosis of early anemia. Suppose the proposed XGBoost al-
gorithm is put into the upper computer software. In that case, the prediction of hemoglobin
will be more convenient and intelligent, or the collected data will be transferred to the
cloud for processing, and the results will be returned to the upper computer software for
display, which will make the system function more diversified. In addition, more than the
sample size collected in this paper is needed. Therefore, we will expand the sample size
and widen the range of sample data in future research work to further study the regression
modeling algorithm, train the model continuously, improve the generalization ability of
the model, make our detection system have more data support, and make the experimental
results more reliable and based.
Author Contributions: Y.L. designed the study. Z.C., H.Q., W.G., S.L. and Y.L. conceived the study,
provided directions, feedback, and/or revised the manuscript. Y.L. led the investigation and drafted
the manuscript for submission with revisions and feedback from the contributing authors. All authors
have read and agreed to the published version of the manuscript.
Funding: This research was supported by the Guangxi Innovation Driven Development Project
(Guike AA19254003), the National Natural Science Foundation of China (62101148), the Natural
Science Foundation of Guangxi (2020GXNSFBA297156), the National Major Research Instrument
Development Project of the NSFC (Grant No. 61627807), and the Innovation Project of GUET Graduate
Education (Grant No. 2022YCXS222 and Grant No. 2022YCXB08).
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The data used in this manuscript can be downloaded from this link https:
//figshare.com/articles/dataset/Hemoglobin_detection_based_on_four-wavelength_PPG_signal_zip/
22256143 (accessed on 12 February 2023).
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Pintavirooj, C.; Ni, B.; Chatkobkool, C.; Pinijkij, K. Noninvasive portable hemoglobin concentration monitoring system using
optical sensor for anemia disease. Healthcare 2021, 9, 647. [CrossRef]
2. Jensen, F.B.; Fago, A.; Weber, R.E. Hemoglobin structure and function. Fish Physiol. 1998, 17, 1–40.
3. Pinto, C.; Parab, J.; Naik, G. Non-invasive hemoglobin measurement using embedded platform. Sens. Bio-Sens. Res. 2020,
29, 100370. [CrossRef]
4. Munira, L.; Viwattanakulvanid, P. Influencing Factors and Knowledge Gaps on Anemia Prevention among Female Students in
Indonesia. Int. J. Eval. Res. Educ. 2021, 10, 215–221. [CrossRef]
5. Tang, Q.; Chen, Z.; Ward, R.; Menon, C.; Elgendi, M. Subject-Based Model for Reconstructing Arterial Blood Pressure from
Photoplethysmogram. Bioengineering 2022, 9, 402. [CrossRef] [PubMed]
6. Tamura, T. Current progress of photoplethysmography and SPO2 for health monitoring. Biomed. Eng. Lett. 2019, 9, 21–36.
[CrossRef]
7. Kumar, A.; Komaragiri, R.; Kumar, M. A review on computation methods used in photoplethysmography signal analysis for
heart rate estimation. Arch. Comput. Methods Eng. 2022, 29, 921–940.
Electronics 2023, 12, 1346 12 of 12
8. Touw, H.R.; Verheul, M.H.; Tuinman, P.R.; Smit, J.; Th´’one, D.; Schober, P.; Boer, C. Photoplethysmography respiratory rate
monitoring in patients receiving procedural sedation and analgesia for upper gastrointestinal endoscopy. J. Clin. Monit. Comput.
2017, 31, 747–754. [CrossRef] [PubMed]
9. El-Hajj, C.; Kyriacou, P.A. A review of machine learning techniques in photoplethysmography for the non-invasive cuff-less
measurement of blood pressure. Biomed. Signal Process. Control 2020, 58, 101870. [CrossRef]
10. Huttunen, R.; Lepp´’anen, T.; Duce, B.; Oksenberg, A.; Myllymaa, S.; T´’oyr´’as, J.; Korkalainen, H. Assessment of obstructive
sleep apnea-related sleep fragmentation utilizing deep learning-based sleep staging from photoplethysmography. Sleep 2021,
44, zsab142. [CrossRef] [PubMed]
11. Sardana, H.; Dogra, N.; Kanawade, R. Dynamic time warping based arrhythmia detection using photoplethysmography signals.
Signal Image Video Process. 2022, 16, 1925–1933.
12. Kavsaoğlu, A.R.; Polat, K.; Hariharan, M. Non-invasive prediction of hemoglobin level using machine learning techniques with
the PPG signal’s characteristics features. Appl. Soft Comput. 2015, 37, 983–991. [CrossRef]
13. Acharya, S.; Swaminathan, D.; Das, S.; Kansara, K.; Chakraborty, S.; Kumar, D.; Francis, T.; Aatre, K.R. Non-invasive estimation
of hemoglobin using a multi-model stacking regressor. IEEE J. Biomed. Health Inform. 2019, 24, 1717–1726. [CrossRef]
14. Lakshmi, M.; Manimegalai, P.; Bhavani, S. Non-invasive haemoglobin measurement among pregnant women using photoplethys-
mography and machine learning. J. Physics Conf. Ser. 2020, 1432, 012089. [CrossRef]
15. Pinto, C.; Parab, J.; Sequeira, M.; Naik, G. Development of Altera NIOS II Soft-core system to predict total Hemoglobin using
Multivariate Analysis. J. Phys. Conf. Ser. 2021, 1921, 012039. [CrossRef]
16. Liang, Y.; Chen, Z.; Liu, G.; Elgendi, M. A new, short-recorded photoplethysmogram dataset for blood pressure monitoring in
China. Sci. Data 2018, 5, 1–7. [CrossRef] [PubMed]
17. Orphanidou, C. Quality Assessment for the Photoplethysmogram (PPG). In Signal Quality Assessment in Physiological Monitoring;
Springer: Berlin/Heidelberg, Germany, 2018; pp. 41–63.
18. Liang, Y.; Abbott, D.; Howard, N.; Lim, K.; Ward, R.; Elgendi, M. How effective is pulse arrival time for evaluating blood
pressure?Challenges and recommendations from a study using the MIMIC database. J. Clin. Med. 2019, 8, 337. [CrossRef]
19. Golap, M.A.u.; Raju, S.T.U.; Haque, M.R.; Hashem, M. Hemoglobin and glucose level estimation from PPG characteristics features
of fingertip video using MGGP-based model. Biomed. Signal Process. Control 2021, 67, 102478. [CrossRef]
20. Zhao, Z.; Morstatter, F.; Sharma, S.; Alelyani, S.; Anand, A.; Liu, H. Advancing Feature Selection Research–ASU Feature Selection
Repository. 2010; pp. 1–28. Available online: https://www.researchgate.net/publication/305083748 (accessed on 15 January
2023).
21. Zabor, E.C.; Reddy, C.A.; Tendulkar, R.D.; Patil, S. Logistic regression in clinical studies. Int. J. Radiat. Oncol. Biol. Phys.
2021, 271–277. . [CrossRef]
22. Smola, A.; Sch´’olkopf, B. A Tutorial on Support Vector Regression: NeuroCOLT; Technical Report NC-TR-98-030; Royal Holloway
College: London, UK , 1998.
23. Huang, H.; Wei, X.; Zhou, Y. An overview on twin support vector regression. Neurocomputing 2022, 490, 80–92. . [CrossRef]
24. Li, Q.; Qin, Z.; Liu, Z. Uncertain support vector regression with imprecise observations. J. Intell. Fuzzy Syst. 2022, 43, 3403–3409. .
[CrossRef]
25. Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd Acm Sigkdd International Conference
on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [CrossRef]
26. Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 1189–1232. [CrossRef]
27. Wang, X.; Zhu, T.; Xia, M.; Liu, Y.; Wang, Y.; Wang, X.; Zhuang, L.; Zhong, D.; Zhu, J.; He, H.; et al. Predicting the prognosis
of patients in the coronary care unit: A novel multi-category machine learning model using XGBoost. Front. Cardiovasc. Med.
2022, 9, 764629. [CrossRef] [PubMed]
28. Ghosal, S.; Das, D.; Udutalapally, V.; Talukder, A.K.; Misra, S. sHEMO: Smartphone spectroscopy for blood hemoglobin level
monitoring in smart anemia-care. IEEE Sens. J. 2020, 21, 8520–8529. [CrossRef]
29. Saracoglu, A.; Abdullayev, R.; Sakar, M.; Sacak, B.; Girgin Incekoy, F.; Aykac, Z. Continuous hemoglobin measurement during
frontal advancement operations can improve patient outcomes. J. Clin. Monit. Comput. 2022, 36, 1689–1695. [CrossRef]
30. Fan, Z.; Zhou, Y.; Zhai, H.; Wang, Q.; He, H. A Smartphone-Based Biosensor for Non-Invasive Monitoring of Total Hemoglobin
Concentration in Humans with High Accuracy. Biosensors 2022, 12, 781. [CrossRef]
31. Hardyanto, I.; Pambudi, S.; Suyarna, Y.; Ardidarma, A.; Kurniawan, A.; Iskandar, J.; Siskandar, R.; Jenie, R.P.; Alatas, H.; Irzaman.
Non-invasive hemoglobin blood level measurement system. AIP Conf. Proc. 2021, 2320, 050005.
32. Pinto, C.; Parab, J.; Parab, M.; Naik, G. Improving hemoglobin estimation accuracy through standardizing of light-emitting diode
power. Int. J. Electr. Comput. Eng. 2022, 12, 219–228. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.