1 s2.0 S2405844023059583 Main

Heliyon 9 (2023) e18750
Contents lists available at ScienceDirect
Heliyon
journal homepage: www.cell.com/heliyon
The sustainable development of mathematics subject: An

empirical analysis based on the academic attention and
literature research
Yulin Zhao a, b, *, Junke Li a, c, **, Kai Liu a, d, ***
a
School of Information Engineering, Suqian University, Suqian, Jiangsu, 223800, China
b
School of Educational Information Technology, Central China Normal University, Wuhan, 430079, China
c
Jiangsu Province Engineering Research Center of Smart Poultry Farming and Intelligent Equipment, Jiangsu, 223800, China
d
School of Computer and Information, Qiannan Normal University for Nationalities, Duyun, 558000, China
A R T I C L E I N F O A B S T R A C T
Keywords: The exploration of the correlation between subject network attention and literature research in
High school mathematics China can aid in comprehending the development trend of Chinese scientific and technological
Subject attention journals. Currently, many scholars have done a lot of research based on the network media index,
Literature research
but the relationship between the discipline attention represented by it and literature research has
Relationship modeling
not been fully verified. This paper used CNKI and Baidu Index as data sources to establish a RAPF
Short-term forecasting
experimental framework based on relationship analysis and prediction, and selected high school
mathematics subjects in China for effective demonstration. First, RAPF extracted core keywords
using text tools and word frequency statistics. Second, it constructed a relationship model be
tween subject attention and literature research based on Spearman and LOOCV. Finally, it made
predictions through time series and regression analysis. The results showed a correlation between
subject attention and literature research, and the model fit R2 was 0.774, with a relative error of
less than 2%. Short-term predictions found that some keywords received less online attention, and
2022–2024 may be the crucial development period for mathematical education research, with an
annual literature research volume of approximately 380 articles. This paper summarized the
mathematical subject themes centered on content, culture, literacy, and integration, and also
provided a reference for the development of the subject through experimental prediction. In the
next two years, China’s mathematics literature research still needs to delve deeper, broaden its
breadth, enhance its height, and ensure a steady improvement in the quality and quantity of
literature research.
1. Introduction
In the new media era, the popularization of academic online platforms and data social media has enriched the ways of users’
academic exchanges. The attention given priority by the network index reflects the attention of netizens on a certain thing, which is an
* Corresponding author. School of Information Engineering, Suqian University, Suqian, Jiangsu, 223800, China.
** Corresponding author. School of Information Engineering, Suqian University, Suqian, Jiangsu, 223800, China.
*** Corresponding author. School of Information Engineering, Suqian University, Suqian, Jiangsu, 223800, China.
E-mail addresses: 842215935@qq.com (Y. Zhao), ljk2006ljk@163.com (J. Li), 407165332@qq.com (K. Liu).
https://doi.org/10.1016/j.heliyon.2023.e18750
Received 23 August 2022; Received in revised form 22 July 2023; Accepted 26 July 2023
Available online 29 July 2023
2405-8440/© 2023 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
Y. Zhao et al. Heliyon 9 (2023) e18750
explicit network resource. The available range and application fields of Internet attention are constantly expanded, and it also makes
new contributions to the development of medicine, economy, education, and other fields [1–5]. How to scientifically evaluate the
application of network attention in the field of education and establish its relationship with educational scientific literature is a new
hot topic in educational research.
Subject attention is the public focus on the original and reprinted papers of a subject and then reflects the theme’s heat and in
fluence. As the main carrier of scientific research results, scientific literature has irreplaceable advantages such as high authority and
strong credibility. It is also the main basis for scientific research reference and reference. Through the CNKI(China National Knowledge
Infrastructure) retrieval of the relevant literature of subject concern, it was found that the majority of the studies focused on the topic
mining of scientific literature. For example, Zhang et al. mined hot frontiers of scientific research literature based on social network
attention and generated hot spots of artificial intelligence and image processing knowledge [6]. Pai et al. identified 16 highly con
cerning research topics in information science based on the Z-index method [7]. Internet attention is conducive to extracting and
identifying hotspots within scientific literature. However, whether there is a complex correlation between disciplinary attention and
scientific literature is rarely demonstrated in the current literature. Thanks to the rapid development of digital technology, the
relationship between disciplinary attention reflected by online media and scientific literature on academic resource platforms have
become the focus of this study.
This study intended to select high school mathematics as the research object for two main reasons. First, as an important subject of
basic education, mathematics can promote social development and national rejuvenation. Secondly, the abundant literature on
mathematics education has laid a solid foundation for this research. Through literature review, it was found that most current studies
focus on comparative analysis of mathematics textbooks, core literacy cultivation of mathematics, classroom penetration of mathe
matics culture and mathematics history, hotspots and trends of mathematics research, etc. [8–11]. In addition, the current literature
tends to be speculative, while the research on action and experiment needs to be further improved and strengthened. Considering the
current limitations of the literature, this paper planned to conduct experimental research on high school mathematics by focusing on
the literature method as a breakthrough point. The previous research on the implementation of network search index in other fields has
provided valuable insights for this study. Additionally, fundamental mathematical research served as a strong foundation for this
study.
This study aimed to explore the relationship between subject network attention and literature research in China, so as to under
stand the development trends of Chinese scientific and technological journals. Based on the era of network big data, this paper pro
posed a new framework RAPF(Relationship Analysis and Prediction Framework) centering on the following issues and carries out
experimental research in Mathematics on this basis.
(1) Analysis of the relationship between mathematical subject network attention and the amount of scientific research literature.
(2) Relationship modeling between mathematical subject network attention and the amount of scientific research literature.
(3) Prediction and analysis of mathematical discipline attention and academic platform publication volume. The specific analysis of
each problem in the experimental research process is detailed in Chapter 3.
The main contribution of this study was as follows.
(1) Based on the era of network data, this paper fully considered the driving effect of netizens’ attention on the development of the
discipline. At the same time, it also provided a new idea for the study of network index in education.
(2) This paper was helpful to understand the research trends, future direction, and development process of high school
mathematics.
(3) This paper proposed a new framework RAPF based on Relationship Analysis and Prediction. The framework used a large
number of data and methods to conduct experimental research, supplementing evidence for existing reviews and providing
methodological references.
(4) This paper provided an exemplary model for scholars to conduct literature research on experiments from other disciplines.
2. Related work
The network index provides data support for the micro research of attention. In recent years, network attention has been widely
used in many fields for experimental research. After the literature review, it is found that relationship analysis and prediction analysis
based on attention occupy the majority. Based on Baidu Index, Lin et al. found that the network attention of children with allergic
rhinitis in Beijing was positively correlated with the outpatient volume [12]. Wei et al. found a correlation between the number of
COVID-19 case reports and Internet users’ search and information attention. As a result, he suggested that relevant departments utilize
online big data to monitor public opinion dynamics [13]. Wang et al. said that in the era of big data, the massive real-time data on the
Internet is conducive to monitoring and forecasting traditional macroeconomic trends [14]. Yang’s empirical study of the multivariate
model of the Internet search Index suggested that the amount of Internet searches could predict traders’ views and expectations on the
current economic situation and future [15]. In addition, disciplinary bibliometrics also involved multiple industries such as medicine
and architecture [16]. Cortese Samuele et al. identified four major research trends through quantitative analysis of scientific literature
on ADHD (Attention-Deficit/Hyperactivity Disorder) [17]. Hou et al. proposed a novel Method and Dataset Entity Recognition model
based on literature analysis, which helps extract methods and dataset entities from the main text content of scientific papers [18].
Osei-Kyei et al. used VOSviewer and Gephi to analyze trends in the field of building risk management, providing researchers with
2
current progress in building risk management research [19]. Existing relevant research covered a wide range of fields. But relatively
little research has been done in the field of education. At present, most of the research on the teaching discipline focused on core
literacy, the frontier hot spots, the evolution trend of the discipline, and so on [20–23], while there were few related pieces of literature
concerning the discipline’s attention. Taking this as the starting point, this study aimed to explore discipline network attention and
propose innovative ways to utilize web search indexes in literature research.
Different from previous studies, the differences and advantages of this paper were mainly reflected in the following aspects: (1)
Comprehensively summarizing the methods of previous studies, this paper proposed a framework for relationship analysis and pre
diction. (2) Based on keyword mining in previous studies, the prediction of literature quantity and keyword attention was increased. It
reflected the development trend of the discipline from the quantitative perspective. (3) Previous studies focused more on qualitative
analysis. This paper made a comprehensive discussion combining qualitative and quantitative analysis.
This study focused on addressing the issues raised against the research objectives. (1) Using Spearman to analyze the relationship
between disciplinary attention and literature volume. (2) Using LOOCV(Leave-One-Out Cross-Validation) to construct a relationship
model between subject attention and literature volume. (3) Using time series and regression analysis to predict future disciplinary
attention and literature volume.
The highlight of this study is to mine the hot keywords of the mathematics discipline, explore the formation mode of mathematics
discipline development, and predict the development trend of the mathematics discipline. This study helps to establish a connection
between network platforms and academic platforms, promoting the sustainability of disciplinary development.
3. Proposed RAPF framework
This study built a relationship analysis and prediction framework named RAPF(Fig. 1) that integrated disciplinary attention and
disciplinary literature. The framework was implemented around the problems to be solved in relevant research. Appropriate methods
were used in each process, and each method was described in detail.
The first module of RAPF was relationship analysis. This paper utilized Baidu Index and CNKI as data sources. The study referenced
the practices of previous scholars [24,25], and employed text mining, word frequency analysis, and Spearman correlation analysis to
investigate the relationship. See Section 3.1 for a detailed method description.
The second module of RAPF was relationship modeling. This paper established a regression model of literature research and its
discipline attention using LOOCV. The model used the attention index as the explanatory variable. To enhance the reliability of the
relational model, LOOCV theory was employed due to the limited sample size. The model accuracy was measured by relative error
during testing. See Section 3.2 for a detailed method description.
Fig. 1. RAPF framework.
3
The third module of RAPF was prediction analysis. Referring to the relevant practices of existing studies [26,27], this paper used
time series and regression models to predict the dynamic trend of mathematics development. It provided a reference for the devel
opment of discipline literature research. See Section 3.3 for a detailed method description.
3.1. Relationship analysis
3.1.1. Text mining

Bibliometrics is based on the idea of data mining and carries out scientific statistics on texts to study the quantitative relationship of
texts [28]. Through in-depth analysis and compression of literature content, key points of literature can be mined and important
information can be obtained. Keywords are the summary of the literature theme, but also the epitome of literature content. It covers the
main idea of the document in terms of vocabulary. Keyword frequency analysis is a bibliometric method used to analyze the frequency
of words in the literature of a specific research field [29]. This analysis helps to identify the research hotspots and development trends
in the field. This paper used the complete counting method to count the term frequency (TF) of keywords. Full counting is to count all
the times a word appears in the text (Equation (1)). If safety is presented 3 times in a paper, the binary accounting is 1 and the complete
accounting is 3.
TF = Ni,j (1)
In Equation (1), TF represents the statistics of the complete counting method for keywords, and Ni,j is the display times of word i in
Fig. 2. Text word frequency acquisition process.
4
text j.
The text mining of keywords is carried out according to the following process (Fig. 2).
3.1.2. Spearman correlation analysis

Spearman correlation analysis is used to measure whether there is a dependence between variables. It is defined as the Pearson
correlation coefficient between rank variables. Spearman applies to the sample data set with only two columns of data and a linear
relationship of rank variables. Its application scope is wide and it has been applied to the relationship analysis of multi-field literature
research. Equation (2) is Spearman’s calculation method.
∑
i (Ri − R)(Si − S)
rs = √̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
∑ ̅ (2)
2∑ 2
i (R i − R) i (Si − S)
In Equation (2), Ri represents the rank of the ith x value and Si represents the rank of the ith y value. R and S are the mean values of
sample variables.
Spearman correlation coefficient is tested for significance t to avoid the error value caused by samples. The test method is shown in
Equation (3).
√̅̅̅̅̅̅̅̅̅̅̅
n − 2·r
t = √̅̅̅̅̅̅̅̅̅̅̅̅̅ (3)
1 − r2
In Equation (3), r represents the Spearman coefficient value and n represents the number of samples. The freedom degree is n-2.
According to the p-value corresponding to the t-distribution table, if p>0.05, the correlation test between variables is not significant,
that is, the r value does not have statistical significance. If p≤0.05, the significance test of the correlation passed, that is, there is
dependence between variables.
3.2. Relationship modeling
3.2.1. Model establishment
(1) Model basis
To avoid over-fitting of data and improve the prediction performance of the evaluation model, this paper uses Leave-One-Out
Cross-Validation for relationship modeling [30]. LOOCV, one of the more common methods of the Bayesian model, involves
reserving one sample from a set of X samples to test and evaluate the strengths and weaknesses of the built model. The remaining X-1
samples are used to train the model. After conducting N tests on a given sample, the fitting degree of the model can be obtained by
averaging the fitting degree of these tests. This method is often used for models with small sample sizes, and its advantage is that all
sample sizes are traversed during model training. Each sample is used to test the model, and the model parameters trained by this
method are relatively reliable.
(2) Model building
As a common prediction model in mathematical problems, multiple regression contains many independent variables. It uses or
dinary linear algebra to express the dependency between multiple variables. Based on the LOOCV method theory and the principle of
multiple regression analysis, this paper tests and analyzes the significance of the influence of each variable on the dependent variable,
and finally establishes the optimal multiple regression equation (Equation (4)).
∑
6
y = α0 + αi xi + ε (4)
i=1
In Equation (4), y is the dependent variable of the multiple regression relationship, and xi represents the independent variable of the
multiple regression relationship. A0 is a constant term; ai represents the regression coefficient (weight coefficient) of the model, which
is used to describe the change degree of dependent variable y when the independent variable xi changes. a0 and ai are the mean values
of model coefficients after retention cross-validation. ε is the random error term.
Considering the inconsistency of units between independent variables and dependent variables, and to reduce the error of model
fitting, this paper performed logarithmic transformation on Equation (4) to obtain a new regression equation (Equation (5)). In the
subsequent experiments, the model is established according to Equation (5).
∑
6
lny = α0 + αi ln xi + ε (5)
i=1
3.2.2. Model accuracy test

Considering that the relative error can better reflect the credibility of the measurement, this paper uses the relative error between
5
the actual value and the predicted value to test the accuracy of a regression model. Referring to the practice of existing scholars [31],
this paper defines the calculation method of relative error (Equation (6)).
⃒ ⃒
⃒ln ypredict − ln yactual ⃒
δ= × 100% (6)
ln yactual
In Equation (6), δ represents the relative error between the original sample value and the fitting value. Lnypredict is the predicted
value of the model, and lnyactual is the actual value of the sample.
3.3. Forecast analysis
3.3.1. Model building

Time series is a group of sequential data with time rules. Data features can be extracted from the sample series by mining. ARIMA
(p,d,q) (Autoregressive Integrated Moving Average model) time series model is applied to a single stationary time series [32]. Equation
(7) is the expression of the ARIMA model. In the model, p represents the auto-regression term, q represents the moving average term,
and d is the difference order.
Xt = c + α1 Xt− 1 + α2 Xt− 2 + ... + αp Xt− p + ut + β1 ut− 1 + ... + βq ut− q (7)
In Equation (7), real parameters ap and βq are parameters of the ARIMA time series model, which represent the coefficient values of
auto-regressive term and moving average term respectively.
The selection of this autoregressive method in this article took into account two aspects. First, the dataset in this study met the
requirements of the ARIMA model for time series data. Second, this method could complete modeling without the need for other
exogenous variables. This model’ validation was shown in Section 4.3.1.
Fig. 3 showed the steps of the model, which included the following processes.
(a) Checking data: Test the stability of the original data. If not, logarithmic transformation or difference transformation is required
until the sequence is stable. During this process, the d value can be determined.
(b) Estimating parameters: Judge the p and q parameters of the model according to the stationary data and AIC criteria.
(c) Building model: Build a fitting model according to estimated parameters.
Fig. 3. Flow chart of prediction.
6
(d) Testing model: Use stationarity and white noise to test the rationality of the model. If the model performs well, it can be applied
to forecast future data. However, if the model does not meet expectations, the parameters need to be adjusted to improve its
performance.
3.3.2. Model checking

The prediction accuracy of the time series model is usually evaluated according to the prediction error of the model. Commonly
used methods mainly include meaning absolute percentage error (MAPE) and root mean squared percentage error (RMSPE) [33,34].
Among them, MAPE can reflect the average relative error of prediction and applies to a wide range. The calculation method is shown in
Equation (8).
100% ∑
n
MAPE = |pi | (8)
n i=1
In Equation (8), n is the number of samples, and pi represents the percentage of relative error. According to the size of MAPE and
RMSPE and Delurgio’s method [35], this paper divides the prediction level of the model (Table 1). This study selected this method and
combined with the standard deviation of the sample to comprehensively measure the accuracy of the prediction model.
4. Experimental results
4.1. Analysis of the relationship between discipline attention and discipline literature research
RAPF module 1 aimed to analyze the relationship between discipline attention and discipline literature research. This section
focused on RAPF module 1 to carry out experimental research on relationship analysis.
4.1.1. Data source

This paper selected academic journals of CNKI and retrieved all literature with the theme of “High school Mathematics” from
January 1, 2011, to December 31, 2021. Irrelevant documents such as journal solicitation, submission, and newspapers were selected
manually to ensure the validity of the literature. Finally, 5561 documents were selected. Then, this paper used tables and VOSviewer
tools to extract and analyze the keywords of pre-selected literature. Firstly, the keyword word frequency of literature was counted
every year to obtain the high-frequency keywords of high school mathematics in recent ten years. Then through the in-depth discussion
of the pre-selected keywords, some keywords were preliminarily selected as the research object of this paper. Finally, extracted
keywords were taken as a representative to analyze the hot research issues and development trends of high school mathematics in the
past decade.
The network attention data of mathematics discipline was measured by Baidu Index, and the literature research data of mathe
matics discipline was measured by the number of relevant literature published on CNKI.
4.1.2. Selection of subject keywords

The selection of subject keywords followed the following steps.
(1) Extracting keywords and their word frequency. With CNKI as the basic database, the search subject was “High school Math
ematics”, and the time was respectively 2011 and 2012 2021 literature. Then, the obtained literature was exported in a unit of
each year and then identified and analyzed by VOSviewer. Finally, the annual keywords and their word frequency data were
obtained according to Equation (1) and the process in Fig. 2.
(2) Construct a keyword frequency matrix. Taking 2011 as the benchmark, the obtained keywords were listed in the first line, and
their word frequency was placed on the next line. Similar to 2012, if encounter keywords that did not exist in the previous year,
added them to the list, and set the word frequency of the previous year to 0. Similarly, from 2013 to 2021, the word frequency
matrix list of 11 × 140 can be obtained.
(3) Calculate the total frequency of each word. The total word frequency of all keywords from 2011 to 2021 was counted in each
column of the word frequency matrix, and then sorted for the output. Table 2 showed the top 50 keywords in the literature
research on high school mathematics.
Through the statistical analysis of the keywords in high school mathematics literature from 2011 to 2021 and the reading of
relevant articles [36,37], the research on senior high school mathematics education in China mainly focuses on the following points in
Table 1
Prediction capability levels of MAPE and RMSPE.
MAPE & RMSPE (%) Predictive ability
<10 Highly accurate

10~20 good
20~50 reasonable
>50 incorrect
7
Table 2
Subject keywords from 2011 to 2021.
number Keywords Frequency number Keywords Frequency
1 High school mathematics 2487 26 Mathematics learning 42

2 High school mathematics teaching 315 27 Problem 42
3 Mathematics 229 28 symbolic-graphic combination 40
4 Mathematics teaching 225 29 High school mathematics curriculum standards 39
5 Classroom teaching 182 30 Learning interest 38
6 Information technology 144 31 Layered teaching 37
7 Teaching strategy 106 32 Geometric sketchpad 36
8 Teaching method 83 33 Multimedia 35
9 Mathematics core literacy 83 34 Teaching design 34
10 New curriculum 79 35 Mathematical thought and method 33
11 Efficient classroom 73 36 High school Mathematics class 32
12 Mathematical culture 72 37 Effective teaching 32
13 Teaching mode 69 38 Mathematical history 31
14 New curriculum standard 68 39 Autonomous Learning 30
15 Higher mathematics 68 40 High school mathematics classroom teaching 28
16 New curriculum reform 62 41 Flipped Classroom 28
17 Mathematical modeling 54 42 Mathematical thinking 27
18 countermeasures 48 43 Inequality 26
19 link up 47 44 Mathematics classroom 26
20 Effectiveness 46 45 Conic 25
21 Function 45 46 Innovative thinking 24
22 Learning method 44 47 Cultivation strategy 24
23 Trigonometric function 44 48 Series 22
24 Micro-Course 44 49 High school math teacher 21
25 Mathematics knowledge 42 50 Multimedia technology 20
recent ten years: 1. Teaching and learning of high school mathematics: it includes the teaching mode, strategies, methods, teaching
design and independent learning of high school mathematics. 2. Mathematical thinking and core quality of mathematics, with a
combination of numbers and shapes, mathematical modeling, and logical reasoning as the focus. 3. The concept guidance of the new
curriculum standard and the new curriculum reform of senior high school mathematics, which is an important basis and standard for
teachers’ teaching, students’ learning, and teaching evaluation in high school mathematics classrooms in China. 4. The proper
infiltration of mathematical history and the dissemination of mathematical culture in the mathematics classroom can highlight the
ideological methods of teaching and learning mathematics. As the heart of its educational research, high school mathematics teaching
and learning focuses more on what (goal), what (content), who (object), when (time), and which (method) [38]. The keywords
preliminarily screened represent the development direction of mathematics to a certain extent.
4.1.3. Analysis of the relationship between discipline attention and literature research
Subject keywords are the dynamic interpretation of subject hot spots and directions. Keywords preliminarily selected by CNKI have
a large amount of data and complex information, so it is necessary to further streamline the sample to eliminate secondary words. Refer
to the 80/20 rule (Pareto principle) [39,40], and combine the preliminarily screened keywords and the experimental purpose of this
paper to carry out secondary filtering.
1. Eliminate words with repeated semantics;

2. Eliminate words unrelated to high school mathematics;
3. Eliminate words not included in Baidu Index;
Secondly, in order to analyze the relationship between discipline attention and literature research, this study utilized Spearman
correlation analysis (Equation (2) and (3)) to explore their correlation. On this basis, keywords with a high correlation (rs＞0.5) and
statistical significance related to high school mathematics were selected as the main analysis goal. The indicators of related subjects
were: Discipline attention was the annual Baidu Index value based on keywords (2011–2021); Subject literature research was the CNKI
publication volume with keywords as the retrieval condition (2011–2021).
Through t-test, correlation analysis was conducted on the filtered words, and then the words with Spearman correlation value rs＞
0.5 and statistically significant (p＜0.05) were determined as follows: Trigonometric function (rs = 0.606*), Geometric sketchpad (rs =
0.651*), Mathematics history (rs = 0.860**) and Inequality (rs = 0.764**), Mathematical modeling (rs = 0.510*), Multimedia tech
nology (rs = 0.745**), a total of six.
The selected keywords can represent the overall trend and characteristics of text data. Trigonometric function and inequalities are
centered on functions and equations. Mathematics history is representative of mathematics culture. The integration of mathematics
history into the mathematics classroom can make students feel the charm of mathematical culture [41]. Mathematical modeling is one
of the contents of mathematics core literacy, and it is also a reflection of students’ mathematical abstraction, reasoning, and imagi
nation ability [42]. Geometric sketchpad and multimedia technology are the integration reflection of teaching technology and subject
in the new media era. The selected keywords represent high school mathematics from the aspects of discipline content, discipline
8
culture, discipline literacy, and discipline integration, which has certain representativeness.
4.2. Relationship modeling between discipline attention and literature research
RAPF module 2 aims to build a relationship model between discipline attention and discipline literature research. This chapter
focuses on RAPF module 2 to carry out experimental research on relationship modeling.
4.2.1. Relationship model construction

Taking high school mathematics as the theme, this paper established a relationship model between subject attention and literature
research. Discipline attention is measured by network attention of representative keywords. The subject literature research used the
publication volume of high school mathematics theme literature to measure. Table 3 showed the specific description of each variable.
According to the variable definition of the model, the variable data from 2011 to 2021 were obtained from the Baidu Index platform
and CNKI, and a total of 11 sample data. Considering the small sample size, this paper constructed multiple regression relationship
models by using LOOCV (Equation (4) and (5)). At the same time, this study utilized Python to calculate independent and dependent
variables, resulting in a correlation model between literature research and discipline attention (Equation (9)).
lny = − 8.368 + 0.639x1 − 1.933x2 + 1.372x3 + 1.046x4 + 1.469x5 − 0.563x6 (9)
2
Through calculation, the R value of the model is about 0.774, and the model has a good fitting degree. The relationship model
reflects the multiple linear relationships between the discipline attention and literature research. It can be used to predict the number
of high school mathematics literature.
4.2.2. Model error analysis

In this paper, model error was used to test model accuracy. Firstly, the independent variable data set from 2011 to 2021 was
selected and the dependent variable predicted value of the model was obtained by combining Equation (9). Then, the error between
the original value and the predicted value of the dependent variable was calculated according to Equation (6). Table 4 showed two
types of errors in the model in the recent five years.
Through the error calculation of the model in the recent five years, it was found that the error of the model was maintained within
2%. In addition, the RMSE calculation value from 2011 to 2021 was 0.0628. This effectively showed that the prediction ability of the
model was high and the model constructed was feasible.
4.3. Predictive analysis of disciplinary literature research
Based on the method proposed in RAPF module 3, this study predicted the attention and publication volume of discipline
respectively. Firstly, based on the historical data of attention, this paper used a time series model to predict discipline attention
(Equation (7)). Then, according to the regression relationship model between subject attention and its literature research, the paper
further forecasted the publication volume of high school mathematics subject literature research.
4.3.1. Construction of prediction model

Based on high school mathematics, this paper predicted the discipline attention according to the selected keywords representing
discipline content, culture, literacy, and integration. Considering the limited space, this paper took the keyword “inequality” as an
example to analyze the construction process of the prediction model in detail. Other keywords were similar.
(1) Stationary processing
First of all, taking the month as the unit, the Baidu Index value of “inequality” from 2011 to 2021 was obtained as the historical data
of attention, and a total of 132 data were obtained. To detect the stable characteristics of the data, ADF was used for inspection and
judgment (Table 5). Through the ADF checklist, it was found that the original sequence of variables was not significant (p = 0.719).
The first-order difference sequence of variables has obvious significance (p = 0.002). The original hypothesis was rejected, and the
first-order difference sequence of variables has the characteristics of stationarity. Therefore, the first-order difference data was selected
Table 3
Variable definitions.
Variable Dimension Indicator Symbol Data source
Independent Subject content Trigonometric function x1 Annual average of Baidu Index

variable Inequality x2
Discipline culture Mathematics history x3
Discipline literacy Mathematical modeling x4
Scientific integration Geometric sketchpad x5
Multimedia technology x6
Dependent variable Literature research on mathematics in senior high High school mathematics subject y Amount of CNKI annual
school literature publication
9
Table 4
Model accuracy test.
Year Actual value Predictive value Relative error (%) Mean relative error (%)
2017 6.4739 6.3680 1.6354 1.0232

2018 6.3526 6.3528 0.0021
2019 6.2442 6.3022 0.9294
2020 6.1181 6.0761 0.6871
2021 5.9738 6.0850 1.8619
for parameter identification.
(2)Identification and verification of model parameters
Considering the use of auto-correlation and partial auto-correlation to judge p,q parameters are subjective and artificial. Therefore,
to ensure that the model parameters are scientific and reasonable, this paper automatically searched for the optimal parameters based
on AIC and obtained the model results in ARIMA (0,1,5) test table (Table 6). According to the Q statistics in Table 6, the test results of
Q6 and Q12 were not significant, so the model residual of the white noise sequence cannot be rejected. The model residuals did not
have auto-correlation. In addition, the R2 of the model was 0.749, and the model had a good fitting degree. The estimated parameters
met the model indicators.
(3)Model building
According to the estimated parameters of the model, Table 7 presented the test results of the model parameters. The model was
ARIMA (0,1,5) and was based on first-order difference data. Based on the coefficient values in Table 7, the ARIMA model (Equation
(10)) could be established.
Xt = 4.006 − 0.142u(t − 1) − 0.278u(t − 2) − 0.425u(t − 3) − 0.311u(t − 4) + 0.156u(t − 5) (10)
(4)Model prediction
This paper selected a period (2013–2020) to evaluate the predictive performance (Equation (10)) by comparing the attention data
of the original sample with the predicted values generated by the ARIMA model. According to the trend comparison between the
predicted value and the original value of attention in Fig. 4, and combined with error analysis Equation (8) (MAPE = 5.6829%), it was
found that the model had a good prediction effect and a high prediction accuracy. Meanwhile, combined with the standard deviation
and difference standard error of samples in Table 8, the difference between the actual value and the predicted value was 3.502, and the
95%CI difference was − 41.249–48.253. That is, the difference between the two sets of data should be between − 41.249 and 48.253
with 95% certainty. In addition, the test showed acceptance of the original hypothesis at the 0.05 level (t = 0.154, p = 0.877 > 0.05),
meaning that the difference of 3.502 was not statistically significant. There was no significant difference between the actual value and
the predicted value. Therefore, the model constructed by using the keyword “inequality” could predict future short-term data.
4.3.2. Prediction of discipline attention

This paper utilized the modeling prediction process of “inequality” to make predictions for other keywords, including trigonometric
function, mathematical history, mathematical modeling, geometric sketch-board, multimedia technology, etc. These predictions re
flected the attention trend of high school mathematics from various angles (Fig. 5a and b). As can be seen from Fig. 5, except for
trigonometric function, the future attention to discipline literacy, discipline culture, and discipline integration represented by other
keywords tended to be flat and relatively low. Specifically, trigonometric function had the highest attention, and inequality showed a
slowly rising trend. Trigonometric function and inequalities were considered the core content and examination point of high school
mathematics. They integrated all properties and methods of geometry and algebra, and were also representative knowledge of basic
elementary functions. The attention of mathematical modeling showed a slight decline trend. Compared with The geometric
sketchpad, the public’s attention to multimedia technology and mathematical history has been low over the years, that is, the
Table 5
ADF test table.
Variable Differential order t p AIC Critical value
1% 5% 10%
data 0 − 1.09 0.719 1391.893 − 3.487 − 2.886 − 2.58

1 − 3.928 0.002*** 1379.959 − 3.487 − 2.886 − 2.58
2 − 7.26 0.000*** 1382.732 − 3.488 − 2.887 − 2.58
Note: ***, ** and × represent the significance level of 1%, 5% and 10% respectively.
10
Table 6
ARIMA (0,1,5) test table.
Item Symbol Value
Df Residuals 125
Sample size N 132
Q statistic Q6 (p-value) 0.034 (0.853)
Q12 (p-value) 1.482 (0.961)
Information criterion AIC 1599.2
BIC 1619.327
Goodness of fit R2 0.749
Table 7
Model parameters.
Coefficient Standard deviation t p>|t| 0.025 0.975
Constant 4.006 0.553 7.244 0 2.922 5.089

ma.L1.D.data − 0.142 0.092 − 1.535 0.125 − 0.322 0.039
ma.L2.D.data − 0.278 0.086 − 3.222 0.001 − 0.447 − 0.109
ma.L3.D.data − 0.425 0.081 − 5.219 0 − 0.585 − 0.265
ma.L4.D.data − 0.311 0.085 − 3.664 0 − 0.478 − 0.145
ma.L5.D.data 0.156 0.101 1.538 0.124 − 0.043 0.355
Fig. 4. Time series diagram of keyword “inequality".
Table 8
Confidence interval for difference of independent sample t-test.
Item actual value (n = 85) Predicted value (n = 85) Mean t p Differential 95%CI 95%CI
difference standard error (L) (U)
Average Standard Average Standard
value deviation value deviation
Value 739.074 149.756 735.572 145.775 3.502 0.154 0.877 22.668 − 41.249 48.253
discipline culture represented by the keywords has not attracted enough attention. It indicates that in the future, high school math
ematics teaching should focus on practicing the requirements of “General High School Mathematics Curriculum Standards (2017
Edition)", improve the sense of immersion of mathematics culture in mathematics class, improve students’ mathematics core literacy,
and strengthen the in-depth integration of mathematics subject and new media technology to improve the quality of classroom
teaching.
11
Fig. 5A. Prediction of subject attention represented by keywords.Prediction of subject attention represented by keywords (excluding trigono
metric functions).
4.3.3. Prediction of subject literature volume

Combining the relationship model with the trend data of discipline attention, this section forecasted the number of literature
studies on mathematics in senior high school.
First, according to the relationship model between discipline attention and literature research (Equation (9)), the variable data of
discipline attention in 2022 was substituted into Equation (9) to obtain the dependent variable. Further anti-logarithm treatment of the
dependent variable could be obtained the volume of literature research on high school mathematics in 2022, which was 396.
Secondly, to reduce prediction error, data from 2022 was added to the sample set and a regression prediction model with LOOCV
was constructed using Equation (5). The obtained model was as follows Equation (11).
lny = − 9.499 + 0.701x1 − 1.970x2 + 1.105x3 + 1.003x4 + 1.752x5 − 0.444x6 (11)
Then, the independent variable data of discipline attention in 2023 was substituted into Equation (11). Further anti-logarithm
treatment could be obtained the amount of literature research on high school mathematics in 2023, which was 384.
Finally, similar to the previous step, add the data obtained in 2023 to the sample set and built a regression prediction model ac
cording to Equation (5). The obtained model was as follows Equation (12).
12
lny = − 9.452 + 0.701x1 − 1.965x2 + 1.092x3 + 1.002x4 + 1.752x5 − 0.444x6 (12)

Then, the independent variable data of discipline attention in 2024 is substituted into Equation (12). Further anti-logarithm
processing, it was obtained that the publications number of literature research on mathematics in high school in 2024 was 350.
Fig. 6 showed the published volume of literature research in this discipline in the short term. It can be seen that the volume of published
literature on high school mathematics has fluctuated over the years. It appears that research in this area may have experienced a
transition from quantity to quality. In the next three years after 2021, the publication volume of high school mathematics themed
literature tends to be flat, with a decrease compared to previous years.
4.4. The implications of the findings
The trend of attention is the micro-data embodiment of the overall development of the discipline. As the “end” of the functional
content of mathematics compulsory courses in senior high school, the trigonometric function is a comprehensive knowledge aimed at
solving practical mathematical problems. The universality and depth of subject content can help students link up university knowledge
and use theory to investigate practical problems. Mathematical modeling is one of the core qualities of mathematics, which is the
comprehensive embodiment of students’ abstract ability, reasoning ability, imagination ability, and computing ability. It is also the
primary way students relate to and understand the outside world. Wang once said that mathematical modeling activities are complex
and comprehensive. Effective teaching should prioritize the development of students’ thinking processes and theoretical frameworks,
encouraging them to gain a profound understanding of the deductive logic of mathematics. Additionally, educators should strive to
broaden students’ knowledge and cultivate their motivation to learn and apply mathematical concepts [43]. As a representative of
disciplinary culture, the history of mathematics plays an important role in mathematical literacy. Integrating the explanation of
mathematical history in mathematics class helps students feel the history of mathematics closely and stimulate their perception of
mathematical culture. Geometric sketchpad and multimedia technology are the representatives of new media technology, which
contribute to the dynamic transformation of static information and enrich classroom teaching forms. Multimedia can reduce the
difficulty of teaching mathematical knowledge, create an intuitive, lively, and interactive teaching experience for students, and
stimulate students’ sensory touch. Therefore, to enhance the learning experience of high school mathematics, it is necessary to
incorporate new educational technology and utilize new media courseware. Additionally, presenting challenging concepts through
Micro class, MOOC, and other innovative methods can aid in simplifying the comprehension process for students. Through the pre
diction of attention, it is found that high school mathematics discipline literacy, discipline culture, and discipline integration need to be
further strengthened and improved. Integrating subject culture into the mathematics classroom in a timely manner can enhance
students’ mathematical literacy and facilitate the integration of discipline with modern educational technology.
Through the prediction of subject literature quantity and the combination of relevant literature materials, it was found that the
development of high school mathematics literature presented three stages and had undergone two mutations. First, 2011–2017 was a
period of rapid development. The fluctuation of the annual literature volume was relatively stable in the early stage, with an annual
volume of about 480 articles. It has entered a period of rapid growth since 2014, with 648 articles published annually. Secondly,
2018–2021 was a transitional development period. The annual volume of literature on high school mathematics themes has been
gradually decreasing until it stabilized in 2021. Third, 2022–2024 is the key development period, with an annual volume of about 380
articles. The first mutation occurred during the high-speed development period. In 2014, the Ministry of Education formulated the
Reform Opinions on the core literacy system for student development and academic quality standards [44]. Since 2016, when relevant
Fig. 6. Prediction of the publication volume of high school mathematics literature.
13
research groups released The Development of Chinese Students’ Core Literacy, the research on mathematics and its subject core lit
eracy has gradually increased [45]. At the same time, this period was also a critical period for the revision of curriculum standards. The
concern and interpretation of relevant scholars on the revision of mathematics curriculum standards are also one of the reasons for the
rapid increase in literature volume. Based on the analysis of published literature for 16 years, Zhang confirmed that the achievements
of Chinese mathematics education research have been steadily rising since the beginning of the new century [46]. This reflected the
rapid leap in the quantity of mathematics education literature in China. Another mutation occurred in the transition period. Most of the
literature during this period focuses on the Evaluation System of China’s College Entrance Examination and the new curriculum
standard of high schools. Literature research focuses on the practice of morality, talent, and thought, as well as the effective imple
mentation of mathematics core literacy. Its literature content was more systematic and in-depth [47]. At the same time, some journals
will effectively control the number of papers after entering the CSSCI source extended edition and core journals, to improve the quality
of publishing. The annual journal of mathematics education research has a steep downward trend, which may have a lot to do with the
quality of the journal. From the side, it also reflects the transition and breakthrough of Chinese mathematics education research from
quantity to quality. After the leap and breakthrough in quantity and quality, Chinese mathematics education research literature has
entered a critical period of development. However, According to some scholars, the research methods used in mathematics education
in China are largely speculative and lack diversity. Additionally, there are concerns about many repetitive studies on the same themes
[48]. Therefore, the follow-up literature should strengthen empirical and quantitative research in terms of research methods, and
expand and excavate the content and depth of the literature. Such as strengthening the teaching and learning of mathematics,
mathematics textbooks, mathematics teachers, mathematics literacy, mathematics thought, and other content expansion depth. Then
present convincing and worthy promotion research results, and improve the quality and quantity of research papers on mathematics
education in China.
5. Conclusion
Based on Baidu Index and CNKI data from 2011 to 2021, this paper put forward a framework RAPF based on the relationship
analysis and prediction of discipline attention and literature research. RAPF first analyzed the relationship between high school
mathematics discipline attention and literature research through text mining and Spearman. Then established a relationship model
between them based on LOOCV theory. Finally, RAPF used the time series model to predict the future trends of discipline attention and
publication volume of discipline literature. This paper used a large number of data to carry out experimental research, which gave a
reference for other types of research in mathematics.
This study mainly drew the following conclusions: Firstly, text mining analysis has found that high school mathematics has formed
a word frequency list in the recent ten years, consisting of subject content, subject culture, subject literacy, and subject integration.
Then, according to Spearman correlation analysis, there was a correlation between literature research and discipline attention. Based
on this relationship, six keywords representing the subject characteristics were selected: trigonometric function, geometric sketchpad,
mathematical history, inequality, mathematical modeling, and multimedia technology. Secondly, based on LOOCV theory, this study
used representative keywords to build the multiple regression relationship models between high school mathematics discipline
attention and its literature research. The model fitting R2 was 0.774, indicating that the model fitting degree was good. The model
passed the error test (2%), and was feasible. Finally, ARIMA and regression models were used to predict the Index of discipline
attention and the publication volume of discipline literature. The prediction of subject attention found that subject culture, literacy,
and integration represented by mathematical history, mathematical modeling, geometric sketchpad, and multimedia technology
needed to be further strengthened and deepened in the classroom. The prediction of the volume in literature research found that
China’s mathematics education research has entered a critical phase of development. This follows a period of rapid and transitional
development, which saw a quantitative leap and a qualitative breakthrough. In the short term, In the short term, China’s mathematics
education research is still in a critical stage of development. Therefore, it is still necessary to explore the depth of literature research,
expand its breadth, and elevate its standards in the future. At the same time, it is necessary to strengthen the empirical research of
mathematics, enrich diversified research methods, and realize the steady improvement of quality and quantity.
There might be some limitations to the study. First, the sample size was small in relation modeling. Although LOOCV was also
suitable for model training with small sample sets, an appropriate sample size can reflect the diverse characteristics of the discipline.
Future research could slightly increase the sample interval and optimize model parameters on this basis. Second, due to the limitations
of the Baidu Index database, the keywords included on the Baidu Index platform are not comprehensive enough, which may affect the
collection of vocabulary. In the future, Index platforms such as Massive Arithmetic [49] can be expanded to further enrich the research.
Third, this article is mainly limited to statistical methods for analysis and discussion. Future research can promote other laws, such as
Lotka’s law, to understand the scientific maturity of the research field.
Future research directions in this field could start from other basic educational disciplines, such as Chinese and Biology, to explore
the universality of the relationship model between subject attention and literature research. This would help to enrich the widespread
application of network attention in the education field. In addition, future research can also be based on the mathematics discipline,
using Lotka’s law to carry out bibliometric analysis on the publication status and citation pattern of mathematics, to understand the
research dynamics of mathematics. This work is also currently under consideration. These studies may contribute to more diverse and
rich research fields.
14
Author contribution statement
All authors listed have significantly contributed to the investigation, development and writing of this article.
Funding statement
This work was supported in part by the National Natural Science Foundation of China under Grant Nos. 61862051 and 62001183;
the Science and Technology Foundation of Guizhou Province under Grant No. [2019]1447; the Nature Science Foundation of
Educational Department under Grant Nos. [2022]094 and [2022]100; Education Planning Project of Jiangsu Province under Grant No
/2022/01/166; the High Level Talent Foundation of Suqian University under Grant No. 2022XRC077.
Data availability statement
Data will be made available on request.
Additional information
No additional information is available for this paper.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to
influence the work reported in this paper.
References
[1] R. Huang, G. Luo, Q. Duan, L. Zhang, Q. Zhang, W. Tang, H. Zou, Using Baidu search index to monitor and predict newly diagnosed cases of HIV/AIDS, syphilis
and gonorrhea in China: estimates from a vector autoregressive (VAR) model, BMJ Open 10 (3) (2020), e036098, https://doi.org/10.1136/bmjopen-2019-
036098.
[2] B. Zhou, G. Yang, Z. Shi, S. Ma, Interpretable temporal attention network for COVID-19 forecasting, Appl. Soft Comput. 120 (2022) 108691, https://doi.org/
10.1016/J.ASOC.2022.108691.
[3] Y. Li, S. Lv, X. Liu, Q. Zhang, Incorporating transformers and attention networks for stock movement prediction, Complexity (2022) 2022, https://doi.org/
10.1155/2022/7739087.
[4] C.Q. Jiang, X.X. Wang, Z. Wang, Forecasting car sales based on consumer attention, Data Anal. Knowl. Discov. 5 (1) (2021) 128–139, https://doi.org/10.11925/
infotech.2096-3467.2020.0418.
[5] D.T. Li, Q.L. Feng, Tian, J. Comparative analysis of research hotspots in scientific journals and hotspots on the internet, Chin. J. Sci. Technol. Res. 28 (11) (2017)
992–996, https://doi.org/10.11946/cjstp.201704130259.
[6] H. Zhang, X.Y. Yang, X.J. Zhao, C.M. Yang, B. Li, Subject frontiers hot spots mining based on social network attention, J. Zhengzhou Univ. (Nat. Sci. Ed.) 50 (3)
(2018) 46–52, https://doi.org/10.13705/j.issn.1671-6841.2017201.
[7] Y.X. Pai, C.L. Li, Y.M. Liu, AAS high concern subject research topic identification based on z-index, Inf. Data Services 40 (6) (2019) 30–37, https://doi.org/
10.12154/j.qbzlgz.2019.06.005.
[8] M. J, M. Juniantari, G.A. Mahayukti, I.N. Gita, I.P.P. Suryawan, Validity of introduction to basic mathematics teaching materials based on conceptual
understanding procedures models and character education, J. Phys. Conf. 1503 (1) (2020), 012018, https://doi.org/10.1088/1742-6596/1503/1/012018.
[9] S. Alagumalai, N. Buchdahl, Pisa 2012: examining the influence of prior knowledge, time-on-task, school-level effects on achievements in mathematical literacy
processes – interpret, employ and formulate, Aust. J. Educ. 65 (2) (2021) 173–194, https://doi.org/10.1177/00049441211031674.
[10] A. van den Bogaart, D. A, R.J. Oostdam, F. Janssen, J.J. M, From speck to story: relating history of mathematics to the cognitive demand level of tasks, Educ.
Stud. Math. (2021) 1–16.
[11] H.X. Deng, L.B. Wu, Y.M. Cao, The hot spot, evolution and prospect of math textbooks in primary & secondary schools in China since the new century– an
empirical analysis based on scientific knowledge mapping, Educ. Theory Prac. 41 (8) (2021) 38–43.
[12] F. Lin, C. Pang, Q.L. Gu, Based on baidu index in the Beijing area children allergic rhinitis attention research, Chin. J. Prac. Pediatr. 36 (2) (2021) 130–135,
https://doi.org/10.19538/j.ek2021020611.
[13] Q.Y. Wei, J. Liu, J.H. Zhang, P.J. Pan, R.F. Chen, J.M. Chu, J.J. Jiang, B.Y. Liang, H. Liang, L. Ye, Analysis of the epidemic situation of COVID-19 and network
public opinion monitoring based on Baidu Index, Chin. J. Prev. Med. 22 (1) (2021) 4–9, https://doi.org/10.16506/j.1009-6639.2021.01.002.
[14] W.S. Wang, X.Y. Chang, Research on the prediction of regional RPI growth based on web search data- the simulation of the impact of daily network search index
on RPI in Hebei province, Price Theory Prac. (2) (2022) 104–107, https://doi.org/10.19851/j.cnki.cn11-1010/f.2022.02.073.
[15] C. Yang, H. Jiang, Z.R. Lei, Exchange rate prediction based on text mining and Baidu Index, Stat. Decis. Making 35 (13) (2019) 85–87, https://doi.org/
10.13546/j.cnki.tjyjc.2019.13.020.
[16] T. Narbaev, A meta-analysis of the public-private partnership literature reviews: exploring the identity of the field, Int. J. Strat. Property Manag. 26 (4) (2022)
318–331, https://doi.org/10.3846/ijspm.2022.17860.
[17] Cortese Samuele, et al., Half a century of research on Attention-Deficit/Hyperactivity Disorder: a scientometric study, Neurosci. Biobehav. Rev. 140pp (2022)
104769.
[18] L.L. Hou, Method and dataset entity mining in scientific literature: a CNN + BiLSTM model with self-attention, Knowl. Base Syst. 235 (2022), https://doi.org/
10.1016/j.knosys.2021.107621.
[19] R. Osei-Kyei, T. Narbaev, G. Ampratwum, A scientometric analysis of studies on risk management in construction projects, Buildings 12 (9) (2022) 1342,
https://doi.org/10.3390/buildings12091342.
[20] H. Zhang, Exploration of mathematics classroom teaching practice in middle school under the concept of core mathematics literacy, Front. Educ. Res 3 (6)
(2020).
[21] S. Maslihah, S.B. Waluya, Kartono Rochmad, N. Karomah, K. Iqbal, Increasing mathematical literacy ability and learning independence through problem-based
learning model with realistic mathematic education approach, J. Phys. Conf. (4) (2021) 1918, https://doi.org/10.1088/1742-6596/1918/4/042123.
[22] R. Hu, Hot issues and trends in Chinese reform and development of mathematics education and teaching-summary of the Chinese conference on reform and
development of mathematics education and teaching and doctoral (masters’) forum on mathematics education, J. Math. Educ. 27 (4) (2018) 100–102.
15
[23] D.P. Zhang, J.Y. Gu, T.T. Zhang, Research Progress and Trend Prediction of Mathematics Education in China: Based on Data Analysis of CNKI in Recent Five
Years vol. 8, University Education, 2019, pp. 93–96, https://doi.org/10.3969/j.issn.2095-3437.2019.08.029.
[24] J. Rymarz, A. Niewczas, P. Hołyszko, E. Dębicka, Application of spearman’s method for the analysis of the causes of long-term post-failure downtime of city
buses, Appl. Sci. 12 (6) (2022) 2921, https://doi.org/10.3390/APP12062921.
[25] T. Chang, S. Chi, S. Im, Understanding user experience and satisfaction with urban infrastructure through text mining of civil complaint data, J. Construct. Eng.
Manag. 148 (8) (2022), https://doi.org/10.1061/(ASCE)CO.1943-7862.0002308.
[26] Y. Zhao, J. Li, K. Liu, J.E. Wang, Analyzing the spatio-temporal characteristics and influencing factors of “aI + education” network attention in China, Math.
Probl Eng. (2022) 2022, https://doi.org/10.1155/2022/5101967.
[27] M.R. Elkharbotly, M. Seddik, A. Khalifa, Toward sustainable water: prediction of non-revenue water via artificial neural network and multiple linear regression
modelling approach in Egypt, Ain Shams Eng. J. 13 (5) (2022), https://doi.org/10.1016/J.ASEJ.2021.101673.
[28] M. Gupta, P. Chinnasamy, Trends in groundwater research development in the South and Southeast Asian Countries: a 50-year bibliometric analysis (1970-
2020), Environ. Sci. Pollut. Res. Int. 29 (50) (2022), https://doi.org/10.1007/S11356-022-21163-4.
[29] C.B. Chulhwan, Information systems frontiers: keyword analysis and classification, Inf. Syst. Front 17 (1) (2015) 217–237, https://doi.org/10.1007/s10796-
014-9544-z.
[30] L. Tran, K. He, D. Wang, H. Jiang, A cross-validation statistical framework for asymmetric data integration, Biometrics (2022), https://doi.org/10.1111/
BIOM.13685.
[31] N. Hou, J.M. Su, Z.J. Li, Analysis of key influencing factors of economic development in Inner Mongolia, Inn. Mongolia Stat. (3) (2017) 10–15, https://doi.org/
10.19454/j.cnki.cn15-1170/c.2017.03.007.
[32] Y. Zhao, J. Li, J. Wang, Analysis and prediction of “AI + education” attention based on baidu index—taking guizhou province as an example, Future Internet 13
(5) (2021) 120, https://doi.org/10.3390/FI13050120.
[33] H. Verma, S. Mandal, A. Gupta, Temporal deep learning architecture for prediction of COVID-19 cases in India, Expert Syst. Appl. 195 (2022) 116611, https://
doi.org/10.1016/J.ESWA.2022.116611.
[34] M. Kass, M. Ramin, M.D. Hanigan, P. Huhtanen, Comparison of Molly and Karoline models to predict methane production in growing and dairy cattle, J. Dairy
Sci. 105 (4) (2022) 3049–3063, https://doi.org/10.3168/JDS.2021-20806.
[35] S.A. DeLurgio, Forecasting Principles and Applications. Homewood,lll, Richard D. lrvin,Inc, 1998.
[36] Y. Qingxin, S. Andrey, M. Alexander, L. Henrik, H. Martin, V.K. Evgenii, Revealing property-performance relationships for efficient CO2 hydrogenation to higher
hydrocarbons over Fe-based catalysts: statistical analysis of literature data and its experimental validation, Appl. Catal. B Environ. (2020), https://doi.org/
10.1016/j.apcatb.2020.119554.
[37] M.S. Liu, X.H. Du, W.J. Chen, A statistical analysis of China’s forestry research hotpots based on keywords of scientific papers一Taking keywords of eight
forestry journals from 2014 to 2018 as an example, Forest Eng. 38 (2) (2022) 154–160, https://doi.org/10.16270/j.cnki.slgc.2022.02.007.
[38] R.Z. Ren, N.Q. Ding, An analysis of the key factors in ESP teaching with the " 5W1H" method, Foreign Lang. Circles (2) (2012) 58–64.
[39] X.Y. Liu, The interpretation of the "28" management rule, Econ. Theory Econ. Manag. (8) (2004) 57–59, https://doi.org/10.3969/j.issn.1000-
596X.2004.08.010.
[40] N. Amin, P. Paolo, Dynamic water allocation policies improve the global efficiency of storage systems, Adv. Water Resour. 104 (2017) 55–64, https://doi.org/
10.1016/j.advwatres.2017.03.004.
[41] J. Xiao, Let mathematics culture infiltrate mathematics Classroom, Chin. Educ. J. (12) (2021) 102.
[42] B. Chen, P. Yu, Research on the evaluation of high school students’ mathematical core literacy, Math. Bull. 60 (8) (2021) 16–21, https://doi.org/10.3969/j.
issn.0583-1458.2021.08.003.
[43] Y.Z. Wang, Thinking about teaching and learning mathematical modeling in middle school, Math. Bull. 59 (11) (2020) 1–3, https://doi.org/10.3969/j.
issn.0583-1458.2020.11.001.
[44] Ministry of Education of the People’s Republic of China, Opinions of the Ministry of education on comprehensively deepening curriculum reform and
implementing the fundamental task of building virtue and cultivating people. http://www.moe.gov.cn/srcsite/A26/jcj_kcjcgh/201404/t20140408_167226.
html.
[45] S. Jiang, Research report on mathematics education in primary and secondary schools(2016-2020)–Based on the data of the replicated journal of the
information center for social sciences of RUC, J. Tianjin Normal Univ. 23 (1) (2022) 22–28, https://doi.org/10.16826/j.cnki.1009-7228.2022.01.004.
[46] Q.W. Zhang, A new century picture of mathematics education research in China–Based on papers analysis of journal of mathematical education (2000–2016),
J. Math. Educ. 27 (4) (2018) 95–99.
[47] B.Y. Yan, X.P. Song, Reflecting the mathematics curriculum objectives of inheritance and development –Interpretation of the target system of mathematics
curriculum standard for ordinary high school (2017 edition), Math. Bull. 57 (12) (2018) 18–21, https://doi.org/10.3969/j.issn.0583-1458.2018.12.005.
[48] W.Q. Niu, T. Zhang, B. Xiong, Reflection of research on mathematical modeling in China’s primary and secondary schools, J. Math. Educ. 26 (5) (2017) 66–70.
[49] Massive arithmetic. https://trendinsight.oceanengine.com/.
16

1 s2.0 S2405844023059583 Main

Uploaded by

Copyright:

Available Formats

You might also like

1 s2.0 S2405844023059583 Main

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1 s2.0 S2405844023059583 Main

Uploaded by

Copyright:

Available Formats

Heliyon 9 (2023) e18750

Contents lists available at ScienceDirect

The sustainable development of mathematics subject: An

The main contribution of this study was as follows.

3. Proposed RAPF framework

Fig. 1. RAPF framework.

3.1. Relationship analysis

3.1.1. Text mining

Fig. 2. Text word frequency acquisition process.

3.1.2. Spearman correlation analysis

3.2. Relationship modeling

3.2.1. Model establishment

(1) Model basis

(2) Model building

3.2.2. Model accuracy test

3.3. Forecast analysis

3.3.1. Model building

Fig. 3. Flow chart of prediction.

3.3.2. Model checking

4.1.1. Data source

4.1.2. Selection of subject keywords

<10 Highly accurate

1 High school mathematics 2487 26 Mathematics learning 42

1. Eliminate words with repeated semantics;

4.2. Relationship modeling between discipline attention and literature research

4.2.1. Relationship model construction

4.2.2. Model error analysis

4.3. Predictive analysis of disciplinary literature research

4.3.1. Construction of prediction model

(1) Stationary processing

Independent Subject content Trigonometric function x1 Annual average of Baidu Index

2017 6.4739 6.3680 1.6354 1.0232

for parameter identification.

(2)Identification and verification of model parameters

4.3.2. Prediction of discipline attention

data 0 − 1.09 0.719 1391.893 − 3.487 − 2.886 − 2.58

Constant 4.006 0.553 7.244 0 2.922 5.089

Fig. 4. Time series diagram of keyword “inequality".

4.3.3. Prediction of subject literature volume

lny = − 9.452 + 0.701x1 − 1.965x2 + 1.092x3 + 1.002x4 + 1.752x5 − 0.444x6 (12)

4.4. The implications of the findings

Fig. 6. Prediction of the publication volume of high school mathematics literature.

Author contribution statement

Data availability statement

Data will be made available on request.

No additional information is available for this paper.

Declaration of competing interest

You might also like