Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

Research on High-Value Patent Identi cation Model

from Perspective of Patent Transfer


Zengyuan Wu

China Jiliang University


Ying Li
China Jiliang University
Xiangli Han
China Jiliang University
Bin He
Bin He Syney Elevator (Hangzhou) Co., Ltd

Research Article

Keywords: patent transfer, high-value patent identi cation, imbalanced data, ensemble learning

Posted Date: April 12th, 2024

DOI: https://doi.org/10.21203/rs.3.rs-4239996/v1

License:   This work is licensed under a Creative Commons Attribution 4.0 International License.
Read Full License

Additional Declarations: No competing interests reported.


RESEARCH ON HIGH-VALUE PATENT IDENTIFICATION MODEL FROM
PERSPECTIVE OF PATENT TRANSFER

Zengyuan Wu *(corresponding author)


College of Economics and Management, China Jiliang University
No. 258, Xueyuan Street, Hangzhou, Zhejiang Province, 310018, P.R. China
wuzengyuan@cjlu.edu.cn

Ying Li
College of Economics and Management, China Jiliang University
No. 258, Xueyuan Street, Hangzhou, Zhejiang Province, 310018, P.R. China
2577325924@qq.com

Xiangli Han
College of Economics and Management, China Jiliang University
No. 258, Xueyuan Street, Hangzhou, Zhejiang Province, 310018, P.R. China
1768103222@qq.com

Bin He
Syney Elevator (Hangzhou) Co., Ltd
No. 31, Yangcheng Street, Hangzhou, Zhejiang Province, 310000, P.R. China
hebin@syney.net

Page 1
RESEARCH ON HIGH-VALUE PATENT IDENTIFICATION MODEL FROM
PERSPECTIVE OF PATENT TRANSFER

ABSTRACT

Accurately identifying high-value patents can be difficult with the dramatic increase in the number of patent

applications. This leads to a low rate of commercialization of patent achievements. Whether a patent is transferred or

not is an important reflection of the value of the patent. In order to solve above problems, we proposed a high-value

patent identification model that combines hybrid sampling technology and ensemble learning algorithm. First, we add

technical capacity of patentees based on traditional high-value patent identification indicators to reconstruct the

indicator system. Then we reduce the identification indicator system for high-value patents to eliminate redundant

indicators. Second, we use Adaptive Synthetic Sampling - Local Outlier Factor (ADASYN-LOF) to expand minority

samples to balance the data. Finally, we use Genetic Algorithm (GA) to optimise the parameters of AdaBoost. For

clarity, this model is called the ADASYN-LOF-GA-AdaBoost model. To test the effectiveness of above model, we

use patent data in field of scientific instruments. The results demonstrate that the proposed model achieves ACC of

94.47%, AUC of 94.87%, recall of 97.54%, and F1-score of 95.23%. The results show that ADASYN-LOF-GA-

AdaBoost model performs better than other models. Therefore, this model can effectively identify high-value patents

with transfer potential.

Keywords: patent transfer; high-value patent identification; imbalanced data; ensemble learning

Page 2
RESEARCH ON HIGH-VALUE PATENT IDENTIFICATION MODEL FROM
PERSPECTIVE OF PATENT TRANSFER
1. Introduction

With the rapid advancement of technology and increasing global economic competition, the importance of

intellectual property has become more and more prominent. In the era of the knowledge economy, patents as an

important form of intellectual property play a crucial role in protecting innovation and driving economic development.

Furthermore, patents serve as essential carriers of core technologies and reflect a nation's scientific and technological

strength. Consequently, supported by strategic initiatives and policy encouragement from various countries, the global

patent application volume has been increasing year by year. According to a report released by the World Intellectual

Property Organization, the number of patent applications submitted by over 150 countries and regions worldwide

reached 3.4 million in 2022(Kong et al. 2023). However, as the volume of patent applications continues to rise, the

quality of patents varies greatly. Relevant studies have shown that half of the patent value in the world comes from

only 5% of high-value patents, and only a minority of patents can truly achieve the transfer and transformation of

patent achievements (Zhou et al. 2021). High-value patents can help enterprises make targeted R&D investments,

avoid financial risks, enhance technological competitive advantages, and increase market share. For countries, they

can strengthen national scientific and technological strength (Huang et al. 2021). Therefore, the development of

accurate and efficient methods for identifying high-value patents has become a hot topic of academic research.

The research on high-value patent identification mainly focuses on the construction of patent identification

indicator systems and the improvement of identification methods. Traditional high-value patent identification

indicator systems are primarily constructed by market, legal, and technological. For example, Wang et al. (2019) used

five indicators such as the number of cited patents, the number of patent families, patent coverage, the number of

claims per patent, and the number of patent litigations to identify leading technologies in sustainable energy. Hu et al.

(2023) confirmed that indicators such as patent family size and first citation speed are significant in measuring patent

value when identifying high-value patents in the integrated circuit field. However, most of these indicators have a lag

and are difficult to use for identifying newly authorized patents. In terms of identification methods, scholars calculate

patent value by determining the weights of indicators based on the constructed indicator system (Huang et al. 2022).

However, this approach is time-consuming and labor-intensive, making it challenging to handle large amounts of data.

Therefore, scholars have begun to use machine learning to identify high-value patents. Yang et al. (2022) measured

Page 3
the technological novelty value of each technology and used K-means clustering to identify technological

opportunities in the drone field. Liu et al. (2023) comprehensively considered the patent value evaluation model and

the credit evaluation model for small and medium-sized enterprises, constructed 17 measurement indicators, and

finally used RF to screen out high-value underlying assets. Although these methods have made certain contributions

to the identification of high-value patents, they overlook the imbalance of patent data, which means that the number

of high-value patents is relatively small compared to non-high-value patents. Most machine learning algorithms are

based on balanced datasets for classification. Directly training on imbalanced datasets can lead to model distortion

and poor classification performance.

Given this, this study aims to enhance the accuracy of high-value patent identification by constructing the

following framework: Firstly, high-value patents are defined based on whether patent transfer has occurred. Secondly,

to effectively identify newly granted patents, the traditional dimensions of technology, legal, and market are

supplemented with technical capacity of patentees dimension to optimize the indicator system. Thirdly, addressing

the issue of imbalanced data, a combined model of ADASYN-LOF-GA-AdaBoost is constructed. This involves using

the ADASYN-LOF algorithm to balance the samples and utilizing GA to find the optimal parameter combination for

AdaBoost, further enhancing the classification performance of the model. Finally, this model is compared with other

machine learning algorithms to validate its effectiveness.

2. Literature Review

2.1 The definition of high-value patent

The research on high-value patents has always received considerable attention. Its connotation is closely related

to the development of national strategies and socio-economic development. Currently, there is no consensus on the

definition of high-value patents both domestically and internationally. However, most scholars agree that the

formation of patent value cannot be separated from its efficient transformation and utilization (Fischer et al. 2014).

Patents have broad prospects for industrialization and market application, as well as strong policy adaptability. By

converting and applying patents through various means such as possession, use, transfer, and mortgage, higher

economic benefits can be obtained, thereby contributing to society.

The definition of high-value patents involves multiple disciplines and perspectives, primarily focusing on patent

lifespan, patent protection strength, and the degree of patent innovativeness. Odasso et al. (2015) started from different

types of buyers and sellers, argue that the transaction price of a patent depends on its remaining lifespan. Lee (2008)

Page 4
found in his research that the quality and value of a patent can be proxied by its lifespan, and factors such as the scale

of invention, the number of claims, and the number of self-citations can influence the patent lifespan. Although using

patent lifespan to measure high-value patents has achieved certain results, it is difficult to predict the value of some

expired patents from the perspective of patent lifespan, which may lead to the loss of important information (Mansfield

et al. 1981). Therefore, scholars have begun to use patent protection strength to measure patent value. Klemperer

(1990) was the first to introduce the concept of patent breadth. He believed that the strength of patent protection

depends on the breadth of the patent, and the larger the scope of protection, the higher the cost for competitors to

imitate. Ultimately, the larger the profits allocated to the patent holder, the higher the patent value will be. However,

patent breadth is usually specific to a particular technical field and lacks universality for cross-field composite

technologies. Currently, some scholars are attempting to measure patent value from the perspective of patent

innovativeness by using patent citation information (Miao et al.2021). They believe that the number of citations a

patent receives can be used to assess the competitiveness and influence of a technological invention. Patents with a

high number of citations are likely to be high-value patents. Yang et al. (2015) proposed a patent value evaluation

method based on a comprehensive patent citation network and used this method to identify more high-value patents

in the field of optical disk technology. Danish et al. (2023) used the number of citations as a measure of high-value

patents to evaluate the value of Indian patents from the perspectives of inventors, companies, and technology.

Measuring patent value based on the number of citations is simple, straightforward, and widely applicable. However,

this indicator has a lag effect and a time dilation effect, and its ability to explain differences in patent value is limited

(Huang et al.2020).

With the deepening of research, scholars have found a connection between the transferability of patents and their

value. The transferability of patents refers to the possibility of realizing the potential value of patents through

transactions, that is, whether patents can be transferred or sold (Ko 2019). Patent value refers to the economic benefits

brought to enterprises by patents in the process of operation and the contributions to the enterprise's development

strategy under actual market conditions. The transferability of patents can be evaluated by examining whether the

patent has been transferred. The value of a patent has an impact on its transferability. Patents with higher value are

usually easier to be transferred because buyers are more willing to invest in patents with potential economic returns.

2.2 The indicator system of high-value patent identification

Page 5
The construction of a high-value patent identification indicator system directly affects the accuracy of the

identification. Existing research on the construction of indicator system mainly falls into three major dimensions:

legal, market, and technological. Change et al. (2014) added patent family depth and revenue plan ratio to the indicator

system based on features such as the number of claims and patent citations, and used logistic regression to identify

and analyze high-value patents in the LED industry. The results showed that the selected indicators were positively

correlated with patent value. Danish et al. (2020) evaluated patent value based on indicators such as patent family size,

technological scope, and patent grant. Chiu et al. (2015) constructed a high-value patent evaluation indicator system

using references, patent citation counts, non-patent references, and other indicators. Then applied them to evaluate the

value of multinational patents. In current research on high-value patent identification indicator systems, scholars tend

to construct a system that identifies high-value patents based on three dimensions: legal, technological, and market.

However, for some newly granted patents in their early stages, their market value may not be fully realized and their

lifespan may be relatively short. Therefore, relying solely on these three dimensions for value assessment can make it

challenging to effectively identify high-value patents.

Existing research has shown that factors related to the patentee are positively correlated with patent value. To

better identify high-value patents, the construction of an indicator system needs to take into account the technical

proficiency of the patentee. Lee et al. (2018) demonstrated through research that incorporating patent ownership

indicators can effectively identify emerging technologies, which have a significant impact on patent value. Caviggioli

et al. (2016) further discovered a significant positive correlation between patent ownership-related indicators and

patent value. Chung et al. (2021) identified competitive partners by assessing the technical capabilities, concentration,

and scale of inventor groups in order to enhance the conversion value of patents.

2.3 The identification of high-value patent

Due to the uncertainty of high-value patents and the complexity of patent data, early research methods for

identifying high-value patents primarily relied on market-based criteria to evaluate patent value. These methods can

be categorized into four major types: cost approach, market approach, income approach, and real options approach.

Scholars estimate the costs invested in patent technology during the research and development stage through the cost

approach to predict the future value of the patent. Moreno et al. (2016) conducted an analysis of future pricing in the

pharmaceutical industry based on cost-benefit analysis, while also considering non-patent pricing and future patients.

However, there is no clear correlation between the R&D investment of most patents and the subsequent returns,

Page 6
making it difficult to accurately estimate the value of patents. Some scholars also use the market approach to determine

the value of a patent to be evaluated based on the transaction prices of similar patents in the market. Hsu et al. [26]

used the market approach to match university patents with those granted to listed companies with similar

characteristics and estimated the potential value of university patents based on the stock market's response to the

matched patents. The implementation of the market approach is challenging and relies on comprehensive transaction

data platforms. The income approach involves estimating and discounting the future market prospects of patents. Oh

et al. (2022) used the income approach to estimate the market value of future innovative technologies in order to gain

an early understanding. In a highly competitive market environment, the uncertainty of valuation using the income

approach is extremely high. Chung et al. (2019) developed a new theoretical framework to assess the value of software

using real option theory. Although this method is practical, the models used are complex and difficult to understand,

making it challenging to address the uncertainty of variables in the future and exhibiting significant randomness.

Methods based on market criteria approach patent value assessment from a static perspective and can only roughly

calculate patent value indicators, making it difficult to handle large amounts of patent data (Chen et al. 2016).

Over time, scholars have begun to develop a set of high-value patent evaluation indicator systems in order to

address the inapplicability of market-based criteria methods for assessing value. A commonly used approach is the

comprehensive evaluation method, which primarily includes analytic hierarchy process (AHP), fuzzy comprehensive

evaluation method, entropy weight method, and others. Huang et al. (2022) evaluated patent value by calculating the

weights of patent evaluation indicators using the AHP. Wang et al. (2015) used the fuzzy comprehensive evaluation

method to assess patent value. Yuan et al. (2022) combined the entropy weight method with TOPSIS to improve the

rationality of indicator weights and applied this method to evaluate patents in the field of solar cell technology.

Research on the evaluation of high-value patents based on the comprehensive evaluation method has become relatively

mature. However, when assigning weights to indicators, subjective value judgments play a significant role, which can

be time-consuming and labor-intensive. For patents with different uses, the factors influencing value vary

significantly, making it difficult to construct a universal patent value indicator system using this method. Additionally,

it is not suitable for identifying large amounts of patent data.

With the development of artificial intelligence technology, an increasing number of scholars have begun to apply

machine learning techniques to the identification of high-value patents. Erdogan et al. (2022) believe that, against the

backdrop of a rapid increase in patent applications, identifying high-value patents is crucial for enterprises to make

Page 7
precise investments. Therefore, he constructed a predictive model combining supervised algorithms with the analytic

hierarchy process to identify high-value patents. Kwon et al. (2020) established a multi-dimensional indicator system

and constructed six machine learning models, including DT, XGBoost, and RF, to demonstrate the effectiveness of

this method. Han et al. (2022) used support vector machines (SVM) to identify cutting-edge technologies in the field

of electric vehicles and verified that this method performs well in terms of prediction accuracy and generalization

ability. Lee et al. (2018) extracted quantitative indicators from patent data and used AdaBoost to predict sustainable

technology transfers, concluding that the algorithm exhibits good classification performance in technology transfer

predictions. Machine learning algorithms can not only easily handle massive amounts of data but also automatically

learn the importance of high-value patent indicators for weight assignment, making identification more efficient.

However, when machine learning is used for classification, the imbalance data can affect classification performance.

Through a review of existing research, it is found that scholars have rarely considered the imbalance data in high-

value patent identification based on machine learning, resulting in relatively low identification accuracy.

3. Methodology

This paper proposes a high-value patent identification model based on machine learning. The specific approach

is as follows: Firstly, a high-value patent identification indicator system is constructed from the dimensions of legal,

market, technological, and technical capacity of patentees The required data is obtained from the PatSnap patent

database. Secondly, the indicators are reduced to eliminate redundant ones. Then, the ADASYN-LOF method is used

to expand the dataset of the retained indicators, ensuring a balanced distribution of the data. AdaBoost is subsequently

employed for training and testing, and the accuracy of high-value patent identification is further improved by

incorporating a genetic algorithm (GA). Finally, classification evaluation metrics are used to assess the performance

of the machine learning model, ultimately leading to the development of a high-value patent identification model

tailored for patent transfers.

3.1 The construction and reduction of high-value patent identification indicator system

3.1.1 The construction of a high-value patent evaluation indicator system

Based on existing research, this paper constructs a high-value patent indicator system suitable for machine

learning. The selected indicators cover four dimensions: legal, market, technological, and technical capacity of

patentees. Some time-lag indicators have been excluded, and the specific indicators are listed in Table 1.

Page 8
(1) The technological dimension of patent reflects the patent itself technological level. The number of IPC

classification codes reflects the technical coverage of the patent, indicating the connotation of the technology, which

is measured by the number of four-digit IPC subcategories of the patent. The number of cited patents refers to the

number of other patents cited by the target patent, reflecting the technological foundation of the target patent. The

number of non-patent citations refers to the number of scientific papers cited by the patent. The more citations there

are, the more likely it is that the patent is based on a larger number of research results, and its technological level is

likely to be higher. The number of pages in the document can be used to describe the structure, technical essentials,

and usage methods of a certain patent technology. The more pages there are, the higher the complexity of the patent

technology.

(2) The patent legal dimension primarily measures the statutory value of a patent from the perspectives of patent

application process, application cost, maintenance cost, and scope of protection. The number of claims refers to the

scope of protection claimed by the target patent. The more claims there are, the more technical features the patent has,

indicating stronger innovation capabilities and higher advancement of the patent. The number of litigation cases

reflects the legal effectiveness of a patent. Patents with higher technological content and stronger novelty are more

likely to encounter litigation. The number of independent claims reflects the innovation and practicality of a patent in

solving technical problems. The higher the technological innovation and practicality, the higher the patent value. The

examination duration refers to the time span from the patent application date to the patent grant date. Patents with

longer examination durations indicate more advanced patent technology. The duration of maintenance refers to the

period between the patent grant date and the estimated expiration date of the patent. The longer the duration of

maintenance, the more funds expended, indicating a greater patent value.

(3) The patent market dimension primarily measures patent value based on the scope of patent protection and

patent type. The number of simple patent families refers to the count of patents within a patent family that share the

same priority right. A larger patent family size indicates a stronger patent protection network, a more complete

technical portfolio layout, and a higher value of the patent family. The number of simple family members represents

the number of countries where patent applications are filed, reflecting the international competitiveness of the patent.

Patents that are applied for protection in multiple countries generally have higher value. Whether a patent belongs to

a strategic emerging industry or a national economic industry also reflects the degree of patent value to a certain extent.

Page 9
(4) The technical capacity of patentees dimension primarily reflects the technical strength of patent-related

entities. The number of current patentees and inventors is used to reflect the cooperation situation of patents. The more

inventors there are, the more knowledge and experience contributed by different inventors, leading to a stronger

knowledge base and a higher level of value. Different types of applicants have varying tendencies towards patent

transfer. Scientific research institutions often undertake the work of technology research and development, while

enterprises focus on the market operation of technology. In this paper, the types of patentees are divided into

companies, scientific research institutions, individuals, government agencies and others, and are labeled as 0, 1, 2, 3,

4 accordingly. Technical influence refers to the total number of patents published by the patent holder in the field.

Generally, the more patents an inventor publishes in the field, the more thoroughly they understand the knowledge in

that field, and the more likely they are to create high-value patents. Overall technical strength refers to the total number

of patents invented by the patent inventor, reflecting their technological innovation capabilities and strength.

Table 1 The high-value patent evaluation indicator system


Indicator Dimension Indicator Name
number of IPC classification codes (Huang et al. 2021)
number of citation patents (Huang et al. 2021)
technological
number of non-patent citation documents (Chiu et al. 2015)
number of pages of documents (Chiu et al. 2015)
number of claims (Huang et al. 2021)
number of litigation cases (Huang et al. 2021)
legal number of independent claims (Chang et al. 2014)
examination duration (Odasso et al. 2015)
maintenance duration (Lee 2008)
number of simple patent families (Hu et al. 2023)
number of simple family members (Hu et al. 2023)
market
whether it belongs to a strategic emerging industry (Lee et al. 2018)
whether it belongs to the national economic industry (Lee et al. 2018)
number of current patentees (Lee et al. 2018)
number of inventors (Lee et al. 2018)
technical capacity of
type of applicant (Caviggioli et al. 2016)
patentees
technological influence (Chung et al. 2021)
overall technological strength (Chung et al. 2021)

3.1.2 The reduction of high-value patent identification indicator system

After establishing the identification indicator system for high-value patents, this paper reduces the constructed

indicators to eliminate redundancy and improve the prediction accuracy of the model. Consequently, an optimized

indicator system suitable for machine learning classification is obtained.

Firstly, we divide the dataset. To ensure the validity of the final test results, we employ the ten-fold cross-

validation method for our experiments. Specifically, the dataset is randomly divided into ten parts, with nine parts

Page 10
serving as the training set and one part serving as the test set in rotation. This process of training and testing is repeated

ten times. The model's performance is then evaluated based on the combined results of these ten iterations.

Secondly, we obtain the initial mean prediction accuracy 𝑄̅0 of the model. The imbalance data can lead to poor

classification performance of the model. To make the reduction results more effective, we employ the ADASYN-LOF

method to balance the dataset before reducing the indicators. Subsequently, we use the AdaBoost to train and test the

dataset containing the original indicators, obtaining the identification accuracy 𝑄̅0 .

Then, we calculate the importance index for each indicator:

𝑄̅𝑖 − 𝑄̅ 0 = 𝐼𝑖 (1)
𝑄̅0 represents the mean accuracy of the initial model, and 𝑄̅𝑖 represents the mean accuracy of the model after the

removal of the ith indicator. By iteratively removing features with replacement from the original set of indicators, we

calculate the importance indices for each indicator. Indicators are then ranked in ascending order according to their

importance indices, meaning that the higher the ranking, the greater the impact of the indicator on improving the

model's identification accuracy.

After obtaining the ranking of indicator importance indices, we proceed with indicator reduction. Using a

recursive method, we extract one indicator at a time from the ranked list of indicator importance indices and calculate

its mean model identification accuracy. If the mean accuracy of the model improves, we retain that indicator.

Otherwise, we continue by attempting to add the next indicator from the list. This process continues until all indicators

in the list have been considered. The algorithm stops when no further improvement is achieved, and we are left with

the retained indicators.

3.2 The construction of high-value identification model

After indicator reduction, we use the dataset with retained indicators for high-value patent identification to by

ADASYN-LOF-GA-AdaBoost combined model. The workflow is illustrated in Figure 1. It primarily includes two

steps: First, data balancing treatment based on the ADASYN-LOF model. Second, classification processing based on

the GA-AdaBoost model.

Page 11
start

the dataset with


retained indicators

expanding the dataset by ADASYN

removing noise samples by LOF

training set
(90%)
testing set
(10%)
optimize AdaBoost
by GA
ten-fold cross-validation
NO
model classification
results

10次

YES

mean values of ACC,


AUC, Recall, and F1

end

Figure 1 The training process of ADASYN-LOF-GA-AdaBoost


3.2.1 Data augmentation based on the ADASYN-LOF model

ADASYN is an oversampling method proposed by He [36] and its core principle is to use density distribution

parameters as the distribution standard. Based on the varying degrees of difficulty in learning different minority class

samples, ADASYN applies a weighted distribution, generating more synthetic samples for those minority class

samples that are harder to learn compared to those that are easier to learn. The ADASYN algorithm improves learning

in two ways: (1) Reducing biases caused by imbalanced categorical data. (2) Adaptively shifting the classification

decision boundary towards difficult sample instances. Most outlier detection methods rely on density, angles,

distances, and other factors to delineate hyperplanes and identify anomalous points. These methods are based on the

similarity of data points. However, the Local Outlier Factor (LOF) is a detection method that starts from the data

density surrounding a sample point. It assigns a local reachability density to each sample point and analyzes the outlier

degree of the sample based on its outlier factor derived from this reachability density, determining whether it is an

outlier or not. The LOF algorithm is simple and intuitive, considering both local and global attributes of the dataset.

The ADASYN-LOF first performs sampling on the original data, and the resulting data inevitably contains noise

Page 12
samples. At this point, noise reduction can be achieved through the application of LOF, resulting in a balanced dataset

that is more conducive to classification processing. The specific training process is outlined in Table 2.

Table 2 Training steps of ADASYN-LOF algorithm


Training Process
Input: training set 𝐺𝑡𝑟 = {(𝑥1 , 𝑦1 ), (𝑥2 , 𝑦2 ) … … (𝑥𝑛 , 𝑦𝑛 )}, 𝑥 represents the feature variables,
and 𝑦 represents the category labels for the dataset. 𝑦 = {0,1};Let 𝑛𝑠 represents the minority
samples, and 𝑛𝑙 represents the number of majority samples, 𝑛𝑠 < 𝑛𝑙 ,𝑛 = 𝑛𝑠 + 𝑛𝑙
(1) Calculate the imbalance ratio between minority class samples and majority class
samples.𝑑 = 𝑛𝑠 /𝑛𝑙 ,𝑑 ∈ (0, 1]
(2) If 𝑑 < 𝑑𝑡ℎ (𝒅𝒕𝒉 refers to the maximum acceptable degree of class imbalance ratio.)
1)Calculate the number of synthetic samples needed to be added for the minority
samples:𝐷 = (𝑛𝑙 − 𝑛𝑠 ) × 𝛽, 𝛽𝜖[0,1]
2) For each minority sample, find its K nearest neighbors based on Euclidean
distance. Let ∆𝑖 represents the number of majority samples among the K nearest
neighbors. Calculate the ratio 𝑟𝑖 = ∆𝑖 /𝐾,𝑖 = 1, … … 𝑛,𝑟𝑖 ∈ [0,1]
𝑛𝑠
3) Standardization Process: 𝑟̂𝑖 = 𝑟𝑖 / ∑𝑖=1 𝑟𝑖
4) Calculate the number of synthetic samples that need to be generated for each
minority samples:𝑑𝑖 = 𝑟̂𝑖 × 𝐷
5)For each minority sample,as 𝐹𝑜𝑟 𝑧 = 1 𝑡𝑜𝑔𝑖
(3) Randomly select an 𝑥𝑖𝑧 from the K nearest neighbors of the minority class sample 𝑥𝑖 that
has not been used to generate synthetic data.
(4) According to 𝑠𝑖 = (𝑥𝑖 + 𝑥𝑖𝑧 ) × 𝜆 synthetic new samples,𝜆 is a random number,𝜆 ∈
[0,1]
(5) Generate new synthetic dataset.
(6) Using the LOF to perform noise reduction on the newly generated synthetic samples.
Output: adding new synthetic data to create a category-balanced dataset.

3.2.2 AdaBoost algorithm optimized based on GA

AdaBoost (Adaptive Boosting) algorithm proposed by Freund and Schapire [35]. The basic idea of the algorithm

is as follows: Initially, each of the 𝐺 samples in the dataset is assigned the same weight of 1/𝐺. During the training

process, samples that are misclassified are given higher weights, allowing the classifier to focus on learning from these

incorrectly classified samples in the next iteration, resulting in a new sample distribution. The algorithm generates a

weak classifier in each round of learning and increases the weight of the classifier with higher accuracy. Finally,

multiple classifiers are combined to form a strong classifier. The training process of the algorithm is as follows:

Input: original dataset 𝑄 = {(𝑥1 , 𝑦1 ), (𝑥2 , 𝑦2 ), … (𝑥𝐺 , 𝑦𝐺 }, 𝑥𝑖 𝜖𝑥 ⊑ 𝑅𝑛 , 𝑦𝑖 𝜖𝑦 = {−1, +1}, 𝐻 represents the base

classifier.

1)Initialize the weight distribution of the original sample set.

1
𝑇1 = (𝑤11 , … , 𝑤1𝑖 , … , 𝑤1𝐺 ), 𝑤1𝑖 = , 𝑖 = 1,2,4, … , 𝐺 (2)
𝐺

2)𝐹𝑜𝑟 𝑚 = 1 𝑡𝑜 𝑀

Page 13
a. Use the dataset with weight distribution 𝑇𝑚 to learn and obtain the weak classifier 𝐻𝑚 (𝑥)。

𝐻𝑚 (𝑥): 𝑥 → {−1,1} (3)

b. Calculate the classification error rate 𝑒𝑚 of 𝐻𝑚 (𝑥). If the 𝑒𝑚 > 50%, discard the weak classifier.
𝐺 𝐺

𝑒𝑚 = ∑ 𝑃(𝐻𝑚 (𝑥𝑖 ) ≠ 𝑦𝑖 ) = ∑ 𝑤𝑚𝑖 𝐼(𝐻𝑚 (𝑥𝑖 ) ≠ 𝑦𝑖 ) (4)


𝑖=1 𝑖=1

c. Calculate the weight of the weak classifier 𝐻𝑚 (𝑥)

1 1 − 𝑒𝑚
𝛼𝑚 = ln (5)
2 𝑒𝑚

d. Iterate and update the weight distribution of the dataset.

𝑇𝑚+1 = (𝑤𝑚+1 , 1, … , 𝑤𝑚+1 , 𝑖, … , 𝑤𝑚+1 , 𝑁),


𝑤𝑚𝑖 (6)
𝑤𝑚+1 , 𝑖 = exp(−𝛼𝑚 𝑦𝑖 𝐻𝑚 (𝑥𝑖 )) , 𝑖 = 1,2, … 𝑁
𝑍𝑚

𝑍𝑚 is a normalization factor used to maintain the sum of the weight distribution as 1.

3)Output the strong classifier.

𝐴(𝑥) = 𝑠𝑖𝑔𝑛(𝑓(𝑥)) = 𝑠𝑖𝑔𝑛( ∑ 𝛼𝑚 𝐻𝑚 (𝑥)) (7)


𝑚=1

The strong classifier integrated by the AdaBoost exhibits higher stability and classification accuracy. However,

it is challenging to determine the most suitable number of iterations during the iterative process, and it is difficult to

identify the optimal combination of weights for all weak classifiers. In GA, a vector composed of 𝑛 𝑋𝑛 =

[𝑥1 , 𝑥2 , … , 𝑥𝑖 ] represents a decision space vector. Each 𝑋𝑛 is considered a genetic gene, and the spatial vector 𝑋

represents a feasible solution to the problem. The optimization problem is transformed into a process of solving for 𝑋.

The greatest advantage of GA is their ability to simulate the process of biological evolution. Through population

selection, crossover, mutation, and other processes, they screen for the optimal population and determine the global

optimal parameters. The specific iterative process is shown in Figure 2. By combining GA with AdaBoost, GA can

be used to tune the hyperparameters of the AdaBoost model, obtaining the optimal number of iterations and optimizing

the classification accuracy and convergence speed of the model.

Page 14
start

create an initial population

calculate the fitness of each


individual in the population

choose

crossover
NO

mutation

calculate the fitness of each


individual in the population

meet the
termination
condition
YES

select the individual with


the highest fitness

end

Figure 2 The optimization flow chart of GA

4. Empirical analysis

4.1 Data retrieval and acquisition

The data source of this study from PatSnap. The database deeply integrates patent data from 116 countries and

regions around the world, dating back to 1790, encompassing over 140 million patent records. By reviewing the

literature related to the field of scientific instruments, we conducted a search using query TAC:("measurement" OR

"metrology") AND ("instrument" OR "device" OR "system") AND ("sensor" OR "detector" OR "transducer") AND

("accuracy" OR "precision" OR "calibration") ISD: [ 2013 TO 2020] in the PatSnap database retrieving a total of

21,771 patent records.

4.2 Reduction of indicators for high-value patent identification

The specific steps for performing indicator reduction using the original dataset are as follows:

Step 1: calculate the mean identification accuracy of the original indicator dataset.

The ADASYN is used to augment the dataset to obtain a balanced dataset, which is then denoised using the LOF

on the newly generated samples, resulting in the final balanced dataset. The AdaBoost is used to train and test this

Page 15
sample, with the results shown in Table 3. We can see that the accuracy of AdaBoost on the original data is only

70.68%, but after balancing the data with ADASYN-LOF, the accuracy reaches 89.18% and exhibits good stability.

Table3 Comparison of model performance before and after dataset balancing


machine learning model ACC (%) standard deviation (S.D.)
AdaBoost 70.68 0.0585
ADASYN-LOF-AdaBoost 89.18 0.0045
Step 2: calculate the importance index for each indicator.

Through Step 1, the mean ACC of AdaBoost on the dataset containing all indicators is 89.18%, denoted as 𝑄̅0 .

Then, an AdaBoost model is established using the dataset with indicator 𝑖 removed, and the ACC is obtained, denoted

̅𝑖 . The importance indices of each indicator is calculated by 𝑄̅𝑖 − 𝑄̅0 = 𝐼𝑖 , and the results are sorted in ascending
as 𝑄

order as shown in Table 4.

Table 4 The importance index of indicators


importance ranking indicator importance index
1 number of current patentees -0.003962042
2 type of applicant -0.003718214
3 whether it belongs to a strategic emerging industry -0.003474386
4 number of non-patent citation documents -0.003443907
5 number of claims -0.003108644
6 maintenance duration -0.002925773
7 technological influence -0.002834337
8 number of citation patents -0.002620988
9 overall technological strength -0.002468595
10 examination duration -0.001950460
11 number of simple family members -0.001859025
12 number of inventors -0.001645675
13 number of independent claims -0.001036105
14 number of litigation cases -0.001036105
15 whether it belongs to the national economic industry -0.000761798
16 number of IPC classification codes -0.000578927
17 number of pages of documents -0.000335099
18 number of simple patent families 0.000518299
Step 3: confirm and retain the indicators based on their effectiveness in improving the accuracy of the model

Based on the sorted list of indicator importance indices obtained in the step 2, indicators are selected in sequence.

If the addition of a selected indicator during training improves the identification accuracy of the model, then that

indicator is confirmed and retained. If the recognition accuracy does not improve, the next indicator in the list is tried

until all indicators in the sorted list have been added, and the algorithm ends. The final list of retained indicators is

shown in Table 5.

Tbale5 Identification accuracy recursively added based on retained indicators"


number of indicators newly added indicators ACC
1 number of current patentees 0.553245962
2 type of applicant 0.69939043

Page 16
3 whether it belongs to a strategic 0.738250533
emerging industry
4 number of non-patent citation documents 0.848887534
5 number of claims 0.879609875
6 maintenance duration 0.88381591
7 technological influence 0.895736056
8 number of citation patents 0.907930509
9 overall technological strength 0.910947882
10 examination duration 0.914660165

ACC
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0

ACC

Figure 3 Retention indicator accuracy improvement curve graph

Table 6 Comparison of model performance before and after indicator reduction


Whether the indicator is
ACC (%) S.D.
reduced or not
NO 89.18 0.0045
YES 91.46 0.0033
As can be seen from Figure 3, among the 10 retained indicators after reduction, four are from the dimension of

patentee, two are from the technology dimension, three are from the legal dimension, and one is from the market

dimension. According to Table 6, the final model identification accuracy is 91.46%, which is a 2.28% improvement

compared to the average identification accuracy of 89.18% using all indicators. Therefore, by reducing indicators

and eliminating redundant ones, the identification accuracy of the model can be improved.

4.3 Experimental results

4.3.1 Metrics of model evaluation

Page 17
This study focuses on a binary classification problem, thus four evaluation metrics, namely Accuracy (ACC),

AUC, Recall, and F1-score, are employed to assess the performance of the model. In binary classification, the actual

categories in the dataset can be combined with the predicted categories by the classifier, resulting in four categories,

which are represented by the confusion matrix in Table 7.

Table 7: Confusion Matrix


Predicted results.
Actual class
Predicted Positive Predicted Negative
Actual Positive TP FN
Actual Negative FP TN

(1) ACC

ACC represents the proportion of correctly classified samples to the total number of samples. In this study, it

equates to the ratio of correctly identified high-value patents to the total number of patents. It is a commonly used

performance metric in classification tasks with imbalanced data. The ACC can be represented by the binary confusion

matrix in the table as:

𝑇𝑃 + 𝑇𝑁
𝐴𝐶𝐶 = 。 (8)
𝑇𝑃 + 𝐹𝑃 + 𝑇𝑁 + 𝐹𝑁

(2) AUC

The Receiver Operating Characteristic (ROC) curve is created by plotting the true positive rate against the false

positive rate at various threshold settings, based on the predicted results of the classifier. The Area Under the Curve

(AUC) is the area beneath the ROC curve. If the ROC curves intersect, it can be difficult to judge the superiority of

the models. In such cases, using the AUC can effectively avoid this problem. Generally, a higher AUC value indicates

better performance of the classifier.

(3) Recall

Recall, also known as sensitivity, represents the proportion of the original samples that were predicted correctly.

A higher recall indicates that fewer minority class samples are being misclassified:

𝑇𝑃
𝑅𝑒𝑐𝑎𝑙𝑙 = 。 (9)
𝑇𝑃 + 𝐹𝑁

(4) F1-score

The F1 score is a precision metric that calculates the weighted harmonic mean of recall and precision. It will

only be high if both recall and precision values are relatively large. Therefore, the F1 score comprehensively reflects

the classification performance of the algorithm for both positive and negative samples.

Page 18
2 × 𝑃𝑟𝑒𝑠𝑐𝑖𝑠𝑖𝑜𝑛 × 𝑅𝑒𝑐𝑎𝑙𝑙 (10)
𝐹1 − 𝑠𝑐𝑜𝑟𝑒 =
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙

4.3.2 The comparison of experimental results

To further enhance the classification performance of AdaBoost, we introduced the GA for parameter

optimization. Through multiple iterations, we identified the optimal combination, where the number of base classifiers

in the model was 318, the learning rate was 1.069, and the maximum tree depth was 16. This improved the convergence

speed and accuracy of the model. Furthermore, to validate the effectiveness of our model, we applied the GA-

AdaBoost model to a dataset with retained indicators that had been balanced using ADASYN-LOF. We compared the

classification performance of the GA-AdaBoost model with that of BP neural networks, DT, RF, and AdaBoost. After

ten-fold cross-validation, we obtained the mean values of ACC, AUC, Recall, and F1-score for these models on the

test set, as shown in Table 8.

Table 8 Performance evaluation of machine learning models after indicator reduction


Model ACC AUC Recall F1-score
BP Neural Network 0.6442 0.5445 0.5764 0.4976
DT 0.6569 0.5674 0.6786 0.7554
AdaBoost 0.7068 0.8803 0.7423 0.7786
ADASYN-LOF-AdaBoost 0.9146 0.9145 0.9466 0.9123
ADASYN-LOF-GA-AdaBoost 0.9447 0.9487 0.9754 0.9523
As can be seen from Table 8, the indicator values of the AdaBoost are superior to BP neural network and DT,

indicating that the classification performance of the ensemble algorithm is better than that of a single algorithm. In

addition, the performance of the ADASYN-LOF-AdaBoost model after mixed sampling processing is significantly

improved. Furthermore, the classification performance of the ADASYN-LOF-GA-AdaBoost model, which

incorporates GA optimization on the basis of mixed sampling, is further enhanced. By comparing the ACC, it can be

observed that the ACC of the ADASYN-LOF-GA-AdaBoost is 94.47%, which is higher than that of other models,

indicating that the model has strong discrimination ability and can accurately identify high-value patents. Regarding

the AUC metric, the AUC of the BP neural network is only 54.45%, and the AUC of DT is 56.74%. However, the

AUC of the ADASYN-LOF-GA-AdaBoost reaches 94.87%. The Recall of the ADASYN-LOF-GA-AdaBoost is even

reaching 97.54%, indicating that the model can identify more high-value patents. In terms of the F1-score, the F1 of

the BP neural network is only 49.76%, the mean F1 of DT is 75.54%, the mean F1 of AdaBoost is 77.86%, and the

mean F1 of the ADASYN-LOF-AdaBoost model is 91.23%. However, the mean F1 of the ADASYN-LOF-GA-

AdaBoost model reaches 95.23%, demonstrating that the overall performance of this model is superior to the other

models.

Page 19
5. Discussion

The main contributions of this paper are as follows. Firstly, this study optimizes the construction of the indicator

system. Based on a review of previous studies, we add technical capacity of patentees to improve identification

accuracy. Secondly, in order to solve the problem of low identification accuracy of high-value patents caused by data

imbalance, we introduce ADASYN to expand the dataset and use LOF to remove noise samples to obtain a balanced

dataset. In addition, for AdaBoost, it is difficult to determine the most appropriate number of iterations during the

iterative process, and it is impossible to determine the optimal combination of weights for all weak classifiers. We add

GA to optimize the parameters of AdaBoost to further improve the classification performance of the model. These

will help to improve the model's performance to identify high-value patents.

6. Conclusion

How to quickly and accurately identify high-value patents with transfer potential among massive patent data is

of great significance for promoting the transformation of patent achievements. Addressing the shortcomings in

existing research, this paper proposes a high-value patent identification method based on machine learning. Firstly, in

response to the inadequacies in the construction of indicator systems in current studies, we incorporate the dimension

of patentee to reconstruct the indicator system. Secondly, in view of the imbalance in high-value patent data, an

ADASYN-LOF-GA-AdaBoost model is proposed. This model utilizes ADASYN to expand minority class samples

and applies LOF for noise reduction to mitigate the degree of data imbalance. Finally, the GA-optimized AdaBoost is

employed for classification to further enhance the classification performance of the model.

Based on the empirical analysis in this paper, the following two conclusions can be drawn: (1) It is effective to

incorporate the dimension of patentee into the indicators for high-value patent identification. In the ranking of the

importance indices of indicators, it is found that the indicators related to the dimension of patentee introduced in this

paper are all ranked at the top, and they can still be retained after indicator reduction. This further proves the rationality

of incorporating such indicators. (2) Combined model can enhance the classification performance. This paper proposes

the ADASYN-LOF-GA-AdaBoost model, which involves using ADASYN-LOF to balance the data at the data level

and incorporating GA to optimize the AdaBoost. Compared with other models, this combined model achieves the best

classification results, demonstrating the effectiveness of the high-value patent identification method proposed in this

paper.

Page 20
This study also has some limitations, which can be considered to be improved in future research. First, the data

used in this paper is structured data, and only includes patent data. In the future, we can consider using multi-source

data, combining patents and papers, enriching data sources, and considering the use of text mining methods to further

identify high-value patents. Second, the machine learning algorithm is chosen in this paper. If text features are added

in future research, deep learning algorithm can be considered. Finally, considering the availability of the database, this

paper uses the Patsnap database, and the more authoritative Derwent or Incopat database can be used in the future.

Acknowledgements: This article is supported by Zhejiang Provincial Philosophy and Social Sciences Planning

Project (grant number 24NDJC215YB), the Key project of the Zhejiang Provincial Soft Science Research Program

(grant number 2024C25010), the Key Program of Zhejiang Province (grant number 2021C01027), and Special Project

for the Alliance of high-level Universities in the Changjiang Delta (grant number CSJYB202312).

Author contributions: Z.W. proposed the framework and wrote the manuscript. Y.L. collected the patent data and

conducted empirical analysis. X.H. provided suggestions for modification of the manuscript. B.H. revised the

manuscript. All authors reviewed the manuscript.

Competing interests: The authors declare no conflict of interest.

Data availability: All data generated or analyzed during this study are included in this published article.

REFERENCES

[1] Chang K C, Hao J, Chen C, et al. The relationships between the patent deployment strategy and patent

value[C]//Proceedings of PICMET'14 Conference: Portland International Center for Management of Engineering

and Technology; Infrastructure and Service Integration. IEEE, 2014: 1336-1340.

[2] Chiu C C, Su H N. What is the value of internationalized patent? [C]//2015 Portland International Conference on

Management of Engineering and Technology (PICMET). IEEE, 2015: 1061-1070.

[3] Chen Y M, Liu H H, Liu Y S, et al. A preemptive power to offensive patent litigation strategy: Value creation,

transaction costs and organizational slack[J]. Journal of Business Research, 2016, 69(5):1634-1638.

[4] Caviggioli F, Ughetto E. Buyers in the patent auction market: Opening the black box of patent acquisitions by non-

practicing entities[J]. Technological Forecasting and Social Change, 2016, 104: 122-132.

[5] Chung S, Animesh A, Han K, et al. Software patents and firm value: A real options perspective on the role of

innovation orientation and environmental uncertainty[J]. Information Systems Research, 2019, 30(3): 1073-1097.

Page 21
[6] Chung J, Ko N, Yoon J. Inventor group identification approach for selecting university-industry collaboration

partners[J]. Technological Forecasting and Social Change, 2021, 171: 120988.

[7] Danish M S, Ranjan P, Sharma R. Valuation of patents in emerging economies: a renewal model-based study of

Indian patents[J]. Technology analysis & strategic management, 2020, 32(4): 457-473.

[8] Danish M, Sharma R. The value of Indian patents: an empirical analysis using citation lags approach[J]. Economics

of Innovation and New Technology, 2023: 1-25.

[9] Erdogan Z, Altuntas S, Dereli T. Predicting Patent Quality Based on Machine Learning Approach[J]. IEEE

Transactions on Engineering Management, 2022.

[10] Freund Y, Schapire R. A decision-theoretic generalization of on-line learning and an application to boosting[J].

Journal of Computer and System Sciences, 1997:119-139.

[11] Fischer T, Leidinger J. Testing patent value indicators on directly observed patent value—An empirical analysis

of Ocean Tomo patent auctions[J]. Research policy, 2014, 43(3): 519-529.

[12] He H, Yang B, Garcia E A, et al. ADASYN: Adaptive synthetic sampling approach for imbalanced learning[C]//

Proceeding of the 2008 International Joint Conference on Neural Networks. Piscataway: IEEE, 2008:1322-1328.

[13] Huang Y, Chen L, Zhang L. Patent citation inflation: The phenomenon, its measurement, and relative indicators

to temper its effects[J]. Journal of Informetrics, 2020, 14(2): 101015.

[14] Huang K G L, Huang C, Shen H, et al. Assessing the value of China's patented inventions[J]. Technological

Forecasting and Social Change, 2021, 170: 120868.

[15] Hsu D H, Hsu P H, Zhou T, et al. Benchmarking US university patent value and commercialization efforts: A

new approach[J]. Research Policy, 2021, 50(1): 104076.

[16] Huang Z, Li J, Yue H. Study on comprehensive evaluation based on AHP-MADM model for patent value of

balanced vehicle[J]. Axioms, 2022, 11(9): 481.

[17] Han F, Zhang S, Yuan J, et al. Assessing future technological impacts of patents based on the classification

algorithms in machine learning: The case of electric vehicle domain[J]. Plos one, 2022, 17(12): e0278523.

[18] Hu Z, Zhou X, Lin A. Evaluation and identification of potential high-value patents in the field of integrated

circuits using a multidimensional patent indicators pre-screening strategy and machine learning approaches[J].

Journal of Informetrics, 2023, 17(2): 101406.

Page 22
[19] Klemperer P. How broad should the scope of patent protection be? [J]. RAND Journal of

Economics,1990,21(1):113-130.

[20] Ko N, Jeong B, Seo W, et al. A transferability evaluation model for intellectual property[J]. Computers &

Industrial Engineering, 2019, 131: 344-355.

[21] Kwon U, Geum Y. Identification of promising inventions considering the quality of knowledge accumulation: A

machine learning approach[J]. Scientometrics, 2020, 125: 1877-1897.

[22] Kong J, Zhang J, Deng S, et al. Knowledge convergence of science and technology in patent inventions[J]. Journal

of Informetrics, 2023, 17(3): 101435.

[23] Lee Y G. Patent licensability and life: A study of US patents registered by South Korean public research

institutes[J]. Scientometrics, 2008, 75: 463-471.

[24] Lee C, Kwon O, Kim M, et al. Early identification of emerging technologies: A machine learning approach using

multiple patent indicators[J]. Technological Forecasting and Social Change, 2018, 127: 291-303.

[25] Lee J, Kang J H, Jun S, et al. Ensemble modeling for sustainable technology transfer[J]. Sustainability, 2018,

10(7): 2278.

[26] Liu C, Shi Y, et al. A novel approach to screening patents for securitization: a machine learning-based predictive

analysis of high-quality basic asset[J]. Kybernetes, 2023.

[27] Mansfield E, Schwartz M, Wagner S. Imitation costs and patents: An empirical study [J]. Economic

Journal,1981,91:907-918.

[28] Moreno S G, Ray J A. The value of innovation under value-based pricing[J]. Journal of market access & health

policy, 2016, 4(1): 30754.

[29] Miao Y Z, Salomon R M, Song J. L-earning from technologically successful peers: the convergence of Asian

laggards to the technology frontier[J]. Organization Science, 2021,32(1):210-232.

[30] Odasso C, Scellato G, Ughetto E. Selling patents at auction: an empirical analysis of patent value[J]. Industrial

and Corporate Change, 2015, 24(2): 417-438.

[31] Oh J W, Park H W. Income approach to technology valuation for innovations[J]. International Journal of

Technology Management, 2022, 88(2-4): 389-407.

[32] Wang M H, Hsiao Y C, Tsai B H, et al. Fuzzy markup language with genetic learning mechanism for invention

patent quality evaluation[C]//2015 IEEE Congress on Evolutionary Computation (CEC). IEEE, 2015: 251-258.

Page 23
[33] Wang H, Sun B, Wang P. Dominant technology identification model based on patent information toward

sustainable energy development[J]. IEEE Access, 2019, 7: 141374-141385.

[34] Yang G C, Li G, Li C Y, et al. Using the comprehensive patent citation network (CPC) to evaluate patent value[J].

Scientometrics, 2015, 105: 1319-1346.

[35] Yang W, Cao G, Peng Q, et al. Effective Identification of Technological Opportunities for Radical Inventions

Using International Patent Classification: Application of Patent Data Mining[J]. Applied Sciences, 2022, 12(13):

6755.

[36] Zhou Y, Dong F, Liu Y, et al. A deep learning framework to early identify emerging technologies in large-scale

outlier patents: An empirical study of CNC machine tool[J]. Scientometrics, 2021, 126: 969-994.

[37] Yuan X, Song W. Evaluating technology innovation capabilities of companies based on entropy-TOPSIS: the

case of solar cell companies[J]. Information Technology and Management, 2022, 23(2): 65-76.

Page 24

You might also like