Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

,(((,QWHUQDWLRQDO&RQIHUHQFHRQ%LJ'DWD

A Study of Innovation Network Database Construction


by Using Big Data and An Enterprise Strategy Model
ZHOU Wen YE Shu-Tao LU Xiao-Long
School of Computer Engineer and Science
Shanghai University
Shanghai, China
zhouwen@shu.edu.cn, fbcbest@hotmail.com, luxiaolong0822@gmail.com

Abstract This paper presents a method of extracting enterprises strategy alliance data from
massive text on the Internet using the text analysis technique; And also enterprise alliance
innovation networks are constructed based on the existing alliance database; Meanwhile, the process
of partners choice is considered by game model and a game model of average strategy among
enterprises is built.

Keywords Big data; Text analysis; Innovation network; Game model

next step of strategic decisions[6]. First of all


1 Introduction the researches on innovation network is the data
With the rapid development of Internet and collection. Therefore, we propose that the
cloud computing, the volume of data is growing enterprise data can be collected by using text
explosively, the world has officially entered the analysis technique, which provides an important
era of big data [1]. The name, big data, refers to support for the research on innovation network
the huge size of data by definition[2]. However, latter.
in addition to the volume, variety and velocity This paper has constructed and simulated a
are also the significant features of big data [3]. game model to study innovation network from
The complexity of data structure has set the perspective on enterprises. To get more
obstacles to the value extracting. A general benefits, enterprises must comprehensively
network crawler can only operate structurally weigh the risks, costs and benefits during
stored data, while becomes stretched when investment Game[7]. Bilateral game will
facing unstructured information. To this end, tend to an equilibrium point, which is called
domestic and foreign researchers have proposed "Nash Equilibrium" point[8]. At this
Text Analysis Technique[4]. Text analysis has equilibrium point, both sides will not make
achieved the automatic extraction and analysis further changes, which is a steady state of the
on unstructured information, which is called game[9].
Parsing in natural language[5]. The emergence The main work of this paper is as
of text analysis has provided a great help for follows:(1) Collaboration database construction
researchers to obtain valuable data. by using big data. An idea is presented that
Innovation network is a social network. collecting enterprise alliance data with text
Through the research on innovation network the analysis technique, storing as the format of
relevance can be obtained between network structured information and construct the
attributes and the capability of innovation, database. (2) A study on average strategy game
which can provide a theoretical basis for the model among enterprises in innovation network.

978-1-4799-1293-3/13/$31.00 2013 IEEE 


An average cooperative game model is analysis so that ensured the readability[18]
constructed and the investment strategy in the
condition of average distribution is also 2.3 Game Theory
considered, finally is the experimental analysis. In the research on economic field, game
The rest of this paper is organized as follows: theory is a common tool. Many economic
Section 2 describes the research status and the phenomena and behaviors have been understood
review of relevant literatures. Section 3 presents as a game situation[19, 20]. Game can be
a theoretical hypothesis on innovation network divided into cooperative game and non-
database construction. Section 4 is the cooperative game [21]. The core concept
experiment of average strategy game model and involved in non- cooperative game is the Nash
the results. Finally, conclusions are drawn in Equilibrium[22]. The Nash Equilibrium adapts
Section 5. to the need of modern economics that turning
from simplistic to complex and actual, which
2 Research Status and Literature constructs the basis of modern theory method of
Review economic analysis[23].
2.1 the Big Data
The big data has quickly become the 3 Innovation Network Database
research hotspot by its abundant value. Huge Construction Based on Text
benefits are produced to the society through the Analysis
rational use of big data. Just two years in 3.1 Proposal of Database Construction
academia, scholars have published numerous Enterprise database construction is the core
articles. Frankel has extracted the meaning and work of this project, so that enterprise data can
value from the theoretical level of the big automatically collected by text analysis
data[10]; Elias has applied the big data to technique. First, the selection of data sources of
economy from the application level and enterprise alliance is SDC Platinum, which is
developed profitable strategy and then widely applied to alliance researches. We plan
simulated[11]. Domestic scholars MENG and to collect all the alliance information of the
LI have also deeply researched on the big national enterprises from 2000 to 2011. The text
data[12-14]. Big data has clearly been the focus analysis software and the database should be
of the research in various fields. initialized before using, and the scope of
collection and the volume of data collected each
2.2 Text Analysis Technique batch should be set as well. After that, Lexical
Scholars have made a lot of researches on rules and syntax rules are created by text
text analysis technique and many versions have analysis technique and extract text data from
been molded. Among foreign scholars, Nord has massive complex data. Then we collect data
proposed a text analysis model facing to from data source, and take the operations that
translation, on which experiments were carried preprocessing, removing duplication, sorting
out from multiple perspectives such as text, and filtering successively, during which the user
language, culture and applied to the field of can track the progress of the data acquisition.
teaching[15]; Heinrich has studied the Export the data when all the data has been
parameter estimation in text analysis[16]. collected in database. The flowchart of the
Among domestic scholars, LIU has deeply entire database construction is shown in Figure
researched on the practical meaning ---- 1.
Intention Digging, on which automatically
searched and analyzed combined with the actual
opinions on the Internet[17]; LIN has divided
the text hierarchically based on latent semantic
49
8VHU &RUSXV $GPLQ

&OLHQW
Track and
Data Source Search Text Rule Database System
Search
Tracking Settings Settings Results Settings Management Monitoring

6HUYHU
1HWZRUN
Tracking Search Rules Text Rule
'DWD
6RXUFH Agents Creation Creation

Fig 2 Innovation network of domestic automotive


Data Access industry in the year G2007
Space

Innovation Network (QW$ (QW%


Data Text
Remove Text Text Text
Source
Screening
Preproces
sing
Duplication Sorting Filtering Generation Enterprise Database

Export 7DFWLF6 7DFWLF6


Database >7@ >7@

7RWDO%HQLILWV
Fig 1 The flowchart of database construction D 66E66

3.2 Data Checking and Supplement


1HW*DLQ$ 7RWDO,QY$ 1HW*DLQ% 7RWDO,QY%
When the database is constructed, both 8 66  D 66E66 6 8 66  D 66E66 6

program check and manual check on the data


collected in the database should be taken. In Fig 3 Flowchart of bilateral cooperation game model
principle enterprise name, industry name, the
alliance date should be non-empty, so if the innovative alliance by signing contracts or
attribute above is missing, the record should not protocols and constitute the both sides of game.
be saved. The rest attributes are supplemented The flowchart is shown in Figure 3.
by the ways such as text analysis checking and Thus the Nash equilibrium in bilateral
secondary collection. If the error is still exist, cooperation model is:
users should check manually.
S = S = (1) [24]

4 Results and Analysis of the


Experiment 4.2.2 Multilateral Cooperation Game Model
4.1 Data Collection In bilateral cooperation model, the under-
The alliance information of auto- mobile lying assumption is the capacity of the
industry from SDC Platinum from 2000 to 2009 enterprise is sufficient, but in fact the degree
is collected. To make the data more convinced, may be high, and the Nash Equilibrium in
we set three years as a period and construct 8 bilateral cooperation model may be insufficient
networks based on the data above. Then show in the condition of multi-stacking.
them with the software, Netdraw. Figure 2 is the (1) Definition of Insufficient Capacity
network diagram of automotive industry in the Assume that the current capacity of the
year G2007. enterprise i is Ui, when the degree is N, the
consideration of resource needed is:
4.2 Assumption and Construction of Multi-
= (2)
lateral Game Model
4.2.1 Bilateral Cooperation Game Model Then the insufficient capacity means :
Two enterprise A and B construct the < (3)

50
(2) Average Strategy Model the calculation of the average strategy, in which
The main principle of the average strategy parameter dgr, U, T, dgr relational value, dgr
model is: The part of insufficient capacity is parameter a, dgr parameter b are involved.
evenly distributed to all the other enterprises Parameter dgr refers to the number of degree.
allied to it. The process of steady state forming The algorithm process includes four steps:
in multilateral cooperation game model is given 1) Determine whether the capacity of each
as follows: node is insufficient. (U<T). If it is, then
1) Calculate the amount of the insufficient calculate the insufficient capacity of each edge;
capacity of all the nodes , , ,, , 2) Initialize the nodes of insufficient capacity
Assume the degree is , , ,, and set the expense to U/dgr on each edge, the
2) Sort = / ones of sufficient to T/dgr ;
3) Reduce resource expense from the 3) Update the expense of all the nodes
enterprise of largest , then adjust with according to the average strategy algorithm
resource expense of the other enterprises until mentioned in Section 4.2.2;
form the best response. 4) Calculate the income of each node on each
4) Deduct the nodes in step 3) from the edge.
network, and take the remaining nodes The expense and income of each node on
re-completing step 1), 2), 3) until only one node each edge can be acquired through the steps
in the network. above, and the result is shown in Figure 4.
Figure 4(upper) shows the statistical relation
4.3 Results and Analysis between expense and income; Figure 4(lower)
4.3.1 Preparation shows the statistical relation between degree
The variables involved in the model are: and income/expense ratio.
Return parameters a, Collaborative parameter
b, Enterprise inherent capacity U and capacity
needs T.
The random space of parameter a includes
1000 random values, which comes from normal
distribution(mu=2, sigma=1), the values are
concentrated between 0-4 (p=0.9545), the
negative values are replaced with 0; The
experience value of parameter b is 1/4, the
random space also includes 1000 random values,
which comes from normal distribution(mu=1/4, Fig 4 The relation between degree, income and expense
sigma =1/8)the values are concentrated bet- in average strategy
ween 0-1/2 (p=0.9545)also the negative values
generated are replaced with 0. Generate U by
normal distribution of mu=d/2, sigma =d/4
5 Conclusions
randomly, assume that the degree is d, then set First, this paper attempts to collect data from
U concentrated in the interval [0,d]. The enterprise alliance database by text analysis
probability that the value of U falls in the technique, using this technique and manual
interval [0,d] is 0.9545, the negative values are operation to check and store the data and then
replaced with 0. T can be drawn by Nash construct database. Second studies the game
Equilibrium on each pair of relation. among enterprises and constructs an average
strategy model in innovation network.
4.3.2 Results Although only a plan is purposed, but
A basic data structure is constructed to take previously we have successful experience of
51
data collection and also numerous text analysis integration on participation in the diffusion of inno-
techniques exist, which make the plan feasible. vations," Social Science Research,vol.2, pp.125-144, 1973.
In later research the plan will be improved and [7] JQ Zeng, " Game Theory in Human Resource
implemented. In addition, in particular a Management," Enterprise Economy vol. 28, p. 44, 2007.
multilateral cooperation game model among [8] R. B. Myerson, "Refinements of the Nash equilibrium
enterprises based on game is also constructed, concept," International journal of game theory, vol. 7, pp.
which makes the behavior and mechanisms 73-80, 1978.
more clearly, and provides a choice to the [9] I. L. Glicksberg, "A further generalization of the
enterprise to make a strategy. Kakutani fixed point theorem, with application to Nash
The coming of big data brings new equilibrium points," Proceedings of the American
opportunities and challenges for the research of Mathematical Society, vol. 3, pp. 170-174, 1952.
innovation network. Later research will broaden [10]F. Frankel and R. Reid, "Big data: Distilling meaning
and deepen based on this paper, as the following from data," Nature, vol. 455, pp. 30-30, 2008.
aspects: [11]H. Elias, "The Big Data Challenge: How to Develop a
(1) Supplement of database model. this Winning Strategy," Manufacture Information Engineering
paper only proposes the plan and simulates the of China, vol. 47, 2012.
process of the construction of database. Later [12]XF Meng and X Ci,"Big Data Management: Concepts,
research will broaden and deepen the plan and Techniques and Challenges" Journal of Computer
get the results. Research and Development, vol. 50, pp. 146-169, 2013.
(2) Game analysis on innovation network. [13]GJ Li,"The Scientific Value of the Research on Big
The model presented in this paper is only an Data,"Chinese Journal of Computers,vol. 8,pp. 8-15, 2012.
model of ideal state and most parameters are [14]QP Jiang, "The Coming of Big Data," China Internet
needed to set manually. Later research will Weekly, pp. 6-6, 2012.
simplify and improve the model, and push the [15]C. Nord, Text Analysis in Translation: Theory
model into practical applications. Methodology, and Didactic Application of a Model for
Translation-Oriented Text Analysis vol. 94: Rodopi, 2005.
Acknowledgements: [16]G. Heinrich, "Parameter estimation for text analysis,"
This study is supported by the National http://www.arbylon.net/publications/text-est.pdf, 2005.
Science Foundation of China (Grant No. [17]J Liu, "Research on Approximate Text Analysis Based
71003069 and No.71203135). Opinion Mining," Shanghai University, 2007.
[18]HF Lin, et al., "Text Structure Analysis Based on
References
Latent Semantic Indexing," Pattern Recognition and
[1]S. Lohr, "The age of big data," New York Times, vol. 11, Artificial Interlligence, vol. 13, pp. 47-51, 2000.
2012. [19]J. W. Friedman, Game theory with applications to
[2] J. Manyika, et al., Big data: The next frontier for economics: Oxford University Press New York, 1986.
innovation, competition, and productivity: McKinsey [20]R. B. Myerson, Game theory: analysis of conflict:
Global Institute, 2011. Harvard university press, 2013.
[3] P. Zikopoulos and C. Eaton, Understanding big data: [21]ZQ Jian," Cooperative Game Analysis of Strategic
Analytics for enterprise class hadoop and streaming data: Alliance," Quantitative & Technical Economics, vol. 8, pp.
McGraw-Hill Osborne Media, 2011. 34-36, 1999.
[4] S. Soderland, "Learning information extraction rules [22]A. W. Tucker, "A two-person dilemma," Readings in
for semi-structured and free text," Machine learning, vol. games and information, pp. 7-8, 1950.
34, pp. 233-272, 1999. [23]DL Wu, " On the Connotation Problems and
[5] U. Hahn, "Topic parsing: Accounting for text macro Prospective of Nash Equilibrium," Journal of Shanghai
structures in full-text analysis," Information Processing & University(Social Science Edition), vol. 1, p. 012, 2001
Management, vol. 26, pp. 135-170, 1990. [24] Shubik M. Game theory in the social sciences:
[6] R. S. Burt, "The differential impact of social Concepts and solutions[J]. 2006.

52

You might also like