Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

Tech Mining: After 12 years

Alan Porter
Director of R&D, Search Technology, Inc.
&
Co-Director,
Technology Policy & Assessment Center
Georgia Tech

aporter@searchtech.com
Agenda
1. Tech Mining: Key concepts
2. “12 years of Tech Mining”: Case examples
3. Summing up
How do you extract
effective intelligence
from all that Science,
Technology & Innovation
(“ST&I”)information?

Tech Mining
Alan L. Porter and Scott W. Cunningham
John Wiley & Sons Inc., 2005
Tech Mining Foci

Various types of analysis to


inform pressing decisions:
 Research Profiling
 R&D Assessment, Portfolio
Management, Social Network
Analyses

 Competitive Intelligence
 Focus on one or more target
competitor organizations

 Technology Intelligence
 Focus on a target topic (ST&I) to
inform Science Policy or
Management of Technology (MOT)
Questions to Answer from field-structured data

Who? Where?

What? When?

How? & Why? – Need human analyst to interpret the data


“How to”:
Ten-Step Tech Mining
1. Spell out the Science, Technology & Innovation
(ST&I) questions and how to answer them
2. Get suitable data
3. Search (iterate) & retrieve ~abstract records
4. Import into text mining software (VantagePoint)
5. Clean the data
6. Analyze
7. Visualize (Map)
8. Integrate with Internet analyses & expert opinion
9. Summarize; Interpret; Communicate (multi-
dimensionally)!
10. Standardize and semi-automate where possible
On-line Data Sources Custom Data
Cambridge Scientific Abstracts Factiva Patbase Comma/tab delimited tables
Delphion ISI Web Of Knowledge Questel-Orbit Microsoft Excel and Access
Dialog Lexis Nexis SilverPlatter SmartCharts
EBSCOHost Micropatent STN XML
Ei Engineering Village Ovid Thomson Innovation

Databases Record/Field Tools


Aerospace Focust Pascal Combine duplicate records
Art Abstracts Food Sci & Tech Patent Citation Index Remove duplicate records
Biobase Foodline Market PCT Create “frankenrecords”
Biological Abstracts Foodline Science PCTPAT (merge records from
Biological Sciences Forege Phin dissimilar sources)
Biosis Frosti Pira Classify records
Biotechno FSTA Pluspat Merge fields
Business & Industry Gale PROMT PROMT Clean up fields
CAPlus (AnaVist export) GeoRef PsycINFO Apply thesauri
Cassis Global Reporter PubMed
CBNB IFIPAT Rapra
Claims
Computer & Info Systems
IFIUDB
INPADOC
Recent Refs
Reference Manager
A wealth of
Corrosion INSPEC Science Citation Index
Current Contents IPA SciSearch diverse
Derwent Biotech Abstracts ISD Scopus
Derwent Innovations Index
Derwent World Patent Index
ITRD
JAPIO
Tech Research
ToxFile
information
Ei Compendex JICST Transport
EMBase Kosmet USApps sources for
EnCompass Literature LGST USPat
EnCompass Patents
Energy
MATBUS
Medline
Waternet
WaterResAbs
innovation
EnergySciTech
Engineering Materials Abstr
METADEX
Mgmt and Org Studies
Web of Science
WeldaSearch management
Envr Sci & Pollution Mgmt Micropatent Materials Wisdomain
ERIC Mobility
EuroPat NSF Awards VantagePoint Filters and Tools
FamPat NTIS
VPInstitute.org
VP Institute
Body of Knowledge: Published text-analytic research
▼Research Examples (269)
►Data Type (70)
▼Research Type (144)
•Citation analysis (2)
•CTI (4)
•Future-oriented technology analysis(10)
•Interdisciplinarity(2)
•Literature Based Discovery (1)
•Network analysis(5)
•Research evaluation(8)
•Research profiling(17)
•Science mapping (7)
•ST&I indicators (6)
•ST&I policy (12)
•Strategic planning(8)
•Tech mining (9)
Body of Knowledge: Examples
• “Trends in nanotechnology patents applied to the health
sector” - Instituto Nacional de Propriedade Industrial &
FIOCRUZ
• “Synthetic Biology: Mapping the Scientific Landscape” -
Lancaster University
• “Applying text-mining to personalization and
customization research literature – Who, what and where?”
- Aalto University School of Economics
• Nanobiomedical Science in China: A Research Field on the
Rise”
Chinese Academy of Sciences+
• “Predicting Breakthrough Papers: Ranking Statistics,
Patterns, and Visualization”
Discovery Logic
• “Composing Technology Roadmapping According to
Bibliometrics: Hybrid Model and Empirical Study”
Beijing Institute of Technology
Latin American Examples
• Contributing to National R&D policy consideration: The
Brazilian Dengue Network in global context - Collective
Health Institute+
• Co-authorship Network Analyses of Neglected Diseases -
Fiocruz
• Technology Prospecting on enzymes for the Pulp & Paper
industry – UFRJ
• Biotech trends in Venezuela – with Latin American
collaboration networks – UFRJ
• Trends in drug development for breast cancer - Brazilian
Institute of Information in Science and Technology
• Building a research intelligence system with
Scientometrics + Mediametrics - Embrapa
TechMining success story:
Ceramics in Engines
• Overcoming
Management
Resistance
• Jumping Domains
• “Discovering” new
technology
Informing a tough decision

• US Army Tank-Automotive Research,


Development & Engineering Center
• Task in 1996: Reassess a “loser
technology” – could thin-film ceramics be
used in tank engines?
• TechMining: R&D Profile -- Amount of
activity up a little -- but clues of
significant maturation
The data speak
The rest of the story
• Experts support the empirical findings
• Management buys in – search out potential in
“coating engine parts”
• Who to go to? search finds ~95% of the research
is NOT in the mechanical engineering domain
• Identify R&D leaders -- in semiconductor
ceramics!
• $million projects instigated with Sandia National
Lab and a company to adapt “vapor deposition”
to turbine blades
• Production plant coats used (Gulf War) Abrams
tank turbine blades back to spec begins
successful operation (2004)
Mediametrics: Tracking Press Coverage
of Embrapa [Roberto Penteado]
• Clippings file: Shift from paper to electronic
• Embrapa – Brazil’s agricultural research
– Some 37 distinct units
– Electronic clipping since 1997
• Text mine these
– 17,000 from 2003-2004
– Track amount of coverage
– Track type of coverage
• New extension of this approach up on
VPInstitute.org
Sistema Nacional de Pesquisa Agropecuária – SNPA
37 Centros de Pesquisa e três Serviços
Embrapa Research: R$ 877 milhões anuais
Focus on Special Issues:
Attention to Genetically Modified Organisms

# O Estado Folha de Gazeta Correio Gazeta Jornal de


Publications de S. Paulo S. Paulo Mercantil do Povo do Povo Brasilia
492 GMOs 35 20 19 19 18 11
4895 Total 740 238 505 597 128 555
10 % 4.6 8.4 3.7 3.1 14 2
Framework to Forecast NEST Innovation Pathways
1. Understand the Step A: Characterize the technology’s
NEST and its TDS nature
(Technology Delivery
Step B: Model the TDS
System)
2. Tech Mine Step C: Profile R&D
Step D: Profile innovation actors &
activities
Step E: Determine potential applications
Step J: Engage experts
3. Forecast likely Step F: Lay out alternative innovation
innovation paths pathways
Step G: Explore innovation components
Step H: Perform Technology Assessment
4. Synthesize & report Step I: Synthesize and Report

Search Technology, 2010


Case Example

1. A Newly Emerging Science & Technology


case analysis – Dye-Sensitized Solar Cells
(“DSSCs”)
2. Combining technical intelligence from multiple
database analyses – to answer:
a. What? / When?
b. Who? / What?
3. Seeking to Forecast Innovation Pathways
Trends of Annual DSSC Activity
Step D. Leading DSSC Companies across Databases
SCI EI DWPI Factiva
Samsung SDI Co LTD 52* 38 65* 4
Sharp Co Ltd 27* 24 17* 4
Nippon Oil Corp 15* 35 27* 10*
Hayashibara Biochem Labs Inc 14* 9 0 0
Fujikura Ltd 12* 8 17* 9*
Chemicrea Co Ltd 10* 8 0 0
Sumitomo Osaka Cement Co Ltd 10* 3 3 2
Toshiba Co Ltd 9* 7 2 1
Konarka Technologies Inc 7* 11 11* 9*
DONG JIN SEMICHEM CO LTD 0 1 16* 8*
SONY CORP 10 10 17* 17*
Evonik Degussa GmbH 0 0 0 15*
STMicroelectronics NV 0 0 0 12*
Data Systems & Software Inc 0 0 0 8*
Dongjin Semichem Co Ltd 0 1 0 8*
Dyesol Ltd 3 3 2 8*
Step D (detailed). DSSCs  “Glass Houses”
• Who?
 ~19 or so patent families
 Samsung prominent (6)

• Find out more – Profile Samsung


 54 patent families
 ~2 inventor teams
 1 team with 28 patents has all 6 of these
[network map next]

• We could analyze their emphases – e.g.,


Manual Code concentrations
 Discrete devices
 Electro-(in)organics
 Polymer applications, etc.
Search Technology, 2010
Step D (detailed).
Patent Analyses:
2 distinct inventor
teams --
The upper team has
the 6 glass wall
related patents

Search Technology, 2010


Players-technologies Network Map
Study research networks
• From publications
– Mainly compare: Before vs. After
– Secondarily, examine those deriving from NSF support
• From citations
– By researcher publications, or proposals
– To researcher publications
• For proposal references
• For Target & Comparison Group researchers
• Networks based on
– Social links [e.g., co-authoring]
– Intellectual links [e.g., cross-citing or bibliographic
coupling on SCs, topics, or whatever]
Co-citation Map
of the most cited
authors by
the 307
nano
social science
papers
[Use Auto-corr on
hi cited Authors]

Visions

Evolutionary Economics
Tech Mining References
• Porter, A.L., and Cunningham, S.W. (2005), Tech Mining: Exploiting New
Technologies for Competitive Advantage, Wiley, New York.
• Porter, A.L., Guo, Y., and Chiavetta, D. (2011), Tech Mining: Text mining
and visualization tools, as applied to nano-enhanced solar cells, Wiley
Interdisciplinary Reviews: Data Mining and Knowledge Discovery 1 (2),
172-181.
• Zhang, Y., Zhou, X., Porter, A.L., and Gomila, J.M.V., and Yan, A. (2014),
Triple Helix Innovation in China’s Dye-Sensitized Solar Cell Industry:
Hybrid Methods with Semantic TRIZ and Technology Roadmapping,
Scientometrics, 99:1, 55-75.
• Robinson, D.K.R., Huang, L., Guo, Y., and Porter, A.L. (2013),
Forecasting Innovation Pathways for New and Emerging Science &
Technologies, Technological Forecasting & Social Change, 80 (2), 267-
285.
• Global Tech Mining Conference Special Issues: Technology Analysis &
Strategic Management; Technological Forecasting & Social Change;
Scientometrics; International Journal of Technology Management.
Resources
• The text mining software used:
www.theVantagePoint.com
• Example analyses:
www.VPInstitute.org
• Global Tech Mining Conference, in conjunction with
S&T Indicators Conference, Sep., 2014, Leiden
• Brazilian Tech Mining workshop in conjunction with
ProspeCTI, Sep., Salvador
12 Years of Tech Mining:
Looking Ahead
 Mine field-structured text like data –
for patterns!
Using VantagePoint + other software
 Mine multiple data sources with ever-better
computing power
 Deeper text mining: Topical (what?) analyses using
Term clumping and topic modeling
 Combine empirical + expert knowledge
 Aim to semi-automate scientometric analysis
processes (growing resource of macros – see:
http://vpinstitute.org/
[also good for information on Tech Mining with lots of
examples]
Tech Mining in Brasil at a Glance

Author Affiliations (Organization and City and Country) (Cleaned) Authors Keywords

Instituto Nacional de Propriedade Industrial, Praça Mauá Antunes, A. M. S. Patent Analysis [4];
no 7 sala 718, Rio de Janeiro, Brazil [3]; [2]; Brazil [3];
Federal Center of Technological Education Celso Suckow de Souza, C. G. [2]; nanotechnology [3];
da Fonseca, Rio Janeiro, Brazil [2]; Alencar, M. S. M. Data mining [2];
Center for Management and Strategic Studies - CGEE, [1]; Foresight [2];
SCN Quadra 2, Bloco A, 11o andar, CEP 70712-900 Coelho, G. M. [1]; Future Technology
Brasília-DF, Brazil [1]; da Silva, C. H. [1]; Analysis [2];
Federal University of Rio de Janeiro, Centro de de Barros, W. C. industrial property [2];
Tecnologia-Bloco E-Sala I-222, Ilha do Fundao, Rio de [1]; Information technology
Janeiro CEP 21949-900, Brazil [1]; de Cássia Amorim, [2];
Fundação Oswaldo Cruz, Instituto de Comunicacao e R. [1]; Intellectual property
Informacao Cientifica e Tecnologica em Saude, Av. Brasil, de Menezes [2];
4.365 - Pavilhão Haity Moussatche, Rio de Janeiro, CEP: Alencar, M. S. [1]; Research and
21.045-360, Brazil [1]; de Miranda Santo, development [2];
Search Technology, Inc., Norcross, GA, United States [1]; M. [1]; Technology policy [2];
Universidade Federal do Rio de Janeiro, Escola de dos Santos, D. M. Text mining [2];
Química, Av. Horácio Macedo, 2030, Centro de [1]; Competition [1];
Tecnologi, Cidade universitária, Rio de Janeiro, Brazil [1] Filho, L. F. [1]; Composite materials
Mendes, F. M. L. [1];
[1]; D centers [1];
Nunes, J. [1]; Decision making [1];
Porter, Alan L [1] Dental care [1];
Detecting target [1];

You might also like