Find Ideas, Do Research, Publish Papers, Earn Degree!

Find Ideas, Do Research, Publish Papers, Earn Degree!

Charles Ling (), Professor

PhD (U of Pennsylvania)

University of Western Ontario, Canada

( )

To Become a Researcher
Research career Scientific discovery
Not commercialization Different from engineering

The life as a researcher are you suitable

Creative & critical, focus, diligent, persistent,

Graduate students PhD researcher (large companies) or professor (universities)

First Step: MSc/PhD Candidates

You are becoming a researcher (with support)
1. 2. 3. 4.

Finding research topics and ideas Doing good research Writing and publishing papers (peer-reviewed) Earning MSc/PhD thesis; applying for grant

Two things help young researchers to start

Internet Anonymous authorship in paper submission

Probably best to obtain PhD in North America

1. Finding Research Ideas

(Theoretical) Research: discover new knowledge

New, sound, significant and better, high impact

(Applied research: killer applications) Read latest proceedings and journals Internet makes things much easy
Google Scholar, authors homepage, email authors

Attend latest conferences, chat with people Relax: walk on beach, drink some beer
Finding Research Ideas

The attitude when you read papers

30% understanding, 70% critical/creative thinking

What is wrong? [Example]

What else? Why not? How to do better? Many possibilities (creativity no unique correct answer) From small to major extension Surprising effect (simple but creative) [Example] High impact: affecting future research & applications
Finding Research Ideas

Reading group: read abstract, then brainstorm

What is wrong? What would I do? How would I do better?

Different levels of discussions Taking notes on new ideas; do not criticize

From Ideas to Thesis Topics

Start with some small and interesting ideas Expand to a broad topic with related problems Both supervisor and student play a role Pay attention to hot topics: invited speech in conferences, special issues of journals Make a plan (like a thesis outline)
Be flexible with the plan
Stay focus with the plan
An Example: Measures for ML

Review: previous measure - accuracy, AUC, ROC - measures for orders - profit, cost sensitive learning - Recall, precision, combination - many others in NL retrieval, engineering AUC: a single number measure - criteria for comparing measures - prove theorems for AUC and accuracy (IJCAI03) - relation between C and D (5 % measures) **** - Experimental verification (CAI03) - Comparing algorithms with AUC (ICDM03) - AUC vs profit: ****Charles (TKDE05) X. Ling

Optimizing AUC *** - AUC oriented decision trees - AUC oriented SVM - AUC oriented neural networks (submitted) - General theorems about optimizing AUC Other measures and comparison **** - (new) Super AUC measure (sub 06) - compare and connect others measures - rank measures (ECML05) - Proof or experiments with other measures (NL, ) Discussions and Conclusions
2. Doing Good Research

Not an easy task: must make new contributions
Know well the state-of-the-art (write a survey paper) Many people may have tried your (new) ideas but

failed (not usually published) Usually take a great effort to get it to work Must be very diligent and thorough

Implement yours and others algorithms Develop new theory and run many experiments Very from different aspects: how others challenge me?
3. Writing and Publishing Papers

Conferences papers: yearly, so fast, with submission deadline Journal papers: archive, so more complete
Dealing with rejection and major revision

Paper writing (another talk)

English is not our worst enemy Convincing results, clear presentation

Work with your supervisor

Convincing: Logic of a paper:

Problem X is important

Previous work A, B, have been done, but they have certain weaknesses We propose a new method Z We conduct experiments comparing Z to A and B, and show Z is better Why is Z better? Why didnt C, D work? What are strengths and weaknesses of Z Conclusions and future work of Z
Clarity: Structure of a Paper

A 10-word paper

Title A 200-word paper: very high level. Emphasize contributions Abstract and significance. Omit details Introduction and avoid technical terms (Review of Previous Work) A 2-page paper: high level. Emphasize background and Our Work motivation Experiments and Comparisons Expand on various parts Relation to Previous Work (also top-down structure) Discussions A 200-word paper: Conclusions Summary & future work References, Appendix Charles X. Ling

4. Towards a PhD Thesis

Start early: survey the area, design a plan Submit at least 2 conf papers and 1 journal paper each year PhD thesis: a collection of these papers!

On-line system

PhD Thesis



KDD 04



CAI 03

JMLR 95-02 JMLR 95-02 MLJ 95-02

Survey paper IJCAI 95-02 KDD 95-02 AAAI 95-02 95-02 Charles X. ECML Ling

Applied Research/Building Systems

Find a killer application!!! Keep in mind the grand vision and end users Divide and conquer: sub goals and milestones Application-driven research: new algorithms, exp, scaling up, user study, Still publishing, and allowing others to use your system
Should you worry about being copied? In general, no.
You are always the best expert, one-step ahead For research purpose only, getting feedbacks You gain reputation and recognition

You may want to start up a new company !

Research: new, significant, high-impact Msc and PhD candidates: supervised researchers Think critically and creatively Start with small idea, survey the area, build it up Thesis: have a plan early and focus on it Publish yearly in best conferences and journals Applied research: killer applications Internet and anonymous authorship make it fair Researchers/professors are a great career
Wish you success in research!! Thanks!

Advance Data Mining and Applications

Prof. Charles X. Ling
PhD (U of Pennsylvania)
University of Western Ontario, Canada ();

Course Plan

Wed: Charles Ling, WEKA algorithms and software Thurs: Qiang Yang, DM algorithms and applications Fri: Qiang Yang, DM algorithms and applications Monday: Charles Ling/Qiang Yang: Applications Tues: Wrap up and exam!
Textbook: Data Mining: Practical Machine Learning Tools and Techniques (Second Edition), by Morgan Kaufmann, 2005. Software (v4.11): Lecture notes

Keys in DM Applications

Many simple algorithms work quite well Crucial: converting real-world problems into DM problems
Use WEKA directly

Modify/improve WEKA

DM: a lot of science and a bit of art

Must understanding the working of algorithms

Research Methodologies
Prof. Charles X. Ling
PhD (U of Pennsylvania)

University of Western Ontario, Canada ()
Becoming a Researcher & Research Methodologies

Professor Charles X. Ling
PhD (U of Pennsylvania)

University of Western Ontario, Canada
What Researchers/Scientists do?

Create new ideas, invent, discover

Potentially useful in a long run
Not merely engineering or applications Not for short term profit (not products)

Research Support

Mostly government grants and funding (NSF, NSERC, NASA, etc.) Based purely on merit of your research

Research Environment
University and research institutes Teaching is light
Graduate teaching is useful for research

Academic freedom: you are your own boss Supported mostly by government funding Tenure system: a lot of free time to explore Supervise graduate students Publish papers

