Download as pdf or txt
Download as pdf or txt
You are on page 1of 43

Answering The Queries Your Users Really Want To Ask

Dr Greg Low Managing Director and Mentor SolidQ Australia GLow@SolidQ.com

When querying a database, what do users want?

#1: Theyre not sure

#2: They dont want to be precise

#3: They do want to use their own terminology

#4: They want the answers fast

So what do we typically give them?

#1: Limited choices

#2: Highly constrained searches

#3: Strict and limited terminology

#4: Slow response times

We need to do better !

WHO AM I?
Managing Director Solid Q Australia Host of SQL Down Under Podcast

Microsoft Regional Director


Microsoft MVP for SQL Server Organizer for SDU Code Camp Co-organizer for CodeCampOz

PASS Board Member

WHO WE ARE
LEADING INDUSTRY EXPERTS:

Growing group of over 100+ of the worlds best technical experts possessing a high concentration of Microsoft MVPs and RDs in our ranks.
PUBLISHED AUTHORS:

Contributors in technical reference books, Microsoft reference materials, industry white papers, technical magazine articles, and webcasts.
TOP TECHNICAL SPEAKERS:

PASS Community Summit, Microsoft TechEd, The Microsoft BI Conference, SQL Server DevConnections, countless user groups, international conferences, and events.

Session Objectives And Agenda


What Can It Do For Me ? (Why do I care ?) How Do I Implement And Manage IFTS? How Did SQL Server 2008 Change The Game? Summary

Cant we just use LIKE ?


SELECT SomeColumns FROM SomeTable WHERE Description LIKE %Hockey%

Strings vs Words
Search For: Get Back: Pen Pencil

Or:

Open Pendulum Penitentiary Male Anatomy

Or Worse:

Inflections
Search For: Drive

Really Wants:

Drives Driving Drove

Proximity
Search For: Time Beijing

Really Wants:

The word Time somewhere near the word Beijing

Synonyms
Search For: Client

Really Wants:

Client Customer Punter etc

Semantics and Meaning


They search: Time in Beijing

Data actually is:

In Beijing, the current time is 10:45am

Demo

The End Game

Session Objectives And Agenda


What Can It Do For Me ? (Why do I care ?) How Do I Implement And Manage IFTS? How Did SQL Server 2008 Change The Game? Summary

Search At Microsoft: Main Players


Live Search Microsoft Search FTS in SQL Server 2000 and 2005 IFTS in SQL Server 2008 changes the game

IFTS Main Concepts


Integrated into Microsoft SQL Server Fast and flexible querying Scoring and relevance Since when?

IFTS Main Terminology


iFilters Tokenizing Indexing Querying Scoring

SQL Server 2008: Wordbreakers


Arabic Bengali Brazilian Bulgarian Canadian Catalan Chinese (Simplified) Chinese (Traditional) Chinese (Hong Kong) Chinese (Macau) Chinese (Singapore) Croatian Cyrillic Danish Dutch English English UK French German Gujarati Hebrew Hindi Icelandic Indonesian Italian Japanese Korean Latvian Lithuanian Malay Malayalam Marathi Neutral Norwegian Polish Portuguese Punjabi Romanian Russian Serbian Serbian Latin Slovak Slovenian Spanish Swedish Tamil Telugu Thai Turkish Ukrainian Urdu Vietnamese

Present but disabled

New for 2008


In 2005 but replaced in 2008

Unchanged from 2005

IFTS Implementation In A Nutshell


Table with textual data Create Full Text Catalog Create Full-text index Populate the index Query using full-text predicates or tables

Demo

Implementing and Managing Full Text Search

Session Objectives And Agenda


What Can It Do For Me ? (Why do I care ?) How Do I Implement And Manage IFTS? How Did SQL Server 2008 Change The Game? Summary

FTS 2005 Architecture Overview


SQL Server

MSFTEFD (Filter Daemon) Protocol Handler MSFTESQL (FT Search Engine) Filter Raw Data

Wordbreaker

Filtered Data

Tokenized data Fulltext Catalog

Main FTS 2005 Performance Issue


Mixed Queries:

SELECT Product_id FROM Products WHERE ProductID = 764A AND CONTAINS (Description, Microsoft)

Main FTS 2005 Performance Issue


Workaround (computed column and modifed query):

SELECT Product_id FROM Products WHERE CONTAINS (Description_Merged, Microsoft AND productId764A)

Product_id
764A

.......

Description
Microsoft product from Office family

Description_Merged
Microsoft product from Office family. ProductId764A

IFTS Query Architecture


SQL SERVER process FDHOST process

T-SQL Parser
QUERY

Algebrizer
SQL Algeb.

FTS Algeb.
Parse WB client Bind

Shared Memory WB
Stemmer

FTS Algeb.
SQL/FTS integrated query tree

Language Module Ranking Func. Integration

STOPLIST

iFilters
THESAURUS

Results

QE

Execution Plan

Cardinality

QO

FTLogicalOperator FTLogicalOperator FTLogicalOperator

FTExecutionOperator
FTExecutionOperator FTExecutionOperator

Full-Text Index

Other FTS2005 Challenges


Indexes stored outside SQL Server Possible Scaling issues Two different product teams

SQL Server 2008: IFTS


Full-Text Indexes inside SQL Server Management just works Indexing fast Troubleshooting Diagnostic DMVs Stoplists

Indexing Performance

Populating an index of 20million rows of 1k data on identical hardware (time in minutes)

Demo

SQL Server 2008 IFTS Enhancements

Upgrade Options
New Index Structure Upgrade options:
Import (default) Rebuild Reset

Possible Upgrade methods:


In place Restore/attach

Potential Extensions
New data types
XML CLR UDT

Extend the IFTS feature set


snippets with hit-highlighting field weighted relevance customizable tokenizing customizable proximity operator property level search

Be heard now !

Summary
IFTS can add significant value Implementation -> straightforward Management -> straightforward Users love this SQL Server 2008 has changed the game!

Q&A

Thank you
glow@solidq.com http://sqlblog.com/blogs/greg_low http://www.sqldownunder.com

You might also like