An End-to-End Walkthrough

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Store and Index Documents in SQL Server 2012 - An End-

to-End Walkthrough
SQL Server 2012 introduces two new features that make it both easier and more valuable to store your documents and files in the database:

FileTables let you save files into the database quickly and conveniently by dragging and dropping them in Windows Explorer, or saving them from your
everyday Windows applications. FileTables build on top of the FILESTREAM feature that was introduced in a previous version of SQL Server.
Semantic Search helps you get more value out of your documents by tagging documents with the most important key phrases that they contain, and by
identifying similar and related documents automatically. Semantic Search builds on top of the Full-Text Search feature that was introduced in a previous version
of SQL Server.

This article captures a timed walkthrough of the steps required to configure a SQL Server 2012 document storage solution from scratch. It includes a screen shot of
every step.

What we'll achieve in this walkthrough


1. Configure an instance of SQL Server 2012 from scratch for document storage.
2. Create and configure a new database and a new table to store documents.
3. Copy 137 files and folders totaling 105 MB into the new table.
4. Enable indexing of the stored documents.
5. Demonstrate the success of our solution by running sample Full-Text and Semantic Search queries over the indexed documents.

We'll also see how quickly this solution can be completed.

Steps
1. Configure document storage on the server.
a. At the instance level: Enable and configure FILESTREAM; specify a folder.
b. At the database level: Enable non-transactional access; specify a folder.
c. At the table level: Create a new FileTable. This looks like a folder to users.
2. Populate the table with documents by drag-and-drop.
3. Index the documents.
a. Create a new Full-Text and Semantic index.
b. Check the status of indexing and wait for indexing to finish.
4. Demonstrate the success of the indexing.
a. Query the Full-Text index.
b. Query the Semantic Search index.

Before I started the timer...


I installed this software on the computer:
Installed SQL Server 2012.
Installed, attached, and registered the Semantic Language Statistics database (an additional prerequisite for Semantic Search).
Installed the Filter Pack for indexing Office 2010 documents and enabled the new filters in SQL Server.

I prepared these queries in SQL Server Management Studio:


A query to show that the documents are stored in the database.
A query to check the status of document indexing.
A sample Full-Text Search query.
A sample Semantic Search query.

I opened these windows on the desktop:

SQL Server Configuration Manager.


SQL Server Management Studio, with an open tab for each of the queries written in advance.
Windows Explorer, with the source folder of the documents to be copied into the database.

Back to the top

Step 1. Configure document storage in SQL Server


Step 1.1. Configure document storage at the instance level

In SQL Server Configuration Manager, enable FILESTREAM on the instance and specify a folder name for the instance-level share. By default, the share has the same
name as the instance.
In SSMS, check the corresponding instance-level server properties.

Let's open the instance-level share in Windows Explorer to check it.

Back to the top

Step 1.2. Configure document storage at the database level


Create a new database for this demo. We'll name it EndToEndFileSearch.

Enable non-transactional access to support FileTables in this database, and specify a folder name for the database-level share. By default, the share has the same name
as the database.

Add a FILESTREAM filegroup to the new database. We'll call it EndToEndFileSearch.

Add a file to the new FILESTREAM filegroup. We'll call it EndToEndFileSearch_FS.


Let's open the database-level share in Windows Explorer to check it.

Back to the top

Step 1.3. Configure document storage at the table level

Create a new FileTable. By default, the folder name for the table-level share is the same as the name of the table itself. A FileTable has a fixed schema, so you don't
have to specify a list of columns.
This action scripts a CREATE TABLE statement template to a new query window.

Fill in parameter values for the template, and run the script to create the new FileTable.

Here's the newly-created FileTable in SSMS Object Explorer.


Let's open the table-level share in Windows Explorer to check it.

Back to the top

Step 2. Populate the table with documents


In Windows Explorer, let's drag and drop 137 files and folders totaling 105 MB into the new FileTable. (You cannot convert an existing file system folder to a FileTable.)
This corpus of documents includes all white papers published to the MSDN Library by the SQL Server team.
Let's query the FileTable to confirm that the documents are now stored in SQL Server.

Back to the top

Step 3. Index the documents


Step 3.1. Create a new Full-Text and Semantic index

Launch the Full-Text Indexing Wizard for the new FileTable.


Select the system-defined unique index on the FileTable's rowguidcol as the unique index to be used by the full-text index. This selection corresponds to the KEY
INDEX clause in the Transact-SQL CREATE FULLTEXT INDEX statement.

Select the file_stream column that contains the documents as the column to be indexed. Specify the file_type column as the type column, and check the box under
Statistical Semantics to enable Semantic Search.

Enable automatic change-tracking.


Create a new full-text catalog for the new database and set it as the default catalog. This catalog will be used for the new full-text index. (The catalog is only a logical
container.)

Finish the wizard and finish creating the full-text and semantic indexes.
Back to the top

Step 3.2. Check the status of indexing and wait for indexing to finish
Query two dynamic management views to check the status of full-text and semantic indexing.

Indexing is finished! (When you specify automatic population, the status of the indexing remains at "Starting" while waiting to process more updates.)
Back to the top

Step 4. Demonstrate the success of the indexing


First, let's query the full-text index for a list of white papers that mention "SSIS" and sort them in descending order by rank.

Now, let's query the semantic index for a list of white papers in which "ETL" is a key phrase and sort them in descending order by score.
How long did all this take?
Four minutes and twenty seconds.

Back to the top

What have we achieved in this walkthrough?


1. Configured an instance of SQL Server 2012 from scratch for document storage.
2. Created and configured a new database and a new table to store documents.
3. Copied 137 files and folders totaling 105 MB into the new table.
4. Enabled indexing of the stored documents.
5. Demonstrated the success of our solution by running sample Full-Text and Semantic Search queries over the indexed documents.

 How long did all this take? Less than 5 minutes!

For more information


For more information about these new features in SQL Server 2012, see:

FileTables in SQL Server 2012 Books Online


Semantic Search in SQL Server 2012 Books Online

Also, see this blog post:

http://dougbert.com/blog/post/Creating-a-Full-Text-Index-on-a-FileTable-in-SQL-Server-2012.aspx

You might also like