Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 SQL Server Administration (2008)
 Full-Text Catalogs

Author  Topic 

philipstratford
Starting Member

8 Posts

Posted - 2012-11-02 : 10:29:16
Hi All,

Another noob question from me (but we all have to start somewhere, right?).

What's the point of a full-text catalog? Full-text indexes I get, they make sense, but why do they have to be contained within a catalog? And, since they do, why not just put all of your full-text indexes in one catalog, for the sake of simplicity?

Finally, what does it mean to populate a catalog? Surely it's the index that needs populating regularly. What's a catalog populated WITH such that it needs populating? And how often should it be done? I'm looking at some full-text catalogs in our DB which haven't been populated since 2009, but I can't work out if that's a bad thing or not, given that the contained indexes are automatically populating.

Thanks for your help!

robvolk
Most Valuable Yak

15732 Posts

Posted - 2012-11-02 : 10:47:38
From Books Online:
quote:
The information in full-text indexes is used by the Full-Text Engine to compile full-text queries that can quickly search a table for particular words or combinations of words. A full-text index stores information about significant words and their location within one or more columns of a database table. A full-text index is a special type of token-based functional index that is built and maintained by the Full-Text Engine for SQL Server. The process of building a full-text index differs from building other types of indexes. Instead of constructing a B-tree structure based on a value stored in a particular row, the Full-Text Engine builds an inverted, stacked, compressed index structure based on individual tokens from the text being indexed. In SQL Server 2008, the size of a full-text index is limited only by the available memory resources of the computer on which the instance of SQL Server is running.

Beginning in SQL Server 2008, the full-text indexes are integrated with the Database Engine, instead of residing in the file system as in previous versions of SQL Server. For a new database, the full-text catalog is now a virtual object that does not belong to any filegroup; it is merely a logical concept that refers to a group of the full-text indexes. Note, however, that during upgrade of a SQL Server 2000 or SQL Server 2005 database, any full-text catalog that contains data files, a new filegroup is created; for more information, see Full-Text Search Upgrade.
The red portions highlight the chief differences. Can't really answer the population question, other than creating FT indexes on large document sets may be something that has to be deferred or done incrementally due to resource limitations (space, CPU, etc.)

You can change accent sensitivity at the catalog level, that might make sense to have multiple catalogs if you're indexing multiple languages that need different settings.
Go to Top of Page

philipstratford
Starting Member

8 Posts

Posted - 2012-11-02 : 13:08:19
Thank you. I don't think that really explains why the indexes are in catalogs or why the catalogs need to be populated (as you acknowledged), but the bit about migrating from 2005 was very useful because that's exactly what I'm doing. It's driving me nuts because, having completed the restore of the database from 2005 to 2012 I want to remove those pesky extra filegroups and files, but there doesn't seem to be any way to do that other than deleting the indexes and catalogs and recreating them.
Go to Top of Page

philipstratford
Starting Member

8 Posts

Posted - 2012-11-02 : 13:43:17
Ok, here's my guess. And it IS a guess, so if someone could confirm I'm right that'd be really helpful. If you've just arrived here looking for the same information as me, don't assume this is the answer!

What's confusing me is why, if your full-text index is set to auto-tracking, you would also need a schedule to populate the catalog. And, if the full-text index IS set to auto-tracking, does it matter if the catalog hasn't been updated for ages, as is the case with my catalogs?

Well, the key to my understanding this is that a full-text CATALOG can contain multiple full-text INDEXES. So, whereas each index contains the location of words in the table on which it is configured, the catalog contains an amalgamation of all of those indexes, providing a master list, essentially an index of indexes, containing the location of words in any of the tables with indexes contained in the catalog.

For example, if you just search for the word "dispatched" in a full-text index (I don't even know if it's possible to do that), you'd return all of the rows where that word was contained within that table. But if you search for the same word in a full-text catalog, you return occurrences of that word in any of the indexed tables in the catalog.

Therefore, whilst the full-text indexes may be set to auto-tracking, and thus have a current and up-to-date record of which words appear where in the individual tables they relate to, if the catalog isn't being populated that information is useless because the records in the indexes aren't being collated and populating the catalog, which is what is actually searched by queries. So yes, regularly populating the catalog is as important as regularly updating the indexes.

Does that sound about right..?
Go to Top of Page
   

- Advertisement -