Hardware vs caching + Caching problem

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2005 Forums

Transact-SQL (2005)

Hardware vs caching + Caching problem

Author

Topic

dbwilson4
Yak Posting Veteran

50 Posts

Posted - 2007-07-01 : 15:33:47

I have a bunch of application layer objects that tie to the database cached at the application layer. Some of these cached objects are updated and the cache must be immediately purged so that the LATEST items will be displayed.

For an example, say you have a Car object with Name, ManufactorId, and SortOrder. If someone updates the name or manufactorid, the cache will stay intact and will not refresh itself. BUT if someone updates the sortorder, the cache needs to UPDATE itself immediately. It only has to happen when the sortorder is changed.

Is there any solution to this?

Also, does it make sense to cache at the application layer at ALL? The application objects are rendered really fast (the application server is a beast). If the database server has lots of RAM, does it even make sense to cache at application layer?

nr
SQLTeam MVY

12543 Posts

Posted - 2007-07-01 : 16:18:47

Depends on the situation. At the detail level you are talking about it probably doesn't make sense.
You could have a flag for each entity that you want to cache, that gets updated in the database when the sort order changes the next application read updates the cache (every access needs to read the flag but that could be pushed to the application layer - probably wouldn't want ot do that for the whole entity) - you will need a mechanism so that other application threads accessing the entity wait for the cached item refresh to complete.

I've done this sort of thing for caching fairly static pages but that is building the html so that it can be sent to the client tather than logical entities.
There comes a point where the effort to control the refresh outweighs the benefit.

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.

Kristen
Test

22859 Posts

Posted - 2007-07-01 : 16:21:58

We do this sort of thing as follows:

Application checks Cache for data. If not found Application calls Sproc and then caches the result.

Application calls Sprocs through a single function call. Also "Next Recordset" is via a centralised function call.

Both these "sniff" the resultset looking for a "special" first column column-name.

if the "special" first column name is seen it is acted on, and then "next record set" function is called.

This allows any Sproc to return a "special resultset". One of these is a "Delete from cache" instruction. So basically any Sproc that updates something which might be in cache can return a Special Resultset which causes the application to clear the cache for the appropriate objects - and the next person to request those objects finds that they are not in cache, and thus calls the DB to get the data, and caches the result.

Kristen

dbwilson4
Yak Posting Veteran

50 Posts

Posted - 2007-07-01 : 17:02:11

Aha! That is "godly" advice kristen.

I get the idea but I'm having trouble on how to turn it into pratical code.

How would I add your technique to this code:

public class BookFactory
{
public BookCollection GetForUser(User user)
{
BookCollection books = (BookCollection) Cache.Get(Cache.GenerateKey("BookFactory","GetForUser", user.Id));
if( books == null )
{
string sql = "SELECT * FROM books WHERE userid = @userid ORDER BY sortorder ASC;";
books = Load(SqlCommand(sql).ExecuteReader());
}
return books;
}

public void SetSortOrder(Book book)
{
string sql = "UPDATE book SET sortorder = sortorder+1 WHERE book = bookid;";
SqlCommand.ExecuteNonQuery(); //This should invalidate the cache above because the sort order has changed for this book.
}
}

dbwilson4
Yak Posting Veteran

50 Posts

Posted - 2007-07-01 : 19:11:44

I have determined that this would be very hard to do and would require something very complex. Do you guys think I should just not cache the data? The application will be used intensively and I'm expecting about 10-20 million rows in 5-8 tables with a DB size of 6-10gigs. This will be run dedicated DB server with 8gigs of ram and dual raid 0 15krpm SCSI drives.

Kristen
Test

22859 Posts

Posted - 2007-07-02 : 05:09:05

Don't you just need to do something like:

BookCollection books = (BookCollection) Cache.Get(Cache.GenerateKey("BookFactory","GetForUser", user.Id));
if( books == null )
{
string sql = "SELECT * FROM books WHERE userid = @userid ORDER BY sortorder ASC;";
books = Load(SqlCommand(sql).ExecuteReader());
Now store the data from "books" into the Cache for the next punter
}

then you just need a way for any update to books to "signal" back that the cache needs to be cleared (or to freshen up the changed row(s) in the cache)

Kristen

pootle_flump

1064 Posts

Posted - 2007-07-02 : 06:05:59

Not tried this (too much work for little reward for the little intranet sites I have developed) but...

You can bind the cache to an xml file. If the file changes the cache is updated. If not it isn't.

You could create a trigger on your relevent tables that writes to another table indicating any changes to the data. A regular job could launch a sproc that writes the tables whose data has changed to the xml file. I think one of the new 2005 features is reasonably well suited to this (Notification Services?). The effect is an asynchronous but regular updating of the application cache only when the data changes with virtually no overhead when nothing does change.

Anyway - just a musing I have had - I've not persued it so it might be completely full of holes.

Kristen
Test

22859 Posts

Posted - 2007-07-02 : 06:18:53

My feeling is that the "regular job" will be running behind the actual use, so there may be some "use" of the stale data before it gets freshened (which may be acceptable of course).

Also, the cache may be rebuilt and not used (e.g. changes in quick succession which happen not to be viewed).

So I've always taken the approach of looking in the cache, and then requesting the data if not found (and caching it for the next person). And then anything that changes data which is cached needs to be capable of deleting the relevant cache entries ... which will then be re-created Just-In-Time before next use.

One slight downside of this is that if several people look for data which is not cached they may all simultaneously request the data from the DB and re-create the same cache entry. But I have decided its a small enough hit that I can live with it, but some sort of sempaphore system may be called for in busier systems.

Kristen

pootle_flump

1064 Posts

Posted - 2007-07-02 : 07:21:27

quote:
Originally posted by Kristen

My feeling is that the "regular job" will be running behind the actual use, so there may be some "use" of the stale data before it gets freshened (which may be acceptable of course).

Yep. The more regularly the job is run of course the less the hit.

quote:
Originally posted by Kristen

Also, the cache may be rebuilt and not used (e.g. changes in quick succession which happen not to be viewed).

Don't get that one...

One advantage I see to it is that the application and database are still fairly loosely coupled. For example, when the sprocs are bypassed for data updates (e.g. an admin applies a few DML statements) a trigger will still fire.

Kristen
Test

22859 Posts

Posted - 2007-07-02 : 07:42:38

"Don't get that one"

You edit a record
The scheduled job rebuilds the cache
You edit the record again
The scheduled job rebuilds the cache again
... repeat ...
Finally someone reaches a page that actually reads that record from the cache.

So you have had multiple cache rebuilds for data that was not used.

Good point about loosely coupled. Web farms have similar issues.

Kristen

pootle_flump

1064 Posts

Posted - 2007-07-02 : 08:08:07

quote:
Originally posted by Kristen

"Don't get that one"

You edit a record
The scheduled job rebuilds the cache
You edit the record again
The scheduled job rebuilds the cache again
... repeat ...
Finally someone reaches a page that actually reads that record from the cache.

So you have had multiple cache rebuilds for data that was not used.

Good point too

Kristen
Test

22859 Posts

Posted - 2007-07-02 : 08:18:09

In our case its rather significant, but it might not be so in other scenarios.

I might change something in the Heading section of a web page template which hits pretty much every page on the site. I might then change something in the footer.

If a job was triggered that rebuilt the cache it would be building thousands of pages ... whereas if it is done Just-In-Time as a user reaches an un-cached page the load is spread.

Kristen

dbwilson4
Yak Posting Veteran

50 Posts

Posted - 2007-07-02 : 10:18:43

Good points.

Regarding the notification services or the xml notification services to trigger cache refreshes......some of these caches are specific to each user this means there will be a notification service running for each user.

Like some of these caches are available ONLY to the private user, or administrator and others available to public user etc.

Do you think the multitude of notifications dispatched would cause performance issues? Should I be even caching when the server has amplitude of ram?

Kristen
Test

22859 Posts

Posted - 2007-07-02 : 10:32:14

We cache by user-group, so similar to what you are doing (some of our user-groups will only have one user).

I think if you have that scenario it makes it even more important to build the cache Just-In-Time so that material which is rarely / never viewed is not cached unnecessarily.

You could also eject "stale" material from the cache on a least-recently-read basis. That would allow you to tune the cache to the size of RAM.

We store all material that is put in the cache in the DB too, so after, say, a reboot we initially populate from the cache.

The way we do this is to have the Sproc that delivers the material check the Cache Table in the database first.

So it goes something like this:

Get material from cache
None found? --> Call Sproc to get material

Sproc looks in Cache Table. If found it records a BLOB. (plus some "hint" that it is returning cached material)

If NOT found then Sproc does whatever it is supposed to do to acquire the raw data. It also returns a "hint" that this is the raw data.

The Application either just returns the BLOB, or processes the "raw" data, calls the Sproc to store it in the DB cache table and stores it in the RAM cache table.

I suppose the step for "Store in RAM cache table" could be dispensed with for "limited viewing" data - i.e. infrequently viewed data would be pulled from Database Cache Table, rather than RAM cache.

Anyways, if the caching stuff is "modularised" it can be in RAM now, and push to DB or whatever in the future if the performance/resource usage turns out to be a problem

Kristen

dbwilson4
Yak Posting Veteran

50 Posts

Posted - 2007-07-02 : 11:40:22

That seems like too much work and things could get messy for the backend.

I'm trying SqlCacheDependency right now and this thing is sweeeeettt. I love it. However, if an update occurs to the table im monitoring...does it flush out the cache depending on the rows updated OR does it flush out anything that cached any rows in the table?

Kristen
Test

22859 Posts

Posted - 2007-07-02 : 12:01:24

"That seems like too much work and things could get messy for the backend"

Yeah it is, took us a while to implement, but we did at least have the benefit of having it designed-in, rather than needing a retro-fit!

Kristen

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources