Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Administration (2000)
 Database design advise needed ... sp also...

Author  Topic 

Mojo
Starting Member

19 Posts

Posted - 2001-12-28 : 15:48:06
I work for a small (I am the IT department) company with great contacts. We have the opportunity to sell our web applications to large customers (New York Times, USA Today, Chicago Tribune). I built everything, with help from this site and a few others. Here is my situation and then two simple questions:

  • SQL Server 2000 DB is HORIZONTALLY partitioned

  • We realistically expect 400K users

  • Each user has their own table-- 400K tables



How lame was I to horizontally partition the DB?

I did it because we expect a large number of users and each user will be updating, deleting and inserting records frequently. I thought a single table with 40 million to 160 million records would not handle the number of users.

Also, I have been converting my database calls to stored procedures. Each user has their own table so @TableName does not work. I use dynamic SQL to pass in the table name. I read that the speed advantages of stored procedures are lost when passing in object names like tables.

If this is true, should I even bother with the stored procedures?

I may have a terrible design in the first place... Tell me straight, I can take it.

Thanks,

Joe

Nazim
A custom title

1408 Posts

Posted - 2001-12-29 : 00:22:03
This link should be good help in deciding about your Stored Procedure.

http://www.sqlteam.com/Forums/topic.asp?TOPIC_ID=10964


About your design , will you create a new table when a new user comes up ? to me. that makes me feel little uncomfortable. otherwise it sounds ok.

Might be some1 here can give a better advice

-------------------------
Graz's Baby is my Master:)

Edited by - Nazim on 12/29/2001 00:23:10
Go to Top of Page

Mojo
Starting Member

19 Posts

Posted - 2001-12-29 : 00:54:04
Nazim,

Thanks for your reply. It appears that the advice is to use stored procedures regardless. That is fine, I wanted to learn them anyhow. I think all of my sprocs will be dynamic, so I may not see a big performance gain.

I am building a new table for each registered user. I think I read somewhere that SQL Server 2000 can have up to a million tables!

Each user can have between 400 and 600 records. Adding and deleting whenever they want.

Now, if I could easily make it so there were no duplicate records in the database I would be following the recommended path! I am actually working on that...

Joe

Go to Top of Page

Nazim
A custom title

1408 Posts

Posted - 2001-12-29 : 01:52:03
Mojo, Can u try in single table too and have a look at performance.

i would like to know your naming conventions for tables of each user.

Sql Server can hold billions of records in each table , so that shouldnt worry u . i feel the performance gain you are trying to gain
by using separate table for each user ,is killed by using Dynamic Sql. you can be better off with using single table for all users minus dynamic sql. try comparing both the results. can have a look at the peformance. Remember, if you use Dynamic sql Sql Server doesnt holds its cached plan thereby increasing the time for every call to that stored procedure.

thought this could help u , though not directly related to your problem
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=6118

HTH




-------------------------
Graz's Baby is my Master:)

Edited by - Nazim on 12/29/2001 01:53:11
Go to Top of Page

AjarnMark
SQL Slashing Gunting Master

3246 Posts

Posted - 2001-12-30 : 02:43:28
Hmmm... One table per user? Does that mean that your database application basically consists of only one table? So, there are no other tables/relationships/etc. I find that hard to believe. Does each user have one table for SOME information and then the rest of the data is visible to everyone? Or are they mixing data in some tables anyway.

Regardless, it seems like a lot of overhead to worry about one table for each user, and creating a new table for each new user... where does it end? I think you would probably be happier with all users using the same table and include a UserID field in that table. Put an Index on the UserID field and you should get some good performance. If need be, you could split this large table onto separate servers so that half your workload and users are on one and half on the other.

Again, it seems odd to me that you really only have one table...

--------------------------------------------------------------
1000 Posts, Here I come! I wonder what my new title will be...
Go to Top of Page

Mojo
Starting Member

19 Posts

Posted - 2001-12-30 : 22:21:36
Thanks Nazim and AjarnMark for your advise.

Each user is given their own table after they register. It is no problem as the table is built on the fly using SQL. The tables are given an unique number as a name. There is a user info table that is used to hold the personal info for each user. I have no relationships set, but each user has an ID that is the same as their table number. So I can easily track who is who.

My main concerns:

If I used one table to hold all the records I am afraid of either performance issues or strange *goof-ups*.

One table could easily have over 100 million records. The last field can be any size, but usually is between 20kb and 250kb.

If there are thousands of users updating, deleting and inserting records that range from 20kb to 250kb in size in a table that contains over 100 million records will everything be OK???

When a user does and action there are 7 fields that are either inserted, updated or deleted with the last field being the large field.

Basically, I am in over my head a bit. When more funding comes we will be able to bring in some DB studs, but the newspapers will want the site to go live almost immediately after paying out the $$$.

Again, we are small, but we brought on a guy with GREAT contacts so we are able to land some giant clients without having a talented database person(s). I am an application designer.

Thanks again,

Joe

Go to Top of Page

Nazim
A custom title

1408 Posts

Posted - 2001-12-30 : 23:50:59
How many records does each user contain, and what kind of information will he/she needs to insert/update/delete. i seriously feel there is an design flaw in your database. i would suggest you to give a good description about your needs so that we can help you in your overall design issues, instead of comparing the ONLY two options (single table and multiple tables) which might not be the best of solutions.
you should give a serious thought about Ajarn's suggestion too.
quote:

Each user is given their own table after they register. It is no problem as the table is built on the fly using SQL. The tables are given an unique number as a name. There is a user info table that is used to hold the personal info for each user. I have no relationships set, but each user has an ID that is the same as their table number. So I can easily track who is who.



if you have proper indexes and your system is properly tuned. dont think it will have any problem.
quote:

If there are thousands of users updating, deleting and inserting records that range from 20kb to 250kb in size in a table that contains over 100 million records will everything be OK???



Did you missed something here, what do you mean by deleting a field?.Am not sure how much difference does the size of a field will have on DML operations.
quote:

When a user does and action there are 7 fields that are either inserted, updated or deleted with the last field being the large field.



Mojo, i repeat again . if at all you are gaining any performance again with this design, it will be killed by extent of dynamic sql you are using for it. and it is imperative for you to use it with this design.

HTH

-------------------------
Graz's Baby is my Master:)

Edited by - Nazim on 12/31/2001 00:09:20
Go to Top of Page

graz
Chief SQLTeam Crack Dealer

4149 Posts

Posted - 2001-12-31 : 00:09:38
Mojo,

Interesting thread. I've got a few thoughts to pass along.

  • When you say 400,000 users I'm assuming you don't mean concurrently. That number is only for sizing.

  • I'd really suggest against the one table per user. That will force you into dynamic SQL or client-prepared SQL. One table keyed by user is really the way to go (except as noted below)

  • A table with 100 million rows is a VERY, VERY LARGE table in any database environment. Just loading that much data is very time consuming. That process alone may take days or weeks.

  • If the size of a record is only 20KB (I'm assuming you meant bytes, not bits) then your data alone will be 1.8 terabytes(TB). That's before any overead or log file size. If the transaction volume is what you describe then the log file could grow to a significant percentage. If record sizes are closer to 250KB, then your data storage will grow accordingly. Backing up this data will require some interesting hardware. Heck, just storing it will require a SERIOUS outlay of cash!

  • I'm guess that the "large" field will either be a text field or some type of binary field. Those come with their own sets problems so plan and test carefully.

  • If the numbers you provided are accurate, I would give serious thought to a horizontal partition. Use distributed partitioned views to split your table onto a number of servers. That will make hardware sizing and backup realistic.

  • Bring in some datbase experience. If this starts small and grows you can get by without a strong architecture. If it grows quickly or starts large you can quickly overwhelm even the best hardware.


That's my collection of thoughts. Good luck!

===============================================
Creating tomorrow's legacy systems today.
One crisis at a time.
Go to Top of Page

Nazim
A custom title

1408 Posts

Posted - 2001-12-31 : 14:39:24
Wow! Graz ! you worked out pretty amazing statistics.

Mojo, would like to ask you , does this 400k users keep coming....i mean to say , how and when will u decide to delete the data of a particular user. On what frequecy does new users are added , or the old user gets deleted.

Graz, has given a pretty impressive list of suggestion's. i would like to ask graz, will it not be difficult maintaining a database of 4,00,000 tables. how about having multiple databases, depending upon location or anyother information using which the load in devided into multiple databases on multiple servers.

HTH


Go to Top of Page

Mojo
Starting Member

19 Posts

Posted - 2001-12-31 : 17:02:49
I came up with 400k users because that is our best guestimate. We have an application that the newspaper industry is interested in using. We have several companies that own collectively a great part of the American newspapers. The companies that are interested have a combined readership of 30 to 40 million. The industry average is 20% of paper readers also read the online version regularly. That would give us 6 to 8 million potiential users. 5% of those online readers would give us 300k to 400k users.

I think we may have more than 5% of the online readers use the service because the newspapers are going to give it away. The newspapers will pay us directly (B2B).

I guess things could get ugly quickly depending on how quickly things roll out and how many users sign up for the "free" service. It is somewhat out of our control.

Nazim:
I don't know how often we would delete old users. That would be up to the papers. They will be paying the bill. We are limiting the amount of space the users get on the database. It works out to between 300 and 600 articles.
You also asked how I would delete the data? I have a date field that is updated each time a user saves an article. We can use that as a starting point to identify if an account should be deleted.

graz:
Thanks for you input. The record sizes are more close to 20 kilobytes. Some, not many are over 100kb and even fewer are over 200kb. It is entirely dependent upon what the user does.

I have thought about more than one database on seperate servers for a possible solution, but I wanted to make sure I was on the right track in using Horizontal partitioning (HP). There is not much out there on HP. My wife gave me the idea and after finding a couple of articles here on SQLTeam I figured it might be a good way for what we are doing.

Thanks,

Joe

Edit: I forgot to mention that I was planning on 4 terabytes.


Edited by - mojo on 12/31/2001 17:06:04
Go to Top of Page

Nazim
A custom title

1408 Posts

Posted - 2002-01-02 : 07:52:02
Mojo, Graz has provided you very helpful suggestions. should follow them.

i would like to see your table schema, i still have some reservations on it.




----------------------------
Anything that Doesn't Kills you Makes you Stronger
Go to Top of Page

JustinBigelow
SQL Gigolo

1157 Posts

Posted - 2002-01-02 : 16:13:56
quote:

graz:
Thanks for you input. The record sizes are more close to 20 kilobytes. Some, not many are over 100kb and even fewer are over 200kb. It is entirely dependent upon what the user does.



??? Joe, pardon me if I'm nitpicking but SQL Server can't handle rows over 8kb. If your average record is going to be 20kb then you would need three tables. Each table would need a one to one join to the other two to maintain integrity. If you had a record consisting of 200kb of data you would need 25 tables for a single row! Are you sure you dont mean that each table is going to be around 20kb - 200kb? If a single row is going to take this kind of space you probably need to normalize your data or consider some other design changes.

--addendum--
I didnt take into consideration Text fields but this would only be a pointer anyway.

m2c,
Justin



Edited by - justinbigelow on 01/02/2002 16:17:21
Go to Top of Page

Mojo
Starting Member

19 Posts

Posted - 2002-01-02 : 17:36:04
Justin,

I am using text fields and adding the data like:

objRS2.Fields(9).appendChunk strContents

I agree with you that my design may be screwed, but that is why I posted. From what Graz posted, I will keep my original design. I am also moving everything to stored procedures even though I will not see a performance gain it seems to be the general consensus that sprocs are the way to go... regardless.

I am always open to suggestions.

Thanks,

Joe

Go to Top of Page

izaltsman
A custom title

1139 Posts

Posted - 2002-01-02 : 18:34:16
Mojo,

What exactly are you going to store in this text field? It sounds like you might be storing newspaper articles there. This means that multiple users could potentially be interested in the same content. And to accomodate these users, your design would require storing multiple copies of the same article for each of the interested users. If that is true then you can make your data structure a lot more efficient by normalizing it. Feel free to skip the rest of my post if I am wrong about the purpose of the text field though.

To normalize your data you could:
1. Create an Articles table, containing unique article_id, subject/excerpt, and body of each article (better yet, store the body on the filesystem, and keep only the path to it in the table).
2. Create a Users table, containing all sorts of user details and keyed on a unique user_id.
3. Create a UserInterests table containing user_id (referencing user_id in Users table), and article_id (referencing article_id in Articles table).


This way, whenever a user expresses an interest in some article, you can just add a record into a UserInterest table, saving lots of space, and improving performance. Normalizing your data this way will not prevent you from keeping it partitioned (e.g. you could slice UserInterests table based on user_id or article_id ranges).



Edited by - izaltsman on 01/02/2002 18:35:37
Go to Top of Page

Mojo
Starting Member

19 Posts

Posted - 2002-01-02 : 19:04:23
izaltsman -

You have some good ideas. Although, I am still concerned about having one LARGE table. Also, if a user deletes an article, how do I know if it is safe to do so in the Articles Table? Even still, after two years, how big would that table be? 200 million records? That seems to be large.

You have given me an idea... I will give a short description and then pose a question.
  • Large paper (not the largest) with 200,000 online users
  • 40,000 will be users in my database

  • Each user averages 200 records (can have many times that)

  • [*] One table with 4 million records, large text field


I have read on this site before that a site should expect 10% of their users at the peak time. More if users are all in US.

If I have 6000 users hitting the table (with 4 million records) will performance suffer? Is this something that I could just by my way out of (hardware)?

I am thinking that instead of 400,000 tables (one for each user)... maybe I could have 30 large tables.

I am just concerned about performance when the tables are large (millions of records).

Thanks,

Joe

Go to Top of Page

Mojo
Starting Member

19 Posts

Posted - 2002-01-02 : 19:19:17
The more I here from the good people here at SQLTeam the more I would like to use fewer (much, much fewer) tables and get the full benefit from using sprocs. I could also use and index to great advantage.

My last question: Is there a table size that I should stay within? At what point will the table become unwieldy?

Thanks for everything,

Joe

Go to Top of Page

izaltsman
A custom title

1139 Posts

Posted - 2002-01-02 : 19:39:38
quote:

Although, I am still concerned about having one LARGE table.



You mean the Articles table, right? I share your concern... However, in your physical implementation, it does not have to be (and probably should not be) just one table. Graz had already mentioned Distributed Partitioned Views in this thread. You should definitely consider it! You could partition your articles based on subject or title or an author's name or whatever and put each portion on a "federated" server. Then set up a distributed partitioned view called Articles and base all your queries on that view. This way the workload will be spread across the "federated" servers (which will give you both performance gains and limited fault tolerance).

quote:

Also, if a user deletes an article, how do I know if it is safe to do so in the Articles Table?



You just delete a record from the UserInterests table... And if that means that no one is interested in the article any longer -- so be it... You could run a batch process once a week (or once a day, or once a month...) to pick up any "orphaned" articles and delete them.

quote:

Even still, after two years, how big would that table be? 200 million records? That seems to be large.



That's the beauty of the Federated Servers model. You can scale out as much as you want... If you feel that a portion of data handled by one server is getting too large, you introduce another server into the federation (of course you will need to tweak your partitioning criteria a bit).

quote:

I am thinking that instead of 400,000 tables (one for each user)... maybe I could have 30 large tables.



Hate to sound like a broken record, but once again -- partition the data, spread it across several servers (if you can afford 30 servers -- great, but in fact, I doubt you should need this many). And then set up distributed partitioned views to pull the data from them.


Go to Top of Page

izaltsman
A custom title

1139 Posts

Posted - 2002-01-02 : 19:45:05
quote:

My last question: Is there a table size that I should stay within? At what point will the table become unwieldy?



I doubt there is a magic number (although if anyone came up with one, please do tell). The only thing I can say is that this number will be dependent on the capabilities of your hardware and your indexing strategy.



Edited by - izaltsman on 01/02/2002 19:46:21
Go to Top of Page
   

- Advertisement -