Compress and Decompress - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2000 Forums

Transact-SQL (2000)

Compress and Decompress

Author

Topic

xiii29
Starting Member

13 Posts

Posted - 2007-03-16 : 08:38:03

hi,

I've a table with a varchar field:


CREATE TABLE MyTable( UID as UNIQUEIDENTIER, DATA AS VARCHAR(4000), FK_UID AS UNIQUEIDENTIFIER)

My application could send lot of data in order to fill the table. The data is send on a network which could be very slow (compare to LAN).

So in some case sending data is very slow.

I was wonderinfg if there is a way to compress/decompress data in SQL ?

The goals will be to send compress data and then decompress before recording.

Thanks for any help !

Kristen
Test

22859 Posts

Posted - 2007-03-16 : 10:14:01

A browser/web interface could use [their inbuilt] gZip compression. Dunno if that helps though?

I'm not sure if that applies to the Browser POSTing to a website, if its just a Browser getting HTTP requests ...

Kristen

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2007-03-16 : 10:39:59

I think that if you tell us more specifics about your environment, someone here will be able to come up with some ideas. For example, where does the data initially come from? A web page, windows application, DTS job, etc ? If a web server is used, where does the database server live in relation to the web server? And so on ... the more specific you are, the better we can help.

- Jeff
http://weblogs.sqlteam.com/JeffS

xiii29
Starting Member

13 Posts

Posted - 2007-03-16 : 11:51:24

Data comes from a Window Application and are used in a second time by a specific stored proc.
So compress is not the real problem as I can do it my Window Apps but it Decompress that is a problem ...

Kristen
Test

22859 Posts

Posted - 2007-03-16 : 12:40:08

Can application upload a ZIP file (containing, say, CSV file(s) ), and some batch job look out for new ZIP files arriving and "process" them?

Kristen

xiii29
Starting Member

13 Posts

Posted - 2007-03-16 : 12:53:34

Unfortunally not ...

Kristen
Test

22859 Posts

Posted - 2007-03-16 : 13:11:06

... Oh Well! it was worth a punt!

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2007-03-16 : 13:36:33

Well, if you ever feel like giving some specifics along with workflow examples explaining how the large amount of data is created and then transfered to the sql server, we may be able to help.

- Jeff
http://weblogs.sqlteam.com/JeffS

X002548
Not Just a Number

15586 Posts

Posted - 2007-03-16 : 14:55:21

can you even define a table like that?

Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx

Add yourself!
http://www.frappr.com/sqlteam

xiii29
Starting Member

13 Posts

Posted - 2007-03-19 : 05:56:13

-1- The table is a part of the table ...

-2- User is working disconnect from network for a period from 1 hour to 2 days ... So apps store user data (infos, note ...) in local Database and when he comes back data are synchronised with the main server.

Connexion can be done at home or at work so connexion speed change a lot ...

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2007-03-19 : 08:18:31

Ok, now we are getting somewhere ... thanks for the info. makes it much easier to help you.

Two more questions (so far): (edit ..make that 3)

1) What kind of database are the apps using locally? (i.e., sql express, Access, etc...)

2) Explain how the "sync" process works, using one or two of the typical tables that are synched as an example.

3) What programming language is the client app written in?

Any more details, above and beyond these questions, that you can provide will help speed this process along and help us to give you the best advice possible for your situation.
- Jeff
http://weblogs.sqlteam.com/JeffS

xiii29
Starting Member

13 Posts

Posted - 2007-03-21 : 07:01:25

Sorry for the delay still alive but lot of work outside the websphere !

1)MSDE,
2)Sync takes infos from many tables (100 - 150) and generate text info that are insert in just one table on the main server,
3) VB.Net

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2007-03-21 : 08:07:33

If you could explain #2 in more detail, that would be helpful. Just explain it as if there were only 1 simple table to sync, and write down the specific steps taken.

for example:

The MSDE data contains TableX and marks all new rows with a column called "Updated".
The VB.NET app takes all rows from TableX marked Updated and exports them to a textfile.
The Textfile is sent to the sever via FTP.
The VB.NET app then calls a stored proc on the server that imports the text file.
The stored proc on the server then ... etc ....

That's just a silly simplified example, but that's the kind of detail that we need to help you out. Hopefully you get the idea.

- Jeff
http://weblogs.sqlteam.com/JeffS

xiii29
Starting Member

13 Posts

Posted - 2007-03-21 : 12:48:41

-1- The VB.Net apps reads info tables and detect changes (update, insert and delete) on MSDE
-2- The VB.Net generates report of the differents change
--> Report are simple text that describes changes
-3- VB.Net Apps call stored proc on the distant server to record reports ...
--> The PS simply does a INSERT on the report tables ...

-4- Out of case : another VB apps will read reports later ...

Hope you got all you need ... It is a very simple apps ... so it is difficult to be more explicit ...

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2007-03-21 : 13:28:45

In step 3, is the VB.NET App just taking the entire "text report" of all changes and passing that info as a Text parameter to the stored procedure it is calling on the server?

the key point I am trying to get at, which we still don't know and you still haven't described in detail, is at what point is the actual large amount of data transmitted across the network to the sql server, and is it one big transmission or many calls to many stored procs, over and over. Again, I will suggest that specific, simplified examples will be helpful. Maybe once again I should give you an example:

Step1: The client gathers all changes made locally to TableA into 1 large text string. (let's call this string "TableAChanges").
Step2: The client app connects to the SQL Server.
step3: the client app creates creates a command based on a stored procedure on the server. the stored procedure looks like this:

create proc UpdateTableA @Data text
as
.... uses text in @Data to update table

step4: the client app assigns the TableAChanges string to the @Data parameter, and executes the stored procedure. the bottleneck occurs in this step, since it takes a long time to transmit the @Data parameter to the sql server.

step5: sql server then updates the data via the stored proc.
step6: this is repeated for tables B,C,D,E and F as well. Each table has a corresponding stored procedure and uses the same method to pass the changes to the sql server.

step7: Once all tables are updated, the "synch" is complete.

Again, these are just my guesses. I hope you can see that if I provide you with this info, it is very easy for you to see exactly what I am doing and perhaps where you could help me if I was in need of assistance.

- Jeff
http://weblogs.sqlteam.com/JeffS

xiii29
Starting Member

13 Posts

Posted - 2007-03-22 : 12:11:22

Sorry and thanks for your patience

So you right, I call the SP many times ...

Here is what happen in the problematic step :

-1- VB.net call a SP to record an info that is used to identify the report,
-2- Report is split in parts of 4 KB (size of a varchar),
-3- For each "packets", VB.Net call a SP in order to record it in the distant server
--> the packets is attached to first info (simple FK),
--> In VB.Net, I'm using command parameters (not command text)

The step is the problematic part as in some case I have to transmit 4000 --> 6000 packets ...

There is only ONE SP as there is only one table in which i'm recording data. It is later that another process will used this data. But there is no performance trouble in such case.

Packet are simple text data. That's why I'm to compress it as it is the transmission that takes times.

For me the way the packets are created is not important as there is nothing I can change about it. I just create text about data that changes : "Value [A] was 10 and now is 12.5 --> +2.5" etc ...

jsmith8858
Dr. Cross Join

7423 Posts

Posted - 2007-03-22 : 14:22:08

Ok, we are finally getting somewhere.

All you need to do is put all of this data about the changes into 1 large string variable, and save it as a file. You could then zip this file up using the Compression library and FTP or some way send it across the network to a fold that the DB server has access to, perhaps giving it a unique name. Then, you call a stored proc on the DB server, passing in the name of the ZIP'ed file to process, and the stored procedure could then UNZIP and import the file and then process the changes.

This is fairly complicated, but definitely doable. the key is, you are only send 1 large transmission across the network and it is zipped up, so that should help a little. That's really the only way I can think of that you could do something like what you are after.

- Jeff
http://weblogs.sqlteam.com/JeffS

xiii29
Starting Member

13 Posts

Posted - 2007-03-23 : 04:03:31

I've already think about this solution.

One pb is that customer use priority in order to optimize bandwith for SQL ... not for file transfer ...

I'm gone work on it and see what it give !:

Thanks for your patience !

Kristen
Test

22859 Posts

Posted - 2007-03-23 : 07:34:54

"priority in order to optimize bandwith for SQL"

"Upload" the ZIP file as a SQL INSERT into an IMAGE column, and then "process" it from their in your Stored Procedure? (Extract the IMAGE column to a file, unzip, import the unzipped data)

Kristen

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources