Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Programming
 FLAT File indexing Vs RDBMS Indexing

Author  Topic 

abuhassan

105 Posts

Posted - 2006-09-22 : 09:51:41
Hi

I want to know is a flat file faster than a RDBMS for indexing for example a search engine indexing would a flat file be better in terms of performance, scalability etc than a RDBMS?

X002548
Not Just a Number

15586 Posts

Posted - 2006-09-22 : 09:53:25
How would you "Index" a flat file.

Or are we talking Mainframe and VSAM vs. DB2?



Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx

Add yourself!
http://www.frappr.com/sqlteam
Go to Top of Page

abuhassan

105 Posts

Posted - 2006-09-22 : 09:58:14
i had a discussion with a search engine architect he seem to think that the flat file was much better in terms of performance than having say an index in a relational database such as sql server or oracle.
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2006-09-22 : 10:04:09
Is this a troll?



Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx

Add yourself!
http://www.frappr.com/sqlteam
Go to Top of Page

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2006-09-22 : 10:18:07
quote:
Originally posted by abuhassan

i had a discussion with a search engine architect he seem to think that the flat file was much better in terms of performance than having say an index in a relational database such as sql server or oracle.



Why don't you ask the "search engine architect" to explain why it is better? Ask for a link to results using flat files for one of the standard TCP benchmark tests.




CODO ERGO SUM
Go to Top of Page

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2006-09-22 : 10:18:59
it depends on what your application will be used for.
sometimes a db is an overkill and using flat files is a great option.

reading content from a flat file is a lot faster than reading from a DB or XML file.

you might want to tell us is this an academic discussion or a real world problem?

also in what way would you index your files?
would you index just the file names or their content? how would you go about that?
is the data in your files in chunks, evenly distributed etc...

The advantage of DB's is that you don't have to worry about all that

but in essence your friend has a point.
reading from a file is faster than reading from a db.




Go with the flow & have fun! Else fight the flow
blog thingie: http://weblogs.sqlteam.com/mladenp
Go to Top of Page

abuhassan

105 Posts

Posted - 2006-09-22 : 11:12:08
i Asked the architect he said he doent want to give his design away but said look why do the major search indexer use flat file not RDBMS?
Go to Top of Page

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2006-09-22 : 11:14:11
huh? major search indexer? like who?





Go with the flow & have fun! Else fight the flow
blog thingie: http://weblogs.sqlteam.com/mladenp
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2006-09-22 : 11:31:16
Using a fla file is akin to a table scan...actually worse...it's like using a cursor

So actually if you need to reference the flat file more than once, you'd have to cursor through the whole file every time.

Does that sound effecient?



Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx

Add yourself!
http://www.frappr.com/sqlteam
Go to Top of Page

abuhassan

105 Posts

Posted - 2006-09-22 : 11:44:36
how does google index does it use flatfile or database?
Go to Top of Page

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2006-09-22 : 11:50:20
this is how google stores the data:
http://labs.google.com/papers/bigtable.html

more here in the comments:
http://googlesystem.blogspot.com/2006/09/paper-about-bigtable.html

acctually reading a flat file is very fast because the file is read into the memory.
you can't acctually compare the two ways because flat files are read with procedural languages that
are optimized in the same way as RDBMS are optimized for working with sets.



Go with the flow & have fun! Else fight the flow
blog thingie: http://weblogs.sqlteam.com/mladenp
Go to Top of Page

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2006-09-22 : 11:56:32
and some more about Google file system:
http://en.wikipedia.org/wiki/Google_File_System



Go with the flow & have fun! Else fight the flow
blog thingie: http://weblogs.sqlteam.com/mladenp
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2006-09-22 : 12:09:08
quote:
Originally posted by spirit1

acctually reading a flat file is very fast because the file is read into the memory.



Actually most of my databases reside in memory



Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx

Add yourself!
http://www.frappr.com/sqlteam
Go to Top of Page

nr
SQLTeam MVY

12543 Posts

Posted - 2006-09-22 : 12:33:08
What is a database other than a wrapper for a flat file? The data is all held in files.
All a database is is an access method for those files.
If you have a application which needs to be efficient and the benefits outweigh the cost of effort in developing it then of course a bespoke access method and storage tailored to your system would be more efficient - you can concentrate on just retrieving the data rather than all the other stuff relational database do - think about updates and multi-user access though.
Of ciourse you probably wouldn't use windows or whatever and would devlope your own shell to access it and might go as far as building your own disks (or whatever) to match the structure of the data you are saving.

Anyone who comes from an asembler or engineering background will tell you that any generic solution will compromise on efficiency (but maybe not on cost).

I spent 7 years working on assembler based systems using a flat files for storage with hashing algorithms to access the data because the (ok it was primitive) isam methods that came with the system were too slow (and unreliable).

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2006-09-22 : 15:22:48
quote:
Originally posted by abuhassan

i Asked the architect he said he doent want to give his design away but said look why do the major search indexer use flat file not RDBMS?
Translation: Search architect doesn't know what he's talking about and doesn't want to admit it to anyone. Be very suspicious of people who won't answer simple questions, they're hiding something.

He probably thinks SQL Server and Oracle are glorified spreadsheets, if he's even used them before, and doesn't understand relational theory. Probably doesn't understand advanced text searching and indexing algorithms either, because searching flat files that are not indexed is gonna be as slow as anything.
Go to Top of Page
   

- Advertisement -