Author |
Topic |
blackjackIT
Starting Member
25 Posts |
Posted - 2007-07-06 : 15:52:07
|
So here's the deal. I just started with a new company as a SQL Developer/Analyst. I've got a couple years of experience with SQL Server, mostly 2000, on some fairly large and complex databases (or so I thought).So I get to this new company and the database structure is just wacky. I've never seen anything like this before. After a few google searches I find a bunch of articles on EAV. Yup, that's what I've gotten myself into. On top of that, it seems to be some exploded EAV hybrid, possibly EAV/CR or something I saw. I've dubbed it the ESEASAADSADAVSAVCRS Schema (Entity Subentity Attribute subattribute attributedata subattributedata attributevalue subattributevalue circular reference system). Gotta laugh so you don't cry, right?As far as I can tell, all of the data they have is submitted from clients, cleaned, aggregated and then used to generate reports which clients in turn subscribe to. From what I've read and seen here, EAV is horrible for reporting (among other things) and they are having performance problems. The guy 2 times before me made a reporting table structure that does all of the aggregations and spits the data into new tables for the reports to run off of. The problem? The reporting table structure is also EAV!As the original designer left the company, he said the word CUBE. Then comes in the next "SQL Guru" (she was only here for 6 months, can't imagine why). So they start doing upgrades to SQL 2005 and she takes her predecessors advice and starts designing a dimensional model in SSAS based of the reporting database structure. EAV + CUBES = WTF? Did I happen to mention she didn't have any OLAP experience when she started? (neither do I, at least not in a production environment)So now there's me sitting here 2 weeks in with an EAV database, a pseudo-EAV reporting database, some unfinished cubes, not wanting to touch anything for fear of the whole thing imploding.Here are some of the factors I must take into account:1. The company website serves as the client UI and is tightly integrated with the EAV schema.2. New data sources come and go quite often which means lots of attribute changes to the data3. After looking at the data with what little SSAS knowledge I have, it seems that going this direction might just be useless. I believe all of the fact/measure data is stored in all of these dynamic attributes and it seems like I would be changing them on every load.4. I thought about dropping the cube idea and redesigning the reporting database structure to 3NF and then pumping the data from EAV to 3NF for reporting but... uhhh... damn, my mind went blank4. My brain is fried from looking at this thing so I can't remember what other points I was going to bring up... please give some advice. |
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2007-07-06 : 16:02:44
|
buy a box of kleenex, have a good cry, and look for a new job. failing that I like option 4 the best. elsasoft.org |
 |
|
blackjackIT
Starting Member
25 Posts |
Posted - 2007-07-06 : 16:18:11
|
If I went with #4, how would I best deal with the schema changes of the EAV? Just suck it up and manually change the associated tables in the 3NF db? Luckily (I think), the reporting database is only updated once a month. On the other hand, we have to keep historical data. We basically have a dataset for each monthly build and based off of business rules and historical data, we can produce projected data for future results (usually 6 month projections).I welcome anyone's suggestions, advice, condolences, and/or general taunting on this. |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2007-07-06 : 16:35:46
|
well your quest for 3NF might be easier if you used XML datatype as a middle man if you can...put EAV into xml and then XML to 3nf with the help of xml datatype's query functions.i have no idea if that's viable for you but hey... it's a suggestion _______________________________________________Causing trouble since 1980blog: http://weblogs.sqlteam.com/mladenp |
 |
|
RocketScientist
Official SQLTeam Chef
85 Posts |
Posted - 2007-07-06 : 16:42:20
|
Well, you're going to have to get that database into a non-suck format somehow. Option 4's the way to go. It's going to be a big pile of work, make sure you've got lots of backups.If you start with the reporting schema, you can basically create the new schema and run it in parallel with the old one to work the bugs out. Create the schema, the ETL, then test it to make sure everything ties. Then start working on the live database the same way.EA...looks great on paper, but at least it's impossible to manage, search, or report on. It's nice for a write-only database though. Writes are really fast. |
 |
|
blackjackIT
Starting Member
25 Posts |
Posted - 2007-07-06 : 16:44:33
|
I'm pretty sure I'm going to have full reign over the direction of the db. The only thing stopping me from dropping the whole damn thing and starting over is the fact that the app/web developer would have to also redesign everything in his side.I've never dealt with the XML datatype so I guess I'll do some more reading. It's looking like a long road. |
 |
|
blackjackIT
Starting Member
25 Posts |
Posted - 2007-07-06 : 16:49:04
|
A quick look back at the SSAS option. Do you guys think this is even a viable option in this scenario or would we be causing ourselves more headaches than it's worth? |
 |
|
Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)
7020 Posts |
Posted - 2007-07-06 : 17:12:20
|
I think for SSAS to work, you have to design a star schema in a relational database to be able to load it. Of course, if you had that, you could just report directly from the star schema, so it probably doesn't gain you much.I think I would do it like this:1. Create a relational model for the transaction schema.2. Create procedures to transform the EVA data into the new transaction schema and to keep it up-to-date enough for reporting.3. Get the web guy working on moving the application to work against the new transaction schema.4. While the web guy is working on the front end, redo the reporting to run against the transactional schema.5. Cutover the web app to the new schema when that is ready.5. Optionally, create a star schema for reporting, create procedures to load the star from the transactional data, and run the reports from that. This would not have to wait until the front-end is done; that’s the reason for doing the new transaction schema first.Either that, or find a new job.Either way, you have my deepest sympathy. Been there and done it.CODO ERGO SUM |
 |
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2007-07-06 : 17:12:28
|
quote: Originally posted by RocketScientistIt's nice for a write-only database though. Writes are really fast.
a write-only database? sounds like the IT equivalent of a black hole. easy to implement - on the back end you just send everything to NUL elsasoft.org |
 |
|
pootle_flump
1064 Posts |
Posted - 2007-07-07 : 09:29:32
|
quote: Originally posted by blackjackIT 4. My brain is fried from looking at this thing
Quite clearly - you've already done 4. But obviously Michael got worn out writing his response too.I think the transition to 3nf for reporting seems sensible. The only reason for implementing EAV is regular schema changes. You don't mention this. Did the original designer just use EAV++ (can't be bothered typing your acronym....) for the hell of it then?Best of luck ZenRaven |
 |
|
blackjackIT
Starting Member
25 Posts |
Posted - 2007-07-07 : 11:38:37
|
Too funny. I just saw the 2 number 4's and Michael's 2 5's. Watch out, it's contagious.I think regular schema changes is what I meant by number 2. Adding or subtracting an attribute is as easy as adding or deleting a row in EAV but I'd have to alter tables in 3NF.It seems like I'm going to have a hell of a time with my 3NF reporting structure because I will virtually have to recreate the structure on every monthly build because of that month's attribute changes, correct?The original designer was more of a web developer that a dba. He used EAV for regular schema changes but I think more so because he also uses it to dynamically generate the front end. |
 |
|
jezemine
Master Smack Fu Yak Hacker
2886 Posts |
Posted - 2007-07-07 : 17:38:49
|
that's why I posted "I like option 4 the best" it has both a fried brain and 3NF. what could be better? elsasoft.org |
 |
|
PSamsig
Constraint Violating Yak Guru
384 Posts |
Posted - 2007-07-08 : 04:57:28
|
Assuming the metadata is also stored in the database, wouldn't it be quite easy (read doable) to make an application that scripts a 3NF snapshot and feeds it, then doing the reporting on that?-- If you give someone a program, you will frustrate them for a day; if you teach them how to program, you will frustrate them for a lifetime. |
 |
|
PSamsig
Constraint Violating Yak Guru
384 Posts |
|
PSamsig
Constraint Violating Yak Guru
384 Posts |
Posted - 2007-07-08 : 10:47:20
|
OTLT/EAV design - why do people hate it?To 'blindman': Broadcast News-- If you give someone a program, you will frustrate them for a day; if you teach them how to program, you will frustrate them for a lifetime. |
 |
|
blackjackIT
Starting Member
25 Posts |
Posted - 2007-07-08 : 11:58:55
|
Thanks for the links, I've been readin through those and just about everything else I can google about EAV over the last week.quote: make an application that scripts a 3NF snapshot and feeds it
Can you elaborate on this? |
 |
|
PSamsig
Constraint Violating Yak Guru
384 Posts |
Posted - 2007-07-08 : 12:48:02
|
You havn't really described how well this EAV database is designed, and as you clearly know by now, there is bad and less bad ways to do this (or good and less good).Of all the articles I have read today, they all point out how 'difficult' and inefficient multi-attribute queries is in a EAV design, they somewhat fail to point out is, given the that the metadata is accesible, that they are trivially machine generated (but still just as inefficient).So if you had (read make) an application that could interprete these metadata, and translate them into a 3NF database schema (which of course will be a snapshot of how the 3NF schema will look at that particially point in time) and then feed that database with the current data, you will have a basic reporting database. Without some index tuning it wont be very efficient, but hay, it can't hardly get much worse right, and at least it will be tuneable.-- If you give someone a program, you will frustrate them for a day; if you teach them how to program, you will frustrate them for a lifetime. |
 |
|
blackjackIT
Starting Member
25 Posts |
Posted - 2007-07-12 : 10:29:10
|
Thanks for everyone's advise. I'm going to build a 3NF reporting structure in parallel to the existing and slowly transition off the old structure as some of you suggested.After some discussion of the subject, we've decided to focus more on the query optimization of the current code before heading into a longer term project. Good thing, now I have more time to ponder.Of course, I'm still wondering how I'm going to deal with the monthly schema changes. I guess I'll either create a new db every month based off the current eav snapshot or modify it every month, I don't know.Anyway, thanks again to everyone |
 |
|
|