Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 SSIS and Import/Export (2005)
 BCP, Tab separated files and Microsoft

Author  Topic 

Jenda
Starting Member

29 Posts

Posted - 2007-09-24 : 18:19:34
Let's see, so I saved an Excel sheet in the CSV format and tried to import using BCP ... no luck, some values contain commas and there's apparently no way to force BCP to honour quotes around values with commas.
Fine, why not, let's save it as a tab separated file:


...
1000 rows sent to SQL Server. Total sent: 6000
SQLState = HY000, NativeError = 0
Error = [Microsoft][SQL Native Client]Unexpected EOF encountered in BCP data-file

6823 rows copied.


What? Unexpected EOF? 6823? There's 7162 lines of data in the file!
And if I look at the imported results ... what??? The BCP helpfully ignored the row separators and if there were too few values on a line it stole a few from the next one.

Now how the (censored) do I import the bloody data? Is BCP supposed to be used or is it just a tool to annoy people?

The params:
[code]V:\Web\Import>bcp DB.dbo.TEMP_Sites in D:\temp\ActiveSites.tab.txt /Uxxx /Pxxx /Sxxx /c -e d:\temp\errors.csv

Thanks in advance, Jenda
(It's 12:18am and I'm getting up at 6:30. Thanks a lot mrkvosoft!)

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2007-09-24 : 18:22:07
You need to pick a field terminator that does not exist in your data. You apparently have commas and tabs in your data.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

Jenda
Starting Member

29 Posts

Posted - 2007-09-24 : 18:37:58
quote:
Originally posted by tkizer

You need to pick a field terminator that does not exist in your data. You apparently have commas and tabs in your data.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/



Nope, there are no tabs in the values. Bot not all rows has all values set. And if the last value(s) are missing Excel doesn't add tabs to make all rows the same length. OTOH it does add several thousands of empty lines containing just a few tabs, not as many as there are in the longest rows though, just a few to waste space.

Actually, I think I do have a solution. I'll add a bogus last column to make sure all lines contains enough tabs. I love you Microsoft!

Jenda
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2007-09-24 : 18:39:20
It doesn't sound like you have a structured file. Could you post a few samples of what it looks like and where possible problems are?

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

Jenda
Starting Member

29 Posts

Posted - 2007-09-24 : 18:54:59
quote:
Originally posted by tkizer

It doesn't sound like you have a structured file. Could you post a few samples of what it looks like and where possible problems are?

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/



Structured file? Yeah, let's invent a fancy name so that we have a reason why it doesn't work.

The first thing I had was a plain old CSV:

123,Sitename,url,no,yes,bleargh
321,"Site, sate or not site",other,no,no,

Probably too complex to even consider.

The second was a tab delimited file:

123 Sitename URL no yes bleargh
321 "Site, sate or not site" other no no
111 Some urlurl yes yes bad-luck

I don't really mind the doublequotes, why would MS Excel and MS SQL Server care to support a common format, I can get rid of those later, the catch is that the second row doesn't have a value for the last column. A very complex notion that BCP has no chance of expecting.

Jenda
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2007-09-24 : 19:05:22
quote:
Originally posted by Jenda


Structured file? Yeah, let's invent a fancy name so that we have a reason why it doesn't work.




I give up.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

jezemine
Master Smack Fu Yak Hacker

2886 Posts

Posted - 2007-09-24 : 19:09:19
bcp wants to see the same number of columns in each row. if there are different numbers, it complains.

so this is invalid:

123,abc,def
321,eee,ccc,ddd,eee

but this would be valid:

123,abc,def,,
321,eee,ccc,ddd,eee

notice the trailing two commas on the first line. these separate two NULLs.


elsasoft.org
Go to Top of Page

Jenda
Starting Member

29 Posts

Posted - 2007-09-24 : 19:12:47
quote:
Originally posted by jezemine

bcp wants to see the same number of columns in each row. if there are different numbers, it complains.

so this is invalid:

123,abc,def
321,eee,ccc,ddd,eee

but this would be valid:

123,abc,def,,
321,eee,ccc,ddd,eee

notice the trailing two commas on the first line. these separate two NULLs.



In other words BCP is pretty useless. No matter what well known and well defined format do you have you still have to transform it so something BCP would we willing to accept. Someone apparently made their job easy :-(
Go to Top of Page

jezemine
Master Smack Fu Yak Hacker

2886 Posts

Posted - 2007-09-25 : 01:49:41
I find bcp quite useful. I use it every day to great effect.

Most programs produce correct output given expected input. bcp is one such program. it barfs (as most other programs do) if you give it unexpected input. why is this so surprising?


elsasoft.org
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-09-25 : 02:50:50
"No matter what well known and well defined format do you have you still have to transform it so something BCP would we willing to accept"

There is nothing in the documentation for BCP that says it will, so stop whinging.

And it isn't well known and well defined in my experience, its "every app has its own interpretation". Most only accept comma-delimiters and skip the whole quoted thing, or fail to allow doubled-quotes for any embedded quotes.

Much easier to just use DTS to import an XLS.

Kristen
Go to Top of Page

Jenda
Starting Member

29 Posts

Posted - 2007-09-25 : 02:53:48
quote:
Originally posted by jezemine

I find bcp quite useful. I use it every day to great effect.

Most programs produce correct output given expected input. bcp is one such program. it barfs (as most other programs do) if you give it unexpected input. why is this so surprising?


I would not call CSV an unexpected format for a program that's supposed to batch-import data into database. But I can understand that someone was simply too lazy to implement it.

The behaviour I'd seen with the tab separated file is just a bug though. Any sensible implementation would assign NULLs to the missing columns, stealing several values from the next record is simply ... Microsoftish.

Really ... is it so unexpected to want to be able to run a report (check), copy the results to Excel (check), do whatever transformations/filters/hand-editing/augmentation you need (check) and then import the results into a (different) database (what the heck?)?

But, yeah, it was stupid to expect things to work. It was stupid to expect BCP to handle well known formats, it was stupid to expect one MS program to be able to handle data created by a different MS program. I see the error in my ways.
Jenda
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-09-25 : 03:18:50
"I would not call CSV an unexpected format for a program that's supposed to batch-import data into database. But I can understand that someone was simply too lazy to implement it."

What a load of rubbish.

It supports a single character delimited format, that's it.

It also supports a format file for more variable formats.

It does not support the various notions of what CSV mean to different people.

Would it be nice if it did? maybe, but most of the applications I come up against don't understand that CSV CAN mean "Quote the field if it contains an embedded comma, and for such fields double any embedded quotes". So there is rarely an opportunity to have that as a source file format anyway.

Yes, XLS can generate that style. But SQL Server has DTS which is far more suitable for importing XLS as it will do it natively without all the mucking about exporting it to CSV etc.

"Any sensible implementation would assign NULLs to the missing columns, stealing several values from the next record is simply ... Microsoftish."

More rubbish. I would HATE for it to do that. The import file is incomplete, I certainly would not want that to be handled silently. In this case it seems that XLS is at fault.

"I see the error in my ways"

No you don;t, you're just a whingeing minny.

The programs are documented in how they work, you want them to work differently. Fine, send a WishList to Microsoft.

"is it so unexpected to want to be able to run a report (check), copy the results to Excel (check), do whatever transformations/filters/hand-editing/augmentation you need (check) and then import the results into a (different) database (what the heck?)?"

We strongly discourage our users from massaging their data in this way. Excel messes up data in so many ways - e.g. stripping off leading zeros, interpreting string as dates or numbers when it thinks it can, and so on.

Kristen
Go to Top of Page

Jenda
Starting Member

29 Posts

Posted - 2007-09-25 : 04:31:50
quote:
Originally posted by Kristen

"Any sensible implementation would assign NULLs to the missing columns, stealing several values from the next record is simply ... Microsoftish."

More rubbish. I would HATE for it to do that. The import file is incomplete, I certainly would not want that to be handled silently. In this case it seems that XLS is at fault.



But it does handle the "incomplete" file silently. It silently steals a few values from the start of the next record. Unless you are lucky enough that those values fail whatever constraints are there for the last columns, the only hint that something went wrong is the number of imported records!

If it rejected the rows it's be annoying, IMHO overly picky, but sensible. Stealing values from the next record is not sensible.

Jenda
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-09-25 : 05:10:46
"Stealing values from the next record is not sensible"

indeed.

Anyway, you would be better off using DTS

Kristen
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2007-09-25 : 08:48:45
quote:
Originally posted by Kristen

"Stealing values from the next record is not sensible"

indeed.

Anyway, you would be better off using DTS

Kristen

Good Lord! Don't even suggest that unless you want a dozen posts from Jenda whining about how useless DTS is.

e4 d5 xd5 Nf6
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2007-09-25 : 09:19:30
We have a term for this

It's a theme park called bobo land

they have all kinds of "lands" and rides you can visit

It's right upi your ally

All you need is a pink slip to get in



Brett

8-)

Hint: Want your questions answered fast? Follow the direction in this link
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx

Add yourself!
http://www.frappr.com/sqlteam



Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-09-26 : 05:29:02
"Don't even suggest that unless you want a dozen posts from Jenda whining about how useless DTS is."

He's not going to; I suggested it three times already, but obviously exporting to CSV and importing with BCP is preferable to just having DTS import the XLS directly

Kristen
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-09-26 : 05:29:39
Mind you, perhaps I'll offer him a job ... it would be nice to have a developer who writes software with a perfect interface and no bugs
Go to Top of Page

Jenda
Starting Member

29 Posts

Posted - 2007-09-26 : 08:07:01
quote:
Originally posted by Kristen

"Don't even suggest that unless you want a dozen posts from Jenda whining about how useless DTS is."

He's not going to; I suggested it three times already, but obviously exporting to CSV and importing with BCP is preferable to just having DTS import the XLS directly

Kristen



Next time I might do that. Which doesn't mean that the omission of CSV from the list of formats supported by BCP is a good idea or that the handling of tab separated files with lines that do not end by several tabs in case the last values are empty is not an error that should be fixed.

If BCP was opensourced someone would already do both. Maybe that's part of the frustration ... instead of being able to fix a bug, I'm forced to circumvent it. With the likelyhood of a fix within the next ten years being close to zero :-( Especially since Microsoft is famous for calling their bugs features.

Jenda
P.S.: And no, thanks, I would not want to work with a smartass like you.
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2007-09-26 : 08:21:12
"If BCP was opensourced someone would already do both."

There probably is a suitable import tool that has source available. You can set up a linked server to pretty much anything, or use some sort of OPENQUERY to talk to all sorts of weird external "providers", and a 3rd party import gateway sounds like a sensible Widget.

"And no, thanks, I would not want to work with a smartass like you."

Thanks for letting me know, but as my staff will tell you I'm not a smartarse; but your whole attitude in this thread doesn't make it conducive to wanting to like, or help, you.

But I expect you already know that if that's your normal style of communication.

Kristen
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2007-09-26 : 10:49:13
Jenda, BCP is optimized for speed and low overhead. It was never intended to be used with a large variety of formats. It was intended to be used to import datasets that had been output by BCP. Other tools, such as DTS and Development Studio have more functionality, with probably less efficiency and certainly more overhead.
So your complaining about about BCP is more than a little humorous to the rest of us. Its as if you sat down in a Formula One race car and started whining about how few cup holders it has.

e4 d5 xd5 Nf6
Go to Top of Page
    Next Page

- Advertisement -