Home » Questions » Computers [ Ask a new question ]

CSV string handling

CSV string handling

"Typical way of creating a CSV string (pseudocode):

Create a CSV container object (like a StringBuilder in C#).
Loop through the strings you want to add appending a comma after each one.
After the loop, remove that last superfluous comma.

Code sample:

public string ReturnAsCSV(ContactList contactList)
{
StringBuilder sb = new StringBuilder();
foreach (Contact c in contactList)
{
sb.Append(c.Name + "","");
}

sb.Remove(sb.Length - 1, 1);
//sb.Replace("","", """", sb.Length - 1, 1)

return sb.ToString();
}

I like the idea of adding the comma by checking if the container is empty, but doesn't that mean more processing as it needs to check the length of the string on each occurrence?

I feel that there should be an easier/cleaner/more efficient way of removing that last comma. Any ideas?"

Asked by: Guest | Views: 236
Total answers/comments: 4
Guest [Entry]

"You could use LINQ to Objects:

string [] strings = contactList.Select(c => c.Name).ToArray();
string csv = string.Join("","", strings);

Obviously that could all be done in one line, but it's a bit clearer on two."
Guest [Entry]

"Your code not really compliant with full CSV format. If you are just generating CSV from data that has no commas, leading/trailing spaces, tabs, newlines or quotes, it should be fine. However, in most real-world data-exchange scenarios, you do need the full imlementation.

For generation to proper CSV, you can use this:

public static String EncodeCsvLine(params String[] fields)
{
StringBuilder line = new StringBuilder();

for (int i = 0; i < fields.Length; i++)
{
if (i > 0)
{
line.Append(DelimiterChar);
}

String csvField = EncodeCsvField(fields[i]);
line.Append(csvField);
}

return line.ToString();
}

static String EncodeCsvField(String field)
{
StringBuilder sb = new StringBuilder();
sb.Append(field);

// Some fields with special characters must be embedded in double quotes
bool embedInQuotes = false;

// Embed in quotes to preserve leading/tralining whitespace
if (sb.Length > 0 &&
(sb[0] == ' ' ||
sb[0] == '\t' ||
sb[sb.Length-1] == ' ' ||
sb[sb.Length-1] == '\t' ))
{
embedInQuotes = true;
}

for (int i = 0; i < sb.Length; i++)
{
// Embed in quotes to preserve: commas, line-breaks etc.
if (sb[i] == DelimiterChar ||
sb[i]=='\r' ||
sb[i]=='\n' ||
sb[i] == '""')
{
embedInQuotes = true;
break;
}
}

// If the field itself has quotes, they must each be represented
// by a pair of consecutive quotes.
sb.Replace(""\"""", ""\""\"""");

String rv = sb.ToString();

if (embedInQuotes)
{
rv = ""\"""" + rv + ""\"""";
}

return rv;
}

Might not be world's most efficient code, but it has been tested. Real world sucks compared to quick sample code :)"
Guest [Entry]

"Why not use one of the open source CSV libraries out there?

I know it sounds like overkill for something that appears so simple, but as you can tell by the comments and code snippets, there's more than meets the eye. In addition to handling full CSV compliance, you'll eventually want to handle both reading and writing CSVs... and you may want file manipulation.

I've used Open CSV on one of my projects before (but there are plenty of others to choose from). It certainly made my life easier. ;)"
Guest [Entry]

"Don't forget our old friend ""for"". It's not as nice-looking as foreach but it has the advantage of being able to start at the second element.

public string ReturnAsCSV(ContactList contactList)
{
if (contactList == null || contactList.Count == 0)
return string.Empty;

StringBuilder sb = new StringBuilder(contactList[0].Name);

for (int i = 1; i < contactList.Count; i++)
{
sb.Append("","");
sb.Append(contactList[i].Name);
}

return sb.ToString();
}

You could also wrap the second Append in an ""if"" that tests whether the Name property contains a double-quote or a comma, and if so, escape them appropriately."