CSV string handling

CSV string handling



Typical way of creating a CSV string (pseudocode):



Code sample:


public string ReturnAsCSV(ContactList contactList)

StringBuilder sb = new StringBuilder();
foreach (Contact c in contactList)

sb.Append(c.Name + ",");


sb.Remove(sb.Length - 1, 1);
//sb.Replace(",", "", sb.Length - 1, 1)

return sb.ToString();



I like the idea of adding the comma by checking if the container is empty, but doesn't that mean more processing as it needs to check the length of the string on each occurrence?



I feel that there should be an easier/cleaner/more efficient way of removing that last comma. Any ideas?




13 Answers
13



You could use LINQ to Objects:


string strings = contactList.Select(c => c.Name).ToArray();
string csv = string.Join(",", strings);



Obviously that could all be done in one line, but it's a bit clearer on two.






Less obvious is that it doesn't actually implement the CSV specification. It's a great way to put commas into a string, but that's not the same as CSV, the format.

– rcreswick
Sep 22 '08 at 19:21






Works well with the EncodeCsvField() function posted by dbkk

– Chris Miller
Oct 27 '10 at 15:48



Your code not really compliant with full CSV format. If you are just generating CSV from data that has no commas, leading/trailing spaces, tabs, newlines or quotes, it should be fine. However, in most real-world data-exchange scenarios, you do need the full imlementation.



For generation to proper CSV, you can use this:


public static String EncodeCsvLine(params String fields)

StringBuilder line = new StringBuilder();

for (int i = 0; i < fields.Length; i++)

if (i > 0)

line.Append(DelimiterChar);


String csvField = EncodeCsvField(fields[i]);
line.Append(csvField);


return line.ToString();


static String EncodeCsvField(String field)



Might not be world's most efficient code, but it has been tested. Real world sucks compared to quick sample code :)






As posted in another answer, there are libraries to do this (eg: OpenCSV) and they actually have test frameworks / unit tests too.

– rcreswick
Sep 22 '08 at 19:19






These two subroutines finally solved the issue I have been chasing. True, a bit longer then simply lick-and-stick all the data together with commas, but handled my 400,000 row export without issue.

– Lloyd
Mar 13 '13 at 2:04



Why not use one of the open source CSV libraries out there?



I know it sounds like overkill for something that appears so simple, but as you can tell by the comments and code snippets, there's more than meets the eye. In addition to handling full CSV compliance, you'll eventually want to handle both reading and writing CSVs... and you may want file manipulation.



I've used Open CSV on one of my projects before (but there are plenty of others to choose from). It certainly made my life easier. ;)



Don't forget our old friend "for". It's not as nice-looking as foreach but it has the advantage of being able to start at the second element.


public string ReturnAsCSV(ContactList contactList)

if (contactList == null



You could also wrap the second Append in an "if" that tests whether the Name property contains a double-quote or a comma, and if so, escape them appropriately.



You could instead add the comma as the first thing inside your foreach.



if (sb.Length > 0) sb.Append(",");


if (sb.Length > 0) sb.Append(",");



You could also make an array of c.Name data and use String.Join method to create your line.


public string ReturnAsCSV(ContactList contactList)

List<String> tmpList = new List<string>();

foreach (Contact c in contactList)

tmpList.Add(c.Name);


return String.Join(",", tmpList.ToArray());



This might not be as performant as the StringBuilder approach, but it definitely looks cleaner.



Also, you might want to consider using .CurrentCulture.TextInfo.ListSeparator instead of a hard-coded comma -- If your output is going to be imported into other applications, you might have problems with it. ListSeparator may be different across different cultures, and MS Excel at the very least, honors this setting. So:


return String.Join(
System.Globalization.CultureInfo.CurrentCulture.TextInfo.ListSeparator,
tmpList.ToArray());



I like the idea of adding the comma by checking if the container is empty, but doesn't that mean more processing as it needs to check the length of the string on each occurrence?



You're prematurely optimizing, the performance hit would be negligible.



Just a thought, but remember to handle comma's and quotation marks (") in the field values, otherwise your CSV file may break the consumers reader.



I wrote a small class for this in case someone else finds it useful...


public class clsCSVBuilder

protected int _CurrentIndex = -1;
protected List<string> _Headers = new List<string>();
protected List<List<string>> _Records = new List<List<string>>();
protected const string SEPERATOR = ",";

public clsCSVBuilder()

public void CreateRow()

_Records.Add(new List<string>());
_CurrentIndex++;


protected string _EscapeString(string str)

return string.Format(""0"", str.Replace(""", """")
.Replace("rn", " ")
.Replace("n", " ")
.Replace("r", " "));


protected void _AddRawString(string item)

_Records[_CurrentIndex].Add(item);


public void AddHeader(string name)

_Headers.Add(_EscapeString(name));


public void AddRowItem(string item)

_AddRawString(_EscapeString(item));


public void AddRowItem(int item)

_AddRawString(item.ToString());


public void AddRowItem(double item)

_AddRawString(item.ToString());


public void AddRowItem(DateTime date)

AddRowItem(date.ToShortDateString());


public static string GenerateTempCSVPath()

return Path.Combine(Path.GetTempPath(), Guid.NewGuid().ToString().ToLower().Replace("-", "") + ".csv");


protected string _GenerateCSV()

StringBuilder sb = new StringBuilder();

if (_Headers.Count > 0)

sb.AppendLine(string.Join(SEPERATOR, _Headers.ToArray()));


foreach (List<string> row in _Records)

sb.AppendLine(string.Join(SEPERATOR, row.ToArray()));


return sb.ToString();


public void SaveAs(string path)

using (StreamWriter sw = new StreamWriter(path))

sw.Write(_GenerateCSV());





I've used this method before. The Length property of StringBuilder is NOT readonly so subtracting it by one means truncate the last character. But you have to make sure your length is not zero to start with (which would happen if your list is empty) because setting the length to less than zero is an error.


public string ReturnAsCSV(ContactList contactList)

StringBuilder sb = new StringBuilder();

foreach (Contact c in contactList)

sb.Append(c.Name + ",");


if (sb.Length > 0)
sb.Length -= 1;

return sb.ToString();



I use CSVHelper - it's a great open-source library that lets you generate compliant CSV streams one element at a time or custom-map your classes:


public string ReturnAsCSV(ContactList contactList)

StringBuilder sb = new StringBuilder();
using (StringWriter stringWriter = new StringWriter(sb))

using (var csvWriter = new CsvHelper.CsvWriter(stringWriter))

csvWriter.Configuration.HasHeaderRecord = false;
foreach (Contact c in contactList)

csvWriter.WriteField(c.Name);



return sb.ToString();



or if you map then something like this: csvWriter.WriteRecords<ContactList>(contactList);


csvWriter.WriteRecords<ContactList>(contactList);



How about some trimming?


public string ReturnAsCSV(ContactList contactList)

StringBuilder sb = new StringBuilder();

foreach (Contact c in contactList)

sb.Append(c.Name + ",");


return sb.ToString().Trim(',');



How about tracking whether you are on the first item, and only add a comma before the item if it is not the first one.


public string ReturnAsCSV(ContactList contactList)

StringBuilder sb = new StringBuilder();
bool isFirst = true;

foreach (Contact c in contactList)
if (!isFirst)
// Only add comma before item if it is not the first item
sb.Append(",");
else
isFirst = false;


sb.Append(c.Name);


return sb.ToString();



Thanks for contributing an answer to Stack Overflow!



But avoid



To learn more, see our tips on writing great answers.



Required, but never shown



Required, but never shown




By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

Edmonton

Crossroads (UK TV series)