I'm running into a problem parsing a comma separated CSV file with lots of double quoted strings with embedded double quotes. I get 143,651 rows into the file and then it crashes with the Specified argument was out of the range of valid values. at nietras.SeparatedValues.SepReader.ParseNewRows() error. The total row count in the file is around 650,000 and there are lots of columns.
I tried creating a small file with just the column header row and the block of rows -10 to +10 around the place where it crashes on the larger file. That doesn't crash.
I also experimented with the reader options. I found out that I can flip the setting for Unescaped and then it no longer crashes. That leaves me with another problem as I'm also using the SepWriter to write out a subset of that much larger input file. This subset becomes formatted differently with the Unescaped setting set to non-default value (most of my double quotes are now missing).
Since the tiny -10 to + 10 row test file didn't crash it is almost like some quote escape variable is running out of range after dealing with a ton of these column/cell values with lots of embedded quotes.
using (SepReader reader = Sep.New(',').Reader(o => o with { HasHeader = true, DisableColCountCheck = true }).FromFile(sFilename))
{
using (SepWriter writer = reader.Spec.Writer().ToFile(sListingFilename))
{
foreach (SepReader.Row row in reader)
{
nLine++;
string sName = row["Name"].ToString();
if (sName.IndexOf("lookingfor", StringComparison.OrdinalIgnoreCase) >= 0)
{
using SepWriter.Row writeRow = writer.NewRow(row);
}
}
}
}