xlson / groovycsv Goto Github PK
View Code? Open in Web Editor NEWA simple CSV parsing library for groovy
Home Page: http://xlson.com/groovycsv/
License: Other
A simple CSV parsing library for groovy
Home Page: http://xlson.com/groovycsv/
License: Other
This could be useful if you would like to parse a csv where there are no column headers in the first line.
def dataDir = new File(".");
dataDir.traverse(type: FILES, nameFilter: ~/.*\.csv$/) { File file ->
file.withReader { Reader reader ->
parseCsv(reader) { row ->
println row
}
}
}
This fails because parseCsv does not provide a way to be called from a Closure:
Caught: groovy.lang.MissingMethodException: No signature of method: static com.xlson.groovycsv.CsvParser.parseCsv() is applicable for argument types: (java.io.LineNumberReader, Dataload$_run_closure1$_closure3$_closure4)
I have a csv file from a customer that has an extra separator at the header row.
So the first line endsWith() a separator.
The parsing fails because it assumes an extra column in de data lines.
The file does import fine in LibreOffice
Link is broken. How do we access documentation now?
Solves two use cases:
Add support for type conversion so that it's possible to tell the parser that a specific column is of a certain type.
toMap() method from PropertyMapper is missing in jar and @grab download
Making things private (as opposed to protected) in an object language is extremely anti-social, because it prevents people from inheriting and expanding the class. For example, I wrote the below code to try and overcome issue #48 (i.e. stop getProperty() from throwing an exception), but it failed to compile because the members of CsvIterator are private. Don't do that. Make all data members protected.
CsvParser p = new CsvParser() {
// all this code is so we can have a csv line that doesn't throw
// an exception if some columns are missing.
Iterator parse(Map args = [:], Reader reader) {
def csvReader = createCSVReader(args, reader)
def columnNames = parseColumnNames(args, csvReader)
new CsvIterator(columnNames, csvReader) {
def next() {
throwsExceptionIfClosed()
new PropertyMapper(columns: this.columns, values: this.nextValue) {
def propertyMissing(String name) {
def index = this.columns[name]
if (index != null) {
values[index]
} else {
return null
}
}
}
}
}
}
}
I'll start with example for better understanding of my issue.
String csv = """
Col1, Col2
val1-1, val2-1
val1-2, val2-2"qqq
val1-3, val2-3,
val1-4, val2-4"www,
val1-5, val2-5
""";
CsvParser.parse(
csv,
separator: ",",
quoteChar: "",
readFirstLine: false
).each{PropertyMapper line->
println("Col1: '${line.Col1}', Col2: '${line.Col2}'");
};
Result:
Col1: 'val1-1', Col2: 'val2-1'
Col1: 'val1-2', Col2: 'val2-2\"qqq\nval1-3,val2-3,\nval1-4,val2-4\"www'
Col1: 'val1-3', Col2: 'val2-5'
So I cannot assign "empty string" as quoteChar
. I believe this is due to code in CsvParser.groovy : line: 147
:
if (args.autoDetect == true) {
reader = new PushbackReader(reader, autoDetectCharNumber)
doAutoDetection(args, reader)
separator = args.separator
quoteChar = args.quoteChar
} else {
separator = args.separator ?: ','
quoteChar = args.quoteChar ?: '"'
}
Groovy converts empty string to Boolean false
:
def q = '';
assert (Boolean)q == false;
I can fix that issue with just using fake quoteChar
, but it is nice to have better fix.
And thanks for your lib!
OpenCSV has support for different line separators and escape characters.
Hello,
I have just started using this cool library and hope it continues to evolve.
I am currently facing some issues when trying to parse files that contain spaces or special chars in the headers, would it be possible to add some kind of feature that remplaces the spaces with underscores.
Fix: Value-access by header names with spaces:
def csv = '''dev uuid,Node Name, Time (Seconds)
1,TestNode,300
2,TestNode2,200'''
def data = parseCsv(csv)
for(line in data) {
println "$line.dev_uuid $line.NodeName : $line.Time_Seconds"
}
Best regards,
Sebas
Functionality to skip empty lines was implemented for 1.2.1-SNAPSHOT, unfortunately it's incomplete and code like the following will cause an exception to be thrown.
@GrabResolver(name='snapshots', root='https://oss.sonatype.org/content/groups/public/')
@Grab('com.xlson.groovycsv:groovycsv:1.2.1-SNAPSHOT')
import static com.xlson.groovycsv.CsvParser.parseCsv
def csv = parseCsv('''fruit,qty
apple,5
pear,6''')
csv.eachWithIndex { it, index ->
println index
println it
}
Output:
0
fruit: apple, qty: 5
1
Exception thrown
java.lang.ArrayIndexOutOfBoundsException: 1
at com.xlson.groovycsv.PropertyMapper$_toString_closure1.doCall(PropertyMapper.groovy:67)
at com.xlson.groovycsv.PropertyMapper.toString(PropertyMapper.groovy:67)
at ConsoleScript28$_run_closure1.doCall(ConsoleScript28:13)
at ConsoleScript28.run(ConsoleScript28:11)
CsvParsers parse method only allows String at this point which means all of the csv needs to be read into memory before being parsed.
Hello,
I faced an issue, when trying to parse a file with a line that ends with quotes.
The parser doesn't read the ending quote, but if I add a char or space after the ending quote (line 1, from the file, ends with space), it is fine.
Let me share an example, a file with the following content:
id;test
1;project = "Test 1"
2;project = "Test 2"
Code:
def data = parseCsv(new FileReader(new File("tmp.csv")), separator: ';')
for(line in data) {
println "$line"
}
Output:
id: 1, test: project = "Test 1"
id: 2, test: project = "Test 2
There hasn't been a new release since 2012 (to maven). The last version is 1.0 and lack a few bug fixes and enhancements. Could you publish a new release to Maven?
Thanks
Do a final release of GroovyCSV 0.2.
To do:
Improvement Request!
There is currently the possibility to skipLines
using the configuration argument. But this reflects only a simple number for the first lines to be skipped. I suggest extending the definition of skipLines
to support an array of single lines or Ranges of lines.
Current behaviour:
skipLines: 10
would skip the first 10 Lines
Additional behaviour:
skipLines: [1..10, 15..16, 20, 22..24]
would skip the lines 1 to 10, 15, 16, 20 and 22 to 24.
com.xlson.groovycsv.PropertyMapper doesn't seem too much different from a Map and it would be great if I could pass maps to my other libraries without having to convert first.
New Feature Request!
People tend to use tabs/whitespaces in CSV file to make them more readable. This makes a problem when the content cells are not delimited. Because those characters would be included as content which was not the initial intention of those users.
It would make sense to let the parser trim the content of leading and trailing tabs/whitespaces. Best way to do it IMHO is to provide an additional configuration argument for this purpose.
Add support for reading csv values from the iterable using line[0] so that code that that uses OpenCSV can be easily changed into using GroovyCSV without too much hassle.
Cloned to the master.
and running ./gradlew --debug, stacktrace below.
Am I missing anytihng?
15:56:10.775 [DEBUG] [org.gradle.messaging.remote.internal.TcpOutgoingConnector] Found loop-back addresses: [/0:0:0:0:0:0:0:1%1, /127.0.0.1].
15:56:11.418 [DEBUG] [de.huxhorn.gradle.pgp.PgpSigner] Added BouncyCastleProvider.
15:56:11.446 [DEBUG] [de.huxhorn.gradle.pgp.PgpPlugin] Created PgpSigner instance.
15:56:11.923 [DEBUG] [org.gradle.configuration.BuildScriptProcessor] Timing: Running the build script took 2.779 secs
15:56:11.933 [ERROR] [org.gradle.BuildExceptionReporter]
15:56:11.937 [ERROR] [org.gradle.BuildExceptionReporter] FAILURE: Build failed with an exception.
15:56:11.939 [ERROR] [org.gradle.BuildExceptionReporter]
15:56:11.940 [ERROR] [org.gradle.BuildExceptionReporter] * Where:
15:56:11.941 [ERROR] [org.gradle.BuildExceptionReporter] Build file '/opt/softwares/workspace/otherprojects/groovycsv/build.gradle' line: 18
15:56:11.942 [ERROR] [org.gradle.BuildExceptionReporter]
15:56:11.942 [ERROR] [org.gradle.BuildExceptionReporter] * What went wrong:
15:56:11.943 [ERROR] [org.gradle.BuildExceptionReporter] A problem occurred evaluating root project 'groovycsv'.
15:56:11.946 [ERROR] [org.gradle.BuildExceptionReporter] Cause: Could not find method maven() for arguments [build_gr4fdlqlq7963u3c5ul17a57$_run_closure1_closure11@518e322b] on root project 'groovycsv'.
15:56:11.947 [ERROR] [org.gradle.BuildExceptionReporter]
15:56:11.948 [ERROR] [org.gradle.BuildExceptionReporter] * Exception is:
15:56:11.950 [ERROR] [org.gradle.BuildExceptionReporter] org.gradle.api.GradleScriptException: A problem occurred evaluating root project 'groovycsv'.
15:56:11.951 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.groovy.scripts.DefaultScriptRunnerFactory$ScriptRunnerImpl.run(DefaultScriptRunnerFactory.java:51)
15:56:11.952 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.configuration.DefaultScriptPluginFactory$ScriptPluginImpl.apply(DefaultScriptPluginFactory.java:127)
15:56:11.952 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.configuration.BuildScriptProcessor.evaluate(BuildScriptProcessor.java:38)
15:56:11.953 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.configuration.DefaultProjectEvaluator.evaluate(DefaultProjectEvaluator.java:38)
15:56:11.954 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.api.internal.project.AbstractProject.evaluate(AbstractProject.java:487)
15:56:11.954 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.api.internal.project.AbstractProject.evaluate(AbstractProject.java:71)
15:56:11.955 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.configuration.ProjectEvaluationConfigurer.execute(ProjectEvaluationConfigurer.java:23)
15:56:11.956 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.configuration.ProjectEvaluationConfigurer.execute(ProjectEvaluationConfigurer.java:21)
15:56:11.957 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.configuration.DefaultBuildConfigurer$1.execute(DefaultBuildConfigurer.java:38)
15:56:11.958 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.configuration.DefaultBuildConfigurer$1.execute(DefaultBuildConfigurer.java:35)
15:56:11.959 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.api.internal.project.AbstractProject.configure(AbstractProject.java:463)
15:56:11.960 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.api.internal.project.AbstractProject.allprojects(AbstractProject.java:458)
15:56:11.961 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.configuration.DefaultBuildConfigurer.configure(DefaultBuildConfigurer.java:35)
15:56:11.962 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.initialization.DefaultGradleLauncher.doBuildStages(DefaultGradleLauncher.java:141)
15:56:11.963 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.initialization.DefaultGradleLauncher.doBuild(DefaultGradleLauncher.java:112)
15:56:11.964 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.initialization.DefaultGradleLauncher.run(DefaultGradleLauncher.java:80)
15:56:11.964 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.launcher.RunBuildAction.execute(RunBuildAction.java:41)
15:56:11.965 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.launcher.RunBuildAction.execute(RunBuildAction.java:27)
15:56:11.966 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.launcher.ExceptionReportingAction.execute(ExceptionReportingAction.java:32)
15:56:11.967 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.launcher.ExceptionReportingAction.execute(ExceptionReportingAction.java:21)
15:56:11.967 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.launcher.CommandLineActionFactory$WithLoggingAction.execute(CommandLineActionFactory.java:219)
15:56:11.968 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.launcher.CommandLineActionFactory$WithLoggingAction.execute(CommandLineActionFactory.java:203)
15:56:11.969 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.launcher.Main.execute(Main.java:55)
15:56:11.970 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.launcher.Main.main(Main.java:40)
15:56:11.970 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.launcher.ProcessBootstrap.runNoExit(ProcessBootstrap.java:46)
15:56:11.971 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.launcher.ProcessBootstrap.run(ProcessBootstrap.java:28)
15:56:11.972 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.launcher.GradleMain.main(GradleMain.java:24)
15:56:11.972 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.wrapper.BootstrapMainStarter.start(BootstrapMainStarter.java:33)
15:56:11.973 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.wrapper.Wrapper.execute(Wrapper.java:87)
15:56:11.974 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.wrapper.GradleWrapperMain.main(GradleWrapperMain.java:37)
15:56:11.975 [ERROR] [org.gradle.BuildExceptionReporter] Caused by: org.gradle.api.internal.MissingMethodException: Could not find method maven() for arguments [build_gr4fdlqlq7963u3c5ul17a57$_run_closure1_closure11@518e322b] on root project 'groovycsv'.
15:56:11.975 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.api.internal.AbstractDynamicObject.methodMissingException(AbstractDynamicObject.java:60)
15:56:11.976 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.api.internal.AbstractDynamicObject.invokeMethod(AbstractDynamicObject.java:56)
15:56:11.977 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.api.internal.CompositeDynamicObject.invokeMethod(CompositeDynamicObject.java:106)
15:56:11.978 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.api.internal.project.DefaultProject_Decorated.invokeMethod(Unknown Source)
15:56:11.978 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.groovy.scripts.BasicScript.methodMissing(BasicScript.groovy:68)
15:56:11.979 [ERROR] [org.gradle.BuildExceptionReporter] at build_gr4fdlqlq7963u3c5ul17a57$_run_closure1.doCall(/opt/softwares/workspace/otherprojects/groovycsv/build.gradle:18)
15:56:11.980 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.util.ConfigureUtil.configure(ConfigureUtil.java:61)
15:56:11.980 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.util.ConfigureUtil.configure(ConfigureUtil.java:31)
15:56:11.981 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.api.internal.project.AbstractProject.repositories(AbstractProject.java:889)
15:56:11.982 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.api.internal.BeanDynamicObject.invokeMethod(BeanDynamicObject.java:158)
15:56:11.983 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.api.internal.CompositeDynamicObject.invokeMethod(CompositeDynamicObject.java:93)
15:56:11.983 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.api.internal.project.DefaultProject_Decorated.invokeMethod(Unknown Source)
15:56:11.984 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.groovy.scripts.BasicScript.methodMissing(BasicScript.groovy:68)
15:56:11.985 [ERROR] [org.gradle.BuildExceptionReporter] at build_gr4fdlqlq7963u3c5ul17a57.run(/opt/softwares/workspace/otherprojects/groovycsv/build.gradle:16)
15:56:11.986 [ERROR] [org.gradle.BuildExceptionReporter] at org.gradle.groovy.scripts.DefaultScriptRunnerFactory$ScriptRunnerImpl.run(DefaultScriptRunnerFactory.java:49)
15:56:11.986 [ERROR] [org.gradle.BuildExceptionReporter] ... 29 more
15:56:11.987 [ERROR] [org.gradle.BuildExceptionReporter]
15:56:11.988 [LIFECYCLE] [org.gradle.BuildResultLogger]
15:56:11.988 [LIFECYCLE] [org.gradle.BuildResultLogger] BUILD FAILED
15:56:11.989 [LIFECYCLE] [org.gradle.BuildResultLogger]
15:56:11.990 [LIFECYCLE] [org.gradle.BuildResultLogger] Total time: 4.084 secs
CsvIterator.hasNext() incorrectly throws IllegalArgumentException when the reader is closed. Fix this so that it just returns false when the reader is closed.
Hi, I tried to run the software on heroku but it fails with java.lang.NoClassDefFoundError: com.xlson.groovycsv.CsvParser. Any idea how to fix this? thanks for your help and for building this project!
best Sebastian
Hi,
reading your docs "Build instructions" for creating a JAR-file and using it within a grails application I get a java.lang.ClassNotFoundException: au.com.bytecode.opencsv.CSVReader.
I build groovycsv-1.0-SNAPSHOT.jar, copied this to the lib-folder, run grails war, uploaded the war file to my server.
Any hint how to resolves this dependency?
Thanks
Sebastian
There's no call to close() for the underlying reader implementation.
I've been using OpenCSV for awhile now and decided to give GroovyCSV a try. It seems that GroovyCSV currently depends on a pretty old version of OpenCSV from au.com.bytecode. I've tried changing the source to depend on the latest OpenCSV shown in Maven Central which today is 3.7 and from com.opencsv, as follows:
I've changed the imports in the src and test groovy files to
import com.opencsv.CSVReader
and also the build.gradle to
compile 'com.opencsv:opencsv:3.7'
Which gets me a clean build, but when I try to use the .jar file in the build/libs, I still get this error:
java.lang.RuntimeException: java.lang.NoClassDefFoundError: Unable to load class com.xlson.groovycsv.CsvParser due to missing dependency au/com/bytecode/opencsv/CSVReader
Any thoughts on this?
At the moment all of the csv is read into memory when parse()
is called. This should be changed to support large files.
In CSV files exported from PayPal, the column names in the first line are separated by ", "
Sample:
Date, Time, Time Zone, Name, Type, Status, Currency, Amount, Receipt ID, Balance,
When groovycsv parses these, the second and subsequent column names contain a leading space.
I expect that the leading space would be stripped, or that there would at least be some option to strip leading whitespace from the column names.
Hi,
the data of the first column of my csv file cannot be imported. it says "MissignPropertyException". If I log line.columns.toString() it says
[place:0, Plm:1, Plw:2, name:3, ...
Looking into the csv file with an editor does not show this signs. Is there a way to correct this?
thanks, and also great thanks for groovycsv!
best Sebastian
It's hard to understand what readFirstLine is supposed to do. Setting it to true means the first line will be read as csv instead of being used as header.
The parts of the build that handles signing and upload of signed artifacts is pretty much a hack at this point and it really needs to be replaced with something cleaner as soon as possible.
When the CSV file contains an empty line between some lines, the following exception is thrown:
java.lang.ArrayIndexOutOfBoundsException: 1
at com.xlson.groovycsv.PropertyMapper$_toString_closure1.doCall(PropertyMapper.groovy:67)
at com.xlson.groovycsv.PropertyMapper.toString(PropertyMapper.groovy:67)
at ConsoleScript1.run(ConsoleScript1:7)
something like:
columns.collect{ col -> "$col.key = ${values[col.value]}" }.join(',')
Suggestion: gcsv
Why?
To stand out and better explain what the project is about.
choosing \0 as escape char (i.e. disabling escape chars) is impossible with groovyCsv 1.0
The following patch fixes the issue:
diff --git a/src/com/xlson/groovycsv/CsvParser.groovy b/src/com/xlson/groovycsv/CsvParser.groovy
index 4b0c9a3..806087c 100644
--- a/src/com/xlson/groovycsv/CsvParser.groovy
+++ b/src/com/xlson/groovycsv/CsvParser.groovy
@@ -147,7 +147,7 @@ class CsvParser {
quoteChar = args.quoteChar ?: '"'
}
- if(escapeChar) {
+ if(escapeChar != null) {
return new CSVReader(reader, separator, quoteChar, escapeChar)
} else {
return new CSVReader(reader, separator, quoteChar)
As far as I can see, this is terribly designed, because you can't get access to the column names until you've actually read the first line of real data. The column names is private in CsvIterator, which makes it impossible to get to... but even if you could get to it, it would still be awful. The CsvParser should take as arguments the things that relate to the whole file, and the CsvIterator should limit its domain to things relating to each line of data. Like this:
CsvParser p = new CsvParser(inputStreamReader, separator: ',')
for (String n in p.columns) {
println(n)
}
for (line in p.parse()) {
// process line here.
}
I used the CsvParser this way :
new CsvParser().parse(csv, separator: ',', quoteChar: '"')
With following content :
"Key","Description"
"XXX-1","Blah blah blah ""A title"" and ""C:\A Windows\like\path\"" and bleh bleh bleh"
"XXX-2","Hello world"
As a result, I have only 1 row (instead of 2) and every data of row 2 is located inside row1 description field.
There should be new static methods, probably CsvParser.parseCsv()
that could be used together with static imports for a nicer API experience.
Structure should problably be something like /latest/docs/index.html
In some applications you might not be sure what columns you are going to get. Therefore it would be good if PropertyMapper.getProperty(name) optionally did not throw an exception, but rather returned null.
Arguments sent to parse should be validated so that an error is given when an argument that is not supported is used.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.