GithubHelp home page GithubHelp logo

reading CSV about cntk HOT 6 CLOSED

microsoft avatar microsoft commented on May 8, 2024
reading CSV

from cntk.

Comments (6)

dongyu888 avatar dongyu888 commented on May 8, 2024

Yes, you should use UCIFastReader.

For testing you do need labels to compute the error rates. If what you want is to output the value of an output node so that you can compare it with labels you should use the “write” command instead of “eval”, in which case I think the label column does not need to be there.

Thanks,

Dong Yu (俞栋)

From: pari [mailto:[email protected]]
Sent: Friday, January 29, 2016 5:32 AM
To: Microsoft/CNTK [email protected]
Subject: [CNTK] reading CSV (#57)

Hello All,

  • For reading CSV files, Should I use UCIFastReader?
  • If my test dataset, doesnt have label, what kind of configuration should i apply?should i remove the label for test config? I didi it and the compiler stopped working


Reply to this email directly or view it on GitHub #57 . https://github.com/notifications/beacon/AL5Pc-1bxBYLkik9xtQtVlPDhM8Ym22Wks5pe2FKgaJpZM4HPM4T.gif

from cntk.

parvanesh avatar parvanesh commented on May 8, 2024

Thanks for the response...For the CSV reading, when I provide UCIFastReader as the reader with a comma separated file, compiler gives an error as "EXCEPTION occurred: label found in data not specified in label mapping file:XXXXXX", XXXX is the first row of my CSV file.
When I replace the comma seperated file with a tab separated file, i works. Is it a problem or I need to specify other parameter? I use it as:

Simple2_Demo_Test = [
action = "test"
# Parameter values for the reader
reader = [
readerType = "UCIFastReader"
file = "$DataDir$/iris.txt"
features = [
dim = 4
start = 0
]
labels = [
start = 4
dim = 1
labelDim = 3
labelMappingFile = "$DataDir$/SimpleMapping2.txt"
]
]
]

from cntk.

dongyu888 avatar dongyu888 commented on May 8, 2024

Currently we do not support comma delimited files, but it would be very simple to do so. Just look for the state transition tables in UCIParser.cpp:

SetState(',', Whitespace, Whitespace);
SetState(' ', Whitespace, Whitespace);
SetState('\t', Whitespace, Whitespace);
SetState('\r', Whitespace, Whitespace);

There are several places ‘ ‘, ‘\t’, and ‘\r’ appear in the tables (near the top of the file). Just add a similar ‘,’ entry and it should all work.

Note that this would not handle empty fields (two commas in a row, with or without whitespace in-between), but if that is not an issue, then it would just work.
Also note that any commas would also not be allowed in labels or other strings (i.e. as a decimal point in some languages)

We will add "," and ";" as white spaces as a fix.

from cntk.

parvanesh avatar parvanesh commented on May 8, 2024

Thnx..I wonder if there is a straight method to do it instead of converting..so it will be supported..Thnx!

from cntk.

dongyu888 avatar dongyu888 commented on May 8, 2024

the change is now checked in. you can specify

customDelimiter="," 

inside the UCIFastReader block in the config file. Please note that it requires values between commas. In other words, it would not handle empty fields (two commas in a row, with or without whitespace in-between).

from cntk.

parvanesh avatar parvanesh commented on May 8, 2024

Thnx!

from cntk.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.