GithubHelp home page GithubHelp logo

Comments (6)

patelliandrea avatar patelliandrea commented on June 11, 2024

Another question...
If one file is huge, e.g. millions of lines, and I'm importing it using a batch size of let's say 1000, what happens if the connector dies before the entire was processed?
Seems like that when I'll restart the connector, it'll read the entire file again, starting from the beginning, without skipping the lines that were already sent to kafka.

from kafka-connect-spooldir.

jcustenborder avatar jcustenborder commented on June 11, 2024

You're on it. It's been missing offset management for a while. I'm working on a few changes and refactors to cover this and some better schema handling. A new release will be coming soon.

from kafka-connect-spooldir.

patelliandrea avatar patelliandrea commented on June 11, 2024

Perfect, thanks. I'm working on it as well... I managed to implement the offset management but I'm struggling to implement the clean up of the folder in a nice way.

from kafka-connect-spooldir.

gbehrmann avatar gbehrmann commented on June 11, 2024

Any updates on this subject? I am investigating ways to import from an application that continually appends data to a CSV file and your connector seems like a perfect match (the application will let go of the file when it is renamed, so the automatic renaming to a .PROCESSING file is perfect). However the lack of resuming from the recorded offset is a showstopper for us.

from kafka-connect-spooldir.

jcustenborder avatar jcustenborder commented on June 11, 2024

@gbehrmann Missed your comment. I've had this pull out there but haven't received much feedback. #17

from kafka-connect-spooldir.

jcustenborder avatar jcustenborder commented on June 11, 2024

This was fixed with #17.

from kafka-connect-spooldir.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.