GithubHelp home page GithubHelp logo

fagan2888 / autofillmissingdata Goto Github PK

View Code? Open in Web Editor NEW

This project forked from dennisfrancis/autofillmissingdata

0.0 1.0 0.0 332 KB

A LibreOffice Calc extension that fills missing data using machine learning techniques

License: GNU General Public License v3.0

Makefile 13.41% C++ 86.59%

autofillmissingdata's Introduction

AutoFillMissingData

A LibreOffice Calc extension that fills missing data (both continuous and categorical) using machine learning techniques.

To install the prebuilt extension, use the Extension Manager of LibreOffice and browse to this repo's file AutoFillMissingData.oxt. Alternatively you can install the extension using terminal as :

$ git clone https://github.com/dennisfrancis/AutoFillMissingData.git
$ cd AutoFillMissingData
$ unopkg install ./AutoFillMissingData.oxt

To use this extension on some data in a sheet in LibreOffice, place the cursor on any cell inside your table with data ( no need to select the whole table ) and go to the menu Missing data and click on Fill missing data.

This project is currently alpha and under heavy development. Full source code is made available under GPL3 license. Whole of the project was written from scratch and it does not depend on any Machine Learning or Linear Algebra library. If you are interested in understanding how to build LibreOffice extensions like this, I have a blog for that at https://niocs.github.io/LOBook/extensions/index.html

As of now the extension uses a variation of kNN instance based regression and classification algorithms to predict missing data. It does auto tuning of k parameter(number of nearest neighbors) to reduce overfitting using a validation set. Ability to tune the algorithm parameters via dialogue boxes is coming soon.

A major issue is that the prebuilt extension (.oxt file) supports only modern GNU/Linux 64 bit systems comparable to Fedora 24. However support for MS Windows and MacOSX is planned. Another caveat is that for the extension to work, at least 10 non blank samples(rows) are needed per feature(column) in the table.

Building the extension from source

First you need to setup LibreOffice SDK as per the instruction in http://api.libreoffice.org/docs/install.html. Then do the below :

$ cd AutoFillMissingData
$ make

If you get errors in compilation, please check if the SDK's environment variables are set properly after setting up the SDK. If that does not solve it, open an issue here. If you are compiling in a GNU/Linux platform and used the standard defaults while setting up the SDK, the oxt file can be found at the location /home/$username/libreoffice5.4_sdk/LINUXexample.out/bin/

Pull requests are always welcome. Happy hacking !

autofillmissingdata's People

Contributors

dennisfrancis avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.