GithubHelp home page GithubHelp logo

vedo-datascience-toolkit's Introduction

DataSift VEDO Data Science Toolkit

This repository contains useful tools that help you get the best out of DataSift VEDO. When working with the DataSift platform, VEDO allows you add custom metadata to unstructured data to make it much easier to process and action.

DataSift is the leading platform for giving unified, compliant access to data from social networks and news providers. To learn more, please visit our website http://datasift.com.

Linear Classifier Generator

The linear classifier library helps you generate machine learned classifiers to run on VEDO.

vedo-datascience-toolkit's People

Contributors

quipo avatar

Stargazers

Andy Slack avatar Rob Lang avatar Bruno Vilar avatar

Watchers

Will James avatar  avatar Ollie Parsley avatar  avatar Ivan Shaovchev avatar  avatar Nick Halstead avatar Daniel Saxil-Nielsen avatar Tim Shea avatar Juan Carlos Alonso avatar Courtney Robinson avatar Brad Hubbard avatar James Cloos avatar Akshay Shirahatti avatar Ed Stenson avatar  avatar Nicola Asuni avatar Marcin Cabaj avatar Rich Caudle avatar  avatar jon rognerud avatar  avatar Rob Hubbard avatar Silvija avatar aszac avatar Victor avatar edwardttril avatar Nathan Macnamara avatar  avatar Paul Mozo avatar Linton Baddeley avatar  avatar  avatar  avatar  avatar Shweta Saikumar avatar  avatar  avatar  avatar  avatar  avatar  avatar Alessio Martorelli avatar David Shrive avatar Gary Goodger avatar Chris Knight avatar Ryan Stanley avatar  avatar  avatar Chris Young avatar  avatar  avatar

Forkers

adtu isabella232

vedo-datascience-toolkit's Issues

Write CSDL to output file

Please change the script so that:

  • Progress is written to standard out, giving the user an idea of progress
  • The script takes in an argument for a file to write the CSDL to

New potential features

When I was classifying data I noticed that there was correlation between certain categories and the number of links or mentions in a given interaction.

It would be nice to get some features implemented that focus on the number of links or mentions present in an interaction if this is at all possible.

Make it more intuitive to add apriori words to config base

At the moment the apriori_features method contains an empty set. This set must contain a number of feature objects or the classifier script will fail.

My suggestion would either to make it more explicit that this set requires feature objects in the documentation/comments or if strings are passed in to the set to create them as PunctuatedWordFeature features with a default path of 'interaction.content'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.