GithubHelp home page GithubHelp logo

cardozogp / gdata-storagehandler Goto Github PK

View Code? Open in Web Editor NEW

This project forked from balshor/gdata-storagehandler

0.0 1.0 0.0 8.81 MB

A Hive StorageHandler that uses a Google Spreadsheet as a backend.

License: Other

Java 98.56% HTML 1.44%

gdata-storagehandler's Introduction

gdata-storagehandler

This project implements a HiveStorageHandler that allows Hive to read and write data from a Google spreadsheet.

Although Hive/Hadoop are geared towards processing big data, this storage handler implementation is geared towards "Small Data". The original use case was for writing around a dozen lines of data containing the final output of a report into a Google spreadsheet.

Because of the small data orientation, it is recommended to read or write data from tables backed by this StorageHandler from only a single mapper or reducer. Using multiple mappers or reducers can result in duplicate data being read or written to the spreadsheet.

Some other notes:

Sample usage:

add jar gdata-storagehandler.jar ;

create external table output(day string, cnt int, source_class string, source_method string, thrown_class string)
stored by 'com.bizo.hive.gdata.GDataStorageHandler'
with serdeproperties (
  "gdata.user" = "[email protected]",
  "gdata.consumer.key" = "bizo.com",
  "gdata.consumer.secret" = "...",
  "gdata.spreadsheet.name" = "Daily Exception Summary",
  "gdata.worksheet.name" = "First Worksheet",
  "gdata.columns.mapping" = "day,count,class,method,thrown"
)
;

If you are using Amazon's Elastic Mapreduce, you can add the jar file as follows:

add jar s3://com-bizo-public/hive/storagehandler/gdata-storagehandler-0.1.jar ;

gdata-storagehandler's People

Contributors

balshor avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.