GithubHelp home page GithubHelp logo

hdfs's Introduction

Package

Go bindings for libhdfs, for manipulating files on Hadoop distributed file system.

Types

  • hdfs.Fs: file system handle
  • hdfs.File: file handle
  • hdfs.FileInfo: file metadata structure, represented within Go

Methods

see go doc

Usage

Prerequisite

  • JVM
  • HDFS: c bindings for libhdfs, java binary packages
  • HDFS: configured cluster

Tips for building libhdfs on OS X

Based on hadoop-1.0.4.

  1. change <error.h> to <err.h> in src/c++/libhdfs/hdfsJniHelper.c
  2. change md5sum to md5 in src/saveVersion.sh
  3. run chmod +x src/c++/libhdfs/install-sh
  4. run ant -Dcompile.c++=true -Dlibhdfs=true compile-c++-libhdfs to build libhdfs.
  5. Upon successful building, libraries have been installed in build/c++/Mac_OS_X-x86_64-64/lib; Makefile in build/c++-build/Mac_OS_X-x86_64-64/libhdfs is very helpful for later-on re-compilation. it is ok the build ends up with installation errors if you can already find compiled libs in build/c++-build/Mac_OS_X-x86_64-64/libhdfs/.libs or so
  6. change install_name for libhdfs: install_name_tool -id /usr/lib/java/libhdfs.0.0.0.dylib libhdfs.0.0.0.dylib
  7. put libhdfs*.dylib in /usr/lib/java

Tips for building libhadoop on OS X

Based on hadoop-1.0.1; libhadoop would be loaded by util.NativeCodeLoader when accessing local file system.

  1. java: change -ljvm to -framework JavaVM in both Makefile.am and Makefile.in

  2. libz: apply patch to acinclude.m4:

     elif test ! -z "`which otool | grep -v 'no otool'`"; then
         ac_cv_libname_$1=\"`otool -L conftest | grep $1 | sed -e 's/^[  ]*//' -e 's/ .*//' -e 's/.*\/\(.*\)$/\1/'`\";
    

    and configure:

     elif test ! -z "`which otool | grep -v 'no otool'`"; then
         ac_cv_libname_z=\"`otool -L conftest | grep z | sed -e 's/^  *//' -e 's/ .*//' -e 's/.*\/\(.*\)$/\1/'`\";
    
  3. apply patch to source code src/org/apache/hadoop/security/JniBasedUnixGroupsNetgroupMapping.c.

  4. run ant compile-native

  5. put the compiled library libhadoop.1.0.0.dylib and its symbolic links in /usr/lib/java, which is one of the default element of java.library.path.

  6. change install_name for libhadoop: sudo install_name_tool -id /usr/lib/java/libhadoop.1.dylib /usr/lib/java/libhadoop.1.0.0.dylib

Prepare

  1. put .jar from hadoop in .libs/javalibs; conf/ in .libs; see mktest.sh for details, or you can modify it to accommodate your environment.

  2. set LD_LIBRARY_PATH for Linux:

     export LD_LIBRARY_PATH=./lib:/opt/jdk/jre/lib/amd64/server
    

    make sure libhdfs.so and libjvm are declared in LD_LIBRARY_PATH

    You don't have to do this on OS X. You can always use install_name_tool to set or change a library's install name, also jvm on OS X is a system framework, so that it is not necessory to add jvm's path, while the only thing in step 3 is providing hdfs.h header path for #cgo.

  3. correct the #cgo header in hdfs.go, according to your enviornment.

Test

  • After the preparation, correct the constants in hdfs_test.go.
  • run ./mktest.sh.

Known Issues

  1. Currently connecting to local file system is not handled correctly. So Connect("", 0) would lead to error. It is okay now to access to local file system.
  2. errno in libhdfs is not handled precisely. For example, invokeMethod() would probably sets errno to 2 in a lot of routines.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.