spring-attic / spring-hadoop-samples Goto Github PK

View Code? Open in Web Editor NEW

489.0 113.0 468.0 3.21 MB

Spring Hadoop Samples

License: Apache License 2.0

Java 93.96% Groovy 2.36% PigLatin 2.04% HiveQL 1.63%

spring-hadoop-samples's Introduction

spring-hadoop-samples is no longer actively maintained by VMware, Inc.

Sample Applications for Spring for Apache Hadoop

This repository contains several sample applications that show how you can use Spring for Apache Hadoop.

Note	These samples are built using version 2.2.0.RELEASE of Spring for Apache Hadoop project. For examples built against older versions check out the Git "tag" that corresponds to your desired version.

Overview of Spring for Apache Hadoop

Hadoop has a poor out of the box programming model. Writing applications for Hadoop generally turn into a collection of scripts calling Hadoop command line applications. Spring for Apache Hadoop provides a consistent programming model and declarative configuration model for developing Hadoop applications.

Together with Spring Integration and Spring Batch, Spring for Apache Hadoop can be used to address a wide range of use cases

HDFS data access and scripting
Data Analysis
- MapReduce
- Pig
- Hive
Workflow
Data collection and ingestion
Event Streams processing

Features

Declarative configuration to create, configure, and parameterize Hadoop connectivity and all job types (MR/Streaming MR/Pig/Hive/Cascading)
Simplify HDFS API with added support for JVM scripting languages
Runner classes for MR/Pig/Hive/Cascading for small workflows consisting of the following steps HDFS operations → data analysis → HDFS operations
Helper “Template” classes for Pig/Hive/HBase
- Execute scripts and queries without worrying about Resource Management Exception Handling and Translation
- Thread-safety
Lightweight Object-Mapping for HBase
Hadoop components for Spring Integratio and Spring Batch
- Spring Batch tasklets for HDFS and data analysis
- Spring Batch HDFS ItemWriters
- Spring Integration HDFS channel adapters

Additional Resources

Many of the samples were taken from the O’Reilly book Spring Data. Using the book as a companion to the samples is quite helpful to understanding the samples and the full feature set of what can be done using Spring technologies and Hadoop.

The main web site for Spring for Apache Hadoop

spring-hadoop-samples's People

Contributors

Stargazers

Watchers

Forkers

madbluesky wxlund jvalkeal alanma jahubba pooyaho kevinylo qingt larsselsaas gitsamdev crazysnailer longamu zealeanlijilin jackerxff licunbing landytest oferhaze khurramshehzad purushottam2005 nnmerchant bmcnees16 alfkosgey rinmypm ranwar123 trisberg atiqamjad evany hujunfei alonsoir dsaikumar vkhang55 denggeng bigdata06 reedf renestvs raaztripathi usingalreddy meraboxer todotobe1 shivtest fabioajm devlier jugalps silverhyuk liujiong1982 qq254963746 mkgobaco codeboyyong ameyjadiye bigbluebutton86 zgshu kico445 bilyush kumarandhanapal jbidarkoti aaronzhangl dts3 ihuerga pawelantczak jjedmorianktah sky01126 lanxy88 mattyb149 feature2018 chennar f13mash bvkkr2808 mylovetop j00131120 pippo1980 keshavbashyal somashis justinjoseph89 wei66d prateek rohitrajt gabhi gitchethan fx1061076658 lgscofield lifei128 reicherm venkatark janostik manikandanv jarp80 cygnusx1 tomdev2008 mohamedzajith pfilipak hidenny zhangwei5095 ajunboys sideflow housedaine navula1 xpmars flowers2023 akhil4chelsia combantu

spring-hadoop-samples's Issues

Mapreduce Example Fails

I have been trying get the Mapreduce example to run against a cluster with the following version:

Hadoop 2.2.0.2.0.6.0-101

I built the example with the following command:

mvn clean package -Phadoop22

The first error I encountered was seen on the logs for the Map task:

java.lang.ClassNotFoundException: org.apache.hadoop.mapreduce.v2.app.MRAppMaster

To get around this error I added the following config item to the application-context.xml for the config object:

yarn.application.classpath=$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,$HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,$HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,$HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*

Now the job gets submitted and starts to execute properly, however it only succeeds if the task is executed on the same node that Resource Manager is running on. It is a 3 node cluster, where Node1 has Resource Manager. When any of the jobs get submitted to Node2 or Node3, it will fail with (repeating):

2014-02-06 12:05:52,135 INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)

If I run a sample map/reduce job outside of spring hadoop it executes as expected on any of the nodes, so I don't think it is a problem with the setup. It seems like the Spring Hadoop libraries are picking up a setting where the task thinks the Resource Manager is installed on the local Node.

Please let me know if you have any suggestions.

May I ask where "Lightweight Object-Mapping for HBase" is

Thx for contribution.

I looked over all codes, but did not find the "lightweight OM for Hbase". May I inquiry where it is?

spring-yarn examples not finding / using hadoop classpath

I'm trying to run the yarn examples. I tried both simple-command and batch-files on a Hortonworks HDP-2.1 multi-node (non-secured) cluster.

The job submits fine, but it fails with:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/logging/LogFactory
    at org.springframework.yarn.launch.AbstractCommandLineRunner.<clinit>(AbstractCommandLineRunner.java:60)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.logging.LogFactory
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    ... 1 more

The only modifications to the application-context.xml and appmaster-context.xml was to edit the paths to where I copied the jars that get built with gradlew. For example, here is (part of) the simple-master application-context.xml:

    <yarn:configuration>
            fs.defaultFS=${hd.fs}
            yarn.resourcemanager.address=${hd.rm}
            fs.hdfs.impl=org.apache.hadoop.hdfs.DistributedFileSystem
    </yarn:configuration>

    <yarn:localresources>
            <yarn:hdfs path="/user/u070072/spring-yarn/app/simple-command/*.jar"/>
            <yarn:hdfs path="/user/u070072/spring-yarn/lib/*"/>
    </yarn:localresources>

    <yarn:environment>
            <yarn:classpath use-yarn-app-classpath="true"/>
    </yarn:environment>

    <util:properties id="arguments">
            <prop key="container-count">4</prop>
    </util:properties>

    <yarn:client app-name="simple-command">
            <yarn:master-runner arguments="arguments"/>
    </yarn:client>

and the appmaster-context.xml:

    <yarn:configuration>
            fs.defaultFS=${hd.fs}
            yarn.resourcemanager.address=${hd.rm}
            fs.hdfs.impl=org.apache.hadoop.hdfs.DistributedFileSystem
    </yarn:configuration>

   <yarn:localresources>
            <yarn:hdfs path="/user/u070072/spring-yarn/app/simple-command/*.jar"/>
            <yarn:hdfs path="/user/u070072/spring-yarn/lib/*"/>
    </yarn:localresources>

    <yarn:environment>
            <yarn:classpath use-yarn-app-classpath="true" delimiter=":">
                    ./*
            </yarn:classpath>
    </yarn:environment>

    <yarn:master>
            <yarn:container-allocator/>
            <yarn:container-command>
                    <![CDATA[
                    date
                    1><LOG_DIR>/Container.stdout
                    2><LOG_DIR>/Container.stderr
                    ]]>
            </yarn:container-command>
    </yarn:master>

I invoked it with:

$ ./gradlew -q run-yarn-examples-simple-command -Dhd.fs=hdfs://trvlapp0049:8020 \
-Dhd.rm=http://trvlapp0050.tsh.thomson.com:8050 -Dlocalresources.remote=hdfs://trvlapp0049:8020

The apache-commons-logging jar it wants is in /usr/lib/hadoop/lib:

u070072@TST yarn$ ls -1 /usr/lib/hadoop/lib/ | grep commons-logging
commons-logging-1.1.3.jar

and that location is in the standard hadoop classpath on the HDP platform:

$ hadoop classpath
/etc/hadoop/conf:/usr/lib/hadoop/lib/*:/usr/lib/hadoop/.//*:/usr/lib/hadoop-hdfs/./:/usr/lib/hadoop-hdfs/lib/*:/usr/lib/hadoop-hdfs/.//*:/usr/lib/hadoop-yarn/lib/*:/usr/lib/hadoop-yarn/.//*:/usr/lib/hadoop-mapreduce/lib/*:/usr/lib/hadoop-mapreduce/.//*::/usr/share/java/mysql-connector-java-5.1.17.jar:/usr/share/java/mysql-connector-java.jar:/usr/lib/hadoop-mapreduce/*:/usr/lib/tez/*:/usr/lib/tez/lib/*:/etc/tez/conf

So why isn't the spring-yarn setup finding the commons-logging jar? I've run other YARN apps (not with spring-yarn) and everything works fine.

I need a SpringBatch Hive 13, HiverServer2 sample

I posted a message earlier on this board and it got deleted and so posting again.

I down loaded the samples and modified some of them for Hive13, hadoop 2.4.1 (hrtonworks 2.1).
Please take aa look and see if you can help in spotting the problem. Or Do you have a project that works for hive13 on Hiveserver2.

Thanks for any help.

I get the following error

[sagar@devsagar hive]$ sh ./target/appassembler/bin/hiveApp
00:47:15,902  INFO t.support.ClassPathXmlApplicationContext: 513 - Refreshing org.springframework.context.support.ClassPathXmlApplicationContext@5ab2f56: startup date [Wed Aug 20 00:47:15 CEST 2014]; root of context hierarchy
00:47:16,057  INFO eans.factory.xml.XmlBeanDefinitionReader: 316 - Loading XML bean definitions from class path resource [META-INF/spring/hive-context.xml]
00:47:16,449  INFO eans.factory.xml.XmlBeanDefinitionReader: 316 - Loading XML bean definitions from class path resource [META-INF/spring/jdbc-context.xml]
00:47:16,889  INFO ort.PropertySourcesPlaceholderConfigurer: 172 - Loading properties file from class path resource [hadoop.properties]
00:47:16,890  INFO ort.PropertySourcesPlaceholderConfigurer: 172 - Loading properties file from class path resource [hive.properties]
00:47:16,898  INFO ion.AutowiredAnnotationBeanPostProcessor: 141 - JSR-330 'javax.inject.Inject' annotation found and supported for autowiring
00:47:17,433  INFO ans.factory.config.PropertiesFactoryBean: 172 - Loading properties file from class path resource [hive-server.properties]
Exception in thread "main" org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'hiveServer': Invocation of init method failed; nested exception is org.apache.thrift.transport.TTransportException: Could not create ServerSocket on address 0.0.0.0/0.0.0.0:10000.
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1553)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:539)

Mapreduce program could only be replicated to 0 nodes instead of minReplication (=1)

I got the following stacktrace when i ran the mapreduce program:

Issue while writing data to hdfs from remote client

Hi, I'm trying out the spring for hadoop sample provided to write data to HDFS running on Amazon EC2 cluster from my local machine(windows-from eclipse). From the documentation provided here
http://docs.spring.io/spring-hadoop/docs/1.0.x/reference/html/appendix-amazon-emr.html,
I have created a SOCKS proxy using the below command
ssh -i kp1.pem -ND 6666 [email protected]

and then tried to connect to remote cluster. but ,it gives me the below exception

Also, as per the information in the blog(http://blog.cloudera.com/blog/2008/12/securing-a-hadoop-cluster-through-a-gateway/), I have set up the below properties in core-site.xml on client side and on server side, I have made the property "hadoop.rpc.socket.factory.class.default" final.

hadoop.socks.server
localhost:6666

hadoop.rpc.socket.factory.class.default
org.apache.hadoop.net.SocksSocketFactory

I'm using hadoop-2.4.0 and in all the hadoop related configuration files, I have used the amazon public DNS name as the hostname both on the client and the server side. For example,

fs.default.name
hdfs://ec2-54-191-18-136.us-west-2.compute.amazonaws.com:8020

Can you please let me know the reason why I get the attached error?

Debug hive in eclipse,it didn't work.

I try to run this sample, but it throw a "NoClassDefFoundError: HiveServerException".
I search this class on the maven websit, finaly I found a hadoop-hive-cdh jar.
But my server is not CDH,and I can't find the answer.

my properties:
<spring.hadoop.version>2.5.0.RELEASE</spring.hadoop.version>
<hadoop.version>2.7.5</hadoop.version>
<hive.version>2.3.2</hive.version>
<spring.version>4.3.9.RELEASE</spring.version>

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/D:/Project/%e5%a4%a7%e6%95%b0%e6%8d%ae/all%20lib/hadoop/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/D:/Project/%e5%a4%a7%e6%95%b0%e6%8d%ae/all%20lib/hive/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
15:05:45,574  INFO ingframework.samples.hadoop.hive.HiveApp:  31 - Hive Application Running
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hive/service/HiveServerException
	at org.springframework.data.hadoop.hive.HiveTemplate$1.doInHive(HiveTemplate.java:267)
	at org.springframework.data.hadoop.hive.HiveTemplate$1.doInHive(HiveTemplate.java:264)
	at org.springframework.data.hadoop.hive.HiveTemplate.execute(HiveTemplate.java:83)
	at org.springframework.data.hadoop.hive.HiveTemplate.executeScript(HiveTemplate.java:264)
	at org.springframework.data.hadoop.hive.HiveTemplate.executeScript(HiveTemplate.java:252)
	at org.springframework.data.hadoop.hive.HiveTemplate.query(HiveTemplate.java:142)
	at org.springframework.data.hadoop.hive.HiveTemplate.query(HiveTemplate.java:115)
	at org.springframework.samples.hadoop.hive.HiveApp.main(HiveApp.java:35)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.service.HiveServerException
	at java.net.URLClassLoader.findClass(Unknown Source)
	at java.lang.ClassLoader.loadClass(Unknown Source)
	at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
	at java.lang.ClassLoader.loadClass(Unknown Source)
	... 8 more
15:05:52,497  INFO t.support.ClassPathXmlApplicationContext: 984 - Closing org.springframework.context.support.ClassPathXmlApplicationContext@15761df8: startup date [Thu Jan 31 15:05:40 CST 2019]; root of context hierarchy
ERROR: JDWP Unable to get JNI 1.2 environment, jvm->GetEnv() return code = -2
JDWP exit error AGENT_ERROR_NO_JNI_ENV(183):  [util.c:840]

Help ... please.

Connection Refused in Mapreduce sample

Hello I am getting the following error when trying to run the Mapreduce sample:

19:47:07,866  WARN t.support.ClassPathXmlApplicationContext: 487 - Exception encountered during context initialization - cancelling refresh attempt
org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'runner': Invocation of init method failed; nested exception is org.springframework.scripting.ScriptCompilationException: Could not compile script [class path resource [copy-files.groovy]]: Execution failure; nested exception is javax.script.ScriptException: javax.script.ScriptException: org.springframework.data.hadoop.HadoopException: Cannot test resource /user/gutenberg/input/word/;Call From Gabriels-MacBook-Pro.local/192.168.1.101 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1574)
	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:539)
	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:476)
	at org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:303)

I am running Hadoop version 3 in mac OS X.

$HADOOP_HOME = /usr/local/Cellar/hadoop
$JAVA_HOME = /Library/Java/JavaVirtualMachines/jdk1.8.0_144.jdk/Contents/Home
jps
93667 DataNode
93810 SecondaryNameNode
94004 ResourceManager
94884 Jps
94102 NodeManager
93562 NameNode

Any idea?

Hbase Connection Issue?

Does this code base still work or is it out dated? I have tried to connect to hortonworks sandbox VM 2.2 2.3 and 2.4 and all are giving me issues.

One of the issues is...

Tue Jun 14 11:42:29 EDT 2016, null, java.net.SocketTimeoutException: callTimeout=60000, callDuration=75740: row 'users,,' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=sandbox.hortonworks.com,60020,1418759208042, seqNum=0

    at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.throwEnrichedException(RpcRetryingCallerWithReadReplicas.java:271)
    at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:195)
    at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:59)
    at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
    at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320)
    at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:295)
    at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:160)
    at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:155)
    at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:821)
    at org.apache.hadoop.hbase.MetaTableAccessor.fullScan(MetaTableAccessor.java:602)
    at org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccessor.java:366)
    at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:303)
    at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:308)
    at org.springframework.samples.hadoop.hbase.UserUtils.initialize(UserUtils.java:34)
    at org.springframework.samples.hadoop.hbase.UserApp.main(UserApp.java:36)
Caused by: java.net.SocketTimeoutException: callTimeout=60000, callDuration=75740: row 'users,,' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=sandbox.hortonworks.com,60020,1418759208042, seqNum=0
    at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:159)
    at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:64)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: no further information
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
    at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupConnection(RpcClientImpl.java:424)
    at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:748)
    at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:920)
    at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:889)
    at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1222)
    at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213)
    at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287)
    at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:32651)
    at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:372)
    at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:199)
    at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:62)
    at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
    at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:346)
    at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:320)
    at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
    ... 4 more

Have you tested this code on the Hortonworks Sandbox VM?

Windows Client Support?

Does Spring Hadoop support running from a Windows client? I assume it does, since I see windows specific batch files to execute in the map reduce example.

When I build and run on a Windows client, connecting to my cluster, it fails. First it says it can't load native libs and then it submits the job but fails after that.

11:40:41,919  INFO t.support.ClassPathXmlApplicationContext: 510 - Refreshing org.springframework.context.support.ClassPathXmlApplicationContext@659297ab: startup date [Tue Feb 11 11:40:41 EST 2014]; root of context hierarchy
11:40:42,176  INFO eans.factory.xml.XmlBeanDefinitionReader: 315 - Loading XML bean definitions from class path resource [META-INF/spring/application-context.xml]
11:40:42,895  INFO ort.PropertySourcesPlaceholderConfigurer: 172 - Loading properties file from class path resource [hadoop.properties]
11:40:42,922  INFO ctory.support.DefaultListableBeanFactory: 596 - Pre-instantiating singletons in org.springframework.beans.factory.support.DefaultListableBeanFactory@74ab6b5: defining beans [org.springframework.context.support.PropertySourcesPlaceholderConfigurer#0,hadoopConfiguration,wordcountJob,runner]; root of factory hierarchy
11:40:43,166  INFO he.hadoop.conf.Configuration.deprecation: 840 - fs.default.name is deprecated. Instead, use fs.defaultFS
11:40:44,706 ERROR             org.apache.hadoop.util.Shell: 303 - Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
    at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:278)
    at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:300)
    at org.apache.hadoop.util.Shell.<clinit>(Shell.java:293)
    at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
    at org.apache.hadoop.conf.Configuration.getTrimmedStrings(Configuration.java:1546)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:519)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:453)
    at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:136)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2433)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:166)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:351)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
    at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.addInputPath(FileInputFormat.java:466)
    at org.springframework.data.hadoop.mapreduce.JobFactoryBean.afterPropertiesSet(JobFactoryBean.java:208)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeInitMethods(AbstractAutowireCapableBeanFactory.java:1547)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1485)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:524)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:461)
    at org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:295)
    at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:223)
    at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:292)
    at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:194)
    at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:608)
    at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:932)
    at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:479)
    at org.springframework.context.support.ClassPathXmlApplicationContext.<init>(ClassPathXmlApplicationContext.java:197)
    at org.springframework.context.support.ClassPathXmlApplicationContext.<init>(ClassPathXmlApplicationContext.java:172)
    at org.springframework.context.support.ClassPathXmlApplicationContext.<init>(ClassPathXmlApplicationContext.java:158)
    at org.springframework.samples.hadoop.mapreduce.Wordcount.main(Wordcount.java:28)
11:40:45,142  INFO    org.apache.hadoop.yarn.client.RMProxy:  56 - Connecting to ResourceManager at hd-dn-01.grcrtp.local/10.6.64.232:8050
11:40:45,245  INFO ramework.data.hadoop.mapreduce.JobRunner: 192 - Starting job [wordcountJob]
11:40:45,302  INFO    org.apache.hadoop.yarn.client.RMProxy:  56 - Connecting to ResourceManager at hd-dn-01.grcrtp.local/10.6.64.232:8050
11:40:45,971  WARN org.apache.hadoop.mapreduce.JobSubmitter: 258 - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
11:40:46,080  INFO doop.mapreduce.lib.input.FileInputFormat: 287 - Total input paths to process : 1
11:40:46,422  INFO org.apache.hadoop.mapreduce.JobSubmitter: 394 - number of splits:1
11:40:46,441  INFO he.hadoop.conf.Configuration.deprecation: 840 - user.name is deprecated. Instead, use mapreduce.job.user.name
11:40:46,442  INFO he.hadoop.conf.Configuration.deprecation: 840 - fs.default.name is deprecated. Instead, use fs.defaultFS
11:40:46,444  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class
11:40:46,444  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used
11:40:46,450  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
11:40:46,450  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.job.name is deprecated. Instead, use mapreduce.job.name
11:40:46,450  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
11:40:46,451  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class
11:40:46,451  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
11:40:46,452  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
11:40:46,452  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
11:40:46,454  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class
11:40:46,454  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
11:40:46,820  INFO org.apache.hadoop.mapreduce.JobSubmitter: 477 - Submitting tokens for job: job_1391711633872_0022
11:40:47,127  INFO      org.apache.hadoop.mapred.YARNRunner: 368 - Job jar is not present. Not adding any jar to the list of resources.
11:40:47,225  INFO doop.yarn.client.api.impl.YarnClientImpl: 174 - Submitted application application_1391711633872_0022 to ResourceManager at hd-dn-01.grcrtp.local/10.6.64.232:8050
11:40:47,291  INFO          org.apache.hadoop.mapreduce.Job:1272 - The url to track the job: http://http://hd-dn-01.grcrtp.local:8088/proxy/application_1391711633872_0022/
11:40:47,292  INFO          org.apache.hadoop.mapreduce.Job:1317 - Running job: job_1391711633872_0022
11:40:50,330  INFO          org.apache.hadoop.mapreduce.Job:1338 - Job job_1391711633872_0022 running in uber mode : false
11:40:50,332  INFO          org.apache.hadoop.mapreduce.Job:1345 -  map 0% reduce 0%
11:40:50,356  INFO          org.apache.hadoop.mapreduce.Job:1358 - Job job_1391711633872_0022 failed with state FAILED due to: Application application_1391711633872_0022 failed 2 times due to AM Container for appattempt_1391711633872_0022_000002 exited with  exitCode: 1 due to: Exception from container-launch: 
org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job control

    at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
    at org.apache.hadoop.util.Shell.run(Shell.java:379)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)


.Failing this attempt.. Failing the application.
11:40:50,434  INFO          org.apache.hadoop.mapreduce.Job:1363 - Counters: 0
11:40:50,470  INFO ramework.data.hadoop.mapreduce.JobRunner: 202 - Completed job [wordcountJob]
11:40:50,507  INFO    org.apache.hadoop.yarn.client.RMProxy:  56 - Connecting to ResourceManager at hd-dn-01.grcrtp.local/10.6.64.232:8050
11:40:50,590  INFO ctory.support.DefaultListableBeanFactory: 444 - Destroying singletons in org.springframework.beans.factory.support.DefaultListableBeanFactory@74ab6b5: defining beans [org.springframework.context.support.PropertySourcesPlaceholderConfigurer#0,hadoopConfiguration,wordcountJob,runner]; root of factory hierarchy

hbase running app

log4j:WARN No appenders could be found for logger (org.springframework.samples.hadoop.hbase.UserApp).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Exception in thread "main" java.io.IOException: Attempt to start meta tracker failed.
at org.apache.hadoop.hbase.catalog.CatalogTracker.start(CatalogTracker.java:201)
at org.apache.hadoop.hbase.client.HBaseAdmin.getCatalogTracker(HBaseAdmin.java:230)
at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:277)
at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:288)
at org.springframework.samples.hadoop.hbase.UserUtils.initialize(UserUtils.java:36)
at org.springframework.samples.hadoop.hbase.UserApp.main(UserApp.java:22)
Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/meta-region-server
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:199)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:425)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:77)
at org.apache.hadoop.hbase.catalog.CatalogTracker.start(CatalogTracker.java:197)
... 5 more

I need a Hive 13, HiverServer2 sample

I am currently working with a server running HiveServer2 and Hive 13. I tried modifying the samples to use Hive 13.
I executed
sh ./target/appassembler/bin/hiveBatchApp

I get the following Exception. How should I get around this? Alernatively, if you have a sample that works on SpringBatch/Hive 13/HiverServer2 please send the link for that. Thanks for your help

localSourceFile = /home/saga/Downloads/spring-hadoop-samples/hive-batch/target/appassembler/data/nbatweets-small.txt
inputDir = /tweets/input
about to execute the file copying
exiting
23:24:10,564  INFO amework.samples.hadoop.hive.HiveBatchApp:  37 - Batch Tweet Influencers Hive Job Running
23:24:10,667  INFO ch.core.launch.support.SimpleJobLauncher: 133 - Job: [FlowJob: [name=hiveJob]] launched with the following parameters: [{}]
23:24:10,727  INFO amework.batch.core.job.SimpleStepHandler: 146 - Executing step: [influencer-step]
23:24:10,812 ERROR ngframework.batch.core.step.AbstractStep: 225 - Encountered an error executing step influencer-step in job hiveJob
org.springframework.dao.DataAccessResourceFailureException: Invalid method name: 'execute'; nested exception is org.apache.thrift.TApplicationException: Invalid method name: 'execute'
    at org.springframework.data.hadoop.hive.HiveUtils.convert(HiveUtils.java:69)
    at org.springframework.data.hadoop.hive.HiveTemplate.convertHiveAccessException(HiveTemplate.java:99)
    at org.springframework.data.hadoop.hive.HiveTemplate.execute(HiveTemplate.java:82)
    at org.springframework.data.hadoop.hive.HiveTemplate.executeScript(HiveTemplate.java:261)

Hadoop Spring mapreduce multiple inputs and mappers in a job

How to specify multiple input files and their respective format in a Job tag?


<beans:beans xmlns="http://www.springframework.org/schema/hadoop"

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xmlns:beans="http://www.springframework.org/schema/beans"

xmlns:context="http://www.springframework.org/schema/context"

xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd

http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd

http://www.springframework.org/schema/hadoop http://www.springframework.org/schema/hadoop/spring-hadoop.xsd">

 <context:property-placeholder location="hadoop.properties"/>

 <configuration>

fs.default.name=${hd.fs}

yarn.resourcemanager.address=${hd.rm}

mapreduce.framework.name=${mr.fw}

</configuration>

 <job id="wordcountJob"

input-path="${wordcount.input.path}"

output-path="${wordcount.output.path}"

mapper="org.apache.hadoop.examples.WordCount.TokenizerMapper"

reducer="org.apache.hadoop.examples.WordCount.IntSumReducer"/>

 </beans:beans>

As we can a specify in java program. Like we this.
MultipleInputs.addInputPath(job, firstPath, FirstInputFormat.class, FirstMap.class);

MultipleInputs.addInputPath(job, sencondPath, SecondInputFormat.class, SecondMap.class);
I goggled a lot even i checked its xsd file. I did not find any attribute so how can we specify multiple inputs in a job?

I need a SpringBatch Hive 13, HiverServer2 sample

I am trying to get the sample working when running against hadoop 2.4.1, Hiveserver2, Hive 13.

I am having problems and please find below details of a second instance of failure. This is in hive-batch project from spring-hadoop-samples.

Thanks for any help.

My properties are

hd.fs=hdfs://aa.0.11.120:8020
localSourceFile=data/nbatweets-small.txt
tweets.input.path=/tweets/input

The stack trace is as given below.

localSourceFile = /home/sagar/Downloads/spring-hadoop-samples/hive-batch/target/appassembler/data/nbatweets-small.txt
inputDir = /tweets/input
about to execute the file copying 
exiting
01:06:34,304  INFO amework.samples.hadoop.hive.HiveBatchApp:  37 - Batch Tweet Influencers Hive Job Running
01:06:34,411  INFO ch.core.launch.support.SimpleJobLauncher: 133 - Job: [FlowJob: [name=hiveJob]] launched with the following parameters: [{}]
01:06:34,471  INFO amework.batch.core.job.SimpleStepHandler: 146 - Executing step: [influencer-step]
01:06:34,589 ERROR ngframework.batch.core.step.AbstractStep: 225 - Encountered an error executing step influencer-step in job hiveJob
org.springframework.dao.DataAccessResourceFailureException: Invalid method name: 'execute'; nested exception is org.apache.thrift.TApplicationException: Invalid method name: 'execute'
    at org.springframework.data.hadoop.hive.HiveUtils.convert(HiveUtils.java:69)
    at org.springframework.data.hadoop.hive.HiveTemplate.convertHiveAccessException(HiveTemplate.java:99)
    at org.springframework.data.hadoop.hive.HiveTemplate.execute(HiveTemplate.java:82)
    at org.springframework.data.hadoop.hive.HiveTemplate.executeScript(HiveTemplate.java:261)

Exceptions for Running HiveApp in Mac

I started my local HDFS and then tried to run HiveApp, the "show tables" commands can be run successfully, but when running HSQL, exceptions happened, this is the output in the console:

2015-07-26 21:56:51.744 java[5042:457346] Unable to load realm mapping info from SCDynamicStore
OK
[grpshell, passwords]OK
OK
OK
Copying data from file:/etc/passwd
Copying file: file:/etc/passwd
Loading data to table default.passwords
Table default.passwords stats: [numFiles=1, numRows=0, totalSize=5581, rawDataSize=0]
OK
OK
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=
In order to set a constant number of reducers:
set mapreduce.job.reduces=
org.apache.hive.com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Unable to create serializer "org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer" for class: org.apache.hadoop.hive.ql.exec.FileSinkOperator
Serialization trace:
childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
childOperators (org.apache.hadoop.hive.ql.exec.GroupByOperator)
reducer (org.apache.hadoop.hive.ql.plan.ReduceWork)
at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:82)
at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:474)
at org.apache.hive.com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:614)
at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:78)
at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:18)
at org.apache.hive.com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:538)
at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:474)
at org.apache.hive.com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:538)
at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61)
at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:474)
at org.apache.hive.com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:520)
at org.apache.hadoop.hive.ql.exec.Utilities.serializeObjectByKryo(Utilities.java:895)
at org.apache.hadoop.hive.ql.exec.Utilities.serializePlan(Utilities.java:799)
at org.apache.hadoop.hive.ql.exec.Utilities.serializePlan(Utilities.java:811)
at org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:601)
at org.apache.hadoop.hive.ql.exec.Utilities.setReduceWork(Utilities.java:578)
at org.apache.hadoop.hive.ql.exec.Utilities.setMapRedWork(Utilities.java:569)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:372)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1503)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1270)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901)
at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:198)
at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:644)
at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:628)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:695)
java.lang.OutOfMemoryError: PermGen space
at java.lang.Throwable.getStackTraceElement(Native Method)
at java.lang.Throwable.getOurStackTrace(Throwable.java:591)
at java.lang.Throwable.printStackTraceAsCause(Throwable.java:481)
at java.lang.Throwable.printStackTrace(Throwable.java:468)
at java.lang.Throwable.printStackTrace(Throwable.java:451)
at org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:626)
at org.apache.hadoop.hive.ql.exec.Utilities.setReduceWork(Utilities.java:578)
at org.apache.hadoop.hive.ql.exec.Utilities.setMapRedWork(Utilities.java:569)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:372)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1503)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1270)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901)
at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:198)
at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:644)
at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:628)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:695)
FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. PermGen space
Exception in thread "org.apache.hadoop.hdfs.PeerCache@4db323af" java.lang.OutOfMemoryError: PermGen space
Exception in thread "main" java.lang.OutOfMemoryError: PermGen space
Exception in thread "LeaseRenewer:frankhe@localhost:9000" java.lang.OutOfMemoryError: PermGen space

The config is as follows:

hd.fs=hdfs://localhost:9000
hd.rm=localhost:50070
hd.jh=localhost:8088

hive.host=localhost
hive.port=10000
hive.url=jdbc:hive://${hive.host}:${hive.port}
hive.table=passwords

My OS is Mac, I already tried to udpate STS.ini to:

-Xms128m
-Xmx768m
-XX:MaxPermSize=4096m

the exception is the same.

Any idea on how to fix the exception?

Thanks

Working for Hive2, error: org.apache.thrift.TApplicationException: Invalid method name: 'execute'

I am using the sample, but as my hive server is hive2, and the hive URL server is:
hive.url=jdbc:hive2://xxx.xxx.xxx.xxx:21050/;auth=noSasl
now when I run the sample, I got this error immediately:
org.apache.thrift.TApplicationException: Invalid method name: 'execute'
at org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
at org.apache.hadoop.hive.service.ThriftHive$Client.recv_execute(ThriftHive.java:116)
at org.apache.hadoop.hive.service.ThriftHive$Client.execute(ThriftHive.java:103)
at org.springframework.samples.hadoop.hive.HiveJdbcAppMy.main(HiveJdbcAppMy.java:31)
Is this because hive2 vs. hive? How can I fix this sample and use hive2?
Thanks

Can I have a working sample for spring hbase using hbaseTemplate with lastest stable hBase build?

I am a newbie for Hbase and I want to continue to use spring solution, hBaseTemplate to access HBase. But I tested a lot of times and can never be successful in doing so. This is what I did.
The sample I am using is:
https://github.com/spring-projects/spring-data-book/tree/master/hadoop/hbase

I am using latest stable HBase build, version 1.0.1.1
When I start UserApp, I got this error:

 Exception in thread "main" org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'userUtils' defined in file [/Users/fhe/spring/spring-data-book/hadoop/hbase/target/classes/com/oreilly/springdata/hadoop/hbase/UserUtils.class]: Invocation of init method failed; nested exception is java.lang.IllegalArgumentException: Not a host:port pair: PBUF
�
192.168.1.75�����ݠ���)���}
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1486)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:524)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:461)
    at org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:295)
    at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:223)
    at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:292)
    at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:194)
    at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:626)
    at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:932)
    at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:479)
    at org.springframework.context.support.ClassPathXmlApplicationContext.<init>(ClassPathXmlApplicationContext.java:197)
    at org.springframework.context.support.ClassPathXmlApplicationContext.<init>(ClassPathXmlApplicationContext.java:172)
    at org.springframework.context.support.ClassPathXmlApplicationContext.<init>(ClassPathXmlApplicationContext.java:158)
    at com.oreilly.springdata.hadoop.hbase.UserApp.main(UserApp.java:30)

So I google this error: Not a host:port pair: PBUF

checking the POM file,

<properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <spring.hadoop.version>1.0.0.RELEASE</spring.hadoop.version>
        <hadoop.version>1.0.1</hadoop.version>
        <hbase.version>0.92.1</hbase.version>
        <log4j.version>1.2.17</log4j.version>
    </properties>

People said it is because version is different from client and server. so I donwload hbaseServer 0.92.1, start HBaseServer and then start project again, now error is:

09:22:32.259 [main-SendThread(localhost:2181)] WARN  org.apache.zookeeper.ClientCnxn - Session 0x14e5f064fea0003 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.6.0_65]
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) ~[na:1.6.0_65]
    at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:286) ~[zookeeper-3.4.3.jar:3.4.3-1240972]
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1035) ~[zookeeper-3.4.3.jar:3.4.3-1240972]
09:22:32.503 [main-SendThread(localhost:2181)] INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server /127.0.0.1:2181
09:22:32.504 [main-SendThread(localhost:2181)] WARN  o.a.z.client.ZooKeeperSaslClient - SecurityException: java.lang.SecurityException: Unable to locate a login configuration occurred when trying to find JAAS configuration.
09:22:32.504 [main-SendThread(localhost:2181)] INFO  o.a.z.client.ZooKeeperSaslClient - Client will not SASL-authenticate because the default JAAS configuration section 'Client' could not be found. If you are not using SASL, you may ignore this. On the other hand, if you expected SASL to work, please fix your JAAS configuration.
09:22:32.505 [main-SendThread(localhost:2181)] WARN  org.apache.zookeeper.ClientCnxn - Session 0x14e5f064fea0002 for server null, unexpected error, closing socket connection and attempting reconnect

Even if I am using Hbase Shell, it is throwing a lot exceptions, so I gave up old version of hBase server.

I tried to use same hBaseClient in the POM to match latest HBase Server, 1.0.1.1, but it is either jar not available error, or some other major minor version exceptions.

Can anyone who has such experiences tell me how you fixed it and use hbaseTemplate to work with latest stable hBaseBuild? How do you config the spring project etc? Can you show me your working solution?

Thanks very much.

issue while reading data on hdfs from remote

I want to read the files on the remote HDFS，but there was an exception.

local ENV.

OS: windows 10

JDK 1.8

SHDP 2.4.0

hadoop 2.7.1

1. code

AbstractApplicationContext context = new ClassPathXmlApplicationContext(
                    "/application.xml", XmlUserApp.class);
            HdfsResourceLoader loader = context.getBean(HdfsResourceLoader.class);
            Resource resource = loader.getResource("hdfs://hd-23:6000/user/alleyz/hdfs.txt");
            System.out.println(resource.exists()); // true
            System.out.println(resource.lastModified()); // 1469696205365
            System.out.println(resource.isReadable()); // true
            System.out.println(resource.contentLength()); // 41
            System.out.println(resource.isOpen()); // true
            System.out.println(resource.isReadable()); // true
            File file = resource.getFile(); // throw exception

2. exception

java.lang.UnsupportedOperationException: Cannot resolve File object for HDFS Resource for [hdfs://hd-23:6000/user/alleyz/hdfs.txt]
    at org.springframework.data.hadoop.fs.HdfsResource.getFile(HdfsResource.java:160)
    at com.alleyz.spring.hadoop.hdfs.XmlHdfsApp.main(XmlHdfsApp.java:31)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)

How to solve it?

mapreduce on CDH5: not enough memory for Yarn

I had the following error when trying to run the mapreduce example on CDH5.

InvalidResourceRequestException Invalid resource request, 
requested memory < 0, or  requested memory > max configured,
requestedMemory=1536, maxMemory=1024

I had to do the following fix to make it work:

Hue > Yarn > Configuration > Edit
Search for "valve", set property YARN Service Advanced Configuration Snippet (Safety Valve) for yarn-site.xml to:

<property>
    <name>yarn.scheduler.maximum-allocation-mb</name>
    <value>2024</value>
 </property>

Save changes and redeploy.

Unable to initialize 'javax.el.ExpressionFactory' after update Spring Boot to 1.5.2

Hello. I am not sure this is the right place to open an issue. If it's not, I'd gladly appreciate if you could share the right person / project.

We are using Spring Boot and Hadoop as shown in Gradle script below.

plugins {
    // before update it was: '1.4.4.RELEASE'
    id 'org.springframework.boot' version '1.5.2.RELEASE'
}
dependencies {
    compile 'org.springframework.data:spring-data-hadoop-boot:2.4.0.RELEASE-cdh5'
}

When trying to get validator, we got following exception.

import javax.validation.Validator;
class Utils {
    Validator v = Validation.buildDefaultValidatorFactory().getValidator();
}

java.lang.ExceptionInInitializerError
	at com.rakuten.felix.listnormalizer.broker.MessageProcessor.handleIncomingMessage(MessageProcessor.java:110)
	at com.rakuten.felix.listnormalizer.test.broker.MessageProcessorTest.handleIncomingMessageTest(MessageProcessorTest.java:115)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.lang.reflect.Method.invoke(Method.java:498)
Caused by: javax.validation.ValidationException: HV000183: Unable to initialize 'javax.el.ExpressionFactory'. Check that you have the EL dependencies on the classpath, or use ParameterMessageInterpolator instead
	at org.hibernate.validator.messageinterpolation.ResourceBundleMessageInterpolator.buildExpressionFactory(ResourceBundleMessageInterpolator.java:102)
	at org.hibernate.validator.messageinterpolation.ResourceBundleMessageInterpolator.<init>(ResourceBundleMessageInterpolator.java:45)
	at org.hibernate.validator.internal.engine.ConfigurationImpl.getDefaultMessageInterpolator(ConfigurationImpl.java:423)
	at org.hibernate.validator.internal.engine.ConfigurationImpl.getDefaultMessageInterpolatorConfiguredWithClassLoader(ConfigurationImpl.java:575)
	at org.hibernate.validator.internal.engine.ConfigurationImpl.getMessageInterpolator(ConfigurationImpl.java:364)
	at org.hibernate.validator.internal.engine.ValidatorFactoryImpl.<init>(ValidatorFactoryImpl.java:144)
	at org.hibernate.validator.HibernateValidator.buildValidatorFactory(HibernateValidator.java:38)
	at org.hibernate.validator.internal.engine.ConfigurationImpl.buildValidatorFactory(ConfigurationImpl.java:331)
	at javax.validation.Validation.buildDefaultValidatorFactory(Validation.java:110)
	at com.rakuten.felix.listnormalizer.ValidatorUtils.<clinit>(ValidatorUtils.java:11)
	... 4 more
Caused by: javax.el.ELException: Provider com.sun.el.ExpressionFactoryImpl not found
	at javax.el.FactoryFinder.newInstance(FactoryFinder.java:101)
	at javax.el.FactoryFinder.find(FactoryFinder.java:197)
	at javax.el.ExpressionFactory.newInstance(ExpressionFactory.java:189)
	at javax.el.ExpressionFactory.newInstance(ExpressionFactory.java:160)
	at org.hibernate.validator.messageinterpolation.ResourceBundleMessageInterpolator.buildExpressionFactory(ResourceBundleMessageInterpolator.java:98)
	... 13 more
Caused by: java.lang.ClassNotFoundException: com.sun.el.ExpressionFactoryImpl
	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	at javax.el.FactoryFinder.newInstance(FactoryFinder.java:87)
	... 17 more

After an investigation we found out that two javax.el.ExpressionFactory classes exist in class path. One from tomcat-embed-el:8.5.11 and other from jsp:jsp-api:2.1. I assumed jsp-api contains old version of ExpressionFactory class and excluded JSP API, which solved the issue.

The question is - shouldn't this be solved in Spring itself? Since I wasn't excluding anything before and it was working fine, I would expect the same behavior after update. Any ideas how to proceed with this? Thank you.

hiveAppWithApacheLogs demo hangs on: Returning cached instance of singleton bean 'hiveRunner'

Hello.
I'm trying to run Hive demo on Hortonworks Sandbox 2.1.
I'm connecting to Hive from host (outside of VM).
When running hiveAppWithApacheLogs, logs looks like this:

21:10:51,481 DEBUG ctory.support.DefaultListableBeanFactory: 477 - Finished creating instance of bean 'hadoopConfiguration'
21:10:51,482 DEBUG ctory.support.DefaultListableBeanFactory: 220 - Creating shared instance of singleton bean 'hiveClientFactory'
21:10:51,482 DEBUG ctory.support.DefaultListableBeanFactory: 449 - Creating instance of bean 'hiveClientFactory'
21:10:51,486 DEBUG ctory.support.DefaultListableBeanFactory: 523 - Eagerly caching bean 'hiveClientFactory' to allow for resolving potential circular references
21:10:51,492 DEBUG ctory.support.DefaultListableBeanFactory: 477 - Finished creating instance of bean 'hiveClientFactory'
21:10:51,492 DEBUG ctory.support.DefaultListableBeanFactory: 220 - Creating shared instance of singleton bean 'hiveRunner'
21:10:51,492 DEBUG ctory.support.DefaultListableBeanFactory: 449 - Creating instance of bean 'hiveRunner'
21:10:51,493 DEBUG ctory.support.DefaultListableBeanFactory: 523 - Eagerly caching bean 'hiveRunner' to allow for resolving potential circular references
21:10:51,493 DEBUG ctory.support.DefaultListableBeanFactory: 249 - Returning cached instance of singleton bean 'hiveClientFactory'
21:10:51,497 DEBUG ctory.support.DefaultListableBeanFactory: 449 - Creating instance of bean '(inner bean)#198ddef7'
21:10:51,500 DEBUG ctory.support.DefaultListableBeanFactory: 449 - Creating instance of bean '(inner bean)#ab14733'
21:10:51,511 DEBUG ctory.support.DefaultListableBeanFactory: 477 - Finished creating instance of bean '(inner bean)#ab14733'
21:10:51,515 DEBUG ctory.support.DefaultListableBeanFactory: 477 - Finished creating instance of bean '(inner bean)#198ddef7'
21:10:51,517 DEBUG ctory.support.DefaultListableBeanFactory:1595 - Invoking afterPropertiesSet() on bean with name 'hiveRunner'
21:10:51,518 DEBUG ctory.support.DefaultListableBeanFactory: 477 - Finished creating instance of bean 'hiveRunner'
21:10:51,519 DEBUG t.support.ClassPathXmlApplicationContext: 700 - Unable to locate LifecycleProcessor with name 'lifecycleProcessor': using default [org.springframework.context.support.DefaultLifecycleProcessor@39a3014f]
21:10:51,519 DEBUG ctory.support.DefaultListableBeanFactory: 249 - Returning cached instance of singleton bean 'lifecycleProcessor'
21:10:51,521 DEBUG core.env.PropertySourcesPropertyResolver:  81 - Searching for key 'spring.liveBeansView.mbeanDomain' in [systemProperties]
21:10:51,521 DEBUG core.env.PropertySourcesPropertyResolver:  81 - Searching for key 'spring.liveBeansView.mbeanDomain' in [systemEnvironment]
21:10:51,521 DEBUG core.env.PropertySourcesPropertyResolver: 103 - Could not find key 'spring.liveBeansView.mbeanDomain' in any property source. Returning [null]
21:10:51,521  INFO amples.hadoop.hive.HiveAppWithApacheLogs:  31 - Hive Application Running
21:10:51,522 DEBUG ctory.support.DefaultListableBeanFactory: 249 - Returning cached instance of singleton bean 'hiveRunner'

On my own app based on existing Hive existing server approach there is also problem.
My logs:

21:15:54,121  INFO  org.apache.hadoop.fs.TrashPolicyDefault:  92 - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
21:15:54,121 DEBUG ctory.support.DefaultListableBeanFactory: 477 - Finished creating instance of bean 'setupScript'
21:15:54,121 DEBUG ctory.support.DefaultListableBeanFactory: 220 - Creating shared instance of singleton bean 'hiveClientFactory'
21:15:54,121 DEBUG ctory.support.DefaultListableBeanFactory: 449 - Creating instance of bean 'hiveClientFactory'
21:15:54,121 DEBUG ctory.support.DefaultListableBeanFactory: 523 - Eagerly caching bean 'hiveClientFactory' to allow for resolving potential circular references
21:15:54,125 DEBUG ctory.support.DefaultListableBeanFactory: 477 - Finished creating instance of bean 'hiveClientFactory'
21:15:54,125 DEBUG ctory.support.DefaultListableBeanFactory: 220 - Creating shared instance of singleton bean 'hiveRunner'
21:15:54,125 DEBUG ctory.support.DefaultListableBeanFactory: 449 - Creating instance of bean 'hiveRunner'
21:15:54,126 DEBUG ctory.support.DefaultListableBeanFactory: 523 - Eagerly caching bean 'hiveRunner' to allow for resolving potential circular references
21:15:54,126 DEBUG ctory.support.DefaultListableBeanFactory: 247 - Returning cached instance of singleton bean 'hiveClientFactory'
21:15:54,130 DEBUG ctory.support.DefaultListableBeanFactory: 449 - Creating instance of bean '(inner bean)#672e34d8'
21:15:54,130 DEBUG ctory.support.DefaultListableBeanFactory: 449 - Creating instance of bean '(inner bean)#a8f85d4'
21:15:54,131 DEBUG ctory.support.DefaultListableBeanFactory: 477 - Finished creating instance of bean '(inner bean)#a8f85d4'
21:15:54,132 DEBUG ctory.support.DefaultListableBeanFactory: 477 - Finished creating instance of bean '(inner bean)#672e34d8'
21:15:54,134 DEBUG ctory.support.DefaultListableBeanFactory:1595 - Invoking afterPropertiesSet() on bean with name 'hiveRunner'

Having issues with your mapreduce example

I've installed Hadoop-2.2.0 and I'm able to run the hadoop samples just fine from the command line. Interested in using Spring though so I tried this sample. Whenever I run the job I get this output:

sh ./target/appassembler/bin/wordcount
09:24:53,391 INFO t.support.ClassPathXmlApplicationContext: 510 - Refreshing org.springframework.context.support.ClassPathXmlApplicationContext@370410a7: startup date [Sat Jan 18 09:24:53 MST 2014]; root of context hierarchy
09:24:53,540 INFO eans.factory.xml.XmlBeanDefinitionReader: 315 - Loading XML bean definitions from class path resource [META-INF/spring/application-context.xml]
09:24:53,914 INFO ort.PropertySourcesPlaceholderConfigurer: 172 - Loading properties file from class path resource [hadoop.properties]
09:24:53,938 INFO ctory.support.DefaultListableBeanFactory: 596 - Pre-instantiating singletons in org.springframework.beans.factory.support.DefaultListableBeanFactory@4fe596de: defining beans [org.springframework.context.support.PropertySourcesPlaceholderConfigurer#0,hadoopConfiguration,wordcountJob,setupScript,runner]; root of factory hierarchy
09:24:54,096 INFO he.hadoop.conf.Configuration.deprecation: 840 - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-01-18 09:24:54.573 java[4237:1703] Unable to load realm info from SCDynamicStore
09:25:28,945 WARN org.apache.hadoop.util.NativeCodeLoader: 62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
09:25:29,488 INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used
09:25:29,501 INFO org.apache.hadoop.fs.TrashPolicyDefault: 92 - Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
09:25:30,707 INFO org.apache.hadoop.yarn.client.RMProxy: 56 - Connecting to ResourceManager at localhost/127.0.0.1:8032
09:25:30,759 INFO ramework.data.hadoop.mapreduce.JobRunner: 192 - Starting job [wordcountJob]
09:25:30,790 INFO org.apache.hadoop.yarn.client.RMProxy: 56 - Connecting to ResourceManager at localhost/127.0.0.1:8032
09:25:31,055 WARN org.apache.hadoop.mapreduce.JobSubmitter: 258 - No job jar file set. User classes may not be found. See Job or Job#setJar(String).
09:25:31,111 INFO doop.mapreduce.lib.input.FileInputFormat: 287 - Total input paths to process : 1
09:25:31,260 INFO org.apache.hadoop.mapreduce.JobSubmitter: 394 - number of splits:1
09:25:31,271 INFO he.hadoop.conf.Configuration.deprecation: 840 - user.name is deprecated. Instead, use mapreduce.job.user.name
09:25:31,272 INFO he.hadoop.conf.Configuration.deprecation: 840 - fs.default.name is deprecated. Instead, use fs.defaultFS
09:25:31,275 INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class
09:25:31,276 INFO he.hadoop.conf.Configuration.deprecation: 840 - mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
09:25:31,276 INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.job.name is deprecated. Instead, use mapreduce.job.name
09:25:31,276 INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
09:25:31,277 INFO he.hadoop.conf.Configuration.deprecation: 840 - mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class
09:25:31,277 INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
09:25:31,277 INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
09:25:31,278 INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
09:25:31,278 INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class
09:25:31,279 INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
09:25:31,412 INFO org.apache.hadoop.mapreduce.JobSubmitter: 477 - Submitting tokens for job: job_1390012296433_0009
09:25:31,641 INFO org.apache.hadoop.mapred.YARNRunner: 368 - Job jar is not present. Not adding any jar to the list of resources.
09:25:31,705 INFO doop.yarn.client.api.impl.YarnClientImpl: 174 - Submitted application application_1390012296433_0009 to ResourceManager at localhost/127.0.0.1:8032
09:25:31,748 INFO org.apache.hadoop.mapreduce.Job:1272 - The url to track the job: http://Admins-MacBook-Pro.local:8088/proxy/application_1390012296433_0009/
09:25:31,749 INFO org.apache.hadoop.mapreduce.Job:1317 - Running job: job_1390012296433_0009
09:25:35,778 INFO org.apache.hadoop.mapreduce.Job:1338 - Job job_1390012296433_0009 running in uber mode : false
09:25:35,780 INFO org.apache.hadoop.mapreduce.Job:1345 - map 0% reduce 0%
09:25:35,796 INFO org.apache.hadoop.mapreduce.Job:1358 - Job job_1390012296433_0009 failed with state FAILED due to: Application application_1390012296433_0009 failed 2 times due to AM Container for appattempt_1390012296433_0009_000002 exited with exitCode: 127 due to: Exception from container-launch:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)

.Failing this attempt.. Failing the application.
09:25:35,850 INFO org.apache.hadoop.mapreduce.Job:1363 - Counters: 0
09:25:35,858 INFO ramework.data.hadoop.mapreduce.JobRunner: 202 - Completed job [wordcountJob]
09:25:35,876 INFO org.apache.hadoop.yarn.client.RMProxy: 56 - Connecting to ResourceManager at localhost/127.0.0.1:8032
09:25:35,914 INFO ctory.support.DefaultListableBeanFactory: 444 - Destroying singletons in org.springframework.beans.factory.support.DefaultListableBeanFactory@4fe596de: defining beans [org.springframework.context.support.PropertySourcesPlaceholderConfigurer#0,hadoopConfiguration,wordcountJob,setupScript,runner]; root of factory hierarchy
Exception in thread "main" org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'runner': Invocation of init method failed; nested exception is java.lang.IllegalStateException: Job wordcountJob] failed to start; status=FAILED
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1488)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:524)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:461)
at org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:295)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:223)
at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:292)
at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:194)
at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:626)
at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:932)
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:479)
at org.springframework.context.support.ClassPathXmlApplicationContext.(ClassPathXmlApplicationContext.java:197)
at org.springframework.context.support.ClassPathXmlApplicationContext.(ClassPathXmlApplicationContext.java:172)
at org.springframework.context.support.ClassPathXmlApplicationContext.(ClassPathXmlApplicationContext.java:158)
at org.springframework.samples.hadoop.mapreduce.Wordcount.main(Wordcount.java:28)
Caused by: java.lang.IllegalStateException: Job wordcountJob] failed to start; status=FAILED
at org.springframework.data.hadoop.mapreduce.JobExecutor$2.run(JobExecutor.java:223)
at org.springframework.core.task.SyncTaskExecutor.execute(SyncTaskExecutor.java:49)
at org.springframework.data.hadoop.mapreduce.JobExecutor.startJobs(JobExecutor.java:172)
at org.springframework.data.hadoop.mapreduce.JobExecutor.startJobs(JobExecutor.java:164)
at org.springframework.data.hadoop.mapreduce.JobRunner.call(JobRunner.java:52)
at org.springframework.data.hadoop.mapreduce.JobRunner.afterPropertiesSet(JobRunner.java:44)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeInitMethods(AbstractAutowireCapableBeanFactory.java:1547)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1485)
... 13 more

Any thoughts on this? Looks like the job is connection to hadoop and the resource manager...

Protocol message contained an invalid tag

I've been struggling with this example trying to make it work.
I am running Hadoop Version: 2.0.5-alpha-gphd-2.1.0.0 with namenode at port 8020
I've run the application as described in the README file
mvn clean package -Pphd20
sh ./target/appassembler/bin/wordcount

I got the error
Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message contained an invalid tag (zero).

Here's the gist containing the stacktrace
https://gist.github.com/betht1220/e9aab0b241778fe93758

Can you please help me out with this
Thanks

new Interface of Hbase

I am trying to use spring as a framework to read and write data in Hbase. Even if it works, I receive warning of deprecated class when I build it. Could we update Interface of Hbase to optimize performance and prevent NoSuchClass exception?

warning: [deprecation] HTableInterface in org.apache.hadoop.hbase.client has been deprecated
import org.apache.hadoop.hbase.client.HTableInterface;

yarn-examples-simple-command failed on cdh5@centos

I'm using cloudera cdh5.0 quick startvm for my development hadoop cluster.
when I run yarn-examples-simple-command folloing the README.md,the application failed with the message below.
Any ideas on this?

2014-10-20 18:24:46,286 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1413848891726_0001_02_000001 transitioned from LOCALIZED to RUNNING
2014-10-20 18:24:46,296 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: launchContainer: [nice, -n, 0, bash, /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/cloudera/appcache/application_1413848891726_0001/container_1413848891726_0001_02_000001/default_container_executor.sh]
2014-10-20 18:24:46,516 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code from container container_1413848891726_0001_02_000001 is : 1
2014-10-20 18:24:46,516 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exception from container-launch with container ID: container_1413848891726_0001_02_000001 and exit code: 1
org.apache.hadoop.util.Shell$ExitCodeException:
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
        at org.apache.hadoop.util.Shell.run(Shell.java:424)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
        at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2014-10-20 18:24:46,516 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor:
2014-10-20 18:24:46,516 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Container exited with a non-zero exit code 1