GithubHelp home page GithubHelp logo

cdh-twitter-example's People

Contributors

strangetcy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cdh-twitter-example's Issues

variable [wfInput] cannot be resolved

Hello!

I have followed all the steps but when I run the following error occurs:
variable [wfInput] cannot be resolved

<coordinator-app name="add-partition-coord" frequency="${coord:hours(1)}" start="${jobStart}" end="${jobEnd}" timezone="UTC" xmlns="uri:oozie:coordinator:0.1"> <datasets> <dataset name="tweets" frequency="${coord:hours(1)}" initial-instance="${initialDataset}" timezone="America/Los_Angeles"> <uri-template>hdfs://jupiter:8020/user/flume/tweets/${YEAR}/${MONTH}/${DAY}/${HOUR}</uri-template> <done-flag></done-flag> </dataset> </datasets> <input-events> <data-in name="input" dataset="tweets"> <instance>${coord:current(coord:tzOffset() / 60)}</instance> </data-in> <data-in name="readyIndicator" dataset="tweets"> <instance>${coord:current(1 + (coord:tzOffset() / 60))}</instance> </data-in> </input-events> <action> <workflow> <app-path>${workflowRoot}/hive-action.xml</app-path> <configuration> <property> <name>wfInput</name> <value>${coord:dataIn('input')}</value> </property> <property> <name>dateHour</name> <value>${coord:formatTime(coord:dateOffset(coord:nominalTime(), tzOffset, 'HOUR'), 'yyyyMMddHH')}</value> </property> </configuration> </workflow> </action> </coordinator-app>

`nameNode=hdfs://jupiter:8020
jobTracker= jupiter:8021
workflowRoot=${nameNode}/user/${user.name}/oozie-workflows

jobStart=2016-11-15T12:30Z
jobEnd=2016-11-15T15:00Z

initialDataset=2016-11-15T11:00Z

tzOffset=+1

oozie.use.system.libpath=true
oozie.coord.application.path=${nameNode}/user/${user.name}/oozie-workflows/coord-app.xml
`

I live in Spain (UTC+1). Job starts correctly at the scheduled time.
Does anyone know why this error occurs ? Can anyone help me?
Thanks in advance.

Configuration issues with Oozie 3.1.3-cdh4.0.1 ?

This is a great tutorial - many thanks for posting it. I follow all of the set up instructions, but get hung up on running the Oozie workflow, with error: "Error: E0504 : E0504: App directory [hdfs://phocion:8020/user/tim/oozie-workflows/coord-app.xml] does not exist"

The file certainly does exist, and there doesn't seem to be an issue with permissions. I'm not sure if this error is suggesting it can't find coord-app.xml or if there is an issue with a setting in coord-app.xml. Could there be some issue with my default CDH4 setup?

tim@phocion:/user/tim$ oozie version
Oozie client build version: 3.1.3-cdh4.0.1

tim@phocion:/user/tim$ oozie job -oozie http://localhost:11000/oozie -config oozie-workflows/job.properties -run
Error: E0504 : E0504: App directory [hdfs://phocion:8020/user/tim/oozie-workflows/coord-app.xml] does not exist

tim@phocion:/user/tim$ sudo -u oozie [ -f oozie-workflows/coord-app.xml ] && echo "FOUND" || "NOT FOUND"
FOUND

tim@phocion:/user/tim$ ls -l oozie-workflows/
total 24
-rwxr-xr-x 1 tim tim 938 Sep 24 21:29 add_partition.q
-rwxr-xr-x 1 tim tim 1356 Sep 26 11:09 coord-app.xml
-rwxr-xr-x 1 tim tim 1918 Sep 24 21:29 hive-action.xml
-rwxr-xr-x 1 tim tim 2200 Sep 24 21:29 hive-site.xml
-rwxr-xr-x 1 tim tim 1356 Sep 26 11:32 job.properties
drwxr-xr-x 2 tim tim 4096 Sep 24 21:29 lib

Null values in nested structures issue

When we have a nested structure that doesn't have a value in the JSON object, the hive deserialization silently fails for the whole structure.
I've created a fork of the project with a patch that I believe will demonstrate the issue and the suggested fix:

sjulias@33617a0

It would be great if you could review this change and hopefully apply it to the main project.

Thanks in advance,
Yulia

Issues with the json serde.

Hi,

I'm using the json serde in hive for parsing another set of json files I have from valentines day. I noticed that there is no option to ignore malformed json, and there seems to be some problems with deserializing all json.

This tweet is causing the error:

{"text":"@KimKardashian happy valentines day, hope it's a good one","retweet_count":0,"geo":{"type":"Point","coordinates":[38.7313358,-108.05278695]},"in_reply_to_status_id_str":null,"in_reply_to_user_id":25365536,"source":"\u003Ca href="http://twitter.com/download/android" rel="nofollow"\u003ETwitter for Android\u003C/a\u003E","in_reply_to_user_id_str":"25365536","id_str":"169483808003989505","entities":{"user_mentions":[{"indices":[0,14],"screen_name":"KimKardashian","id_str":"25365536","name":"Kim Kardashian","id":25365536}],"urls":[],"hashtags":[]},"in_reply_to_status_id":null,"place":{"url":"http://api.twitter.com/1/geo/id/6a7e7dbf9d6c7ac4.json","place_type":"city","country_code":"US","attributes":{},"full_name":"Delta, CO","bounding_box":{"type":"Polygon","coordinates":[[[-108.104644,38.71503],[-108.021863,38.71503],[-108.021863,38.769794],[-108.104644,38.769794]]]},"name":"Delta","id":"6a7e7dbf9d6c7ac4","country":"United States"},"in_reply_to_screen_name":"Ki{"text":"@bbrandivirgo too bad I dont have the number. Happy valentines day tho :)","retweet_count":0,"geo":{"type":"Point","coordinates":[33.77406404,-84.39270512]},"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"source":"\u003Ca href="http://mobile.twitter.com" rel="nofollow"\u003EMobile Web\u003C/a\u003E","in_reply_to_user_id_str":null,"id_str":"169497701241716736","entities":{"user_mentions":[],"urls":[],"hashtags":[]},"in_reply_to_status_id":null,"place":{"url":"http://api.twitter.com/1/geo/id/8173485c72e78ca5.json","place_type":"city","country_code":"US","attributes":{},"full_name":"Atlanta, GA","bounding_box":{"type":"Polygon","coordinates":[[[-84.54674,33.647908],[-84.289389,33.647908],[-84.289389,33.887618],[-84.54674,33.887618]]]},"name":"Atlanta","id":"8173485c72e78ca5","country":"United States"},"in_reply_to_screen_name":null,"favorited":false,"truncated":false,"created_at":"Tue Feb 14 19:06:15 +0000 2012","contributors":null,"user":{"contributors_enabled":false,"profile_background_image_url":"http://a3.twimg.com/profile_background_images/376284279/yyyyyyyyyyyyyyyyyyyy.jpg","url":"http://facebook.com/cperk3","profile_link_color":"0084B4","followers_count":773,"profile_image_url":"http://a3.twimg.com/profile_images/1792490671/000011110000_normal.jpg","default_profile_image":false,"show_all_inline_media":true,"statuses_count":3271,"profile_background_color":"C0DEED","description":"Ga Tech Athlete-Student.. Black&Samoan...Follow me as I follow Jesus-","location":"Atlanta, GA","profile_background_tile":true,"favourites_count":1,"profile_background_image_url_https":"https://si0.twimg.com/profile_background_images/376284279/yyyyyyyyyyyyyyyyyyyy.jpg","time_zone":"Quito","profile_sidebar_fill_color":"DDEEF6","screen_name":"Cpeezy21","id_str":"312682111","lang":"en","geo_enabled":true,"profile_image_url_https":"https://si0.twimg.com/profile_images/1792490671/000011110000_normal.jpg","verified":false,"notifications":null,"profile_sidebar_border_color":"04080a","protected":false,"listed_count":5,"created_at":"Tue Jun 07 14:14:34 +0000 2011","name":"Charles Perkins III","is_translator":false,"follow_request_sent":null,"following":null,"profile_use_background_image":true,"friends_count":223,"id":312682111,"default_profile":false,"utc_offset":-18000,"profile_text_color":"333333"},"retweeted":false,"id":169497701241716736,"coordinates":{"type":"Point","coordinates":[-84.39270512,33.77406404]}}

I'm getting this error when processing some sample twitter data:

2012-09-26 15:15:39,059 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2012-09-26 15:15:39,215 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
2012-09-26 15:15:39,372 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /mapred/local/taskTracker/distcache/-624804405132306423_-2027207125_45603557/hadoop1.domain.com/tmp/hive-root/hive_2012-09-26_15-15-33_715_8669028640552125101/-mr-10004/af319f96-99f0-4f06-8fba-3fbf5b880148 <- /mapred/local/taskTracker/root/jobcache/job_201209252321_0010/attempt_201209252321_0010_m_000000_0/work/HIVE_PLANaf319f96-99f0-4f06-8fba-3fbf5b880148
2012-09-26 15:15:39,380 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/root/jobcache/job_201209252321_0010/jars/job.jar <- /mapred/local/taskTracker/root/jobcache/job_201209252321_0010/attempt_201209252321_0010_m_000000_0/work/job.jar
2012-09-26 15:15:39,388 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /mapred/local/taskTracker/root/jobcache/job_201209252321_0010/jars/.job.jar.crc <- /mapred/local/taskTracker/root/jobcache/job_201209252321_0010/attempt_201209252321_0010_m_000000_0/work/.job.jar.crc
2012-09-26 15:15:39,451 WARN org.apache.hadoop.conf.Configuration: session.id is deprecated. Instead, use dfs.metrics.session-id
2012-09-26 15:15:39,452 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
2012-09-26 15:15:39,767 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
2012-09-26 15:15:39,773 INFO org.apache.hadoop.mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@484845aa
2012-09-26 15:15:40,065 WARN org.apache.hadoop.hive.conf.HiveConf: hive-site.xml not found on CLASSPATH
2012-09-26 15:15:40,222 WARN org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library is available
2012-09-26 15:15:40,222 INFO org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library loaded
2012-09-26 15:15:40,232 WARN mapreduce.Counters: Counter name MAP_INPUT_BYTES is deprecated. Use FileInputFormatCounters as group name and BYTES_READ as counter name instead
2012-09-26 15:15:40,236 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 0
2012-09-26 15:15:40,242 INFO ExecMapper: maximum memory = 119341056
2012-09-26 15:15:40,243 INFO ExecMapper: conf classpath = [file:/var/run/cloudera-scm-agent/process/93-mapreduce-TASKTRACKER/, file:/usr/java/jdk1.6.0_31/lib/tools.jar, file:/usr/lib/hadoop-0.20-mapreduce/, file:/usr/lib/hadoop-0.20-mapreduce/hadoop-core-2.0.0-mr1-cdh4.0.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/activation-1.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/ant-contrib-1.0b3.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/asm-3.2.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/aspectjrt-1.6.5.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/aspectjtools-1.6.5.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/avro-1.5.4.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/avro-compiler-1.5.4.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-beanutils-1.7.0.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-beanutils-core-1.8.0.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-cli-1.2.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-codec-1.4.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-collections-3.2.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-configuration-1.6.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-digester-1.8.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-el-1.0.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-httpclient-3.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-io-2.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-lang-2.5.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-logging-1.1.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-logging-api-1.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-math-2.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-net-3.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/core-3.1.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/guava-11.0.2.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/hadoop-fairscheduler-2.0.0-mr1-cdh4.0.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/hsqldb-1.8.0.10.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jackson-core-asl-1.8.8.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jackson-jaxrs-1.8.8.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jackson-mapper-asl-1.8.8.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jackson-xc-1.8.8.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jasper-compiler-5.5.23.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jasper-runtime-5.5.23.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jaxb-api-2.2.2.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jaxb-impl-2.2.3-1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jdiff-1.0.9.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jersey-core-1.8.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jersey-json-1.8.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jersey-server-1.8.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jets3t-0.6.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jettison-1.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jetty-6.1.26.cloudera.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jetty-util-6.1.26.cloudera.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jsch-0.1.42.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/json-simple-1.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jsp-api-2.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jsr305-1.3.9.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/kfs-0.2.2.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/kfs-0.3.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/log4j-1.2.16.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/oro-2.0.8.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/paranamer-2.3.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/protobuf-java-2.4.0a.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/servlet-api-2.5.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/slf4j-api-1.6.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/snappy-java-1.0.3.2.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/stax-api-1.0.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/xmlenc-0.52.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jsp-2.1/jsp-2.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jsp-2.1/jsp-api-2.1.jar, file:/usr/share/cmf/lib/plugins/tt-instrumentation-4.0.4.jar, file:/usr/share/cmf/lib/plugins/event-publish-4.0.4-shaded.jar, file:/usr/lib/hadoop-hdfs/lib/avro-1.5.4.jar, file:/usr/lib/hadoop-hdfs/lib/paranamer-2.3.jar, file:/usr/lib/hadoop-hdfs/lib/commons-logging-1.1.1.jar, file:/usr/lib/hadoop-hdfs/lib/jackson-mapper-asl-1.8.8.jar, file:/usr/lib/hadoop-hdfs/lib/slf4j-api-1.6.1.jar, file:/usr/lib/hadoop-hdfs/lib/protobuf-java-2.4.0a.jar, file:/usr/lib/hadoop-hdfs/lib/snappy-java-1.0.3.2.jar, file:/usr/lib/hadoop-hdfs/lib/jline-0.9.94.jar, file:/usr/lib/hadoop-hdfs/lib/commons-daemon-1.0.3.jar, file:/usr/lib/hadoop-hdfs/lib/jackson-core-asl-1.8.8.jar, file:/usr/lib/hadoop-hdfs/lib/zookeeper-3.4.3-cdh4.0.1.jar, file:/usr/lib/hadoop-hdfs/lib/log4j-1.2.15.jar, file:/usr/lib/hadoop-hdfs/hadoop-hdfs-2.0.0-cdh4.0.1.jar, file:/usr/lib/hadoop-hdfs/hadoop-hdfs-2.0.0-cdh4.0.1.jar, file:/usr/lib/hadoop-hdfs/hadoop-hdfs-2.0.0-cdh4.0.1-tests.jar, file:/usr/lib/hadoop/lib/commons-beanutils-core-1.8.0.jar, file:/usr/lib/hadoop/lib/commons-codec-1.4.jar, file:/usr/lib/hadoop/lib/jets3t-0.6.1.jar, file:/usr/lib/hadoop/lib/json-simple-1.1.jar, file:/usr/lib/hadoop/lib/guava-11.0.2.jar, file:/usr/lib/hadoop/lib/avro-1.5.4.jar, file:/usr/lib/hadoop/lib/commons-beanutils-1.7.0.jar, file:/usr/lib/hadoop/lib/commons-configuration-1.6.jar, file:/usr/lib/hadoop/lib/asm-3.2.jar, file:/usr/lib/hadoop/lib/paranamer-2.3.jar, file:/usr/lib/hadoop/lib/jaxb-impl-2.2.3-1.jar, file:/usr/lib/hadoop/lib/jackson-xc-1.8.8.jar, file:/usr/lib/hadoop/lib/commons-logging-1.1.1.jar, file:/usr/lib/hadoop/lib/jackson-mapper-asl-1.8.8.jar, file:/usr/lib/hadoop/lib/commons-cli-1.2.jar, file:/usr/lib/hadoop/lib/jetty-6.1.26.cloudera.1.jar, file:/usr/lib/hadoop/lib/commons-lang-2.5.jar, file:/usr/lib/hadoop/lib/kfs-0.3.jar, file:/usr/lib/hadoop/lib/hue-plugins-2.0.0-cdh4.0.1.jar, file:/usr/lib/hadoop/lib/jasper-compiler-5.5.23.jar, file:/usr/lib/hadoop/lib/jettison-1.1.jar, file:/usr/lib/hadoop/lib/slf4j-api-1.6.1.jar, file:/usr/lib/hadoop/lib/jsch-0.1.42.jar, file:/usr/lib/hadoop/lib/stax-api-1.0.1.jar, file:/usr/lib/hadoop/lib/protobuf-java-2.4.0a.jar, file:/usr/lib/hadoop/lib/jsr305-1.3.9.jar, file:/usr/lib/hadoop/lib/snappy-java-1.0.3.2.jar, file:/usr/lib/hadoop/lib/jsp-api-2.1.jar, file:/usr/lib/hadoop/lib/oro-2.0.8.jar, file:/usr/lib/hadoop/lib/jersey-server-1.8.jar, file:/usr/lib/hadoop/lib/commons-digester-1.8.jar, file:/usr/lib/hadoop/lib/commons-math-2.1.jar, file:/usr/lib/hadoop/lib/jline-0.9.94.jar, file:/usr/lib/hadoop/lib/core-3.1.1.jar, file:/usr/lib/hadoop/lib/commons-httpclient-3.1.jar, file:/usr/lib/hadoop/lib/commons-el-1.0.jar, file:/usr/lib/hadoop/lib/jersey-core-1.8.jar, file:/usr/lib/hadoop/lib/jackson-jaxrs-1.8.8.jar, file:/usr/lib/hadoop/lib/jackson-core-asl-1.8.8.jar, file:/usr/lib/hadoop/lib/jetty-util-6.1.26.cloudera.1.jar, file:/usr/lib/zookeeper/zookeeper-3.4.3-cdh4.0.1.jar, file:/usr/lib/hadoop/lib/jasper-runtime-5.5.23.jar, file:/usr/lib/hadoop/lib/commons-net-3.1.jar, file:/usr/lib/hadoop/lib/servlet-api-2.5.jar, file:/usr/lib/hadoop/lib/jaxb-api-2.2.2.jar, file:/usr/lib/hadoop/lib/commons-io-2.1.jar, file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.6.1.jar, file:/usr/lib/hadoop/lib/commons-logging-api-1.1.jar, file:/usr/lib/hadoop/lib/xmlenc-0.52.jar, file:/usr/lib/hadoop/lib/commons-collections-3.2.1.jar, file:/usr/lib/hadoop/lib/activation-1.1.jar, file:/usr/lib/hadoop/lib/jersey-json-1.8.jar, file:/usr/lib/hadoop/lib/aspectjrt-1.6.5.jar, file:/usr/lib/hadoop/lib/log4j-1.2.15.jar, file:/usr/lib/hadoop/hadoop-common-2.0.0-cdh4.0.1.jar, file:/usr/lib/hadoop/hadoop-auth-2.0.0-cdh4.0.1.jar, file:/usr/lib/hadoop/hadoop-common-2.0.0-cdh4.0.1.jar, file:/usr/lib/hadoop/hadoop-annotations-2.0.0-cdh4.0.1.jar, file:/usr/lib/hadoop/hadoop-common-2.0.0-cdh4.0.1-tests.jar, file:/usr/lib/hadoop/hadoop-annotations-2.0.0-cdh4.0.1.jar, file:/usr/lib/hadoop/hadoop-auth-2.0.0-cdh4.0.1.jar, file:/mapred/local/taskTracker/root/jobcache/job_201209252321_0010/jars/classes, file:/mapred/local/taskTracker/root/jobcache/job_201209252321_0010/jars/job.jar, file:/mapred/local/taskTracker/root/distcache/4260026189093522549_-70309741_45603944/hadoop1.domain.com/user/root/.staging/job_201209252321_0010/libjars/hive-builtins-0.8.1-cdh4.0.1.jar, file:/mapred/local/taskTracker/root/distcache/-6339710882011042599_2132445101_45603979/hadoop1.domain.com/user/root/.staging/job_201209252321_0010/libjars/hive-serdes-1.0-SNAPSHOT.jar, file:/mapred/local/taskTracker/root/distcache/7269667103068590023_-978189584_45604014/hadoop1.domain.com/user/root/.staging/job_201209252321_0010/libjars/hive-contrib-0.8.1-cdh4.0.1.jar, file:/mapred/local/taskTracker/root/jobcache/job_201209252321_0010/attempt_201209252321_0010_m_000000_0/work/]
2012-09-26 15:15:40,243 INFO ExecMapper: thread classpath = [file:/var/run/cloudera-scm-agent/process/93-mapreduce-TASKTRACKER/, file:/usr/java/jdk1.6.0_31/lib/tools.jar, file:/usr/lib/hadoop-0.20-mapreduce/, file:/usr/lib/hadoop-0.20-mapreduce/hadoop-core-2.0.0-mr1-cdh4.0.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/activation-1.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/ant-contrib-1.0b3.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/asm-3.2.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/aspectjrt-1.6.5.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/aspectjtools-1.6.5.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/avro-1.5.4.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/avro-compiler-1.5.4.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-beanutils-1.7.0.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-beanutils-core-1.8.0.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-cli-1.2.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-codec-1.4.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-collections-3.2.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-configuration-1.6.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-digester-1.8.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-el-1.0.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-httpclient-3.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-io-2.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-lang-2.5.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-logging-1.1.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-logging-api-1.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-math-2.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/commons-net-3.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/core-3.1.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/guava-11.0.2.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/hadoop-fairscheduler-2.0.0-mr1-cdh4.0.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/hsqldb-1.8.0.10.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jackson-core-asl-1.8.8.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jackson-jaxrs-1.8.8.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jackson-mapper-asl-1.8.8.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jackson-xc-1.8.8.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jasper-compiler-5.5.23.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jasper-runtime-5.5.23.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jaxb-api-2.2.2.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jaxb-impl-2.2.3-1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jdiff-1.0.9.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jersey-core-1.8.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jersey-json-1.8.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jersey-server-1.8.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jets3t-0.6.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jettison-1.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jetty-6.1.26.cloudera.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jetty-util-6.1.26.cloudera.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jsch-0.1.42.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/json-simple-1.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jsp-api-2.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jsr305-1.3.9.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/kfs-0.2.2.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/kfs-0.3.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/log4j-1.2.16.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/oro-2.0.8.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/paranamer-2.3.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/protobuf-java-2.4.0a.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/servlet-api-2.5.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/slf4j-api-1.6.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/snappy-java-1.0.3.2.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/stax-api-1.0.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/xmlenc-0.52.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jsp-2.1/jsp-2.1.jar, file:/usr/lib/hadoop-0.20-mapreduce/lib/jsp-2.1/jsp-api-2.1.jar, file:/usr/share/cmf/lib/plugins/tt-instrumentation-4.0.4.jar, file:/usr/share/cmf/lib/plugins/event-publish-4.0.4-shaded.jar, file:/usr/lib/hadoop-hdfs/lib/avro-1.5.4.jar, file:/usr/lib/hadoop-hdfs/lib/paranamer-2.3.jar, file:/usr/lib/hadoop-hdfs/lib/commons-logging-1.1.1.jar, file:/usr/lib/hadoop-hdfs/lib/jackson-mapper-asl-1.8.8.jar, file:/usr/lib/hadoop-hdfs/lib/slf4j-api-1.6.1.jar, file:/usr/lib/hadoop-hdfs/lib/protobuf-java-2.4.0a.jar, file:/usr/lib/hadoop-hdfs/lib/snappy-java-1.0.3.2.jar, file:/usr/lib/hadoop-hdfs/lib/jline-0.9.94.jar, file:/usr/lib/hadoop-hdfs/lib/commons-daemon-1.0.3.jar, file:/usr/lib/hadoop-hdfs/lib/jackson-core-asl-1.8.8.jar, file:/usr/lib/hadoop-hdfs/lib/zookeeper-3.4.3-cdh4.0.1.jar, file:/usr/lib/hadoop-hdfs/lib/log4j-1.2.15.jar, file:/usr/lib/hadoop-hdfs/hadoop-hdfs-2.0.0-cdh4.0.1.jar, file:/usr/lib/hadoop-hdfs/hadoop-hdfs-2.0.0-cdh4.0.1.jar, file:/usr/lib/hadoop-hdfs/hadoop-hdfs-2.0.0-cdh4.0.1-tests.jar, file:/usr/lib/hadoop/lib/commons-beanutils-core-1.8.0.jar, file:/usr/lib/hadoop/lib/commons-codec-1.4.jar, file:/usr/lib/hadoop/lib/jets3t-0.6.1.jar, file:/usr/lib/hadoop/lib/json-simple-1.1.jar, file:/usr/lib/hadoop/lib/guava-11.0.2.jar, file:/usr/lib/hadoop/lib/avro-1.5.4.jar, file:/usr/lib/hadoop/lib/commons-beanutils-1.7.0.jar, file:/usr/lib/hadoop/lib/commons-configuration-1.6.jar, file:/usr/lib/hadoop/lib/asm-3.2.jar, file:/usr/lib/hadoop/lib/paranamer-2.3.jar, file:/usr/lib/hadoop/lib/jaxb-impl-2.2.3-1.jar, file:/usr/lib/hadoop/lib/jackson-xc-1.8.8.jar, file:/usr/lib/hadoop/lib/commons-logging-1.1.1.jar, file:/usr/lib/hadoop/lib/jackson-mapper-asl-1.8.8.jar, file:/usr/lib/hadoop/lib/commons-cli-1.2.jar, file:/usr/lib/hadoop/lib/jetty-6.1.26.cloudera.1.jar, file:/usr/lib/hadoop/lib/commons-lang-2.5.jar, file:/usr/lib/hadoop/lib/kfs-0.3.jar, file:/usr/lib/hadoop/lib/hue-plugins-2.0.0-cdh4.0.1.jar, file:/usr/lib/hadoop/lib/jasper-compiler-5.5.23.jar, file:/usr/lib/hadoop/lib/jettison-1.1.jar, file:/usr/lib/hadoop/lib/slf4j-api-1.6.1.jar, file:/usr/lib/hadoop/lib/jsch-0.1.42.jar, file:/usr/lib/hadoop/lib/stax-api-1.0.1.jar, file:/usr/lib/hadoop/lib/protobuf-java-2.4.0a.jar, file:/usr/lib/hadoop/lib/jsr305-1.3.9.jar, file:/usr/lib/hadoop/lib/snappy-java-1.0.3.2.jar, file:/usr/lib/hadoop/lib/jsp-api-2.1.jar, file:/usr/lib/hadoop/lib/oro-2.0.8.jar, file:/usr/lib/hadoop/lib/jersey-server-1.8.jar, file:/usr/lib/hadoop/lib/commons-digester-1.8.jar, file:/usr/lib/hadoop/lib/commons-math-2.1.jar, file:/usr/lib/hadoop/lib/jline-0.9.94.jar, file:/usr/lib/hadoop/lib/core-3.1.1.jar, file:/usr/lib/hadoop/lib/commons-httpclient-3.1.jar, file:/usr/lib/hadoop/lib/commons-el-1.0.jar, file:/usr/lib/hadoop/lib/jersey-core-1.8.jar, file:/usr/lib/hadoop/lib/jackson-jaxrs-1.8.8.jar, file:/usr/lib/hadoop/lib/jackson-core-asl-1.8.8.jar, file:/usr/lib/hadoop/lib/jetty-util-6.1.26.cloudera.1.jar, file:/usr/lib/zookeeper/zookeeper-3.4.3-cdh4.0.1.jar, file:/usr/lib/hadoop/lib/jasper-runtime-5.5.23.jar, file:/usr/lib/hadoop/lib/commons-net-3.1.jar, file:/usr/lib/hadoop/lib/servlet-api-2.5.jar, file:/usr/lib/hadoop/lib/jaxb-api-2.2.2.jar, file:/usr/lib/hadoop/lib/commons-io-2.1.jar, file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.6.1.jar, file:/usr/lib/hadoop/lib/commons-logging-api-1.1.jar, file:/usr/lib/hadoop/lib/xmlenc-0.52.jar, file:/usr/lib/hadoop/lib/commons-collections-3.2.1.jar, file:/usr/lib/hadoop/lib/activation-1.1.jar, file:/usr/lib/hadoop/lib/jersey-json-1.8.jar, file:/usr/lib/hadoop/lib/aspectjrt-1.6.5.jar, file:/usr/lib/hadoop/lib/log4j-1.2.15.jar, file:/usr/lib/hadoop/hadoop-common-2.0.0-cdh4.0.1.jar, file:/usr/lib/hadoop/hadoop-auth-2.0.0-cdh4.0.1.jar, file:/usr/lib/hadoop/hadoop-common-2.0.0-cdh4.0.1.jar, file:/usr/lib/hadoop/hadoop-annotations-2.0.0-cdh4.0.1.jar, file:/usr/lib/hadoop/hadoop-common-2.0.0-cdh4.0.1-tests.jar, file:/usr/lib/hadoop/hadoop-annotations-2.0.0-cdh4.0.1.jar, file:/usr/lib/hadoop/hadoop-auth-2.0.0-cdh4.0.1.jar, file:/mapred/local/taskTracker/root/jobcache/job_201209252321_0010/jars/classes, file:/mapred/local/taskTracker/root/jobcache/job_201209252321_0010/jars/job.jar, file:/mapred/local/taskTracker/root/distcache/4260026189093522549_-70309741_45603944/hadoop1.domain.com/user/root/.staging/job_201209252321_0010/libjars/hive-builtins-0.8.1-cdh4.0.1.jar, file:/mapred/local/taskTracker/root/distcache/-6339710882011042599_2132445101_45603979/hadoop1.domain.com/user/root/.staging/job_201209252321_0010/libjars/hive-serdes-1.0-SNAPSHOT.jar, file:/mapred/local/taskTracker/root/distcache/7269667103068590023_-978189584_45604014/hadoop1.domain.com/user/root/.staging/job_201209252321_0010/libjars/hive-contrib-0.8.1-cdh4.0.1.jar, file:/mapred/local/taskTracker/root/jobcache/job_201209252321_0010/attempt_201209252321_0010_m_000000_0/work/]
2012-09-26 15:15:40,253 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias tweets to work list for file hdfs://hadoop1.domain.com:8020/uploads
2012-09-26 15:15:40,256 INFO org.apache.hadoop.hive.ql.exec.MapOperator: dump TS structtext:string,user:struct<screen_name:string>
2012-09-26 15:15:40,256 INFO ExecMapper:
Id =3

Id =0

Id =1

Id =2
Id = 1 null<\Parent>
<\FS>
<\Children>
Id = 0 null<\Parent>
<\SEL>
<\Children>
Id = 3 null<\Parent>
<\TS>
<\Children>
<\MAP>
2012-09-26 15:15:40,257 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Initializing Self 3 MAP
2012-09-26 15:15:40,257 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing Self 0 TS
2012-09-26 15:15:40,257 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Operator 0 TS initialized
2012-09-26 15:15:40,257 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing children of 0 TS
2012-09-26 15:15:40,257 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing child 1 SEL
2012-09-26 15:15:40,257 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing Self 1 SEL
2012-09-26 15:15:40,262 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: SELECT structtext:string,user:struct<screen_name:string>
2012-09-26 15:15:40,262 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Operator 1 SEL initialized
2012-09-26 15:15:40,262 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing children of 1 SEL
2012-09-26 15:15:40,262 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing child 2 FS
2012-09-26 15:15:40,262 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing Self 2 FS
2012-09-26 15:15:40,293 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Operator 2 FS initialized
2012-09-26 15:15:40,293 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initialization Done 2 FS
2012-09-26 15:15:40,293 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initialization Done 1 SEL
2012-09-26 15:15:40,293 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initialization Done 0 TS
2012-09-26 15:15:40,293 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Initialization Done 3 MAP
2012-09-26 15:15:40,298 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Processing path hdfs://hadoop1.domain.com:8020/uploads/twitter.txt
2012-09-26 15:15:40,298 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Processing alias tweets for file hdfs://hadoop1.domain.com:8020/uploads
2012-09-26 15:15:40,497 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 3 forwarding 1 rows
2012-09-26 15:15:40,497 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarding 1 rows
2012-09-26 15:15:40,497 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 forwarding 1 rows
2012-09-26 15:15:40,497 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Final Path: FS hdfs://hadoop1.domain.com:8020/tmp/hive-root/hive_2012-09-26_15-15-33_715_8669028640552125101/_tmp.-ext-10002/000000_0
2012-09-26 15:15:40,498 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Writing to temp file: FS hdfs://hadoop1.domain.com:8020/tmp/hive-root/hive_2012-09-26_15-15-33_715_8669028640552125101/_task_tmp.-ext-10002/_tmp.000000_0
2012-09-26 15:15:40,498 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: New Final Path: FS hdfs://hadoop1.domain.com:8020/tmp/hive-root/hive_2012-09-26_15-15-33_715_8669028640552125101/_tmp.-ext-10002/000000_0
2012-09-26 15:15:40,560 INFO ExecMapper: ExecMapper: processing 1 rows: used memory = 24284800
2012-09-26 15:15:40,577 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 3 forwarding 10 rows
2012-09-26 15:15:40,577 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarding 10 rows
2012-09-26 15:15:40,577 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 forwarding 10 rows
2012-09-26 15:15:40,577 INFO ExecMapper: ExecMapper: processing 10 rows: used memory = 24860552
2012-09-26 15:15:40,705 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 3 forwarding 100 rows
2012-09-26 15:15:40,705 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarding 100 rows
2012-09-26 15:15:40,705 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 forwarding 100 rows
2012-09-26 15:15:40,705 INFO ExecMapper: ExecMapper: processing 100 rows: used memory = 28885000
2012-09-26 15:15:41,499 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 3 forwarding 1000 rows
2012-09-26 15:15:41,499 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarding 1000 rows
2012-09-26 15:15:41,499 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 forwarding 1000 rows
2012-09-26 15:15:41,499 INFO ExecMapper: ExecMapper: processing 1000 rows: used memory = 7598072
2012-09-26 15:15:42,992 FATAL ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"text":"@KimKardashian happy valentines day, hope it's a good one","retweet_count":0,"geo":{"type":"Point","coordinates":[38.7313358,-108.05278695]},"in_reply_to_status_id_str":null,"in_reply_to_user_id":25365536,"source":"\u003Ca href="http://twitter.com/download/android" rel="nofollow"\u003ETwitter for Android\u003C/a\u003E","in_reply_to_user_id_str":"25365536","id_str":"169483808003989505","entities":{"user_mentions":[{"indices":[0,14],"screen_name":"KimKardashian","id_str":"25365536","name":"Kim Kardashian","id":25365536}],"urls":[],"hashtags":[]},"in_reply_to_status_id":null,"place":{"url":"http://api.twitter.com/1/geo/id/6a7e7dbf9d6c7ac4.json","place_type":"city","country_code":"US","attributes":{},"full_name":"Delta, CO","bounding_box":{"type":"Polygon","coordinates":[[[-108.104644,38.71503],[-108.021863,38.71503],[-108.021863,38.769794],[-108.104644,38.769794]]]},"name":"Delta","id":"6a7e7dbf9d6c7ac4","country":"United States"},"in_reply_to_screen_name":"Ki{"text":"@bbrandivirgo too bad I dont have the number. Happy valentines day tho :)","retweet_count":0,"geo":{"type":"Point","coordinates":[33.77406404,-84.39270512]},"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"source":"\u003Ca href="http://mobile.twitter.com" rel="nofollow"\u003EMobile Web\u003C/a\u003E","in_reply_to_user_id_str":null,"id_str":"169497701241716736","entities":{"user_mentions":[],"urls":[],"hashtags":[]},"in_reply_to_status_id":null,"place":{"url":"http://api.twitter.com/1/geo/id/8173485c72e78ca5.json","place_type":"city","country_code":"US","attributes":{},"full_name":"Atlanta, GA","bounding_box":{"type":"Polygon","coordinates":[[[-84.54674,33.647908],[-84.289389,33.647908],[-84.289389,33.887618],[-84.54674,33.887618]]]},"name":"Atlanta","id":"8173485c72e78ca5","country":"United States"},"in_reply_to_screen_name":null,"favorited":false,"truncated":false,"created_at":"Tue Feb 14 19:06:15 +0000 2012","contributors":null,"user":{"contributors_enabled":false,"profile_background_image_url":"http://a3.twimg.com/profile_background_images/376284279/yyyyyyyyyyyyyyyyyyyy.jpg","url":"http://facebook.com/cperk3","profile_link_color":"0084B4","followers_count":773,"profile_image_url":"http://a3.twimg.com/profile_images/1792490671/000011110000_normal.jpg","default_profile_image":false,"show_all_inline_media":true,"statuses_count":3271,"profile_background_color":"C0DEED","description":"Ga Tech Athlete-Student.. Black&Samoan...Follow me as I follow Jesus-","location":"Atlanta, GA","profile_background_tile":true,"favourites_count":1,"profile_background_image_url_https":"https://si0.twimg.com/profile_background_images/376284279/yyyyyyyyyyyyyyyyyyyy.jpg","time_zone":"Quito","profile_sidebar_fill_color":"DDEEF6","screen_name":"Cpeezy21","id_str":"312682111","lang":"en","geo_enabled":true,"profile_image_url_https":"https://si0.twimg.com/profile_images/1792490671/000011110000_normal.jpg","verified":false,"notifications":null,"profile_sidebar_border_color":"04080a","protected":false,"listed_count":5,"created_at":"Tue Jun 07 14:14:34 +0000 2011","name":"Charles Perkins III","is_translator":false,"follow_request_sent":null,"following":null,"profile_use_background_image":true,"friends_count":223,"id":312682111,"default_profile":false,"utc_offset":-18000,"profile_text_color":"333333"},"retweeted":false,"id":169497701241716736,"coordinates":{"type":"Point","coordinates":[-84.39270512,33.77406404]}}
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524)
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:393)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapred.Child.main(Child.java:264)
Caused by: org.apache.hadoop.hive.serde2.SerDeException: org.codehaus.jackson.JsonParseException: Unexpected character ('t' (code 116)): was expecting comma to separate OBJECT entries
at [Source: java.io.StringReader@366ef7ba; line: 1, column: 999]
at com.cloudera.hive.serde.JSONSerDe.deserialize(JSONSerDe.java:128)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508)
... 9 more
Caused by: org.codehaus.jackson.JsonParseException: Unexpected character ('t' (code 116)): was expecting comma to separate OBJECT entries
at [Source: java.io.StringReader@366ef7ba; line: 1, column: 999]
at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:1291)
at org.codehaus.jackson.impl.JsonParserMinimalBase._reportError(JsonParserMinimalBase.java:385)
at org.codehaus.jackson.impl.JsonParserMinimalBase._reportUnexpectedChar(JsonParserMinimalBase.java:306)
at org.codehaus.jackson.impl.ReaderBasedParser.nextToken(ReaderBasedParser.java:285)
at org.codehaus.jackson.map.deser.MapDeserializer._readAndBind(MapDeserializer.java:220)
at org.codehaus.jackson.map.deser.MapDeserializer.deserialize(MapDeserializer.java:165)
at org.codehaus.jackson.map.deser.MapDeserializer.deserialize(MapDeserializer.java:25)
at org.codehaus.jackson.map.ObjectMapper._readMapAndClose(ObjectMapper.java:2402)
at org.codehaus.jackson.map.ObjectMapper.readValue(ObjectMapper.java:1602)
at com.cloudera.hive.serde.JSONSerDe.deserialize(JSONSerDe.java:126)
... 10 more

2012-09-26 15:15:42,993 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 3 finished. closing...
2012-09-26 15:15:42,993 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 3 forwarded 4551 rows
2012-09-26 15:15:42,993 INFO org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:1
2012-09-26 15:15:42,993 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 finished. closing...
2012-09-26 15:15:42,993 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarded 4551 rows
2012-09-26 15:15:42,993 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 finished. closing...
2012-09-26 15:15:42,993 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 forwarded 4551 rows
2012-09-26 15:15:42,993 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 2 finished. closing...
2012-09-26 15:15:42,993 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: 2 forwarded 0 rows
2012-09-26 15:15:43,066 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: TABLE_ID_1_ROWCOUNT:4551
2012-09-26 15:15:43,066 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 Close done
2012-09-26 15:15:43,066 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 Close done
2012-09-26 15:15:43,066 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 3 Close done
2012-09-26 15:15:43,066 INFO ExecMapper: ExecMapper: processed 4551 rows: used memory = 17571376
2012-09-26 15:15:43,074 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2012-09-26 15:15:43,077 WARN org.apache.hadoop.mapred.Child: Error running child
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"text":"@KimKardashian happy valentines day, hope it's a good one","retweet_count":0,"geo":{"type":"Point","coordinates":[38.7313358,-108.05278695]},"in_reply_to_status_id_str":null,"in_reply_to_user_id":25365536,"source":"\u003Ca href="http://twitter.com/download/android" rel="nofollow"\u003ETwitter for Android\u003C/a\u003E","in_reply_to_user_id_str":"25365536","id_str":"169483808003989505","entities":{"user_mentions":[{"indices":[0,14],"screen_name":"KimKardashian","id_str":"25365536","name":"Kim Kardashian","id":25365536}],"urls":[],"hashtags":[]},"in_reply_to_status_id":null,"place":{"url":"http://api.twitter.com/1/geo/id/6a7e7dbf9d6c7ac4.json","place_type":"city","country_code":"US","attributes":{},"full_name":"Delta, CO","bounding_box":{"type":"Polygon","coordinates":[[[-108.104644,38.71503],[-108.021863,38.71503],[-108.021863,38.769794],[-108.104644,38.769794]]]},"name":"Delta","id":"6a7e7dbf9d6c7ac4","country":"United States"},"in_reply_to_screen_name":"Ki{"text":"@bbrandivirgo too bad I dont have the number. Happy valentines day tho :)","retweet_count":0,"geo":{"type":"Point","coordinates":[33.77406404,-84.39270512]},"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"source":"\u003Ca href="http://mobile.twitter.com" rel="nofollow"\u003EMobile Web\u003C/a\u003E","in_reply_to_user_id_str":null,"id_str":"169497701241716736","entities":{"user_mentions":[],"urls":[],"hashtags":[]},"in_reply_to_status_id":null,"place":{"url":"http://api.twitter.com/1/geo/id/8173485c72e78ca5.json","place_type":"city","country_code":"US","attributes":{},"full_name":"Atlanta, GA","bounding_box":{"type":"Polygon","coordinates":[[[-84.54674,33.647908],[-84.289389,33.647908],[-84.289389,33.887618],[-84.54674,33.887618]]]},"name":"Atlanta","id":"8173485c72e78ca5","country":"United States"},"in_reply_to_screen_name":null,"favorited":false,"truncated":false,"created_at":"Tue Feb 14 19:06:15 +0000 2012","contributors":null,"user":{"contributors_enabled":false,"profile_background_image_url":"http://a3.twimg.com/profile_background_images/376284279/yyyyyyyyyyyyyyyyyyyy.jpg","url":"http://facebook.com/cperk3","profile_link_color":"0084B4","followers_count":773,"profile_image_url":"http://a3.twimg.com/profile_images/1792490671/000011110000_normal.jpg","default_profile_image":false,"show_all_inline_media":true,"statuses_count":3271,"profile_background_color":"C0DEED","description":"Ga Tech Athlete-Student.. Black&Samoan...Follow me as I follow Jesus-","location":"Atlanta, GA","profile_background_tile":true,"favourites_count":1,"profile_background_image_url_https":"https://si0.twimg.com/profile_background_images/376284279/yyyyyyyyyyyyyyyyyyyy.jpg","time_zone":"Quito","profile_sidebar_fill_color":"DDEEF6","screen_name":"Cpeezy21","id_str":"312682111","lang":"en","geo_enabled":true,"profile_image_url_https":"https://si0.twimg.com/profile_images/1792490671/000011110000_normal.jpg","verified":false,"notifications":null,"profile_sidebar_border_color":"04080a","protected":false,"listed_count":5,"created_at":"Tue Jun 07 14:14:34 +0000 2011","name":"Charles Perkins III","is_translator":false,"follow_request_sent":null,"following":null,"profile_use_background_image":true,"friends_count":223,"id":312682111,"default_profile":false,"utc_offset":-18000,"profile_text_color":"333333"},"retweeted":false,"id":169497701241716736,"coordinates":{"type":"Point","coordinates":[-84.39270512,33.77406404]}}
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:393)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapred.Child.main(Child.java:264)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {"text":"@KimKardashian happy valentines day, hope it's a good one","retweet_count":0,"geo":{"type":"Point","coordinates":[38.7313358,-108.05278695]},"in_reply_to_status_id_str":null,"in_reply_to_user_id":25365536,"source":"\u003Ca href="http://twitter.com/download/android" rel="nofollow"\u003ETwitter for Android\u003C/a\u003E","in_reply_to_user_id_str":"25365536","id_str":"169483808003989505","entities":{"user_mentions":[{"indices":[0,14],"screen_name":"KimKardashian","id_str":"25365536","name":"Kim Kardashian","id":25365536}],"urls":[],"hashtags":[]},"in_reply_to_status_id":null,"place":{"url":"http://api.twitter.com/1/geo/id/6a7e7dbf9d6c7ac4.json","place_type":"city","country_code":"US","attributes":{},"full_name":"Delta, CO","bounding_box":{"type":"Polygon","coordinates":[[[-108.104644,38.71503],[-108.021863,38.71503],[-108.021863,38.769794],[-108.104644,38.769794]]]},"name":"Delta","id":"6a7e7dbf9d6c7ac4","country":"United States"},"in_reply_to_screen_name":"Ki{"text":"@bbrandivirgo too bad I dont have the number. Happy valentines day tho :)","retweet_count":0,"geo":{"type":"Point","coordinates":[33.77406404,-84.39270512]},"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"source":"\u003Ca href="http://mobile.twitter.com" rel="nofollow"\u003EMobile Web\u003C/a\u003E","in_reply_to_user_id_str":null,"id_str":"169497701241716736","entities":{"user_mentions":[],"urls":[],"hashtags":[]},"in_reply_to_status_id":null,"place":{"url":"http://api.twitter.com/1/geo/id/8173485c72e78ca5.json","place_type":"city","country_code":"US","attributes":{},"full_name":"Atlanta, GA","bounding_box":{"type":"Polygon","coordinates":[[[-84.54674,33.647908],[-84.289389,33.647908],[-84.289389,33.887618],[-84.54674,33.887618]]]},"name":"Atlanta","id":"8173485c72e78ca5","country":"United States"},"in_reply_to_screen_name":null,"favorited":false,"truncated":false,"created_at":"Tue Feb 14 19:06:15 +0000 2012","contributors":null,"user":{"contributors_enabled":false,"profile_background_image_url":"http://a3.twimg.com/profile_background_images/376284279/yyyyyyyyyyyyyyyyyyyy.jpg","url":"http://facebook.com/cperk3","profile_link_color":"0084B4","followers_count":773,"profile_image_url":"http://a3.twimg.com/profile_images/1792490671/000011110000_normal.jpg","default_profile_image":false,"show_all_inline_media":true,"statuses_count":3271,"profile_background_color":"C0DEED","description":"Ga Tech Athlete-Student.. Black&Samoan...Follow me as I follow Jesus-","location":"Atlanta, GA","profile_background_tile":true,"favourites_count":1,"profile_background_image_url_https":"https://si0.twimg.com/profile_background_images/376284279/yyyyyyyyyyyyyyyyyyyy.jpg","time_zone":"Quito","profile_sidebar_fill_color":"DDEEF6","screen_name":"Cpeezy21","id_str":"312682111","lang":"en","geo_enabled":true,"profile_image_url_https":"https://si0.twimg.com/profile_images/1792490671/000011110000_normal.jpg","verified":false,"notifications":null,"profile_sidebar_border_color":"04080a","protected":false,"listed_count":5,"created_at":"Tue Jun 07 14:14:34 +0000 2011","name":"Charles Perkins III","is_translator":false,"follow_request_sent":null,"following":null,"profile_use_background_image":true,"friends_count":223,"id":312682111,"default_profile":false,"utc_offset":-18000,"profile_text_color":"333333"},"retweeted":false,"id":169497701241716736,"coordinates":{"type":"Point","coordinates":[-84.39270512,33.77406404]}}
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524)
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
... 8 more
Caused by: org.apache.hadoop.hive.serde2.SerDeException: org.codehaus.jackson.JsonParseException: Unexpected character ('t' (code 116)): was expecting comma to separate OBJECT entries
at [Source: java.io.StringReader@366ef7ba; line: 1, column: 999]
at com.cloudera.hive.serde.JSONSerDe.deserialize(JSONSerDe.java:128)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508)
... 9 more
Caused by: org.codehaus.jackson.JsonParseException: Unexpected character ('t' (code 116)): was expecting comma to separate OBJECT entries
at [Source: java.io.StringReader@366ef7ba; line: 1, column: 999]
at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:1291)
at org.codehaus.jackson.impl.JsonParserMinimalBase._reportError(JsonParserMinimalBase.java:385)
at org.codehaus.jackson.impl.JsonParserMinimalBase._reportUnexpectedChar(JsonParserMinimalBase.java:306)
at org.codehaus.jackson.impl.ReaderBasedParser.nextToken(ReaderBasedParser.java:285)
at org.codehaus.jackson.map.deser.MapDeserializer._readAndBind(MapDeserializer.java:220)
at org.codehaus.jackson.map.deser.MapDeserializer.deserialize(MapDeserializer.java:165)
at org.codehaus.jackson.map.deser.MapDeserializer.deserialize(MapDeserializer.java:25)
at org.codehaus.jackson.map.ObjectMapper._readMapAndClose(ObjectMapper.java:2402)
at org.codehaus.jackson.map.ObjectMapper.readValue(ObjectMapper.java:1602)
at com.cloudera.hive.serde.JSONSerDe.deserialize(JSONSerDe.java:126)
... 10 more
2012-09-26 15:15:43,081 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task

Authentication credentials is missing

I got these error when i tried to stream twitter using flume to hbase :

2015-06-30 17:01:46,352 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: Unable to start EventDrivenSourceRunner: { source:com.cloudera.flume.source.TwitterSource{name:Twitter,state:IDLE} } - Exception follows.
java.lang.IllegalStateException: Authentication credentials are missing. See http://twitter4j.org/configuration.html for the detail.
at twitter4j.TwitterBaseImpl.ensureAuthorizationEnabled(TwitterBaseImpl.java:200)
at twitter4j.TwitterStreamImpl.sample(TwitterStreamImpl.java:159)
at com.cloudera.flume.source.TwitterSource.start(TwitterSource.java:121)
at org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44)
at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

i wonder what is the error caused by

Hive Table gives error while creation

Hi,
When I am trying to create the table - it gives me the following error. Can anyone please help in letting me know what can I do for this?

I have added Jar :
ADD JAR /usr/lib/hive/lib/hive-serdes-1.0-SNAPSHOT.jar;
ADD JAR /usr/local/Hive-JSON-Serde/json-serde/target/json-serde-1.3.9-SNAPSHOT-jar-with-dependencies.jar;

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Cannot validate serde: com.cloudera.hive.serde.JSONSerDe

Thanks.

Use example with Hadoop 2.0.0-cdh4.2.1

We tried the example with the following software:

  • Hadoop 2.0.0-cdh4.2.1
  • Hive 0.10.0-cdh4.2.1
  • Flume 1.3.0-cdh4.2.1
  • Oozie 3.3.0-cdh4.2.1

In the description of the example the use of MySQL is stressed. Default Hadoop 2.0.0-cdh4.2.1 is installed with postgresql for Hive and Derby for Oozie, which works with no problem using this example.

We didn’t need to install Flume manually either. In the Cloudera Manager you can add Flume as a service. In the page of the service you can add the content of flume.conf in Configuration – Agent (Base). In the same page you can set the agent name to TwitterAgent. When you put the flume-sources-1.0-SNAPSHOT.jar in /usr/share/cmf/lib/plugins/ the jar will be added to FLUME_CLASSPATH in /var/run/cloudera-scm-agent/process/-flume-AGENT/flume-env.sh when the service is started.

However, one issue prevented us to use this service for the example. You have to add com.cloudera.flume.source.TwitterSource to flume.plugin.classes in flume-site.xml. Otherwise you get the error ClassNotFound. We haven’t found a way to do this via the Cloudera Manager. When starting the service a directory /var/run/cloudera-scm-agent/process/-flume-AGENT is created, which includes flume-site.xml. When you restart the service via Cloudera Manager a new directory is created with a different number for . But after changing flume-site.xml you could use this directory to start Flume via the command line.

Concerning the custom Flume Source, it’s probably best to build the source with the right value for hadoop.version (in our case 2.0.0-cdh4.2.1) and flume.version (1.3.0-cdh4.2.1) in pom.xml.

We had some trouble with the time zone. In our case the time zone in coord-app.xml in oozie-workflows had to be changed to "Europe/Amsterdam". In job.properties tzOffset had to be changed to 1, otherwise we got a mismatch between the directory mentioned in the parameter WFINPUT and the DATEHOUR-parameter in Action Configuration of Oozie (viewed via Oozie Web Console).

We didn’t need to install the Oozie ShareLib in HDFS.

We used Hue File Browser to create the necessary directories in HDFS.

It turned out that each time a hive session is started the ADD JAR ; has to be executed again.

Flume: Unable to start EventDrivenSourceRunner

Hi, tried this code and guidance for extracting data from Twitter. But when i start the agent i got the following error:

Unable to start EventDrivenSourceRunner: { source:com.cloudera.flume.source.TwitterSource{name:Twitter,state:IDLE} } - Exception follows.
java.lang.NoSuchMethodError: twitter4j.FilterQuery.setIncludeEntities(Z)Ltwitter4j/FilterQuery;
at com.cloudera.flume.source.TwitterSource.start(TwitterSource.java:139)
at org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44)
at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

What could be my failure. I am a absolute beginner with flume!

Thanks and best regards,
Martin

getting error while connecting solr with SQL server 2005

Hi,
Pls give me the solution why i am getting this error when i am using Solr DIH to SQL server2005

WARN - 2015-01-20 18:56:28.534; org.apache.solr.handler.dataimport.SolrWriter; Error creating document : SolrInputDocument(fields: [id=[email protected], Resume_Size=397090, Document_Type=.pdf, Application_date=2014-09-03 15:12:19.0, Resume_content=[B@14ebc94, Phone_Day=435454354545, Years_of_Experience=2, Main_Skills=fdsf, Resume_Content_Type=text/html, Exported=false, City=grtgr, Timestamp=[B@3ec96e, Source=Makro-Care, Category=3, Fname=dsfsdf, Lname=dfdsf, statename=Chiba, Resume_File_Name=doc_en_FAQ.pdf, Current_Address=, version=1490823821167427584])
org.apache.solr.common.SolrException: ERROR: [doc=[email protected]] multiple values encountered for non multiValued copy field Phone_Day: 435454354545
at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:140)
at org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:78)
at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:238)
at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:164)
at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:926)
at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1080)
at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:692)
at org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100)
at org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:71)
at org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:265)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:511)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:415)
at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:330)
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)
INFO - 2015-01-20 18:56:28.534; org.apache.solr.handler.dataimport.DocBuilder; Time taken = 0:0:0.693

Apache Hive is not loading tweets from flume's directory

Hi!

I have been following the tutorial and I am stuck. Flume is gathering tweets correctly and saving them in /user/flume/tweets.

Then, I have execute:

ADD JAR ;

CREATE EXTERNAL TABLE tweets (.....

But when for example I execute SELECT COUNT(*) from tweets; , the table is empty, the result is 0. Do I have to execute some other command in order to load the tweets?

Thanks!

Oozie Job execution "Intrernal Server Error"

Hi ,

I have executed the same procedure as given by you only difference is I am not using Cloudera Manager . I am using CDH4.7 . when I execute as : oozie job -oozie http://localhost:11000/oozie -config job.properties -run. I am getting the following error as given below.

Error: HTTP error code: 500 : Internal Server Error.

Please help me in crossing this hurdle

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.