Comments (4)
I tried using two ways to write to hbase:
- standard spark API saveAsNewAPIHadoopDataset.
- use spark-hbase-connector.
They both failed with the same issue that it was hanging there after "Created table instance for testTable" without any error messages:
Log
16/03/24 23:26:00 INFO ClientCnxn: Session establishment complete on server bpaascp1devhbasezk01.bpaas.local/192.168.23.33:2181, sessionid = 0x152844be97f6ddf, negotiated timeout = 40000
16/03/24 23:26:00 INFO ZooKeeperRegistry: ClusterId read in ZooKeeper is null
16/03/24 23:26:00 INFO TableOutputFormat: Created table instance for testTable
By looking into the log closely - there is one more line of Log says: ClusterId read in ZooKeeper is null. The issue is because that *zookeeper.znode.parent * was not set with proper value - *zookeeper.znode.parent * tells which znode keeps the data (and address for HMaster) for the cluster.
The value of zookeeper.znode.parent in HBASE_CONF/hbase-site.xml. After setting the proper value into HBaseConfiguration, the issue was gone by using saveAsNewAPIHadoopDataset.
However, in spark-hbase-connector there will need to be code changes to set value of zookeeper.znode.parent, similar to how spark.hbase.host is set.
from spark-hbase-connector.
I'm afraid I don't have the zookeeper.znode.parent
property set in my HBASE_CONF/hbase-site.xml
.
What should I do in order to make it still save from spark to hbase?
from spark-hbase-connector.
Version 1.0.3 of the connector (already available on maven) supports the hbase-site.xml file for specifying a custom configuration.
from spark-hbase-connector.
Great to know :)
from spark-hbase-connector.
Related Issues (20)
- How is the Performance of this Connector? HOT 1
- Upload package to https://spark-packages.org/
- Can this framework use Java? HOT 1
- there is not spark-hbase-connector_2.11 Could it support scala 2.11? HOT 4
- could not find implicit value for parameter mapper: it.nerdammer.spark.hbase.conversion.FieldWriter[org.apache.spark.sql.Row] rdd.toHBaseTable("mytable") HOT 2
- Running in Spark 2.2
- spark streaming with hbase ERROR HOT 2
- I got this error when running the connector: java.lang.NoClassDefFoundError: org/apache/zookeeper/KeeperException Any idea ?
- ClusterId read in ZooKeeper is null HOT 2
- Continuously INFO JobScheduler: Added jobs for time HOT 2
- Whether to support kerberos authentication access? HOT 2
- Caused by: java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:208)
- how do i apply a customized partitioner to hBaseRDD? HOT 2
- how do i generate a pairRDD?
- IllegalArgumentException: Unexpected number of columns: expected 2 or 1, returned 1 HOT 1
- Pyspark support
- java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD HOT 1
- tuple limit 22 while inserting data into hbase HOT 1
- Supports hbase-1.2.X? HOT 2
- Spark2.4.2, Hbase1.4.9 run error, can not find the class : java.lang.ClassNotFoundException: org.apache.hadoop.hbase.regionserver.StoreFileWriter HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from spark-hbase-connector.