GithubHelp home page GithubHelp logo

Comments (47)

adrian-wang avatar adrian-wang commented on July 19, 2024

Hi David,
To run HiBench, you need to upload HiBench file to your cluster, and follow the instructions in README.md[https://github.com/intel-hadoop/HiBench/blob/master/README.md] to configure and run HiBench.

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

Thanks and let me try. My original thought is that I can install it on my PC and connect to cluster Hadoop.

David

Sent from my iPhone

On Jun 5, 2014, at 9:18 PM, Daoyuan Wang [email protected] wrote:

Hi David,
To run HiBench, you need to upload HiBench file to your cluster, and follow the instructions in README.md[https://github.com/intel-hadoop/HiBench/blob/master/README.md] to configure and run HiBench.


Reply to this email directly or view it on GitHub.

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

I downloaded HiBench after I installed / configued / test Hadoop based on
your link (wget https://github.com/intel-hadoop/HiBench/zipball/HiBench-2.2
). However, I the downloaded file is not .zip or .tar file and it is
HiBench-2.2. Something wrong is here.

David

On Thu, Jun 5, 2014 at 9:18 PM, Daoyuan Wang [email protected]
wrote:

Hi David,
To run HiBench, you need to upload HiBench file to your cluster, and
follow the instructions in README.md[
https://github.com/intel-hadoop/HiBench/blob/master/README.md] to
configure and run HiBench.


Reply to this email directly or view it on GitHub
#43 (comment).

from hibench.

adrian-wang avatar adrian-wang commented on July 19, 2024

Hi David,

You can simply rename the downloaded file to HiBench-2.2.zip.

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

Hi, Daoyuan,

You are right. That is what I did. Thanks for your response.

David

On Mon, Jun 9, 2014 at 7:30 PM, Daoyuan Wang [email protected]
wrote:

Hi David,

You can simply rename the downloaded file to HiBench-2.2.zip.


Reply to this email directly or view it on GitHub
#43 (comment).

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

Hi, Daoyuan,

Hadoop 2.2.0 works well on my linux. However, I am not sure where I can
find the HiBench step manual for collecting data. Can you refer me any
documents? I need to run two data sets (small and large).

Thanks.

David

On Thu, Jun 5, 2014 at 9:18 PM, Daoyuan Wang [email protected]
wrote:

Hi David,
To run HiBench, you need to upload HiBench file to your cluster, and
follow the instructions in README.md[
https://github.com/intel-hadoop/HiBench/blob/master/README.md] to
configure and run HiBench.


Reply to this email directly or view it on GitHub
#43 (comment).

from hibench.

adrian-wang avatar adrian-wang commented on July 19, 2024

Hi David,

You can modify the conf/configure.sh file under each workload folder if it exists. All the data size and options related to the workload are defined in this file.

Thanks,
Daoyuan

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

Hi, Daoyuan,

Thanks for information.

When you run "run.sh", I actually use Hadoop data set provided from Hadoop download. I can size the data size in configure file as you mentioned in this email. Is my statement correct? Can I use my own dataset in case. If I can how can I issue run.sh with file name?

Thanks

David

Sent from my iPhone

On Jun 11, 2014, at 7:14 PM, Daoyuan Wang [email protected] wrote:

Hi David,

You can modify the conf/configure.sh file under each workload folder if it exists. All the data size and options related to the workload are defined in this file.

Thanks,
Daoyuan


Reply to this email directly or view it on GitHub.

from hibench.

adrian-wang avatar adrian-wang commented on July 19, 2024

Hi David,

For some of the workloads like wordcount, you are just configuring Hadoop-example to write out random data, while for some workloads like pagerank, there’s a datagen tool of HiBench that output the dataset. The data set preparation period is in prepare*.sh file in each work load. If you have your own data, you can upload your files to HDFS, and config the path in conf.sh in each work load to use by directly run run*.sh without prepare*.sh.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 10:52 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

Thanks for information.

When you run "run.sh", I actually use Hadoop data set provided from Hadoop download. I can size the data size in configure file as you mentioned in this email. Is my statement correct? Can I use my own dataset in case. If I can how can I issue run.sh with file name?

Thanks

David

Sent from my iPhone

On Jun 11, 2014, at 7:14 PM, Daoyuan Wang <[email protected]mailto:[email protected]> wrote:

Hi David,

You can modify the conf/configure.sh file under each workload folder if it exists. All the data size and options related to the workload are defined in this file.

Thanks,
Daoyuan


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHubhttps://github.com//issues/43#issuecomment-45824799.

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

Hi, Daoyuan,

This is very helpful. Thanks a lot.

I saw that some of bench mark hadoop test was using 312 nodes. Does it mean
312 virtual servers?

David

Piston OPS-20 (8vm/cn) 312 nodes (SMALL DATA SET)

David

On Wed, Jun 11, 2014 at 9:15 PM, Daoyuan Wang [email protected]
wrote:

Hi David,

For some of the workloads like wordcount, you are just configuring
Hadoop-example to write out random data, while for some workloads like
pagerank, there’s a datagen tool of HiBench that output the dataset. The
data set preparation period is in prepare*.sh file in each work load. If
you have your own data, you can upload your files to HDFS, and config the
path in conf.sh in each work load to use by directly run run*.sh
without prepare*.sh.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 10:52 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

Thanks for information.

When you run "run.sh", I actually use Hadoop data set provided from Hadoop
download. I can size the data size in configure file as you mentioned in
this email. Is my statement correct? Can I use my own dataset in case. If I
can how can I issue run.sh with file name?

Thanks

David

Sent from my iPhone

On Jun 11, 2014, at 7:14 PM, Daoyuan Wang <[email protected]
mailto:[email protected]> wrote:

Hi David,

You can modify the conf/configure.sh file under each workload folder if
it exists. All the data size and options related to the workload are
defined in this file.

Thanks,
Daoyuan


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45824799>.


Reply to this email directly or view it on GitHub
#43 (comment).

from hibench.

adrian-wang avatar adrian-wang commented on July 19, 2024

Hi David,

Not very sure about your case. The slaves that HiBench use are all the slaves you configured to your Hadoop cluster.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 12:25 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

This is very helpful. Thanks a lot.

I saw that some of bench mark hadoop test was using 312 nodes. Does it mean
312 virtual servers?

David

Piston OPS-20 (8vm/cn) 312 nodes (SMALL DATA SET)

David

On Wed, Jun 11, 2014 at 9:15 PM, Daoyuan Wang <[email protected]mailto:[email protected]>
wrote:

Hi David,

For some of the workloads like wordcount, you are just configuring
Hadoop-example to write out random data, while for some workloads like
pagerank, there’s a datagen tool of HiBench that output the dataset. The
data set preparation period is in prepare*.sh file in each work load. If
you have your own data, you can upload your files to HDFS, and config the
path in conf.sh in each work load to use by directly run run*.sh
without prepare*.sh.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 10:52 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

Thanks for information.

When you run "run.sh", I actually use Hadoop data set provided from Hadoop
download. I can size the data size in configure file as you mentioned in
this email. Is my statement correct? Can I use my own dataset in case. If I
can how can I issue run.sh with file name?

Thanks

David

Sent from my iPhone

On Jun 11, 2014, at 7:14 PM, Daoyuan Wang <[email protected]
mailto:[email protected]%20%0b> mailto:[email protected]> wrote:

Hi David,

You can modify the conf/configure.sh file under each workload folder if
it exists. All the data size and options related to the workload are
defined in this file.

Thanks,
Daoyuan


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45824799>.


Reply to this email directly or view it on GitHub
#43 (comment).


Reply to this email directly or view it on GitHubhttps://github.com//issues/43#issuecomment-45828903.

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

Hi, Daoyuan,

Thanks for your information. I run the following statement and got an
error. I checked thhat 8088 is localhost port. Please let me know why it
happened and how to fix it.

Regards

David

hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-

client-jobclient-2.2.0-tests.jar TestDFSIO -write -nrFiles 10 -fileSize 10

java.io.IOException: Failed on local exception:
com.google.protobuf.InvalidProtocolBufferException: Protocol message
end-group tag did not match expected tag.; Host Details : local host is:
"localhost/127.0.0.1"; destination host is: "localhost":8088;

On Wed, Jun 11, 2014 at 10:32 PM, Daoyuan Wang [email protected]
wrote:

Hi David,

Not very sure about your case. The slaves that HiBench use are all the
slaves you configured to your Hadoop cluster.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 12:25 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

This is very helpful. Thanks a lot.

I saw that some of bench mark hadoop test was using 312 nodes. Does it
mean
312 virtual servers?

David

Piston OPS-20 (8vm/cn) 312 nodes (SMALL DATA SET)

David

On Wed, Jun 11, 2014 at 9:15 PM, Daoyuan Wang <[email protected]
mailto:[email protected]>
wrote:

Hi David,

For some of the workloads like wordcount, you are just configuring
Hadoop-example to write out random data, while for some workloads like
pagerank, there’s a datagen tool of HiBench that output the dataset. The
data set preparation period is in prepare*.sh file in each work load.
If
you have your own data, you can upload your files to HDFS, and config
the
path in conf.sh in each work load to use by directly run run*.sh
without prepare*.sh.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 10:52 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

Thanks for information.

When you run "run.sh", I actually use Hadoop data set provided from
Hadoop
download. I can size the data size in configure file as you mentioned in
this email. Is my statement correct? Can I use my own dataset in case.
If I
can how can I issue run.sh with file name?

Thanks

David

Sent from my iPhone

On Jun 11, 2014, at 7:14 PM, Daoyuan Wang <[email protected]
mailto:[email protected]%20%0b> mailto:[email protected]>
wrote:

Hi David,

You can modify the conf/configure.sh file under each workload folder
if
it exists. All the data size and options related to the workload are
defined in this file.

Thanks,
Daoyuan


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45824799>.


Reply to this email directly or view it on GitHub
#43 (comment).


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828903>.


Reply to this email directly or view it on GitHub
#43 (comment).

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

Daoyuan,

I have solved the 8080 port mismatch problem.

I have following questions:

  1. Why am I getting this error? Should I used == to replace -eq?

[hadoopuser@localhost conf]$ sh configure.sh
configure.sh: line 26: [: -eq: unary operator expected

  1. word counts
    When did the output files go when I run "run.sh"?

Thanks for your help.

David

On Thu, Jun 12, 2014 at 3:06 PM, david cheng [email protected] wrote:

Hi, Daoyuan,

Thanks for your information. I run the following statement and got an
error. I checked thhat 8088 is localhost port. Please let me know why it
happened and how to fix it.

Regards

David

hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-

client-jobclient-2.2.0-tests.jar TestDFSIO -write -nrFiles 10 -fileSize 10

java.io.IOException: Failed on local exception:
com.google.protobuf.InvalidProtocolBufferException: Protocol message
end-group tag did not match expected tag.; Host Details : local host is:
"localhost/127.0.0.1"; destination host is: "localhost":8088;

On Wed, Jun 11, 2014 at 10:32 PM, Daoyuan Wang [email protected]
wrote:

Hi David,

Not very sure about your case. The slaves that HiBench use are all the
slaves you configured to your Hadoop cluster.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 12:25 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

This is very helpful. Thanks a lot.

I saw that some of bench mark hadoop test was using 312 nodes. Does it
mean
312 virtual servers?

David

Piston OPS-20 (8vm/cn) 312 nodes (SMALL DATA SET)

David

On Wed, Jun 11, 2014 at 9:15 PM, Daoyuan Wang <[email protected]
mailto:[email protected]>
wrote:

Hi David,

For some of the workloads like wordcount, you are just configuring
Hadoop-example to write out random data, while for some workloads like
pagerank, there’s a datagen tool of HiBench that output the dataset.
The
data set preparation period is in prepare*.sh file in each work load.
If
you have your own data, you can upload your files to HDFS, and config
the
path in conf.sh in each work load to use by directly run run*.sh
without prepare*.sh.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 10:52 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

Thanks for information.

When you run "run.sh", I actually use Hadoop data set provided from
Hadoop
download. I can size the data size in configure file as you mentioned
in
this email. Is my statement correct? Can I use my own dataset in case.
If I
can how can I issue run.sh with file name?

Thanks

David

Sent from my iPhone

On Jun 11, 2014, at 7:14 PM, Daoyuan Wang <[email protected]
mailto:[email protected]%20%0b> mailto:[email protected]>
wrote:

Hi David,

You can modify the conf/configure.sh file under each workload folder
if
it exists. All the data size and options related to the workload are
defined in this file.

Thanks,
Daoyuan


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45824799>.


Reply to this email directly or view it on GitHub
<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828494>.


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828903>.


Reply to this email directly or view it on GitHub
#43 (comment)
.

from hibench.

adrian-wang avatar adrian-wang commented on July 19, 2024

1. Are you running HiBench over Yarn? I think you may need to checkout yarn branch on github

2. Output file is in HDFS

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 9:05 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Daoyuan,

I have solved the 8080 port mismatch problem.

I have following questions:

  1. Why am I getting this error? Should I used == to replace -eq?

[hadoopuser@localhost conf]$ sh configure.sh
configure.sh: line 26: [: -eq: unary operator expected

  1. word counts
    When did the output files go when I run "run.sh"?

Thanks for your help.

David

On Thu, Jun 12, 2014 at 3:06 PM, david cheng <[email protected]mailto:[email protected]> wrote:

Hi, Daoyuan,

Thanks for your information. I run the following statement and got an
error. I checked thhat 8088 is localhost port. Please let me know why it
happened and how to fix it.

Regards

David

hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-

client-jobclient-2.2.0-tests.jar TestDFSIO -write -nrFiles 10 -fileSize 10

java.io.IOException: Failed on local exception:
com.google.protobuf.InvalidProtocolBufferException: Protocol message
end-group tag did not match expected tag.; Host Details : local host is:
"localhost/127.0.0.1"; destination host is: "localhost":8088;

On Wed, Jun 11, 2014 at 10:32 PM, Daoyuan Wang <[email protected]mailto:[email protected]>
wrote:

Hi David,

Not very sure about your case. The slaves that HiBench use are all the
slaves you configured to your Hadoop cluster.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 12:25 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

This is very helpful. Thanks a lot.

I saw that some of bench mark hadoop test was using 312 nodes. Does it
mean
312 virtual servers?

David

Piston OPS-20 (8vm/cn) 312 nodes (SMALL DATA SET)

David

On Wed, Jun 11, 2014 at 9:15 PM, Daoyuan Wang <[email protected]
mailto:[email protected]%20%0b>> mailto:[email protected]>
wrote:

Hi David,

For some of the workloads like wordcount, you are just configuring
Hadoop-example to write out random data, while for some workloads like
pagerank, there’s a datagen tool of HiBench that output the dataset.
The
data set preparation period is in prepare*.sh file in each work load.
If
you have your own data, you can upload your files to HDFS, and config
the
path in conf.sh in each work load to use by directly run run*.sh
without prepare*.sh.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 10:52 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

Thanks for information.

When you run "run.sh", I actually use Hadoop data set provided from
Hadoop
download. I can size the data size in configure file as you mentioned
in
this email. Is my statement correct? Can I use my own dataset in case.
If I
can how can I issue run.sh with file name?

Thanks

David

Sent from my iPhone

On Jun 11, 2014, at 7:14 PM, Daoyuan Wang <[email protected]
mailto:[email protected]%20%0b>> mailto:[email protected]%20%0b> mailto:[email protected]>
wrote:

Hi David,

You can modify the conf/configure.sh file under each workload folder
if
it exists. All the data size and options related to the workload are
defined in this file.

Thanks,
Daoyuan


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45824799>.


Reply to this email directly or view it on GitHub
<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828494>.


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828903>.


Reply to this email directly or view it on GitHub
#43 (comment)
.


Reply to this email directly or view it on GitHubhttps://github.com//issues/43#issuecomment-45966009.

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

Yes I run HiBench over the yarn. Should I stop it before I run HiBench?
What output file name is called?

Thanks.
David

On Thu, Jun 12, 2014 at 6:30 PM, Daoyuan Wang [email protected]
wrote:

1. Are you running HiBench over Yarn? I think you may need to checkout
yarn branch on github

2. Output file is in HDFS

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 9:05 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Daoyuan,

I have solved the 8080 port mismatch problem.

I have following questions:

  1. Why am I getting this error? Should I used == to replace -eq?

[hadoopuser@localhost conf]$ sh configure.sh
configure.sh: line 26: [: -eq: unary operator expected

  1. word counts
    When did the output files go when I run "run.sh"?

Thanks for your help.

David

On Thu, Jun 12, 2014 at 3:06 PM, david cheng <[email protected]<mailto:
[email protected]>> wrote:

Hi, Daoyuan,

Thanks for your information. I run the following statement and got an
error. I checked thhat 8088 is localhost port. Please let me know why it
happened and how to fix it.

Regards

David

hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-

client-jobclient-2.2.0-tests.jar TestDFSIO -write -nrFiles 10 -fileSize
10

java.io.IOException: Failed on local exception:
com.google.protobuf.InvalidProtocolBufferException: Protocol message
end-group tag did not match expected tag.; Host Details : local host is:
"localhost/127.0.0.1"; destination host is: "localhost":8088;

On Wed, Jun 11, 2014 at 10:32 PM, Daoyuan Wang <[email protected]
mailto:[email protected]>
wrote:

Hi David,

Not very sure about your case. The slaves that HiBench use are all the
slaves you configured to your Hadoop cluster.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 12:25 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

This is very helpful. Thanks a lot.

I saw that some of bench mark hadoop test was using 312 nodes. Does it
mean
312 virtual servers?

David

Piston OPS-20 (8vm/cn) 312 nodes (SMALL DATA SET)

David

On Wed, Jun 11, 2014 at 9:15 PM, Daoyuan Wang <[email protected]
mailto:[email protected]%20%0b>> mailto:[email protected]>

wrote:

Hi David,

For some of the workloads like wordcount, you are just configuring
Hadoop-example to write out random data, while for some workloads
like
pagerank, there’s a datagen tool of HiBench that output the dataset.
The
data set preparation period is in prepare*.sh file in each work
load.
If
you have your own data, you can upload your files to HDFS, and config
the
path in conf.sh in each work load to use by directly run run*.sh
without prepare*.sh.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 10:52 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

Thanks for information.

When you run "run.sh", I actually use Hadoop data set provided from
Hadoop
download. I can size the data size in configure file as you mentioned
in
this email. Is my statement correct? Can I use my own dataset in
case.
If I
can how can I issue run.sh with file name?

Thanks

David

Sent from my iPhone

On Jun 11, 2014, at 7:14 PM, Daoyuan Wang <[email protected]
mailto:[email protected]%20%0b>> mailto:[email protected]%20%0b>
mailto:[email protected]>
wrote:

Hi David,

You can modify the conf/configure.sh file under each workload
folder
if
it exists. All the data size and options related to the workload are
defined in this file.

Thanks,
Daoyuan


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45824799>.


Reply to this email directly or view it on GitHub
<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828494>.


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828903>.


Reply to this email directly or view it on GitHub
<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45831658>
.


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45966009>.


Reply to this email directly or view it on GitHub
#43 (comment).

from hibench.

adrian-wang avatar adrian-wang commented on July 19, 2024

No, what you download is HiBench master branch, may need some modification to run against Yarn, because master branch is tested over MRv1 instead of yarn. You should check out the Yarn branch from the repo.
You can check your output file name in the corresponding configure file in work load’s conf/ directory.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 1:05 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Yes I run HiBench over the yarn. Should I stop it before I run HiBench?
What output file name is called?

Thanks.
David

On Thu, Jun 12, 2014 at 6:30 PM, Daoyuan Wang <[email protected]mailto:[email protected]>
wrote:

1. Are you running HiBench over Yarn? I think you may need to checkout
yarn branch on github

2. Output file is in HDFS

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 9:05 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Daoyuan,

I have solved the 8080 port mismatch problem.

I have following questions:

  1. Why am I getting this error? Should I used == to replace -eq?

[hadoopuser@localhost conf]$ sh configure.sh
configure.sh: line 26: [: -eq: unary operator expected

  1. word counts
    When did the output files go when I run "run.sh"?

Thanks for your help.

David

On Thu, Jun 12, 2014 at 3:06 PM, david cheng <[email protected]<mailto:
mailto:[email protected]%3cmailto:%20%0b> [email protected]:[email protected]>> wrote:

Hi, Daoyuan,

Thanks for your information. I run the following statement and got an
error. I checked thhat 8088 is localhost port. Please let me know why it
happened and how to fix it.

Regards

David

hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-

client-jobclient-2.2.0-tests.jar TestDFSIO -write -nrFiles 10 -fileSize
10

java.io.IOException: Failed on local exception:
com.google.protobuf.InvalidProtocolBufferException: Protocol message
end-group tag did not match expected tag.; Host Details : local host is:
"localhost/127.0.0.1"; destination host is: "localhost":8088;

On Wed, Jun 11, 2014 at 10:32 PM, Daoyuan Wang <[email protected]
mailto:[email protected]%20%0b> mailto:[email protected]>
wrote:

Hi David,

Not very sure about your case. The slaves that HiBench use are all the
slaves you configured to your Hadoop cluster.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 12:25 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

This is very helpful. Thanks a lot.

I saw that some of bench mark hadoop test was using 312 nodes. Does it
mean
312 virtual servers?

David

Piston OPS-20 (8vm/cn) 312 nodes (SMALL DATA SET)

David

On Wed, Jun 11, 2014 at 9:15 PM, Daoyuan Wang <[email protected]
mailto:[email protected]%20%0b> mailto:[email protected]%20%0b>> mailto:[email protected]>

wrote:

Hi David,

For some of the workloads like wordcount, you are just configuring
Hadoop-example to write out random data, while for some workloads
like
pagerank, there’s a datagen tool of HiBench that output the dataset.
The
data set preparation period is in prepare*.sh file in each work
load.
If
you have your own data, you can upload your files to HDFS, and config
the
path in conf.sh in each work load to use by directly run run*.sh
without prepare*.sh.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 10:52 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

Thanks for information.

When you run "run.sh", I actually use Hadoop data set provided from
Hadoop
download. I can size the data size in configure file as you mentioned
in
this email. Is my statement correct? Can I use my own dataset in
case.
If I
can how can I issue run.sh with file name?

Thanks

David

Sent from my iPhone

On Jun 11, 2014, at 7:14 PM, Daoyuan Wang <[email protected]
mailto:[email protected]%20%0b> mailto:[email protected]%20%0b>> mailto:[email protected]%20%0b>
mailto:[email protected]>
wrote:

Hi David,

You can modify the conf/configure.sh file under each workload
folder
if
it exists. All the data size and options related to the workload are
defined in this file.

Thanks,
Daoyuan


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45824799>.


Reply to this email directly or view it on GitHub
<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828494>.


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828903>.


Reply to this email directly or view it on GitHub
<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45831658>
.


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45966009>.


Reply to this email directly or view it on GitHub
#43 (comment).


Reply to this email directly or view it on GitHubhttps://github.com//issues/43#issuecomment-45976236.

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

Hi, Daoyuan,

I downloaded and configured Hadoop 2.2.0. I have following questions:

  1. CDH3U4 - Is this for Hadoop 2.2.0 verson? I only saw cdh3u4.jar. Where I
    can download? Is CDH3U for every Hadoop version or it has a different
    cdh3u4?

  2. Your link (Github) - when you pointed me your link it doesn't have
    detail information and instruction. For example, you said that I need check
    Yarn from repo? You mean Github?

Thanks for your advice.

David

On Thu, Jun 12, 2014 at 10:09 PM, Daoyuan Wang [email protected]
wrote:

No, what you download is HiBench master branch, may need some modification
to run against Yarn, because master branch is tested over MRv1 instead of
yarn. You should check out the Yarn branch from the repo.
You can check your output file name in the corresponding configure file in
work load’s conf/ directory.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 1:05 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Yes I run HiBench over the yarn. Should I stop it before I run HiBench?
What output file name is called?

Thanks.
David

On Thu, Jun 12, 2014 at 6:30 PM, Daoyuan Wang <[email protected]
mailto:[email protected]>
wrote:

1. Are you running HiBench over Yarn? I think you may need to checkout
yarn branch on github

2. Output file is in HDFS

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 9:05 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Daoyuan,

I have solved the 8080 port mismatch problem.

I have following questions:

  1. Why am I getting this error? Should I used == to replace -eq?

[hadoopuser@localhost conf]$ sh configure.sh
configure.sh: line 26: [: -eq: unary operator expected

  1. word counts
    When did the output files go when I run "run.sh"?

Thanks for your help.

David

On Thu, Jun 12, 2014 at 3:06 PM, david cheng <[email protected]
<mailto:
mailto:[email protected]%3cmailto:%20%0b> [email protected]
mailto:[email protected]>> wrote:

Hi, Daoyuan,

Thanks for your information. I run the following statement and got an
error. I checked thhat 8088 is localhost port. Please let me know why
it
happened and how to fix it.

Regards

David

hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-

client-jobclient-2.2.0-tests.jar TestDFSIO -write -nrFiles 10
-fileSize
10

java.io.IOException: Failed on local exception:
com.google.protobuf.InvalidProtocolBufferException: Protocol message
end-group tag did not match expected tag.; Host Details : local host
is:
"localhost/127.0.0.1"; destination host is: "localhost":8088;

On Wed, Jun 11, 2014 at 10:32 PM, Daoyuan Wang <
[email protected]
mailto:[email protected]%20%0b> mailto:[email protected]>

wrote:

Hi David,

Not very sure about your case. The slaves that HiBench use are all
the
slaves you configured to your Hadoop cluster.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 12:25 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

This is very helpful. Thanks a lot.

I saw that some of bench mark hadoop test was using 312 nodes. Does
it
mean
312 virtual servers?

David

Piston OPS-20 (8vm/cn) 312 nodes (SMALL DATA SET)

David

On Wed, Jun 11, 2014 at 9:15 PM, Daoyuan Wang <
[email protected]
mailto:[email protected]%20%0b> mailto:[email protected]%20%0b>>
mailto:[email protected]>

wrote:

Hi David,

For some of the workloads like wordcount, you are just configuring
Hadoop-example to write out random data, while for some workloads
like
pagerank, there’s a datagen tool of HiBench that output the
dataset.
The
data set preparation period is in prepare*.sh file in each work
load.
If
you have your own data, you can upload your files to HDFS, and
config
the
path in conf.sh in each work load to use by directly run
run*.sh
without prepare*.sh.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 10:52 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

Thanks for information.

When you run "run.sh", I actually use Hadoop data set provided from
Hadoop
download. I can size the data size in configure file as you
mentioned
in
this email. Is my statement correct? Can I use my own dataset in
case.
If I
can how can I issue run.sh with file name?

Thanks

David

Sent from my iPhone

On Jun 11, 2014, at 7:14 PM, Daoyuan Wang <
[email protected]
mailto:[email protected]%20%0b> mailto:[email protected]%20%0b>>
mailto:[email protected]%20%0b>
mailto:[email protected]>
wrote:

Hi David,

You can modify the conf/configure.sh file under each workload
folder
if
it exists. All the data size and options related to the workload
are
defined in this file.

Thanks,
Daoyuan


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45824799>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828494>.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828903>.


Reply to this email directly or view it on GitHub
<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45831658>

.


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45966009>.


Reply to this email directly or view it on GitHub
#43 (comment).


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45976236>.


Reply to this email directly or view it on GitHub
#43 (comment).

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

In addition, if I run yarn, which part I need to modify (see the configure
file for sort)

#!/bin/bash

compress

COMPRESS=$COMPRESS_GLOBAL
COMPRESS_CODEC=$COMPRESS_CODEC_GLOBAL

paths

INPUT_HDFS=${DATA_HDFS}/Sort/Input
OUTPUT_HDFS=${DATA_HDFS}/Sort/Output

if [ $COMPRESS -eq 1 ]; then
INPUT_HDFS=${INPUT_HDFS}-comp
OUTPUT_HDFS=${OUTPUT_HDFS}-comp
fi

for prepare (per node) - 24G/node

#DATASIZE=24000000000
DATASIZE=24000000000
NUM_MAPS=16

for running (in total)

NUM_REDS=48

David

On Fri, Jun 13, 2014 at 3:02 PM, david cheng [email protected] wrote:

Hi, Daoyuan,

I downloaded and configured Hadoop 2.2.0. I have following questions:

  1. CDH3U4 - Is this for Hadoop 2.2.0 verson? I only saw cdh3u4.jar. Where
    I can download? Is CDH3U for every Hadoop version or it has a different
    cdh3u4?

  2. Your link (Github) - when you pointed me your link it doesn't have
    detail information and instruction. For example, you said that I need check
    Yarn from repo? You mean Github?

Thanks for your advice.

David

On Thu, Jun 12, 2014 at 10:09 PM, Daoyuan Wang [email protected]
wrote:

No, what you download is HiBench master branch, may need some
modification to run against Yarn, because master branch is tested over MRv1
instead of yarn. You should check out the Yarn branch from the repo.
You can check your output file name in the corresponding configure file
in work load’s conf/ directory.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 1:05 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Yes I run HiBench over the yarn. Should I stop it before I run HiBench?
What output file name is called?

Thanks.
David

On Thu, Jun 12, 2014 at 6:30 PM, Daoyuan Wang <[email protected]
mailto:[email protected]>
wrote:

1. Are you running HiBench over Yarn? I think you may need to checkout
yarn branch on github

2. Output file is in HDFS

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 9:05 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Daoyuan,

I have solved the 8080 port mismatch problem.

I have following questions:

  1. Why am I getting this error? Should I used == to replace -eq?

[hadoopuser@localhost conf]$ sh configure.sh
configure.sh: line 26: [: -eq: unary operator expected

  1. word counts
    When did the output files go when I run "run.sh"?

Thanks for your help.

David

On Thu, Jun 12, 2014 at 3:06 PM, david cheng <[email protected]
<mailto:
mailto:[email protected]%3cmailto:%20%0b> [email protected]
mailto:[email protected]>> wrote:

Hi, Daoyuan,

Thanks for your information. I run the following statement and got an
error. I checked thhat 8088 is localhost port. Please let me know why
it
happened and how to fix it.

Regards

David

hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-

client-jobclient-2.2.0-tests.jar TestDFSIO -write -nrFiles 10
-fileSize
10

java.io.IOException: Failed on local exception:
com.google.protobuf.InvalidProtocolBufferException: Protocol message
end-group tag did not match expected tag.; Host Details : local host
is:
"localhost/127.0.0.1"; destination host is: "localhost":8088;

On Wed, Jun 11, 2014 at 10:32 PM, Daoyuan Wang <
[email protected]
mailto:[email protected]%20%0b> mailto:[email protected]>

wrote:

Hi David,

Not very sure about your case. The slaves that HiBench use are all
the
slaves you configured to your Hadoop cluster.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 12:25 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

This is very helpful. Thanks a lot.

I saw that some of bench mark hadoop test was using 312 nodes. Does
it
mean
312 virtual servers?

David

Piston OPS-20 (8vm/cn) 312 nodes (SMALL DATA SET)

David

On Wed, Jun 11, 2014 at 9:15 PM, Daoyuan Wang <
[email protected]
mailto:[email protected]%20%0b> mailto:[email protected]%20%0b>>
mailto:[email protected]>

wrote:

Hi David,

For some of the workloads like wordcount, you are just configuring
Hadoop-example to write out random data, while for some workloads
like
pagerank, there’s a datagen tool of HiBench that output the
dataset.
The
data set preparation period is in prepare*.sh file in each work
load.
If
you have your own data, you can upload your files to HDFS, and
config
the
path in conf.sh in each work load to use by directly run
run*.sh
without prepare*.sh.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 10:52 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

Thanks for information.

When you run "run.sh", I actually use Hadoop data set provided
from
Hadoop
download. I can size the data size in configure file as you
mentioned
in
this email. Is my statement correct? Can I use my own dataset in
case.
If I
can how can I issue run.sh with file name?

Thanks

David

Sent from my iPhone

On Jun 11, 2014, at 7:14 PM, Daoyuan Wang <
[email protected]
mailto:[email protected]%20%0b> mailto:[email protected]%20%0b>>
mailto:[email protected]%20%0b>
mailto:[email protected]>
wrote:

Hi David,

You can modify the conf/configure.sh file under each workload
folder
if
it exists. All the data size and options related to the workload
are
defined in this file.

Thanks,
Daoyuan


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45824799>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828494>.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828903>.


Reply to this email directly or view it on GitHub
<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45831658>

.


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45966009>.


Reply to this email directly or view it on GitHub
<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45967290>.


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45976236>.


Reply to this email directly or view it on GitHub
#43 (comment)
.

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

Hi Daoyuan,

I downloaded hadoop-2.2.0 without cdh3u4. Based on your doc (you use
hadoop-1.03 cdh3u4), I tried to find cdh3u4 but I can only find
hadoop-0.20.2 cdh3u4 instead of hadoop-1.0.3.. Please let me know where I
can find it. Can I used 2.2.0 with cdh3u4 or cdh4?

Thanks.
David

On Fri, Jun 13, 2014 at 3:20 PM, david cheng [email protected] wrote:

In addition, if I run yarn, which part I need to modify (see the configure
file for sort)

#!/bin/bash

compress

COMPRESS=$COMPRESS_GLOBAL
COMPRESS_CODEC=$COMPRESS_CODEC_GLOBAL

paths

INPUT_HDFS=${DATA_HDFS}/Sort/Input
OUTPUT_HDFS=${DATA_HDFS}/Sort/Output

if [ $COMPRESS -eq 1 ]; then
INPUT_HDFS=${INPUT_HDFS}-comp
OUTPUT_HDFS=${OUTPUT_HDFS}-comp
fi

for prepare (per node) - 24G/node

#DATASIZE=24000000000
DATASIZE=24000000000
NUM_MAPS=16

for running (in total)

NUM_REDS=48

David

On Fri, Jun 13, 2014 at 3:02 PM, david cheng [email protected] wrote:

Hi, Daoyuan,

I downloaded and configured Hadoop 2.2.0. I have following questions:

  1. CDH3U4 - Is this for Hadoop 2.2.0 verson? I only saw cdh3u4.jar. Where
    I can download? Is CDH3U for every Hadoop version or it has a different
    cdh3u4?

  2. Your link (Github) - when you pointed me your link it doesn't have
    detail information and instruction. For example, you said that I need check
    Yarn from repo? You mean Github?

Thanks for your advice.

David

On Thu, Jun 12, 2014 at 10:09 PM, Daoyuan Wang [email protected]
wrote:

No, what you download is HiBench master branch, may need some
modification to run against Yarn, because master branch is tested over MRv1
instead of yarn. You should check out the Yarn branch from the repo.
You can check your output file name in the corresponding configure file
in work load’s conf/ directory.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 1:05 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Yes I run HiBench over the yarn. Should I stop it before I run HiBench?
What output file name is called?

Thanks.
David

On Thu, Jun 12, 2014 at 6:30 PM, Daoyuan Wang <[email protected]
mailto:[email protected]>
wrote:

1. Are you running HiBench over Yarn? I think you may need to checkout
yarn branch on github

2. Output file is in HDFS

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 9:05 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Daoyuan,

I have solved the 8080 port mismatch problem.

I have following questions:

  1. Why am I getting this error? Should I used == to replace -eq?

[hadoopuser@localhost conf]$ sh configure.sh
configure.sh: line 26: [: -eq: unary operator expected

  1. word counts
    When did the output files go when I run "run.sh"?

Thanks for your help.

David

On Thu, Jun 12, 2014 at 3:06 PM, david cheng <[email protected]
<mailto:
mailto:[email protected]%3cmailto:%20%0b> [email protected]
mailto:[email protected]>> wrote:

Hi, Daoyuan,

Thanks for your information. I run the following statement and got
an
error. I checked thhat 8088 is localhost port. Please let me know
why it
happened and how to fix it.

Regards

David

hadoop jar
/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-

client-jobclient-2.2.0-tests.jar TestDFSIO -write -nrFiles 10
-fileSize
10

java.io.IOException: Failed on local exception:
com.google.protobuf.InvalidProtocolBufferException: Protocol message
end-group tag did not match expected tag.; Host Details : local host
is:
"localhost/127.0.0.1"; destination host is: "localhost":8088;

On Wed, Jun 11, 2014 at 10:32 PM, Daoyuan Wang <
[email protected]
mailto:[email protected]%20%0b> <mailto:
[email protected]>>
wrote:

Hi David,

Not very sure about your case. The slaves that HiBench use are all
the
slaves you configured to your Hadoop cluster.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 12:25 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

This is very helpful. Thanks a lot.

I saw that some of bench mark hadoop test was using 312 nodes. Does
it
mean
312 virtual servers?

David

Piston OPS-20 (8vm/cn) 312 nodes (SMALL DATA SET)

David

On Wed, Jun 11, 2014 at 9:15 PM, Daoyuan Wang <
[email protected]
mailto:[email protected]%20%0b> <mailto:
[email protected]%20%0b>>> mailto:[email protected]>

wrote:

Hi David,

For some of the workloads like wordcount, you are just
configuring
Hadoop-example to write out random data, while for some workloads
like
pagerank, there’s a datagen tool of HiBench that output the
dataset.
The
data set preparation period is in prepare*.sh file in each work
load.
If
you have your own data, you can upload your files to HDFS, and
config
the
path in conf.sh in each work load to use by directly run
run*.sh
without prepare*.sh.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 10:52 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

Thanks for information.

When you run "run.sh", I actually use Hadoop data set provided
from
Hadoop
download. I can size the data size in configure file as you
mentioned
in
this email. Is my statement correct? Can I use my own dataset in
case.
If I
can how can I issue run.sh with file name?

Thanks

David

Sent from my iPhone

On Jun 11, 2014, at 7:14 PM, Daoyuan Wang <
[email protected]
mailto:[email protected]%20%0b> <mailto:
[email protected]%20%0b>>> mailto:[email protected]%20%0b>

mailto:[email protected]>

wrote:

Hi David,

You can modify the conf/configure.sh file under each workload
folder
if
it exists. All the data size and options related to the workload
are
defined in this file.

Thanks,
Daoyuan


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45824799>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828494>.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828903>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45831658>

.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45966009>.


Reply to this email directly or view it on GitHub
<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45967290>.


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45976236>.


Reply to this email directly or view it on GitHub
#43 (comment)
.

from hibench.

adrian-wang avatar adrian-wang commented on July 19, 2024

Hi David,

I think you can treat Hadoop-0.20.2-cdh3u4 the same way in the doc when using master branch. If you want to use cdh4, please use the yarn branch of the repo and reset the branch to version at April 23th.
Maybe you need to refer to git manual to know how to use cdh4. And you can also use that yarn branch for cdh5 directly.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Sunday, June 15, 2014 3:09 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi Daoyuan,

I downloaded hadoop-2.2.0 without cdh3u4. Based on your doc (you use
hadoop-1.03 cdh3u4), I tried to find cdh3u4 but I can only find
hadoop-0.20.2 cdh3u4 instead of hadoop-1.0.3.. Please let me know where I
can find it. Can I used 2.2.0 with cdh3u4 or cdh4?

Thanks.
David

On Fri, Jun 13, 2014 at 3:20 PM, david cheng <[email protected]mailto:[email protected]> wrote:

In addition, if I run yarn, which part I need to modify (see the configure
file for sort)

#!/bin/bash

compress

COMPRESS=$COMPRESS_GLOBAL
COMPRESS_CODEC=$COMPRESS_CODEC_GLOBAL

paths

INPUT_HDFS=${DATA_HDFS}/Sort/Input
OUTPUT_HDFS=${DATA_HDFS}/Sort/Output

if [ $COMPRESS -eq 1 ]; then
INPUT_HDFS=${INPUT_HDFS}-comp
OUTPUT_HDFS=${OUTPUT_HDFS}-comp
fi

for prepare (per node) - 24G/node

#DATASIZE=24000000000
DATASIZE=24000000000
NUM_MAPS=16

for running (in total)

NUM_REDS=48

David

On Fri, Jun 13, 2014 at 3:02 PM, david cheng <[email protected]mailto:[email protected]> wrote:

Hi, Daoyuan,

I downloaded and configured Hadoop 2.2.0. I have following questions:

  1. CDH3U4 - Is this for Hadoop 2.2.0 verson? I only saw cdh3u4.jar. Where
    I can download? Is CDH3U for every Hadoop version or it has a different
    cdh3u4?

  2. Your link (Github) - when you pointed me your link it doesn't have
    detail information and instruction. For example, you said that I need check
    Yarn from repo? You mean Github?

Thanks for your advice.

David

On Thu, Jun 12, 2014 at 10:09 PM, Daoyuan Wang <[email protected]mailto:[email protected]>
wrote:

No, what you download is HiBench master branch, may need some
modification to run against Yarn, because master branch is tested over MRv1
instead of yarn. You should check out the Yarn branch from the repo.
You can check your output file name in the corresponding configure file
in work load’s conf/ directory.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 1:05 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Yes I run HiBench over the yarn. Should I stop it before I run HiBench?
What output file name is called?

Thanks.
David

On Thu, Jun 12, 2014 at 6:30 PM, Daoyuan Wang <[email protected]
mailto:[email protected]%20%0b>>> mailto:[email protected]>
wrote:

1. Are you running HiBench over Yarn? I think you may need to checkout
yarn branch on github

2. Output file is in HDFS

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 9:05 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Daoyuan,

I have solved the 8080 port mismatch problem.

I have following questions:

  1. Why am I getting this error? Should I used == to replace -eq?

[hadoopuser@localhost conf]$ sh configure.sh
configure.sh: line 26: [: -eq: unary operator expected

  1. word counts
    When did the output files go when I run "run.sh"?

Thanks for your help.

David

On Thu, Jun 12, 2014 at 3:06 PM, david cheng <[email protected]
mailto:[email protected]%20%0b>>> <mailto:
mailto:[email protected]%3cmailto:%20%0b> [email protected]:[email protected]
mailto:[email protected]>> wrote:

Hi, Daoyuan,

Thanks for your information. I run the following statement and got
an
error. I checked thhat 8088 is localhost port. Please let me know
why it
happened and how to fix it.

Regards

David

hadoop jar
/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-

client-jobclient-2.2.0-tests.jar TestDFSIO -write -nrFiles 10
-fileSize
10

java.io.IOException: Failed on local exception:
com.google.protobuf.InvalidProtocolBufferException: Protocol message
end-group tag did not match expected tag.; Host Details : local host
is:
"localhost/127.0.0.1"; destination host is: "localhost":8088;

On Wed, Jun 11, 2014 at 10:32 PM, Daoyuan Wang <
[email protected]:[email protected]
mailto:[email protected]%20%0b> <mailto:
[email protected]mailto:[email protected]>>
wrote:

Hi David,

Not very sure about your case. The slaves that HiBench use are all
the
slaves you configured to your Hadoop cluster.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 12:25 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

This is very helpful. Thanks a lot.

I saw that some of bench mark hadoop test was using 312 nodes. Does
it
mean
312 virtual servers?

David

Piston OPS-20 (8vm/cn) 312 nodes (SMALL DATA SET)

David

On Wed, Jun 11, 2014 at 9:15 PM, Daoyuan Wang <
[email protected]:[email protected]
mailto:[email protected]%20%0b> <mailto:
[email protected]%20%0bmailto:[email protected]%20%0b>>> mailto:[email protected]>

wrote:

Hi David,

For some of the workloads like wordcount, you are just
configuring
Hadoop-example to write out random data, while for some workloads
like
pagerank, there’s a datagen tool of HiBench that output the
dataset.
The
data set preparation period is in prepare*.sh file in each work
load.
If
you have your own data, you can upload your files to HDFS, and
config
the
path in conf.sh in each work load to use by directly run
run*.sh
without prepare*.sh.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 10:52 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

Thanks for information.

When you run "run.sh", I actually use Hadoop data set provided
from
Hadoop
download. I can size the data size in configure file as you
mentioned
in
this email. Is my statement correct? Can I use my own dataset in
case.
If I
can how can I issue run.sh with file name?

Thanks

David

Sent from my iPhone

On Jun 11, 2014, at 7:14 PM, Daoyuan Wang <
[email protected]:[email protected]
mailto:[email protected]%20%0b> <mailto:
[email protected]%20%0bmailto:[email protected]%20%0b>>> mailto:[email protected]%20%0b>

mailto:[email protected]>

wrote:

Hi David,

You can modify the conf/configure.sh file under each workload
folder
if
it exists. All the data size and options related to the workload
are
defined in this file.

Thanks,
Daoyuan


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45824799>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828494>.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828903>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45831658>

.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45966009>.


Reply to this email directly or view it on GitHub
<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45967290>.


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45976236>.


Reply to this email directly or view it on GitHub
#43 (comment)
.


Reply to this email directly or view it on GitHubhttps://github.com//issues/43#issuecomment-46096844.

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

Daoyuan,

Sounds like that I have remove hadoop-2.2.0 and yarn. Then I need to
install hadoop -1.0.3 and cdh3u4 so I can test HiBench 2,2 based on your
doc. Can I have hadoop-2.2.0 and 1.0.3 coexist in centos6.5? I stoped all
services. But it will not let me mv hadoop dir somehow.

David

On Sat, Jun 14, 2014 at 12:09 PM, david cheng [email protected] wrote:

Hi Daoyuan,

I downloaded hadoop-2.2.0 without cdh3u4. Based on your doc (you use
hadoop-1.03 cdh3u4), I tried to find cdh3u4 but I can only find
hadoop-0.20.2 cdh3u4 instead of hadoop-1.0.3.. Please let me know where I
can find it. Can I used 2.2.0 with cdh3u4 or cdh4?

Thanks.
David

On Fri, Jun 13, 2014 at 3:20 PM, david cheng [email protected] wrote:

In addition, if I run yarn, which part I need to modify (see the
configure file for sort)

#!/bin/bash

compress

COMPRESS=$COMPRESS_GLOBAL
COMPRESS_CODEC=$COMPRESS_CODEC_GLOBAL

paths

INPUT_HDFS=${DATA_HDFS}/Sort/Input
OUTPUT_HDFS=${DATA_HDFS}/Sort/Output

if [ $COMPRESS -eq 1 ]; then
INPUT_HDFS=${INPUT_HDFS}-comp
OUTPUT_HDFS=${OUTPUT_HDFS}-comp
fi

for prepare (per node) - 24G/node

#DATASIZE=24000000000
DATASIZE=24000000000
NUM_MAPS=16

for running (in total)

NUM_REDS=48

David

On Fri, Jun 13, 2014 at 3:02 PM, david cheng [email protected]
wrote:

Hi, Daoyuan,

I downloaded and configured Hadoop 2.2.0. I have following questions:

  1. CDH3U4 - Is this for Hadoop 2.2.0 verson? I only saw cdh3u4.jar.
    Where I can download? Is CDH3U for every Hadoop version or it has a
    different cdh3u4?

  2. Your link (Github) - when you pointed me your link it doesn't have
    detail information and instruction. For example, you said that I need check
    Yarn from repo? You mean Github?

Thanks for your advice.

David

On Thu, Jun 12, 2014 at 10:09 PM, Daoyuan Wang <[email protected]

wrote:

No, what you download is HiBench master branch, may need some
modification to run against Yarn, because master branch is tested over MRv1
instead of yarn. You should check out the Yarn branch from the repo.
You can check your output file name in the corresponding configure file
in work load’s conf/ directory.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 1:05 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Yes I run HiBench over the yarn. Should I stop it before I run HiBench?
What output file name is called?

Thanks.
David

On Thu, Jun 12, 2014 at 6:30 PM, Daoyuan Wang <[email protected]
mailto:[email protected]>
wrote:

1. Are you running HiBench over Yarn? I think you may need to
checkout
yarn branch on github

2. Output file is in HDFS

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 9:05 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Daoyuan,

I have solved the 8080 port mismatch problem.

I have following questions:

  1. Why am I getting this error? Should I used == to replace -eq?

[hadoopuser@localhost conf]$ sh configure.sh
configure.sh: line 26: [: -eq: unary operator expected

  1. word counts
    When did the output files go when I run "run.sh"?

Thanks for your help.

David

On Thu, Jun 12, 2014 at 3:06 PM, david cheng <[email protected]
<mailto:
mailto:[email protected]%3cmailto:%20%0b> [email protected]
mailto:[email protected]>> wrote:

Hi, Daoyuan,

Thanks for your information. I run the following statement and got
an
error. I checked thhat 8088 is localhost port. Please let me know
why it
happened and how to fix it.

Regards

David

hadoop jar
/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-

client-jobclient-2.2.0-tests.jar TestDFSIO -write -nrFiles 10
-fileSize
10

java.io.IOException: Failed on local exception:
com.google.protobuf.InvalidProtocolBufferException: Protocol
message
end-group tag did not match expected tag.; Host Details : local
host is:
"localhost/127.0.0.1"; destination host is: "localhost":8088;

On Wed, Jun 11, 2014 at 10:32 PM, Daoyuan Wang <
[email protected]
mailto:[email protected]%20%0b> <mailto:
[email protected]>>
wrote:

Hi David,

Not very sure about your case. The slaves that HiBench use are all
the
slaves you configured to your Hadoop cluster.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 12:25 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

This is very helpful. Thanks a lot.

I saw that some of bench mark hadoop test was using 312 nodes.
Does it
mean
312 virtual servers?

David

Piston OPS-20 (8vm/cn) 312 nodes (SMALL DATA SET)

David

On Wed, Jun 11, 2014 at 9:15 PM, Daoyuan Wang <
[email protected]
mailto:[email protected]%20%0b> <mailto:
[email protected]%20%0b>>> mailto:[email protected]>

wrote:

Hi David,

For some of the workloads like wordcount, you are just
configuring
Hadoop-example to write out random data, while for some
workloads
like
pagerank, there’s a datagen tool of HiBench that output the
dataset.
The
data set preparation period is in prepare*.sh file in each
work
load.
If
you have your own data, you can upload your files to HDFS, and
config
the
path in conf.sh in each work load to use by directly run
run*.sh
without prepare*.sh.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 10:52 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

Thanks for information.

When you run "run.sh", I actually use Hadoop data set provided
from
Hadoop
download. I can size the data size in configure file as you
mentioned
in
this email. Is my statement correct? Can I use my own dataset in
case.
If I
can how can I issue run.sh with file name?

Thanks

David

Sent from my iPhone

On Jun 11, 2014, at 7:14 PM, Daoyuan Wang <
[email protected]
mailto:[email protected]%20%0b> <mailto:
[email protected]%20%0b>>> mailto:[email protected]%20%0b>

mailto:[email protected]>

wrote:

Hi David,

You can modify the conf/configure.sh file under each workload
folder
if
it exists. All the data size and options related to the workload
are
defined in this file.

Thanks,
Daoyuan


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45824799>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828494>.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828903>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45831658>

.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45966009>.


Reply to this email directly or view it on GitHub
<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45967290>.


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45976236>.


Reply to this email directly or view it on GitHub
#43 (comment)
.

from hibench.

adrian-wang avatar adrian-wang commented on July 19, 2024

It is totally OK for different versions of Hadoop coexisting, but you cannot start two versions at the same time. And to run the master branch, what you need to install is Hadoop-1.0.3 OR cdh3u4, not AND. In fact I think most of Hadoop MRv1 conforms to the doc, and if you run yarn, it is still the same way.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Monday, June 16, 2014 10:22 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Daoyuan,

Sounds like that I have remove hadoop-2.2.0 and yarn. Then I need to
install hadoop -1.0.3 and cdh3u4 so I can test HiBench 2,2 based on your
doc. Can I have hadoop-2.2.0 and 1.0.3 coexist in centos6.5? I stoped all
services. But it will not let me mv hadoop dir somehow.

David

On Sat, Jun 14, 2014 at 12:09 PM, david cheng <[email protected]mailto:[email protected]> wrote:

Hi Daoyuan,

I downloaded hadoop-2.2.0 without cdh3u4. Based on your doc (you use
hadoop-1.03 cdh3u4), I tried to find cdh3u4 but I can only find
hadoop-0.20.2 cdh3u4 instead of hadoop-1.0.3.. Please let me know where I
can find it. Can I used 2.2.0 with cdh3u4 or cdh4?

Thanks.
David

On Fri, Jun 13, 2014 at 3:20 PM, david cheng <[email protected]mailto:[email protected]> wrote:

In addition, if I run yarn, which part I need to modify (see the
configure file for sort)

#!/bin/bash

compress

COMPRESS=$COMPRESS_GLOBAL
COMPRESS_CODEC=$COMPRESS_CODEC_GLOBAL

paths

INPUT_HDFS=${DATA_HDFS}/Sort/Input
OUTPUT_HDFS=${DATA_HDFS}/Sort/Output

if [ $COMPRESS -eq 1 ]; then
INPUT_HDFS=${INPUT_HDFS}-comp
OUTPUT_HDFS=${OUTPUT_HDFS}-comp
fi

for prepare (per node) - 24G/node

#DATASIZE=24000000000
DATASIZE=24000000000
NUM_MAPS=16

for running (in total)

NUM_REDS=48

David

On Fri, Jun 13, 2014 at 3:02 PM, david cheng <[email protected]mailto:[email protected]>
wrote:

Hi, Daoyuan,

I downloaded and configured Hadoop 2.2.0. I have following questions:

  1. CDH3U4 - Is this for Hadoop 2.2.0 verson? I only saw cdh3u4.jar.
    Where I can download? Is CDH3U for every Hadoop version or it has a
    different cdh3u4?

  2. Your link (Github) - when you pointed me your link it doesn't have
    detail information and instruction. For example, you said that I need check
    Yarn from repo? You mean Github?

Thanks for your advice.

David

On Thu, Jun 12, 2014 at 10:09 PM, Daoyuan Wang <[email protected]
mailto:[email protected]%20%0b>>> > wrote:

No, what you download is HiBench master branch, may need some
modification to run against Yarn, because master branch is tested over MRv1
instead of yarn. You should check out the Yarn branch from the repo.
You can check your output file name in the corresponding configure file
in work load’s conf/ directory.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 1:05 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Yes I run HiBench over the yarn. Should I stop it before I run HiBench?
What output file name is called?

Thanks.
David

On Thu, Jun 12, 2014 at 6:30 PM, Daoyuan Wang <[email protected]
mailto:[email protected]%20%0b>>>> mailto:[email protected]>
wrote:

1. Are you running HiBench over Yarn? I think you may need to
checkout
yarn branch on github

2. Output file is in HDFS

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 9:05 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Daoyuan,

I have solved the 8080 port mismatch problem.

I have following questions:

  1. Why am I getting this error? Should I used == to replace -eq?

[hadoopuser@localhost conf]$ sh configure.sh
configure.sh: line 26: [: -eq: unary operator expected

  1. word counts
    When did the output files go when I run "run.sh"?

Thanks for your help.

David

On Thu, Jun 12, 2014 at 3:06 PM, david cheng <[email protected]
mailto:[email protected]%20%0b>>>> <mailto:
mailto:[email protected]%3cmailto:%20%0b> [email protected]:[email protected]
mailto:[email protected]>> wrote:

Hi, Daoyuan,

Thanks for your information. I run the following statement and got
an
error. I checked thhat 8088 is localhost port. Please let me know
why it
happened and how to fix it.

Regards

David

hadoop jar
/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-

client-jobclient-2.2.0-tests.jar TestDFSIO -write -nrFiles 10
-fileSize
10

java.io.IOException: Failed on local exception:
com.google.protobuf.InvalidProtocolBufferException: Protocol
message
end-group tag did not match expected tag.; Host Details : local
host is:
"localhost/127.0.0.1"; destination host is: "localhost":8088;

On Wed, Jun 11, 2014 at 10:32 PM, Daoyuan Wang <
[email protected]:[email protected]
mailto:[email protected]%20%0b> <mailto:
[email protected]mailto:[email protected]>>
wrote:

Hi David,

Not very sure about your case. The slaves that HiBench use are all
the
slaves you configured to your Hadoop cluster.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 12:25 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

This is very helpful. Thanks a lot.

I saw that some of bench mark hadoop test was using 312 nodes.
Does it
mean
312 virtual servers?

David

Piston OPS-20 (8vm/cn) 312 nodes (SMALL DATA SET)

David

On Wed, Jun 11, 2014 at 9:15 PM, Daoyuan Wang <
[email protected]:[email protected]
mailto:[email protected]%20%0b> <mailto:
[email protected]%20%0bmailto:[email protected]%20%0b>>> mailto:[email protected]>

wrote:

Hi David,

For some of the workloads like wordcount, you are just
configuring
Hadoop-example to write out random data, while for some
workloads
like
pagerank, there’s a datagen tool of HiBench that output the
dataset.
The
data set preparation period is in prepare*.sh file in each
work
load.
If
you have your own data, you can upload your files to HDFS, and
config
the
path in conf.sh in each work load to use by directly run
run*.sh
without prepare*.sh.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 10:52 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

Thanks for information.

When you run "run.sh", I actually use Hadoop data set provided
from
Hadoop
download. I can size the data size in configure file as you
mentioned
in
this email. Is my statement correct? Can I use my own dataset in
case.
If I
can how can I issue run.sh with file name?

Thanks

David

Sent from my iPhone

On Jun 11, 2014, at 7:14 PM, Daoyuan Wang <
[email protected]:[email protected]
mailto:[email protected]%20%0b> <mailto:
[email protected]%20%0bmailto:[email protected]%20%0b>>> mailto:[email protected]%20%0b>

mailto:[email protected]>

wrote:

Hi David,

You can modify the conf/configure.sh file under each workload
folder
if
it exists. All the data size and options related to the workload
are
defined in this file.

Thanks,
Daoyuan


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45824799>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828494>.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828903>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45831658>

.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45966009>.


Reply to this email directly or view it on GitHub
<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45967290>.


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45976236>.


Reply to this email directly or view it on GitHub
#43 (comment)
.


Reply to this email directly or view it on GitHubhttps://github.com//issues/43#issuecomment-46136178.

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

Daoyuan,

Thanks for your response.

I have hadoop 2.2.0 in centos 6.5. You think that I need to download yarn
HiBench?

David

On Sun, Jun 15, 2014 at 7:19 PM, Daoyuan Wang [email protected]
wrote:

Hi David,

I think you can treat Hadoop-0.20.2-cdh3u4 the same way in the doc when
using master branch. If you want to use cdh4, please use the yarn branch of
the repo and reset the branch to version at April 23th.
Maybe you need to refer to git manual to know how to use cdh4. And you can
also use that yarn branch for cdh5 directly.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Sunday, June 15, 2014 3:09 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi Daoyuan,

I downloaded hadoop-2.2.0 without cdh3u4. Based on your doc (you use
hadoop-1.03 cdh3u4), I tried to find cdh3u4 but I can only find
hadoop-0.20.2 cdh3u4 instead of hadoop-1.0.3.. Please let me know where I
can find it. Can I used 2.2.0 with cdh3u4 or cdh4?

Thanks.
David

On Fri, Jun 13, 2014 at 3:20 PM, david cheng <[email protected]<mailto:
[email protected]>> wrote:

In addition, if I run yarn, which part I need to modify (see the
configure
file for sort)

#!/bin/bash

compress

COMPRESS=$COMPRESS_GLOBAL
COMPRESS_CODEC=$COMPRESS_CODEC_GLOBAL

paths

INPUT_HDFS=${DATA_HDFS}/Sort/Input
OUTPUT_HDFS=${DATA_HDFS}/Sort/Output

if [ $COMPRESS -eq 1 ]; then
INPUT_HDFS=${INPUT_HDFS}-comp
OUTPUT_HDFS=${OUTPUT_HDFS}-comp
fi

for prepare (per node) - 24G/node

#DATASIZE=24000000000
DATASIZE=24000000000
NUM_MAPS=16

for running (in total)

NUM_REDS=48

David

On Fri, Jun 13, 2014 at 3:02 PM, david cheng <[email protected]
mailto:[email protected]> wrote:

Hi, Daoyuan,

I downloaded and configured Hadoop 2.2.0. I have following questions:

  1. CDH3U4 - Is this for Hadoop 2.2.0 verson? I only saw cdh3u4.jar.
    Where
    I can download? Is CDH3U for every Hadoop version or it has a different
    cdh3u4?

  2. Your link (Github) - when you pointed me your link it doesn't have
    detail information and instruction. For example, you said that I need
    check
    Yarn from repo? You mean Github?

Thanks for your advice.

David

On Thu, Jun 12, 2014 at 10:09 PM, Daoyuan Wang <
[email protected]:[email protected]>
wrote:

No, what you download is HiBench master branch, may need some
modification to run against Yarn, because master branch is tested over
MRv1
instead of yarn. You should check out the Yarn branch from the repo.
You can check your output file name in the corresponding configure
file
in work load’s conf/ directory.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 1:05 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Yes I run HiBench over the yarn. Should I stop it before I run
HiBench?
What output file name is called?

Thanks.
David

On Thu, Jun 12, 2014 at 6:30 PM, Daoyuan Wang <
[email protected]
mailto:[email protected]%20%0b>>> <mailto:
[email protected]>>
wrote:

1. Are you running HiBench over Yarn? I think you may need to
checkout
yarn branch on github

2. Output file is in HDFS

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 9:05 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Daoyuan,

I have solved the 8080 port mismatch problem.

I have following questions:

  1. Why am I getting this error? Should I used == to replace -eq?

[hadoopuser@localhost conf]$ sh configure.sh
configure.sh: line 26: [: -eq: unary operator expected

  1. word counts
    When did the output files go when I run "run.sh"?

Thanks for your help.

David

On Thu, Jun 12, 2014 at 3:06 PM, david cheng <[email protected]
mailto:[email protected]%20%0b>>> <mailto:
mailto:[email protected]%3cmailto:%20%0b> [email protected]
mailto:[email protected]
mailto:[email protected]>> wrote:

Hi, Daoyuan,

Thanks for your information. I run the following statement and got
an
error. I checked thhat 8088 is localhost port. Please let me know
why it
happened and how to fix it.

Regards

David

hadoop jar
/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-

client-jobclient-2.2.0-tests.jar TestDFSIO -write -nrFiles 10
-fileSize
10

java.io.IOException: Failed on local exception:
com.google.protobuf.InvalidProtocolBufferException: Protocol
message
end-group tag did not match expected tag.; Host Details : local
host
is:
"localhost/127.0.0.1"; destination host is: "localhost":8088;

On Wed, Jun 11, 2014 at 10:32 PM, Daoyuan Wang <
[email protected]:[email protected]
mailto:[email protected]%20%0b> <mailto:
[email protected]mailto:[email protected]>>
wrote:

Hi David,

Not very sure about your case. The slaves that HiBench use are
all
the
slaves you configured to your Hadoop cluster.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 12:25 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

This is very helpful. Thanks a lot.

I saw that some of bench mark hadoop test was using 312 nodes.
Does
it
mean
312 virtual servers?

David

Piston OPS-20 (8vm/cn) 312 nodes (SMALL DATA SET)

David

On Wed, Jun 11, 2014 at 9:15 PM, Daoyuan Wang <
[email protected]:[email protected]
mailto:[email protected]%20%0b> <mailto:
[email protected]%20%0bmailto:[email protected]%20%0b>>>
mailto:[email protected]>

wrote:

Hi David,

For some of the workloads like wordcount, you are just
configuring
Hadoop-example to write out random data, while for some
workloads
like
pagerank, there’s a datagen tool of HiBench that output the
dataset.
The
data set preparation period is in prepare*.sh file in each
work
load.
If
you have your own data, you can upload your files to HDFS, and
config
the
path in conf.sh in each work load to use by directly run
run*.sh
without prepare*.sh.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 10:52 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

Thanks for information.

When you run "run.sh", I actually use Hadoop data set provided
from
Hadoop
download. I can size the data size in configure file as you
mentioned
in
this email. Is my statement correct? Can I use my own dataset
in
case.
If I
can how can I issue run.sh with file name?

Thanks

David

Sent from my iPhone

On Jun 11, 2014, at 7:14 PM, Daoyuan Wang <
[email protected]:[email protected]
mailto:[email protected]%20%0b> <mailto:
[email protected]%20%0bmailto:[email protected]%20%0b>>>
mailto:[email protected]%20%0b>

mailto:[email protected]>

wrote:

Hi David,

You can modify the conf/configure.sh file under each workload
folder
if
it exists. All the data size and options related to the
workload
are
defined in this file.

Thanks,
Daoyuan


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45824799>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828494>.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828903>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45831658>

.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45966009>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45967290>.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45976236>.


Reply to this email directly or view it on GitHub
<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45976386>
.


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-46096844>.


Reply to this email directly or view it on GitHub
#43 (comment).

from hibench.

adrian-wang avatar adrian-wang commented on July 19, 2024

The yarn branch is tested against cdh5, while I think it is also ok for Hadoop-2.2.0. If you come up with some errors when running Hadoop-2.2.0, you can let me know, I’ll help you to figure out what to do.

Thanks,
Daoyan

From: dcheng1709 [mailto:[email protected]]
Sent: Monday, June 16, 2014 10:33 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Daoyuan,

Thanks for your response.

I have hadoop 2.2.0 in centos 6.5. You think that I need to download yarn
HiBench?

David

On Sun, Jun 15, 2014 at 7:19 PM, Daoyuan Wang <[email protected]mailto:[email protected]>
wrote:

Hi David,

I think you can treat Hadoop-0.20.2-cdh3u4 the same way in the doc when
using master branch. If you want to use cdh4, please use the yarn branch of
the repo and reset the branch to version at April 23th.
Maybe you need to refer to git manual to know how to use cdh4. And you can
also use that yarn branch for cdh5 directly.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Sunday, June 15, 2014 3:09 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi Daoyuan,

I downloaded hadoop-2.2.0 without cdh3u4. Based on your doc (you use
hadoop-1.03 cdh3u4), I tried to find cdh3u4 but I can only find
hadoop-0.20.2 cdh3u4 instead of hadoop-1.0.3.. Please let me know where I
can find it. Can I used 2.2.0 with cdh3u4 or cdh4?

Thanks.
David

On Fri, Jun 13, 2014 at 3:20 PM, david cheng <[email protected]<mailto:
mailto:[email protected]%3cmailto:%20%0b> [email protected]:[email protected]>> wrote:

In addition, if I run yarn, which part I need to modify (see the
configure
file for sort)

#!/bin/bash

compress

COMPRESS=$COMPRESS_GLOBAL
COMPRESS_CODEC=$COMPRESS_CODEC_GLOBAL

paths

INPUT_HDFS=${DATA_HDFS}/Sort/Input
OUTPUT_HDFS=${DATA_HDFS}/Sort/Output

if [ $COMPRESS -eq 1 ]; then
INPUT_HDFS=${INPUT_HDFS}-comp
OUTPUT_HDFS=${OUTPUT_HDFS}-comp
fi

for prepare (per node) - 24G/node

#DATASIZE=24000000000
DATASIZE=24000000000
NUM_MAPS=16

for running (in total)

NUM_REDS=48

David

On Fri, Jun 13, 2014 at 3:02 PM, david cheng <[email protected]
mailto:[email protected]%20%0b> mailto:[email protected]> wrote:

Hi, Daoyuan,

I downloaded and configured Hadoop 2.2.0. I have following questions:

  1. CDH3U4 - Is this for Hadoop 2.2.0 verson? I only saw cdh3u4.jar.
    Where
    I can download? Is CDH3U for every Hadoop version or it has a different
    cdh3u4?

  2. Your link (Github) - when you pointed me your link it doesn't have
    detail information and instruction. For example, you said that I need
    check
    Yarn from repo? You mean Github?

Thanks for your advice.

David

On Thu, Jun 12, 2014 at 10:09 PM, Daoyuan Wang <
[email protected]:[email protected]mailto:[email protected]%3cmailto:[email protected]>
wrote:

No, what you download is HiBench master branch, may need some
modification to run against Yarn, because master branch is tested over
MRv1
instead of yarn. You should check out the Yarn branch from the repo.
You can check your output file name in the corresponding configure
file
in work load’s conf/ directory.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 1:05 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Yes I run HiBench over the yarn. Should I stop it before I run
HiBench?
What output file name is called?

Thanks.
David

On Thu, Jun 12, 2014 at 6:30 PM, Daoyuan Wang <
[email protected]:[email protected]
mailto:[email protected]%20%0b>>> <mailto:
[email protected]mailto:[email protected]>>
wrote:

1. Are you running HiBench over Yarn? I think you may need to
checkout
yarn branch on github

2. Output file is in HDFS

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 9:05 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Daoyuan,

I have solved the 8080 port mismatch problem.

I have following questions:

  1. Why am I getting this error? Should I used == to replace -eq?

[hadoopuser@localhost conf]$ sh configure.sh
configure.sh: line 26: [: -eq: unary operator expected

  1. word counts
    When did the output files go when I run "run.sh"?

Thanks for your help.

David

On Thu, Jun 12, 2014 at 3:06 PM, david cheng <[email protected]
mailto:[email protected]%20%0b> mailto:[email protected]%20%0b>>> <mailto:
mailto:[email protected]%3cmailto:%20%0b> [email protected]:[email protected]
mailto:[email protected]
mailto:[email protected]>> wrote:

Hi, Daoyuan,

Thanks for your information. I run the following statement and got
an
error. I checked thhat 8088 is localhost port. Please let me know
why it
happened and how to fix it.

Regards

David

hadoop jar
/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-

client-jobclient-2.2.0-tests.jar TestDFSIO -write -nrFiles 10
-fileSize
10

java.io.IOException: Failed on local exception:
com.google.protobuf.InvalidProtocolBufferException: Protocol
message
end-group tag did not match expected tag.; Host Details : local
host
is:
"localhost/127.0.0.1"; destination host is: "localhost":8088;

On Wed, Jun 11, 2014 at 10:32 PM, Daoyuan Wang <
[email protected]:[email protected]mailto:[email protected]%3cmailto:[email protected]
mailto:[email protected]%20%0b> <mailto:
[email protected]<mailto:[email protected]mailto:[email protected]%3cmailto:[email protected]>>>
wrote:

Hi David,

Not very sure about your case. The slaves that HiBench use are
all
the
slaves you configured to your Hadoop cluster.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 12:25 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

This is very helpful. Thanks a lot.

I saw that some of bench mark hadoop test was using 312 nodes.
Does
it
mean
312 virtual servers?

David

Piston OPS-20 (8vm/cn) 312 nodes (SMALL DATA SET)

David

On Wed, Jun 11, 2014 at 9:15 PM, Daoyuan Wang <
[email protected]:[email protected]mailto:[email protected]%3cmailto:[email protected]
mailto:[email protected]%20%0b> <mailto:
[email protected]%20%0b<mailto:[email protected]%20%0bmailto:[email protected]%20%0b%3cmailto:[email protected]%20%0b>>>>
mailto:[email protected]>

wrote:

Hi David,

For some of the workloads like wordcount, you are just
configuring
Hadoop-example to write out random data, while for some
workloads
like
pagerank, there’s a datagen tool of HiBench that output the
dataset.
The
data set preparation period is in prepare*.sh file in each
work
load.
If
you have your own data, you can upload your files to HDFS, and
config
the
path in conf.sh in each work load to use by directly run
run*.sh
without prepare*.sh.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 10:52 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

Thanks for information.

When you run "run.sh", I actually use Hadoop data set provided
from
Hadoop
download. I can size the data size in configure file as you
mentioned
in
this email. Is my statement correct? Can I use my own dataset
in
case.
If I
can how can I issue run.sh with file name?

Thanks

David

Sent from my iPhone

On Jun 11, 2014, at 7:14 PM, Daoyuan Wang <
[email protected]:[email protected]mailto:[email protected]%3cmailto:[email protected]
mailto:[email protected]%20%0b> <mailto:
[email protected]%20%0b<mailto:[email protected]%20%0bmailto:[email protected]%20%0b%3cmailto:[email protected]%20%0b>>>>
mailto:[email protected]%20%0b>

mailto:[email protected]>

wrote:

Hi David,

You can modify the conf/configure.sh file under each workload
folder
if
it exists. All the data size and options related to the
workload
are
defined in this file.

Thanks,
Daoyuan


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45824799>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828494>.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828903>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45831658>

.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45966009>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45967290>.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45976236>.


Reply to this email directly or view it on GitHub
<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45976386>
.


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-46096844>.


Reply to this email directly or view it on GitHub
#43 (comment).


Reply to this email directly or view it on GitHubhttps://github.com//issues/43#issuecomment-46136551.

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

Thanks a lot for your support. I couldn't download HiBench-yarn.zip it said
no such site. But I can't download to my pc. Any idea?

wget https://github.com/intel-hadoop/HiBench/tree/yarn/HiBench-yarn.zip

David

On Sun, Jun 15, 2014 at 7:36 PM, Daoyuan Wang [email protected]
wrote:

The yarn branch is tested against cdh5, while I think it is also ok for
Hadoop-2.2.0. If you come up with some errors when running Hadoop-2.2.0,
you can let me know, I’ll help you to figure out what to do.

Thanks,
Daoyan

From: dcheng1709 [mailto:[email protected]]
Sent: Monday, June 16, 2014 10:33 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Daoyuan,

Thanks for your response.

I have hadoop 2.2.0 in centos 6.5. You think that I need to download yarn
HiBench?

David

On Sun, Jun 15, 2014 at 7:19 PM, Daoyuan Wang <[email protected]
mailto:[email protected]>
wrote:

Hi David,

I think you can treat Hadoop-0.20.2-cdh3u4 the same way in the doc when
using master branch. If you want to use cdh4, please use the yarn branch
of
the repo and reset the branch to version at April 23th.
Maybe you need to refer to git manual to know how to use cdh4. And you
can
also use that yarn branch for cdh5 directly.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Sunday, June 15, 2014 3:09 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi Daoyuan,

I downloaded hadoop-2.2.0 without cdh3u4. Based on your doc (you use
hadoop-1.03 cdh3u4), I tried to find cdh3u4 but I can only find
hadoop-0.20.2 cdh3u4 instead of hadoop-1.0.3.. Please let me know where
I
can find it. Can I used 2.2.0 with cdh3u4 or cdh4?

Thanks.
David

On Fri, Jun 13, 2014 at 3:20 PM, david cheng <[email protected]
<mailto:
mailto:[email protected]%3cmailto:%20%0b> [email protected]
mailto:[email protected]>> wrote:

In addition, if I run yarn, which part I need to modify (see the
configure
file for sort)

#!/bin/bash

compress

COMPRESS=$COMPRESS_GLOBAL
COMPRESS_CODEC=$COMPRESS_CODEC_GLOBAL

paths

INPUT_HDFS=${DATA_HDFS}/Sort/Input
OUTPUT_HDFS=${DATA_HDFS}/Sort/Output

if [ $COMPRESS -eq 1 ]; then
INPUT_HDFS=${INPUT_HDFS}-comp
OUTPUT_HDFS=${OUTPUT_HDFS}-comp
fi

for prepare (per node) - 24G/node

#DATASIZE=24000000000
DATASIZE=24000000000
NUM_MAPS=16

for running (in total)

NUM_REDS=48

David

On Fri, Jun 13, 2014 at 3:02 PM, david cheng <[email protected]
mailto:[email protected]%20%0b> mailto:[email protected]>
wrote:

Hi, Daoyuan,

I downloaded and configured Hadoop 2.2.0. I have following questions:

  1. CDH3U4 - Is this for Hadoop 2.2.0 verson? I only saw cdh3u4.jar.
    Where
    I can download? Is CDH3U for every Hadoop version or it has a
    different
    cdh3u4?

  2. Your link (Github) - when you pointed me your link it doesn't have
    detail information and instruction. For example, you said that I need
    check
    Yarn from repo? You mean Github?

Thanks for your advice.

David

On Thu, Jun 12, 2014 at 10:09 PM, Daoyuan Wang <
[email protected]<mailto:[email protected]<mailto:
[email protected]%3cmailto:[email protected]>>>
wrote:

No, what you download is HiBench master branch, may need some
modification to run against Yarn, because master branch is tested
over
MRv1
instead of yarn. You should check out the Yarn branch from the repo.
You can check your output file name in the corresponding configure
file
in work load’s conf/ directory.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 1:05 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Yes I run HiBench over the yarn. Should I stop it before I run
HiBench?
What output file name is called?

Thanks.
David

On Thu, Jun 12, 2014 at 6:30 PM, Daoyuan Wang <
[email protected]:[email protected]
mailto:[email protected]%20%0b>>> <mailto:
[email protected]mailto:[email protected]>>
wrote:

1. Are you running HiBench over Yarn? I think you may need to
checkout
yarn branch on github

2. Output file is in HDFS

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 9:05 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Daoyuan,

I have solved the 8080 port mismatch problem.

I have following questions:

  1. Why am I getting this error? Should I used == to replace -eq?

[hadoopuser@localhost conf]$ sh configure.sh
configure.sh: line 26: [: -eq: unary operator expected

  1. word counts
    When did the output files go when I run "run.sh"?

Thanks for your help.

David

On Thu, Jun 12, 2014 at 3:06 PM, david cheng <[email protected]
mailto:[email protected]%20%0b> mailto:[email protected]%20%0b>>>
<mailto:
mailto:[email protected]%3cmailto:%20%0b> [email protected]
mailto:[email protected]
mailto:[email protected]
mailto:[email protected]>> wrote:

Hi, Daoyuan,

Thanks for your information. I run the following statement and
got
an
error. I checked thhat 8088 is localhost port. Please let me
know
why it
happened and how to fix it.

Regards

David

hadoop jar
/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-

client-jobclient-2.2.0-tests.jar TestDFSIO -write -nrFiles 10
-fileSize
10

java.io.IOException: Failed on local exception:
com.google.protobuf.InvalidProtocolBufferException: Protocol
message
end-group tag did not match expected tag.; Host Details : local
host
is:
"localhost/127.0.0.1"; destination host is: "localhost":8088;

On Wed, Jun 11, 2014 at 10:32 PM, Daoyuan Wang <
[email protected]<mailto:[email protected]<mailto:
[email protected]%3cmailto:[email protected]>>
mailto:[email protected]%20%0b> <mailto:
[email protected]<mailto:[email protected]<mailto:
[email protected]%3cmailto:[email protected]>>>>
wrote:

Hi David,

Not very sure about your case. The slaves that HiBench use are
all
the
slaves you configured to your Hadoop cluster.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 12:25 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

This is very helpful. Thanks a lot.

I saw that some of bench mark hadoop test was using 312 nodes.
Does
it
mean
312 virtual servers?

David

Piston OPS-20 (8vm/cn) 312 nodes (SMALL DATA SET)

David

On Wed, Jun 11, 2014 at 9:15 PM, Daoyuan Wang <
[email protected]<mailto:[email protected]<mailto:
[email protected]%3cmailto:[email protected]>>
mailto:[email protected]%20%0b> <mailto:
[email protected]%20%0b<mailto:[email protected]
%20%0b<mailto:[email protected]%
20%0b%3cmailto:[email protected]%20%0b>>>>>
mailto:[email protected]>

wrote:

Hi David,

For some of the workloads like wordcount, you are just
configuring
Hadoop-example to write out random data, while for some
workloads
like
pagerank, there’s a datagen tool of HiBench that output the
dataset.
The
data set preparation period is in prepare*.sh file in each
work
load.
If
you have your own data, you can upload your files to HDFS,
and
config
the
path in conf.sh in each work load to use by directly run
run*.sh
without prepare*.sh.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 10:52 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

Thanks for information.

When you run "run.sh", I actually use Hadoop data set
provided
from
Hadoop
download. I can size the data size in configure file as you
mentioned
in
this email. Is my statement correct? Can I use my own dataset
in
case.
If I
can how can I issue run.sh with file name?

Thanks

David

Sent from my iPhone

On Jun 11, 2014, at 7:14 PM, Daoyuan Wang <
[email protected]<mailto:[email protected]<mailto:
[email protected]%3cmailto:[email protected]>>
mailto:[email protected]%20%0b> <mailto:
[email protected]%20%0b<mailto:[email protected]
%20%0b<mailto:[email protected]%
20%0b%3cmailto:[email protected]%20%0b>>>>>
mailto:[email protected]%20%0b>

mailto:[email protected]>

wrote:

Hi David,

You can modify the conf/configure.sh file under each
workload
folder
if
it exists. All the data size and options related to the
workload
are
defined in this file.

Thanks,
Daoyuan


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45824799>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828494>.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828903>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45831658>

.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45966009>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45967290>.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45976236>.


Reply to this email directly or view it on GitHub
<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45976386>

.


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-46096844>.


Reply to this email directly or view it on GitHub
#43 (comment).


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-46136551>.


Reply to this email directly or view it on GitHub
#43 (comment).

from hibench.

adrian-wang avatar adrian-wang commented on July 19, 2024

$git clone https://github.com/intel-hadoop/HiBench.git
$cd HiBench
$git checkout yarn

You may need to install git first.

From: dcheng1709 [mailto:[email protected]]
Sent: Monday, June 16, 2014 12:59 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Thanks a lot for your support. I couldn't download HiBench-yarn.zip it said
no such site. But I can't download to my pc. Any idea?

wget https://github.com/intel-hadoop/HiBench/tree/yarn/HiBench-yarn.zip

David

On Sun, Jun 15, 2014 at 7:36 PM, Daoyuan Wang <[email protected]mailto:[email protected]>
wrote:

The yarn branch is tested against cdh5, while I think it is also ok for
Hadoop-2.2.0. If you come up with some errors when running Hadoop-2.2.0,
you can let me know, I’ll help you to figure out what to do.

Thanks,
Daoyan

From: dcheng1709 [mailto:[email protected]]
Sent: Monday, June 16, 2014 10:33 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Daoyuan,

Thanks for your response.

I have hadoop 2.2.0 in centos 6.5. You think that I need to download yarn
HiBench?

David

On Sun, Jun 15, 2014 at 7:19 PM, Daoyuan Wang <[email protected]
mailto:[email protected]%20%0b> mailto:[email protected]>
wrote:

Hi David,

I think you can treat Hadoop-0.20.2-cdh3u4 the same way in the doc when
using master branch. If you want to use cdh4, please use the yarn branch
of
the repo and reset the branch to version at April 23th.
Maybe you need to refer to git manual to know how to use cdh4. And you
can
also use that yarn branch for cdh5 directly.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Sunday, June 15, 2014 3:09 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi Daoyuan,

I downloaded hadoop-2.2.0 without cdh3u4. Based on your doc (you use
hadoop-1.03 cdh3u4), I tried to find cdh3u4 but I can only find
hadoop-0.20.2 cdh3u4 instead of hadoop-1.0.3.. Please let me know where
I
can find it. Can I used 2.2.0 with cdh3u4 or cdh4?

Thanks.
David

On Fri, Jun 13, 2014 at 3:20 PM, david cheng <[email protected]
mailto:[email protected]%20%0b> <mailto:
mailto:[email protected]%3cmailto:%20%0b> [email protected]:[email protected]
mailto:[email protected]>> wrote:

In addition, if I run yarn, which part I need to modify (see the
configure
file for sort)

#!/bin/bash

compress

COMPRESS=$COMPRESS_GLOBAL
COMPRESS_CODEC=$COMPRESS_CODEC_GLOBAL

paths

INPUT_HDFS=${DATA_HDFS}/Sort/Input
OUTPUT_HDFS=${DATA_HDFS}/Sort/Output

if [ $COMPRESS -eq 1 ]; then
INPUT_HDFS=${INPUT_HDFS}-comp
OUTPUT_HDFS=${OUTPUT_HDFS}-comp
fi

for prepare (per node) - 24G/node

#DATASIZE=24000000000
DATASIZE=24000000000
NUM_MAPS=16

for running (in total)

NUM_REDS=48

David

On Fri, Jun 13, 2014 at 3:02 PM, david cheng <[email protected]
mailto:[email protected]%20%0b> mailto:[email protected]%20%0b> mailto:[email protected]>
wrote:

Hi, Daoyuan,

I downloaded and configured Hadoop 2.2.0. I have following questions:

  1. CDH3U4 - Is this for Hadoop 2.2.0 verson? I only saw cdh3u4.jar.
    Where
    I can download? Is CDH3U for every Hadoop version or it has a
    different
    cdh3u4?

  2. Your link (Github) - when you pointed me your link it doesn't have
    detail information and instruction. For example, you said that I need
    check
    Yarn from repo? You mean Github?

Thanks for your advice.

David

On Thu, Jun 12, 2014 at 10:09 PM, Daoyuan Wang <
[email protected]:[email protected]<mailto<mailto:[email protected]%3cmailto:[email protected]%3cmailto:
[email protected]%3cmailto:[email protected]:[email protected]%3cmailto:[email protected]>>>
wrote:

No, what you download is HiBench master branch, may need some
modification to run against Yarn, because master branch is tested
over
MRv1
instead of yarn. You should check out the Yarn branch from the repo.
You can check your output file name in the corresponding configure
file
in work load’s conf/ directory.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 1:05 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Yes I run HiBench over the yarn. Should I stop it before I run
HiBench?
What output file name is called?

Thanks.
David

On Thu, Jun 12, 2014 at 6:30 PM, Daoyuan Wang <
[email protected]:[email protected]mailto:[email protected]%3cmailto:[email protected]
mailto:[email protected]%20%0b>>> <mailto:
[email protected]<mailto:[email protected]mailto:[email protected]%3cmailto:[email protected]>>>
wrote:

1. Are you running HiBench over Yarn? I think you may need to
checkout
yarn branch on github

2. Output file is in HDFS

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 9:05 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Daoyuan,

I have solved the 8080 port mismatch problem.

I have following questions:

  1. Why am I getting this error? Should I used == to replace -eq?

[hadoopuser@localhost conf]$ sh configure.sh
configure.sh: line 26: [: -eq: unary operator expected

  1. word counts
    When did the output files go when I run "run.sh"?

Thanks for your help.

David

On Thu, Jun 12, 2014 at 3:06 PM, david cheng <[email protected]
mailto:[email protected]%20%0b> mailto:[email protected]%20%0b> mailto:[email protected]%20%0b>>>
<mailto:
mailto:[email protected]%3cmailto:%20%0b> [email protected]:[email protected]
mailto:[email protected]
mailto:[email protected]
mailto:[email protected]>> wrote:

Hi, Daoyuan,

Thanks for your information. I run the following statement and
got
an
error. I checked thhat 8088 is localhost port. Please let me
know
why it
happened and how to fix it.

Regards

David

hadoop jar
/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-

client-jobclient-2.2.0-tests.jar TestDFSIO -write -nrFiles 10
-fileSize
10

java.io.IOException: Failed on local exception:
com.google.protobuf.InvalidProtocolBufferException: Protocol
message
end-group tag did not match expected tag.; Host Details : local
host
is:
"localhost/127.0.0.1"; destination host is: "localhost":8088;

On Wed, Jun 11, 2014 at 10:32 PM, Daoyuan Wang <
[email protected]:[email protected]<mailto<mailto:[email protected]%3cmailto:[email protected]%3cmailto:
[email protected]%3cmailto:[email protected]:[email protected]%3cmailto:[email protected]>>
mailto:[email protected]%20%0b> <mailto:
[email protected]<mailto:[email protected]<mailtomailto:[email protected]%3cmailto:[email protected]%3cmailto:
[email protected]%3cmailto:[email protected]:[email protected]%3cmailto:[email protected]>>>>
wrote:

Hi David,

Not very sure about your case. The slaves that HiBench use are
all
the
slaves you configured to your Hadoop cluster.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 12:25 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

This is very helpful. Thanks a lot.

I saw that some of bench mark hadoop test was using 312 nodes.
Does
it
mean
312 virtual servers?

David

Piston OPS-20 (8vm/cn) 312 nodes (SMALL DATA SET)

David

On Wed, Jun 11, 2014 at 9:15 PM, Daoyuan Wang <
[email protected]:[email protected]<mailto<mailto:[email protected]%3cmailto:[email protected]%3cmailto:
[email protected]%3cmailto:[email protected]:[email protected]%3cmailto:[email protected]>>
mailto:[email protected]%20%0b> <mailto:
[email protected]%20%0b<mailto:[email protected]mailto:[email protected]%20%0b%3cmailto:[email protected]
%20%0b<mailto:[email protected]%
mailto:[email protected]%25%20%0b> 20%0b%3cmailto:[email protected]%20%0b>>>>>
mailto:[email protected]>

wrote:

Hi David,

For some of the workloads like wordcount, you are just
configuring
Hadoop-example to write out random data, while for some
workloads
like
pagerank, there’s a datagen tool of HiBench that output the
dataset.
The
data set preparation period is in prepare*.sh file in each
work
load.
If
you have your own data, you can upload your files to HDFS,
and
config
the
path in conf.sh in each work load to use by directly run
run*.sh
without prepare*.sh.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 10:52 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

Thanks for information.

When you run "run.sh", I actually use Hadoop data set
provided
from
Hadoop
download. I can size the data size in configure file as you
mentioned
in
this email. Is my statement correct? Can I use my own dataset
in
case.
If I
can how can I issue run.sh with file name?

Thanks

David

Sent from my iPhone

On Jun 11, 2014, at 7:14 PM, Daoyuan Wang <
[email protected]:[email protected]<mailto<mailto:[email protected]%3cmailto:[email protected]%3cmailto:
[email protected]%3cmailto:[email protected]:[email protected]%3cmailto:[email protected]>>
mailto:[email protected]%20%0b> <mailto:
[email protected]%20%0b<mailto:[email protected]mailto:[email protected]%20%0b%3cmailto:[email protected]
%20%0b<mailto:[email protected]%
mailto:[email protected]%25%20%0b> 20%0b%3cmailto:[email protected]%20%0b>>>>>
mailto:[email protected]%20%0b>

mailto:[email protected]>

wrote:

Hi David,

You can modify the conf/configure.sh file under each
workload
folder
if
it exists. All the data size and options related to the
workload
are
defined in this file.

Thanks,
Daoyuan


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45824799>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828494>.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828903>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45831658>

.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45966009>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45967290>.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45976236>.


Reply to this email directly or view it on GitHub
<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45976386>

.


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-46096844>.


Reply to this email directly or view it on GitHub
#43 (comment).


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-46136551>.


Reply to this email directly or view it on GitHub
#43 (comment).


Reply to this email directly or view it on GitHubhttps://github.com//issues/43#issuecomment-46141145.

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

Dao Yuan,

Thanks. I am able to download HiBench-yarn. But I got the following error
when I run configure.sh under /home/hadoopuser/HiBench/sort/conf

sh configure.sh
configure.sh: line 20: COMPRESS_GLOBAL: unbound variable

David

On Sun, Jun 15, 2014 at 10:02 PM, Daoyuan Wang [email protected]
wrote:

$git clone https://github.com/intel-hadoop/HiBench.git
$cd HiBench
$git checkout yarn

You may need to install git first.

From: dcheng1709 [mailto:[email protected]]
Sent: Monday, June 16, 2014 12:59 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Thanks a lot for your support. I couldn't download HiBench-yarn.zip it
said
no such site. But I can't download to my pc. Any idea?

wget https://github.com/intel-hadoop/HiBench/tree/yarn/HiBench-yarn.zip

David

On Sun, Jun 15, 2014 at 7:36 PM, Daoyuan Wang <[email protected]
mailto:[email protected]>
wrote:

The yarn branch is tested against cdh5, while I think it is also ok for
Hadoop-2.2.0. If you come up with some errors when running Hadoop-2.2.0,
you can let me know, I’ll help you to figure out what to do.

Thanks,
Daoyan

From: dcheng1709 [mailto:[email protected]]
Sent: Monday, June 16, 2014 10:33 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Daoyuan,

Thanks for your response.

I have hadoop 2.2.0 in centos 6.5. You think that I need to download
yarn
HiBench?

David

On Sun, Jun 15, 2014 at 7:19 PM, Daoyuan Wang <[email protected]
mailto:[email protected]%20%0b> mailto:[email protected]>

wrote:

Hi David,

I think you can treat Hadoop-0.20.2-cdh3u4 the same way in the doc
when
using master branch. If you want to use cdh4, please use the yarn
branch
of
the repo and reset the branch to version at April 23th.
Maybe you need to refer to git manual to know how to use cdh4. And you
can
also use that yarn branch for cdh5 directly.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Sunday, June 15, 2014 3:09 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi Daoyuan,

I downloaded hadoop-2.2.0 without cdh3u4. Based on your doc (you use
hadoop-1.03 cdh3u4), I tried to find cdh3u4 but I can only find
hadoop-0.20.2 cdh3u4 instead of hadoop-1.0.3.. Please let me know
where
I
can find it. Can I used 2.2.0 with cdh3u4 or cdh4?

Thanks.
David

On Fri, Jun 13, 2014 at 3:20 PM, david cheng <[email protected]
mailto:[email protected]%20%0b> <mailto:
mailto:[email protected]%3cmailto:%20%0b> [email protected]
mailto:[email protected]
mailto:[email protected]>> wrote:

In addition, if I run yarn, which part I need to modify (see the
configure
file for sort)

#!/bin/bash

compress

COMPRESS=$COMPRESS_GLOBAL
COMPRESS_CODEC=$COMPRESS_CODEC_GLOBAL

paths

INPUT_HDFS=${DATA_HDFS}/Sort/Input
OUTPUT_HDFS=${DATA_HDFS}/Sort/Output

if [ $COMPRESS -eq 1 ]; then
INPUT_HDFS=${INPUT_HDFS}-comp
OUTPUT_HDFS=${OUTPUT_HDFS}-comp
fi

for prepare (per node) - 24G/node

#DATASIZE=24000000000
DATASIZE=24000000000
NUM_MAPS=16

for running (in total)

NUM_REDS=48

David

On Fri, Jun 13, 2014 at 3:02 PM, david cheng <[email protected]
mailto:[email protected]%20%0b> mailto:[email protected]%20%0b>
mailto:[email protected]>
wrote:

Hi, Daoyuan,

I downloaded and configured Hadoop 2.2.0. I have following
questions:

  1. CDH3U4 - Is this for Hadoop 2.2.0 verson? I only saw cdh3u4.jar.
    Where
    I can download? Is CDH3U for every Hadoop version or it has a
    different
    cdh3u4?

  2. Your link (Github) - when you pointed me your link it doesn't
    have
    detail information and instruction. For example, you said that I
    need
    check
    Yarn from repo? You mean Github?

Thanks for your advice.

David

On Thu, Jun 12, 2014 at 10:09 PM, Daoyuan Wang <
[email protected]<mailto:[email protected]
<mailtomailto:[email protected]%3cmailto:[email protected]%3cmailto:

[email protected]%3cmailto:[email protected]<mailto:
[email protected]%3cmailto:[email protected]>>>>

wrote:

No, what you download is HiBench master branch, may need some
modification to run against Yarn, because master branch is tested
over
MRv1
instead of yarn. You should check out the Yarn branch from the
repo.
You can check your output file name in the corresponding configure
file
in work load’s conf/ directory.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 1:05 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Yes I run HiBench over the yarn. Should I stop it before I run
HiBench?
What output file name is called?

Thanks.
David

On Thu, Jun 12, 2014 at 6:30 PM, Daoyuan Wang <
[email protected]<mailto:[email protected]<mailto:
[email protected]%3cmailto:[email protected]>>
mailto:[email protected]%20%0b>>> <mailto:
[email protected]<mailto:[email protected]<mailto:
[email protected]%3cmailto:[email protected]>>>>
wrote:

1. Are you running HiBench over Yarn? I think you may need to
checkout
yarn branch on github

2. Output file is in HDFS

From: dcheng1709 [mailto:[email protected]]
Sent: Friday, June 13, 2014 9:05 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Daoyuan,

I have solved the 8080 port mismatch problem.

I have following questions:

  1. Why am I getting this error? Should I used == to replace -eq?

[hadoopuser@localhost conf]$ sh configure.sh
configure.sh: line 26: [: -eq: unary operator expected

  1. word counts
    When did the output files go when I run "run.sh"?

Thanks for your help.

David

On Thu, Jun 12, 2014 at 3:06 PM, david cheng <
[email protected]
mailto:[email protected]%20%0b> mailto:[email protected]%20%0b>
mailto:[email protected]%20%0b>>>
<mailto:
mailto:[email protected]%3cmailto:%20%0b>
[email protected]:[email protected]
mailto:[email protected]
mailto:[email protected]
mailto:[email protected]>> wrote:

Hi, Daoyuan,

Thanks for your information. I run the following statement and
got
an
error. I checked thhat 8088 is localhost port. Please let me
know
why it
happened and how to fix it.

Regards

David

hadoop jar
/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-

client-jobclient-2.2.0-tests.jar TestDFSIO -write -nrFiles 10
-fileSize
10

java.io.IOException: Failed on local exception:
com.google.protobuf.InvalidProtocolBufferException: Protocol
message
end-group tag did not match expected tag.; Host Details :
local
host
is:
"localhost/127.0.0.1"; destination host is: "localhost":8088;

On Wed, Jun 11, 2014 at 10:32 PM, Daoyuan Wang <
[email protected]<mailto:[email protected]
<mailtomailto:[email protected]%3cmailto:[email protected]%3cmailto:

[email protected]%3cmailto:[email protected]<mailto:
[email protected]%3cmailto:[email protected]>>>

mailto:[email protected]%20%0b> <mailto:
[email protected]<mailto:[email protected]
<mailtomailto:[email protected]%3cmailto:[email protected]%3cmailto:

[email protected]%3cmailto:[email protected]<mailto:
[email protected]%3cmailto:[email protected]>>>>>

wrote:

Hi David,

Not very sure about your case. The slaves that HiBench use
are
all
the
slaves you configured to your Hadoop cluster.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 12:25 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

This is very helpful. Thanks a lot.

I saw that some of bench mark hadoop test was using 312
nodes.
Does
it
mean
312 virtual servers?

David

Piston OPS-20 (8vm/cn) 312 nodes (SMALL DATA SET)

David

On Wed, Jun 11, 2014 at 9:15 PM, Daoyuan Wang <
[email protected]<mailto:[email protected]
<mailtomailto:[email protected]%3cmailto:[email protected]%3cmailto:

[email protected]%3cmailto:[email protected]<mailto:
[email protected]%3cmailto:[email protected]>>>

mailto:[email protected]%20%0b> <mailto:
[email protected]%20%0b<mailto:[email protected]
mailto:[email protected]%20%0b%3cmailto:[email protected]
%20%0b<mailto:[email protected]%
mailto:[email protected]%25%20%0b>
20%0b%3cmailto:[email protected]%20%0b>>>>>
mailto:[email protected]>

wrote:

Hi David,

For some of the workloads like wordcount, you are just
configuring
Hadoop-example to write out random data, while for some
workloads
like
pagerank, there’s a datagen tool of HiBench that output the
dataset.
The
data set preparation period is in prepare*.sh file in
each
work
load.
If
you have your own data, you can upload your files to HDFS,
and
config
the
path in conf.sh in each work load to use by directly run
run*.sh
without prepare*.sh.

Thanks,
Daoyuan

From: dcheng1709 [mailto:[email protected]]
Sent: Thursday, June 12, 2014 10:52 AM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hi, Daoyuan,

Thanks for information.

When you run "run.sh", I actually use Hadoop data set
provided
from
Hadoop
download. I can size the data size in configure file as you
mentioned
in
this email. Is my statement correct? Can I use my own
dataset
in
case.
If I
can how can I issue run.sh with file name?

Thanks

David

Sent from my iPhone

On Jun 11, 2014, at 7:14 PM, Daoyuan Wang <
[email protected]<mailto:[email protected]
<mailtomailto:[email protected]%3cmailto:[email protected]%3cmailto:

[email protected]%3cmailto:[email protected]<mailto:
[email protected]%3cmailto:[email protected]>>>

mailto:[email protected]%20%0b> <mailto:
[email protected]%20%0b<mailto:[email protected]
mailto:[email protected]%20%0b%3cmailto:[email protected]
%20%0b<mailto:[email protected]%
mailto:[email protected]%25%20%0b>
20%0b%3cmailto:[email protected]%20%0b>>>>>
mailto:[email protected]%20%0b>

mailto:[email protected]>

wrote:

Hi David,

You can modify the conf/configure.sh file under each
workload
folder
if
it exists. All the data size and options related to the
workload
are
defined in this file.

Thanks,
Daoyuan


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45824799>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828494>.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45828903>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45831658>

.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45966009>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45967290>.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45976236>.


Reply to this email directly or view it on GitHub
<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-45976386>

.


Reply to this email directly or view it on GitHub<

https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-46096844>.


Reply to this email directly or view it on GitHub
<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-46136100>.


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-46136551>.


Reply to this email directly or view it on GitHub
#43 (comment).


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-46141145>.


Reply to this email directly or view it on GitHub
#43 (comment).

from hibench.

adrian-wang avatar adrian-wang commented on July 19, 2024

You don't have to run configure.sh, the script will be run by any run*.sh

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

Daoyuan,

Thanks. Now I am getting another error. Sounds like I don't have a input
datafile?

rmr: `/HiBench/Sort/Input-comp': No such file or directory

On Sun, Jun 15, 2014 at 10:51 PM, Daoyuan Wang [email protected]
wrote:

You don't have to run configure.sh, the script will be run by any run*.sh


Reply to this email directly or view it on GitHub
#43 (comment).

from hibench.

adrian-wang avatar adrian-wang commented on July 19, 2024

You should run prepare*.sh prior to run*.sh

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

Here is a same error.

[hadoopuser@localhost bin]$ sh prepare.sh
========== preparing sort data==========
HADOOP_EXECUTABLE=/usr/local/hadoop/bin/hadoop
HADOOP_CONF_DIR=/usr/local/hadoop/conf
HADOOP_EXAMPLES_JAR=/usr/local/hadoop/hadoop-examples*.jar
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

rmr: DEPRECATED: Please use 'rm -r' instead.
2014-06-16 02:05:10,132 WARN [main] util.NativeCodeLoader
(NativeCodeLoader.java:(62)) - Unable to load native-hadoop library
for your platform... using builtin-java classes where applicable
rmr: `/HiBench/Sort/Input-comp': No such file or directory
Not a valid JAR: /usr/local/hadoop/hadoop-examples*.jar
ERROR: Hadoop job failed to run successfully.

On Sun, Jun 15, 2014 at 11:01 PM, Daoyuan Wang [email protected]
wrote:

You should run prepare_.sh prior to run_.sh


Reply to this email directly or view it on GitHub
#43 (comment).

from hibench.

adrian-wang avatar adrian-wang commented on July 19, 2024

Please, do as the document say before run prepare.sh

  1. Configure for the all workloads

    You need to set some global environment variables in the bin/hibench-config.sh file located in the root dir.

      HADOOP_HOME      <The Hadoop installation location>
      HADOOP_CONF_DIR  <The hadoop configuration DIR, default is $HADOOP_HOME/conf>
      COMPRESS_GLOBAL  <Whether to enable the in/out compression for all workloads, 0 is disable, 1 is enable>
      COMPRESS_CODEC_GLOBAL  <The default codec used for in/out data compression>
    

    Note: Do not change the default values of other global environment variables unless necessary.

  2. Configure each workload

    You can modify the conf/configure.sh file under each workload folder if it exists. All the data size and options related to the workload are defined in this file.

  3. Synchronize the time on all nodes (This is required for dfsioe, and optional for others)

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

I added the entries in hibench-config.sh at HiBench/bin.I am not sure what
I should assign for compress_codec_global.

David

HADOOP_CONF_DIR=`/usr/local/hadoop/etc/hadoop'
COMPRESS_GLOBAL=1
COMPRESS_CODEC_GLOBAL=???

This is my hibench-config.sh

HADOOP_EXECUTABLE=
HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop' COMPRESS_GLOBAL=1 COMPRESS_CODEC_GLOBAL= HADOOP_EXAMPLES_JAR= HADOOP_HOME=/usr/local/hadoop HIBENCH_HOME=printenv HIBENCH_HOME HIBENCH_CONF=printenv HIBENCH_CONF HIVE_HOME=printenv HIVE_HOME MAHOUT_HOME=printenv MAHOUT_HOME NUTCH_HOME=printenv NUTCH_HOME DATATOOLS=printenv DATATOOLS`

On Sun, Jun 15, 2014 at 11:11 PM, Daoyuan Wang [email protected]
wrote:

Please, do as the document say before run prepare.sh

Configure for the all workloads

You need to set some global environment variables in the
bin/hibench-config.sh file located in the root dir.

 HADOOP_HOME      <The Hadoop installation location>
 HADOOP_CONF_DIR  <The hadoop configuration DIR, default is $HADOOP_HOME/conf>
 COMPRESS_GLOBAL  <Whether to enable the in/out compression for all workloads, 0 is disable, 1 is enable>
 COMPRESS_CODEC_GLOBAL  <The default codec used for in/out data compression>

Note: Do not change the default values of other global environment
variables unless necessary.
2.

Configure each workload

You can modify the conf/configure.sh file under each workload folder
if it exists. All the data size and options related to the workload are
defined in this file.
3.

Synchronize the time on all nodes (This is required for dfsioe, and
optional for others)


Reply to this email directly or view it on GitHub
#43 (comment).

from hibench.

adrian-wang avatar adrian-wang commented on July 19, 2024

To start, you can use
COMPRESS_GLOBAL=0 #this means not to compress
COMPRESS_CODEC_GLOBAL= XXX # any one of the three values, it does not matter since COMPRESS_GLOBAL is set to 0.

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

what does COMPRESS_CODEC_GLOBAL mean? I just assign 123?

COMPRESS_CODEC_GLOBAL= XXX # any one of the three values, it does not
matter since

On Sun, Jun 15, 2014 at 11:34 PM, Daoyuan Wang [email protected]
wrote:

To start, you can use
COMPRESS_GLOBAL=0 #this means not to compress
COMPRESS_CODEC_GLOBAL= XXX # any one of the three values, it does not
matter since COMPRESS_GLOBAL is set to 0.


Reply to this email directly or view it on GitHub
#43 (comment).

from hibench.

adrian-wang avatar adrian-wang commented on July 19, 2024

COMPRESS_CODEC_GLOBAL means the specific codec algorithm you will use to compress output. You can leave it as default, or change it to anything you like, as long as COMPRESS_GLOBAL is 0.

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

what about if it is "1"? what value I should give in general?

On Sun, Jun 15, 2014 at 11:44 PM, Daoyuan Wang [email protected]
wrote:

COMPRESS_CODEC_GLOBAL means the specific codec algorithm you will use to
compress output. You can leave it as default, or change it to anything you
like, as long as COMPRESS_GLOBAL is 0.


Reply to this email directly or view it on GitHub
#43 (comment).

from hibench.

adrian-wang avatar adrian-wang commented on July 19, 2024

1 means you need compress. There are three lines in your downloaded file, comment two of them, and leave one uncommmented. you need to install native library for hadoop first, before you can use those compress codec.

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

Do I need to set the path in .bashrc file?

[hadoopuser@localhost bin]$ sh prepare.sh
========== preparing sort data==========
/home/hadoopuser/HiBench/sort/../bin/hibench-config.sh: line 26:
/usr/local/hadoop/etc/hadoop: is a directory
/home/hadoopuser/HiBench/sort/../bin/hibench-config.sh: line 30:
/usr/local/hadoop: is a directory
HADOOP_EXECUTABLE=/usr/local/hadoop/bin/hadoop
/home/hadoopuser/HiBench/sort/../bin/hibench-config.sh: line 60:
HADOOP_CONF_DIR: ERROR: Please set paths in
/home/hadoopuser/HiBench/bin/hibench-config.sh before using HiBench.

On Sun, Jun 15, 2014 at 11:52 PM, Daoyuan Wang [email protected]
wrote:

1 means you need compress. There are three lines in your downloaded file,
comment two of them, and leave one uncommmented. you need to install native
library for hadoop first, before you can use those compress codec.


Reply to this email directly or view it on GitHub
#43 (comment).

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

why I didn't see conf dir under /hadoop_home or anywhere?

On Mon, Jun 16, 2014 at 12:01 AM, david cheng [email protected] wrote:

Do I need to set the path in .bashrc file?

[hadoopuser@localhost bin]$ sh prepare.sh
========== preparing sort data==========
/home/hadoopuser/HiBench/sort/../bin/hibench-config.sh: line 26:
/usr/local/hadoop/etc/hadoop: is a directory
/home/hadoopuser/HiBench/sort/../bin/hibench-config.sh: line 30:
/usr/local/hadoop: is a directory
HADOOP_EXECUTABLE=/usr/local/hadoop/bin/hadoop
/home/hadoopuser/HiBench/sort/../bin/hibench-config.sh: line 60:
HADOOP_CONF_DIR: ERROR: Please set paths in
/home/hadoopuser/HiBench/bin/hibench-config.sh before using HiBench.

On Sun, Jun 15, 2014 at 11:52 PM, Daoyuan Wang [email protected]
wrote:

1 means you need compress. There are three lines in your downloaded file,
comment two of them, and leave one uncommmented. you need to install native
library for hadoop first, before you can use those compress codec.


Reply to this email directly or view it on GitHub
#43 (comment)
.

from hibench.

adrian-wang avatar adrian-wang commented on July 19, 2024

That depends on which version of hadoop you are using, and in which way did you install your hadoop. Not an issue of HiBench....

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

Daoyuan

I got the following error when I run prepare.sh

[hadoopuser@localhost bin]$ sh prepare.sh
========== preparing sort data==========
/home/hadoopuser/HiBench/sort/../bin/hibench-config.sh: line 26:
/usr/local/hadoop/etc/hadoop: is a directory
/home/hadoopuser/HiBench/sort/../bin/hibench-config.sh: line 30:
/usr/local/hadoop: is a directory
HADOOP_EXECUTABLE=/usr/local/hadoop/bin/hadoop
/home/hadoopuser/HiBench/sort/../bin/hibench-config.sh: line 60:
HADOOP_CONF_DIR: ERROR: Please set paths in
/home/hadoopuser/HiBench/bin/hibench-config.sh before using HiBench.

On Mon, Jun 16, 2014 at 12:24 AM, Daoyuan Wang [email protected]
wrote:

That depends on which version of hadoop you are using, and in which way
did you install your hadoop. Not an issue of HiBench....


Reply to this email directly or view it on GitHub
#43 (comment).

from hibench.

adrian-wang avatar adrian-wang commented on July 19, 2024

could you tell me what version of hadoop you are running, how you install your hadoop, and attach your hibench-config.sh file to me?

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

Hadoop-2.2

On Mon, Jun 16, 2014 at 12:30 AM, Daoyuan Wang [email protected]
wrote:

could you tell me what version of hadoop you are running, how you install
your hadoop, and attach your hibench-config.sh file to me?


Reply to this email directly or view it on GitHub
#43 (comment).

from hibench.

adrian-wang avatar adrian-wang commented on July 19, 2024

Could you please send your hibench-config.sh file to [email protected]:[email protected]? And tell me your location where you installed your Hadoop.

From: dcheng1709 [mailto:[email protected]]
Sent: Monday, June 16, 2014 3:32 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hadoop-2.2

On Mon, Jun 16, 2014 at 12:30 AM, Daoyuan Wang <[email protected]mailto:[email protected]>
wrote:

could you tell me what version of hadoop you are running, how you install
your hadoop, and attach your hibench-config.sh file to me?


Reply to this email directly or view it on GitHub
#43 (comment).


Reply to this email directly or view it on GitHubhttps://github.com//issues/43#issuecomment-46148491.

from hibench.

dcheng1709 avatar dcheng1709 commented on July 19, 2024

Daoyuan,

Somehow I can't send email out through [email protected].

I followed this link for install and configure Hadoop 2.2.

http://alanxelsys.com/2014/02/01/hadoop-2-2-single-node-installation-on-centos-6-5/

Here is my hadoop location based on doc

[hadoopuser@localhost conf]$ cd /usr/local/hadoop
[hadoopuser@localhost hadoop]$ ls
bin include libexec logs README.txt share
etc lib LICENSE.txt NOTICE.txt sbin temp

/usr/local/hadoop/etc/hadoop
[hadoopuser@localhost hadoop]$ ls
capacity-scheduler.xml httpfs-env.sh slaves
configuration.xsl httpfs-log4j.properties
ssl-client.xml.example
container-executor.cfg httpfs-signature.secret
ssl-server.xml.example
core-site.xml httpfs-site.xml
TestDFSIO_results.log
core-site.xml.org log4j.properties yarn-env.cmd
hadoop-env.cmd mapred-env.cmd yarn-env.sh
hadoop-env.sh mapred-env.sh yarn-env.sh.last
hadoop-metrics2.properties mapred-queues.xml.template yarn-site.xml
hadoop-metrics.properties mapred-site.xml yarn-site.xml.last
hadoop-policy.xml mapred-site.xml.last
hdfs-site.xml mapred-site.xml.templatel.

I followed this link for install and configure Hadoop 2.2. Thanks a lot.

http://alanxelsys.com/2014/02/01/hadoop-2-2-single-node-installation-on-centos-6-5/

Here is my hadoop location based on doc

[hadoopuser@localhost conf]$ cd /usr/local/hadoop
[hadoopuser@localhost hadoop]$ ls
bin include libexec logs README.txt share
etc lib LICENSE.txt NOTICE.txt sbin temp

/usr/local/hadoop/etc/hadoop
[hadoopuser@localhost hadoop]$ ls
capacity-scheduler.xml httpfs-env.sh slaves
configuration.xsl httpfs-log4j.properties
ssl-client.xml.example
container-executor.cfg httpfs-signature.secret
ssl-server.xml.example
core-site.xml httpfs-site.xml
TestDFSIO_results.log
core-site.xml.org log4j.properties yarn-env.cmd
hadoop-env.cmd mapred-env.cmd yarn-env.sh
hadoop-env.sh mapred-env.sh yarn-env.sh.last
hadoop-metrics2.properties mapred-queues.xml.template yarn-site.xml
hadoop-metrics.properties mapred-site.xml yarn-site.xml.last
hadoop-policy.xml mapred-site.xml.last
hdfs-site.xml mapred-site.xml.template

David

On Mon, Jun 16, 2014 at 12:44 AM, Daoyuan Wang [email protected]
wrote:

Could you please send your hibench-config.sh file to
[email protected]:[email protected]? And tell me your
location where you installed your Hadoop.

From: dcheng1709 [mailto:[email protected]]
Sent: Monday, June 16, 2014 3:32 PM
To: intel-hadoop/HiBench
Cc: Wang, Daoyuan
Subject: Re: [HiBench] HiBench Installation Guide (#43)

Hadoop-2.2

On Mon, Jun 16, 2014 at 12:30 AM, Daoyuan Wang <[email protected]
mailto:[email protected]>
wrote:

could you tell me what version of hadoop you are running, how you
install
your hadoop, and attach your hibench-config.sh file to me?


Reply to this email directly or view it on GitHub
#43 (comment).


Reply to this email directly or view it on GitHub<
https://github.com/intel-hadoop/HiBench/issues/43#issuecomment-46148491>.


Reply to this email directly or view it on GitHub
#43 (comment).

from hibench.

laminucsy avatar laminucsy commented on July 19, 2024

I am a beginner who try to test Hibench in hadoop cluster and I don't know well about HIbench. Currently, my cluster does not support spark but I would like to run specific workload like kmeans, wordcount, and so on. I am now using HIbench 4.0. I run workload hibench and the output shows the following errors

"reduce.EventFetcher: EventFetcher is interrupted.. Returning"
"HiBench.HiveData$GenerateRankingsReducer: pid: 0, 0 erros, 0 missed"

I don't know what to proceed. And I would like to know where is the input data folder and output folder for specific workload run. If it is possible, please explain to me about HIBench! Thank you very much in advance!

from hibench.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.