Hello Harel, When running the test one of your sample tests ./hadoop

Attribute Error about hadoop-job-analyzer HOT 7 CLOSED

ayuk23 commented on May 30, 2024

Attribute Error

from hadoop-job-analyzer.

Comments (7)

harelba commented on May 30, 2024

Yes, it's a bug I fixed which I didn't push yet.

It's about a missing metric name prefix - Adding -n aaa.bbb to the command line would work around it.

However, I just pushed the fix for it.

Thanks for noticing.
Harel

from hadoop-job-analyzer.

ayuk23 commented on May 30, 2024

That did the trick. Thanks! Also, this may be a silly question, but is there a way to run aggregations and only select specific parameters. i.e. aggregate over all users, the total duration, failed reducers, and total maps?

from hadoop-job-analyzer.

harelba commented on May 30, 2024

Great.

I didn't think about providing such filtering, but it can be easily added. I'll try to add support for this (e.g. some kind of parameter for filtering metrics according to a regexp), and update here when I push it.

Harel

from hadoop-job-analyzer.

ayuk23 commented on May 30, 2024

So how is it that you are able to graph only single metric fields like spilled record count per selected cross region? Is that something that graphite handles? Also, I know that graphitus is included, but do we also have to download graphite ourselves?

from hadoop-job-analyzer.

harelba commented on May 30, 2024

hadoop-job-analyzer outputs metrics to graphite, broken down ("grouped") by what you request in the -p parameter. The metrics are output in a format that allows smart querying by graphite. Try running the tool using -C stdout instead of -C graphite and you'll be able to see the metric names.

Part of the metric names is the "group by" values. For example, if you request -p SUBMIT_HOST/USER, then the metric name will be something like:

<prefix>.projections.SUBMIT_HOST-USER.<submit-host>.<user>.<metric-name>.<value>

Graphite is very powerful in querying and dissecting the data once it's there, and can be used to group by any part of the metric name and to set parts of the metric name as "constant" (such as the spilled record metric name). Look at the groupByNode() function ( graphite api here ).

Example metrics when running ./hadoop-job-analyzer -f example-history-folder/ -n aaa.bbb -C stdout -p SUBMIT_HOST/USER (the -C stdout parameters tell the tool to output parameters to stdout instead of to graphite):

Metric - name is aaa.bbb.projections.SUBMIT_HOST-USER.machineA.diana.MAP_COUNTERS.org_apache_hadoop_mapred_Task__Counter.COMBINE_INPUT_RECORDS.value value is 0.000 timestamp is 1368598500
Metric - name is aaa.bbb.projections.SUBMIT_HOST-USER.machineA.diana.COUNTERS.org_apache_hadoop_mapred_Task__Counter.REDUCE_SHUFFLE_BYTES.value value is 0.000 timestamp is 1368598500
Metric - name is aaa.bbb.projections.SUBMIT_HOST-USER.machineB.diana.MAP_COUNTERS.org_apache_hadoop_mapred_Task__Counter.MAP_OUTPUT_BYTES.value value is 0.000 timestamp is 1368598500
Metric - name is aaa.bbb.projections.SUBMIT_HOST-USER.machineB.diana.FAILED_REDUCES.value value is 0.000 timestamp is 1368598500

You do need to install graphite on your own, though. It's not part of hadoop-job-analyzer. However, maybe I'll try to provide a script for downloading and auto-installing a simple graphite installation in order to ease the initial overhead in cases where the user doesn't have graphite already installed.

from hadoop-job-analyzer.

ayuk23 commented on May 30, 2024

Thank you for all of the support. It has been incredibly helpful. My (hopefully) last question is if there is anything that must be done differently for YARN. Thanks in advance!

from hadoop-job-analyzer.

harelba commented on May 30, 2024

No problem. Glad it helps.

I'm not sure about YARN, but I haven't played with it in that regard, and we don't have a production cluster running YARN yet. I do believe that some changes will be needed in order to support it, since YARN has a separate JobHistoryServer which collects and manages historical data.

Harel

from hadoop-job-analyzer.

Attribute Error about hadoop-job-analyzer HOT 7 CLOSED

Comments (7)

Related Issues (5)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs