On the question about whether it’s preferable for me to write my own s to direct

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

I put together two python s for reporting on jobs run in LSF. The first (/usr/tc

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

job-archive: prefer a command line interface to accounting database to querying the database directly about flux-accounting HOT 8 CLOSED

flux-framework commented on August 11, 2024 1

job-archive: prefer a command line interface to accounting database to querying the database directly

from flux-accounting.

Comments (8)

dongahn commented on August 11, 2024 2

@cmoussa1: awesome progress! Yeah it will be good to find out what format @ryanday36 wants to port his scripts. You may export some data to that format and see if that's good enough for @ryanday36.

from flux-accounting.

ryanday36 commented on August 11, 2024 1

I put together two python scripts for reporting on jobs run in LSF. The first (/usr/tce/packages/lacct/lacct-1.0/bin/lacct.py on Lassen, rzansel, butte, etc) wraps the 'bhist' command to read all jobs that meet a given selection query (specified by the command arguments: user, account/bank, wckey, start, end) into a list of JobRecord objects and then print those out. The second (/usr/tce/packages/lacct/lacct-1.0/bin/lreport.py) imports lacct.py to get that list of JobRecords and then sums up the total usage according to the command arguments (e.g. byuser, bygroup, etc.).

So, the fastest way that I could adapt those to flux data would probably actually be a python module that gave me a function that I would return a list of JobRecords for a given user, bank, wckey, starttime, endtime (default values should be 'all' for each of these).

Here's what my JobRecord object looks like:

class JobRecord(object):
        '''
        A record of an individual job.
        '''

        def __init__(self, jobid, user, group, project, nnodes, hostlist, sub, start, end) :
                self.jobid = jobid
                self.user = user
                self.group = group
                self.project = project
                self.hostlist = hostlist
                self.nnodes = nnodes
                self.sub = sub
                self.start = start
                self.end = end
                return None

        @property
        def elapsed(self) :
                return time.mktime(self.end) - time.mktime(self.start)
        
        @property
        def queued(self) :
                return time.mktime(self.start) - time.mktime(self.sub)

And here's the arguments that I take for 'lacct' (selection criteria for jobs):

if __name__ == '__main__' :
        parser = argparse.ArgumentParser(description='Show information about completed jobs.')
        parser.add_argument('-P', action='store_true', help='output "|" delimited columns for easy parsing')
        parser.add_argument('-v', action='store_true', help='output more details about jobs')
        parser.add_argument('-s', metavar='<start_time>', help='show jobs that completed after start_time')
        parser.add_argument('-e', metavar='<end_time>', help='show jobs that completed before end_time)')
        parser.add_argument('-j', metavar='<jobid>', help='show job with jobid')
        parser.add_argument('-u', metavar='<username>', help='show jobs run by username')
        parser.add_argument('-g', metavar='<group/bank>', help='show jobs run with group/bank')
        parser.add_argument('-p', metavar='<project/wckey>', help='show jobs run with project/wckey')
        parser.add_argument('-C', action='store_true', help='print command line.')
        main(parser.parse_args())

from flux-accounting.

cmoussa1 commented on August 11, 2024 1

So I took a look at flux-core's job archive and tried to compare what you have listed in your job record object. Here is a table comparing the two (as of now):

JobRecord object	flux-core's job-archive DB
jobid	id
user	userid
group	account
project	missing - wckey
hostlist	missing
nnodes	"rank" field in R column
sub	t_submit
start	t_start
end	t_inactive

In terms of getting data out of the job-archive DB, since the database itself is SQLite, I think a Python interface (similar to the two scripts you described) should be relatively similar to grab data out (Python has a pretty nice SQLite API, which is what I have been using with flux-accounting's database).

from flux-accounting.

ryanday36 commented on August 11, 2024 1

@cmoussa1, I just gave you lacct.py and lreport.py on the RZ.

A couple things to note on your table above comparing the JobRecord object and the flux-core accounting db is that the 'group' in LSF is the bank, so you do have that. The 'project' is the Slurm wckey, which I don't believe you have. The hostlist isn't actually needed for accounting, but can be useful for the hotline / system admins when a user calls in to ask about a job failure for a job that ran a week ago or similar.

from flux-accounting.

dongahn commented on August 11, 2024

@ryanday36: Yeah we need to discuss this and make the porting of your script as easy as possible. Since @chu11 is out, @cmoussa1 or @grondo can tell you about how to get the data into the format that is easy to be slurped into your script?

You told us about how you did this under LSF. Maybe you can describe that + the format of the output you get from LSF to begin with? Then we can tell you a few ways you can get this info. If an easy mechanism is not there, we should open up a ticket and put that into our plan.

from flux-accounting.

cmoussa1 commented on August 11, 2024

I think this might be a good next development-step for accounting (thanks @dongahn for the suggestion from #29), so I think I will pursue this next if there are no objections.

@ryanday36: do you think it would be possible to give me the scripts you mentioned above in this thread? I know you pasted some sections of it but it might be nice to have the full file to work off of.

The other question probably worth discussing is where a usage reporting script might reside - does it make sense to put those scripts in flux-accounting? Maybe it could be another subcommand - flux account usage-report ... or something. We could try and include all of the same options as the ones lifted above.

This might end up also defining other gaps in what we are reporting as of now and what we need to report in the future (I pointed a couple out already; fields like group/bank, project, etc).

from flux-accounting.

cmoussa1 commented on August 11, 2024

Just to update/let myself know where I am at - I've started poking around (early and experimentally 😉) with implementing similar style functionality within flux-accounting - queries from the job-archive are fetched using pandas, which returns matching entries into a Dataframe object. I can iterate through these, extract the fields needed, and construct a JobRecord object:

class JobRecord(object):
        '''
        A record of an individual job.
        '''

        def __init__(self, userid, jobid, ranks, t_submit, t_run, t_inactive, R) :
                self.userid = userid
                self.jobid = jobid
                self.ranks = ranks
                self.t_submit = t_submit
                self.t_run = t_run
                self.t_inactive = t_inactive
                self.R = R
                return None

        @property
        def elapsed(self) :
                return time.mktime(self.t_inactive) - time.mktime(self.t_run)

        @property
        def queued(self) :
                return time.mktime(self.t_run) - time.mktime(self.t_submit)

job_records = []

for index, row in dataframe.iterrows():
    job_record = JobRecord(row["userid"], row["id"], row["ranks"], row["t_submit"], row["t_run"], row["t_inactive"], row["R"])
    job_records.append(job_record)

from flux-accounting.

cmoussa1 commented on August 11, 2024

I think I am making pretty good progress on this so far. As of now we can pass in a few different options, similar to those you have listed above:

    by-user             show jobs run by username
    by-jobid            show job info from jobid
    after-start-time    show jobs that completed after start time
    before-end-time     show jobs that completed before end time

each which returns a list of JobRecords.

$ flux account -p /var/lib/flux/jobs.db by-user fluxuser

FOUND 6 RECORDS FROM USER 1001
RECORD 1
userid: 1001 
 username: fluxuser 
 jobid: 728382832640 
 t_submit: 1597421529.0616362 
 t_run: 1597421529.084426 
 t_inactive: 1597421529.1862037 
 nnodes: 1 

RECORD 2
userid: 1001 
 username: fluxuser 
 jobid: 708015292416 
 t_submit: 1597421527.848035 
 t_run: 1597421527.8755805 
 t_inactive: 1597421527.9819384 
 nnodes: 1 
.
.
.

Like we talked about in flux-framework/flux-core #3136, fields like bank and project/wckey can be added once those get passed into the job-archive. Currently I just have them printed to the screen, but might it also be useful for there to be an option to have the JobRecords written to a file, maybe in CSV or other format? @ryanday36

from flux-accounting.

job-archive: prefer a command line interface to accounting database to querying the database directly about flux-accounting HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs