GithubHelp home page GithubHelp logo

Comments (8)

dongahn avatar dongahn commented on August 11, 2024 2

@cmoussa1: awesome progress! Yeah it will be good to find out what format @ryanday36 wants to port his scripts. You may export some data to that format and see if that's good enough for @ryanday36.

from flux-accounting.

ryanday36 avatar ryanday36 commented on August 11, 2024 1

I put together two python scripts for reporting on jobs run in LSF. The first (/usr/tce/packages/lacct/lacct-1.0/bin/lacct.py on Lassen, rzansel, butte, etc) wraps the 'bhist' command to read all jobs that meet a given selection query (specified by the command arguments: user, account/bank, wckey, start, end) into a list of JobRecord objects and then print those out. The second (/usr/tce/packages/lacct/lacct-1.0/bin/lreport.py) imports lacct.py to get that list of JobRecords and then sums up the total usage according to the command arguments (e.g. byuser, bygroup, etc.).

So, the fastest way that I could adapt those to flux data would probably actually be a python module that gave me a function that I would return a list of JobRecords for a given user, bank, wckey, starttime, endtime (default values should be 'all' for each of these).

Here's what my JobRecord object looks like:

class JobRecord(object):
        '''
        A record of an individual job.
        '''

        def __init__(self, jobid, user, group, project, nnodes, hostlist, sub, start, end) :
                self.jobid = jobid
                self.user = user
                self.group = group
                self.project = project
                self.hostlist = hostlist
                self.nnodes = nnodes
                self.sub = sub
                self.start = start
                self.end = end
                return None

        @property
        def elapsed(self) :
                return time.mktime(self.end) - time.mktime(self.start)
        
        @property
        def queued(self) :
                return time.mktime(self.start) - time.mktime(self.sub)

And here's the arguments that I take for 'lacct' (selection criteria for jobs):

if __name__ == '__main__' :
        parser = argparse.ArgumentParser(description='Show information about completed jobs.')
        parser.add_argument('-P', action='store_true', help='output "|" delimited columns for easy parsing')
        parser.add_argument('-v', action='store_true', help='output more details about jobs')
        parser.add_argument('-s', metavar='<start_time>', help='show jobs that completed after start_time')
        parser.add_argument('-e', metavar='<end_time>', help='show jobs that completed before end_time)')
        parser.add_argument('-j', metavar='<jobid>', help='show job with jobid')
        parser.add_argument('-u', metavar='<username>', help='show jobs run by username')
        parser.add_argument('-g', metavar='<group/bank>', help='show jobs run with group/bank')
        parser.add_argument('-p', metavar='<project/wckey>', help='show jobs run with project/wckey')
        parser.add_argument('-C', action='store_true', help='print command line.')
        main(parser.parse_args())

from flux-accounting.

cmoussa1 avatar cmoussa1 commented on August 11, 2024 1

So I took a look at flux-core's job archive and tried to compare what you have listed in your job record object. Here is a table comparing the two (as of now):

JobRecord object flux-core's job-archive DB
jobid id
user userid
group account
project missing - wckey
hostlist missing
nnodes "rank" field in R column
sub t_submit
start t_start
end t_inactive

In terms of getting data out of the job-archive DB, since the database itself is SQLite, I think a Python interface (similar to the two scripts you described) should be relatively similar to grab data out (Python has a pretty nice SQLite API, which is what I have been using with flux-accounting's database).

from flux-accounting.

ryanday36 avatar ryanday36 commented on August 11, 2024 1

@cmoussa1, I just gave you lacct.py and lreport.py on the RZ.

A couple things to note on your table above comparing the JobRecord object and the flux-core accounting db is that the 'group' in LSF is the bank, so you do have that. The 'project' is the Slurm wckey, which I don't believe you have. The hostlist isn't actually needed for accounting, but can be useful for the hotline / system admins when a user calls in to ask about a job failure for a job that ran a week ago or similar.

from flux-accounting.

dongahn avatar dongahn commented on August 11, 2024

@ryanday36: Yeah we need to discuss this and make the porting of your script as easy as possible. Since @chu11 is out, @cmoussa1 or @grondo can tell you about how to get the data into the format that is easy to be slurped into your script?

You told us about how you did this under LSF. Maybe you can describe that + the format of the output you get from LSF to begin with? Then we can tell you a few ways you can get this info. If an easy mechanism is not there, we should open up a ticket and put that into our plan.

from flux-accounting.

cmoussa1 avatar cmoussa1 commented on August 11, 2024

I think this might be a good next development-step for accounting (thanks @dongahn for the suggestion from #29), so I think I will pursue this next if there are no objections.

@ryanday36: do you think it would be possible to give me the scripts you mentioned above in this thread? I know you pasted some sections of it but it might be nice to have the full file to work off of.

The other question probably worth discussing is where a usage reporting script might reside - does it make sense to put those scripts in flux-accounting? Maybe it could be another subcommand - flux account usage-report ... or something. We could try and include all of the same options as the ones lifted above.

This might end up also defining other gaps in what we are reporting as of now and what we need to report in the future (I pointed a couple out already; fields like group/bank, project, etc).

from flux-accounting.

cmoussa1 avatar cmoussa1 commented on August 11, 2024

Just to update/let myself know where I am at - I've started poking around (early and experimentally πŸ˜‰) with implementing similar style functionality within flux-accounting - queries from the job-archive are fetched using pandas, which returns matching entries into a Dataframe object. I can iterate through these, extract the fields needed, and construct a JobRecord object:

class JobRecord(object):
        '''
        A record of an individual job.
        '''

        def __init__(self, userid, jobid, ranks, t_submit, t_run, t_inactive, R) :
                self.userid = userid
                self.jobid = jobid
                self.ranks = ranks
                self.t_submit = t_submit
                self.t_run = t_run
                self.t_inactive = t_inactive
                self.R = R
                return None

        @property
        def elapsed(self) :
                return time.mktime(self.t_inactive) - time.mktime(self.t_run)

        @property
        def queued(self) :
                return time.mktime(self.t_run) - time.mktime(self.t_submit)
job_records = []

for index, row in dataframe.iterrows():
    job_record = JobRecord(row["userid"], row["id"], row["ranks"], row["t_submit"], row["t_run"], row["t_inactive"], row["R"])
    job_records.append(job_record)

from flux-accounting.

cmoussa1 avatar cmoussa1 commented on August 11, 2024

I think I am making pretty good progress on this so far. As of now we can pass in a few different options, similar to those you have listed above:

    by-user             show jobs run by username
    by-jobid            show job info from jobid
    after-start-time    show jobs that completed after start time
    before-end-time     show jobs that completed before end time

each which returns a list of JobRecords.

$ flux account -p /var/lib/flux/jobs.db by-user fluxuser

FOUND 6 RECORDS FROM USER 1001
RECORD 1
userid: 1001 
 username: fluxuser 
 jobid: 728382832640 
 t_submit: 1597421529.0616362 
 t_run: 1597421529.084426 
 t_inactive: 1597421529.1862037 
 nnodes: 1 

RECORD 2
userid: 1001 
 username: fluxuser 
 jobid: 708015292416 
 t_submit: 1597421527.848035 
 t_run: 1597421527.8755805 
 t_inactive: 1597421527.9819384 
 nnodes: 1 
.
.
.

Like we talked about in flux-framework/flux-core #3136, fields like bank and project/wckey can be added once those get passed into the job-archive. Currently I just have them printed to the screen, but might it also be useful for there to be an option to have the JobRecords written to a file, maybe in CSV or other format? @ryanday36

from flux-accounting.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.