GithubHelp home page GithubHelp logo

rueckstiess / mtools Goto Github PK

View Code? Open in Web Editor NEW
1.9K 113.0 394.0 21.02 MB

A collection of scripts to set up MongoDB test environments and parse and visualize MongoDB log files.

License: Apache License 2.0

Python 99.88% JavaScript 0.12%

mtools's Introduction

mtools

PyPI version PyPI pyversions PyPI license

mtools is a collection of helper scripts to parse, filter, and visualize MongoDB log files (mongod, mongos). mtools also includes mlaunch, a utility to quickly set up complex MongoDB test environments on a local machine, and mtransfer, a tool for transferring databases between MongoDB instances.

What's in the box?

The following tools are in the mtools collection:

mlogfilter

slices log files by time, merges log files, filters slow queries, finds table scans, shortens log lines, filters by other attributes, convert to JSON

mloginfo

returns info about log file, like start and end time, version, binary, special sections like restarts, connections, distinct view (requires numpy)

mplotqueries

visualize log files with different types of plots (requires matplotlib)

mlaunch

a script to quickly spin up local test environments, including replica sets and sharded systems (requires pymongo, psutil, packaging)

mtransfer

an experimental script to transfer databases between MongoDB instances by copying WiredTiger data files (requires pymongo and wiredtiger)

For more information, see the mtools documentation.

Requirements and Installation Instructions

The mtools collection is written in Python, and most of the tools only use the standard packages shipped with Python. The tools are currently tested with Python 3.8, 3.9, 3.10, and 3.11.

Some of the tools have additional dependencies, which are listed under the specific tool's section. See the installation instructions for more information.

The mtools suite is only tested with actively supported <https://www.mongodb.com/support-policy/lifecycles> (non End-of-Life) versions of the MongoDB server. As of September 2023, that includes MongoDB 4.4 or newer.

Recent Changes

See Changes to mtools for a list of changes from previous versions of mtools.

Contribute to mtools

If you'd like to contribute to mtools, please read the contributor page for instructions.

Disclaimer

This software is not supported by MongoDB, Inc. under any of their commercial support subscriptions or otherwise. Any usage of mtools is at your own risk. Bug reports, feature requests and questions can be posted in the Issues section on GitHub.

mtools's People

Contributors

aheckmann avatar ajdavis avatar autarch avatar blink1073 avatar chiemseesurfer avatar corymintz avatar devkev avatar garycahill avatar gianpaj avatar gormanb avatar jamesbroadhead avatar jaraco avatar jimoleary avatar kallimachos avatar kevinadi avatar mitesh-gosavi avatar nleite avatar p avatar p-mongo avatar pash10g avatar pzrq avatar rozza avatar rueckstiess avatar savinay-vijay avatar shaneharvey avatar shrayolacrayon avatar stennie avatar steve-hand avatar svisser avatar turtlemonvh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mtools's Issues

mplotqueries: implement gap-threshold for range plots

If two data points are further apart than the gap-threshold, the range bar is stopped at the last point and started again with the new point.

For this we also need to find a way to pass customized arguments to a plot type.

[mlaunch] Don't accept invalid cmd line arguments in certain cases

$ mlaunch --single --verbose --helpz
creating directory: ./data/db
launching: mongod  --dbpath ./data/db --logpath ./data/mongod.log --port 27017 --logappend  --helpz  --fork
waiting for mongod to start up...
# takes a long time
mongod at Gianfranco-10gen.local:27017 running.

but doesn't start it

same for
mlaunch --replica --verbose --helpz
mlaunch --replicaset --verbose --helpz

Consolidate mlogvisjs library and mlogvis tool

mlogvisjs is a stand-alone library based on the mlogvis javascript code. We don't want to keep both versions around. The js library is further advanced now, so I'd like to try to use only that.

Necessary steps:

  • move mlogvisjs into mtools
  • write contents to the top of the index.html file each time, similar to how the data is written

mplotqueries: Option to plot certain lines that are not timed operations

Sometimes it would be useful to plot log lines that aren't necessarily timed operations. Let's say you filtered/grepped for a small set of "events" in the logfile (e.g. replica set status changes, etc) and want to visualize those as well.

I could see this as a useful scenario:

    grep "is now in state" mongod.log | mplotqueries --plot-untimed

To visualize, they could draw vertical thin lines instead of dots (because they don't have a y-axis value).

Ideally, I'd like to overlay such a plot with the original timed plot. Need to work out how that would be possible.

mlogfilter: add human-readable option

Add option -h / --human to enable human-readable format.

This would convert each line's ms numbers to min, hours, days, ...
It would also insert commas for very large values of nscanned, nreturned, ...

Make sure that --slow, --scan, and other mtools, like mplotqueries, still work.

No escaped $ signs

This should work:

    mplotqueries mongod.log --ns admin.$cmd

Currently, you must use instead:

    mplotqueries mongod.log --ns admin.\$cmd

Fix --exclude-ns arg or the usage help

If you follow the usage help by putting --exclude-ns before the filename it errors out.

./mplotqueries.py --exclude-ns "(command)" mongo-live-a-4_2-13_mongodb.log
usage: mplotqueries.py [-h] [--ns [NS [NS ...]]] [--log]
                       [--exclude-ns [NS [NS ...]]]
                       filename
mplotqueries.py: error: too few arguments

If you put it after it works fine

/mplotqueries.py mongo-live-a-4_2-13_mongodb.log --exclude-ns "(command)"
{'exclude_ns': ['(command)'], 'ns': None, 'log': False, 'filename': 'mongo-live-a-4_2-13_mongodb.log'}
0 live.player_action 1
...

I think is just moving line 88 and 89 after or changing the usage help

Add version numbers to mtools

Add version numbers for mtools releases. Should be included in the --help output, as well as a simple display with --version.

mplotqueries: support for different color filters

currently it is only possible to color by namespace. Re-factor to make it possible to color by different aspects, i.e by operations (update, remove, query, ...).

Possible usage:

mplotqueries logfile --color namespace (default)
mplotqueries logfile --color operation

mplotqueries: add optional "created by .. " string

Similar to other open source tools, suggest adding "created by ..." tagline at the bottom of generated charts. This should include version & github url, eg:
"Created by mplotqueries v0.31 (https://github.com/rueckstiess/mtools)"

Would also add a command line option to disable the this (--nobanner), but would have the default for this as enabled.

This will help others discover mtools to create their own charts ;)

User definable $PATH

Instead of setting $PATH before running mlaunch, it would be nice to allow a user to specify the path to the correct binary to use.

Restructure README.md

Remove usage from each script section. Create new documentation (either in the wiki or as separate pages) to explain each of the tools in detail.

Should make it easier to maintain as well, as each change doesn't require README.md to be updated.

mlaunch: check on launch if dbpath exists

If the dbpath exists already (from previous mlaunch), then compare the startup options. If they are different, exit with warning that existing data will be overwritten and that mlaunch --restore should be used to restart.

Should we allow the script to run if the same number of nodes are started? What about with a different number of mongos? more shards?

Perhaps offer --force to force overwrite.

mplotqueries: Graph ratio of nscanned to n(toreturn|returned)

(I searched the repository for "nscanned" and didn't see anything relevant for mplotqueries, so I hope I didn't miss an existing feature request! Feel free to close if I did miss an existing issue.)

Currently, the y-axis of charts produced by mplotqueries is some function of the runtime of slow queries being logged. It'd be great if there was an option to show the ratio of nscanned to n (maybe either logarithmic, if that makes sense?), nscanned, or just n.

Obviously, there's a correlation of nscanned/n to query runtime, but there are cases when nscanned/n is very high but overall runtime is not a problem from the perspective of the developer. However, when many of the same/similar query are run simultaneously, the inefficiency of nscanned/n becomes much more apparent.

Add unitests

Big goal, but if this project will continue to grow, which I hope it will, we'll need to have tests in place so we don't brake anything. And also be more confident to try things and don't be scared to break stuff

Group slow queries together and display statistics

Group slow queries and show counts, to easily identify the most inefficient queries. Needs to be smart enough to group similar queries together but allow for value changes. Maybe use similar logic as the query optimizers uses.

Possible output

<query type>    #occurences     average time    total time    avg nscanned     avg nreturned

socket availability check buggy?

Sometimes I kill instances and it takes a minute or so before mlaunch let's me start a new set on the old ports. I'm not sure if the socket test is reliable or what the problem is.

We should also check for all ports before starting any processes. Sometimes, it starts a port on 27017 but then complains that 27018 is still in use. But the first process is already running and needs to be killed off (which again can take some time).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.