rueckstiess / mtools Goto Github PK
View Code? Open in Web Editor NEWA collection of scripts to set up MongoDB test environments and parse and visualize MongoDB log files.
License: Apache License 2.0
A collection of scripts to set up MongoDB test environments and parse and visualize MongoDB log files.
License: Apache License 2.0
Similar to other open source tools, suggest adding "created by ..." tagline at the bottom of generated charts. This should include version & github url, eg:
"Created by mplotqueries v0.31 (https://github.com/rueckstiess/mtools)"
Would also add a command line option to disable the this (--nobanner), but would have the default for this as enabled.
This will help others discover mtools to create their own charts ;)
To filter only databases, --ns and --exclude-ns should accept wildcards.
Then it would be possible to write
--ns "admin." to include all admin collections
--exclude-ns "bla to exclude all databases starting with bla
And in addition to --kill
also offer --remove
which kills the instances and removes the data folder.
currently it is only possible to color by namespace. Re-factor to make it possible to color by different aspects, i.e by operations (update, remove, query, ...).
Possible usage:
mplotqueries logfile --color namespace (default)
mplotqueries logfile --color operation
Sometimes I kill instances and it takes a minute or so before mlaunch let's me start a new set on the old ports. I'm not sure if the socket test is reliable or what the problem is.
We should also check for all ports before starting any processes. Sometimes, it starts a port on 27017 but then complains that 27018 is still in use. But the first process is already running and needs to be killed off (which again can take some time).
Big goal, but if this project will continue to grow, which I hope it will, we'll need to have tests in place so we don't brake anything. And also be more confident to try things and don't be scared to break stuff
mtools/util/logfile.py
This should work:
mplotqueries mongod.log --ns admin.$cmd
Currently, you must use instead:
mplotqueries mongod.log --ns admin.\$cmd
millisecond parsing (input) is already implemented through LogLine. output should detect if any of the files use millisecond format and output accordingly.
use numbers 0-9 to hide and unhide plots.
start counting at 1 for real plots.
use 0 to hide/unhide all plots.
mlogvisjs is a stand-alone library based on the mlogvis javascript code. We don't want to keep both versions around. The js library is further advanced now, so I'd like to try to use only that.
Necessary steps:
index.html
file each time, similar to how the data is writtenAdd option -h / --human to enable human-readable format.
This would convert each line's ms numbers to min, hours, days, ...
It would also insert commas for very large values of nscanned, nreturned, ...
Make sure that --slow, --scan, and other mtools, like mplotqueries, still work.
bucketing namespace/operations, so you can see whether a large number of operations occurred within a given unit of time.
This can be useful for simply time-shifting a single file.
If no restart (or start) message found, return a warning or error
filtering by namespace, operation, thread, ...
mlogfilter logfile --namespace "test.collection"
mlogfilter logfile --operation update query
mlogfilter logfile --thread conn123
Something
for stdin, provide buffer that makes the input seekable (line numbers, start and end of file).
also support DB storage (mongod).
Group slow queries and show counts, to easily identify the most inefficient queries. Needs to be smart enough to group similar queries together but allow for value changes. Maybe use similar logic as the query optimizers uses.
Possible output
<query type> #occurences average time total time avg nscanned avg nreturned
Including mlogfilter, which did it's own thing before.
why does this not work:
mlogmerge logfile --timezone "-5" --pos eol
mlogfilter should read and understand the new format for time hh:mm:ss.uuu where uuu are the milliseconds.
Add the number in the legend that needs to be pressed to toggle that plot.
Show which ones are visible/invisible in the legend.
Extend to 18 plots with Shift-1 - Shift-9.
any option that mlaunch doesn't understand directly should be passed on to the mongod mongos launch.
This could include -vvv, --nojournal, etc.
Make sure to pass options only to the process that understands it (filter before).
--mongos X (where X is the number of mongos to start. X=1 default)
Remove usage from each script section. Create new documentation (either in the wiki or as separate pages) to explain each of the tools in detail.
Should make it easier to maintain as well, as each change doesn't require README.md to be updated.
If a group is invisible from the plot, clicking in the graph should not output any log lines from this group. This is confusing and prevents useful filtering with the numbers.
(I searched the repository for "nscanned" and didn't see anything relevant for mplotqueries, so I hope I didn't miss an existing feature request! Feel free to close if I did miss an existing issue.)
Currently, the y-axis of charts produced by mplotqueries is some function of the runtime of slow queries being logged. It'd be great if there was an option to show the ratio of nscanned to n (maybe either logarithmic, if that makes sense?), nscanned, or just n.
Obviously, there's a correlation of nscanned/n to query runtime, but there are cases when nscanned/n is very high but overall runtime is not a problem from the perspective of the developer. However, when many of the same/similar query are run simultaneously, the inefficiency of nscanned/n becomes much more apparent.
If you follow the usage help by putting --exclude-ns
before the filename it errors out.
./mplotqueries.py --exclude-ns "(command)" mongo-live-a-4_2-13_mongodb.log
usage: mplotqueries.py [-h] [--ns [NS [NS ...]]] [--log]
[--exclude-ns [NS [NS ...]]]
filename
mplotqueries.py: error: too few arguments
If you put it after it works fine
/mplotqueries.py mongo-live-a-4_2-13_mongodb.log --exclude-ns "(command)"
{'exclude_ns': ['(command)'], 'ns': None, 'log': False, 'filename': 'mongo-live-a-4_2-13_mongodb.log'}
0 live.player_action 1
...
I think is just moving line 88 and 89 after or changing the usage help
Instead of setting $PATH before running mlaunch, it would be nice to allow a user to specify the path to the correct binary to use.
This problem started after implementing the number toggling [0-9].
for example:
mlogfilter logfile.log --from "-2h" should grab the last two hours of the logfile.
(retry)
Sometimes it would be useful to plot log lines that aren't necessarily timed operations. Let's say you filtered/grepped for a small set of "events" in the logfile (e.g. replica set status changes, etc) and want to visualize those as well.
I could see this as a useful scenario:
grep "is now in state" mongod.log | mplotqueries --plot-untimed
To visualize, they could draw vertical thin lines instead of dots (because they don't have a y-axis value).
Ideally, I'd like to overlay such a plot with the original timed plot. Need to work out how that would be possible.
$ mlaunch --single --verbose --helpz
creating directory: ./data/db
launching: mongod --dbpath ./data/db --logpath ./data/mongod.log --port 27017 --logappend --helpz --fork
waiting for mongod to start up...
# takes a long time
mongod at Gianfranco-10gen.local:27017 running.
but doesn't start it
same for
mlaunch --replica --verbose --helpz
mlaunch --replicaset --verbose --helpz
currently, if there are no datapoints anymore, the plot will only reach to the last data point. to compare different plots, the x-axis range should always cover the complete file. This can be fixed with a call to plt.xlim() setting the whole range.
If two data points are further apart than the gap-threshold, the range bar is stopped at the last point and started again with the new point.
For this we also need to find a way to pass customized arguments to a plot type.
This would replace the --no-legend startup parameter.
If the dbpath exists already (from previous mlaunch), then compare the startup options. If they are different, exit with warning that existing data will be overwritten and that mlaunch --restore
should be used to restart.
Should we allow the script to run if the same number of nodes are started? What about with a different number of mongos? more shards?
Perhaps offer --force
to force overwrite.
For example mention branching model (develop / master).
currently, a start of
mlaunch --single
stores the data in ./data/
, whereas starting
mlaunch --single foo
stores the data in ./foo/data/
.
The behavior should be changed to store the second case in ./foo/
directly, without the nested data
folder.
Especially with some matplotlib complications.
Should go in LICENSE.md and linked from README.md
Add version numbers for mtools releases. Should be included in the --help output, as well as a simple display with --version.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.