Can jmespath be distributed across search heads?

This is the definition of distributable I've had in mind: <a href="https://docs.splunk

Distributable about jmespath HOT 7 CLOSED

kintyre commented on June 12, 2024

Distributable

from jmespath.

Comments (7)

lowell80 commented on June 12, 2024

Yes. I'm unsure if you mean distributed search (where the app is distributed to the indexers via bundle replication), or deployed to multiple search heads. But in either case, it should work fine.

I have seen some issues with distributed search Splunk Cloud where certain bits weren't replicated correctly or something like that, but I was able to reproduce the issue with other custom search commands (distributed by apps written by Splunk); so I concluded that the issue was outside my control and therefore I handed the issue over to Splunk Support.

Have you tried something that hasn't worked as expected? Or is this just a general question.

from jmespath.

mwisnicki commented on June 12, 2024

This is the definition of distributable I've had in mind: https://docs.splunk.com/Documentation/Splunk/7.3.1/SearchReference/Commandsbytype#Streaming_commands

It's more of general question. I'm not that deep into splunk but I was warned that since this is centralized streaming command, I should avoid it in dashboards and reports for performance reasons.

From what I was able to gather, distributable command would have distributed = true in commands.conf which this project does not.

from jmespath.

lowell80 commented on June 12, 2024

Ah, yes the command is streamable. The jmespath command looks at a single event at a time, and doesn't let information cross that boundary, therefore the command can be distributed so it can be run on either on indexers or on the search head, depending on how a search is constructed.

Performance isn't dictated simply by the classification of the search type (streaming vs transforming, stateful/non-stateful, ...) It ends up being much more complicated than that, and there's plenty of good resources on search optimization. The most basic rule being, get your base search correct (eliminate unwanted data as early as possible in the process.) If your new to Splunk, then I'd suggest not worrying about that yet. Aim for functionality first, and once you get the hang of things, then look at optimizations; often it's not a big deal, but it's very use-case dependent. That being said. There is, and always will be, some extra overhead for external search commands like jmespath. For example, if the built-in spath command does everything you need, then that will perform much more quickly than jmespath which launches an external python process.

BTW, I've written a bit about which one to choose here:
https://github.com/Kintyre/jmespath/wiki/Command-Reference-jmespath#when-to-use-jmespath-vs-spath

I'm not sure about the distributed = true setting in commands.conf. I think the option you are looking to is streaming = true, which is set for jmespath. At some point I'm going to upgrade the interface to use the Splunk Python SDK which uses the newer style "chunked" interface rather than the old internal (non-published splunk.Intersplunk library), but none of this will change the streaming behavior.

from jmespath.

mwisnicki commented on June 12, 2024

Thanks for explanation. The information about distributed = true being needed I've actually found in splunklib in this repo:

jmespath/bin/splunklib/searchcommands/streaming_command.py

Lines 124 to 139 in d3c1f0c

 distributed = ConfigurationSetting(value=True, doc=''' 

  :const:`True`, if this command should be distributed to indexers. 

  Under SCP 1 you must either specify `local = False` or include this line in commands.conf_, if this command 

  should be distributed to indexers. 

  ..code: 

  local = true 

  Default: :const:`True` 

  Supported by: SCP 2 

  .. commands.conf_: http://docs.splunk.com/Documentation/Splunk/latest/Admin/Commandsconf 

  ''')

Do you know what's the performance difference for simple property navigation with jmespath compared to equivalent spath query?

Do you think it's possible to write splunk extension that would match or get close to performance of built-in command (in whatever language)?

from jmespath.

lowell80 commented on June 12, 2024

Okay that's in the Splunk Python SDK, which is already in use for the jsonformat command and eventually will be used for jmespath (hopefully before the 2.0 release). Under the covers the mode of operation is negotiated at runtime for "chunked" search commands (between splunk and external search commands), and therefore very little definition ends up in commands.conf. Note that distributed defaults to True.

I haven't done any official performance comparisons. For most of the searches where I use it, it's because spath can't get the job done, or it would take a half-dozen SPL command to do the equivalent of what a single jmespath expression can do. In those cases, any performance hit becomes effectively irrelevant to me. Of course performance can't suck. In practice, I haven't seen jmespath become the performance bottleneck.

Can you clarify. Is this an academic concern or have you tried it and run into performance issues?
Do you have very demanding performance requirements?

from jmespath.

mwisnicki commented on June 12, 2024

Right now I'm just trying to satisfy my curiosity :)

from jmespath.

lowell80 commented on June 12, 2024

My experience has been that it's typically fast enough. Use cases where super high performance is necessary typically isn't well suited for Splunk (or any other big data platform) in the first place.

If you use it and find that things are slower than you'd expect, please reach back out, and we'll see what can be done. Most often, there are ways to restructure a search to make it much faster (because most often, jmespath doesn't end up being the bottle neck. It's typically orders of magnitude faster than say, the time it takes to pull the raw events off disk.)

I'm going to go ahead and close this. BTW, a great resource for general (or highly-specific) Splunk performance questions is Splunk User Slack channel or Splunk Answers. (I can be found on both.)

from jmespath.

Distributable about jmespath HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs

	distributed = ConfigurationSetting(value=True, doc='''
	:const:`True`, if this command should be distributed to indexers.

	Under SCP 1 you must either specify `local = False` or include this line in commands.conf_, if this command
	should be distributed to indexers.

	..code:
	local = true

	Default: :const:`True`

	Supported by: SCP 2

	.. commands.conf_: http://docs.splunk.com/Documentation/Splunk/latest/Admin/Commandsconf

	''')