taoensso / tufte Goto Github PK

View Code? Open in Web Editor NEW

536.0 13.0 22.0 699 KB

Simple performance monitoring library for Clojure/Script

Home Page: https://www.taoensso.com/tufte

License: Eclipse Public License 1.0

Clojure 100.00%

clojure clojurescript epl taoensso profiling benchmarking performance-monitoring

tufte's Introduction

Documentation | Latest releases | Get support

Tufte

Simple performance monitoring library for Clojure/Script

Tufte allows you to easily monitor the ongoing performance of your Clojure and ClojureScript applications in production and other environments.

It provides sensible application-level metrics, and gives them to you as Clojure data that can be easily analyzed programatically.

Carte Figurative, one of Edward Tufte's favourite data visualizations.

Latest release/s

2023-09-27 2.6.3: release info

See here for earlier releases.

Why Tufte?

Small, fast, cross-platform Clojure/Script codebase
Sensible application-level metrics without the obscure JVM-level noise
Metrics as Clojure maps: easily aggregate, analyse, log, serialize to db, etc.
Tiny, flexible API: p, profiled, profile
Great compile-time elision and runtime filtering support
Arbitrary Clojure/Script form-level profiling
Full support for thread-local and multi-threaded profiling

10-second example

(require '[taoensso.tufte :as tufte :refer [defnp p profiled profile]])

;; Request to send `profile` stats to `println`:
(tufte/add-basic-println-handler! {})

;;; Define a couple dummy fns to simulate doing some expensive work
(defn get-x [] (Thread/sleep 500)             "x val")
(defn get-y [] (Thread/sleep (rand-int 1000)) "y val")

;; Let's check how these fns perform:

(profile ; Profile any `p` forms called during body execution
  {} ; Profiling options; we'll use the defaults for now
  (dotimes [_ 5]
    (p :get-x (get-x))
    (p :get-y (get-y))))

;; The following will be printed to *out*:
;;
;; pId      nCalls      Min    50% ≤    90% ≤    95% ≤    99% ≤      Max     Mean   MAD    Clock  Total
;; :get-x        5    501ms    503ms    505ms    505ms    505ms    505ms    503ms   ±0%    2.52s    53%
;; :get-y        5     78ms    396ms    815ms    815ms    815ms    815ms    452ms  ±48%    2.25s    47%
;;
;; Accounted                                                                               4.78s   100%
;; Clock                                                                                   4.78s   100%

Documentation

Wiki (getting started, usage, etc.)
API reference: cljdoc, Codox

Funding

You can help support continued work on this project, thank you!! 🙏

License

tufte's People

Contributors

Stargazers

Watchers

tufte's Issues

ConcurrentModificationException in `stats-accumulator` (thrown from `merge-pstats`)

Hey there! Thanks for all of the great software.

I have a Clojure application that I control from the REPL, and I'm using two handlers for tufte data. One is a stats accumulator (via add-accumulating-handler!) that I'm draining every minute into a database, and the second is a stats accumulator that I'm managing manually with stats-accumulator and core.async, which I'm draining every 15 minutes and logging with timbre.

The second one is throwing a ConcurrentModificationException every so often (probably once every 5 hours under heavy use) and I'm curious what I might be doing wrong. The add-accumulating-handler! implementation would swallow the error and continue I think (since the agent's :error-mode is :continue) so I'm not totally sure that this error is specific to my implementation.

Anyway, in my REPL namespace I have the following:

(require '[taoensso.timbre :as timbre])
(require '[taoensso.encore :as enc])
(require '[taoensso.tufte :as tufte])
(require '[clojure.core.async :as async])

;; mostly for exceptions thrown on core.async threads
(Thread/setDefaultUncaughtExceptionHandler
  (reify Thread$UncaughtExceptionHandler
    (uncaughtException [_ thread ex]
      (timbre/error ex "Uncaught exception on" (.getName thread)))))

(defonce repl-stats
  (let [stats (tufte/stats-accumulator)
        buf (async/chan 1024)]
    
    ;; queue stats for consumption
    (tufte/add-handler! :repl-pstats "*" #(async/>!! buf %))

    (async/go-loop [next-timeout (async/timeout (enc/ms :mins 15))]
      (async/alt!
        buf ;; consume stats from the buffer and invoke the accumulator
        ([datum]
         (when-let [{:keys [?id pstats]} datum]
           (stats (or ?id :tufte/nil-id) pstats) ;; <-- this throws sometimes
           (recur next-timeout)))

        next-timeout ;; periodically drain & log the stats
        ([_]
         (timbre/info (tufte/format-grouped-pstats @stats))
         (recur (async/timeout (enc/ms :mins 15))))))

    stats))

Eventually, I'll see that there was an uncaught exception from the async thread when invoking stats:

2020-10-24T22:31:50.719Z ERROR [the-system.data.collector:499] - Uncaught exception on async-dispatch-5
                                              java.lang.Thread.run              Thread.java:  834
                java.util.concurrent.ThreadPoolExecutor$Worker.run  ThreadPoolExecutor.java:  628
                 java.util.concurrent.ThreadPoolExecutor.runWorker  ThreadPoolExecutor.java: 1128
                                                               ...
          clojure.core.async.impl.channels.ManyToManyChannel/fn/fn             channels.clj:   95
                                  clojure.core.async/do-alts/fn/fn                async.clj:  252
                                   clojure.core.async/ioc-alts!/fn                async.clj:  383
      clojure.core.async.impl.ioc-macros/run-state-machine-wrapped           ioc_macros.clj:  977
              clojure.core.async.impl.ioc-macros/run-state-machine           ioc_macros.clj:  973
   the-system.data.collector/eval42802/fn/fn/state-machine--auto--            collector.clj:  417
the-system.data.collector/eval42802/fn/fn/state-machine--auto--/fn            collector.clj:  417
                                  /taoensso.tufte.StatsAccumulator               tufte.cljc:  715
                                          taoensso.tufte/sacc-add!               tufte.cljc:  711
                                                clojure.core/swap!                 core.clj: 2352
                                                               ...
                                       taoensso.tufte/sacc-add!/fn               tufte.cljc:  711
                                  taoensso.tufte.impl/merge-pstats                impl.cljc:  229
                                  taoensso.tufte.impl/merge-pstats                impl.cljc:  252
                           taoensso.tufte.impl/times-into-id-times                impl.cljc:  181
                                               clojure.core/reduce                 core.clj: 6828
                                       clojure.core.protocols/fn/G            protocols.clj:   13
                                         clojure.core.protocols/fn            protocols.clj:   75
                                clojure.core.protocols/iter-reduce            protocols.clj:   49
                                 java.util.LinkedList$ListItr.next          LinkedList.java:  892
               java.util.LinkedList$ListItr.checkForComodification          LinkedList.java:  970
java.util.ConcurrentModificationException:

Is there something that you see me obviously doing wrong? I tried to reproduce the error by reducing the interval from 15 minutes to one second and just hammering on calls to profile from different threads, but no luck.

Consider adding defnp- macro?

First off, thanks for this great tool. I lean on it a lot to keep my apps fast enough.

I was wondering if it you could add (or would accept a PR for) a defnp- macro. I use private functions frequently to break big functions apart, especially in more verbose ETL workflows. I end up doing the following a lot which doesn't look too pleasant:

(defnp keeping-it-snappy ^:private [])

I still find this better than wrapping the body with p.

I've also been too lazy to add the macro to my own code, to remind myself to log this issue.

CompilerException java.lang.RuntimeException: No such var: enc/now-nano*

I'm sure I'm doing something wrong, but I don't know what. My lein configuration in project.clj includes:

...
:dependencies [[org.clojure/clojurescript "1.9.89"]
               [org.clojure/clojure "1.8.0"]
               [org.clojure/core.async "0.2.385"]
               [com.taoensso/timbre "4.7.0"]
               [com.taoensso/tufte "1.0.2"]
               ....

However, when I try to use tufte, I get an error as follows:

user=> (require '[taoensso.tufte :as tufte :refer [defnp p profiled profile]])

CompilerException java.lang.RuntimeException: No such var: enc/now-nano*, compiling:(taoensso/tufte/impl.clj:154:20)

Am I missing something? Is the transitive encore dependency stale?

Examples?

Hi! Looking forward to trying this out, but I'm having a hard time wrapping my head around the various ways to use this. I've dug through some of the other issues (most notably this one) but figured I'd just ask: Could you put some complete examples together for how we could use tufte? The only examples I could find where in the tests directory, but I'm wondering if you have code laying around with tufte included that we could learn from...

Empty results while profiling substantial code

Problem statement

While profiling "substantial" code (as opposed to a trivial example, which does work for me), no results are printed at all.

Note that I don't hit the profiled defns directly, but through my web framework of choice instead.

This happens in a large codebase, where I know that there are clojure.core.async/go blocks and other types of concurrency.

Things tried

Running tufte OOTB
Running tufte with the :dynamic? true option

Possible cause

Binding conveyance doesn't work across go blocks, and sometimes across vanilla fns (which is why clojure.core/bound-fn exists)

My impression is: there's a missing mechanism for capturing truly arbitrary multi-threaded events?

I know that in a production app this wouldn't be a good idea since results could easily get mixed up. But in the context of running a one-off performance analysis in the REPL, in a deftest (assuming a sequencial test runner), etc this can be considered safe.

Thanks - V

Non-dynamic profiling is not thread-safe

The logic in t.t.impl on line 83, which uses local pdata in non-dynamic mode isn't thread safe because of the use of a shared Stack that is globally declared. A combination of threads calling profile rapidly can lead to a situation where the modifications of a push or pop loses a stack entry, and the check of empty returns false when another thread has actually just corrupted and emptied the stack, leading to a stack empty exception.

I think the Stack itself needs to be thread local.

Align pId column left in format-stats

I find the stats vastly more readable when the namespaces in the pId column are lined up. How about adding an :align config to the column->pattern map?

basic-println-handler should allow to select columns to be shown

Hi,

and thanks for a great library! It has helped me to easily identify the bottlenecks in my application.

There is however one small problem. The output is quite wide, so I would like to select only certain columns, for example [:n-calls :mean]. Would this be possible to add?

I had a go at taoensso.tufte.stats/format-stats and ended up with the following:

(defn format-stats
  "Returns a formatted table string for given `{<id> <stats>}` map.
  Assumes nanosecond clock, stats based on profiling id'd nanosecond times."
  ([clock-total id-stats] (format-stats clock-total id-stats (fn [id m] (get m :sum))))
  ([clock-total id-stats sort-fn] (format-stats clock-total id-stats sort-fn [:n-calls :min :p50 :p90 :p95 :p99 :max :mean :mad]))
  ([clock-total id-stats sort-fn columns]
   (when id-stats
     (let [clock-total (long clock-total)
           columns (vec (distinct (into columns [:total :clock])))
           ^long accounted-total
           (reduce-kv
             (fn [^long acc _id s]
               (+ acc (long (get s :sum))))
             0 id-stats)

           sorted-ids
           (sort-by
             (fn [id] (sort-fn id (get id-stats id)))
             enc/rcompare
             (keys id-stats))

           ^long max-id-width
           (reduce-kv
             (fn [^long acc k v]
               (let [c (count (str k))]
                 (if (> c acc) c acc)))
             9                                              ; (count "Accounted")
             id-stats)]

       #?(:cljs                                             ; Simplified output w/o table
          (let [sb
                (reduce
                  (fn [acc id]
                    (let [s (get id-stats id)
                          sum (get s :sum)
                          mean (get s :mean)]
                      (enc/sb-append acc
                                     (str
                                       (select-keys {:id      id
                                                     :n-calls (get s :n)
                                                     :min     (fmt (get s :min))
                                                     :p50     (fmt (get s :p50))
                                                     :p90     (fmt (get s :p90))
                                                     :p95     (fmt (get s :p95))
                                                     :p99     (fmt (get s :p99))
                                                     :max     (fmt (get s :max))
                                                     :mean    (fmt mean)
                                                     :mad     (str "±" (perc (get s :mad) mean))
                                                     :total   (fmt sum)
                                                     :clock   (perc sum clock-total)} columns)
                                       "\n"))))
                  (enc/str-builder)
                  sorted-ids)]

            (enc/sb-append sb "\n")
            (enc/sb-append sb (str "Accounted: (" (perc accounted-total clock-total) ") " (fmt accounted-total) "\n"))
            (enc/sb-append sb (str "Clock: (100%) " (fmt clock-total) "\n"))
            (str sb))

          :clj
          (let [column->pattern {:id      {:n (str "%" max-id-width "s") :s (str "%" max-id-width "s") :heading "pId"}
                                 :n-calls {:n "%,10d" :s "%10s" :heading "nCalls"}
                                 :min     {:heading "Min"}
                                 :p50     {:heading "50% ≤"}
                                 :p90     {:heading "90% ≤"}
                                 :p95     {:heading "95% ≤"}
                                 :p99     {:heading "99% ≤"}
                                 :max     {:heading "Max"}
                                 :mean    {:heading "Mean"}
                                 :mad     {:n "%5s" :s "%5s" :heading "MAD"}
                                 :total   {:n "%11s" :s "%11s" :heading "Total"}
                                 :clock   {:n "%7s" :s "%7s" :heading "Clock"}}
                ^StringBuilder sb (enc/str-builder "")
                format-n-append (fn [column s] (enc/sb-append sb (format (get-in column->pattern [column :n] "%10s") s)))
                format-s-append (fn [column s] (enc/sb-append sb (format (get-in column->pattern [column :s] "%10s") s)))]

            ; Write headers
            (doseq [column (into [:id] columns)]
              (when-not (= :id column)
                (enc/sb-append sb " "))
              (format-s-append column (get-in column->pattern [column :heading])))
            (enc/sb-append sb "\n\n")

            ; Write numbers
            (doseq [id sorted-ids]
              (let [s (get id-stats id)
                    sum (get s :sum)
                    mean (get s :mean)]
                (format-n-append :id id)
                (doseq [column columns]
                  (enc/sb-append sb " ")
                  (cond (= :n-calls column) (format-n-append column (get s :n))
                        (= :mean column) (format-n-append column (fmt mean))
                        (= :mad column) (format-n-append column (str "±" (perc (get s :mad) mean)))
                        (= :total column) (format-n-append column (fmt sum))
                        (= :clock column) (format-n-append column (perc sum clock-total))
                        :else (format-n-append column (fmt (get s column)))))
                (enc/sb-append sb "\n")))

            ; Write Accounted
            (enc/sb-append sb "\n")
            (format-s-append :id "Accounted")
            (doseq [column columns]
              (enc/sb-append sb " ")
              (cond (= :total column) (format-s-append column (fmt accounted-total))
                    (= :clock column) (format-s-append column (perc accounted-total clock-total))
                    :else (format-s-append column "")))

            ; Write Clock
            (enc/sb-append sb "\n")
            (format-s-append :id "Clock")
            (doseq [column columns]
              (enc/sb-append sb " ")
              (cond (= :total column) (format-s-append column (fmt clock-total))
                    (= :clock column) (format-s-append column "100%")
                    :else (format-s-append column "")))
            (enc/sb-append sb "\n")
            (str sb)))))))

This will allow usage like this:

(format-stats (* 1e6 30)
              {:foo (stats (rand-vs 1e4 20))
               :bar (stats (rand-vs 1e2 50))
               :baz (stats (rand-vs 1e5 30))}
              (fn [id m] (get m id))
              [:n-calls :mean])
=>
"      pId     nCalls       Mean       Total   Clock
 
      :foo     10,000     9.49ns     94.88μs      0%
      :bar        100    26.51ns      2.65μs      0%
      :baz    100,000    14.52ns      1.45ms      5%
 
 Accounted                            1.55ms      5%
     Clock                           30.00ms    100%
 "

What do you think? The code is not great, but it works. I suppose I could make a pull request, add documentation and of course add a columns option to add-basic-println-handler! to put it all together.

I've also verified that the output with all columns selected is identical to the old function:

(def dat {:foo (stats (rand-vs 1e4 20))
          :bar (stats (rand-vs 1e2 50))
          :baz (stats (rand-vs 1e5 30))})

(= (format-stats-old (* 1e6 30) dat) (format-stats (* 1e6 30) dat))
=> true

Best regards.

Alternative approximation for total clock time

Hi, and thanks once again for a fine library.

Currently I'm running tufte in production in a moderately busy webserver. At the most busy endpoint, [:clock :approx] keeps becoming true. I'm not sure exactly why this keeps happening, I'm guessing because of context switches / concurrent access at that endpoint. Most of the time the webserver is idle though. Due to the current approx time strategy, the percentage total in format pstats drop to about zero. The estimated time basically becomes the timespan of the pstats accumulator.

I have a proposed fix for this (see pull request). This strategy supports "bouncing back" from an occasional unordered pstat, yet keeping the user informed about that total time is now an estimate.

On a bigger note: Would it make sense to calculate percentage total based on accounted total, and not clock total?

Thanks and kind regards.

Tufte v4

Switch to Telemere core
General clean-up/refresh
Extend docs?
Demo video?

Incorrect compaction in merge-pstats

Hi, and thanks once again for a fine library!

It seems that tufte/merge-pstats does not always do compaction. Example code to reproduce:

(test/deftest merge-pstats-compaction
  (let [pstats (atom nil)
        ps (looped 1e6
                   (swap! pstats (fn [o] (tufte/merge-pstats o (second (profiled {}
                                                                                 (looped 5 (p :foo))
                                                                                 (looped 2 (p :foo (p :bar (p :baz))))
                                                                                 (looped 1 (p :qux "qux"))))))))
        id-times (.-id_times (.-pstate_ ^PData (.-pd ^PStats ps)))]
    (println (reduce-kv (fn [m k v] (assoc m k (format "%,3d" (count v)))) {} id-times))))
; prints {:foo 7,000,000, :baz 2,000,000, :bar 2,000,000, :qux 1,000,000}

The compaction code inside impl/merge-pstats is never run because pd1-id-times is always nil.

Thanks and kind regards.

Problem compiling ClojureScript with 2.0.0

Hello! Long time com.taoensso fan, first time issue submitter.

When I include 2.0.0 as a dependency and require taoensso.tufte using Figwheel for compilation I get this:

With the exact same setup but the dep changed to 1.4.0 it works fine.

Compile-time elision doesn't work (as expected)

When I set export TUFTE_MIN_LEVEL=6 as part of my build, I get the appropriate output from tufte, as the code dictates: https://github.com/ptaoussanis/tufte/blob/master/src/taoensso/tufte.cljx#L122

Only when I set the run-time value, however, with (tufte/set-min-level! 6), can I actually disable tufte. This issue is pressing because Google App Engine prohibits the creation of threads all willy-nilly and tufte is still creating threads with the supposed compile-time elision.

Can you shed some light on why this might be the case, that having tufte acknowledge my compile-time min level is insufficient to actually stop it from working? Furthermore why the run-time value works instead?

Thanks for your time and for the awesome Clojure projects. timbre, tufte, and nippy have all been very helpful in my work.

merge-pstats ps2-tsum should handle ps1 starting before ps0

Example to reproduce error:

(comment
  (let [started (java.util.concurrent.CountDownLatch. 1)
        z (fn [x] (Thread/sleep x))
        f1 (future (.await started) (z 1400) (profiled {} (p :p1 (z 10))))
        f2 (future                           (profiled {} (p :p1 (.countDown started) (z 1500)))) ; f2 is started before f1 and finishes after f1
        m (merge-pstats (second @f1) (second @f2))] ; and since f1 finishes first, it will also reach merge-pstats first
    (update-in @m [:clock :total] (fn [x] (str (int (/ x 1e6)) "ms")))))
; => :clock {:t0 18081211047393, :t1 18082711509597, :total "99ms", :approx? false}

99 ms is incorrect, should at least be 1500 ms.

Edit: Tricky stuff, hope I did not make a mistake..!

Thanks and kind regards.

Difficulties using both tufte and timbre

Hi,

I seem to be having some problems running both timbre and tufte at the same time. Both are at the latest version and both work in isolation, but when both libraries are :require'd then I see the following error "defnp already refers to: #'taoensso.timbre.profiling/defnp in namespace: profiling.core"

I've attached a ZIP of the project. I'm using Clojure 1.7, tufte 1.0.0-RC2, and timbre 4.6.0, which I think are the latest versions. If there's anything else I can do to help then please let me know.

Finally, thank you for tufte -- it looks excellent, and I'm looking forward to experimenting with it more.

profiling.zip

merge-pstats assumes stats are always overlapping in time

Hi, and thanks once again for a great library.

It seems that merge-pstats assumes that its inputs are always overlapping in time.
For example:

(test/deftest merge-discrete-events
  (test/testing "Merge discrete events"
    (let [[_ ps0] (profiled {} (p :foo "foo"))
          _ (Thread/sleep 100) ; Majority of time spent outside of profiling
          [_ ps1] (profiled {} (p :foo "foo"))
          merged @(tufte/merge-pstats ps0 ps1)]
      (pprint/pprint merged)
      (is (< (get-in merged [:clock :total]) 1e8))))) ; Fails

This test will fail and print:

{:clock {:t0 2109412625335, :t1 2109513733727, :total 101108392},
;                                                     ^^^ Incorrect total, includes (Thread/sleep 100)
 :stats
 {:foo
  {:min 181,
   :mean 501.0,
   :mad-sum 640.0,
   :p99 821,
   :n 2,
   :p90 821,
   :max 821,
   :mad 320.0,
   :p50 821,
   :sum 1002,
   :p95 821}}}

I hit this error while trying to write a simple aggregate handler. I suppose that in nested cases of profiled it makes sense to assume overlapping, but not when you have multiple discrete events. I'm not quite sure how to solve this correctly. Or am I doing something wrong?

Regards.

format-grouped-pstats should accept nil as input

Currently gives a NullPointerException:

(format-grouped-pstats nil)
java.lang.NullPointerException: null
 at clojure.core$transient.invokeStatic (core.clj:3347)
    taoensso.tufte$format_grouped_pstats.invokeStatic (tufte.cljc:760)
    taoensso.tufte$format_grouped_pstats.invoke (tufte.cljc:748)
    taoensso.tufte$format_grouped_pstats.invokeStatic (tufte.cljc:752)
    taoensso.tufte$format_grouped_pstats.invoke (tufte.cljc:748)

stats-accumulator should not drop items with empty group id

Hi, and thanks once again for a fine library.

I found the following behaviour counterintuitive:

(do (def my-sacc (add-accumulating-handler! {}))
    (profile {} (p :p1 (Thread/sleep 3000)))
    (deref my-sacc))
=> {}

whereas giving an id gives the the expected result:

(do (def my-sacc (add-accumulating-handler! {}))
    (profile {:id :foo} (p :p1 (Thread/sleep 3000)))
    (deref my-sacc))
=> {:foo #object[taoensso.tufte.impl.PStats 0x635ab1cb {:status :pending, :val nil}]}

A solution could be to provide a default value for id in profile, for example :default?

No output at all

I am having trouble getting this to output anything at all. I am sure there is something very obvious I am missing. I created an example repo, that just contains your example from the README. However, when I run it, it produces no output.

This is the file in the repo:

(ns example)

(require '[taoensso.tufte :as tufte :refer (defnp p profiled profile)])

;; We'll request to send `profile` stats to `println`:
(tufte/add-basic-println-handler! {})

;;; Let's define a couple dummy fns to simulate doing some expensive work
(defn get-x [] (Thread/sleep 500)             "x val")
(defn get-y [] (Thread/sleep (rand-int 1000)) "y val")
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; main function

(defn -main 
  [& args]
  (profile ; Profile any `p` forms called during body execution
    {} ; Profiling options; we'll use the defaults for now
    (dotimes [_ 5]
      (p :get-x (get-x))
      (p :get-y (get-y)))))

Any ideas?

Support console.table to print stats in cljs (or document how to use it)

add-basic-spit-handler! (?)

Hello,

Would it make sense to have add-basic-spit-handler! in tufte? If so, I could send you a PR maybe?

I just discovered tufte. I have an application which communicates through stdout. I need to see the profiling results in a file and never in the stdout. I don't think reusing add-basic-println-handle! for this purpose is possible and copying these 12 lines of code to my project just to replace println with spit feels wrong. I also tried using add-timbre-logging-handler! which required adding timbre to project dependencies and new requirements in files ( taoensso.tufte.timbre, taoensso.timbre and taoensso.timbre.appenders.core, the last one for the spit-appender). After adding spit-appender to the timbre config I had to disable the other appender, the one printing to stdout, so this solution is just overkill for something I almost have out of the box. Am I missing something obvious?

10 second example doe snot produce output

Good day.

tried 10 second example on Clojure 1.8.0 - no output produced even though profile form is being called. Am I missing something?

Documentation request: how might one use `p`, `defnp` in protocol methods?

I would like to profile methods that are completing a protocol definition. For example, if I have

(defprotocol P
  (method [argument]))

(defrecord ConcreteP [member]
  P
  (method [argument] ...))

how would you suggest adding tufte wrapping to ConcreteP's implementation of method? Is the best that can be done to wrap it in p manually?

Obtain stats map formatted as string representing the time measurement.

Description

I am profiling several forms in my application, such as downloading a dataset from AWS s3, processing the file, upserting to the database, and send data over Kafka. I have a handler that will obtain the stats and will index the stats into Elasticsearch. However, the stats map format is plain integers and I would like to format them into strings such as when the stats are printed to stdout using (tufte/add-basic-println-handler! {}).

My code

My handler looks like this

(defn process-stats
   "Process stats is a handler for tufte. Tufte profiles the statistic of benchmark criteria and passes the stats into the specified handler."
   [event]
   (let [pstats @(:pstats event)
         stats (get-in pstats [:stats])
         event-name (first (keys stats))
         clock (get-in pstats [:clock])
         stats (-> stats
                   (get event-name)
                   (assoc :time (datetime/now-str))
                   (assoc :event-name event-name))]
     (index-stats stats)))

(defn profile-event
  [event event-name]
  (profile {}
           (p event-name event)))

I am profiling as the following:

(profile-event (s3/get-file file-metadata) :s3-file)

It works fine and as expected, however, I can't figure the proper way to format the stats map values into strings and the time measurement

Expected output

{:min "150ns",
 :event-name :s3-file,
 :mean "150ns", 
 :p75 "150ns", 
 :p99 "150ns",
 :n 1,
 :time "2020-02-19T02:51:32Z",
 :p25 "150ns",
 :p90 "150ns",
 :max "150ns",
 :p50 "150ns",
 :p95 "150ns"}

actual output

{:min 150, :event-name :s3-file, :mean 150.0, :p75 150, :mad-sum 0.0, :p99 150, :n 1, :time "2020-02-19T02:51:32Z", :p25 150, :p90 150, :max 150, :mad 0.0, :p50 150, :sum 150, :p95 150}


pId           nCalls        Min      50% ≤      90% ≤      95% ≤      99% ≤        Max       Mean   MAD      Clock  Total

:s3-file           1   150.00ns   150.00ns   150.00ns   150.00ns   150.00ns   150.00ns   150.00ns   ±0%   170.00ns     0%

Accounted                                                                                                 150.00ns     0%
Clock                                                                                                      58.31μs   100%

Clarification w.r.t. multithreaded profiling

Thanks for tufte, it's been useful in profiling a pretty complicated call stack.

In the best practices of profiling, it may be worth reminding folks that when dealing with lazy sequences, the accumulated time of expressions that generate lazy sequences will be attributed to the caller of said lazy sequence. This can be a good thing, as functions that use the lazy sequences wisely (transducers, taking only what's needed, etc) will show up with a better runtime than those who use them wastefully (realizing the entire sequence into memory).

However, if there is something inherently expensive about how the lazy sequence acquires its values, that will not be apparent.

Not a tufte thing, really a Clojure thing. But worth mention, might be a pitfall that others will encounter.

Example at bottom of:
http://stackoverflow.com/questions/40643118/how-to-profile-multithreaded-clojure-results-with-tufte

Version 2.0.1 does not work in ClojureScript

Content of deps.edn:

{:deps {org.clojure/clojurescript {:mvn/version "1.10.339"}
        com.taoensso/tufte        {:mvn/version "2.0.1"}}}

Content of src/hello_world/core.cljs:

(ns hello-world.core
  (:require [taoensso.tufte :as tufte :refer-macros (defnp p profiled profile)]))

(tufte/add-basic-println-handler! {})

(defn get-x []
  (+ 1 1))

(defn get-y []
  (+ 2 2))

(profile ; Profile any `p` forms called during body execution
  {} ; Profiling options; we'll use the defaults for now
  (dotimes [_ 5]
    (p :get-x (get-x))
    (p :get-y (get-y))))

Example output:

$ clj -m cljs.main --target node --output-to main.js -c hello-world.core && node ./main.js

/home/ire/code/learn/deps-edn/out/cljs/core.cljs:312
  (let [ty (type obj)
  ^
Error: No protocol method IDeref.-deref defined for type cljs.core/Cons: (taoensso.encore/if-cljs (cljs.core/array) (LinkedList.))
    at Object.cljs$core$missing_protocol [as missing_protocol] (/home/ire/code/learn/deps-edn/out/cljs/core.cljs:312:3)
    at Object.cljs.core/missing-protocol [as _deref] (/home/ire/code/learn/deps-edn/out/cljs/core.cljs:671:1)
    at cljs.core/-deref (/home/ire/code/learn/deps-edn/out/cljs/core.cljs:1449:4)
    at cljs.core/deref (/home/ire/code/learn/deps-edn/out/taoensso/tufte/impl.cljc:240:15)
    at taoensso$tufte$impl$capture_time_BANG_ (/home/ire/code/learn/deps-edn/out/taoensso/tufte/impl.js:376:3)
    at taoensso.tufte.impl/capture-time! (/home/ire/code/learn/deps-edn/out/hello_world/core.cljs:6:1)
    at /home/ire/code/learn/deps-edn/out/hello_world/core.cljs:15:5
    at /home/ire/code/learn/deps-edn/out/hello_world/core.js:101:3
    at Object.<anonymous> (/home/ire/code/learn/deps-edn/out/hello_world/core.js:103:4)
    at Module._compile (module.js:653:30)

Incorrect calculation of total time in some cases

Consider

(let [s (add-accumulating-handler! "*")]
  (profile {:id :foo} (p :p1))
  (future (Thread/sleep 800) (profile {:id :foo} (p :p1)))
  (future                    (profile {:id :foo} (p :p1 (Thread/sleep 1000))))
  (Thread/sleep 1100)
  (some-> @s :foo (deref) :clock (update :total (fn [x] (str (int (/ x 1e6)) "ms"))))
  )
; => {:t0 286335903741125, :t1 286336908179460, :total "200ms", :approx? false}

Total time reported here is 200 ms, but should be at least 1000 ms. The problem in merge-pstats is that it seems ps0-tsum is assumed to have accounted for everything between ps0-t0 and ps0-t1. Unfortunately I don't see how it is possible to solve this correctly using the current strategy of discarding individual t0 and t1 during merge-pstats.

{:dynamic? true} does not report profilings inside callables in Executor pools

Using tufte 1.2 I can't profile statements wrapped in a p when they're clalled inside a callable executed in an java.util.concurrent Executors pool even when dynamic? is true.

The expected behavior here is that both :foo and :bar are reported. Instead, only :foo is (the p taking place in the main thread.

user=> (import '[java.util.concurrent Executors])
nil
user=> (require '[taoensso.tufte :as tufte :refer [p profile]])
nil
user=> (profile {:dynamic? true} 
         (let [pool (Executors/newFixedThreadPool 5) 
               task (.submit pool (cast Callable (fn [] (p :foo (Thread/sleep 1000)))))] 
           (p :bar (.get task))))

           pId      nCalls       Min        Max       MAD      Mean   Time% Time
          :bar           1     1.01s      1.01s       0ns     1.01s     100 1.01s
    Clock Time                                                          100 1.01s
Accounted Time                                                          100 1.01s

nil

^:dynamic vars seems to work without an issue with executor pools so the expected behavior would be for this to work.

user=> (def ^:dynamic *foo* "foo")
#'user/*foo*
user=> (let [pool (Executors/newFixedThreadPool 5) 
             task (.submit pool (cast Callable (fn [] (println *foo*))))] 
         (.get task))
foo
nil

Clojure 1.6 compatibility

Is there any way to make tufte compatible with Clojure 1.6? I've been getting this error (not reproducible on Clojure 1.7+):

user=> FileNotFoundException Could not locate taoensso/tufte__init.class or taoensso/tufte.clj on classpath:   clojure.lang.RT.load (RT.java:443)

Thanks for making this! I've found it super useful.

Add sensible default implementation for format-id-fn

Trouble getting README example to work with Cursive on IntelliJ

Probably user error but could not get the 10-second example to output anything.
Platform: Using Cursive Plugin on IntelliJ on Windows 10

Code:
(ns example
(:gen-class))

;; 10-second example
(require '[taoensso.tufte :as tufte :refer (defnp p profiled profile)])

;; We'll request to send profile stats to println:
(tufte/add-basic-println-handler! {})

;;; Let's define a couple dummy fns to simulate doing some expensive work
(defn get-x [] (Thread/sleep 500) "x val")
(defn get-y [] (Thread/sleep (rand-int 1000)) "y val")

;; How do these fns perform? Let's check:

(profile ; Profile any p forms called during body execution
{} ; Profiling options; we'll use the defaults for now
(dotimes [_ 5]
(p :get-x (get-x))
(p :get-y (get-y))))

(println "done") ; added to confirm where out goes to

Resulting Output:
done

Getting started

I tried the example code from the README, slightly modified as follows:

(ns trial
  (:require [taoensso.tufte :as tufte :refer (p profiled profile)]))

;;; Let's define a couple dummy fns to simulate doing some expensive work
(defn get-x [] (Thread/sleep 500)             "x val")
(defn get-y [] (Thread/sleep (rand-int 1000)) "y val")

(defn -main []
  ;; We'll request to send `profile` stats to `println`:
  (tufte/add-basic-println-handler! {})


  ;; How do these fns perform? Let's check:
  (profile ; Profile any `p` forms called during body execution
   {} ; Profiling options; we'll use the defaults for now
   (dotimes [_ 5]
     (p :get-x (get-x))
     (p :get-y (get-y)))))

But when I run it (lein run -m trial), nothing at all is printed out.
I tried changing the profile call to profiled and wrapping it in a call to prn and it outputs:

[nil #object[taoensso.tufte.impl.PStats 0xc247363 {:status :pending, :val nil}]]

I must be missing something... have you got any pointers to what I'm doing wrong?

Thanks in advance for any help.

[cljs] Wrong number of args (11) passed to taoensso.tufte/HandlerVal

Hi!
With version 2.1.0, using the 10 second example code emits the following warning:
Wrong number of args (11) passed to taoensso.tufte/HandlerVal

Add doc for how it compares to criterium

Criterium is likely to be the other Clojure profiling tool people think of when they read this README. It could be good to mention how they compare and contrast.

Enable global profiling.

Hey,
I just tried to upgrade to the latest version,
but can't get our use case to work anymore since
tufte/start-profiling-thread! and tufte/stop-profiling-thread!
got removed.
So this is basically a continuation of #[5] :/

The issue that we have is that we're running a pedestal GraphQL endpoint, which dispatches functions to an agent, which will asynchronously handle them.
We wrapped the GraphQL handlers into profile calls, but it seems that by the time the agent gets to process the code of interest, the code that originally dispatched it, has exited, taking the dynamic var to enable profiling with it.

Is there any way we can enable per thread profiling? Or just profiling globally?
There doesn't seem to be an actual var root we could alter, so we're a bit out of ideas :/

Any thoughts and ideas?

format-grouped-stats should have same pId width across groups

This will make it easier to visually scroll by column across groups.

For example:

(do
    (def my-sacc (add-accumulating-handler! "*"))
    (profile {:id :foo} (p :long.namespace.hello-world/core (Thread/sleep 3000)))
    (profile {:id :bar} (p :p1 (Thread/sleep 500)))
    (format-grouped-pstats @my-sacc))
=>
":foo,
 pId                                  nCalls        Min      50% ≤      90% ≤      95% ≤      99% ≤        Max       Mean   MAD      Clock  Total
 
 :long.namespace.hello-world/core          1     3.00s      3.00s      3.00s      3.00s      3.00s      3.00s      3.00s    ±0%     3.00s    100%
 
 Accounted                                                                                                                          3.00s    100%
 Clock                                                                                                                              3.00s    100%
 
 
 :bar,
 pId           nCalls        Min      50% ≤      90% ≤      95% ≤      99% ≤        Max       Mean   MAD      Clock  Total
 
 :p1                1   500.14ms   500.14ms   500.14ms   500.14ms   500.14ms   500.14ms   500.14ms   ±0%   500.14ms   100%
 
 Accounted                                                                                                 500.14ms   100%
 Clock                                                                                                     500.40ms   100%
 "

It's difficult to visually scroll for example the mean column here.

Misleading stats from laziness

While profiling a collection of functions, it appears that all of the time gets attributed to the function that causes a lazy collection to be realized rather than the function that actually "does" the heavy lifting, leading on to misattribute the source of the problem. For example:

(defnp fast-fn [coll]
  (count coll))

(defnp slow-fn []
  (mapcat (fn [i] (map (fn [j] (+ i j)) (range 10000))) (range 10000)))

user> (profile {} (-> (slow-fn) fast-fn))
100000000

pId      nCalls       Min        Max       MAD      Mean   Time% Time
::defn_fast-fn           1    10.13s     10.13s       0ns    10.13s     100 10.13s
::defn_slow-fn           1  392.25μs   392.25μs       0ns  392.25μs       0 392.25μs
Clock Time                                                          100 10.13s
Accounted Time                                                          100 10.13s

I'm not sure there's a way to avoid this as it is, from one perspective accurate, but I'm just not sure if it's going to be the most useful perspective a lot of the time. Do you know of any way to have the profiler put the blame where it's due?

Formatting as documented in the README seems to have no effect

Hey, thanks for a great lib! I Thoroughly enjoy using it.

I tried getting a bit better at using it, and it seems that

(tufte/add-basic-println-handler!
  {:format-pstats-opts {:columns [:n-calls :p50 :mean :clock :total]
                        :format-id-fn name}})

, which is documented in the README as a way to format output, has absolutely no effect on the printed output.

EDIT: My theory didn't make sense.
I don't know why it doesn't work.

Background profiling.

We're running a setup where we have a single thread processing messages in the background.
We'd like to use tufte to profile this thread for a certain amount of time, to see how it behaves under load. However since this is a continuously running thread we have no toplevel form it returns to, to do a 'profile' or 'profiled' call.
Any ideas if and how this is possible?

I'm seeing question marks in times printed.

Any idea why I see some question marks in there:

           pId      nCalls       Min        Max       MAD      Mean   Time% Time
        :doseq         100   37.11?s     8.74ms   590.0?s  935.57?s      98 93.56ms
    Clock Time                                                          100 94.99ms
Accounted Time                                                           98 93.56ms

Nested `p`'s are not taken into consideration when calculating percentages.

Currently, the time spend percentage of format-stats doesn't take into account nested p forms, so (p :a (p :b ...)) will state that both :a and :b spend 50% of the time.

Breaks uberjars due to NoClassDefFoundError when trying to call encore/read-sys-val

I am not sure why this happens, but I noticed that while my program (based on Fulcro and Fulcro RAD) runs fine when being started "from source", but when building an uberjar and trying to run it, it errors out:

Exception in thread "main" java.lang.NoClassDefFoundError: taoensso/encore$read_sys_val
        at taoensso.tufte$fn__31180.invokeStatic(tufte.cljc:129)
        at taoensso.tufte$fn__31180.invoke(tufte.cljc:129)
        at taoensso.tufte__init.load(Unknown Source)
        at taoensso.tufte__init.<clinit>(Unknown Source)
        at java.base/java.lang.Class.forName0(Native Method)
        at java.base/java.lang.Class.forName(Class.java:398)
        at clojure.lang.RT.classForName(RT.java:2211)
        at clojure.lang.RT.classForName(RT.java:2220)
        at clojure.lang.RT.loadClassForName(RT.java:2239)
        at clojure.lang.RT.load(RT.java:449)
        at clojure.lang.RT.load(RT.java:424)
        at clojure.core$load$fn__6839.invoke(core.clj:6126)
        at clojure.core$load.invokeStatic(core.clj:6125)
        at clojure.core$load.doInvoke(core.clj:6109)
        at clojure.lang.RestFn.invoke(RestFn.java:408)
        at clojure.core$load_one.invokeStatic(core.clj:5908)
        at clojure.core$load_one.invoke(core.clj:5903)
        at clojure.core$load_lib$fn__6780.invoke(core.clj:5948)
        at clojure.core$load_lib.invokeStatic(core.clj:5947)
        at clojure.core$load_lib.doInvoke(core.clj:5928)
        at clojure.lang.RestFn.applyTo(RestFn.java:142)
        at clojure.core$apply.invokeStatic(core.clj:667)
        at clojure.core$load_libs.invokeStatic(core.clj:5985)
        at clojure.core$load_libs.doInvoke(core.clj:5969)
        at clojure.lang.RestFn.applyTo(RestFn.java:137)
        at clojure.core$apply.invokeStatic(core.clj:667)
        at clojure.core$require.invokeStatic(core.clj:6007)

This is with Encore 2.122.0 and Tufte 2.1.0.

Support defmethodp same as fnp and defnp

So that we don't have (p :some-keyword ...) boilerplate inside each defmethod

Profiling ClojureScript async code

Is it possible to use this library to profile ClojureScript async/promise type code?

For example, I'm using promesa library, and would like to profile how long an entire promise chain takes.

This means I need to be able to start the profiler before setting up the promise, and then stop the profiler when the promise resolves (which might not happen until much later).

As far as I understand, the current dynamic binding setup won't work for this.

Is this something that this library supports?

Misleading interpolated percentiles for small sample sizes

Up until about 2.4.5 (I think), Tufte had simple behavior around percentiles with very low sample counts, such as 99%ile of a 5-sample pstats. It just reported the exact stat at the quantile closest to the desired percentile, which meant that several of the highest percentiles would be identical for small samples. (At the far end, of course, a 1-sample would have all its percentiles the same.)

Current behavior seems to have changed, although I don't see it in the changelog. Now it appears to interpolate values. Concretely, the straightforward test

(->> (dotimes [_ 2] (p :test (Thread/sleep (long (* 100 (rand))))))
     (tufte/profiled {}) peek deref)

produces results like this:

{:stats
 {:test
  {:min 45379080,
   :p25 5.298566525E7,
   :p50 6.05922505E7,
   :p75 6.819883575E7,
   :p90 7.27627869E7,
   :p95 7.428410395E7,
   :p99 7.550115759E7,
   :max 75805421,
   :n 2
   ,,,
  }
 }
 ,,,
}

Where, as you can see, everything but :max and :min are synthetic. For a 2-sample test, of course, nobody should expect to get good percentiles out. But if someone is running a test with 25 or 50 samples and sees a plausible p99 value, they may not realize that it doesn't actually exist: it's just a regression. Given the nature of timing percentiles, this is likely to fool them into thinking they have a much more robust system than they actually do.

As an example, take the mildly perverse fn here:

(defn spiky []
  (Thread/sleep
   (condp >= (rand)
     0.01 100
     0.05 5
     0.10 4
     0.50 3
     0.99 2
     1)))

Two runs may give very different results but, importantly, neither run gives adequate warning that the behavior of the function is not well described by the stats:

clj꞉user꞉> 
(->> (dotimes [_ 20] (p :test (spiky)))
     (tufte/profiled {}) peek tufte/format-pstats println)
pId           nCalls        Min      50% ≤      90% ≤      95% ≤      99% ≤        Max       Mean   MAD      Clock  Total

:test             20     2.06ms     3.09ms     4.45ms     4.52ms     4.53ms     4.53ms     3.22ms  ±24%    64.37ms    98%

Accounted                                                                                                  64.37ms    98%
Clock                                                                                                      65.63ms   100%

nil
clj꞉user꞉> 
(->> (dotimes [_ 20] (p :test (spiky)))
     (tufte/profiled {}) peek tufte/format-pstats println)
pId           nCalls        Min      50% ≤      90% ≤      95% ≤      99% ≤        Max       Mean   MAD      Clock  Total

:test             20     2.12ms     3.20ms     5.40ms    10.46ms    82.93ms   101.04ms     7.97ms ±117%   159.36ms    98%

Accounted                                                                                                 159.36ms    98%
Clock                                                                                                     161.93ms   100%

nil

(While the first run has no data that could possibly indicate a problem, it does have the false impression of filled-out data. The second run is worse, since it does have warning data, but the interpolation actively de-emphasizes that, discounting the genuine 99%ile stat by 20%.)

Nested 'p' forms don't work unless :dynamic? is true.

The documentation states that :dynamic? is used for multi thread profiling.
However it also seems to control wether or not nested 'p' forms work.

(taoensso.tufte/profile {:dynamic? false}
    (taoensso.tufte/p :outer
       (taoensso.tufte/p :inner
           (println "hi"))))

prints

           pId      nCalls       Min        Max       MAD      Mean   Time% Time
        :outer           1    1.21ms     1.21ms       0ns    1.21ms      86 1.21ms
    Clock Time                                                          100 1.41ms
Accounted Time

while

(taoensso.tufte/profile {:dynamic? true}
    (taoensso.tufte/p :outer
       (taoensso.tufte/p :inner
           (println "hi"))))

prints

           pId      nCalls       Min        Max       MAD      Mean   Time% Time
        :outer           1    1.23ms     1.23ms       0ns    1.23ms      56 1.23ms
        :inner           1  987.87μs   987.87μs       0ns  987.87μs      45 987.87μs
    Clock Time                                                          100 2.21ms
Accounted Time                                                          100 2.22ms

Is this done on purpose (in which case it would be nice to document :) ) or is this a bug?

No such var: enc/max-long

I'm trying to run your 10 seconds example, but when running (require '[taoensso.tufte :as tufte :refer (defnp p profiled profile)]) I get an error: CompilerException java.lang.RuntimeException: No such var: enc/max-long, compiling:(taoensso/tufte/impl.cljc:122:20).

I'm using Clojure 1.8.0

Clock and total is swapped

From the README

;; The following will be printed to *out*:
;;
;;       pId  nCalls       Min     50% ≤     90% ≤     95% ≤     99% ≤       Max      Mean  MAD  Total Clock
;;
;;    :get-y       5   94.01ms  500.99ms  910.14ms  910.14ms  910.14ms  910.14ms  580.49ms ±45%  2.90s   53%
;;    :get-x       5  503.05ms  504.68ms  504.86ms  504.86ms  504.86ms  504.86ms  504.37ms  ±0%  2.52s   46%
;;
;; Accounted                                                                                     5.42s  100%
;;     Clock                                                                                     5.43s  100%

format-pstats: Allow id to be stringified by a custom function

Motivating case:
My code is sprinkled with defnp. All my function names are unique. All of my namespaces share the same top level namespace. Printing the full namespace plus defn plus function name is a little verbose and not really needed for this case.

Allow for something like format-id-fn that will be called inside format-pstats on id. It's then up to the user of the library to choose formatting, for example to remove or abbreviate namespace.

Future work: One could support setting format-id-fn to :abbr and supply a default abbrevation function.

taoensso / tufte Goto Github PK

tufte's Introduction

Tufte

Simple performance monitoring library for Clojure/Script

Latest release/s

Why Tufte?

10-second example

Documentation

Funding

License

tufte's People

Contributors

Stargazers

Watchers

Forkers

tufte's Issues

Problem statement

Things tried

Possible cause

Description

My code

Expected output

actual output

Recommend Projects

Recommend Topics

Recommend Org

Jobs