ryapric / loggit Goto Github PK

View Code? Open in Web Editor NEW

37.0 6.0 2.0 164 KB

Modern Logging for the R Ecosystem

License: Other

R 95.09% Makefile 4.91%

exception-handler r logging

loggit's People

Contributors

Stargazers

Watchers

Forkers

psolymos chuvanan

loggit's Issues

Clean user communication

To make the console output look cleaner one could replace all the prints using paste

  if (confirm) print(paste0("Log file set to ", logfile))

with cats

  if (confirm) cat("Log file set to ", logfile, "\n")

loggit no longer passes check on older versions of R (pre-v4?)

Run some R CMD checks to confirm which versions, but 3.6.0 failed, as did 3.4.0 (which is the minimum version currently in the DESCRIPTION file).

Allow custom naming for default fields

In order to allow for conformance to a corporate logging standard, it would be helpful to allow renaming the default log fields in a configuration block of some sort.

For example, Usually, I need to write log_lvl with the key level and log_msg with the key msg. While I can do that now by adding those keys at the end, it seems I'll I end up having extra keys I don't use.

Feature request: allow inserting the date in the log filename

Thanks for this simple and great package.

A common need is to register logs in different files by date. E.g. by day:

loggit-2021-03-10.log
loggit-2021-03-11.log
loggit-2021-03-12.log
...

It would be useful if the set_logfile function allowed inserting date variables, e.g.:

set_logfile(logfile = 'loggit-%Y-%m-%d.log')

And automatically creates and logs to the file depending on the current date.

I'm available for a pull request if you consider adding this feature into the package.

`:` interrupts log message

Messages that contain an : are not displayed correctly in the log but are cut off

> loggit::message("This won't: work")
{"timestamp": "2023-12-01T16:17:12+0100", "log_lvl": "INFO", "log_msg": "This won't: work"}
This won't: work
> loggit::read_logs()
                 timestamp log_lvl    log_msg
1 2023-12-01T16:17:12+0100    INFO This won't

Change logging behavior to ndjson

The way loggit works now is incredibly non-performant: in order to write a log entry, it must first read in the entire log file, append to the data.frame representation, and then write the whole thing back out. Switching to ndjson will retain the JSON format, but allow for separation of concerns on a line-by-line basis. This will make writes negligibly fast, and infinitely scalable (up to available disk space).

Add rotate_lines to .config and rotate logs automatically

Thanks for the great tool. As far as I can see in the source code, the log files are not automatically rotated. I would suggest the following implementation:

add a rotate_lines=NULL entry to the .config environment
have functions to set_rotate_lines and get_rotate_lines
loggit can check if .config$rotate_lines is set, and if it is set to non-NULL, rotate the logs

This would leave the implementation backwards compatible, but would allow using {loggit} in long running sessions where rotate_lines limit is likely to be exceeded, e.g. Plumber APIs etc.

I am happy to work on a PR if this is something you can see as a useful contribution.

Informative call output

Even though this repo has been inactive for a long time, I hope that I can still help to improve a few things here.

Since loggits stop, warning and message internally call their base equivalent, the loggit function is always displayed as call. This could be prevented by calling the R internal functions instead.

In the same way one could very easily support stopifnot.

`message()` doesn't capture all of `...` args, only the first

I'm in the process of changing over from handmade log files to a uniform automated system, and so far I like loggit. However, a lot of my old scripts have lines like message("The number of rows written to db was: ", nrow(table)). However, the version that you implement for masking the base message ignores the multiple arguments and returns output like this:

> message("The number of rows: ", nrow(iris))
{"timestamp": "2020-05-25T16:50:48-0400", "log_lvl": "INFO", "log_msg": "The number of rows: "}
The number of rows: 150

The output to the console is correct! But the line written to the log only captures the first arg. It looks like this is an intentional design choice, because the source calls loggit with args[[1]] instead of something like paste(args, collapse = " ").

Is this changeable, or should I be adjusting my usage to match this behavior?

Feature request: read_logs() from URL

Thank you for this great logging package, and for considering this feature request!

It would be great to be able to read a remote log file from a URL, instead of a local filepath, e.g., queries <- read_logs("https://raw.githubusercontent.com/USER/REPO/main/queries.log")

Feature request: automatically sanitize disallowed ndjson chars

Would you be interested in me filing a PR for a config setting that automatically replaces : and \n in messages before they're written? Any ideas what they would get replaced with? I'm using _ and ___ respectively in my logs. It would be up to you whether that's a default or not. I haven't been able to figure out why colons in some places are problematic and not others, but newlines always prevent reading back the logs.

errors aren't able to be quieted using `purrr` adverbs

I have a function that I'm logging that occasionally fails badly due to external data sources. I handle it by using purrr::safely and retrying that item later. Usually that means that this function won't print anything to the console, just skip over that item. However, using this package, the deeply hidden stop(...) calls instead manage to print to the log anyway.

Is there any way to avoid this? Is this something for the purrr developers to fix?

library(loggit)
#> Warning: package 'loggit' was built under R version 3.6.2
#> 
#> Attaching package: 'loggit'
#> The following objects are masked from 'package:base':
#> 
#>     message, stop, warning
noisy_fn <- function(x) {
  if (x < 5) {message(x)}
  if (x >= 5) {stop(x)}
  x
}

purrr::map(3:6, purrr::safely(noisy_fn))
#> {"timestamp": "2020-05-25T23:23:54-0400", "log_lvl": "INFO", "log_msg": "3"}
#> 3
#> {"timestamp": "2020-05-25T23:23:54-0400", "log_lvl": "INFO", "log_msg": "4"}
#> 4
#> {"timestamp": "2020-05-25T23:23:54-0400", "log_lvl": "ERROR", "log_msg": "5"}
#> {"timestamp": "2020-05-25T23:23:54-0400", "log_lvl": "ERROR", "log_msg": "6"}
#> [[1]]
#> [[1]]$result
#> [1] 3
#> 
#> [[1]]$error
#> NULL
#> 
#> 
#> [[2]]
#> [[2]]$result
#> [1] 4
#> 
#> [[2]]$error
#> NULL
#> 
#> 
#> [[3]]
#> [[3]]$result
#> NULL
#> 
#> [[3]]$error
#> <simpleError in stop(x): 5>
#> 
#> 
#> [[4]]
#> [[4]]$result
#> NULL
#> 
#> [[4]]$error
#> <simpleError in stop(x): 6>
purrr::map(3:6, purrr::possibly(noisy_fn, otherwise = NA))
#> {"timestamp": "2020-05-25T23:23:54-0400", "log_lvl": "INFO", "log_msg": "3"}
#> 3
#> {"timestamp": "2020-05-25T23:23:54-0400", "log_lvl": "INFO", "log_msg": "4"}
#> 4
#> {"timestamp": "2020-05-25T23:23:54-0400", "log_lvl": "ERROR", "log_msg": "5"}
#> {"timestamp": "2020-05-25T23:23:54-0400", "log_lvl": "ERROR", "log_msg": "6"}
#> [[1]]
#> [1] 3
#> 
#> [[2]]
#> [1] 4
#> 
#> [[3]]
#> [1] NA
#> 
#> [[4]]
#> [1] NA

^{Created on 2020-05-25 by the reprex package (v0.3.0)}

To be clear, in both examples above, I'd expect the INFO messages to print since that's part of the "normal" flow, but not the ERROR messages. Below is the output if I don't use the loggit library:

purrr::map(3:6, purrr::safely(noisy_fn))
#> 3
#> 4
#> [[1]]
#> [[1]]$result
#> [1] 3
#> 
#> [[1]]$error
#> NULL
#> 
#> 
#> [[2]]
#> [[2]]$result
#> [1] 4
#> 
#> [[2]]$error
#> NULL
#> 
#> 
#> [[3]]
#> [[3]]$result
#> NULL
#> 
#> [[3]]$error
#> <simpleError in .f(...): 5>
#> 
#> 
#> [[4]]
#> [[4]]$result
#> NULL
#> 
#> [[4]]$error
#> <simpleError in .f(...): 6>
purrr::map(3:6, purrr::possibly(noisy_fn, otherwise = NA))
#> 3
#> 4
#> [[1]]
#> [1] 3
#> 
#> [[2]]
#> [1] 4
#> 
#> [[3]]
#> [1] NA
#> 
#> [[4]]
#> [1] NA

^{Created on 2020-05-25 by the reprex package (v0.3.0)}

Multiline messages will be destroyed

msg <- paste0("Package: smvgraph %s", utils::packageVersion("smvgraph"), "\n(C) 2022- Sigbert Klinke, HU Berlin")
loggit("DEBUG", msg, echo=FALSE)

leads in the loggit file to

{"timestamp": "2022-03-19T18:51:36+0100", "log_lvl": "DEBUG", "log_msg": "Package: smvgraph %s0.2.0__LF__(C) 2022- Sigbert Klinke__COMMA__ HU Berlin"}

But reading the log into R produces

read_logs()
                 timestamp log_lvl log_msg
1 2022-03-19T18:51:36+0100   DEBUG Package

Uninformative `echo`

Since echo is TRUE by default, the console is easily spammed with no new information, so I would recommend changing this.

This would also have the nice side effect that you could add a log to packages by simply importing loggit without changing the other behavior (in combination with my issue #23).

Alternatively, it would also be possible to introduce an option with default TRUE if backward compatibility is to be guaranteed.

Enable explicit configuration to log to STDOUT

In many of our production contexts, we log, usually in JSON, to STDOUT (or in rare cases STDERR), and have external log collectors that capture logs across many services.

The filename test in configurations.R seems to prevent me from doing what I'd normally do for this:

# gives me a writeable file handler for the standard output stream
log_file <- file("stdout", "w")

I think it would be helpful to allow for these basic streams as output.

This is also in concert with the logging factor of the 12-factor app, which recommends:

A twelve-factor app never concerns itself with routing or storage of its output stream. It should not attempt to write to or manage logfiles. Instead, each running process writes its event stream, unbuffered, to stdout.

Timeline for implementation of in-package logging?

In your README, you note:

If you really wish to have all exception messages logged by loggit, please be patient, as this feature is in the works.

I'm wondering about the timeline for implementation for this, as it's critical for me to be able to do that for this package to be useful for production work.

Remove dplyr and jsonlite as dependencies

I've grown to be wary of external dependencies, both in terms of potential breakage but also in terms of install time/size. A user who just wants to log their script/package entries shouldn't need to spend 20 minutes compiling C/C++ packages (on a Linux deployment host) just to enable that feature. loggit is currently using dplyr only for its bind_rows() functionality, which is easy to replicate in base R. It's leaning heavily into jsonlite though, and that will take some fudging on the read-in-the-data side of things. But writes should be easy to roll my own.

Support `stopifnot`

stopifnot is also a base condition function and should therefore be supported

Add ability to log vector arguments as JSON arrays, instead of separate log entries

Current behavior is a consequence of R's vectorization. This might take some fudging, since R would then read in the values as arrays into a single df cell, vs. separate rows.

Add vignettes

Vignette topics:

Automated data validation
Logging to stdout (user will need to wrap calls to the right suppressor function, if not calling loggit() directly)

Warnings if log_msg has more than one attribute

First of all, thank you for this package!

Now, for the problem I encounter:

library(loggit)
library(glue)

loggit("INFO", "First message")
loggit("INFO", "Second message")
# Warning message:
# In bind_rows_(x, .id) :
#   Vectorizing 'glue' elements may not preserve their attributes

This warning pops up because of bind_rows because glue::glue gives an object of class glue which is not present in the column of the dataframe.

I see two solutions to overcome this:

Use as.character(log_msg) to automatically convert every log message (and log details) into character
Use suppressWarnings(dplyr::bind_rows(...)) to avoid this type of warning.

Note that I expect this kind of warning to happen not just with glue but for all objects with other attributes than character (dates, durations, ...)

rotating logs

Hi Ryan,

Is there a way to set rotating logs?
Loggit appends now, but wouldn't it be nice if you could rotate the logs?

Update DESCRIPTION and README

Should reflect goal and mission of the package now that v2.0.0 is almost out.

loggit() messaging to console suppressed during R markdown rendering due to use of cat()

In the current loggit.R, echoing of log message is handled by write_ndjson()

write_ndjson(log_df, echo = echo)

loggit/R/loggit.R

Line 83 in 5399852

write_ndjson(log_df, echo = echo)

which in turn calls cat() if echo = T.
if (echo) cat(logdata, sep = "\n")

loggit/R/json.R

Line 102 in 5399852

if (echo) cat(logdata, sep = "\n")

Previously, in version 1.1.1 (https://cran.r-project.org/src/contrib/Archive/loggit/loggit_1.1.1.tar.gz), if echo = T, the base message function is called in loggit.R.
if (echo) base::message(paste(c(log_lvl, log_msg), collapse = ": "))

Switching to from message() to cat() causes loggit output to console to be suppressed during R markdown rendering as knitr::knit_hooks has options to handle message() output but nothing to handle cat() output (https://bookdown.org/yihui/rmarkdown-cookbook/output-hooks.html).

Here is an example code snippet (I replaced the markdown backticks with a single quote because I can't figure out a way not to confused the code blocking):

'''{r setup, include=FALSE}
library(loggit)
'''
'''{r test message, echo=F,message=F, warning=F}
loggit("INFO", "loggit message", echo = T)
message('base message\n')
'''

Only "base message" will be printed to console during rendering.

I suggest switching back from cat() to message() when echo = T.

Remove user agreement function introduced in 1.2.0

Also remove the limitations it imposes in loggit.R, and the testthat bypass in test_loggit.R.

Error in rotate_logs if there are less than rotate_lines lines in logfile

Thank you for this very useful tool! Unfortunately, rotate_logs() will cause an error ("Error in xj[i] : only 0's may be mixed with negative subscripts") if there are less than rotate_lines lines in the logfile.

I suggest the following fix: Replace lines 85-86 of loggit/R/utils.R with

if (nrow(log_df) > rotate_lines) {
  log_df <- log_df[(nrow(log_df) - rotate_lines + 1):nrow(log_df), ]
  write_ndjson(log_df, logfile, echo = FALSE, overwrite = TRUE)
}

ryapric / loggit Goto Github PK

loggit's People

Contributors

Stargazers

Watchers

Forkers

loggit's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs