GithubHelp home page GithubHelp logo

audit's People

Contributors

wiggin77 avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

audit's Issues

Audit/log questions

General Observations:

  • would’ve been great to have a requirements/“why” section since there is no PM spec/link available. That would have helped clarify the rationale behind some of the items in the Objectives section. Are we doing a redo/v2 of Auditing because of identified perf issues, data completeness issues, spottiness, customer complaints, etc.
  • having separate sections for auditing and logging parts would have helped (me) identify the commonalities and differences in requirements/solutions - are the objectives for logging the same as for auditing (in terms of reliability, quickness of access, time length availability of entries, etc.)
  • would’ve been great to have several design alternatives being presented, with their pros/cons. A lot of emphasis is on making the logging asynchronous. Is that to make the logging quicker, more reliable, reduce the load on the mid-tier, etc.? Queue integration/management introduces its own set of issues, would’ve been great to call out the pros/cons of more than one approach.
  • is one of the requirements to be able to store the audits in more than one place? (file, email, DB table)? If yes, can it be achieved in some other way (e.g. store in the DB first and export to other output types/stores using some ETL job)? Is it a customer/compliance request to reconcile the different audit storages?
  • wrt to events capture, one example of alternate approach could’ve been to capture the high-frequency or low/system-level audit events in SQL (as part of the same transaction that does the actual action, since most of the audit fields would be available in the SQL transaction). Would that approach solve some of the concerns wrt to correctness, immutability, etc. or would that introduce unacceptable performance issues? I’m not advocating one approach over another, just evaluating/measuring alternatives with pros/cons being considered would help.
  • would’ve been great to gather some metrics to drive decisions if possible - like the amount of data we expect to accumulate hourly/daily per company size, frequency, burstiness, etc.
  • do we see the audit/log store as a silo or are we planning of integrating with other event-consuming apps on the customer side (like data analytics/graph apps, etc.) - in which case a producer-consumer kind of approach is prob. more suitable.

Audit event types:

  • having a more complete list of event types we plan to capture at this moment would help identify differences/commonalities, the level (system or user) + frequency of the action being captured - driving what fields we need to capture in the schema. A higher-level user action can get translated into several low-level events in the SQL - how comprehensive do we want to be?
  • do we plan to also allow capturing of client-only events (not sure if currently InvitePeople is a client-only action but we could prob. think of user actions that don't always translate in API calls).
  • could these events be eventually queried through an API and/or processed and fed back in the MM app (for example # of posts in the last hour), to enhance/build new features on top (e.g. spotlight “hot” channels). Having them sourced out of the audit/logging data would avoid putting extra load on the main user activity flows.
  • will we allow plugins to define custom events (describing specific actions they are interested to capture/audit), independently of our API-based set, that we would then store on their behalf?

API:

  • do we plan to have an API/build queries on top of the audit/log stores (for debugging or by eventid) If yes, how would that impact the schema of the data being stored (like do we store the metadata as a json blob or do we have a “user->action->object” type of structure, etc.). Would also help to monitor/build histograms of most common log warnings/errors in a specific time-frame to detect/alert on regressions.

Operations on the audit entries:

  • do we plan to add PII scrubbing for user-identifiable data (email, name, etc.)?
  • data trimming - is there a timeframe for keeping the audit entries or do we plan data-trimming jobs to remove entries periodically.
  • do we plan to allow turning on/off auditing/logging by entity type (e.g. per team/channel, etc.)
  • since you mention that "This means certain code paths will emit multiple audit records for the same event”, are we planning to do any coalescing of audit entries (by time interval or sessionId) to reduce storage space for example.

Schema
Id- is that a UUID/GUID?
ObjectId - capturing the object on which the event mightObject.id (team/channel, etc) or is that captured by the Meta map?

Cloud:

  • would help to discuss any potential cloud-implications of the current design (e.g. data partitioning, monitoring, etc.). Do we see cloud presence as taking the existing audit/log setup and move it into a cloud env. or do we plan to make structural changes to make it cloud-first/native.

Some comments

Auditing will be done with a dedicated API but utilize the logging engine for storage.

That means that is not going to be affected by the Logging engine level's configuration right?

For example, when logging a struct asynchronously care must be taken to ensure the contents of the data being logged does not change while the log record is being created and formatted

Could you please provide an example of this case so I can understand better?

what to do when the queue is full.

I'd be also good, if is not on the list, to know how to react when we are not able to send to the audit/logging API because discard the messages doesn't seem like an option. I don't if we're using message queues that let us persist the messages but could be a good idea.

Queue(s) will be monitored for percent full and periodically reported.

I don't know if you have it in mind but could be a good idea to add metrics for this as well

Data Model

I've seen that the Meta field is map[string]string. Why not map[string]interface{}? Is it related to the data transformation pre-queue?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.