GithubHelp home page GithubHelp logo

someengineering / fixinventory Goto Github PK

View Code? Open in Web Editor NEW
1.5K 19.0 80.0 15.8 MB

Fix Inventory consolidates user, resource, and configuration data from your cloud environments into a unified, graph-based asset inventory.

Home Page: https://inventory.fix.security

License: GNU Affero General Public License v3.0

Python 99.11% Shell 0.33% HTML 0.16% Makefile 0.25% Jupyter Notebook 0.12% Dockerfile 0.03% CSS 0.01%
aws gcp infrastructure-as-code digitalocean open-source security security-automation cnapp cspm cybersecurity

fixinventory's People

Contributors

1101-1 avatar 1nv1 avatar akash190104 avatar anjafr avatar aquamatthias avatar azagaya avatar cebidhem avatar cprovencher avatar dependabot[bot] avatar fatz avatar fernandocarletti avatar hexpy avatar imgbot[bot] avatar kushthedude avatar lloesche avatar meln1k avatar mrmarvin avatar nburtsev avatar neilinger avatar rpicster avatar scapecast avatar sebbrandt87 avatar snyk-bot avatar some-ci avatar tdickers avatar thecatlady avatar yuval-k avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fixinventory's Issues

Implement an on-premise collector

Implement an on-premise collector that can take a network range, finde nodes in it and add them into the graph. Where possible maybe use snmp, ssh or wmi to connect to discovered systems and figure out more about them. Essentially get the equivalent information of what a cloud provider API would return about an compute instance or network.

[keepercore] rename merge -> replace

rename merge -> replace and move the merge point up. As in if a collector delivers a graph and sets replace = true on the cloud kind node then the entire cloud should be replaced.

keepercore - collect query stats

instrument the query parser and collect stats re: which fields get queried how often for automated index creation and removal

Use TLS for all communication

The current communication over unencrypted http and websocket is obviously not acceptable. We should add the ability to load x509 certs for everything. Starting with the server side of things (ckcore) but eventually doing client authentication (ckworker, ckmetrics, cksh) as well.
This is independent of any JWT users/roles authorization work.

[docker] better defaults on many-core machines

Right now the defaults in our Docker image are not to fork collector plugins and collect 2 accounts simultaneously. This is great for testing Cloudkeeper on a personal Laptop but not for a many-core server. The image should detect the environment it is running in and the amount of resources available to it and automatically scale to higher settings by default.

[plugin gcp] Collecting instances with custom machine types raises exceptions, breaks collecting

Issue

When collecting instances on a project with about 100 instances, the resulting graph only ends up with a small subset of those. This seems to be due to the collect task breaking early when encountering an instance with a custom machine type.

Context

Running latest main (3663023) using

ckworker --verbose --psk changeme --ckcore-uri http://localhost:8900 --ckcore-ws-uri ws://localhost:8900 --collector gcp --gcp-service-account="" --gcp-fork --gcp-project-pool-size=64 --gcp-project=myproject --gcp-collect=instances --verbose --debug-dump-json 2>&1 | tee log/ckworker.log

Error

[...]
2021-10-15 10:00:44,528 - DEBUG - 110556/gcp_myproject - Fetching custom instance type for gcp_instance my-custom-instance (2471263805638373273)
2021-10-15 10:00:44,549 - ERROR - 110556/gcp_myproject - Caught exception in collect_something(<cloudkeeper_plugin_gcp.collector.GCPProjectCollector object at 0x7f249e72bd90>, paginate_method_name='aggregatedList',
resource_class=<class 'cloudkeeper_plugin_gcp.resources.GCPInstance'>, post_process=<function GCPProjectCollector.collect_instances.<locals>.post_process at 0x7f2499d92040>, search_map={'__network': ['link', <function GCPProjectCollector.collect_instances.<locals>.<lambda> at 0x7f2499d92160>], '__subnetwork': ['link', <function GCPProjectCollector.collect_instances.<locals>.<lambda> at 0x7f2499d92280>], 'machine_type': ['link', 'machineType']}, attr_map={'instance_status': 'status', 'machine_type_link': 'machineType'}, predecessors=['__network', '__subnetwork', 'machine_type'])
Traceback (most recent call last):
  File "/home/marv/cloudkeeper/venv/lib/python3.9/site-packages/cklib/utils.py", line 489, in catch_and_log
    return f(*args, **kwargs)
  File "/home/marv/cloudkeeper/venv/lib/python3.9/site-packages/cloudkeeper_plugin_gcp/collector.py", line 657, in collect_something
    post_process(r, self.graph)
  File "/home/marv/cloudkeeper/venv/lib/python3.9/site-packages/cloudkeeper_plugin_gcp/collector.py", line 752, in post_process
    request = gr.get(**kwargs)
  File "/home/marv/cloudkeeper/venv/lib/python3.9/site-packages/googleapiclient/discovery.py", line 997, in method
    raise TypeError('Got an unexpected keyword argument {}'.format(name))
TypeError: Got an unexpected keyword argument region
2021-10-15 10:00:44,623 - DEBUG - 110487/gcp - Merging graph of gcp_project myproject into graph of cloud gcp
[...]

Simple PSK auth between components

Right now all communication between components is open. As mentioned in #218 we're going to add transport layer security over all communication channels. However we also need to do authorization independent of the communication channel.

This will eventually be done by an OpenID Connect service but as a first immediate step let's have a simple auth mechanism that's better than having no-auth and being open to the entire network and also better than plain text passwords.

We'll use a pre-shared key to sign and verify a JWT.

[keepercore] Aggregation: Allow for combined parameters

Why

There are scenarios where the aggregation grouping key has to be derived from more than one variable.
Example:

aggregate(cloud.name as cloud, account.name as account, region.name as region, instance_type as type, quota_type : sum(reservations) as reserved_instances_total) (merge_with_ancestors="cloud,account,region"): is("instance_type") and reservations >= 0

The account.name is not necessarily unique. Ideally we could express something like:

aggregate("{account.name} ({account.id})" as account, r

What

tbd

AC

tbd

[keepercore] Add http command to support web hooks

Why

While it is possible to attach to the event bus, or to query the state of the system, it might be the simplest solution to also allow for web hooks to be triggered.
In order to support this scenario, a http command should be implemented, which will perform the request on execution.
This way it would be possible to trigger web hooks based on events (or any other trigger) in the system.

What

  • Add an http command that allows to specify: method, url
  • Every element in the in_stream will trigger a http request
  • When the method is POST: send the element as body
  • Allow the definition of parallel requests per command

Open Questions

  • How to handle auth?

Acceptance Criteria

TBD

[keepercore] tag command needs to resolve cloud, account, region, zone

Why

The tagger needs additional information for cloud, account, region and zone.
This information should be provided without further user interaction.

What

Incoming elements need to restore load and merge all references.

AC

match | tag command can be issued without any additional merge_ancestors

[keepercore] Add jq commend

Why

Cloudkeeper v1 supports this command.

What

  • Add jq as dependency
  • Implement the command

Acceptance criteria

  • jq can be used in the CLI

[keepercore] - allow dynamic CLI aliases

From user feedback on Discord:

Queries like is(aws_alb) and ctime<"-7d" and backends==[] with(empty, <-- is(aws_alb_target_group) and target_type=="instance" and ctime<"-7d" with(empty, <-- is(aws_ec2_instance) and instance_status!="terminated")) <-[0:1]- is(aws_alb_target_group) or is(aws_alb) are powerful but also complex.

Chat protocol:

ViktorHarutyunyan:
I am not sure what's the target audience but I think this is an overkill, I think a simple query like:
> ck show unused lbs
then there should be like a man page explaining the criteria of selection. 
then the user can choose to say -r region -a account etc.. 
or say --days=7
imo, this is much simpler 
then if we talk about UI, then it is all clicks and nice small value edit windows. 
ck show unused tgs .... 
ck show unattached volumes ....

The CLI already has aliases. We should make those configurable so that users can download/exchange other users aliases. So the complex query above could turn into show unused loadbalancers.

Personally I'd also not want to have to type the complex query even if I know the syntax. A shorthand form would be useful.

Upload to PyPI

It would be convenient to just pip install cloudkeeper.
Right now only the Docker image that also runs the tox tests is being build automatically. We have all the appropriate PyPI package names reserved, there's just no automated build pipeline set up.

[keepercore] Add command to trigger workflows or jobs

Why

To test a workflow it should be easy to start it directly, so the user does not need to wait for the usual trigger.

What

  • Implement a command to start a workflow or job
  • start_task task_name

AC

A job or workflow can be started via the CLI.

[keepercore] less verbose logging on query errors

15:29:37 [WARNING] Request <Request POST /graph/ck/reported/query/aggregate > has failed with exception: Error: AttributeError
Message: Given kind does not exist: volume [core.web.api]
Traceback (most recent call last):
  File "/home/lukas/repo/cloudkeeper/keepercore/core/web/api.py", line 598, in middleware_handler
    response = await handler(request)
  File "/home/lukas/repo/cloudkeeper/venv/lib/python3.9/site-packages/aiohttp/web_urldispatcher.py", line 195, in handler_wrapper
    return await result
  File "/home/lukas/repo/cloudkeeper/keepercore/core/web/api.py", line 485, in query_aggregation
    return await self.stream_response_from_gen(request, gen)
  File "/home/lukas/repo/cloudkeeper/keepercore/core/web/api.py", line 565, in stream_response_from_gen
    gen = await force_gen(gen_in)
  File "/home/lukas/repo/cloudkeeper/keepercore/core/util.py", line 174, in force_gen
    return with_first(await gen.__anext__())
  File "/home/lukas/repo/cloudkeeper/keepercore/core/db/graphdb.py", line 277, in query_aggregation
    q_string, bind = self.to_query(query)
  File "/home/lukas/repo/cloudkeeper/keepercore/core/db/graphdb.py", line 857, in to_query
    part_tuple = part(p, idx, crsr)
  File "/home/lukas/repo/cloudkeeper/keepercore/core/db/graphdb.py", line 820, in part
    cursor = filter_statement()
  File "/home/lukas/repo/cloudkeeper/keepercore/core/db/graphdb.py", line 736, in filter_statement
    query_part += f"LET {out} = (FOR {crsr} in {in_cursor} FILTER {term(crsr, p.term)} RETURN {crsr})"
  File "/home/lukas/repo/cloudkeeper/keepercore/core/db/graphdb.py", line 721, in term
    return is_instance(cursor, ab_term)
  File "/home/lukas/repo/cloudkeeper/keepercore/core/db/graphdb.py", line 706, in is_instance
    raise AttributeError(f"Given kind does not exist: {t.kind}")
AttributeError: Given kind does not exist: volume

is too much logging for saying "user tried to query data that doesn't exist"

workerd - fan-out tasks to worker pool

Right now incoming tag update/delete tasks are being worked on one by one. One could start n workerd to work on n tasks in parallel. Instead if would be better to accept n tasks in parallel and distribute them onto a worker pool. The basic functionality is already there for all collectors as well as the cleaner. It just isn't used yet for tag tasks.

[keepercore] Allow v1 like collect in intervals behaviour

Currently workflows are scheduled using a cron like syntax.

If the collect workflow runs every hour and a run takes 65 minutes, the next execution would be skipped and the next workflow only executed at the next full hour.

In v1 the behaviour is: collect in intervals. If the interval target is 1 hour and a collect run takes 30 minutes the system will wait another 30 minutes for the next run to start. However if the target is 1 hour and the collect takes 65 minutes the system will start the next collect right after the current one finished.

This behaviour ensures that our data is as "fresh" and close to the chosen collection interval as possible. v2 should implement a similar scheduling behaviour.

[keepercore] topic/workflows -> allow to filter by criteria like cloud/account

The task queue allows filtering by attributes. E.g. if a worker tells the core that it can apply tagging for cloud aws then it would receive tag tasks for cloud aws account 123.

In workflows we currently don't have a similar filter criteria. It's just a "collect" or "cleanup" workflow. Not a "collect aws" or "cleanup aws account 123" workflow.

[ckcore] aggregation on graph traversals with duplicated elements are wrong

graph traversals can emit an element multiple times.
duplicated elements are filtered out on the ckcore side.

In case of aggregations, duplicated elements are not filtered but counted multiple times.
Example:

> query is(region) <-[0:]- | count reported.kind
onelogin_region: 1
onelogin_account: 1
slack_region: 3
slack_team: 3
gcp_region: 476
gcp_project: 476
aws_account: 702
aws_region: 702
cloud: 1182
graph_root: 1182
total matched: 4728
total unmatched: 0

There are 1182 regions in total. The path from every region to the root is duplicated on the path, yielding 1182 graph roots. The result of count is wrong in this case.

Better tests for Cloudkeeper core and plugins

Currently the core has some test coverage and the plugins essentially none. Each plugin has a stub that tests some cli args essentially as a starting point for adding more tests. All the infrastructure is already there and tox configured for each plugin.

[keepercore] clean command should accept a reason as parameter

Why

When a resource is cleaned up, users of the system should be able to understand why.
The simplest solution for the moment: whenever a resource should be cleaned, the reason is written to the log.

What

  • CleanCommand accepts a reason message as argument
  • Add a log message for every resource that is marked for cleanup with the reason included

Acceptance Criteria

A user can see from the log, why a resource was cleaned.

[resotoworker] propagate collection progress to `resotocore`

Right now the workers only log collection progress. For the UI as well as CLI it would be useful to get collection feedback so they can provide info like "Collecting AWS account 4/20" or similar.
For this to work BasePlugin needs to be extended so it can feed current collection progress back to the worker and then the worker needs to forward that to the core.

Running `echo` without arguments causes an exception

Passing echo with no args to /cli/evaluate returns a 200 OK but when passed to /cli/execute causes an exception

01:04:01 [WARNING] Request <Request POST /cli/execute > has failed with exception: Error: JSONDecodeError
Message: Expecting value: line 1 column 1 (char 0) [core.web.api]
Traceback (most recent call last):
  File "/Users/lukas/repo/cloudkeeper/keepercore/core/web/api.py", line 490, in middleware_handler
    response = await handler(request)
  File "/Users/lukas/repo/cloudkeeper/keepercore/core/web/api.py", line 418, in execute
    return await self.stream_response_from_gen(request, result)
  File "/Users/lukas/repo/cloudkeeper/keepercore/core/web/api.py", line 484, in stream_response_from_gen
    return await respond_json()
  File "/Users/lukas/repo/cloudkeeper/keepercore/core/web/api.py", line 464, in respond_json
    async for item in gen:
  File "/Users/lukas/repo/cloudkeeper/venv/lib/python3.9/site-packages/aiostream/stream/advanced.py", line 59, in base_combine
    result = task.result()
  File "/Users/lukas/repo/cloudkeeper/keepercore/core/cli/command.py", line 48, in parse
    js = json.loads(arg if arg else "")
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

The same happens when passing an unquoted string. It evaluates okay but when executed causes an exception.

workerd - backpropagate tag changes

Currently tag update/delete are not backpropagated to the core. Instead changes are picked up during the next collect run.
It would be useful to update the individual nodes in real-time so that the graph immediately reflects the changes.

AWS Cloudformation (and by proxy EKS Nodegroup) delete() should block until completed or failed

Right now during the cleanup phase of AWS resources we mostly just call the delete() method on whatever resource we're trying to remove. For almost all resources this is a blocking call that returns whether or not a resource was succesfully deleted. However Cloudformation always returns success and then the operation status has to be polled to know its status. Those deletes can run for many minutes even hours for complexe environments so waiting for them would currently block resource collection.
We could delete a Cloudformation stack and then wait a certain amount of time for the delete to complete before timing out.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.