OPA version: v0.60.0 Short deion <p dir

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Anytime, Dolev 🙂 Quite pleased with <a href="https://github.com/open-policy-agent/opa

OPA process OOM when a bundle contains a large file despite having size_limit_bytes about opa HOT 7 CLOSED

dolevf commented on May 9, 2024 1

OPA process OOM when a bundle contains a large file despite having size_limit_bytes

from opa.

Comments (7)

ashutosh-narkar commented on May 9, 2024

OPA seems to be doing the right thing here ie. rejecting the large file and returning a bundle load error which in turn should kick-off an exponential back-off delay based successive download. What's happening here probably is the memory is not getting released quick enough? I guess if GC were to happen we'd see memory being released. Go 1.19 introduced a concept of soft memory limit which helps to control GC behavior and it can be set via the GOMEMLIMIT env variable. I haven't played around with this so not sure if it will help but is this something you've looked into?

from opa.

anderseknert commented on May 9, 2024

@ashutosh-narkar do we really need to load the file at all if the file size exceeds size_limit_bytes? That seems to defeat the point of the setting. I haven't looked into it, but I'm assuming there's a way to check the file size of an item in the tarball before reading the bytes. Is that not the case?

from opa.

anderseknert commented on May 9, 2024

Looking at the implementation now, and maybe I'm missing someting, but it seems like the NextFile() function greedily reads all files on the first invocation, then serves them from cache on subsequent calls as long as there are more to return. We then call the readFile() on each of the files (which we've already read), which copies them to yet another buffer. Here we use Read with a limit, which will fail if the size limit is exceeded. But at this point we've already read the entire file once, and now we read it again up to the size limit. So if we've set a size limit of 1 GB and we have a file in the tarball which is 5 GB, we'll now have 6 GB of data buffered before we return an error. In case we don't hit the size limit, we'll effectively have each file buffered twice, spending twice as much memory as we're required to.

Some alternative approaches I can think of:

Have NextFile read lazily, i.e. only one file from the tar ball per invocation, and return the reader without copying the buffer. This seems like a reasonable expecation for a function called "NextFile", but it's a breaking change in behavior, as callers would now be required to read the stream of the "file" returned before calling NextFile again.
Keep behavior of NextFile, but store the bytes.Buffer on the descriptor rather than an io.Reader, and make it accessible. That way we can reuse the buffer we've read elsewhere, and avoid the second copy.
Pass the size limit to the tar ball loader, and have it fail immediately when the size reported exceeds the limit, i.e. before we read anything more than the header.

2 and 3 aren't mutually exclusive, but would rather be good to have both done. To avoid keeping two copies of the file, and to not read files exceeding the limit.

As for 2, perhaps there is some more elegant non-intrusive way to do it. I'm open to suggestions.

from opa.

dolevf commented on May 9, 2024

Thanke for investigating Anders. Do we need a CVE assigned to this?

from opa.

anderseknert commented on May 9, 2024

Anytime, Dolev 🙂 Quite pleased with the result!

As for CVE, I'd lean towards no. OPA will need to run under the premise that a remote bundle server can be trusted. If that is not the case, an OOM is about the least harmful thing a malicious actor could accomplish. If they can tamper with the contents of bundles, they could e.g. change an authorization policy to allow them access, or in the case of discovery bundles, changing OPA's configuration to e.g. turn off decision logging or whatnot.

Having the size limit exceeded in real-world deployments is likely going to happen by accident, like where a user accidentally includes a big file by mistake. The fact that a mistake could cause an OOM is of course not good, and I'm happy to see this fixed. But similarly, there are many mistakes one might do as a bundle server "owner" which could have quite severe consequences, so I can't say I'm more worried about this than other imaginable scenarios.

Happy to discuss more if you think differently!

from opa.

dolevf commented on May 9, 2024

Hi,

I agree that if a bundle server is compromised, DoS is the least interesting abuse case. But then again, bundle signing is also a thing, so there's some assumption things can get risky even if you're supposedly trusting your bundle server, no?

Totally up to the project to decide at the end of the day :) what's important is that it's fixed!

from opa.

anderseknert commented on May 9, 2024

You're right — I didn't really make the distinction between whoever creates the bundle and who hosts it, and I should have! Indeed, bundle signing is a good extra measure. Assuming that is in place, the only actor who could accomplish this would be whoever built and signed the bundle, and if that's a malicious actor... not a whole lot we can do then.

Let's leave it as a (soon to be fixed!) issue for now. If others have opinions here, please make your voices heard :)

Thanks Dolev!

from opa.

OPA process OOM when a bundle contains a large file despite having size_limit_bytes about opa HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs