I wasn't actually done with my review on #3. I threw out my back and wasn't able to get to it on Friday. Here's the rest of it:
I would like to see the command logged to the end user. Above it's hard coded to "Running pip install". I would like to see the command be exactly what we run for the user so if they copy and paste (and of course modify any paths) they would get the exact same result.
Since it also looks like the results rely on PYTHONUSERBASE
env var. I would like to see that in the output as well.
In addition to local debugging, it can also help buildpack maintainers spot accidental discrepancies.
Same as above. We're streaming this command, but not announcing the (exact) command that's being streamed. If we don't want to announce this specific command (since it seems it's a helper command rather than something a user might expect to be run against their code) then perhaps we move to only emitting the command string in the error message.
I want the exact command run to be in the logs or the error (or both).
Function name is log_io_error
. Function body says "unexpected error" which might not always be true. Or rather if we're saying "unexpected error" I'm reading that as synonymous with "not my fault" if I'm a customer reading it.
I imagine someone copying and pasting that function without looking too close thinking it's for handling all IO errors. For example a new file format is added and reading it generates an std::io::Error
due to a permissions problem or bad symlink the customer checked in. In that case it wouldn't be so unexpected. Rename for clarity? unexpected_io_error
?
Mentioned in Colin's PR. I would like to avoid early returns when there's only two branches. We can if/else this and eliminate the early return.
I think we should log_warning
instead of log_info
.
In a bit of a surprise to even me, I'm going to advocate for less testing. I think testing build logic inside of main.rs should be enough here. One fewer docker boots at CI test time.
Also we're effectively testing the output of pack here which is subject to change. If you do want to keep this test, I would scope it to only the strings you control.
Same as above. Doesn't need to be a docker test.
Regarding the error message (it's very good), when we say the user is missing files: It would be nice for us to give them an ls
of what files we see in that directory. So at a glance they could see in one window we're looking for "requirements.txt" but they have "REQUIREMENTS.txt" (or something).
Comment: You could manually combine the streams yourself let output = format!("{}{}", context.pack_stdout, context.pack_stderr)
. I'm assuming the goal is to get all of the pack output on test failure.
Also I think you could get rid of this test. Esssentially it's testing that a bad requirements.txt file triggers a non-zero pip install
. The git test above it, seems more useful as an integration test.
Testing caching ๐. We can make these less brittle by asserting only the individual lines (or even just parts of lines). I think asserting for "Using cached Python" and "Using cached pip" without the version numbers, would be enough to convince me. Maybe a "Using cached typing_extensions" for good measure. All the other values and numbers will cause churn on this file and possibly failures on otherwise unrelated changes (if libherokubuildpack updates it's logging style for example).
That comment applies to all integration tests. They're really nice and easy to review in this format, but I don't want to have to update 10 files every time I add an oxford comma (for example).
If we know that one change invalidates things, I would bet multiple changes would as well. I think this is covered in your unit tests โ๏ธ
Unit test should be okay.
Comment: This is a good idea
Unit test should be fine.
Time to test
Not that it's a race, but time to run CI universally always goes up, and integration tests are historically one of the last things that developers are willing to delete. Right now Python is ~6min for integration tests while Ruby is ~3min.
Ideally I would like to aim for <5min for CI completion with a maximum of around 10 min. Once you hit 10 min and a random glitch causes the tests to need to be re-run then you're pushing nearly half an hour for a single change and it absolutely kills (my) productivity.
I think we should be agressive in testing and safety. I also think we should consider pruning some of these tests now, as this is otherwise the fastest this CI suite will ever execute.