GithubHelp home page GithubHelp logo

Comments (18)

johnmaguire avatar johnmaguire commented on June 20, 2024

@hugohuynh I know this bug is very old, but can you give a little more details about the problem? Was it a .tar file or a .zip file? How large was the archive and how large were the files within it? What error did you get?

from archivestream-php.

johnmaguire avatar johnmaguire commented on June 20, 2024

Closing this issue due to lack of information. The problem has likely been fixed in the interim.

from archivestream-php.

ianvonholt avatar ianvonholt commented on June 20, 2024

I have recently run into the same issue, but only if my archive streams above 670mb in size.

I have an script that will pull all selected files that a user wants and then streams the file on the fly. Anything under 670mb and the archive is completely fine. Anything over, and the file is corrupt.

What information would you need to help debug?

from archivestream-php.

johnmaguire avatar johnmaguire commented on June 20, 2024

Was this a tar or a zip?

from archivestream-php.

ianvonholt avatar ianvonholt commented on June 20, 2024

It is a zip file.

Windows will flat out refuse to open the corrupt zip, which is expected. Winrar will open it and show one huge file with an large,but incorrect, file size. Selecting additional items to add to the zip archive still increases the size of the corrupt archive, but you just end up with a single file listing.

Corrupt Zip Screen

Again, as soon as the selected archive is under 670mb the archive displays correctly.

Correct Zip

Additionally, PHP.ini has some rather high memory limits and execution time for this server. Could this be an issue with Zip64 or gmp settings?

from archivestream-php.

johnmaguire avatar johnmaguire commented on June 20, 2024

Hmm. It's rather odd since we haven't seen this in our environment, and our customers are downloading ZIPs every day. Have you made sure that there's no PHP errors occuring during execution? Is there any chance you might be able to get a broken ZIP to me some way, so that I can analyze it? If you'd like to email me a link to it privately, you can email [email protected].

from archivestream-php.

ianvonholt avatar ianvonholt commented on June 20, 2024

It is pretty odd. The download setup has been working pretty well for a couple months, however we never really encountered the file size limitations until now.

There have been no errors generated by PHP within the environment.

I've e-mailed you a link to a corrupt zip file.

from archivestream-php.

johnmaguire avatar johnmaguire commented on June 20, 2024

I'm seeing that the ZIP is lacking an end-of-central-directory signature. This is created when finish() is called on the object after all files have been added. Are you making sure that finish() is called at the end of execution?

Also, do you know if this issue affects tar files as well? (You can test by passing anything into the instance_by_useragent method that doesn't contain the term "windows".)

from archivestream-php.

johnmaguire avatar johnmaguire commented on June 20, 2024

Err, sorry, please verify that both complete_file_stream() and finish() are being called, if you're using the stream methods.

from archivestream-php.

ianvonholt avatar ianvonholt commented on June 20, 2024

The archive is indeed failing to complete.

The call is pretty basic using the ArchiveStream from the example.

$zip = ArchiveStream::instance_by_useragent( 'CPF_WebFTP_Download_' . date('Y_m_d-H:i:s') );

foreach ($pickedFiles as $file) {
    $zip->add_file_from_path($file['internal'], $file['path']);
}

$zip->finish();

The add_file_from_path is correctly calling the add_large_file function, initializing the init_file_stream_transfer and dying on a fread call when the archive streams anything above the previously mentioned 670mb. So it seems that a stream_file_part is missed.

Switching to fgets seems to solve the issue on my current environment. However, this seems like a bad idea due to fgets returning false if it hits a newline.

from archivestream-php.

johnmaguire avatar johnmaguire commented on June 20, 2024

The switch to fread() is quite new. Previously we were using fgets(), and in production here at Barracuda, we're still using the older version with fgets() (I was actually just about to update.)

As such, I'm not surprised there's some unexpected behavior.

While I'm ironing out the issue with fread(), there shouldn't be concerns about fgets() and newlines. For example:

jmaguire@ZimsBase [02:41:36] [~/Repositories/cuda/backup] [release/6.0.08 *]
-> % cat > asdf
test
123
jmaguire@ZimsBase [02:55:26] [~/Repositories/cuda/backup] [release/6.0.08 *]
-> % php -a
Interactive shell

php > $fh = fopen('asdf', 'r');
php > while ($data = fgets($fh)) { echo $data; }
test
123

from archivestream-php.

johnmaguire avatar johnmaguire commented on June 20, 2024

A little shot in the dark: Is the file you're adding to the ZIP being read from over a socket?

from archivestream-php.

ianvonholt avatar ianvonholt commented on June 20, 2024

Nope.

All the files are local to the code-base. I looked into the possibility of a socket_timeout occurring, for some weird reason, but there was no indication that this was happening.

I'll do a bit more testing to see if I can track down what exactly is causing fread to fail, but for me switching back to fgets fixed my problem.

A bit more about the server:
CentOS release 6.6 (Final)
PHP 5.4.29
Apache/2.2.27

from archivestream-php.

johnmaguire avatar johnmaguire commented on June 20, 2024

Interesting. I'll try doing some tests locally using the add from file
call, as we mainly use this library for creating the file on the fly while
streaming from multiple parts. Thanks a lot for the report. :)
On Apr 10, 2015 3:11 PM, "Ian Von Holt" [email protected] wrote:

Nope.

All the files are local to the code-base. I looked into the possibility of
a socket_timeout occurring, for some weird reason, but there was no
indication that this was happening.

I'll do a bit more testing to see if I can track down what exactly is
causing fread to fail, but for me switching back to fgets fixed my problem.

A bit more about the server:
CentOS release 6.6 (Final)
PHP 5.4.29
Apache/2.2.27


Reply to this email directly or view it on GitHub
#1 (comment)
.

from archivestream-php.

johnmaguire avatar johnmaguire commented on June 20, 2024

Can't reproduce using a 5 byte text file, 750MB text file, and 750MB binary file... script below:

<?php

// Created for issue: https://github.com/barracudanetworks/ArchiveStream-php/issues/1
// Debugging problem that caused re-open (reported by ianvonholt)

// Just in case
ini_set('max_execution_time', 600);

// Switch to false for a tar file
define('ZIP_FILE', true);

include_once('stream.php');

// Hack to get a zip file even on Linux and vice-versa
if (ZIP_FILE) { $_SERVER['HTTP_USER_AGENT'] = 'windows'; } else { $_SERVER['HTTP_USER_AGENT'] = 'linux'; }

$files = [
    '5B.txt' => 'files/5B.txt',
    '750M.txt' => 'files/500M.txt',
    '750M.bin' => 'files/750M.bin',
];

$zip = ArchiveStream::instance_by_useragent('fread');
foreach ($files as $file => $path)
{
    $zip->add_file_from_path($file, $path);
}

$zip->finish();

from archivestream-php.

johnmaguire avatar johnmaguire commented on June 20, 2024

It almost seems like WinRAR isn't respecting the third bit being set in the general purpose flag of the file header (this is what says to ignore the values 0x00 for CRC and 0xFFFF for the length of the data, and to go to the file descriptor for those values.) It's odd that switching back to fgets would fix the problem though.

Just to clarify, is the only difference fread -> fgets? Or did you checkout an earlier commit that used fgets?

from archivestream-php.

johnmaguire avatar johnmaguire commented on June 20, 2024

@ianvonholt I've tried reproducing this in house a few times, and can't seem to do it.

Could you verify that output buffering is turned off? You can do this with the following snippet:

while (ob_get_level() > 0) {
    ob_end_clean();
}

Many frameworks can turn this on by default.

from archivestream-php.

johnmaguire avatar johnmaguire commented on June 20, 2024

@ianvonholt If you're still interested in trying to get this fixed, please try running the script I provided and letting me know if you get a corrupt ZIP. Otherwise, I'll close this issue as CNR.

from archivestream-php.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.