GithubHelp home page GithubHelp logo

Comments (4)

pcrov avatar pcrov commented on September 15, 2024

JsonReader works in a forward-only manner, so if you might need prior data you'll need to hang onto it until that determination can be made.

The easiest way to do this would be to step into the array, grab each object in full, check the parentCategory and ignore any that don't match. E.g.:

$reader->read(); //Outer array
$reader->read(); //First object
$depth = $reader->depth();

do {
    $object = $reader->value();
    if ($object["parentCategory"] === "2570") {
        var_dump($object);
    }
} while ($reader->next() && $reader->depth() === $depth);

Note that because numbers get returned as strings (this will likely become optional in a future version) and you didn't get the opportunity to inspect their type, you'll lose their type information this way.

If you need to retain that and you know ahead of time what should be a number it's easy to fix them up:

$reader->read(); //Outer array
$reader->read(); //First object
$depth = $reader->depth();

do {
    $object = $reader->value();
    if ($object["parentCategory"] === "2570") {
        $object["id"] = +$object["id"];
        $object["parentCategory"] = +$object["parentCategory"];
        var_dump($object);
    }
} while ($reader->next() && $reader->depth() === $depth);

(The unary + will cast to int or float as appropriate automagically.)

Let me know if this doesn't work out for whatever reason. There is always another way to do things, it just might be a bit more cumbersome.

from jsonreader.

jenseo avatar jenseo commented on September 15, 2024

Wow, that's exactly what I was looking for! And yes, I will probably know before what will be numbers so I should be able to fix things up :)

Thank you so much for your help and for your great work, such a versatile tool!

from jsonreader.

jenseo avatar jenseo commented on September 15, 2024

Hi again @pcrov ,
wanted to follow up on this and ask you about the following:

I'm importing a rather large json file and I'm trying to stream it from the API server using fopen. I'm having a bit of a problem making it efficient though, it feels like the parsing takes a really long time.

Right now my code looks like this:

$fp = fopen($filename, 'rw'); // create file

$reader = new JsonReader();
$reader->stream($fp);
$reader->read(); //Outer array
$reader->read(); //First object
$depth = $reader->depth(); //Check depth
$object_array = array(); //Set up empty array
do {
    $object = $reader->value(); //Store object before check
    if ($object["category"] === $category_id) { //Do the check
        $object_array[] = $object; //Store object in array
    }
unset($object); // free memory?
} while ($reader->next() && $reader->depth() === $depth);
$json_object = json_encode($object_array, JSON_PRETTY_PRINT); //Convert array to nice Json
echo $json_object; //Output Json
$reader->close();
fclose($fp);
unlink($filename); // delete file

As you can see, I've added the line:

unset($object);

in an attempt to free memory, but not sure if it has any effect. Does this look like a good solution to you?

Thanks!

// Jens.

from jsonreader.

pcrov avatar pcrov commented on September 15, 2024

If you haven't already done so upgrade to the latest release, 0.7.0, as it's significantly faster than prior versions. There are still more speed improvements in the works, but nothing quite like the jump 0.7.0 made.

Make sure xdebug isn't loaded at all. Even when not enabled the extension has a massive performance impact.

Parsing a stream from a remote API directly while supported won't be as quick as parsing a local file, though from the code you've posted it looks like you're dealing with a local file already.

I wouldn't expect unset to do much useful there as $object is being immediately overwritten on the next iteration anyway. Besides, worrying about memory consumption when your problem is speed only makes sense if you're hitting swap (or garbage collection issues, but that shouldn't be a problem here), and you're going to be bound on the memory front by the growing $object_array.

At the end of the day parsing massive files in PHP can only be so fast, and the low memory consumption you get from a streaming parser will always come at the expense of speed. It's the kind of thing best suited to running in a background task and checking the result later.

from jsonreader.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.