GithubHelp home page GithubHelp logo

Concurrent execution about pybison HOT 11 CLOSED

smvv avatar smvv commented on July 28, 2024
Concurrent execution

from pybison.

Comments (11)

arquer avatar arquer commented on July 28, 2024

I really really need help on this... everything else I can manage on my own just like I did with the locations issue I had, but this, I really have no idea where to look...

from pybison.

smvv avatar smvv commented on July 28, 2024

from pybison.

arquer avatar arquer commented on July 28, 2024

Thanks for the answer Sander!

As you can read in my post my case is not really concurrent, so I am not sure if my problem is caused by this.
Basically I have a python script which parses many files, one after the other. This is done in a simple for-loop. When I do this I have noticed only the first file gets parsed correctly, subsequent calls to the parser either return the same object as the first call or None.
I have tried to call parser.reset() in between files and even to create new instances of the parser for every file, but the same keeps happening.

Do you think this is related to the same issues that limit concurrent execution? Have you ever seen something like this on your end? Any ideas?

EDIT:
Just to give you more information, I have tried to declare a new parser class for every file to be parsed and even to compile the pybison library in diferent directories each run, so that hopefully they would not collide, somehow the same thing keeps happening. Since the libraries are totally difderent here I suspect it might not have that much to do with library-sharing as I suspected, it might be much simpler...

from pybison.

smvv avatar smvv commented on July 28, 2024

from pybison.

arquer avatar arquer commented on July 28, 2024

Thanks again,

I am affraid that is not really a solution for me. I use a pretty complex build system that relies on scons as backbone. There are many targets in my builds and knowing exactly how many times the parser will be ran is impossible before-hand (the whole point of the build system is to keep things modular, so that one target does not really know about the others...)

I will keep investigating (I have found about the %define api.pure directive) and also I will check if the lexer is really being reseted when reset() is called.
I will let you know what I find.

from pybison.

smvv avatar smvv commented on July 28, 2024

from pybison.

thisiscam avatar thisiscam commented on July 28, 2024

@arquer Do you have some code examples? I don't think the use case you mentioned should be of any problem though

from pybison.

arquer avatar arquer commented on July 28, 2024

@thisiscam if I don't get to solve it I will work on a minimum example which exhibits this error.

The frustrating part is that I have my parser working really well but until I solve this the tool is useless to me :(

from pybison.

arquer avatar arquer commented on July 28, 2024

Ok I might know what is happening.

I was able to turn my parser into a reentrant parser following the bison guide very quickly. However this does not seem to fix my problem (I don't know if by doing that my parser can now be called concurrently, is that so?).
However I digged a bit more into the lexer reset. In bisondynlib_open we try to lookup the symbol "reset_flex_buffer" in the library, if it exists, a call to bisondynlib_reset forwards the call to the lexer reset.
I wanted to see if indeed resets were being performed so I changed bisondynlib_reset to the following:

void bisondynlib_reset(void) { if (reset_flex_buffer) { printf("reset_flex_buffer is not void, so resetting!!!\n"); reset_flex_buffer(); } else { printf("Order to reset parser received but could not be performed, reset_fex_buffer is NULL!\n"); } }
Then when I run the said script, which resets the parser each time it has to parse a new file, I see the message in the else {} clause printed. So evidently the lexer reset is not really being called..

My question is, and keep in mind I am no expert, in bisondynlib_open we attempt to retrieve a symbol called "reset_flex_buffer". I have looked up and down and grepped and I see no definition anywhere of that function, neither one is found in the source code generated by bison/flex... So where is this function supposed to be found? who is supposed to provide an implementation for it?
Looking around bison's documentation I saw that the flex buffer is supposed to be reset by calling YY_FLUSH_BUFFER shouldn't we try to call that instead? Or is the "reset_flex_buffer" supposed to forward the call?

Evidently YY_FLUSH_BUFFER is a define, I looked around in the flex generated code and this is the definition:
#define YY_FLUSH_BUFFER yy_flush_buffer(YY_CURRENT_BUFFER )

from pybison.

arquer avatar arquer commented on July 28, 2024

Hello,

I have been doing further tests without any luck, so I thought I would make an example application.

I suspect it has to do with the lexer, probably with it's state. The reasons for this is that I have inserted some printfs in the py_input function to track down the source actually passed to the parser and I have seen that py_input is usually only called once (twice at most), only when the first file is being parsed, subsequent parsings don't even call it.... Should we do some extra reseting/housekeeping after the end token is found? or should yyterminate terminate everything?

So, I made a sample which exhibits the same error.
I basically grabbed the calc example, modified it so that the parsed output was returned instead of printed, re-wrote the run.py script, and removed the read() method which grabbed input from the command line.
I also modified two of the c files, bison_callback.c, where I added some printfs to track the calls to py_input, and bisondynlib-linux.c, where I modified the lexer reset symbol that the code looks for to "yy_flush_buffer", and also added some printfs to the lexer reset.
As you will see in the run.py I basically run parsing on three files, source1/2/3.txt, Each of those files has an expression in it, all yielding different results. The script runs the test both by instantiating a new parser each time, and also by reseting the same parser each run.
I have tried both with the source files as you can find them, and also by adding "quit" at the end of each. None of the options worked, even though the first option at least parsed the first file correctly, while addind the "quit" symbol at the end caused the parser to return None for all files, not just the ones after the first.

The test should be plug-and-play, just download, unzip, and hit ./run.py. Pybison is included inside (since I modified some stuff as explained).
If you are running it on an un-compatible platform (doubtfull since there is no win support) just go into the pybison directory, run python setup.py build, and copy the results from build/lib...../ to the directory above it.
GitHub was giving me a hard time attaching a zip file (even though they are supposed to be supported), so I change the format to DOCX. Replace the extension to TAR and you are good to go.

I will really appreciate any help @smvv @thisiscam !

PD: Keep in mind my parser is quiet different. It is a GLR parser, it is reentrant (at least that is what I think), and has other minor modifications and tweeks, but since this exhibited the same error I thought it would do the job

example.docx

from pybison.

arquer avatar arquer commented on July 28, 2024

I think I have fixed it.

I basically made my whole parser/lexer reentrant, following the guide you @smvv suggested and some others, this means changing many function signatures and passing parameters around from the lexer to the parser and backwards, since you can not rely on global variables.
After all I don't know if that was really the fix. What I found is that, the way I was running the script, the runEngine() was actually only being ran once.

I use the "read" parameter when invoking run(), and don't pass any file to the parser whatsover, I checked the code and the parser was then saving stdin as its "file". The thing is that when the py_input callback realized that zero bytes had been read from the parsers read method, it went ahead and it closed the "file", which in this case was stdin (I don't really know how this did not throw a Exception, but I guess it is possible to close stdin somehow). Then, when the .run() method was called again, (and again no file was provided), it checked if the parser's file attribute was a closed file, which it was, and then it did not even call runEngine, skiping the whole while loop in bison/init.py and returning self.last, which was the object that had been last parsed...

Anyway, I am happy, I can now use this I think!

from pybison.

Related Issues (12)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.