GithubHelp home page GithubHelp logo

Comments (9)

rtobar avatar rtobar commented on July 23, 2024

@smclay thanks for reporting this.

In the email exchanged we had about this I mentioned that simply we should check for the presence of the subscriber ID on the database and act differently depending on that result (i.e., succeed?). But now that I think about it a bit more, we should probably have the full picture before deciding how to proceed with this one.

The deeper problem I remember you are having is that your logfiles are getting flooded with these messages. This means there is a counterpart to this server that is trying to create this subscription without ever giving up. That counterpart is probably this:

if stat.getStatus() != NGAMS_SUCCESS:
msg = "Unsuccessful NGAS XML response. Status: %s, message: %s. Will try again later"
logger.warning(msg, stat.getStatus(), stat.getMessage())
continue

This check sits inside a loop that won't finish until all subscriptions are successfully made.

Subscription creation is currently successful only if a new subscription is effectively added to the database. The question is then what to do if a subscription with the same ID comes in. Here are some options:

  • Assume that because the ID is the same as an existing subscription, the subscription parameters must be the same, and simply return successfully (200).
  • Like above, but double-check that the subscription parameters are exactly the same. If they are, declare a success, if they are not declare an error (400).
  • Like above, but return a specific success code different from 200 when parameters are the same.
  • If a subscription ID is already taken just fail, but with a different error (some 4xx specific code).

I don't find options 1 and 2 too attractive because they declare a success even when a new subscription was not created, which I see as a bit of a contradiction in the semantics of the command; option 1 in particular is too simplistic, as it assumes too much. And if we go for option 3 or 4 we'd need to change the code I quoted above to take into account the new return code and adjust itself.

Thoughts? There are probably other alternatives I haven't thought of.

from ngas.

smclay avatar smclay commented on July 23, 2024

@rtobar I tried to look at the subscriber side logs but unfortunately they have already been deleted.

I am unsure but I believe when the subscriber shuts down it sends an UNSUBSCRIBE command to the subscription publisher. Is that correct? Therefore in the event of a crash or messy shutdown the subscription entry will probably not be removed. On restarting the subscriber it will send a new SUBSCRIBE command which should result in an identical request to the entry already stored in the database. I think this is probably the most likely scenario for duplicate requests which should be quite harmless.

Therefore I think I agree that options 3 and 4 are better solutions. The subscriber can log a warning when option 3 occurs. In the event of option 4, I would suggest a notification email is sent to alert the NGAS administrator. If the administrator is expecting data to flow but the subscription is rejected it is probably better to inform them so they can act on it otherwise they may not notice and the problem could go undetected for hours or days.

from ngas.

rtobar avatar rtobar commented on July 23, 2024

@smclay I've pushed a few changes to the issue-50 branch. I've implemented both the changes in the SUBSCRIBE command (returning different HTTP codes when a subscription already exists) and in the automatic-subscription-on-server-startup logic that should look into these HTTP codes and react accordingly. In particular, an error is logged, an email is sent, and no further attempts are done to create the faulty subscription.

Could you give these changes a go? I've added unit tests for the changes in SUBSCRIBE, but haven't yet tried automatically testing the rest (it will require some more effort), so if you could try to reproduce the situation and check that things are working it would be great.

from ngas.

smclay avatar smclay commented on July 23, 2024

@rtobar thanks for the update. I will build new packages and deploy over the weekend. Hopefully I will have some test results next week.

from ngas.

smclay avatar smclay commented on July 23, 2024

@rtobar I have carried out tests. For the most part it is working well. If the subscription already exists I get the following log messages from the subscriber host...

2021-05-31T12:05:49.721 [ 7901] [SUBSCRIBER] [ ERROR] ngamsServer.ngamsSrvUtils#_create_remote_subscriptions:151 Different subscription with ID '%s' already exists, giving up
2021-05-31T12:05:49.723 [ 7901] [SUBSCRIBER] [  INFO] ngamsLib.ngamsNotification#_sendNotifMsg:121 Sending Notification Message to: [email protected]. Subject: aat-ngas-6:8001: Automatic subscription cannot be created
2021-05-31T12:05:49.760 [ 7901] [SUBSCRIBER] [  INFO] ngamsServer.ngamsSrvUtils#_create_remote_subscriptions:164 Successfully subscribed to aat-ngas-5:8001 with url=http://aat-ngas-6.hq.eso.org:8001/QARCHIVE
2021-05-31T12:05:49.761 [ 7901] [SUBSCRIBER] [ ERROR] ngamsServer.ngamsSrvUtils#_create_remote_subscriptions:171 Error while adding subscription, will try later
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ngamsServer/ngamsSrvUtils.py", line 168, in _create_remote_subscriptions
    subscriptions.remove(subscrObj)
ValueError: list.remove(x): x not in list

There appears to be two bugs. The log message "151 Different subscription with ID '%s' already exists, giving up" has not been formatted properly and there is a ValueError exception that could probably be caught and handled.

On the publisher side we get the following log messages...

2021-05-31T12:05:49.656 [11060] [       R-0] [  INFO] ngamsServer.ngamsServer#handleHttpRequest:1806 Handling HTTP request: client_address=('134.171.18.32', 57492) - method=GET - path=|SUBSCRIBE?subscr_id=aat-ngas-6%3A8001&priority=1&url=http%3A%2F%2Faat-ngas-6.hq.eso.org%3A8001%2FQARCHIVE&start_date=2021-05-31T12%3A05%3A49.646|
2021-05-31T12:05:49.658 [11060] [       R-0] [  INFO] ngamsServer.ngamsCmdHandling#_get_module:81 Received command: SUBSCRIBE
2021-05-31T12:05:49.662 [11060] [       R-0] [  INFO] ngamsServer.commands.subscribe#handleCmd:117 Creating subscription for files >= 2021-05-31T12:05:49.646
2021-05-31T12:05:49.701 [11060] [       R-0] [  INFO] ngamsServer.ngamsServer#send_status:350 Returning status FAILURE with message Different subscription with ID 'aat-ngas-6:8001' existed and HTTP code 409
2021-05-31T12:05:49.705 [11060] [       R-0] [  INFO] ngamsServer.ngamsServer#send_data:320 Sending 369 bytes of data of type text/xml and headers {}
2021-05-31T12:05:49.705 [11060] [       R-0] [  INFO] ngamsServer.ngamsServer#handleHttpRequest:1833 Total time for handling request: (GET, SUBSCRIBE ,, ): 0.050 [s]

I also receive an email from the subscriber...

Notification Message:

NGAS attempted to create an automatic subscription with ID=aat-ngas-6:8003 to obtain data from aat-ngas-5:8003, but the remote server already has a subscription registered with the same ID, but different details.

Instead of retrying to create this subscription over and over, this server will give up now. To fit this either remove the remote subscription, or change the ID of the subscription to be created in the local server configuration.


Note: This is an automatically generated message

There is a typo 'fit' -> 'fix' in the email message.

from ngas.

smclay avatar smclay commented on July 23, 2024

@rtobar I noticed that shutting down the subscriber removes the subscriptions from the ngas_subscribers table. This happens also when the subscriber ID was already taken. I think this bahviour is probably the best solution.

from ngas.

rtobar avatar rtobar commented on July 23, 2024

@smclay thanks for the detailed report. I've addressed all the points you mentioned (typo in email, unformatted log statement, ValueError issue) while maintaining the behavior at shutdown. If you pull the issue-50 branch you should see the new commit with the fixes. Let me know if things are correctly for you. Hopefully this time I got things right, and if so I'll merge back to the master branch.

from ngas.

smclay avatar smclay commented on July 23, 2024

@rtobar I tested your latest changes today. Everything looks good...

2021-06-07T13:12:31.566 [31652] [SUBSCRIBER] [ ERROR] ngamsServer.ngamsSrvUtils#_create_remote_subscriptions:158 Different subscription with ID 'aat-ngas-6:8001' already exists, giving up
2021-06-07T13:12:31.568 [31652] [SUBSCRIBER] [  INFO] ngamsLib.ngamsNotification#_sendNotifMsg:121 Sending Notification Message to: [email protected]. Subject: aat-ngas-6:8001: Automatic subscription cannot be created
...
2021-06-07T13:12:41.601 [31652] [SUBSCRIBER] [  INFO] ngamsServer.ngamsSrvUtils#_create_remote_subscriptions:184 No Subscriptions established

I think we can now close this issue. Thank you for the improvements.

from ngas.

rtobar avatar rtobar commented on July 23, 2024

Thanks @smclay for double-checking the new changes, I've just merged the new issue-50 branch into master.

from ngas.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.