GithubHelp home page GithubHelp logo

Comments (12)

herlesupreeth avatar herlesupreeth commented on August 17, 2024 2

Hey @Hoernchen , thank you very much for details pcaps and logs. I think its a bug in kamailio P-CSCF code. I will see whether I can fix it. In my opinion, wrong Contact of the UE is deleted after pending expires + expire grace.

from docker_open5gs.

herlesupreeth avatar herlesupreeth commented on August 17, 2024 2

@Hoernchen Please pull the latest changes from this repo and re-build kamailio images and give it a try (no modifications to expires values is required).

from docker_open5gs.

herlesupreeth avatar herlesupreeth commented on August 17, 2024

Those diff does not make sense. Please post a pcap of the issue you are facing and I can take a look.

from docker_open5gs.

Hoernchen avatar Hoernchen commented on August 17, 2024

Which parts do you need? It appears to be a pcscf issue, only half of the deregistration happens, the other half is perpertually stuck in a pending state, while the "valid" contacts on the scscf side keep piling up with the expiry supplied by the ue, which is 600000 seconds.

Maybe related: kamailio/kamailio#3570

from docker_open5gs.

herlesupreeth avatar herlesupreeth commented on August 17, 2024

pending_reg_expires

The above modparam is related to UEs whose IMS registration remain in pending state and not for UEs who register with the IMS successfully i.e. 200 OK for SIP REGISTER is received.

subscription_expires

This is when the subscription of the UE expires at P-CSCF. 36000 equates to 10hrs

delete_delay

This is the time after which IPSec connection is removed once the UE de-registers from the network

so I am not sure its a P-CSCF issue

Which parts do you need?

I suggest taking the pcap on the 'any' interface without any filters. The scenario should be as follows

  1. Start the trace
  2. Register two UEs
  3. Wait for 1-2 mins
  4. Attempt a call
  5. Stop the trace and post it here

from docker_open5gs.

Hoernchen avatar Hoernchen commented on August 17, 2024

You need to remove the ".zip" file suffix, zip was way too large to attach, so i had to use zstd.
broken_after_3min.pcapng.zst.zip

from docker_open5gs.

herlesupreeth avatar herlesupreeth commented on August 17, 2024

The packets in the pcap file didnt feel to be in right order I saw diameter UAR request before even receiving SIP REGISTER. Can you please ensure that time stamps on the machine running EPC + IMS and UE are the same? And, UE is set to receive time from the network?

Also, can you tell me which eNB you are using?

Sorry, cant be of much help on this. But all I can confirm is that its not due to those values in modparam you are facing this issue.

from docker_open5gs.

Hoernchen avatar Hoernchen commented on August 17, 2024

Just increasing pending_reg_expires is enough to "fix" the issue, for some reason waiting longer than (pending_reg_expires + expires_grace) appears to be time stuff starts failing. It can also be "fixed" by putting the phones in airplane mode, and turning them on again (unitl it fails again). No idea what is going on with the timestamps tbh.. I usually don't look at all interfaces because I don't want to see everything five times, so I never noticed that ;)

from docker_open5gs.

herlesupreeth avatar herlesupreeth commented on August 17, 2024

Can you post a pcap of the working call scenario with increased pending reg expire timer?

from docker_open5gs.

Hoernchen avatar Hoernchen commented on August 17, 2024

enb is srsenb, btw.

One "fixed" attempt:
pending_9999999_callfix.pcapng.zst.zip

And another try, I can basically watch it fail really fast even before making a call by doing this:

--- a/pcscf/kamailio_pcscf.cfg
+++ b/pcscf/kamailio_pcscf.cfg
@@ -309,7 +309,7 @@ modparam("uac","restore_mode","none")
 
 # ----------------- Settings for RTimer ---------------
 # time interval set to 60 seconds
-modparam("rtimer", "timer", "name=NATPING;interval=60;mode=1;")
+modparam("rtimer", "timer", "name=NATPING;interval=2;mode=1;")
 modparam("rtimer", "exec", "timer=NATPING;route=NATPING")
 #!endif
 
@@ -398,7 +398,7 @@ modparam("ims_registrar_pcscf", "ignore_contact_rxport_check", 1)
 modparam("ims_registrar_pcscf", "pending_reg_expires", 30)
 modparam("ims_registrar_pcscf", "subscription_expires", 36000)
 modparam("ims_registrar_pcscf", "delete_delay", CONTACT_DELETE_DELAY)
-modparam("ims_usrloc_pcscf", "expires_grace", 120)
+modparam("ims_usrloc_pcscf", "expires_grace", 12)
 modparam("ims_usrloc_pcscf", "timer_interval", 2)

The point at which a call will fail is basically as soon as the OPTIONS "nat ping" breaks, which is pretty much after 30+12s after the succesful registration according to the second pcap (below), pcscf log for that point in time:

pcscf      | 98(135) ERROR: <script>: OPTIONS to sip:[email protected]:41786 via sip:192.168.101.3:41786;transport=tcp...
pcscf      | 98(135) ERROR: <script>: OPTIONS to sip:[email protected]:40643 via sip:192.168.101.2:40643;transport=tcp...
pcscf      | 107(144) ERROR: <script>: request sent to sip:[email protected]:41786 completed with code: 200, Type 1
pcscf      | 104(141) ERROR: <script>: request sent to sip:[email protected]:40643 completed with code: 200, Type 1
pcscf      | 85(122) INFO: cdp [authstatemachine.c:292]: auth_client_statefull_sm_process(): after callback of event 1
pcscf      | 94(131) INFO: cdp [authstatemachine.c:292]: auth_client_statefull_sm_process(): after callback of event 17
pcscf      | 94(131) INFO: cdp [authstatemachine.c:463]: auth_client_statefull_sm_process(): state machine: AUTH_EV_RECV_STA about to clean up
pcscf      | 98(135) ERROR: <script>: OPTIONS to sip:[email protected]:41786 via sip:192.168.101.3:41786;transport=tcp...
pcscf      | 98(135) ERROR: <script>: OPTIONS to sip:[email protected]:40643 via sip:192.168.101.2:40643;transport=tcp...
pcscf      | 107(144) ERROR: <script>: request sent to sip:[email protected]:41786 completed with code: 200, Type 1
pcscf      | 98(135) ERROR: <script>: OPTIONS to sip:[email protected]:41786 via sip:192.168.101.3:41786;transport=tcp...
pcscf      | 98(135) ERROR: <script>: OPTIONS to sip:[email protected]:40643 via sip:192.168.101.2:40643;transport=tcp...
pcscf      | 107(144) ERROR: <script>: request sent to sip:[email protected]:41786 completed with code: 200, Type 1
pcscf      | 85(122) ERROR: <script>: request sent to sip:[email protected]:40643 completed with code: 408, Type 2
pcscf      | 85(122) ERROR: <script>:   request sent to sip:[email protected]:40643: Fail Counter is 1
pcscf      | 85(122) INFO: cdp [authstatemachine.c:292]: auth_client_statefull_sm_process(): after callback of event 1
pcscf      | 94(131) INFO: cdp [authstatemachine.c:292]: auth_client_statefull_sm_process(): after callback of event 17
pcscf      | 94(131) INFO: cdp [authstatemachine.c:463]: auth_client_statefull_sm_process(): state machine: AUTH_EV_RECV_STA about to clean up
pcscf      | 98(135) ERROR: <script>: OPTIONS to sip:[email protected]:41786 via sip:192.168.101.3:41786;transport=tcp...
pcscf      | 98(135) ERROR: <script>: OPTIONS to sip:[email protected]:40643 via sip:192.168.101.2:40643;transport=tcp...
pcscf      | 85(122) ERROR: <script>: request sent to sip:[email protected]:40643 completed with code: 408, Type 2
pcscf      | 85(122) ERROR: <script>:   request sent to sip:[email protected]:40643: Fail Counter is 2
pcscf      | 98(135) ERROR: <script>: OPTIONS to sip:[email protected]:41786 via sip:192.168.101.3:41786;transport=tcp...
pcscf      | 98(135) ERROR: <script>: OPTIONS to sip:[email protected]:40643 via sip:192.168.101.2:40643;transport=tcp...
pcscf      | 85(122) ERROR: <script>: request sent to sip:[email protected]:41786 completed with code: 408, Type 2
pcscf      | 85(122) ERROR: <script>:   request sent to sip:[email protected]:41786: Fail Counter is 1

fail_natping2_grace12.pcapng.zst.zip

from docker_open5gs.

Hoernchen avatar Hoernchen commented on August 17, 2024

I have been watching and "secretly" using your branch since tuesday, and can confirm that the current repo appears to have fixed the issue and calls just keep working, thanks for your very fast fix!

Just one minor complaint: the new much lower nat ping leads to a lot of "spam" in the docker compose output, it would be great to suppress that in some way unless it fails, because even with just two active phones I kind of mostly just see option messages and have to scroll forever to find anything...

from docker_open5gs.

herlesupreeth avatar herlesupreeth commented on August 17, 2024

Thanks for confirming the fix.

Just one minor complaint: the new much lower nat ping leads to a lot of "spam" in the docker compose output

Point noted. I will fix the interval.

from docker_open5gs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.