GithubHelp home page GithubHelp logo

Comments (10)

bperez77 avatar bperez77 commented on August 22, 2024

Interesting, I suspect that this is because one of the file operations for the character device isn't properly implemented. Does this error occur if you don't specify system("sync") twice? Also, what version of the Linux kernel are you using?

from xilinx_axidma.

baf2099 avatar baf2099 commented on August 22, 2024

Brandon,

We are now having the same problem with the fopen / fprintf functions. Also tried with a single system("sync") and the kernel oops did occur in this case as well. The kernel oops seems to occur on either the next call to the library (e.g. axidma_free) or on completion of the program. In fact, we stripped back the calls to ONLY axidma_init() and axidma_malloc() and this is enough to enter a failure state by running fopen / fprintf and then terminating the program. We believe this points to the issue being somewhere in the axidma_malloc(). Any ideas what could be causing this?

Our petalinux build is 2017.4 at the moment for reference.

Unable to handle kernel NULL pointer dereference at virtual address 00000104 pgd = c0004000 [00000104] *pgd=00000000 Internal error: Oops - BUG: 817 [#1] PREEMPT SMP ARM Modules linked in: xilinx_axidma(O) uio_pdrv_genirq CPU: 1 PID: 1245 Comm: ex.fw Tainted: G O 4.9.0-xilinx-v2017.4 #1 Hardware name: Xilinx Zynq Platform task: dd91a1c0 task.stack: ddade000 PC is at axidma_vma_close+0xa8/0xd4 [xilinx_axidma] LR is at __arm_dma_free.constprop.2+0xb0/0xf0 pc : [<bf004450>] lr : [<c0112840>] psr: 60030113 sp : ddadfde0 ip : 00000000 fp : dd97b480 r10: ddadfefc r9 : 1ed00000 r8 : 00800000 r7 : ded00000 r6 : c7db2600 r5 : c0112880 r4 : c7f848c0 r3 : 00000200 r2 : 00000100 r1 : a0030113 r0 : c7f848c0 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none Control: 18c5387d Table: 07c2004a DAC: 00000051 Process ex.fw (pid: 1245, stack limit = 0xddade210) Stack: (0xddadfde0 to 0xddae0000) fde0: 00000000 c7f31f20 00000017 c7f31478 c7f311b8 00000001 ddadfe88 c7c0dc78 fe00: 00000000 c01b5fbc c7f31478 00000022 00000001 c01b7728 c7ebb700 00000000 fe20: c7c0dc40 00000001 00000000 00000000 ffffffff 20030113 00000003 00000000 fe40: 00000400 dc809000 ddadfefc c014bd0c 00000000 00000003 dd91a1c0 c7c0dc40 fe60: 00000000 dd91a5e0 c7c0dc40 00000000 dd91a5e0 c0118700 dd91a1c0 c7c0dc40 fe80: dd91a5e0 c011cf40 0000000b c0a0302c dd91a1c0 ffffe000 dd97b480 c011e460 fea0: 0000000b ddadfee8 ddade000 ddbe1a44 00106001 c0127458 ddbe1540 ddadfeb8 fec0: 00000000 00000000 00000000 ddadffb0 00000000 00000000 ddade000 b6f76000 fee0: 00000000 c0109b78 ddadff88 00000005 b6d8782c c0a08714 ddadffb0 0000000b ff00: 00000000 00030001 b6d8782c c7f5540a 00000000 e09a9270 c034ebbc dc809000 ff20: ddad8580 a00f0013 00000000 00000807 00000000 00000000 00000001 c7e05900 ff40: 00000000 0000000b c7e05908 dd4e5a40 00000002 00000000 00000000 c01cdce0 ff60: 00000000 00000000 0000000b c7e05900 c7e05900 0002b680 0000000b c0106e24 ff80: ddade000 ddade000 00000000 ddadffb0 18c5387d 00000000 ddade000 c010a034 ffa0: 00010f14 60030010 ffffffff c0106cb4 0000000b 00000000 00000001 00000000 ffc0: b6d87008 b6606008 cccccccd 401acccc 00000000 00000000 b6f76000 00000000 ffe0: 00000000 bece6c58 b6e730ac 00010f14 60030010 ffffffff 58009556 24040741 [<bf004450>] (axidma_vma_close [xilinx_axidma]) from [<c01b5fbc>] (remove_vma+0x28/0x54) [<c01b5fbc>] (remove_vma) from [<c01b7728>] (exit_mmap+0x164/0x1bc) [<c01b7728>] (exit_mmap) from [<c0118700>] (mmput+0x40/0xe8) [<c0118700>] (mmput) from [<c011cf40>] (do_exit+0x36c/0x860) [<c011cf40>] (do_exit) from [<c011e460>] (do_group_exit+0xb8/0xbc) [<c011e460>] (do_group_exit) from [<c0127458>] (get_signal+0x444/0x478) [<c0127458>] (get_signal) from [<c0109b78>] (do_signal+0x74/0x3b4) [<c0109b78>] (do_signal) from [<c010a034>] (do_work_pending+0x68/0xa4) [<c010a034>] (do_work_pending) from [<c0106cb4>] (slow_work_pending+0xc/0x20) Code: e12fff35 e5943014 e1a00004 e5942010 (e5823004) ---[ end trace 9a4114be0e1515f7 ]--- Fixing recursive fault but reboot is needed!

from xilinx_axidma.

bperez77 avatar bperez77 commented on August 22, 2024

Interesting, so it seems to be a persistent issue. Something must be going wrong during the axidma_vma_close function, which will be invoked when axidma_free is called or the program terminates.

Out of curiosity, does the error only occur in the precense of fopen/fprintf/sync, or does it happen without these calls? I'll look into the code and see if I can figure out the reason why.

from xilinx_axidma.

baf2099 avatar baf2099 commented on August 22, 2024

Yes, the error ONLY occurs in the presence of those calls. It will NOT happen if those calls are commented out or if our software doesn't execute the branch they are on.

from xilinx_axidma.

bperez77 avatar bperez77 commented on August 22, 2024

Interesting, and this fopen and fclose are just on some aribtrary file? Not on the /dev/axidma file?

I can't really see any reason there should be a NULL pointer exception in the axidma_vma_close function, unless the private data in the VM struct is becoming NULL.

from xilinx_axidma.

baf2099 avatar baf2099 commented on August 22, 2024

The file we are fopen’ing has nothing to do with the /dev/axidma it is just a simple text file. Is there any kind of test we could run to help you track this bug down?

from xilinx_axidma.

baf2099 avatar baf2099 commented on August 22, 2024

Curious if you or @sk2046 have come any closer to a resolution on this issue?

from xilinx_axidma.

bperez77 avatar bperez77 commented on August 22, 2024

Apologies, I lost track of this. I haven't had a chance to look into this any further, but I should be able to within the next few days.

The only thing I can think of doing off the top of my head is to add some printk statements to the axidma_vma_close. I'm not sure how useful the information provided by this will be; I suspect you'll see that the vma->vm_private_data field is NULL.

from xilinx_axidma.

bperez77 avatar bperez77 commented on August 22, 2024

Here's an update on what I've found so far, @baf2099 and @sk2046.

When I reproduce the problem locally, with the AXI DMA benchmark example program, I see that the second transfer fails because it is unable to find one of the DMA buffers. For both calling sync and doing random file I/O, it seems that these operations cause the DMA buffer to be freed (through a call to the VMA close function), though it is unclear why.

[   23.615368] axidma: axidma_dma.c: axidma_init_sg_entry: 118: Requested transfer address b66b3000 does not fall within a previously allocated DMA buffer.
Failed to perform the AXI DMA read-write transfer: Bad address

The page fault is being caused by one of the DMA buffers being removed twice from the allocated DMA buffers list, which occurs on this line.

It's not immediately clear to me why VMA close is being called on the DMA buffers when random file I/O occurs. I suspect it may have something to do with how the character device is configured, but I'm not certain. In any case, I'm still trying to resolve it.

from xilinx_axidma.

bperez77 avatar bperez77 commented on August 22, 2024

I ended up finding the resolution to this issue. The problem was with how the driver dealt with memory regions that are copied when a process forks a child process. This explains @sk2046's issue with calling the sync program twice.

However, this doesn't explain @baf2099's issue with fopen and fclose. I tried a modified version of the AXI DMA benchmark program that did writes to a text file, simliar to how you described, and saw no issues after this fix. Can you verify that this works for your example program? If not, can you create a new issue, and attach the example application you have that causes the issue?

from xilinx_axidma.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.