GithubHelp home page GithubHelp logo

armcl-pipe-all's Issues

Pipe-ALL AlexNet output formatting error when running entirely on little cluster at certain frequencies

demonstration:
Screenshot 2024-01-23 at 13 55 46

Steps to reproduce:

  • make sure LD_LIBRARY_PATH is set correctly
echo performance > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
echo performance > /sys/devices/system/cpu/cpufreq/policy2/scaling_governor
echo 1000000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_max_freq
./graph_alexnet_all_pipe_sync --threads=4  --threads2=2 --n=60 --total_cores=6 --partition_point=8 --partition_point2=8 --order="L-G-B"                                                                    

Happens on frequencies of 1 GHz and higher. Once it didn't occur on 1.2 GHz, but it is generally very consistent in occurring.

Consistently reproducible on our system.

Hardware is plugged directly into power with the supplied Anker PowerPort+ 1 power supply.

Significance of the issue:

  • It disrupts the proper operation of our parser, which cost us hours trying to "fix" a bug yesterday that was due to this.

Segmentation Fault and Runtime Errors in ResNet50 (all pipe sync)

Output of 'strings libarm_compute.so | grep arm_compute_version':

arm_compute_version=v21.02 
Build options: {'arch': 'arm64-v8a', 'opencl': '1', 'neon': '1', 'asserts': '0', 'debug': '1', 'os': 'linux', 'Werror': '0'} 
Git hash=e4cef6d16f8638331bde4d8d67a0e65ffbe4e571

Platform:

Hikey970

Operating System:

Debian 9, Linux kernel 4.9.78-147538-g244928755bbe

Problem description:

Below, there is a list of commands which yield a seg fault:

sudo LD_LIBRARY_PATH=/home/ARMCL-pipe-all/build /home/ARMCL-pipe-all/build/examples/graph_resnet50_all_pipe_sync --threads=4 --threads2=2 --total_cores=6 --partition_point=13 --partition_point2=15 --order=L-B-G --n=50

sudo LD_LIBRARY_PATH=/home/ARMCL-pipe-all/build /home/ARMCL-pipe-all/build/examples/graph_resnet50_all_pipe_sync --threads=4 --threads2=2 --total_cores=6 --partition_point=1 --partition_point2=8 --order=L-B-G --n=50

sudo LD_LIBRARY_PATH=/home/ARMCL-pipe-all/build /home/ARMCL-pipe-all/build/examples/graph_resnet50_all_pipe_sync --threads=4 --threads2=2 --total_cores=6 --partition_point=1 --partition_point2=3 --order=B-L-G --n=50

Also, these throw Runtime Error:

sudo LD_LIBRARY_PATH=/home/ARMCL-pipe-all/build /home/ARMCL-pipe-all/build/examples/graph_resnet50_all_pipe_sync --threads=4 --threads2=2 --total_cores=8 --partition_point=1 --partition_point2=6 --order=B-L-G --n=50

sudo LD_LIBRARY_PATH=/home/ARMCL-pipe-all/build /home/ARMCL-pipe-all/build/examples/graph_resnet50_all_pipe_sync --threads=4 --threads2=2 --total_cores=8 --partition_point=1 --partition_point2=13 --order=G-B-L --n=50

I have not done an exhaustive search to find all mappings causing seg faults, but those are some that definitely yield problems.

Unable to rebuild the ARM_CL and follow the instruction to reproduce the result

Output of 'strings libarm_compute.so | grep arm_compute_version':
arm_compute_version=v21.02 Build options: {'Werror': '0', 'debug': '1', 'asserts': '0', 'neon': '1', 'opencl': '1', 'os': 'linux', 'arch': 'arm64-v8a'} Git hash=b'5682f000a9e6682be0cf3d2ef5289851cd933433'

Platform:
Hikey970

Operating System:
Linux

Problem description:

Hello Ehsan. First of all, Thanks for open source this great project.

I try to replicate your result. I download your project by the following command. (I guess the README should be updated )
git clone https://github.com/Ehsan-aghapour/ARMCL-pipe-all/ But Here are some problems I faced:

  1. I used scons Werror=0 -j16 debug=0 asserts=0 neon=1 opencl=1 os=linux arch=arm64-v8a to build the ARM-CL. However, it shows the following warnings and errors:

image
aarch64-linux-gnu-g++: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-6/README.Bugs> for instructions.
scons: *** [build/src/graph/backends/NEON/NEFunctionFactory.os] Error 4
aarch64-linux-gnu-g++: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-6/README.Bugs> for instructions.
scons: *** [build/src/graph/backends/CL/CLFunctionsFactory.os] Error 4
examples/graph_mobilenet_all_pipe_sync.cpp: In constructor ‘GraphMobilenetExample::GraphMobilenetExample()’:
examples/graph_mobilenet_all_pipe_sync.cpp:67:5: warning: ‘GraphMobilenetExample::input_descriptor2’ should be

  1. I tried following the "Running in Linux part" : cp /system/lib64/egl/libGLES_mali.so $lib_dir/libOpenCL.so
    But May I ask when Can I find the libOpenCL.so?
    image

3.I tried to use the following command to test
./build/examples/graph_alexnet_all_pipe_sync --threads=4 --threads2=2 --total_cores=6 --partition_point=3 --partition_point2=5 --order=G-L-B --n=60

but it shows these errors, I guess maybe this is due to unsuccessful building.
./build/examples/graph_alexnet_all_pipe_sync: symbol lookup error: /home/Micro_SD_shunya/hungyang/ARMCL-pipe-all/library/libarm_compute_graph.so: undefined symbol: _ZN11arm_compute7logging14LoggerRegistry3getEv

  1. Also, I follow and use the following "Running in Linux part"

image
but it turns out with the following messages:
WARNING: Skipping invalid option 'Neon'!
WARNING: Skipping invalid option '/home/Micro_SD_shunya/hungyang/ARMCL-pipe-all/alexnet/'!
WARNING: Skipping invalid option '/home/Micro_SD_shunya/hungyang/ARMCL-pipe-all/alexnet//go_kart.ppm'!
WARNING: Skipping invalid option '/home/Micro_SD_shunya/hungyang/ARMCL-pipe-all/alexnet//labels.txt'!

Thanks again :)

Questions for clarification (mostly on multithreading, and real data inferencing)

Output of 'strings libarm_compute.so | grep arm_compute_version':
arm_compute_version=v21.02 Build options: {'arch': 'arm64-v8a', 'opencl': '1', 'neon': '1', 'asserts': '0', 'debug': '1', 'os': 'linux', 'Werror': '0'} Git hash=abc2c291bcc4a62c171b84e41e0fb0dafd393291

Platform:
Odroid N2+

Operating System:
Ubuntu 20.04

Problem description:
Hello, I would like to use this repository for my project. But before doing so, I would like to address some thoughts:

  1. Why did you create threads for the execution of different network parts? These are pipelined, so they are going to be scheduled in serial. So why create threads? (ref: examples/alexnet, lines 706-711)
  2. In src/runtime/SchedulerUtils.cpp, you have line 48 commented out, and the reason as you point out is "amend mistake in ARMCL". Could you please elaborate on that?
  3. In the README file, you point out that if we do not provide the --image argument, the execution will proceed with dummy data. Where is this data created? I took a glance over the graph_* files, and did not see any line that created dummy data.
  4. Is there any way I can retrieve the output of the DNN? In other words, How can I print the predictions or map the output layer to labels?
  5. The first lines of my output are:
Third graph inferencing image: 0:../../data/images/7.ppm
First graph inferencing image: 0:../../data/images/7.ppm
Second graph inferencing image: 1:../../data/images/0.ppm

I understand that multithreading can have that "unreasonable" output. I can also understand that you have used mutexes, locks and conditional variables to block certain code blocks from being accessed by multiple threads. However, is there any way that I can verify that data is not actually accessed by a later subgraph before the first pipeline stages (e.g. graph 3 before graph 2 or 1).

Thank you :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.