GithubHelp home page GithubHelp logo

Comments (22)

RaidToRadar avatar RaidToRadar commented on May 18, 2024 1

Here is the output @sakundu after running the code:

root@8d9a5ec71301:/workspace# /workspace/hmetis /workspace/circuit_training/grouping/44cols_34rows/g500_ub5_nruns10_c5_r3_v3_rc1/metis_input /workspace/circuit_training/grouping/44cols_34rows/g500_ub5_nruns10_c5_r3_v3_rc1/metis_input.fix 514 5 10 5 3 3 1 0
bash: /workspace/hmetis: No such file or directory
root@8d9a5ec71301:/workspace# bash /workspace/hmetis /workspace/circuit_training/grouping/44cols_34rows/g500_ub5_nruns10_c5_r3_v3_rc1/metis_input /workspace/circuit_training/grouping/44cols_34rows/g500_ub5_nruns10_c5_r3_v3_rc1/metis_input.fix 514 5 10 5 3 3 1 0
/workspace/hmetis: /workspace/hmetis: cannot execute binary file

I am currently using a virutal box running linux ubuntu20.04 and Circuit Training docker on the head so it would be base image of 22.04 cuda-enabled

One thing that I did notice that I thing that I did notice on my travels is that MacroPlacement ran into a similar issue and used this to fix their issue:

Fix from MacroPlacement(Issue relating to hmetis not working when running the FormatTranslator.py script)

The other problem might related to file hmetis in /src/utils/hmetis
When we run ./hmetis, if the output as below

./hmetis
bash: ./hmetis: No such file or directory

It means the current OS doesnโ€™t recognize this built file. To solve this issue, we have to install some libs as below:

sudo apt-get install libc6:i386
sudo apt-get install libstdc++6:i386

After installing, if we run ./hmetis again, the output should be like this:

./hmetis
********************************************************************************
Usage of hMetis:  
[Option1]:  hmetis HGraphFile Nparts UBfactor
[Option2]:  hmetis HGraphFile FixFile Nparts UBfactor
[Option3]:  hmetis HGraphFile Nparts UBfactor Nruns CType RType Vcycle Reconst dbglvl
[Option4]:  hmetis HGraphFile FixFile Nparts UBfactor Nruns CType RType Vcycle Reconst dbglvl
[INFO] HGraphFile : 
[INFO] FixFile : 
[INFO] Nparts : 2
[INFO] UBfactor : 5
[INFO] Nruns : 10
[INFO] CType : 1
[INFO] RType : 1
[INFO] Vcycle : 1
[INFO] Reconst : 0
[INFO] dbglvl : 16

UPDATE

while learning about the solution for why hmetis does not work, there is also another step when using the ubuntu docker which is that you have to add the i386 architect inside the container. So when running it inside the container what I did to resolve this issue was:

dpkg --add-architecture i386
apt-get update

then I was able to install the required packages:

apt-get install libc6:i386
apt-get install libstdc++6:i386

I think we can probably add the Dockerfile/compose by adding these lines like so:

RUN dpkg --add-architecture i386
RUN apt-get update
RUN apt-get install libc6:i386
RUN apt-get install libstdc++6:i386

from circuit_training.

RaidToRadar avatar RaidToRadar commented on May 18, 2024 1

Hi @sakundu ,
After following up with your recommendation, I thought that I would also try to run the grouping code locally (utilized python virtual environment to be able to use 3.9 on my installation smoothly) and I was able to generate the .plc file need for CT. So as of now, I do not believe that the grouping code works in a docker container at least in ubuntu, however I hope this would be feature in the future. I also appreciate the tips on transferring proto to .odb/LEF/DEF. I wish you the best and thank you for your support.

from circuit_training.

RaidToRadar avatar RaidToRadar commented on May 18, 2024 1

Hi @sakundu,
Thank you for your suggestion, the e2e_smoke test worked after changing the two variables. I have been doing some experimenting with building CT for grouping and learned a few things.

First, if the hmetis file is not being detected, then you can follow along with the fix mentioned in this thread above.

Next, I learned that grouping_main.py code does NOT work on ubuntu 22.04 operating systems AND ubuntu22.04 docker images either. This means that if you are working on HEAD branch and work with GPU, then it will does not work because the nvidia docker uses ubuntu22.04 the base image. However, grouping will work on HEAD branch without GPU because the base image is build on 20.04 which should work.

Also, is there a way to keep two docker images? By this I mean, I would like to keep two different docker images, one for HEAD with GPU enabled, and one without GPU.

Hope this might save some time for some people and thank you again for your time and patience.

from circuit_training.

sakundu avatar sakundu commented on May 18, 2024

Hi @RaidToRadar,

If you are able to generate the placed design using OpenROAD-flow-scripts (ORFS), you can utilize the gen_pb_or.tcl script to generate a Protobuf netlist.

## Open the placed design in OpenROAD. Ensure that all instances and pins are properly placed. 
make gui_place DESIGN_CONFIG=<config file>
## Use the following command on OpenROAD shell / gui
$ source gen_pb_or.tcl
$ gen_pb_netlist <output protobuf file name>

Now, you can use this output protobuf file as input for the grouping flow to generate the clustered protobuf netlist.

Please let us know if you have any problem.

Thanks,
Sayak

from circuit_training.

RaidToRadar avatar RaidToRadar commented on May 18, 2024

Hi @sakundu ,

I followed along with your suggestions but ran into an issue when I attempt to utilize the translated design. The error that it gives me says that there are no hard macros in the design and then exits out. I am not too sure how to fix this error if there is something else I am missing. Any help would be appreciated thank. I will also include my steps that I wrote from another post discussion.

Discussion from ORFS

Hello,

I am a current student trying to learn about and if it is possible to create a flow as my initial synthesizer, floorplan and placement so that I can use the LEF/DEF and translate it into the required Protocol Buffer Format required to run the groper and Deep Reinforcement Learning Circuit Training.

One of the errors I come across is the that compared to the testdata that google provides to run their grouping script, it comes with an error stating that there are no hard macros found in the design. Perhaps I am missing a step but as of now I am not sure.

To get to where I am, I first build ORFS using Docker, then I choose the Design of my choosing and proceed to run the stages:

make synth
make floorplan
make place

After that I, I downloaded the LEF/DEF Translator script created by TIOLOS-AI-INSITIUTE and used their .tcl script called "gen_pb_or.tcl" so that I would be able to source it later into OpenROAD gui as follows:

make gui_place

Then I run the following commands (note that the gen_pb_or.tcl script is located in the ORFS/flow directory)

source gen_pb_or.tcl
gen_pb_netlist <netlistname.pb.txt>

After all that to get the generated netlist, I notice that translated netlist does not contain any fields or types that are called Macro, which is requried. and I get the following error when I run the Google Circuit training pre-processing script called grouper_main.py;

python3.9 circuit_training/grouping/grouper_main.py --output_dir=$OUTPUT_DIR --netlist_file=$NETLIST_FILE --block_name=$BLOCK_NAME --hmetis_dir=$HMETIS_DIR --plc_wrapper_main=$PLC_WRAPPER_MAIN
2023-06-21 14:48:01.439124: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:7630] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-06-21 14:48:01.439156: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-06-21 14:48:01.439170: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1500] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-06-21 14:48:01.462984: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-06-21 14:48:02.523785: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
I0621 14:48:04.069163 139990346412032 grouper.py:431] disconnecting high fanout nets
Disconnecting node: clk with 1056 fanouts.
I0621 14:48:05.321882 139990346412032 grouper.py:395] Try auto grid selection...
I0621 14:48:07.881747 139990346412032 placement_util.py:468] node_order: descending_size_macro_first
No hard macros found in the design!
E0621 14:48:09.306736 139990346412032 grouper.py:403] The auto grid selection failed.
Traceback (most recent call last):
  File "/workspace/circuit_training/grouping/grouper_main.py", line 62, in <module>
    app.run(main)
  File "/usr/local/lib/python3.9/dist-packages/absl/app.py", line 308, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.9/dist-packages/absl/app.py", line 254, in _run_main
    sys.exit(main(argv))
  File "/workspace/circuit_training/grouping/grouper_main.py", line 53, in main
    grouped_plc, placement_file = grouper.group_stdcells(
  File "/workspace/circuit_training/grouping/grouper.py", line 441, in group_stdcells
    select_grid_size(plc)
  File "/workspace/circuit_training/grouping/grouper.py", line 404, in select_grid_size
    raise RuntimeError('Failed to select a grid size.')
RuntimeError: Failed to select a grid size.

I was wondering if there is something critical that I am missing, or if there is a step in ORFS that will have these Macros when getting translated into Protocol Buffer Format. Any help would be appreciated.

from circuit_training.

sakundu avatar sakundu commented on May 18, 2024

The issue seems to be that your design doesn't include any hard macros. Could you please confirm if your design does have hard macros? If so, please make sure that the macro class in the LEF is set to BLOCK?

I0621 14:48:07.881747 139990346412032 placement_util.py:468] node_order: descending_size_macro_first
No hard macros found in the design!

In the gen_pb_or.tcl script, it checks the cell LEF property to determine whether an instance is a macro or not. Thus, if there's an issue with your macro LEF, you'll need to rectify that before generating the protobuf netlist.

from circuit_training.

RaidToRadar avatar RaidToRadar commented on May 18, 2024

Ahhhh, I see, thank you @sakundu , I chose a different PDK that would utilize macros and that fixed the error from showing up when running the grouper_main.py script. However there is now a new error about hmetis saying that no such file or directory even though that it does exist and have read/write/execute permissions (chmod 777 hmetis).:

I0621 16:17:44.114713 140058092941312 meta_netlist_convertor.py:311] Unconnected node found: TAP_2875
I0621 16:17:44.114820 140058092941312 meta_netlist_convertor.py:311] Unconnected node found: TAP_2876
I0621 16:17:44.115687 140058092941312 meta_netlist_convertor.py:318] Total area of the macros and stdcells: 2418.1421482269175. Number nodes: 12929.
I0621 16:17:44.161602 140058092941312 grouper.py:100] Adding 14 (number of fixed groups) to number of metis groups
I0621 16:17:44.184007 140058092941312 hmetis_util.py:65] Run: /workspace/hmetis /workspace/circuit_training/grouping/44cols_34rows/g500_ub5_nruns10_c5_r3_v3_rc1/metis_input /workspace/circuit_training/grouping/44cols_34rows/g500_ub5_nruns10_c5_r3_v3_rc1/metis_input.fix 514 5 10 5 3 3 1 0
Traceback (most recent call last):
  File "/workspace/circuit_training/grouping/grouper_main.py", line 62, in <module>
    app.run(main)
  File "/usr/local/lib/python3.9/dist-packages/absl/app.py", line 308, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.9/dist-packages/absl/app.py", line 254, in _run_main
    sys.exit(main(argv))
  File "/workspace/circuit_training/grouping/grouper_main.py", line 53, in main
    grouped_plc, placement_file = grouper.group_stdcells(
  File "/workspace/circuit_training/grouping/grouper.py", line 447, in group_stdcells
    part_and_plc_files = run_with_default_hmetis_options(
  File "/workspace/circuit_training/grouping/grouper.py", line 247, in run_with_default_hmetis_options
    return partition_netlist(plc, num_groups, fixed_logic_levels,
  File "/workspace/circuit_training/grouping/grouper.py", line 114, in partition_netlist
    metis_out_file = hmetis_util.call_hmetis(
  File "/workspace/circuit_training/grouping/hmetis_util.py", line 66, in call_hmetis
    subprocess.run(args=args, check=True)
  File "/usr/lib/python3.9/subprocess.py", line 505, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib/python3.9/subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/lib/python3.9/subprocess.py", line 1837, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: '/workspace/hmetis'

image

from circuit_training.

sakundu avatar sakundu commented on May 18, 2024

Sometimes, I observed that hmetis binary does not work in debian os. I do not know how to fix this problem. I always ran it on centos 8.

I0621 16:17:44.184007 140058092941312 hmetis_util.py:65] Run: /workspace/hmetis /workspace/circuit_training/grouping/44cols_34rows/g500_ub5_nruns10_c5_r3_v3_rc1/metis_input /workspace/circuit_training/grouping/44cols_34rows/g500_ub5_nruns10_c5_r3_v3_rc1/metis_input.fix 514 5 10 5 3 3 1 0

So can you try running the below command separately and to see if hmetis binary works on your system or not.

/workspace/hmetis /workspace/circuit_training/grouping/44cols_34rows/g500_ub5_nruns10_c5_r3_v3_rc1/metis_input /workspace/circuit_training/grouping/44cols_34rows/g500_ub5_nruns10_c5_r3_v3_rc1/metis_input.fix 514 5 10 5 3 3 1 0

from circuit_training.

sakundu avatar sakundu commented on May 18, 2024

@RaidToRadar, Thanks for sharing the solution of why hmetis was not working. I hope now you can run grouping flow.

from circuit_training.

RaidToRadar avatar RaidToRadar commented on May 18, 2024

New error approaches after getting past the no dir / file found for hmetis. I do not have a gpu setup on this, I am not sure if that would cause issues.

I0621 20:53:40.894911 139973494734848 grouper.py:100] Adding 2 (number of fixed groups) to number of metis groups
W0621 20:53:40.895412 139973494734848 grouper.py:106] Metis input file exists, skipping generation.
I0621 20:53:40.895554 139973494734848 hmetis_util.py:65] Run: /workspace/hmetis /workspace/circuit_training/grouping/24cols_21rows/g500_ub5_nruns10_c5_r3_v3_rc1/metis_input /workspace/circuit_training/grouping/24cols_21rows/g500_ub5_nruns10_c5_r3_v3_rc1/metis_input.fix 502 5 10 5 3 3 1 0
*******************************************************************************
 HMETIS 1.5.3  Copyright 1998, Regents of the University of Minnesota

HyperGraph Information -----------------------------------------------------
 Name: /workspace/circuit_training/grouping/24cols_21rows/g500_ub5_nruns10_c5_r3_v3_rc1/metis_input, #Vtxs:    10, #Hedges:     5, #Parts: 502, UBfactor: 0.05
 Options: EDGE, EEFM, Reconst-True, Always V-cycle, With Fixed Vertices

Recursive Partitioning... --------------------------------------------------
Traceback (most recent call last):
  File "/workspace/circuit_training/grouping/grouper_main.py", line 62, in <module>
    app.run(main)
  File "/usr/local/lib/python3.9/dist-packages/absl/app.py", line 308, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.9/dist-packages/absl/app.py", line 254, in _run_main
    sys.exit(main(argv))
  File "/workspace/circuit_training/grouping/grouper_main.py", line 53, in main
    grouped_plc, placement_file = grouper.group_stdcells(
  File "/workspace/circuit_training/grouping/grouper.py", line 447, in group_stdcells
    part_and_plc_files = run_with_default_hmetis_options(
  File "/workspace/circuit_training/grouping/grouper.py", line 247, in run_with_default_hmetis_options
    return partition_netlist(plc, num_groups, fixed_logic_levels,
  File "/workspace/circuit_training/grouping/grouper.py", line 114, in partition_netlist
    metis_out_file = hmetis_util.call_hmetis(
  File "/workspace/circuit_training/grouping/hmetis_util.py", line 66, in call_hmetis
    subprocess.run(args=args, check=True)
  File "/usr/lib/python3.9/subprocess.py", line 528, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/workspace/hmetis', '/workspace/circuit_training/grouping/24cols_21rows/g500_ub5_nruns10_c5_r3_v3_rc1/metis_input', '/workspace/circuit_training/grouping/24cols_21rows/g500_ub5_nruns10_c5_r3_v3_rc1/metis_input.fix', '502', '5', '10', '5', '3', '3', '1', '0']' died with <Signals.SIGSEGV: 11>.

from circuit_training.

ZhuLinsen avatar ZhuLinsen commented on May 18, 2024

Hi @sakundu, CT's output is in protobuf, but what I need to achieve ORFS is .odb or DEF/LEF. Hense, I would like to know how to translate protobuf to .odb or DEF/LEF. Any suggestions regarding to this would be helpful. Thanks.

from circuit_training.

sakundu avatar sakundu commented on May 18, 2024

Hi @ZhuLinsen ,

You have to modify plc_bp_to_placement_tcl.py to generate the macro placement file and then read the macro placement file using read_macro_placement proc in OpenROAD.

Suggested modification:

##Change Line 112 - 113  of plc_bp_to_placement_tcl.py
fp_out.write(f"placeInstance {macro_name} {x}"\
                        f" {y} {orient} -fixed\n")
## to
fp_out.write(f"{macro_name} {orient} {x} {y}\n")

Please let me know if you have any question.

Thanks,
Sayak

from circuit_training.

RaidToRadar avatar RaidToRadar commented on May 18, 2024

Hello @sakundu,

I have been attempting to run the grouper_main.py code on debian/ubuntu20.04 based OS but keep running into the where I get the error after trying to run it on a docker build. I am not sure if there is anything else I can do to try to get it to trun on this OS. What OS did you use to get the the grouping code to work? I think you had mentioned the use of Centos8, and did you use a docker build within it or did you set it up using a local install. Thank you for all your help so far, I really appreciate it.

from circuit_training.

sakundu avatar sakundu commented on May 18, 2024

Hi @RaidToRadar,

I typically run the grouping code on my local setup and the CT training on a Docker build.

Thanks,
Sayak

from circuit_training.

ZhuLinsen avatar ZhuLinsen commented on May 18, 2024

Hi @sakundu, thank you for your response. I am currently trying it out.

As far as I know, in OpenROAD-flow scripts, the data flow in floorplanning and placement is as follows:

  • Floor planning:
    init_floorplan -> place IO random -> mixed-sized placement -> Macro Placement -> tap cell insertion -> PDN
  • Placement:
    GP without placed IOs -> IO placement -> GP with IOs -> Legalization -> DP

And I still have two questions:

  • After obtaining the Macro Placement file from the CT, at which step should I begin to complete the flow?.
  • Can I use the file - produced by mixed-sized placement step as the input for the CT? This way, I can seamlessly incorporate the Macro Placement file from the CT into the flow.

from circuit_training.

sakundu avatar sakundu commented on May 18, 2024

Hi @ZhuLinsen

I believe the second option should be your first course of action. I have not personally run the macro placement generated using CT through ORFS, so I may not be able to provide direct assistance with this. The CT solution does not guarantee that the macros are on a grid, so it's crucial for you to make sure they are. As far as I'm aware, there's no specific command in OpenROAD that snaps the macros to the grid. Therefore, you might need to take some additional measures to ensure all the macros are grid-aligned. I apologize if this solution doesn't address your problem adequately.

Thanks,
Sayak

from circuit_training.

ZhuLinsen avatar ZhuLinsen commented on May 18, 2024

Hi @sakundu, I really appreciate your wealth of experience and advice. It has helped me a lot. I will carefully consider your suggestion and test it out.

from circuit_training.

RaidToRadar avatar RaidToRadar commented on May 18, 2024

Hello @sakundu,
I have been having a few issues attempting to run Circuit Training with the design (riscv) and platform (asap7) that I have converted from ORFS. Specifically, after I finish preprocessing (grouping script) I then try to do a end-to-end test using the e2e_smoke test script. However, one thing that I notice while running this is that it seems to never finish, or in other words it hangs forever. I am not sure what I am missing here because when I run the default settings (provided ariane netlist and init) it works perfectly fine and finishes within 20-30mins.

With that said, have you tried or run into this issue before, or is there certain designs/platforms that work with CT?
Once again thankyou for your time.

from circuit_training.

sakundu avatar sakundu commented on May 18, 2024

Hi @RaidToRadar

It seems like your run might be stuck due to a discrepancy in the number of macros between your testcase and Ariane. In such a situation, you should update the max_sequence_length and sequence_length to the total number of macros in your design plus one.

from circuit_training.

RaidToRadar avatar RaidToRadar commented on May 18, 2024

Hi @ZhuLinsen,
I was wondering if you were able transfer the contents from protobuf to .odb to run the CT design on ORFS. What were some steps and things that you considered if you have already accomplish this.

from circuit_training.

ZhuLinsen avatar ZhuLinsen commented on May 18, 2024

Hi @RaidToRadar, I'm sorry for replying so late. And I have not done that yet. You can see the conversation above, I used the read_macro_placement function that Sakundu mentioned to achieve my goal. Although, I have some trouble to solve now. Good luck.

from circuit_training.

RaidToRadar avatar RaidToRadar commented on May 18, 2024

Hi @ZhuLinsen , thank you for your response. What kind of troubles are you coming across? Is it related to the legalization of the macro placement locations on the canvas?

from circuit_training.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.