Better search for fmax goals #218
Replies: 20 comments
-
Both @bartokon and @suarezvictor have said things along the lines of bartokon has some experimental plots of the #stages vs fmax curves we would be "searching-along" and its not clear a binary search is going to work? |
Beta Was this translation helpful? Give feedback.
-
It is important to note |
Beta Was this translation helpful? Give feedback.
-
Decision trees can get stuck in local minimum. We could make some border cases for search and try to offset the problem. |
Beta Was this translation helpful? Give feedback.
-
For now, I think, we should analyze pyrtl feedback with critical times and group wires to distinct speed categories (time delay categories) and put registers there for example: https://machinelearningmastery.com/clustering-algorithms-with-python/ |
Beta Was this translation helpful? Give feedback.
-
I dont think I understand I think that assumes way too much of how well names through GHDL->Yosys->BLIF-pyrtl are preserved at the moment. But maybe not / could imagine that not being an issue... Ill explain how the tool currently places registers... |
Beta Was this translation helpful? Give feedback.
-
Say you have a Say your device fmax is 500MHz( Then it goes at the task of |
Beta Was this translation helpful? Give feedback.
-
Then it gets feedback from the syn+pnr tool saying. Your |
Beta Was this translation helpful? Give feedback.
-
And its that kind of iterative looping - finding out you were off, and trying to re-adjust that exists.
More detailed feedback of 'the critical path was in this specific submodule, stage 3' is possible but not for all tools/flows easily. |
Beta Was this translation helpful? Give feedback.
-
Aaaahhh, okay now I know how it works. I thought that you analyze on netlist level, and if you are pipelineing a netlist it could be directly inserted into fpga and resynth (pipeline not C model but netlist). So if some net connection in other words path is loooong. You could slice it in half, put register in the middle, that way we could aim at critical paths + if netlist is supported all other tools and even encrypted ip are supported IF you have a netlist. So my proposal is too low level for now, I think. So the best we can do is detect the worst module and slice it N times. We could slice each module separately with multiprocessing and detect optimal slice level with for example binary search (or as I would like to have some fun with some ML models and predict as much slicing parameters as I posibly can :D). |
Beta Was this translation helpful? Give feedback.
-
Yeah thats good thinking - perhaps what actually occurs for larger non- Regardless the 'slice a module N times' is roughly the same process whether its to the top level or not (its that 'down the call stack' slicing until its raw VHDL you are manipulating) |
Beta Was this translation helpful? Give feedback.
-
Definitely down to have some fun ML experiments to see what better predictions can be done :D |
Beta Was this translation helpful? Give feedback.
-
Better not to reinvent the wheel https://github.com/MattRighetti/leiserson-retiming/wiki |
Beta Was this translation helpful? Give feedback.
-
Haha can't wait for new pipelinec v2 :D |
Beta Was this translation helpful? Give feedback.
-
Oh gosh v2 ⏳
this connects to #45 as well. If I were modeling at the LUT level - then each LUT becomes a comb. element in the network that the re-timing tool is analyzing. Doesnt need to be LUT level - but just some fixed combinatorial primitives seem to be needed. I dont have those really? My comb. primitives can be split / sliced. So I dont know what the smallest comb. chunks are for the tool to move around? Maybe I need to ask - does that retiming tool have a way to say 'split this comb. element delay in half'? At the moment sounds like version 2 for sure - I need to think about how the current way things works would map to some fixed comb. prim network you could throw into some other tool. |
Beta Was this translation helpful? Give feedback.
-
Can try to follow up on this work here |
Beta Was this translation helpful? Give feedback.
-
Let me know if you agree with what I musing, in regards to estimate the larger delay of a path in netlist: |
Beta Was this translation helpful? Give feedback.
-
Please see #46 for recent new caching stuff of period values for pipelined built in functions |
Beta Was this translation helpful? Give feedback.
-
the RWRoute paper has timing model equations used in RapidWright (see Page 13) http://www.rapidwright.io/docs/Papers.html |
Beta Was this translation helpful? Give feedback.
-
Ooo nice - I think trying to use RapidWright ~instead of Vivado is a option that could be ~easily explored for getting faster timing modeling with Xilinx FPGAs Using their equations or something like it to derive our own timing models for all/any fpga architecture also sounds possible - but is indeed a more difficult/experimental long term goal |
Beta Was this translation helpful? Give feedback.
-
Relating to, but not the same as
Faster timing estimates
#46After having some mechanism to get timing estimates^ - aka able to do circuit input -> timing info out - how does PipelineC then use that info to search for a particular fmax goal in the fastest way possible?
Beta Was this translation helpful? Give feedback.
All reactions