Comments (10)
Testing PHY reinitialization with a long wait time (10s) still leaves the system stuck. This leads me to suspect the VSC7448, since it's not being power cycled, but getting ground-truth readings is going to be essential for debugging.
from hubris.
:( ok. Someone will need to probe the board in the office asap
from hubris.
OK so sad news, our osc3 does a thing that makes it output the incorrect frequency sometimes. See this note from microshit:
https://ww1.microchip.com/downloads/en/DeviceDoc/DSC11xx-Family-Silicon-Errata-DS80000982A.pdf
Unfortunately the parts which tri-state are unobtainable or have very long lead times.
from hubris.
Unfortunately sometimes we see a sad clock...
from hubris.
The signal off freq which is a symptom of the above problems:
This is VSC7448 side of our link
from hubris.
Unfortunately, I don't think it's wise to plan any more rework on this pass of sidecar. As discussed on the hardware tactical today, given the ~2% boot failure rate, we're proposing that the sidecar power-cycle the qsfp board (software workaround) in the cases where this issue is detected. @arjenroodselaar is signed up to scope out that work.
This isn't awesome and we should consider using a different part in the future.
from hubris.
First, we are going to try and power cycle the Front IO board from Sidecar.
Alternatively, per our huddle, I plan to sever the FPGA's connection to the enable of our current osc (in a reparable way), and we will attempt to work with the VSC when we violate it's sequencing instructions (typically want's power before refclk)
from hubris.
An update on this issue; https://github.com/oxidecomputer/hubris/tree/front_io_bad_osc contains changes across the sequencer task, monorail task and the controller bitstreams to work around this issue. This is currently running in a loop where the system is power cycled and the links are checked afterwards. So far the monorail task has detected two instances where the QSGMII link did not come up and the front IO board needed to be power cycled and the PHY reinitialized to work around the problem. Afterwards the QSGMII link and technician ports worked as intended.
This will take a few days to get through review, but so far a software workaround seems adequate.
from hubris.
This ran overnight and 1464 power cycles of Sidecar were done. During 57 of those cycles monrail-server
determined the QSGMII link not functional and requested one or more power cycles of the front IO board from the sequencer
. Once the QSGMII link came up ping tests using both technician ports succeeded in all 1464 cycles.
from hubris.
Done in #1449
from hubris.
Related Issues (20)
- Hubris: Config Spartan7 from Aux Flash
- Hubris: Host OS QSPI Driver via FPGA
- Hubris FMC integration on grapefruit
- Hubris: DDR Proxy
- Hubris: UART Proxy Drivers/Implementation
- Ruby Dev Box thermal management
- Hubris/FPGA: Interrupt pins
- SP serial console doesn't detach stale faux-mgs clients on an idle console
- dump-agent gets stuck in send to net HOT 4
- Rack 2 powered off and left blinking power sequencers behind HOT 2
- Load Front IO VPD into `packrat`
- Can't collect an SP dump
- Hubris Work for Grapefruit HOT 2
- maybe we should have a way to release a claimed EXTI interrupt?
- I2C should make sure the controller's off before taking over its pins
- I2C should not treat a mux as having been reset if it NACKs or otherwise fails.
- I2C driver is issuing bus resets in the middle of perfectly reasonable transactions. HOT 1
- I2C is generating 600-750ns glitches, and it should feel bad about this. HOT 1
- GPIO IRQ is causing task rebuilds
- support default features at the task / kernel level
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hubris.