[DNM] Random attempts to fix controller reset issue#4894
[DNM] Random attempts to fix controller reset issue#4894plbossart wants to merge 3 commits intothesofproject:topic/sof-devfrom
Conversation
…commands mode When we use the PIO mode, we should not program anything related to CORB. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
This doesn't seem to have any effect on the controller reset issue, but I wonder why we disable interrupts and then stop the commands. This sounds backwards, the commands first need to complete and then the interrupts should be disabled? Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
For some reason, the programming sequences in the SOF driver do not include the required clear of the WAKE_STS bits when the controller is not in reset. Adding this sequence avoids a regression on LunarLake when using PIO commands. The correlation or causation is not clear, but there was no reason in the first place to deviate from the recommended programming sequence. Closes: thesofproject#4889 Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
|
FWIW the 3rd commit partially reverts 6ec3295 ("ASoC: SOF: Intel: hda: remove duplicated clear WAKESTS") I have no idea why this was removed. It could also be that there's an interference from ae5ff22 ("ASoC: SOF: Intel: hda-ctrl: only clear WAKESTS for HDaudio codecs"), where in the end we didn't clear all the WAKE_STS fields. Wow, my brain is officially fried haha. |
|
And for reference, the original clear was added in 2007 by someone using an "obiwan" email. Who said we can't have fun at work, eh? |
This could have been worse, it could have been "JiaT75" |
|
I'm puzzled, SDW should not be using CORB and we don't have display fro LNL yet (afaik). HDA setups are clear, only SDW is failing. How come that the PIO mode triggered this mass fail with SDW and a seemingly unrelated (to PIO mode) patch fixes the regression? |
| gctl = snd_sof_dsp_read(sdev, HDA_DSP_HDA_BAR, SOF_HDA_GCTL); | ||
| if (gctl & SOF_HDA_GCTL_RESET) | ||
| snd_sof_dsp_write(sdev, HDA_DSP_HDA_BAR, | ||
| SOF_HDA_WAKESTS, SOF_HDA_WAKESTS_INT_MASK); |
There was a problem hiding this comment.
@plbossart @ujfalusi This is really curious. HDA-spec wise this clearing is not needed, the state will be reset with controller reset. But indeed in snd-hda-intel, this has been historically done to avoid bugs, but until now, we've not needed this in SOF.
There was a problem hiding this comment.
Hmm, based on my test, the commit does help on updating GCTL value in hda_dsp_ctrl_link_reset(). gctl will be always 1 without this commit. But I don't get the reason why clear WAKE_STS helps on this. Looks like something is blocked?
There is a clear correlation between SoundWire and WAKE_STS. From LNL onwards, we use the HDaudio wake detection instead of the SoundWire wake detection. So clearing all the bits prior to entering reset makes sense, otherwise there might be some bitfields that are left active. But yeah I don't have any explanation for the PIO role in triggering this. We may need to double check what happens if we don't enable PIO for LNL. |
|
@plbossart, we kind of know what happens if we don't enable PIO mode for LNL... |
I meant do we see the reset issue without the PIO mode, and you've replied that it existed before and we overlooked it. |
| /* disable controller CIE and GIE */ | ||
| snd_sof_dsp_update_bits(sdev, HDA_DSP_HDA_BAR, SOF_HDA_INTCTL, | ||
| SOF_HDA_INT_CTRL_EN | SOF_HDA_INT_GLOBAL_EN, | ||
| 0); |
There was a problem hiding this comment.
Question is does snd_sof_dsp_write need the interrupts?
There was a problem hiding this comment.
I don't know if this is useful, what I was thinking is that disabling the interrupts and then stopping the commands makes no sense in general. If there was an in-flight command it would be blocked, be it with the CORB/RIRB or PIO mode.
For now this doesn't have any effect so we can drop it. Still it'd be good to double-check this.
There was a problem hiding this comment.
The PIO mode is polling on the IC registers, it is not using interrupts, CORB/RIRB uses interrupts.
I don't have access to the entire dmesg, but daily test 38398 (from 29.02.2024) does have these: I don't see gctl |
suggested hacks based on code review only
I don't see any improvement on the hardware but the code does look suspicious.with the 3rd commit I can't reproduce the errors reported in issue #4889 root-caused to the use of PIO commands (see PR #4892)
cc: