ASoC: SOF: Intel: fix HDA codec driver probe with multiple controllers #1679
Conversation
|
Don't merge yet, let's wait until bug reporter verifies this helps on affected systems. |
sound/soc/sof/intel/hda-codec.c
Outdated
|
|
||
| res = device_attach(hda_codec_dev(codec)); | ||
| if (res < 0) | ||
| dev_err(&codec->core.dev, "codec driver attach fail %d\n", res); |
There was a problem hiding this comment.
if this can fail, we should return the error no?
There was a problem hiding this comment.
@ranj063 Probably so yes. We do catch the error in the ASoC card registration. I want to also study the device_*() API a bit more, but I wanted to get a version of the patch uploaded so this can be tested on target hardware ASAP.
sound/soc/sof/intel/hda-codec.c
Outdated
| { | ||
| char alias[MODULE_NAME_LEN]; | ||
| const char *module = alias; | ||
| int res; |
There was a problem hiding this comment.
I find the commit message rather confusing.
The codec probe should not be run just on a module load, this should be dependent on some event or a match with a specific ID, no?
Also I don't see the link with the machine driver, this is a patch against the platform driver? the failure may happen at the machine driver probe/init level but as a result of something missing earlier.
There was a problem hiding this comment.
@plbossart Yes, I'll work on the commit message.
The legacy HDA driver has similar logic around request_module(). It seems the codec probe is not automatic, but has to be triggered like this. I'll dig a bit deeper, but definitely this is a major gap now in SOF, the probe does not get run in cases such as this.
And this is basicly a separate step of the platform driver. Platform driver needs to discover the HDA codecs present and load the correct driver modules. Then when machine driver runs, then all the required entities are present and DAI links can be set up. Now we fail in machine driver snd_soc_register_card().
e465950 to
a4e86e8
Compare
|
Ok @plbossart and @ranj063 , ready for reviewing now. Adding the error propagation was (as expected) more work as this is a new place to stop the probe in, so of course it covered new bugs in the current code (and thus the second patch -> basicly you got system in state where reboot was only way out). But this is aligned with legacy driver now and should work with all the various cases. |
plbossart
left a comment
There was a problem hiding this comment.
looks mostly good except for the error handling which doesn't match the comments.
sound/soc/sof/intel/hda-codec.c
Outdated
| hda_codec_load_module(&hda_priv->codec); | ||
| ret = hda_codec_load_module(&hda_priv->codec); | ||
| /* -1 = no device, 0 = no driver -> both errors */ | ||
| if (!ret) |
There was a problem hiding this comment.
that should be if (ret <= 0) if I understand the comment? the case with -1 would not be an error
There was a problem hiding this comment.
@plbossart I want to pass negative error codes as is from hda_codec_load_module() here. E.g. it can return -ENODEV or -ENOMEM. Only in the case of zero as return value (no driver attached), we return -ENOENT. This way you can see from the dbg traces which error case this is.
UPDATE: one error in comment though, should have " <0 = no device"
There was a problem hiding this comment.
ok, but then the comment can be improved. what looks like a bug is an intended feature...
ret = 0 means no driver, so explicitly adjust the return value. Negative values mean no device and will be returned as is.
There was a problem hiding this comment.
@plbossart ack, you are right, I'm trying to be too clever here and readability suffers. I'll update and resubmit
There was a problem hiding this comment.
@plbossart Now updated. This not 100% happy, but at least it's better. I realized we pass also positive values unmodified, so adjusted the language a bit more. Let's iterate until it's understandable.
|
@plbossart wrote:
In this case this is expected. There's a HDMI/DP connectors routed to external graphics engine, but you can still plug a DP monitor via USB-C that is tied to internal graphics. This does not fix the case where internal graphics is disabled (or the display block is not used), but that needs to be handled separately. |
In case system has multiple HDA controllers, it can happen that same HDA codec driver is used for codecs of multiple controllers. In this case, SOF may fail to probe the HDA driver and SOF initialization fails. SOF HDA code currently relies that a call to request_module() will also run device matching logic to attach driver to the codec instance. However if driver for another HDA controller was already loaded and it already loaded the HDA codec driver, this breaks current logic in SOF. In this case the request_module() SOF does becomes a no-op and HDA Codec driver is not attached to the codec instance sitting on the HDA bus SOF is controlling. Typical scenario would be a system with both external and internal GPUs, with driver of the external GPU loaded first. Fix this by adding similar logic as is used in legacy HDA driver where an explicit device_attach() call is done after request_module(). Also add logic to propagate errors reported by device_attach() back to caller. This also works in the case where drivers are not built as modules. Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
In case system has multiple HDA codecs, and codec probe fails for at least one but not all codecs, driver will end up cancelling a non-initialized timer context upon driver removal. Call trace of typical case: [ 60.593646] WARNING: CPU: 1 PID: 1147 at kernel/workqueue.c:3032 __flush_work+0x18b/0x1a0 [...] [ 60.593670] __cancel_work_timer+0x11f/0x1a0 [ 60.593673] hdac_hda_dev_remove+0x25/0x30 [snd_soc_hdac_hda] [ 60.593674] device_release_driver_internal+0xe0/0x1c0 [ 60.593675] bus_remove_device+0xd6/0x140 [ 60.593677] device_del+0x175/0x3e0 [ 60.593679] ? widget_tree_free.isra.7+0x90/0xb0 [snd_hda_core] [ 60.593680] snd_hdac_device_unregister+0x34/0x50 [snd_hda_core] [ 60.593682] snd_hdac_ext_bus_device_remove+0x2a/0x60 [snd_hda_ext_core] [ 60.593684] hda_dsp_remove+0x26/0x100 [snd_sof_intel_hda_common] [ 60.593686] snd_sof_device_remove+0x84/0xa0 [snd_sof] [ 60.593687] sof_pci_remove+0x10/0x30 [snd_sof_pci] [ 60.593689] pci_device_remove+0x36/0xb0 Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
a4e86e8 to
0fd0fd9
Compare
In case system has multiple HDA controllers, it can happen that
the HDA codec modules are already loaded and their probe is run.
In this case, when SOF machine driver is loaded, the codec probe
is not triggered by request_module(), but rather explicit
device_attach() is required.
Fixes: #1672