Skip to content

ASoC: SOF: Intel: fix HDA codec driver probe with multiple controllers #1679

Merged
plbossart merged 2 commits intothesofproject:topic/sof-devfrom
kv2019i:topic/hda-multi-hda-probe
Jan 10, 2020
Merged

ASoC: SOF: Intel: fix HDA codec driver probe with multiple controllers #1679
plbossart merged 2 commits intothesofproject:topic/sof-devfrom
kv2019i:topic/hda-multi-hda-probe

Conversation

@kv2019i
Copy link
Collaborator

@kv2019i kv2019i commented Jan 9, 2020

In case system has multiple HDA controllers, it can happen that
the HDA codec modules are already loaded and their probe is run.
In this case, when SOF machine driver is loaded, the codec probe
is not triggered by request_module(), but rather explicit
device_attach() is required.

Fixes: #1672

@kv2019i
Copy link
Collaborator Author

kv2019i commented Jan 9, 2020

Don't merge yet, let's wait until bug reporter verifies this helps on affected systems.


res = device_attach(hda_codec_dev(codec));
if (res < 0)
dev_err(&codec->core.dev, "codec driver attach fail %d\n", res);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this can fail, we should return the error no?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ranj063 Probably so yes. We do catch the error in the ASoC card registration. I want to also study the device_*() API a bit more, but I wanted to get a version of the patch uploaded so this can be tested on target hardware ASAP.

{
char alias[MODULE_NAME_LEN];
const char *module = alias;
int res;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the commit message rather confusing.

The codec probe should not be run just on a module load, this should be dependent on some event or a match with a specific ID, no?

Also I don't see the link with the machine driver, this is a patch against the platform driver? the failure may happen at the machine driver probe/init level but as a result of something missing earlier.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plbossart Yes, I'll work on the commit message.
The legacy HDA driver has similar logic around request_module(). It seems the codec probe is not automatic, but has to be triggered like this. I'll dig a bit deeper, but definitely this is a major gap now in SOF, the probe does not get run in cases such as this.

And this is basicly a separate step of the platform driver. Platform driver needs to discover the HDA codecs present and load the correct driver modules. Then when machine driver runs, then all the required entities are present and DAI links can be set up. Now we fail in machine driver snd_soc_register_card().

@kv2019i kv2019i force-pushed the topic/hda-multi-hda-probe branch from e465950 to a4e86e8 Compare January 10, 2020 12:26
@kv2019i kv2019i changed the title ASoC: SOF: Intel: force HDA driver attach after request_module() ASoC: SOF: Intel: fix HDA codec driver probe with multiple controllers Jan 10, 2020
@kv2019i
Copy link
Collaborator Author

kv2019i commented Jan 10, 2020

Ok @plbossart and @ranj063 , ready for reviewing now. Adding the error propagation was (as expected) more work as this is a new place to stop the probe in, so of course it covered new bugs in the current code (and thus the second patch -> basicly you got system in state where reboot was only way out).

But this is aligned with legacy driver now and should work with all the various cases.

Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks mostly good except for the error handling which doesn't match the comments.

hda_codec_load_module(&hda_priv->codec);
ret = hda_codec_load_module(&hda_priv->codec);
/* -1 = no device, 0 = no driver -> both errors */
if (!ret)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that should be if (ret <= 0) if I understand the comment? the case with -1 would not be an error

Copy link
Collaborator Author

@kv2019i kv2019i Jan 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plbossart I want to pass negative error codes as is from hda_codec_load_module() here. E.g. it can return -ENODEV or -ENOMEM. Only in the case of zero as return value (no driver attached), we return -ENOENT. This way you can see from the dbg traces which error case this is.
UPDATE: one error in comment though, should have " <0 = no device"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, but then the comment can be improved. what looks like a bug is an intended feature...

ret = 0 means no driver, so explicitly adjust the return value. Negative values mean no device and will be returned as is.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plbossart ack, you are right, I'm trying to be too clever here and readability suffers. I'll update and resubmit

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plbossart Now updated. This not 100% happy, but at least it's better. I realized we pass also positive values unmodified, so adjusted the language a bit more. Let's iterate until it's understandable.

@plbossart
Copy link
Member

@kv2019i does this actually fix #1672 completely? we would still use Intel graphics for HDMI when there's an external graphics engine, so shouldn't we disable the internal graphics and all the idisp audio paths?

@kv2019i
Copy link
Collaborator Author

kv2019i commented Jan 10, 2020

@plbossart wrote:

@kv2019i does this actually fix #1672 completely? we would still use Intel graphics for HDMI when there's an external graphics engine, so shouldn't we disable the internal graphics and all the idisp audio paths?

In this case this is expected. There's a HDMI/DP connectors routed to external graphics engine, but you can still plug a DP monitor via USB-C that is tied to internal graphics.

This does not fix the case where internal graphics is disabled (or the display block is not used), but that needs to be handled separately.

In case system has multiple HDA controllers, it can happen that
same HDA codec driver is used for codecs of multiple controllers.
In this case, SOF may fail to probe the HDA driver and SOF
initialization fails.

SOF HDA code currently relies that a call to request_module() will
also run device matching logic to attach driver to the codec instance.
However if driver for another HDA controller was already loaded and it
already loaded the HDA codec driver, this breaks current logic in SOF.
In this case the request_module() SOF does becomes a no-op and HDA
Codec driver is not attached to the codec instance sitting on the HDA
bus SOF is controlling. Typical scenario would be a system with both
external and internal GPUs, with driver of the external GPU loaded
first.

Fix this by adding similar logic as is used in legacy HDA driver
where an explicit device_attach() call is done after request_module().

Also add logic to propagate errors reported by device_attach() back
to caller. This also works in the case where drivers are not built
as modules.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
In case system has multiple HDA codecs, and codec probe fails for
at least one but not all codecs, driver will end up cancelling
a non-initialized timer context upon driver removal.

Call trace of typical case:

[   60.593646] WARNING: CPU: 1 PID: 1147 at kernel/workqueue.c:3032
__flush_work+0x18b/0x1a0
[...]
[   60.593670]  __cancel_work_timer+0x11f/0x1a0
[   60.593673]  hdac_hda_dev_remove+0x25/0x30 [snd_soc_hdac_hda]
[   60.593674]  device_release_driver_internal+0xe0/0x1c0
[   60.593675]  bus_remove_device+0xd6/0x140
[   60.593677]  device_del+0x175/0x3e0
[   60.593679]  ? widget_tree_free.isra.7+0x90/0xb0 [snd_hda_core]
[   60.593680]  snd_hdac_device_unregister+0x34/0x50 [snd_hda_core]
[   60.593682]  snd_hdac_ext_bus_device_remove+0x2a/0x60 [snd_hda_ext_core]
[   60.593684]  hda_dsp_remove+0x26/0x100 [snd_sof_intel_hda_common]
[   60.593686]  snd_sof_device_remove+0x84/0xa0 [snd_sof]
[   60.593687]  sof_pci_remove+0x10/0x30 [snd_sof_pci]
[   60.593689]  pci_device_remove+0x36/0xb0

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
@kv2019i kv2019i force-pushed the topic/hda-multi-hda-probe branch from a4e86e8 to 0fd0fd9 Compare January 10, 2020 18:06
Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kv2019i

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] No SOF sound card for error 'ASoC: CODEC DAI intel-hdmi-hifi1 not registered' if NVidia audio probes earlier

3 participants