Skip to content

[RFC][WIP] large message ipc#609

Closed
juimonen wants to merge 2 commits intothesofproject:topic/sof-devfrom
juimonen:large_message
Closed

[RFC][WIP] large message ipc#609
juimonen wants to merge 2 commits intothesofproject:topic/sof-devfrom
juimonen:large_message

Conversation

@juimonen
Copy link

This is a basis for discussion how to send larger messages in smaller parts to the dsp. With this I was able to load fir eq with size over 3kb. The last patch is a test to enable large blobs from controls and the proper way to define max large message size (and the actual sise) should be discussed.

@juimonen
Copy link
Author

related sof pr:
thesofproject/sof#927

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now have functions which are doing basically the same thing, but one is able to send larger messages. I propose to export just one function to avoid confusion for readers.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, will change

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can win an indentation level if you check for msg < SOF_IPC_MSG_MAX_SIZE and return directly here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not 0XAU?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0XAU was already defined in sof fw side... maybe the kernel side header duplicate is not up to date

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sof fw side... maybe the kernel side header duplicate is not up to date

Yes, I noticed that now. We should update the Linux side.

@dbaluta
Copy link
Collaborator

dbaluta commented Jan 31, 2019

@lgirdwood Why is SOF_IPC_MSG_MAX_SIZE so small? I thought mailbox can hold up to 4k?

Jaska Uimonen added 2 commits January 31, 2019 13:08
Make SOF ipc to handle large IPC messages by chopping big
messages to smaller parts.

Signed-off-by: Jaska Uimonen <jaska.uimonen@intel.com>
Currently SOF limits control bytes data to ipc max data size.
Remove this limitation to test partial large ipc messages.

Signed-off-by: Jaska Uimonen <jaska.uimonen@intel.com>
@mengdonglin mengdonglin added the upstream blocker Mandatory for upstream label Jan 31, 2019
Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a completely different view of all this. To me the main issue was to pass large amounts of data between user space and kernel. The IPC could be done in chunks of 1, 128, 4k bytes, it's not really a problem. I was really thinking of a compound IPC and a mechanism to identify that things are in sequence (i.e. to identify that a part was dropped).
Also if we start having multiple sets of IPCs, then should the loop be handled in a thread? this really means the interaction with the DSP is blocked until the large config is passed, and if you have e.g. 30KB this might take a while. For example if userspace wants to know the DMA position over IPC but the IPC is blocked because of a large transfer what happens. The serialization could be problematic.

uint32_t id;
unsigned char data[];
} __packed;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we instead use the concept of compound_ipc, but make sure that there is a notion of start/stop and continuation between commands.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what is the point of having large message implementation if they fit our ipc frame? For e.g. fir filter coeffs are byte blobs that can be really long. If they are going to be transferred to dsp they have to be chopped and framed somehow. So they are not compound messages but 1 message that doesn't fit the transfer size.

I'm not totally sure how the kernel scheduling goes here: we are sending max size chunks in our "normal" ipc way, so I guess the kernel has time to schedule something else, but I guess other IPC at the same time to dsp might be difficult. We could disable large messages if there's stream playing?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plbossart @juimonen why can't we use a larger shared memory area? e.g reserve a chunk of system RAM for communication between DSP and CPU? Thus we can allocate few MB to allow large transfer with a single transfer.

Basically, get rid of the 4K mailbox.

ret = sof_ipc_tx_msg(ipc, SOF_IPC_GLB_LARGE, hdr, send_bytes +
sizeof(struct sof_ipc_large_hdr),
reply_data, reply_bytes);
if (ret < 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use multiple regular IPC transfers, I don't get the need for a new type of transfers.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so actually I renamed the old tx_message to tx_msg, so this is sending multiple "old style" max size ipc transfers (they are just framed inside new struct to be able to gather up on dsp side). Smaller than max transfer size messages are going just as before. This way I didn't have to touch topology.c.

@juimonen juimonen closed this Feb 8, 2019
bardliao pushed a commit to bardliao/linux that referenced this pull request Jul 1, 2020
At times when I'm using kgdb I see a splat on my console about
suspicious RCU usage.  I managed to come up with a case that could
reproduce this that looked like this:

  WARNING: suspicious RCU usage
  5.7.0-rc4+ thesofproject#609 Not tainted
  -----------------------------
  kernel/pid.c:395 find_task_by_pid_ns() needs rcu_read_lock() protection!

  other info that might help us debug this:

    rcu_scheduler_active = 2, debug_locks = 1
  3 locks held by swapper/0/1:
   #0: ffffff81b6b8e988 (&dev->mutex){....}-{3:3}, at: __device_attach+0x40/0x13c
   #1: ffffffd01109e9e8 (dbg_master_lock){....}-{2:2}, at: kgdb_cpu_enter+0x20c/0x7ac
   #2: ffffffd01109ea90 (dbg_slave_lock){....}-{2:2}, at: kgdb_cpu_enter+0x3ec/0x7ac

  stack backtrace:
  CPU: 7 PID: 1 Comm: swapper/0 Not tainted 5.7.0-rc4+ thesofproject#609
  Hardware name: Google Cheza (rev3+) (DT)
  Call trace:
   dump_backtrace+0x0/0x1b8
   show_stack+0x1c/0x24
   dump_stack+0xd4/0x134
   lockdep_rcu_suspicious+0xf0/0x100
   find_task_by_pid_ns+0x5c/0x80
   getthread+0x8c/0xb0
   gdb_serial_stub+0x9d4/0xd04
   kgdb_cpu_enter+0x284/0x7ac
   kgdb_handle_exception+0x174/0x20c
   kgdb_brk_fn+0x24/0x30
   call_break_hook+0x6c/0x7c
   brk_handler+0x20/0x5c
   do_debug_exception+0x1c8/0x22c
   el1_sync_handler+0x3c/0xe4
   el1_sync+0x7c/0x100
   rpmh_rsc_probe+0x38/0x420
   platform_drv_probe+0x94/0xb4
   really_probe+0x134/0x300
   driver_probe_device+0x68/0x100
   __device_attach_driver+0x90/0xa8
   bus_for_each_drv+0x84/0xcc
   __device_attach+0xb4/0x13c
   device_initial_probe+0x18/0x20
   bus_probe_device+0x38/0x98
   device_add+0x38c/0x420

If I understand properly we should just be able to blanket kgdb under
one big RCU read lock and the problem should go away.  We'll add it to
the beast-of-a-function known as kgdb_cpu_enter().

With this I no longer get any splats and things seem to work fine.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200602154729.v2.1.I70e0d4fd46d5ed2aaf0c98a355e8e1b7a5bb7e4e@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

upstream blocker Mandatory for upstream

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants