Analyze unsafe code reachability#4037
Conversation
Add callgraph analysis to scanner in order to find the distance between functions in a crate and unsafe functions. For that, we build the crate call graph and collect the unsafe functions. After that, do reverse BFS traversal from the unsafe functions and store the distance to other functions. The result is stored in a new csv file.
What does this distance metric tell us? More generally: what is the goal of the new analysis? And what else can we maybe do with it? I vaguely recall that I wanted to build upon this PR to find all functions that transitively involve loops, but now I'm not entirely sure this is the right place. |
On my local machine, I added a filter to the queue initialization here to filter out unsafe functions and safe abstractions as roots of the tree. By starting only from safe functions and performing the analysis, we can compute how many safe functions with transitive unsafe dependencies there are. This is useful for the standard library, where 71% of the functions appear safe in that they don't contain any unsafe blocks, but after performing this analysis, we can see that 52% of these "safe" functions end up calling into unsafe somewhere in their call chain. |
|
Just for some context, initially I implemented this analysis to find out the percentage of safe functions that may lead to an unsafe operation. The distance is not as relevant, but it's something we get for free. |
Continuation of #3546.
From @celinval in #3546:
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.