Feature/tree sitter python subprocess syscalls#195
Conversation
| _ => continue, | ||
| }; | ||
|
|
||
| if let Ok(libs) = query_db(&func_name) { |
There was a problem hiding this comment.
Question for all: Do we want only matches that exist in the databases?
There was a problem hiding this comment.
I wound tend to think we'd want all the matches, even if they don't exist in the database.
In some case that might even be more interesting: "What's this call to this random program I haven't heard of <xyz>"
The C++ parser (for #include statements, not sys calls) does include results that it doesn't find in the database, they just result in an entry with an empty list for "no matches found" eg. ("example_included_file.h", [])
There was a problem hiding this comment.
Ok, great point. I will make that change!
src/parsing/python_parser.rs
Outdated
| fn process_files<T>(&self, file_paths: T) -> HashMap<PythonImport, Vec<Vec<String>>> | ||
| fn is_likely_syscall(module: &str, func: &str) -> bool { | ||
| let combined = format!("{}.{}", module, func); | ||
| let predefined = ["os.system", "subprocess.run", "os.run"]; |
There was a problem hiding this comment.
I can't find os.run in https://docs.python.org/3/library/os.html.
There was a problem hiding this comment.
Huh, yeah great point I can't find it now either. I'll take it out and also edit the test file I created with that. Thanks for catching that!
7d78d0b to
3ff6d3b
Compare
3ff6d3b to
8cb6a82
Compare
| os.system("echo from os.system") | ||
|
|
||
| # Should match: "subprocess.run" | ||
| subprocess.run(["echo", "from subprocess.run"]) |
There was a problem hiding this comment.
issue: It looks like test case is giving us an extraneous result:
OS(Application("from")):
[]
When the argument to subprocess.run is a list, only the first item in the list will have the command to run in it (every item in the list afterwards is just an argument). When the argument to subprocess.run is a string, I think it behaves the same as os.system (in most cases). The underlying cause seems to be that right now is that when it is a list we seem to be treating each item in the list as if it were a command.
suggestion: Some other commands that can be treated the same way as subprocess.run for spawning a subprocess are: subprocess.Popen, subprocess.call, subprocess.check_call, and subprocess.check_output
Summary
This PR closes issue #125. If merged this pull request will add support to find specific sub processes that are called within python source code. It also adds a unit test and a test file for the extract_sys_callls function.