Draft
Conversation
Zethson
requested changes
Jan 23, 2026
Member
Zethson
left a comment
There was a problem hiding this comment.
Thank you very much already! Here's some first initial feedback:
- Please ensure that you have a descriptive pull request title and the pull request description only has the necessary detail. This can also be done later but it's important.
- Your implementation should not add any dependencies. Pertpy has to minimize dependencies because it's used by a loooot of people. Every dependency that we add has the potential to break users environment and increases complexity. Therefore, please remove bokeh, pycirclize, and tqdm. The plot should be implemented with seaborn and instead of using tqdm, you can use Rich directly. Not sure what to do about pycirclize - if necessary we can talk about it. That's one of the differences of implementing something for yourself vs implementing it for many.
- Please ensure that all functions that have a bit more than trivial complexity have a proper docstring. Please adhere to our docstring style.
- I didn't review everything yet because I think there's still a lot to do. Let's address these first.
| return adata | ||
|
|
||
|
|
||
| def human_cytokine_dict(exclude_well_biased_genes=True) -> pd.DataFrame: |
Member
There was a problem hiding this comment.
Please type this.
Suggested change
| def human_cytokine_dict(exclude_well_biased_genes=True) -> pd.DataFrame: | |
| def human_cytokine_dict(exclude_well_biased_genes: bool = True) -> pd.DataFrame: |
|
|
||
|
|
||
| def human_cytokine_dict(exclude_well_biased_genes=True) -> pd.DataFrame: | ||
| r"""Human Cytokine Dictionary curated from PBMC allows you to infer differential cytokine activity. |
Member
There was a problem hiding this comment.
Suggested change
| r"""Human Cytokine Dictionary curated from PBMC allows you to infer differential cytokine activity. | |
| """Human Cytokine Dictionary curated from PBMC allows you to infer differential cytokine activity. |
this doesn't need to be a raw string, right?
| def human_cytokine_dict(exclude_well_biased_genes=True) -> pd.DataFrame: | ||
| r"""Human Cytokine Dictionary curated from PBMC allows you to infer differential cytokine activity. | ||
|
|
||
| The Human Cytokine Dictionary was created from single-cell RNA-seq of 9,697,974 human peripheral blood mononuclear cells (PBMC) from 12 donors stimulated in vitro with 87 different cytokines. The object is a dataframe representing cytokine activity as differentially expressed genes after cytokine perturbation. |
Member
There was a problem hiding this comment.
Please always write sentences in their own line.
|
|
||
| Returns: | ||
| Pandas DataFrame | ||
|
|
| The Human Cytokine Dictionary was created from single-cell RNA-seq of 9,697,974 human peripheral blood mononuclear cells (PBMC) from 12 donors stimulated in vitro with 87 different cytokines. The object is a dataframe representing cytokine activity as differentially expressed genes after cytokine perturbation. | ||
|
|
||
| References: | ||
| Oesinghaus, Lukas and Becker, S{\"o}ren and Vornholz, Larsen |
| 2. Creates ranking of query data genes contrasting condition1 vs condition2. A continuum from genes most associated with condition1 (top) to genes most associated with condition2 (bottom) | ||
| 3. Computes enrichment of each cytokine by matching their associated gene set in the ranked list. | ||
|
|
||
| Parameters |
| verbose: bool = False, | ||
| threads: int = 6, | ||
| ) -> pd.DataFrame: | ||
| """Function wrapper: Computes cytokine enrichment activity in one celltype using GSEA scoring. Loops through several threshold value to obtain more robust gene sets. |
Member
There was a problem hiding this comment.
Suggested change
| """Function wrapper: Computes cytokine enrichment activity in one celltype using GSEA scoring. Loops through several threshold value to obtain more robust gene sets. | |
| """Computes cytokine enrichment activity in one celltype using GSEA scoring. | |
| Loops through several threshold value to obtain more robust gene sets. |
It's a rule that the first line is always just one sentence. No exceptions.
| weight: float = 1.0, | ||
| seed: int = 2025, | ||
| verbose: bool = False, | ||
| threads: int = 6, |
| ranked_stats, _num_cells = hucira._compute_ranking_statistic(dummy_adata, contrast_column, contrasts_combo) | ||
| assert isinstance(ranked_stats, pd.DataFrame) | ||
|
|
||
| # with pytest.raises(KeyError): |
Member
There was a problem hiding this comment.
This shouldn't be commented out, right?
| ("B cell", "B_cell"), | ||
| ("CD8a", "CD8_T_cell"), | ||
| ("Mono", "CD14_Mono"), | ||
| ] # can't be a list for "run_one_enrichment_test()" |
Member
There was a problem hiding this comment.
Not sure what's meant with that.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR Checklist
docsis updatedDescription of changes
adding new tool.
Technical details
Technically, the most interesting part is the data itself. Most of the methods that come with this tool are wrappers that aid with the enrichment analysis (e.g., help create a ranked list from query data, or help navigate the different levels of the data).
The plotting tools are seaborn heatmap wrappers that are tailored to the output format.
Only the cytokine communication methods are specific to this tool. (get_senders and receivers()). This plotting function is specific to this tool.
Additional context
new feature "hucira". Lukas wants to check if methods are too specific.
Some of the code might actually be redundant.. I had implemented the wrappers to make simple visualizations a click away for users but I can rethink about making the tool more lightweight.
The only dependencies I added (bokeh and pyCirclize) are for visualization tools
The very generic tests that I wrote are not passing. I don't know why, it looks like they're timing out.