Final Report

Adding embarassingly parallel algorithms to nx-parallel

gsoc

Author

Akshita Sure

Mentors: Dan Schult, Aditi Juneja
Duration: May 2025 - August 2025

Abstract

This report summarizes my work during Google Summer of Code 2025 on the nx-parallel project, a parallel backend for NetworkX that leverages joblib to run parallel graph algorithms. My primary contribution was implementing several embarrassingly parallel algorithms, analyzing their performance speedups, and optimizing existing implementations. Additionally, I switched the default configuration system to NetworkX and enabled all available cores (n_jobs=-1). I also developed a more reliable timing script to reduce measurement variability, introduced a should_run mechanism to determine when the parallel backend should be applied, and enhanced ASV benchmarks by adding a setup function.

For a more detailed discussion of the work done, please refer to my blogs.

Background

nx-parallel is a backend for NetworkX that speeds up graph algorithms by running them in parallel with joblib. Before the coding phase, I spent time strengthening the project’s foundation. I fixed utility functions like chunks, which had been giving wrong results because of parameter handling issues, and refactored get_all_functions to make it more reliable. I also suggested changing the default configuration so that NetworkX uses all available cores (n_jobs=-1) instead of sticking with joblib’s default (None). These early fixes and adjustments set the stage for the larger developments that came during the coding phase.

Contributions

Adding Embarassingly parallel algorithms

The primary objective of the project was to implement seven embarrassingly parallel algorithms. By the end of the coding phase, around nineteen such graph algorithms were added to nx-parallel:

File	Functions
`cluster.py`	`triangles`, `clustering`, `average_clustering`
`link_prediction.py`	`apply_prediction`¹, `jaccard_coefficient`, `resource_allocation_index`, `adamic_adar_index`, `preferential_attachment`, `common_neighbor_centrality`, `cn_soundarajan_hopcroft`, `ra_index_soundarajan_hopcroft`, `within_inter_cluster`
`harmonic.py`	`harmonic_centrality`
`neighbor_degree.py`	`average_neighbor_degree`
`attracting.py`	`number_attracting_components`
`connected.py`	`number_connected_components`
`strongly_connected.py`	`number_strongly_connected_components`
`weakly_connected.py`	`number_weakly_connected_components`
`dag.py`	`colliders`, `v_structures`

¹ A private parallel helper function that applies each internal prediction function used by the link prediction algorithms across multiple cores.

Working with these algorithms gave me the chance to explore a wide range of challenges in parallel graph computation. Along the way, I gained insight into why some algorithms achieve significant speedups while others do not. For example, certain algorithms scale well on sparse graphs but lose their performance advantage on denser graphs due to the overhead of serializing graph data and distributing work across multiple processes. In other cases, speedup slows down as the number of nodes increases because the overhead of chunking tasks, pickling/unpickling, and scheduling workers grows.

I also encountered implementation-specific constraints. Returning generators from within per-chunk functions, for instance, caused issues since generators cannot be pickled for communication between worker processes. Similarly, tuning parameters like max_chunk_size proved crucial– smaller chunks benefited algorithms needing fine-grained load balancing, while larger chunks improved efficiency by reducing scheduling overhead.

These explorations not only deepened my understanding of parallel execution trade-offs in graph algorithms but also helped shape a set of practical guidelines for when and how parallelization in nx-parallel provides the most benefit.

Improving the timing script (ref. PR#114)

The earlier timing script produced inconsistent results, sometimes showing decreasing speedups with more nodes or cores. I replaced it with a timeit-based script that repeats measurements and records the minimum runtime, yielding more stable and accurate results. The impact of this change can be seen below:

Old heatmap:

New heatmap:

Adding a `should_run` parameter (ref. PR#123)

A should_run parameter has been added to control when the parallel backend is used. The default should_run policy ensures that an algorithm is executed only if it runs across multiple cores (i.e., when the number of jobs used is greater than one). Alternative should_run policies handle cases where parallelism is ineffective– for example, when speedup is consistently low, the graph is very small, or it’s too dense. The appropriate policy is passed via the _configure_if_nx_active decorator, making the backend adaptable and preventing wasted resources in scenarios where parallelism provides little or no benefit.

Switching the default config to NetworkX (ref. PR#122)

Previously, nx-parallel used Joblib’s default (n_jobs=None) and ran on a single core unless explicitly changed. I switched the defaults to NetworkX’s config and set n_jobs = -1, so the config system now uses all available cores automatically. Hence, the change required a thorough update of the documentation to ensure it accurately reflects the new configuration.

Implement mem-mapping

To reduce the memory overhead of passing large graphs between processes, I introduced memmapping for algorithms where the adjacency matrix is first converted to a NumPy array and stored on disk using Joblib. This allows all processes to share the same underlying file segment instead of creating separate in-memory copies, significantly improving efficiency for large graphs. I applied this approach to is_reachable (ref. PR#119), but it can be extended to other algorithms as needed.

Adding setup functions to benchmarks (ref. PR#126)

Introduced a setup function in each benchmark class to handle graph creation and other external computations before timing the algorithm.

Work links

GSoC blogs: https://github.com/akshitasure12/networkx-blogs/tree/main/blogs

PRs opened

The table below lists all the PRs I opened during the GSoC coding phase in both the nx-parallel and NetworkX repositories, along with their merge status.

PR	Link	Status
parallel implementation of `triangles`	PR#106
adding a custom marker to avoid `pytest.mark.order` warning	PR#107
refactor `chunks()` to correctly use `n_jobs`	PR#112
improve the timing script	PR#114
parallel implementation of `number_` algorithms	PR#117
modify `is_reachable()` to use mem-mapping approach	PR#119
optimise `is_reachable()` in NetworkX	PR#8112
remove test order dependency via context managers	PR#120
make `n_jobs=-1` as defualt	PR#122
add `should_run` functionality	PR#123
parallel implementation of `harmonic_centrality`	PR#124
optimise harmonic centrality	PR#8158
refactor ASV benchmarks with setup functions	PR#126
parallel implementation of `link_prediction` algorithms	PR#127
refactor `test_get_functions_with_get_chunks`	PR#128
update `test_get_chunks` for new algorithms	PR#129
parallel implementation of `clustering` and `average_clustering`	PR#130
uses `pytest.raises` as context	PR#8170
parallel implementation of `average_neighbor_degree`	PR#132
move `assign_algorithms` outside `BackendInterface` class	PR#133
parallel implementation of `v_structures` and `colliders`	PR#134
simplify node selection using `nbunch_iter()`	PR#135
post merge refinement	PR#138
add `should_run` for when `nodes=None`	PR#141
clean up	PR#142

Issues Raised

The table below summarizes the key issues identified and addressed during the GSoC coding phase in both the nx-parallel and NetworkX repositories, along with their current status.

Issue	Link	Status
`get_all_functions()` is not returning args and kwargs of the functions	PR#94
Incorrect passing of `num_in_chunk` as `n_jobs` in `chunks()`	PR#110
set `n_jobs=-1` as default	PR#111
set `should_run=False` unless `nodes` is `None`	PR#110

Key Learnings and Challenges

Reflecting on my GSoC experience, I’m amazed at how much I’ve grown over the course of the program. One of the biggest lessons I’ve learned is the true value of documentation—not just as a formality, but as a way to communicate ideas clearly and leave a record of insights for others (and even for myself in the future). I’ve developed the habit of recording my findings as I go, and over time, I’ve become more mindful about keeping things concise, relevant, and easy to follow.

This journey has also made me naturally more curious. I now find myself constantly asking why something works the way it does and digging deeper to understand it. Alongside that curiosity, I’ve learned the importance of asking questions and seeking clarification from mentors without hesitation. Collaborating in an open-source community has shown me just how valuable discussions are—they not only improve the quality of code but also broaden your perspective on different ways to tackle a problem.

One of the biggest challenges I faced was balancing focus on the defined tasks with the temptation to dive too deeply into side quests. It’s easy to get carried away chasing rabbit holes, but I had to remind myself to stay grounded in the project’s goals while still leaving room for curiosity and experimentation. Finding that balance wasn’t always easy, but it taught me a lot about prioritization and staying organized.

Conclusion

The GSoC 2025 experience has truly been one of a kind. It helped me grow into a more well-rounded developer, going far beyond just “writing code.” Compared to my original proposal, by the end of the summer I had contributed to many more aspects of the project than initially planned. This growth came from becoming increasingly comfortable with the codebase, which gave me the confidence to explore, experiment, and take initiative in new directions. Looking ahead, I plan to remain an active member of the NetworkX and nx-parallel communities— improving features, and contributing new ideas to advance the project. More importantly, I look forward to continuing the habit of learning, and giving back to the open source community.

Achnowledgements

I would like to sincerely thank my mentors, Dan Schult and Aditi Juneja, for their invaluable guidance and support throughout my GSoC journey. Their thoughtful reviews not only helped me refine my work but also encouraged me to pay close attention to even the smallest details. The mid-term review, in particular, was especially insightful, helping me recognize my strengths while highlighting areas for growth and motivating me to continually improve.

I am deeply grateful for their time, patience, and encouragement. I would also like to thank Google Summer of Code for providing me the opportunity to contribute to such an inspiring open-source community and to learn in a collaborative and supportive environment.