Good First Issues: beginner beware
If you spend a lot of time on GitHub, chances are you've come across issues tagged with a 'good-first-issue' label. You’ll find this in about 1 in 25 repos (with 10+ stars). It’s natural to assume such issues are for junior developers looking to dip their toes into the ocean of open-source software (OSS). Numerous websites curate such "good first issues" GFIs for the purpose of unearthing suitable places for contributing. As of now, notable platforms include Quira, goodfirstissue.dev, goodfirstissues.com, up-for-grabs, and GitHub’s For Good First Issue— in the context of digital public goods. Despite the availability of these resources, many developers become frustrated in their attempts to find somewhere to contribute. The problem is that these labels are context-specific, and despite how they may be interpreted, do not indicate any difficulty level. This ambiguity leads to unnecessary confusion, particularly for developers trying to contribute to open-source software (OSS) for the first time. It is therefore unfortunate that these labels tend to occur more frequently within complex projects as opposed to simple ones, as discussed in this Reddit post. Our own users at Quira echo a similar sentiment: There are many good first issues, but their repos are not beginner friendly. I tried to find beginner-friendly repos for Golang. It showed up repos that had more ‘good first issues’ tags, but the repos were not beginner friendly at all. So what’s going on? It turns out that the there’s a collective misunderstanding around the meaning of good-first-issue (GFI). Appropriately used, this label is a resource maintainers use to signal a smooth on-ramp to a repository. This could involve tackling a self-contained problem or diving into an issue that offers substantial exposure to the codebase without needing architectural decisions. However, crucially, the contributor is expected to possess the background necessary to engage effectively with the project. Hence the label signifies that a given issue is good way of getting used to the project structure, conventions and tooling. This is an increasingly important enterprise for larger projects, resulting in a correlation between the difficulty of projects and the number of GFIs. Determining ease of contribution If the GFI label does not indicate easy issues, what alternative can developers use? As of now, there's a notable lack of forums or mechanisms dedicated to help with this. While projects offering a safe space for users to experiment with Pull Requests (PRs) on GitHub exist, this isn’t the same as making a meaningful open-source contribution. For instance, identifying where an engineer can contribute to frontend tools or frameworks constitutes a considerable challenge. This gap in the OSS ecosystem denies many developers the opportunity to contribute, learn new tools, gain experience, and give something back to the community. It's a challenge that has been on our radar at Quira for some time. In response to this challenge, we've begun the work of creating metrics that facilitate user contributions. Among other signals, we calculate issue responsiveness and PR merge times, which provide an estimate of how quickly maintainers respond to enquiries or contributions. However, these metrics don't provide an indication of the ease of contribution. To address this, we turn to the history of a project: how many junior developers have successfully contributed? The seniority level of contributors isn't directly available, nor can it be easily inferred from open source data, but we can proxy it by considering the tenure of the developer, estimated by the age of their GitHub account. We then propose a straightforward metric: the percentage of First Year Contributors (FYC): those who have contributed to a given repo in their first year on GitHub. This new metric, FYC, tells a very different story from the number of good-first-issues. In fact it is anti-correlated with the number of such issues in a repo, as illustrated in the chart below. Our qualitative experience to date suggests that FYC distinguishes effectively between repositories that are welcoming or intimidating to new OSS contributors (especially in repositories with 20 or more contributors). With a reasonable degree of approximation, one can interpret the score as follows: a repository with a score of < 10% FYC should be approached cautiously unless you possess substantial experience in that domain, while a score of > 20% generally signals a very welcoming or straightforward project for the average developer. Proportion of repos with good first issues, bucketed by 5% intervals of FYC. Data are restricted to repos with at least 20 contributors. Good First Issues vs FYC for ease of contribution To quickly assess the ease of contribution, and how these two metrics perform, let's compare the top 10 projects based on GFI labels with the newly introduced FYC metric. We use the term “Open GFIs” to denote the count of issues with the GFI label that were open at the time of analysis, and “FYC%” to denote the percentage of contributors that were in their first year at the time of their first contribution, both calculated in January 2024. In order to focus on well-known projects, we’ve restricted our attention to a subset of Frontend repos. For a greater range of comparisons, visit quira.sh/contribute. In the first table, you'll find the top 10 projects ordered by the number of GFIs. Many of these repos are popular and fundamental Frontend projects (e.g. nuxt, next.js, or parcel). These are complex repos and inappropriate for first-time contributors in OSS, as reflected by their low FYC score. However, within this top 10, some projects present better opportunities for ease of contribution, notably meteor and mui-x, as indicated by their FYC scores. Now, the list of projects in the second table ordered by FYC tells a different story. Most repositories in this lineup have no open GFIs. Qualitatively we believe these to be much better choices if one is restricted to repos with 1000+ stars. We nevertheless find one or two limitations — which are present in the first 2 repos. Firstly openui5 has a very high FYC score, but only SAP employees appear to be able to contribute to the main branch, and the high score may reflect SAP-specific GitHub account creation. Secondly, redux-framework is also not a good choice since at the time of writing there are no open issues. Nevertheless, these are simple to filter out, and this is done by default in our own Contribute interface. It's important to note that a high FYC% score doesn't necessarily mean a project is simple or "easy" (e.g., meteor is not a straightforward project). Rather, it signifies that contribution is feasible for users with the necessary background, without needing significant OSS experience. This may be attributed to excellent documentation, community management, streamlined environment setup, and self-contained issues which can be solved by less experienced developers. Conclusion In adopting the FYC metric, we shift our focus from individual issues to a broader repository-level metric. Beginning at the issue level can often be less effective, as issues assume knowledge and experience of the wider project, and the contributor's intentions. On the other hand, once developers have selected an appropriate repository, labels like good-first-issue can be invaluable. Our aim is to help users discover the right project to begin contributing to. When the project fit is right, contributing becomes easier — in this case developers will have the required context, and/or the maintainers can provide the required guidance. And this lowers the barrier to subsequent contributions as the developer gets to know the project. This stance challenges the conventional wisdom we've encountered from junior developers, who have heard that expanding their portfolio across various OSS projects is crucial for personal growth. Frequently jumping between repositories to solve issues may not be beneficial to anyone. This was aptly summarised by the user gonzodamus in the Reddit post mentioned above: What no one does tell you is that you should be picking a single project that you're interested in, learning the ins and outs of the code, and then making useful PRs when you have something to contribute. Discovering the right project to contribute to in OSS is no easy feat. It requires a delicate balance between the ease of contribution, the complexity of the project, familiar tooling, the project's popularity, recency of other contributions, the activity of maintainers, the number of open issues, and other bespoke considerations. Most importantly, it's about finding a project that aligns with your skills, experience, and interests. There’s certainly no one-size-fits-all solution, and at Quira we seek to provide an interface that allows each developer to find ways to contribute based on their own requirements. Rather than relying on a recommender system to guess your preferences, we enable you to take the wheel—browsing based on complexity, language, and topic, filtered to active repositories. Explore all of this, including FYC, on quira.sh/contribute. Take a look and share your thoughts with us!
By Alex Bird
•
Feb 20, 2024
•
10 min read