Quira Blog

Developer reputation in the era of LLMs

Part I: The era of infinite code content 📺 Nearly a year has passed since the introduction of GPT-4. One of the public discussions we followed at the time was the one concerned with the impact LLMs would have on software development and, in particular, on jobs in this field. ChatGPT felt like magic and it was easy to extrapolate the impact of these technologies, even to the extent of rendering the software engineering profession redundant. One year later, it’s clear that LLMs are not replacing developers, but helping them focus less on the grunt work and more on the big picture. This transition has already had a measurable impact on unit economics. For example, last year, Microsoft reported that developers using GitHub Copilot are 55% more productive than those not using it. Interestingly, they also noted that 40% of the code checked in by these developers was AI-generated and remained unmodified. Statistics like these suggest a new inflection point in the software economy and an impending shift in our path towards the future of work. Specifically, they prompt us to question, in an era where code is automatically generated, how do we measure the quality of software talent? Computational Talent Validation Talent validation inherently depends on social consensus. For developers, this consensus can be reached through various means like earning a diploma, interviewing for a job, or achieving a high score in LeetCode quizzes. However, with the accelerated production of software talent that LLMs have catalysed, it’s uncertain whether these traditional consensus-building methods can truly scale or remain relevant. At Quira, we believe that in the era of Generative AI, scalable consensus can only be achieved through experience quantification. Under this approach, a developer’s skills and abilities are implicitly inferred through data and computation. Specifically, by semantically understanding the code and contribution metadata that developers generate when creating software. The evolving dynamics of open source are accelerating this potential and accentuating the need for this technology. Indeed, we are observing a shift in attitude towards open source, where a new generation of eager and ambitious developers are embracing it not only to uphold its ethos of community-driven development, but also as a means to validate their work socially. However, even in this new dynamic, the same problems persist. When code can be automatically generated from natural language, analysis of code content alone is insufficient to achieve robust consensus. To automatically quantify experience, one must incorporate additional context regarding the relevance and merit of the software created. In other words, we need to understand the social proof of code and how it has been “validated” by other members in the network. Measuring the social proof of OSS contributions Today we’re excited to release a new version of DevRank, a metric that measures the social proof of open source contributions by quantifying reputation flows in the GitHub network. In a nutshell, DevRank models the ecosystem as a network of contributors and repositories linked together by a directed edge representing a reputation-inducing event on GitHub. The current version considers three types of edges. Developer → Repo, to represent events where a developer stars a repo. Repo → Developer, for events where a developer commits to a repository. Repo → Repo, to incorporate events where a repository imports a package or a dependency. Considering all such edges within GitHub, we can construct a large directed network where reputation flows from one node into another through stargazer, commit, and import events. By applying a customised version of the PageRank algorithm on this resulting graph we can compute the stationary state probabilities of a random walker in the network. These raw probabilities indicate the importance of a developer within the open source ecosystem in the same way that PageRank calculates the importance of a webpage on the internet. The higher the DevRank of a developer, the more their contributions have been socially validated by other developers through star, commit, and import events on GitHub. If you’re interested in the technical details of DevRank, you can read more about it in our documentation. Rethinking open source status One of the coolest applications of DevRank is that it gives us a new framework to think about status & reputation in open source, not only for individual developers, but also for communities. Developer reputation is already a valuable intangible that can have a positive financial impact on the lives of developers, say, by giving them access to work opportunities or, in some cases, making them eligible for grants or venture capital funding. For these reasons, quantifying reputation in a principled and transparent way is critical to foster a meritocratic ecosystem. Think, for example, of stargazer traction, which is widely used as a measure of the success of a project in the ecosystem and that, according to Wired, is now being artificially inflated and is no longer a reliable indication of traction. Let’s look at how DevRank can change our perspective of success and status in open source. Stargazers Let’s start with stargazers, which are by far the most popular way to assess traction in open source. To do this, let’s first ask ourselves, How would the list of most popular repos on GitHub change if instead of ranking them by number of stars, we ranked them by the DevRank of their stargazers? It turns out that this adjustment would significantly alter the rankings. Certain repositories, such as FreeCodeCamp/freecodecamp, public-apis/public-apis, or jwasham/coding-interview-university would not only lose their place in the top 5 of the list… they wouldn’t even feature in the top 30! On the other hand, some repositories like facebook/react, twbs/bootstrap, or electron/electron would rise from the 10th, the 20th, and 30th, to the 1st, 3rd, and 4th place in the list. In other words, the list of top 10 repos would see a demotion of educational repos in favour of some popular development frameworks. But we don’t need to stop there… We can go one level further and look at the repos that have been starred the most by developers in the top and bottom 10% (see table below). The results are qualitatively more pronounced and show that the top 10% tends to star repos focused on developer tooling, while the bottom 10% tends to star beginner resources and educational projects. Contributor Growth Another popular way to assess traction in open-source communities is contributor growth. Contributor growth is already a purer metric than GitHub stars because an increase in the contributor growth counter requires the submission and acceptance of a PR. However, similarly to stargazers, this metric is not resistant to noise. For example, some “first contribution repos” like firstcontributions/first-contributions trivialise merging requirements for PRs, making contribution events almost as frictionless as stargazing. DevRank can help us filter out the noise and add context to these metrics. To illustrate this, consider the top repositories in the ecosystem ranked by the number of new contributors since January 2023. We can see that in the top 20 of the list, there are projects like firstcontributions/first-contributions, mouredev/retos-programacion-2023, Syknapse/Contribute-To-This-Project, which have very low merging requirements. We can derive an alternative ranking if we restrict the counts to contributions by developers in the Top 1% of developers based on DevRank (table below). When doing this, we immediately see that 30% of places in the top 20 are replaced by top AI projects and developer tools like ggerganov/llama.cpp, run-llama/llama_index, langchain-ai/langchainjs. Grudge matches DevRank can also aid in determining a winner in classic rivalries between developer communities. For example, a frequently debated topic among AI researchers is whether PyTorch is better than Tensorflow. There are many non-violent ways to settle this matter, with our preferred option being to appeal to open-source data. Unfortunately, in this particular case, vanilla statistics provide limited insight and fail to give a definite conclusion: TensorFlow leads in the stargazer count with 180.8k stars, but PyTorch is imported by 265.9k repositories. To gain true insight into this question we need to increase the resolution of the data. For example, we can use DevRank to understand which repo is preferred by developers with higher reputation. To do this, we can compute the median DevRank of the contributors of each dependent repository. The results are fascinating. The median DevRank of PyTorch users turns out to be 140.3, whereas for TensorFlow users, it stands at 56.9. Moreover, the total DevRank sum of repos that use PyTorch is 2.65 bn, while the DevRank sum of repos that import Tensorflow is 1.75 bn. This experiment shows that even though PyTorch has significantly fewer stargazers than Tensorflow, it is generally used and preferred by repositories and developers with higher DevRank. Go PyTorch! 🔥 Outro The future of developer work and education is evolving quickly and LLMs playing a significant role in accelerating this transformation. As the software engineering profession becomes democratised, the need to assess the quality of software talent in a scalable and systematic way becomes paramount. At Quira, we are leveraging DevRank along with other signals to algorithmically gauge the expertise of our users. This allows us to rank developers in our community across languages, frameworks, topics, and industries to ultimately match them to paid contribution opportunities in open source. Our technology is already helping a number of Commercial Open Source Organisations (COSS) receive PRs from top contributors outside of their orbits. If you’d like to use Quira to build a dynamic and sustainable community of high-quality contributors, drop us a line, we’d love to chat. Alternatively, if you’re a developer looking to build or monetise your reputation in open-source, then log in to Quira and start your journey with us.

Articles

Good First Issues: beginner beware

If you spend a lot of time on GitHub, chances are you've come across issues tagged with a 'good-first-issue' label. You’ll find this in about 1 in 25 repos (with 10+ stars). It’s natural to assume such issues are for junior developers looking to dip their toes into the ocean of open-source software (OSS). Numerous websites curate such "good first issues" GFIs for the purpose of unearthing suitable places for contributing. As of now, notable platforms include Quira, goodfirstissue.dev, goodfirstissues.com, up-for-grabs, and GitHub’s For Good First Issue— in the context of digital public goods. Despite the availability of these resources, many developers become frustrated in their attempts to find somewhere to contribute. The problem is that these labels are context-specific, and despite how they may be interpreted, do not indicate any difficulty level. This ambiguity leads to unnecessary confusion, particularly for developers trying to contribute to open-source software (OSS) for the first time. It is therefore unfortunate that these labels tend to occur more frequently within complex projects as opposed to simple ones, as discussed in this Reddit post. Our own users at Quira echo a similar sentiment: There are many good first issues, but their repos are not beginner friendly. I tried to find beginner-friendly repos for Golang. It showed up repos that had more ‘good first issues’ tags, but the repos were not beginner friendly at all. So what’s going on? It turns out that the there’s a collective misunderstanding around the meaning of good-first-issue (GFI). Appropriately used, this label is a resource maintainers use to signal a smooth on-ramp to a repository. This could involve tackling a self-contained problem or diving into an issue that offers substantial exposure to the codebase without needing architectural decisions. However, crucially, the contributor is expected to possess the background necessary to engage effectively with the project. Hence the label signifies that a given issue is good way of getting used to the project structure, conventions and tooling. This is an increasingly important enterprise for larger projects, resulting in a correlation between the difficulty of projects and the number of GFIs. Determining ease of contribution If the GFI label does not indicate easy issues, what alternative can developers use? As of now, there's a notable lack of forums or mechanisms dedicated to help with this. While projects offering a safe space for users to experiment with Pull Requests (PRs) on GitHub exist, this isn’t the same as making a meaningful open-source contribution. For instance, identifying where an engineer can contribute to frontend tools or frameworks constitutes a considerable challenge. This gap in the OSS ecosystem denies many developers the opportunity to contribute, learn new tools, gain experience, and give something back to the community. It's a challenge that has been on our radar at Quira for some time. In response to this challenge, we've begun the work of creating metrics that facilitate user contributions. Among other signals, we calculate issue responsiveness and PR merge times, which provide an estimate of how quickly maintainers respond to enquiries or contributions. However, these metrics don't provide an indication of the ease of contribution. To address this, we turn to the history of a project: how many junior developers have successfully contributed? The seniority level of contributors isn't directly available, nor can it be easily inferred from open source data, but we can proxy it by considering the tenure of the developer, estimated by the age of their GitHub account. We then propose a straightforward metric: the percentage of First Year Contributors (FYC): those who have contributed to a given repo in their first year on GitHub. This new metric, FYC, tells a very different story from the number of good-first-issues. In fact it is anti-correlated with the number of such issues in a repo, as illustrated in the chart below. Our qualitative experience to date suggests that FYC distinguishes effectively between repositories that are welcoming or intimidating to new OSS contributors (especially in repositories with 20 or more contributors). With a reasonable degree of approximation, one can interpret the score as follows: a repository with a score of < 10% FYC should be approached cautiously unless you possess substantial experience in that domain, while a score of > 20% generally signals a very welcoming or straightforward project for the average developer. Proportion of repos with good first issues, bucketed by 5% intervals of FYC. Data are restricted to repos with at least 20 contributors. Good First Issues vs FYC for ease of contribution To quickly assess the ease of contribution, and how these two metrics perform, let's compare the top 10 projects based on GFI labels with the newly introduced FYC metric. We use the term “Open GFIs” to denote the count of issues with the GFI label that were open at the time of analysis, and “FYC%” to denote the percentage of contributors that were in their first year at the time of their first contribution, both calculated in January 2024. In order to focus on well-known projects, we’ve restricted our attention to a subset of Frontend repos. For a greater range of comparisons, visit quira.sh/contribute. In the first table, you'll find the top 10 projects ordered by the number of GFIs. Many of these repos are popular and fundamental Frontend projects (e.g. nuxt, next.js, or parcel). These are complex repos and inappropriate for first-time contributors in OSS, as reflected by their low FYC score. However, within this top 10, some projects present better opportunities for ease of contribution, notably meteor and mui-x, as indicated by their FYC scores. Now, the list of projects in the second table ordered by FYC tells a different story. Most repositories in this lineup have no open GFIs. Qualitatively we believe these to be much better choices if one is restricted to repos with 1000+ stars. We nevertheless find one or two limitations — which are present in the first 2 repos. Firstly openui5 has a very high FYC score, but only SAP employees appear to be able to contribute to the main branch, and the high score may reflect SAP-specific GitHub account creation. Secondly, redux-framework is also not a good choice since at the time of writing there are no open issues. Nevertheless, these are simple to filter out, and this is done by default in our own Contribute interface. It's important to note that a high FYC% score doesn't necessarily mean a project is simple or "easy" (e.g., meteor is not a straightforward project). Rather, it signifies that contribution is feasible for users with the necessary background, without needing significant OSS experience. This may be attributed to excellent documentation, community management, streamlined environment setup, and self-contained issues which can be solved by less experienced developers. Conclusion In adopting the FYC metric, we shift our focus from individual issues to a broader repository-level metric. Beginning at the issue level can often be less effective, as issues assume knowledge and experience of the wider project, and the contributor's intentions. On the other hand, once developers have selected an appropriate repository, labels like good-first-issue can be invaluable. Our aim is to help users discover the right project to begin contributing to. When the project fit is right, contributing becomes easier — in this case developers will have the required context, and/or the maintainers can provide the required guidance. And this lowers the barrier to subsequent contributions as the developer gets to know the project. This stance challenges the conventional wisdom we've encountered from junior developers, who have heard that expanding their portfolio across various OSS projects is crucial for personal growth. Frequently jumping between repositories to solve issues may not be beneficial to anyone. This was aptly summarised by the user gonzodamus in the Reddit post mentioned above: What no one does tell you is that you should be picking a single project that you're interested in, learning the ins and outs of the code, and then making useful PRs when you have something to contribute. Discovering the right project to contribute to in OSS is no easy feat. It requires a delicate balance between the ease of contribution, the complexity of the project, familiar tooling, the project's popularity, recency of other contributions, the activity of maintainers, the number of open issues, and other bespoke considerations. Most importantly, it's about finding a project that aligns with your skills, experience, and interests. There’s certainly no one-size-fits-all solution, and at Quira we seek to provide an interface that allows each developer to find ways to contribute based on their own requirements. Rather than relying on a recommender system to guess your preferences, we enable you to take the wheel—browsing based on complexity, language, and topic, filtered to active repositories. Explore all of this, including FYC, on quira.sh/contribute. Take a look and share your thoughts with us!

Alex Bird

By Alex Bird

Feb 20, 2024

10 min read

Introducing Dependency Widgets

Hi all and welcome to another Quira product update! We’re pumped to be announcing the launch of newly updated Quira widgets! These include revised designs and a smaller overall footprint, to take up just a little less of that valuable real-estate on your GitHub readme. Of course, we think you’ll absolutely love these - so if you really want us to make them huge again, you need only ask! 🐘 🤣 Alongside our updated widget designs we’re really excited to share that we’re launching an all new ‘Dependency widget’. This will show off the main software packages you use, based on Lines of Code that reference a specific package in your prior commits and specific usage of function calls. Check it out - we think it’s pretty cool and that it will look great on your profiles. As ever, please do let us know your thoughts and share any feedback with us. 👂 We’ve made it easier to seek out popular repositories on our Contribute page by allowing you to sort by Most Stars on repos - have fun exploring those galaxies! 🌟🧑🏻‍🚀 We’ve also made it easier to browse relevant repositories on Contribute by showing off the Languages mix of a repo on our Quick View card - Python, Brainf$%k, Typescript stack anyone?! 🔎 Org repos!!! You asked and we delivered - many of you wanted to be able to share your org repos on Quira, so we’ve added the ability to connect GitHub Organisations and share relevant repos to our awesome community. Share your org repos and muster an army of Contributors to supercharge development - to get started visit the Add repo page 🏢 ….Plus, we’ve made a whole host of other little quality of life improvements to our website to make your experience of hunting down interesting open source issues to solve just that little bit more lovely ❤️ Thanks for all the ongoing support and stay tuned for further Product updates in the weeks ahead. → Now, Go Forth and Contribute! 🚀 With ❤️ from the Quira Product team.

Product Team

By Product Team

Aug 07, 2023

2 min read

Subscribe to our newsletter