Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Web Perf Hero: Thiemo Kreuz

29 December 2025 at 04:30

Today we recognise Thiemo’s broad impact in improving performance of Wikimedia software. From optimizing code across the MediaWiki stack as felt on Wikipedia.org, to speeding up CI for faster developer feedback; this work benefits us every day!

Thiemo Kreuz works in the Technical Wishes team at Wikimedia Deutschland. He did most of this performance work as a paid software developer. “We are free to spend a portion of our time on side projects like these”, Thiemo wrote to us.

Performance as part of a routine

The tools on performance.wikimedia.org are part of building a culture of performance. These tools help you understand how code performs in production and on real devices. These tools empower developers to maintain performance through regular assessment and incremental improvement. Perf matters, because improving performance is an essential step toward equity of access!

We celebrate Thiemo’s tireless efforts with a story about performance as part of a routine, rather than one specific change. We’ll look at a few examples, but there are many other interesting Git commits if you’re curious for more.

Wikitext editor

The CodeMirror extension for MediaWiki provides syntax highlighting, for example, when editing template pages.

“I found a nasty performance issue in CodeMirror’s syntax highlighter for wikitext that was sitting there for a really, really long time”, Thiemo wrote about T270317 and T270237, which would cause your browser to freeze on long articles. “But nobody could figure out why. Answer: Bad regexes with missing boundary assertions.”

VisualEditor template editor

With the WMDE Technical Wishes team, Thiemo worked on VisualEditor’s template dialog and dramatically improved its performance. “This is mostly about lazy-loading parts of the UI”, Thiemo wrote. This matters because the community maintains templates that sometimes define several hundred parameters.

Faster stylesheet compilation

ResourceLoader is the MediaWiki delivery system for frontend styles, scripts, and localisation. It uses the Less.php library for stylesheet compilation. Thiemo heavily optimized the stylesheet parser through native function calls, inlining, and other techniques. This resulted in a 15% reduction in this change, 8% in this change, 5% in another change, and several more changes after that.

The motivation for this work was faster feedback from CI. While we compile only a handful of Less stylesheets during a page view, we have several hundred Less stylesheet files in our codebase. Our CI automatically checks all frontend assets for compilation errors, without needing dedicated unit tests. This speed-up brought us one step closer to realising the 5-minute pipeline.

Codesniffer rules

MediaWiki has extensive static analysis rules that automate and codify things we learned over two decades. Many such rules are implemented using PHP_CodeSniffer and run both locally and in CI via the composer test command. New rules are developed all the time and discussed in Phabricator. These new rules come at a cost.

“I keep coming back to our MediaWiki ruleset for PHPCS to check if it still runs as fast as it used to”, Thiemo wrote. “I find this particularly interesting because it requires a very specific ‘unfair’ type of optimization: We don’t care how slow the unhappy path is when it finds errors, because that’s the exceptional case that typically never happens. But we care a lot about the happy path, because that gets executed over and over again with every CI run.”

Example changes: 3X faster MultipleEmptyLines rule, 10X faster EmptyTag documentation rule.

Back to basics

Thiemo likes improving low-level libraries and frameworks, such as wikimedia/services and OOUI. “The idea is that even the tiniest of optimizations can make a notable difference, because a piece of library code is executed so often”, Thiemo wrote.

Web Perf Hero award

The Web Perf Hero award is given to individuals who have gone above and beyond to improve the web performance of Wikimedia projects. The initiative started in 2020 and takes the form of a Phabricator badge. You can find past recipients at the Web Perf Hero award page on Wikitech.

Unifying our mobile and desktop domains

21 November 2025 at 13:00

How we achieved 20% faster mobile response times, improved SEO, and reduced infrastructure load.

Until now, when you visited a wiki (like en.wikipedia.org), the server responded in one of two ways: a desktop page, or a redirect to the equivalent mobile URL (like en.m.wikipedia.org). This mobile URL in turn served the mobile version of the page from MediaWiki. Our servers have operated this way since 2011, when we deployed MobileFrontend.

Before: Wikimedia CDN responds with a redirect from en.wikipedia.org to en.m.wikipedia.org for requests from mobile clients, and en.m.wikipedia.org then responds with the mobile HTML. After: Wikimedia CDN responds directly with the mobile HTML.
Diagram of technical change.

Over the past two months we unified the mobile and desktop domain for all wikis (timeline). This means we no longer redirect mobile users to a separate domain while the page is loading.

We completed the change on Wednesday 8 October after deploying to English Wikipedia. The mobile domains became dormant within 24 hours, which confirms that most mobile traffic arrived on Wikipedia via the standard domains and thus experienced a redirect until now.[1][2]

Why?

Why did we have a separate mobile domain? And, why did we believe that changing this might benefit us?

The year is 2008 and all sorts of websites large and small have a mobile subdomain. The BBC, IMDb, Facebook, and newspapers around the world featured the iconic m-dot domain. For Wikipedia, a separate mobile domain made the mobile experiment low-risk to launch and avoided technical limitations. It became the default in 2011 by way of a redirect.

Fast-forward seventeen years, and much has changed. It is no longer common for websites to have m-dot domains. Wikipedia’s use of it is surprising to our present day audience, and it may decrease the perceived strength of domain branding. The technical limitations we had in 2008 have long been solved, with the Wikimedia CDN having efficient and well-tested support for variable responses under a single URL. And above all, we had reason to believe Google stopped supporting separate mobile domains, which motivated the project to start when it did.

You can find a detailed history and engineering analysis in the Mobile domain sunsetting RFC along with weekly updates on mediawiki.org.

Site speed

Google used to link from mobile search results directly to our mobile domain, but last year this stopped. This exposed a huge part of our audience to the mobile redirect and regressed mobile response times by 10-20%.[2]

Google supported mobile domains in 2008 by letting you advertise a separate mobile URL. While Google only indexed the desktop site for content, they stored this mobile URL and linked to it when searching from a mobile device.[3] This allowed Google referrals to skip over the redirect.

Google introduced a new crawler in 2016, and gradually re-indexed the Internet with it.[4-7] This new “mobile-first” crawler acts like a mobile device rather than a desktop device, and removes the ability to advertise a separate mobile or desktop link. It’s now one link for everyone! Wikipedia.org was among the last sites Google switched, with May 2024 as the apparent change window.[2] This meant the 60% of incoming pageviews referred by Google, now had to wait for the same redirect that the other 40% of referrals have experienced since 2011.[8]

Persian Wikipedia saw a quarter second cut in the “responseStart” metric from 1.0s to 0.75s.

Unifying our domains eliminated the redirect and led to a 20% improvement in mobile response times.[2] This improvement is both a recovery and a net-improvement because it applies to everyone! It recovers the regression that Google-referred traffic started to experience last year, but also improves response times for all other traffic by the same amount.

The graphs below show how the change was felt worldwide. The “Worldwide p50” corresponds to what you might experience in Germany or Italy, with fast connectivity close to our data centers. The “Worldwide p80” resembles what you might experience in Iran browsing the Persian Wikipedia.

Wordwide p80 regressed 11% from 0.63s to 0.70s, then reduced 18% from 0.73s to 0.60s. Wordwide p75 regressed 13% to 0.61s, then reduced 19% to 0.52s. Wordwide p50 regressed 22% to 0.33s, then reduced 21% to 0.27s. Full table in the linked comment on Phabricator.
Check Perf report to explore the underlying data and for other regions.

SEO

The first site affected was not Wikipedia but Commons. Wikimedia Commons is the free media repository used by Wikipedia and its sister projects. Tim Starling found in June that only half of the 140 million pages on Commons were known to Google.[9] And of these known pages, 20 million were also delisted due to the mobile redirect. This had been growing by one million delisted pages every month.[10] The cause for delisting turned out to be the mobile redirect. You see, the new Google crawler, just like your browser, also has to follow the mobile redirect.

After following the redirect, the crawler reads our page metadata which points back to the standard domain as the preferred one. This creates a loop that can prevent a page from being updated or listed in Google Search. Delisting is not a matter of ranking, but about whether a page is even in the search index.

Tim and myself disabled the mobile redirect for “Googlebot on Commons” through an emergency intervention on June 23rd. Referrals then began to come back, and kept rising for eleven weeks in a row, until reaching a 100% increase in Google-referrals. From a baseline of 3 million weekly pageviews up to 6 million. Google’s data on clickthroughs shows a similar increase from 1M to 1.8M “clicks”.[9]

Pageviews to Wikimedia Commons having type equal to user (meaning not a known bot or spider), and referrer equal to Google. After July 2025, it increases from 3 million to 6 million per week.
Google-referred pageviews in 2025.
Stable 1.0 million clicks per week in June and early July, then increase to 1.8 million clicks per week in mid-July and stayed there.
Weekly clicks (according to Google Search Console).

We reversed last year’s regression and set a new all-time high. We think there’s three reasons Commons reached new highs:

  1. The redirect consumed half of the crawl budget, thus limiting how many pages could be crawled.[10][11]
  2. Google switched Commons to its new crawler some years before Wikipedia.[12] The index had likely been shrinking for two years already.
  3. Pages on Commons have a sparse link graph. Wikipedia has a rich network of links between articles, whereas pages on Commons represent a photo with an image description that rarely links to other files. This unique page structure makes it hard to discover Commons pages through recursive crawling without a sitemap.

Unifying our domains lifted a ceiling we didn’t know was there!

The MediaWiki software has a built-in sitemap generator, but we disabled this on Wikimedia sites over a decade ago.[13] We decided to enable it for Commons and submitted it to Google on August 6th.[14][15] Google has since indexed 70 million new pages for Commons, up 140% since June.[9]

We also found that less than 0.1% of videos on Commons were recognised by Google as video watch pages (for the Google Search “Videos” tab). I raised this in a partnership meeting with Google Search, and it may’ve been a bug on their end. Commons started showing up in Google Videos a week later.[16][17]

Link sharing UX

When sharing links from a mobile device, such link previously hardcoded the mobile domain. Links shared from a mobile device gave you the mobile site, even when received on desktop. The “Desktop” link in the footer of the mobile site pointed to the standard domain and disabled the standard-to-mobile redirect for you, on the assumption you arrived on the mobile site via the redirect. The “Desktop” link did not remember your choice on the mobile domain itself, and there existed no equivalent mobile-to-standard redirect for when you arrive there. This meant a shared mobile link always presented the mobile site, even after opting-out on desktop.

Everyone now shares the same domain which naturally shows the appropiate version.

There is a long tail of stable referrals from news articles, research papers, blogs, talk pages, and mailing lists that refer to the mobile domain. We plan to support this indefinitely. To limit operational complexity, we now serve these through a simple whole-domain redirect. This has the benefit of retroactively fixing the UX issue because old mobile links now redirect to the standard domain.[18]

This resolves a long-standing bug with workarounds in the form of shared user scripts,[19] browser extensions,[20] and personal scripts.[24]

Infrastructure load

After publishing an edit, MediaWiki instructs the Wikimedia CDN to clear the cache of affected articles (“purge”). It has been a perennial concern from SRE teams at WMF that our CDN purge rates are unsustainable. For every purge from MediaWiki core, the MobileFrontend extension would add a copy for the mobile domain.

Daily purge workload.

After unifying our domains we turned off these duplicate purges, and cut the MediaWiki purge rate by 50%. Over the past weeks the Wikimedia CDN processed approximately 4 billion fewer purges a day. MediaWiki used to send purges at a baseline rate of 40K/second with spikes up to 300K/second, and both have been halved. Factoring in other services, the Wikimedia CDN now receives 20% to 40% fewer purges per second overall, depending on the edit activity.[18]

Footnotes

  1. T403510: Main rollout, Wikimedia Phabricator.
  2. T405429: Detailed traffic stats and performance reports, Wikimedia Phabricator.
  3. Running desktop and mobile versions of your site (2009), developers.google.com.
  4. Mobile-first indexing (2016), developers.google.com.
  5. Google makes mobile-first indexing default for new domains (2019), TechCrunch.
  6. Mobile-first indexing has landed (2023), developers.google.com.
  7. Mobile indexing vLast final final (Jun 2024), developers.google.com.
  8. Mobile domain sunsetting RFC § Footnote: Wikimedia pageviews (Feb 2025), mediawiki.org.
  9. T400022: Commons SEO review, Wikimedia Phabricator.
  10. T54647: Image pages not indexed by Google, Wikimedia Phabricator.
  11. Crawl Budget Management For Large Sites, developers.google.com.
  12. I don’t have a guestimate for when Google switched Commons to its new crawler. I pinpointed May 2024 as the switch date for Wikipedia based on the new redirect impacting page load times (i.e. a non-zero fetch delay). For Commons, this fetch delay was already non-zero since at least 2018. This suggests Google’s old crawler linked mobile users to Commons canonical domain, unlike Wikipedia which it linked to the mobile domain until last year. Raw perf data: P73601.
  13. History of sitemaps at Wikimedia by Tim Starling, wikitech.wikimedia.org.
  14. T396684: Develop Sitemap API for MediaWiki
  15. T400023: Deploy Sitemap API for Commons
  16. T396168: Video pages not indexed by Google, Wikimedia Phabricator.
  17. Google Videos Search results for commons.wikimedia.org.
  18. T405931: Clean up and redirect, Wikimedia Phabricator.
  19. Wikipedia:User scripts/List on en.wikipedia.org. Featuring NeverUseMobileVersion, AutoMobileRedirect, and unmobilePlus.
  20. Redirector (10,000 users), Chrome Web Store.
  21. How can I force my desktop browser to never use mobile Wikipedia (2018), StackOverflow.
  22. Skip Mobile Wikipedia (726 users), Firefox Add-ons.
  23. Search for “mobile wikipedia”, Firefox Add-ons.
  24. Mobile domain sunsetting 2025 Announcement § Personal script workarounds (Sep 2025), mediawiki.org.

About this post

Featured image by PierreSelim, CC BY 3.0, via Wikimedia Commons.

APIs as a product: Investing in the current and next generation of technical contributors

12 June 2025 at 16:21

Wikipedia is coming up on its 25th birthday, and that would not have been possible without the Wikimedia technical volunteer community. Supporting technical volunteers is crucial to carrying forward Wikimedia’s free knowledge mission for generations to come. In line with this commitment, the Foundation is turning its attention to an important area of developer support—the Wikimedia web (HTTP) APIs. 

Both Wikimedia and the Internet have changed a lot over the last 25 years. Patterns that are now ubiquitous standards either didn’t exist or were still in their infancy as the first APIs allowing developers to extend features and automate tasks on Wikimedia projects emerged. In fact, the term representational state transfer”, better known today as the REST framework, was first coined in 2000, just months before the very first Wikipedia post was published, and only 6 years before the Action API was introduced. Because we preceded what have since become industry standards, our most powerful and comprehensive API solution, the Action API, sticks out as being unlike other APIs – but for good reason, if you understand the history.

Wikimedia APIs are used within Foundation-authored features and by volunteer developers. A common sentiment surfaced through the recent API Listening Tour conducted with a mix of volunteers and Foundation staff is “Wikimedia APIs are great, once you know what you’re doing.” New developers first entering the Wikimedia community face a steep learning curve when trying to onboard due to unfamiliar technologies and complex APIs that may require a deep understanding of the underlying Wikimedia systems and processes. While recognizing the power, flexibility, and mission-critical value that developers created using the existing API solutions, we want to make it easier for developers to make more meaningful contributions faster. We have no plans to deprecate the Action API nor treat it as ‘legacy’. Instead, we hope to make it easier and more approachable for both new and experienced developers to use. We also aim to expand REST coverage to better serve developers who are more comfortable working in those structures.

We are focused on simplifying, modernizing, and standardizing Wikimedia API offerings as part of the Responsible Use of Infrastructure objective in the FY25-26 Annual Plan (see: the WE5.2 key result). Focusing on common infrastructure that encourages responsible use allows us to continue to prioritize reliable, free access to knowledge for the technical volunteer community, as well as the readers and contributors they support. Investing in our APIs and the developer experiences surrounding them will ensure a healthy technical community for years to come. To achieve these objectives, we see three main areas for improving the sustainability of our API offering: simplification, documentation, and communication.

Simplification

To reduce maintenance costs and ensure a seamless developer experience, we are simplifying our API infrastructure and bringing greater consistency across all APIs. Decades of organic growth without centralized API governance led to fragmented, bespoke implementations that now hinder technical agility and standardization. Beyond that, maintaining services is not free; we are paying for duplicative infrastructure costs, some of which are scaling directly with the amount of scraper traffic hitting our services.

In light of the above, we will focus on transitioning at least 70% of our public endpoints to common API infrastructure (see the WE 5.2 key result). Common infrastructure makes it easier to maintain and roll out changes across our APIs, in addition to empowering API authors to move faster. Instead of expecting API authors to build and manage their own solutions for things like routing and rate limiting, we will create centralized tools and processes that make it easier to follow the “golden path” of recommended standards. That will allow centralized governance mechanisms to drive more consistent and sustainable end-user experiences, while enabling flexible, federated API ownership. 

An example of simplified internal infrastructure will be introducing a common API Gateway for handling and routing all Wikimedia API requests. Our approach will start as an “invisible gateway” or proxy, with no changes to URL structure or functional behavior for any existing APIs. Centralizing API traffic will make observability across APIs easier, allowing us to make better data-driven decisions. We will use this data to inform endpoint deprecation and versioning, prioritize human and mission-oriented access first, and ultimately provide better support to our developer community.  

Centralized management and traffic identification will also allow us to have more consistent and transparent enforcement of our API policies. API policy enforcement enables us to protect our infrastructure and ensure continued access for all. Once API traffic is rerouted through a centralized gateway, we will explore simplifying options for developer identification mechanisms and standardizing how rate limits and other API access controls are applied. The goal is to make it easier for all developers to know exactly what is expected and what limitations apply.

As we update our API usage policies and developer requirements, we will avoid breaking existing community tools as much as possible. We will continue offering low-friction entry points for volunteer developers experimenting with new ideas, lightly exploring data, or learning to build in the Wikimedia ecosystem. But we must balance support for community creativity and innovation with the need to reduce abuse, such as scraping, Denial of Service (DoS) attacks, and other harmful activities. While open, unauthenticated API access for everyone will continue, we will need to make adjustments. To reduce the likelihood and impact of abuse, we may apply stricter rate limits to unauthenticated traffic and more consistent authentication requirements to better match our documented API policy, Robot policy, and API etiquette guidelines, as well as consolidate per-API access guidelines to reduce the likelihood and impact of abuse.

To continue supporting Wikimedia’s technical volunteer community and minimize disruption to existing tools, community developers will have simple ways to identify themselves and receive higher limits or other access privileges. In many cases, this won’t require additional steps. For example, instead of universally requiring new access tokens or authentication methods, we plan to use IP ranges from Wikimedia Cloud Services (WMCS) and User-Agent headers to grant elevated privileges to trusted community tools, approved bots, and research projects. 

Documentation

It is essential for any API to enable developers to self-serve their use cases through clear, consistent, and modern documentation experiences. However, Wikimedia API documentation is frequently spread across multiple wiki projects, generated sites, and communication channels, which can make it difficult for developers to find the information they need, when they need it. 

To address this, we are working towards a top-requested item coming out of the 2024 developer satisfaction survey: OpenAPI specs and interactive sandboxes for all of our APIs (including conducting experiments to see if we can use OpenAPI to describe the Action API). The MediaWiki Interfaces team began addressing this request through the REST Sandbox, which we released to a limited number of small Wikipedia projects on March 31, 2025. Our implementation approach allows us to generate an OpenAPI specification, which we then use to power a SwaggerUI sandbox. We are also using the OpenAPI specs to automatically validate our endpoints as part of our automated deployment testing, which helps ensure that the generated documentation always matches the actual endpoint behavior. 

In addition, the generated OpenAPI spec offers translation support (powered by Translatewiki) for critical and contextual information like endpoint and parameter descriptions. We believe this is a more equitable approach to API documentation for developers who don’t have English as their preferred language. In the coming year, we plan to transition from Swagger UI to a custom Codex implementation for our sandbox experiences, which will enable full translation support for sandbox UI labels and navigation, as well as a more consistent look and feel for Wikimedia developers. We will also expand coverage for OpenAPI specs and sandbox experiences by introducing repeatable patterns for API authors to publish their specs to a single location where developers can easily browse, learn, and make test calls across all Wikimedia API offerings. 

Communication

When new endpoints are released or breaking changes are required, we need a better way to keep developers informed. As information is shared through different channels, it can become challenging to keep track of the full picture. Over the next year, we will address this on a few fronts. 

First, from a technical change management perspective, we will introduce a centralized API changelog. The changelog will summarize new endpoints, as well as new versions, planned deprecations, and minor changes such as new optional parameters. This will help developers with troubleshooting, as well as help them to more easily understand and monitor the changes happening across the Wikimedia APIs.

In addition to the changelog, we remain committed to consistently communicating changes early and often. As another step towards this commitment, we will provide migration guides and, where needed, provide direct communication channels for developers impacted by the changes to help guarantee a smooth transition. Recognizing that the Wikimedia technical community is split across many smaller communities both on and off-wiki, we will share updates in the largest off-wiki communities, but we will need volunteer support in directing questions and feedback to the right on-wiki pages in various languages. We will also work with communities to make their purpose and audience clearer for new developers so they can more easily get support when they need it and join the discussion with fellow technical contributors. 

Over the next few months, we will also launch a new API beta program, where developers are invited to interact with new endpoints and provide feedback before the capabilities are locked into a long-term stable version. Introducing new patterns through a beta program will allow developers to directly shape the future of the Wikimedia APIs to better suit their needs. To demonstrate this pattern, we will start with changes to MediaWiki REST APIs, including introducing API modularization and consistent structures. 

What’s Next

We are still in the early stages – we are just making the first steps on the journey to a unified API product offering. But we hope that by this time next year, we will be running towards it together. Your involvement and insights can help us shape a future that better serves the technical volunteers behind our knowledge mission. To keep you informed, we will continue to post updates on mailing lists, Diff, TechBlog, and other technical volunteer communication channels. We also invite you to stay actively engaged: share your thoughts on the WE5 objective in the annual plan, ask questions on the related discussion pages, review slides from the Future of Wikimedia APIs session we conducted at the Wikimedia Hackathon, volunteer for upcoming Listening Tour topics, or come talk to us at upcoming events such as Wikimania Nairobi

Technical volunteers play an essential role in the growth and evolution of Wikipedia, as well as all other Wikimedia projects. Together, we can make a better experience for developers who can’t remember life before Wikipedia, and make sure that the next generation doesn’t have to live without it. Here’s to another 25 years! 

Web Perf Hero: Máté Szabó

16 January 2024 at 05:30

MediaWiki is the platform that powers Wikipedia and other Wikimedia projects. There is a lot of traffic to these sites. We want to serve our audience in a way that they get the best experience and performance possible. So efficiency of the MediaWiki platform is of great importance to us and our readers.

MediaWiki is a relatively large application with 645,000 lines of PHP code in 4,600 PHP files, and growing! (Reported by cloc.) When you have as much traffic as Wikipedia, working on such a project can create interesting problems. 

MediaWiki uses an “autoloader” to find and import classes from PHP files into memory. In PHP, this happens on every single request, as each request gets its own process. In 2017, we introduced support for loading classes from PSR-4 namespace directories (in MediaWiki 1.31). This mechanism involves checking which directory contains a given class definition.

Problem statement

Kunal (@Legoktm) noticed after MediaWiki 1.35, wikis became slower due to spending more time in fstat system calls. Syscalls make a program switch to kernel mode, which is expensive.

We learned that our Autoloader was the one doing the fstat calls, to check file existence. The logic powers the PSR-4 namespace feature, and actually existed before MediaWiki 1.35. But, it only became noticeable after we introduced the HookRunner system, which loaded over 500 new PHP interfaces via the PSR-4 mechanism.

MediaWiki’s Autoloader has a class map array that maps class names to their file paths on disk. PSR-4 classes do not need to be present in this map. Before introducing HookRunner, very few classes in MediaWiki were loaded by PSR-4. The new hook files leveraged PSR-4, exposing many calls to file_exists() for PSR-4 directory searching, in every request. This adds up pretty quickly thereby degrading MediaWiki performance.

See task T274041 on Phabricator for the collaborative investigation between volunteers and staff.

Solution: Optimized class map

Máté Szabó (@TK-999) took a deep dive and profiled a local MediaWiki install with php-excimer and generated a flame graph. He found that about 16.6% of request time was spent in the Autoloader::find() method, which is responsible for finding which file contains a given class.

Figure 1: Flame graph by Máté Szabó.

Checking for file existence during PSR-4 autoloading seems necessary because one namespace can correspond to multiple directories that promise to define some of its classes. The search logic has to check each directory until it finds a class file. Only when the class is not not found anywhere may the program crash with a fatal error.

Máté avoided the directory searching cost by expanding MediaWiki’s Autoloader class map to include all classes, including those registered via PSR-4 namespaces. This solution makes use of a hash-map, where each class maps to one and only one file path on disk, making it a 1-to-1 mapping.

This means, the Autoloader::find() method no longer has to search through the PSR-4 directories. It now knows upfront where each class is, by merely accessing the array from memory. This removes the need for file existence checks. This approach is similar to the autoloader optimization flag in Composer.


Impact

Máté’s optimization significantly reduced response time by optimizing the Autoloader::find() method. This is largely due to the elimination of file system calls.

After deploying the change to MediaWiki appservers in production, we saw a major shift in response times toward faster buckets: a ~20% increase in requests completed within 50ms, and a ~10% increase in requests served under 100ms (T274041#8379204).

Máté analyzed the baseline and classmap cases locally, benchmarking 4800 requests, controlled at exactly 40 requests per second. He found latencies reduced on average by ~12%:

Table 1: Difference in latencies between baseline and classmap autoloader.
LatenciesBaselineFull classmap
p50
(mean average)
26.2ms22.7ms (~13.3% faster)
p9029.2ms25.7ms (~11.8% faster)
p9531.1ms27.3ms (~12.3% faster)

We reproduced Máté’s findings locally as well. On the Git commit right before his patch, Autoloader::find() really stands out.

Figure 2: Profile before optimization.
Figure 3: Profile after optimization.

NOTE: We used ApacheBench to load the /wiki/Main_Page URL from a local MediaWiki installation with PHP 8.1 on on Apple M1. We ran it both in a bare metal environment (PHP built-in webserver, 8 workers, no APCU), and in MediaWiki-Docker. We configured our benchmark to run 1000 requests with 7 concurrent requests. The profiles were captured using Excimer with a 1ms interval. The flame graphs were generated with Speedscope, and the box plots were created with Gnuplot.

In Figure 4 and 5, the “After” box plot has a lower median than the “Before” box plot. This means there is a reduction in latency. Also, the standard deviation in the “After” scenario shrunk, which indicates that responses were more consistently fast (not only on average). This increases the percentage of our users that have an experience very close to the average response time of web requests. Fewer users now experience an extreme case of web response slowness.

Figure 4: Boxplot for requests on bare metal.
Figure 5: Boxplot for requests on Docker.

Web Perf Hero award

The Web Perf Hero award is given to individuals who have gone above and beyond to improve the web performance of Wikimedia projects. The initiative is led by the Performance Team and started mid-2020. It is awarded quarterly and takes the form of a Phabricator badge.

Read about past recipients at Web Perf Hero award on Wikitech.


Further reading

❌
❌