Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Perf Matters at Wikipedia in 2020

20 December 2022 at 20:00

Organizing a web performance conference

Photo by Sia Karamalegos, CC BY-SA 4.0.

There are numerous industry conferences dedicated to web performance. We have attended and spoken at several of them, and noticed important topics remain underrepresented. While the logistics of organizing a conference is too daunting for our small team, FOSDEM presents an appealing compromise. 

The Wikimedia Performance Team organized the inaugural Web Performance devroom at FOSDEM 2020. 

FOSDEM is the biggest Free and Open Source software conference in the world. It takes place in Brussels every year, is free to attend, and attracts over 8000 attendees. FOSDEM is known for its many self-organized conference tracks, known as “devrooms”. The logistics are taken care of by FOSDEM, while we focus on programming the content. We ran our own CfP, curate and invite speakers, and emcee the event.

📖 Read all about it on our blog: Organizing a developer room at FOSDEM
🎥 Watch sessions: Web Performance @ FOSDEM 2020
📙
See also: History of Wikimedia attending FOSDEM

Multi-DC progress

This year saw the completion of two milestones on the MediaWiki Multi-DC roadmap. Multi-DC is a cross-team initiative driven by the Performance Team, to evolve MediaWiki for operation from multiple datacenters. This is motivated by higher resilience, and eliminating steps from switchover procedures. This eases or enables routine maintenance by allowing clusters to be turned off — without a major switchover event.

The Multi-DC initiative has brought about performance and resiliency improvements across the MediaWiki codebases, and at every level of our infrastructure. These gains are effective even in today’s single-DC operation. We resolved long-standing tech debt and improved extension interfaces, which increased developer productivity. We also reduced dependencies, coupling, restructured business logic, and implemented async eventual-consistency solutions.

This year we applied the Multi-DC strategy to MediaWiki’s ChronologyProtector (T254634), and started work on the MainStash DB (T212129).

Read more at Performance/Multi-DC MediaWiki.

Setting up a mobile device lab

Today we collect real-user data from pageviews, which alerts when a regression happens, but doesn’t help investigate and fix why. Synthetic testing complements this for desktop browsers, but we have no equivalent for mobile devices. Desktop browsers have an “emulate mobile” option, but DevTools emulation is nothing like real mobile devices

The goal of the mobile device lab is to find performance regressions on Wikipedia, that are relevant to the experience of our mobile users. Alerts include detailed profiles for investigation, like we do for desktop browsers today.

📖 Read more at Learnings from setting up a performance device lab 

Introducing: Web Perf Hero award

Starting in 2020, we give out a Web Perf Hero award to individuals who have gone above and beyond to improve site performance. It’s awarded (up to) once a quarter to individuals who demonstrate repeated care and discipline around performance.

Browse posts tagged Web Perf Hero award or find an overview of Web Perf Hero award on Wikitech.

Performance perception survey

Since 2018, we have an on-going survey measuring performance perception on several Wikipedias. You can find the main findings in last year’s blog post. An important take-away was that none of the standard and new metrics we tried, correlate well to real user experience. The “best” metric (page load time) scored a mere 0.14 on the Pearson coefficient scale (from 0 to 1). As such, it remains valuable to survey the real perceived performance, as empirical barometer to validate other performance monitoring.

Data from three cohorts, seen in Grafana. You can see that there’s loose correlation with page load time (“loadEventEnd”). When site performance degrades (time goes up), satisfaction gets worse too (positive percentage goes down). Likewise, when load time improves (yellow goes down), satisfaction improves (green goes up).

Refer to Research:Study of performance perception for the full dataset used in the 2019 paper.

Catalan Wikipedia, ca.wikipedia.org
Spanish Wikipedia, es.wikipedia.org
Russian Wikipedia, ru.wikipedia.org

Miscellaneous

Further reading

About this post

Featured image by Kuhnmi, CC BY-SA 4.0, via Wikimedia Commons.

Perf Matters at Wikipedia in 2019

19 December 2022 at 11:00

A large-scale study of Wikipedia’s quality of experience

Last year we reported how our extensive literature review on performance perception changed our perspective on what the field of web performance actually knows.

Existing frontend metrics correlated poorly with user-perceived performance. It became clear that the best way to understand perceived performance is still to ask people directly about their experience. We set out to run our own survey to do exactly that, and look for correlations from a range of well-known and novel performance metrics to the lived experience. We partnered with Dario Rossi, Telecom ParisTech, and Wikimedia Research to carry out the study (T187299).

While machine learning failed to explain everything, the survey unearthed many key findings. It gave us newfound appreciation for the old school Page Load Time metric, as the metric that best (or least-terribly) correlated to the real human experience.

📖 A large-scale study of Wikipedia’s quality of experience, the published paper.

Refer to Research:Study of performance perception on Meta-Wiki for the dataset, background info, and an extended technical report.

Throughout the study we blogged about various findings:

Join the World Wide Web Consortium (W3C)

W3C Logo

The Performance Team has been participating in web standards as individually “invited experts” for a while. We initiated the work for Wikimedia Foundation to become an W3C member organization, and by March 2019 it was official.

As a represented membership organization, we are now collaborating in W3C working groups alongside other major stakeholders to the Web!

Read more at Joining the World Wide Web Consortium

Element Timing API for Images experiment

In the search for a better user experience metric, we tried out the upcoming Element Timing API for images. This is meant to measure when a given image is displayed on-screen. We enrolled wikipedia.org in the ongoing Google Chrome origin trial for the Element Timing API.

Read all about it at Evaluating Element Timing API for Images 

Event Timing API origin trial

The upcoming Event Timing API is meant to help developers identify slow event handlers on web pages. This is an area of web performance that hasn’t gotten a lot of attention, but its effects can be very frustrating for users.

Via another Chrome origin trial, this experiment gave us an opportunity to gather data, discover bugs in several MediaWiki extensions, and provide early feedback on the W3C Editor’s Draft to the browser vendors designing this API.

Read more at Tracking down slow event handlers with Event Timing

Implement a new API in upstream WebKit

We decided to commission the implementation of a browser feature that measures performance from an end-user perspective. The Paint Timing API measures when content appears on-screen for a visitor’s device. This was, until now, a largely Chrome-only feature. Being unable to measure such a basic user experience metric for Safari visitors risks long-term bias, negatively affecting over 20% of our audience. It’s essential that we maintain equitable access and keep Wikimedia sites fast for everyone.

We funded and oversaw implementation of the Paint Timing API in WebKit. We contracted Noam Rosenthal who brings experience in both web standards and upstream WebKit development.

Read more at How Wikimedia contributed Paint Timing API to WebKit

Update (April 2021): The Paint Timing API has been released in Safari 14.1!

Wikipedia’s JavaScript initialisation on a budget

ResourceLoader is Wikipedia’s delivery system for styles, scripts, and localization. It delivers JavaScript code on web pages in two stages. This design prioritizes the user experience through optimal cache performance of HTML and individual modules, and through a consistent experience between page views (i.e. no flip-flopping between pages based on when they were cache). It also achieves a great developer experience by ensuring we don’t mix incompatible versions of modules on the same page, and by ensuring rollout (and rollback) of deployments and complete worldwide in under 10 minutes.

This design rests on the first stage (startup manifest) staying small. We carried out a large-scale audit that shrunk the manifest size back down, and put monitoring and guidelines in place. This work was tracked under T202154.

  1. Identify modules that are unused in practice. This included picking up unfinished or forgotten software deprecations, and removing code for obsolete browser compatibility.
  2. Consolidate modules that did not represent an application entrypoint or logical bundle. Extensions are encouraged to use directories and file splitting for internal organization. Some extensions were registering internal files and directories as public module bundles (like a linker or autoloader), thus growing the startup manifest for all page views.
  3. Shrink the registry holistically through clever math and improved compression.

We wrote new frontend development guides as reference material, enabling developers to understand how each stage of the page load process is impacted by different types of changes. We merged and redirected various older guides in favor of this one.

Read about it at Wikipedia’s JavaScript initialisation on a budget

Autonomous Systems performance report

We published our first AS report, which explores the experience of Wikimedia visitors by their IP network (such as mobile carriers and Internet service providers, also known as Autonomous Systems).

This new monthly report is notable for how it accounts for differences in device type and device performance, because device ownership and content choice is not equally distributed among people and regions. We believe our method creates a fair assessment that focuses specifically on the connectivity of mobile carriers and internet services providers, to Wikimedia datacenters.

The goal is to watch the evolution of these metrics over time, allowing us to identify improvements and potential pain points.

Read more at Introducing: Autonomous Systems performance report

Miscellaneous

Further reading

About this post

Feature image by Peng LIU, licensed under Creative Commons CC0 1.0.

❌
❌