Having achieved much of what I hoped when running for the TAG six years ago, it's time for fresh perspectives.
There was no better replacement I could hope for than Alice Boxhall. The W3C membership agreed and elected her to a two year term. Her independence, technical rigour, and dedication to putting users and their needs first have served the TAG and the broader web community exceedingly well since. Sadly, she is not seeking re-election. The TAG will be poorer without her dedication, no-nonsense approach, and depth of knowledge.
TAG service is difficult, time consuming, and requires incredible focus to engage constructively in the controversies the TAG is requested to provide guidance about. The technical breadth required creates a sort of whiplash effect as TAG members consider low-level semantics and high-level design issues across nearly the entire web platform. That anyone serves more than one term is a minor miracle!
While I'm sad Alice isn't running again, my colleague Jeffrey Yasskin has thrown his hat into the ring in one of the most contested TAG elections of recent memory. There aren't many folks who have the temperament, experience, and track record that would make them natural TAG candidates, but Jeffrey is in that rare group and I hope that I can convince you he deserves you organisation's first-choice vote.
Jeffrey and I have worked closely over the past 5 years on challenging projects that have pushed the boundaries of the web's potential forward, but which also posed large risks if done poorly. It was Jeff's involvement that lead to the "chooser model" for several device access APIs, starting with Web Bluetooth.
This model has since been pressed into service multiple times, notably for Chrome's mediation of the Web USB, Web HID, and Web Serial APIs. The introduction of a novel style of chooser solved thorny problems.
There's so much more to talk about in Jeffrey's background and approach to facilitating structural solutions to thorny problems — from his work to (finally) develop a grounded privacy threat model as the TAG repeatedly requested, to his stewardship of the Web Packaging work the TAG started in 2013 to his deep experience in standards at nearly every level of the web stack — but this post is already too long.
There are a lot of great candidates running for this year's four open seats on the TAG, and while I can imagine many of them serving admirably, not least of all Tess O'Connor and Sanghwan Moon who have been doing great work, there's nobody I'd rather see earn your first-choice vote than Jeffrey Yasskin. He's not one to seek the spotlight for himself, but having worked closely with him, I can assure you that, if elected, he'll earn your trust the way he has earned mine.
Update: After hitting a bug related to initial rendering on Android, I've updated the code here and in the snippet to be resilient to browsers deciding (wrongly) to skip rendering the first <article> in the document.
Update, The Second: After holiday explorations, it turns out that one can, indeed, use contain-intrinsic-size on elements that aren't laid out yet. Contra previous advice, you absolutely should do that, even if you don't know natural widths yet. Assuming the browser will calculate a width for an <article> once finally laid out, there's no harm in reserving a narrow but tall place-holder. Code below has been updated to reflect this.
Because content-visibility: visible was never removed, layouts inevitably grew slower as more and more content was unfurled by scrolling.
Layouts caused by resize can be particularly slow on large documents as they can cause all text flows to need recalculation. This would not be improved by the previous solution if many articles had been marked visible.
Jumping "far" into a document via link or quick scroll could still be jittery.
Much of this was pointed out (if obliquely) in a PR to the ECMAScript spec. One alternative is a "place-keeper" element that grows with rendered content to keep elements that eventually disappear from perturbing scroll position. This was conceptually hacky and I couldn't get it working well.
What should happen if an element in the flow changes its width or height in reaction to the browser lazy-loading content? And what if the scroll direction is up, rather than down?
For a mercifully brief while I also played with absolutely positioning elements on reveal and moving them later. This also smelled, but it got me playing with ResizeObservers. After sleeping on the problem, a better answer presented itself: leave the flow alone and use IntersectionObservers and ResizeObservers to reserve vertical space using CSS's new contain-intrinsic-size property once elements have been laid out.
I'd dismissed contain-intrinsic-size for use in the base stylesheet because it's impossible to know widths and heights to reserve. IntersectionObservers and ResizeObservers, however, guarantee that we can know the bounding boxes of the elements cheaply (post layout) and use the sizing info they provide to reserve space should the system decide to stop laying out elements (setting their height to 0) when they leave the viewport. We can still use contain-intrinsic-size, however, to reserve a conservative "place-holder" for a few purposes:
If our element would be content sized, we can give it a narrow intrinsic-size width; the browser will still give us the natural width upon first layout.
Reserving some height for each element creates space for content-visibilitys internal Intersection Observer to interact with. If we don't provide a height (in this case, 500px as a guess), fast scrolls will encounter "bunched" elements upon hitting the bottom, and content-visiblity: auto will respond by laying out all of the elements at once, which isn't what we want. Creating space means we are more likely to lay them out more granularly.
The snippet below works around a Chrome bug with applying content-visibility: auto; to all <articles>, forcing initial paint of the first, then allowing it to later be elided. Observers update sizes in reaction to layouts or browser resizing, updating place-holder sizes while allowing the natural flow to be used. Best of all, it acts a progressive enhancement:
<!-- Basic document structure --> <html> <head> <style> /* Workaround for Chrome bug, part 1 * * Chunk rendering for all but the first article. * * Eventually, this selector will just be: * * body > main > article * */ body > main > article + article{ content-visibility: auto; /* Add vertical space for each "phantom" */ contain-intrinsic-size: 10px 500px; } </style> <head> <body> <!-- header elements --> <main> <article><!-- ... --></article> <article><!-- ... --></article> <article><!-- ... --></article> <!-- ... -->
<!-- Inline, at the bottom of the document --> <scripttype="module"> leteqIsh=(a, b, fuzz=2)=>{ return(Math.abs(a - b)<= fuzz); };
// Keep a map of elements and the dimensions of // their place-holders, re-setting the element's // intrinsic size when we get updated measurements // from observers. let spaced =newWeakMap();
// Only call this when known cheap, post layout letreserveSpace=(el, rect=el.getClientBoundingRect())=>{ let old = spaced.get(el); // Set intrinsic size to prevent jumping on un-painting: // https://drafts.csswg.org/css-sizing-4/#intrinsic-size-override if(!old ||rectNotEQ(old, rect)){ spaced.set(el, rect); el.attributeStyleMap.set( "contain-intrinsic-size", `${rect.width}px ${rect.height}px` ); } };
let iObs =newIntersectionObserver( (entries, o)=>{ entries.forEach((entry)=>{ // We don't care if the element is intersecting or // has been laid out as our page structure ensures // they'll get the right width. reserveSpace(entry.target, entry.boundingClientRect); }); }, {rootMargin:"500px 0px 500px 0px"} );
// Workaround for Chrome bug, part 2. // // Re-enable browser management of rendering for the // first article after the first paint. Double-rAF // to ensure we get called after a layout. requestAnimationFrame(()=>{ requestAnimationFrame(()=>{ articles[0].attributeStyleMap.set( "content-visibility", "auto" ); }); }); </script>
This solves most of our previous problems:
All articles can now take content-visibility: auto, removing the need to continually lay out and render the first article when off-screen.
content-visibility functions as a hint, enhancing the page's performance (where supported) without taking responsibility for layout or positioning.
Elements grow and shrink in response to content changes
Presuming browsers do a good job of scroll anchoring, this solution is robust to both upward and downward scrolls, plus resizing.
Update: After further investigation, an even better solution has presented itself, which is documented in the next post.
The new content-visibility CSS property finally allows browsers to intelligently decide to defer layout and rendering work for content that isn't on-screen. For pages with large DOMs, this can be transformative.
In applications that might otherwise be tempted to adopt large piles of JS to manage viewports, it can be the difference between SPA monoliths and small progressive enhancements to HTML.
One challenge with naive application of content-visibility, though, is the way that it removes elements from the rendered tree once they leave the viewport -- particularly as you scroll downward. If the scroll position depends on elements above the currently viewable content "accordion scrollbars" can dance gleefully as content-visibility: auto does its thing.
To combat this effect, I've added some CSS to this site that optimistically renders the first article on a page but defers others. Given how long my articles are, it's a safe bet this won't usually trigger multiple renders for "above the fold" content on initial pageload (sorry not sorry).
This is coupled with an IntersectionObserver that to unfurls subsequesnt articles as you scroll down.
The snippets:
<style> /* Defer rendering for the 2nd+ article */ body > main > *+*{ content-visibility: auto; } </style>
<scripttype="module"> let observer =newIntersectionObserver( (entries, o)=>{ entries.forEach((entry)=>{ let el = entry.target; // Not currently in intersection area. if(entry.intersectionRatio ==0){ return; } // Trigger rendering for elements within // scroll area that haven't already been // marked. if(!el.markedVisible){ el.attributeStyleMap.set( "content-visibility", "visible" ); el.markedVisible =true; } }); }, // Set a rendering "skirt" 50px above // and 100px below the main scroll area. {rootMargin:"50px 0px 100px 0px"} );
let els = document.querySelectorAll("body > main > *+*"); els.forEach((el)=>{ observer.observe(el);}); </script>
With this fix in place content will continue to appear at the bottom (ala "infinite scrolling") as you scroll down, but never vanish from the top or preturb your scroll position in unnatural ways.
TL;DR: App stores exist to provide overpowered and easily abused native app platforms a halo of safety. Pre-publication gates were valuable when better answers weren't available, but commentators should update their priors to account for hardware and software progress of the past 13 years. Policies built for a different era don't make sense today, and we no longer need to accept sweeping restrictions in the name of safety. The idea that preventing browser innovation is pro-user is particularly risible, leading to entirely avoidable catch-22 scenarios for developers and users. If we're going to get to better outcomes from stores on OSes without store diversity or side-loading, it's worth re-grounding our understanding of stores.
...Only Try To Realize The Truth: There Is No "App"#
OSes differ widely in the details of how "apps" are built and delivered. The differences point to a core truth: technically speaking, being "an app" is merely meeting a set of arbitrary and changeable OS conventions.
The closer one looks, the less a definition of "appiness" can be pinned down to specific technologies. Even "toy" apps with identical functionality are very different under the covers when built on each OSes preferred stack. Why are iOS apps "direct metal" binaries while Android apps use Java runtimes? Is one of these more (or less) "an app"? The lack of portability highlights the absence of clear technical underpinnings for what it means to "be an app".
To be sure, there are platonic technical ideals of applications as judged by any OS's framework team[1], but real-world needs and behaviour universally betray these ideals.[2] Native platforms always provide ways to dynamically load code and content from the network, both natively or via webview and choice of code platform is not dispositive in identifying if an experience is "an app".
The solid-land definition "an app" is best understood through UI requirements to communicate "appiness" to end-users: metadata to identify the publisher and provide app branding, plus bootstrap code to render initial screens. Store vendors might want to deliver the full user experience through the binary they vend from their stores, but that's not how things work in practice.
Industry commentators often write as though the transition to mobile rigidly aligned OSes with specific technology platforms. At a technical level, this is not correct.
Apple could unveil iOS with support for web apps without the need for an app store because web applications are safe by default. Circa '07, those web apps were notably slower than native code could be. The world didn't yet have pervasive multi-process browser sandboxing, JS JITs, or metal-speed binary isolation. Their arrival, e.g. via WASM, has opened up entirely new classes of applications on a platform (the web) that demands memory safety and secure-by-default behavior as a legacy of 90s browser battles.
In '07, safety implied an unacceptable performance hit on slow single-core devices with 128MiB of RAM. Many applications couldn't be delivered if strict protection was required using the tools of the day. This challenging situation gave rise to app stores: OS vendors understood that fast-enough-for-games needed dangerous trade-offs. How could the bad Windows security experience be prevented? A priori restraint on publishing allowed vendors to set and enforce policy quickly.
Fast forward a decade, and both the software and hardware situations have changed dramatically. New low-end devices are 4-to-8 core, 2GHz systems with 2GiB of RAM. Sandboxing is now universal (tho quality may differ) and we collectively got savvy about running code quickly while retaining memory safety. The wisdom of runtime permission grants (the web's model) has come to be widely accepted.
For applications that need peak compute performance, the effective tax rate of excellent runtime security is now in the 5-10% range, rather than the order of magnitude cost a decade ago. Safety is within our budget, assuming platforms don't make exotic and dangerous APIs available to all programs — more on that in a second.
So that's apps: they're whatever the OS says they are, that definition can change at any moment, and both reasonably constrained and superpowered programs can "be apps." Portability is very much a possibility if only OSes deign to allow it[3].
Because app stores rose to prominence on platforms that could not afford good safety using runtime guards, security is the core value proposition of a store. When working well — efficiently applying unobjectionable policy — store screening is invisible to everyone but developers. Other roles played by stores may have value to users but primarily serve to build a proprietary moat around commodity (or lightly differentiable) proprietary, vertically integrated platforms.
Amazon and search engines demonstrate that neither payment nor discovery requires stores. If users are aware of security in app stores, it's in the breach, e.g., the semi-occasional article noting how a lousy app got through. The desired effect of pre-screening is to ensure that everything discovered in the store is safe to try, despite being built on native platforms that dole out over-broad permissions. It's exceedingly difficult to convince folks to use devices heavily if apps can brick the device or drain ones bank account.[4]
It is essential to note that this forward integration has had huge benefits for everyone involved. While Apple pretends like the Internet never existed as a distribution channel, the truth is it was a channel that wasn’t great for a lot of users: people were scared to install apps, convinced they would mess up their computers, get ripped off, or accidentally install a virus.
Specifically, Apple tried to solve three kinds of problem.
Putting apps in a sandbox, where they can only do things that Apple allows and cannot ask (or persuade, or trick) the user for permission to do ‘dangerous’ things, means that apps become completely safe. A horoscope app can’t break your computer, or silt it up, or run your battery down, or watch your web browser and steal your bank details.
An app store is a much better way to distribute software. Users don’t have to mess around with installers and file management to put a program onto their computer - they just press ‘Get’. If you (or your customers) were technical this didn’t seem like a problem, but for everyone else with 15 copies of the installer in their download folder, baffled at what to do next, this was a huge step forward.
Asking for a credit card to buy an app online created both a friction barrier and a safety barrier - ‘can I trust this company with my card?’ Apple added frictionless, safe payment.
These takes highlight the value of safety but skip right past the late desktop experience: we didn't primarily achieve security on Windows by erecting software distribution choke points. Instead, we got protection by deciding not to download (unsafe) apps, moving computing to a safe-by-default platform: the web.[5]
Native platform generosity with capabilities ensures that every native app is a vast ocean of abuse potential. Pervasive"SDK"misbehavior converts this potential to dollars. Data available to SDK developers make fears about web privacy look quaint. Because native apps are not portable or standards-based, the usual web-era solutions of switching browsers or installing an extension are unavailable. To accept any part of the native app ecosystem proposition is to buy the whole thing.
Don't like the consequences? The cost of changing one's mind is now starts at several hundred dollars. The moat of proprietary APIs and distribution is sold on utility and protection, but it's primary roles is protect margins. A lack of app and data interoperability prevents users from leaving. The web, meanwhile, extends security as far as the link can take you. Moats aren't necessities when security is assured.
This situation are improving glacially, often at the pace of hardware replacement.[6] The slog towards mobile OSes adopting a more web-ish contract with developers hints at the interests of OS vendors. Native platforms haven't reset the developer contract to require safety because they recall what happened to Windows. When the web got to parity in enough areas, interest in the proprietary platform receded to specialist niches (e.g., AAA gaming, CAD). Portable, safe apps returned massive benefits to users who no longer needed to LARP as systems administrators.
The web's safety transformed software distribution so thoroughly that many stopped thinking of applications as such. Nothing is as terrifying for a vertically integrated hardware vendor as the thought that developers might leave their walled garden, leading users to a lower-tax-rate jurisdiction.
Safety enables low friction, and the residual costs in data and acquisition of apps through stores indicate not security, but the inherent risk of the proprietary platforms they guard. Windows, Android, and ChromeOS allow web apps in their stores, so long as the apps conform to content policies. Today, developers can reach users of all these devices on their native app stores from a single codebase. Only one vendor prevents developers from meeting user needs with open technology.
This prejudice against open, unowned, safe alternatives exists to enable differentiated integrations. Why? To preserve the ability to charge much more over the device's life than differences in hardware quality might justify. The downsides now receiving scrutiny are a direct consequence of this strategic tax on users and developers. Restrictions, once justifiably required to maintain safety, are now antique. What developer would knowingly accept such a one-sided offer if made today?
It is uniquely perverse that these policies ensure the web on iOS can only ever be as competent, safe, and attractive to developers as Apple allows.
Were Apple producing a demonstrably modern and capable browser, it might not be a crisis, but Apple's browser and the engine they force others to use is years behind.
The continuing choice to under-invest in Safari and WebKit has an ecosystem-wide impact, despite iOS's modest reach. iOS developers who attempt to take Apple up on their generous offer to reach users through the web are directly harmed. The low rate of progress in Apple's browsers all but guarantees that any developer who tries will fail.
But that's not the biggest impact. Because the web's advantage is portability, web developers view features not available "everywhere" as entirely unavailable. The lowest common denominator sets a cap on developer ambitions, and Apple knows it. App Store policies taht restrict meaningful browser competition ensure the cap is set low so as to prevent real competition with Apple's proprietary tools and distribution channel.
Apple doesn't need a majority of web usage to come from browsers without critical features to keep capabilities from being perceived skeptically; they only need to keep them out of the hands of, say, 10% of users. So much the better if users who cannot perceive or experience the web delivering great experiences are wealthy developers and users. Money talks.
Thirteen years of disinterest in a competitive mobile web by Apple has produced a crisis for the web just as we are free of hardware shackles. The limits legitimated the architecture of app store control are gone, but the rules have not changed. The web was a lifeboat away from native apps for Windows XP users. That lifeboat won't come for iPhone owners because Apple won't allow it. Legitimately putting user's interests first is good marketing, but terrible for rent extraction.
Arguments against expanding web capabilities to reach minimum-mobile-viability sometimes come wrapped in pro-privacy language from the very parties that have flogged anti-privacy native app platforms[8]. These claims are suspect. Detractors (perhaps unintentionally) conflate wide and narrow grants of permission. The idea that a browser may grant some sites a capability is, of course, not equivalent to giving superpowers to all sites with low friction. Nobody's proposing to make native's mistakes.
Regardless, consistent misrepresentations are effective in stoking fear. Native apps are so dangerous they require app store gatekeepers, after all. Without looking closely at the details, it's easy to believe that expansions of web capability to be similarly broad. Thoughtful, limited expansions of heavily cabined capabilities take time and effort to explain. The nuances of careful security design are common casualties in FUD-slinging fights.
A cynic might suggest that these misrepresentations deflect questions about the shocking foreclosure of opportunities for the web that Apple has engineered — and benefits from directly.
Apple's defenders offer contradictory arguments:
Browsers are essential to modern operating systems, and so iOS includes a good browser. To remain "a good browser," it continually adds and markets new features.
Browsers are wildly unsafe because they load untrusted code and, therefore, they must use only Apple's (safe?) engine.[9]
App stores ensure superpowered apps are trustworthy because Apple approves all the code.
Apple doesn't allow web apps into the store as they might change at runtime to subvert policy.
Taken in isolation, the last two present a simple, coherent solution: ban browsers, including Safari. Also, webviews and other systems for dynamically pushing code to the client (e.g., React Native). No ads that aren't pre-packaged with the binary, thanks. No web content — you never can tell where bits if of a web page might be coming from, after all! This solution also aligns with current anti-web store policies.
The first two points are coherent if one ignores the last decade of software and hardware progress. They also tell the story of iOS's history: 13 years ago, smartphones were niche, and access to the web corpus was important. The web was a critical bridge in the era before native mobile-first development. Strong browser sandboxing was only then becoming A Thing (TM), and resource limits delayed mobile browsers from providing multi-process protection. OS rootings in the low-memory, 1-2 core-count era perhaps confirmed Cupertino's suspicions.[10]
Time, software, and hardware reality have all moved forward. The most recent weaksauce served to justify this aging policy has been the need for JIT in modern browsers. Browser vendors might be happy to live under JIT-less rules should it come to that — the tech is widely available — and yet alternative engines remain banned. Why?
Undoubtedly, web engines face elevated risks from untrusted, third-party content. Mitigating these risks has played a defining role in evolving browsers for more than twenty years.
Browsers compete on security, privacy, and other sensitive aspects of the user's experience. That competition has driven two decades of incredible security innovation without resorting to prior-restraint gatekeeping. The webkit-wrapper browsers on iOS are just as superpowered as other apps. They can easily subvert user intent and leak sensitive information (e.g., bank passwords). Trustworthiness is not down to Apple's engine restrictions; the success of secure browsers on every OS shows this to be the case. These browsers provide more power to users and developers while maintaining an even more aggressive security posture than any browser on iOS is allowed to implement.
But let's say the idea of "drive-by" web content accessing expanded capabilities unsettled Apple, so much so that they wanted to set a policy to prevent powerful features from being available to websites in mere tabs. Could they? Certainly. Apple's Add to Home Screen feature puts installed sites into little jails distinct from a user's primary browsing context. Apple could trivially enforce a policy that unlocked powerful features only upon completion of an installation ceremony, safe in the knowledge that they will not bleed out into "regular" web use.[11] We designed web APIs to accommodate a diversity of views about exposing features, and developers guard against variable support already.
The preclusion of better iOS browsers hinges on opacity. Control drives from developer policy rather than restrictions visible to users, rendering browser switching moot. Browser makers can either lend their brands to products they do not think of as "real browsers" or be cut off from the world's wealthiest users. No other OS forces this choice. But the drama is tucked behind the curtains, as is Apple's preference.
Most folks aren't technical enough to understand they can never leave Hotel Cupertino, and none of the alternative browsers they download even tell them they're trapped. A more capable, more competitive web might allow users to move away from native apps, leaving Apple without the ability to hold developers over the barrel. Who precludes this possibility? Apple, naturally.
A lack of real browser competition coupled with a trailing-edge engine creates deadweight losses, and these losses aren't constrained to the web. Apple's creative reliance on outdated policy explicitly favors the proprietary over the open, harming consumers along the way.
The platonic ideal of "an app" is invariably built using the system's preferred UI toolkit and constructed to be self-contained. These apps tend to be bigger than toys, but smaller than what you might attempt as an end developer; think the system-provided calculator. ↩︎
Bits downloaded from the store seldom constrain app behavior, protests by the store's shopkeepers notwithstanding. Before the arrival of Android Dynamic Apps, many top apps used slightly-shady techniques to dynamically load JARs at runtime, bypassing DEX and signing restrictions. It's a testament to the prevalence of the need to compose experiences that this is now officially supported dynamically. ↩︎
Portability is a frequent trade-off in the capability/safety discussion. Java and Kotlin Android apps are safer and more portable by default than NDK binaries, which need to worry about endianness and other low-level system properties.
The price paid for portability is a hit to theoretical maximum performance. The same is true of the web vs. Java apps, tho in all cases, performance is contingent on the use-case and where platform engineers have poured engineer-years into optimisation. ↩︎
Skipping over the web and its success is a curious blind-spot for such thoughtful tech observers. Depressingly, it may signal that the web has been so thoroughly excluded from the land of the relevant by Apple that it isn't top, middle, or even bottom-of-mind. ↩︎
Worldwide sales of iOS and the high-end Android devices that get yearly OS updates are less than a third of total global handset volume. Reading this on an iPhone? Congrats! You're a global 15%-er. ↩︎
Section 2.5.6 is the "Hotel Cupertino" clause: you can pick any browser you choose, but you can never leave Apple's system-provided WebKit. ↩︎
Mozilla's current objections are, by contrast, confused to my ears but they are at least not transparently disingenuous. ↩︎
Following this argument, Apple is justified in (uniquely) restricting the ability of other vendors to deliver more features to the web (even though adding new features is part of what it means to be "a good browser"). Perversely, this is also meant to justify restrictions that improve the security of both new and existing features in ways that might make Safari look bad. I wonder what year it will be when Safari supports Cross-Origin Isolation or Site Isolation on any OS?
Why does it matter? Presumably, iOS will disallow WASM threading until that far-off day. ↩︎
For all of Apple's fears about web content and Safari's security track record, Apple's restrictions on competing web engines mean that nobody else has been allowed to bring stronger sandboxing tech to the iOS party.
Chrome, for one, would very much like to improve this situation the way it has on MacOS but is not allowed to. It's unclear how this helps Apple's users. ↩︎
webviews can plumb arbitrary native capabilities into web content they host, providing isolated-storage with expanded powers. The presence of webview-powered Cordova apps in Apple's App Store raises many questions in a world where Apple neither expands Safari's capabilities to meet web developer needs nor allows other browsers to do so in their stead. ↩︎
Update: my colleague Emily Stark has just written an essential-reading post on this topic. We probably need to write something about all of the non-prompt controls that Chrome has put in place behind the patterns discussed in this post, but Emily's post really drives home the case for flexibility with practical examples.
From time to time, working groups (particularly at the W3C) are asked to adopt requirements on browser-presented UI (pixels drawn outside or over-top the page content) in the normative text of a specification. For example, someone might advocate for text like along the lines of "before proceeding to Step 5, conforming User Agents MUST request user consent". Less frequently, advocates will push for the inclusion of specific text to be presented, or in the most extreme cases, specific UI layouts.
The most intense of these debates occur in the context of security-sensitive moments: warning users of potential risks, requesting permission to access sensitive features, etc. Thankfully, these efforts usually fail in the face of bristling browser feedback.
What's going on here? Don't browsers care about providing consistent UI? Don't they want the best possible security?
The answer to both questions is an emphatic yes. Counter-intuitively, it is because browsers care so much about security, privacy, and consistency (in that order) that we err on the side of rejecting constraints on security UI. In the modern era, that is paired with API design patterns that maximize flexibility about these moments.
Web Standards are voluntary. The force that most powerfully compels their adoption is competition, rather than regulation. In practice, this means that vendors of integrated browsers[1] have final say over what their browsers do. Those who ship the bits, rather than those who write the spec text, are ultimately responsible for the properties of the subsequent experience. This is the basis for functioning browser competition.[2]Those who shift the bits call the shots.
This is an inherent property of modern browsers. Vendors participate in standards processes not because they need anyone else to tell them what to do, and not because they are somehow subject to the dictates of standards bodies, but rather to learn from developers and find agreement with competitors in a problem space where compatibility returns outsized gains.
Web Standards can be understood as liberally licensed, meticulously ventilated documentation about the detailed workings of a system that implementers agree to potentially be wrong about together.
Normative restrictions in a well-built spec should only cover aspects of a design where benefits of consistency and compatibility reliably outweigh the costs to getting some detail wrong.[3] That is, if an API is found to be wanting in some way, the specified bits of the design are limited to aspects vendors can take time to address through an iterative, deliberative process.
Of course, there are classes of errors that vendors are not willing to be wrong about for the time it takes to forge new consensus. Chief among them are security vulnerabilities. Pervasive, abusive behaviour is not far behind. There are dozens of instances over the decades of browsers unilaterally breaking spec'd behaviours in the interest of guarding users. UI prescriptions in specifications that intersect with these concerns will be blithely ignored by vendors looking to protect users and their own reputations. This flows directly from the frequently-cited priority of constituencies.
Normative text about properties that are neither developer exposed nor are likely to be upheld in the breech create long-term liabilities. Over time, predictable conformance breakage — either from incident response or iterative exploration toward better solutions — has the effect of putting asterisks over otherwise-implementable specifications. This creates credibility problems for specs, their editors, working groups that back them, reviewers who do not flag unenforceable text, and SDOs that ratify toothless language. In a world of voluntary standards, reputation matters, and bad spec language can create a lasting negative impression.
One of the strongest reasons to avoid making MUST (or even SHOULD)-strength statements about what browsers present is the vast depth of our collective ignorance.
Security UX is a relatively young field, and the set of organisations funding research into it is relatively small. We know from practice that understandability of systems is a major challenge and that users are poor at judging risk. This often causes those concerned about specific risks to demand prompts or user confirmation, but evidence also teaches that we risk "dialog blindness" through repeated re-confirmations.
The most successful examples of guiding users to safety in recent years involve a great deal of trial-and-error and follow-up research. Our best tools, collectively, are studies of variations and watching behaviour at scale. There are no models that I'm aware of that consistently maximize user understanding of choice over time while reducing decision burdens.
Consequently, as API designers, we owe it to our UXR peers to maximize their ability to iterate on potential presentations in service of those dual goals. Failing to acknowledge our own ignorance is a surefire way to paint ourselves into a bad corner.
If the case for a specific UI treatment on the basis of reckons is suspect, the case for consistency between browsers is doubly so. There is some evidence that a not-inconsiderable subset of desktop users may encounter multiple browsers in a week, but it is not a majority. Consistency between them should not then be a more important goal than internal consistency within a single browser's UI surfaces and/or consistency with underlying OS affordances (which also differ). Unlike the APIs that might invoke them, these UIs are not developer defined or exposed, removing the usual argument for consistency in a web standard.
Frustratingly, intense demands are often paired with generalized statements of the form "I think..." and "users don't know X...", without associated research. These reckons can take up a great deal of time when they should fail-fast against Hitchen's Razor, particularly given the browser community's history of iteration on these surfaces. Should future proposals for normative browser security surfaces come with peer-reviewed, wide-scale UXR attached then, perhaps, it may be worth revisiting blanket push-back.
Until then, and given our vast lack of understanding, it seems prudent to design defensively while expanding each spec's Privacy and Security Considerations sections to exhaustively enumerate risks that UX researchers will be asked to navigate. Designing for maximum iteration potential in this area has been, thus far, the best way to be least-cumulatively-wrong.[4]
Integrated browsers are those that ship an engine implementation along with browser UI. "WebView browsers" are an example of non-integrated browsers; they are able to define browser UI, can't change the behaviour of the entire API surface. They are hostage in some measure to the system-provided WebView for consequential choices that affect both users and developers. Browser vendors are hopping mad at Apple about iOS because — unlike every other commodity OS, including ChromeOS — it restricts integrated browsers to a single (first) party, forcing all other vendors to implicitly market Apple's (sub-standard) idea of what a browser can be through WebView-browsers with their brands...or forfeit the right to reach iOS users entirely. This undermines the competition that is central to progress and should be viewed by all users (and particularly web developers) as an intolerable distortion. If there is anything good to be said for browser engine diversity, it is that it spurs competition, sparking improvement across the board. Undercutting competition is, therefore, striking directly at the heart of the reason to want multiple engines in the first place. ↩︎
End-to-end vendor responsibility in integrated browsers — and the inability to make better decisions about consequential choices within WebView-based browsers — also explains why the Hotel Cupertino clause is a slow-moving disaster for progress, depressing the rate of innovation on the web overall. ↩︎
Non-normative notes and external "explainer" documents can help supplement the core text and provide examples of ideas for implementers to keep in mind. If a spec must contain some text that tries to talk about browser UI, this is the way to do it. ↩︎