Document Picture-in-Picture Specification

Status of this document

This section describes the status of this document at the time of its publication. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.

Feedback and comments on this specification are welcome. GitHub Issues are preferred for discussion on this specification. Alternatively, you can send comments to the Media Working Group’s mailing-list, public-media-wg@w3.org (archives). This draft highlights some of the pending issues that are still to be discussed in the working group. No decision has been taken on the outcome of these issues including whether they are valid.

This document was produced by a group operating under the W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

This document is governed by the 2 November 2021 W3C Process Document.

1. Introduction

This section is non-normative.

There currently exists a Web API for putting an HTMLVideoElement into a Picture-in-Picture window (requestPictureInPicture()). This limits a website’s ability to provide a custom picture-in-picture experience (PiP). We want to expand upon that functionality by providing the website with a full Document on an always-on-top window.

This new window will be much like a blank same-origin window opened via the existing open() method on Window, with some minor differences:

The PiP window will float on top of other windows.
The PiP window will never outlive the opening window.
The website cannot set the position of the PiP window.
The PiP window cannot be navigated (any `window.history` or `window.location` calls that change to a new document will close the PiP window).
The PiP window cannot open more windows.

2. Dependencies

The IDL fragments in this specification must be interpreted as required for conforming IDL fragments, as described in the Web IDL specification. [WEBIDL]

3. Security and Privacy Considerations

3.1. Secure Context

The API is limited to [SECURE-CONTEXTS].

3.2. Spoofing

It is recommended that the user agent provides enough UI on the DocumentPictureInPicture window to prevent malicious websites from abusing the ability to float on top of other windows to spoof other websites or system UI. The website’s inability to directly set the location or size of the PiP window helps alleviate some avenues of abuse.

It is recommended that the user agent makes it clear to the user which origin is controlling the DocumentPictureInPicture window at all times.

3.3. Fingerprinting

When a PiP window is closed and then later re-opened, it can be useful for the user agent to re-use size and location of the previous PiP window (modulo aspect-ratio constraints) to provide a smoother user experience. However, it is recommended that the user agent does not re-use size/location across different origins as this may provide malicious websites an avenue for fingerprinting a user.

4. API

[Exposed=Window]
partial interface Window {
  [SameObject, SecureContext] readonly attribute DocumentPictureInPicture
    documentPictureInPicture;
};

[Exposed=Window, SecureContext]
interface DocumentPictureInPicture : EventTarget {
  [NewObject] Promise<Window> requestWindow(
    optional DocumentPictureInPictureOptions options = {});
  readonly attribute Window window;
  attribute EventHandler onenter;
};

dictionary DocumentPictureInPictureOptions {
  long width = 0;
  long height = 0;
  float initialAspectRatio = 0.0;
  boolean lockAspectRatio = false;
  boolean copyStyleSheets = false;
};

[Exposed=Window]
interface DocumentPictureInPictureEvent : Event {
  constructor(DOMString type, DocumentPictureInPictureEventInit eventInitDict);
  [SameObject] readonly attribute Window window;
};

dictionary DocumentPictureInPictureEventInit : EventInit {
  required Window window;
};

A DocumentPictureInPicture object allows websites to create and open a new always-on-top Window as well as listen for events related to opening and closing that Window.

Each Window object has an associated documentPictureInPicture API, which is a new DocumentPictureInPicture instance created alongside the Window.

The documentPictureInPicture getter steps are:

Return this’s documentPictureInPicture API.

The window getter steps are:

Return the last Window opened by this if it exists and is still open. Otherwise, return null.

The requestWindow(options) method steps are:

If Document Picture-in-Picture support is false, throw a "NotSupportedError" DOMException and abort these steps.
If the relevant global object of this does not have transient activation, throw a "NotAllowedError" DOMException and abort these steps.
The user agent may choose to close any existing DocumentPictureInPicture Windows or PictureInPictureWindows.
Let target browsing context be a new browsing context navigated to the about:blank URL.
If options["width"] exists and is greater than zero:
1. Optionally, clamp or ignore options["width"] if it is too large or too small in order to fit a user-friendly window size.
2. Set the window width for the target browsing context to options["width"].
If options["height"] exists and is greater than zero:
1. Optionally, clamp or ignore options["height"] if it is too large or too small in order to fit a user-friendly window size.
2. Set the window height for the target browsing context to options["height"].
If options["initialAspectRatio"] exists and is greater than zero:
1. If options["width"] and options["height"] have been specified and don’t match options["initialAspectRatio"], the user agent may ignore options["initialAspectRatio"].
2. Optionally, clamp or ignore options["initialAspectRatio"] if it is too large or too small in order to fit a user-friendly window size.
3. Set the window size for the target browsing context to a width and height such that width divided by height is approximately options["initialAspectRatio"].
If options["lockAspectRatio"] exists and is true, then the window should be configured such that when a user resizes it, the aspect ratio of the window should remain constant.
Configure the window containing target browsing context to float on top of other windows.
If options["copyStyleSheets"] exists and is true, then the CSS style sheets applied the current associated Document should be copied and applied to the target browsing context’s associated Document. This is a one-time copy, and any further changes to the current associated Document’s CSS style sheets will not be copied.
Queue a global task on the DOM manipulation task source given this’s relevant global object to fire an event named enter using DocumentPictureInPictureEvent on this with its bubbles attribute initialized to true and its window attribute initialized to target browsing context.
Return target browsing context.

While the aspect ratio or size of the window can be configured by the website, the initial position is left to the discretion of the user agent.

enter: Fired on DocumentPictureInPicture when a PiP window is opened.

5. Concepts

5.1. Document Picture-in-Picture Support

Document Picture-in-Picture Support is false if there’s a user preference that disables it or a platform limitation. It is true otherwise.

5.2. One PiP Window

Whether only one window is allowed in Picture-in-Picture mode is left to the implementation and the platform. As such, what happens when there is a Picture-in-Picture request while a DocumentPictureInPicture Window or PictureInPictureWindow is already open will be left as an implementation detail: the current window could be closed, the Picture-in-Picture request could be rejected, or multiple Picture-in-Picture windows could be created. Regardless, the user agent must fire the appropriate events in order to notify the websites of the Picture-in-Picture status changes.

5.3. Relative URLs

A primary use case of DocumentPictureInPicture is to put existing elements (e.g. an HTMLVideoElement) into an always-on-top window so the user can continue to see them while multitasking. However, sometimes these elements have attributes that use a relative-URL string (e.g. src). Since the Document in a DocumentPictureInPicture Window is always navigated to the about:blank URL, these relative-URL strings would break. To prevent this, the user agent must parse relative-URL strings as if they were being parsed on the Document that opened the DocumentPictureInPicture Window.

6. Examples

This section is non-normative

6.1. Extracting a video player into PiP

6.1.1. HTML

<body>
  <div id="player-container">
    <div id="player">
      <video id="video" src="foo.webm"></video>
      <!-- More player elements here. -->
    </div>
  </div>
  <input type="button" onclick="enterPiP();" value="Enter PiP" />
</body>

6.1.2. JavaScript

// Handle to the picture-in-picture window.
let pipWindow = null;

function enterPiP() {
  const player = document.querySelector('#player');

  // Lock the aspect ratio so the window is always properly sized to the video.
  const pipOptions = {
    initialAspectRatio: player.clientWidth / player.clientHeight,
    lockAspectRatio: true,
    copyStyleSheets: true
  };

  documentPictureInPicture.requestWindow(pipOptions).then((pipWin) => {
    pipWindow = pipWin;

    // Style remaining container to imply the player is in PiP.
    playerContainer.classList.add('pip-mode');

    // Add player to the PiP window.
    pipWindow.document.body.append(player);

    // Listen for the PiP closing event to put the video back.
    pipWindow.addEventListener('unload', onLeavePiP.bind(pipWindow), { once: true });
  });
}

// Called when the PiP window has closed.
function onLeavePiP() {
  if (this !== pipWindow) {
    return;
  }

  // Remove PiP styling from the container.
  const playerContainer = document.querySelector('#player-container');
  playerContainer.classList.remove('pip-mode');

  // Add the player back to the main window.
  const player = pipWindow.document.querySelector('#player');
  playerContainer.append(player);

  pipWindow = null;
}

6.2. Accessing elements on the PiP Window

const video = pipWindow.document.querySelector('#video');
video.loop = true;

6.3. Listening to events on the PiP Window

As part of creating an improved picture-in-picture experience, websites will often want customize buttons and controls that need to respond to user input events such as clicks.

const pipDocument = pipWindow.document;
const video = pipDocument.querySelector('#video');
const muteButton = pipDocument.document.createElement('button');
muteButton.textContent = 'Toggle mute';
muteButton.addEventListener('click', () => {
  video.muted = !video.muted;
});
pipDocument.body.append(muteButton);

6.4. Exiting PiP

The website may want to close the DocumentPictureInPicture Window without the user explicitly clicking on the window’s close button. They can do this by using the close() method on the Window object:

// This will close the PiP window and trigger our existing onLeavePiP()
// listener.
pipWindow.close();

6.5. Getting elements out of the PiP window when it closes

When the PiP window is closed for any reason (either because the website initiated it or the user closed it), the website will often want to get the elements back out of the PiP window. The website can perform this in an event handler for the unload event on the Window object. This is shown in the onLeavePiP() handler in video player example above and is copied below:

// Called when the PiP window has closed.
function onLeavePiP() {
  if (this !== pipWindow) {
    return;
  }

  // Remove PiP styling from the container.
  const playerContainer = document.querySelector('#player-container');
  playerContainer.classList.remove('pip-mode');

  // Add the player back to the main window.
  const player = pipWindow.document.querySelector('#player');
  playerContainer.append(player);

  pipWindow = null;
}

7. Acknowledgments

Many thanks to Frank Liberato, Mark Foltz, Klaus Weidner, François Beaufort, Charlie Reis, Joe DeBlasio, Domenic Denicola, and Yiren Wang for their comments and contributions to this document and to the discussions that have informed it.

Conformance

Document conventions

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Conformant Algorithms

Requirements phrased in the imperative as part of algorithms (such as "strip any leading space characters" or "return false and abort these steps") are to be interpreted with the meaning of the key word ("must", "should", "may", etc) used in introducing the algorithm.

Conformance requirements phrased as algorithms or specific steps can be implemented in any manner, so long as the end result is equivalent. In particular, the algorithms defined in this specification are intended to be easy to understand and are not intended to be performant. Implementers are encouraged to optimize.