function FilterHtml::getHTMLRestrictions

Same name and namespace in other branches
  1. 9 core/modules/filter/src/Plugin/Filter/FilterHtml.php \Drupal\filter\Plugin\Filter\FilterHtml::getHTMLRestrictions()
  2. 8.9.x core/modules/filter/src/Plugin/Filter/FilterHtml.php \Drupal\filter\Plugin\Filter\FilterHtml::getHTMLRestrictions()
  3. 10 core/modules/filter/src/Plugin/Filter/FilterHtml.php \Drupal\filter\Plugin\Filter\FilterHtml::getHTMLRestrictions()

Overrides FilterBase::getHTMLRestrictions

2 calls to FilterHtml::getHTMLRestrictions()
FilterHtml::filterAttributes in core/modules/filter/src/Plugin/Filter/FilterHtml.php
Provides filtering of tag attributes into accepted HTML.
FilterHtml::process in core/modules/filter/src/Plugin/Filter/FilterHtml.php
Performs the filter processing.

File

core/modules/filter/src/Plugin/Filter/FilterHtml.php, line 248

Class

FilterHtml
Provides a filter to limit allowed HTML tags.

Namespace

Drupal\filter\Plugin\Filter

Code

public function getHTMLRestrictions() {
  if ($this->restrictions) {
    return $this->restrictions;
  }
  // Parse the allowed HTML setting, and gradually make the list of allowed
  // tags more specific.
  $restrictions = [
    'allowed' => [],
  ];
  $html = $this->settings['allowed_html'];
  // Protect any trailing * characters in attribute names, since DomDocument
  // strips them as invalid.
  // cSpell:disable-next-line
  $star_protector = '__zqh6vxfbk3cg__';
  $html = str_replace('*', $star_protector, $html);
  // Use HTML5 parser with a custom tokenizer to correctly parse tags that
  // normally use text mode, such as iframe.
  $events = new DOMTreeBuilder(FALSE, [
    'disable_html_ns' => TRUE,
  ]);
  $scanner = new Scanner('<body>' . $html);
  $parser = new class ($scanner, $events) extends Tokenizer {
    
    /**
     * phpcs:ignore Drupal.Commenting.FunctionComment.Missing
     * @phpstan-ignore-next-line
     */
    public function setTextMode($textMode, $untilTag = NULL) {
      // Do nothing, we never enter text mode.
    }

};
  $parser->parse();
  $dom = $events->document();
  $xpath = new \DOMXPath($dom);
  foreach ($xpath->query('//body//*') as $node) {
    $tag = $node->tagName;
    // All attributes are already allowed on this tag, this is the most
    // permissive configuration, no additional processing is required.
    if (isset($restrictions['allowed'][$tag]) && $restrictions['allowed'][$tag] === TRUE) {
      continue;
    }
    if ($node->hasAttributes()) {
      // If the tag is not yet present, prepare to add attribute restrictions.
      // Otherwise, check if a more restrictive configuration (FALSE, meaning
      // no attributes were allowed) is present: then override the existing
      // value to prepare to add attribute restrictions.
      if (!isset($restrictions['allowed'][$tag]) || $restrictions['allowed'][$tag] === FALSE) {
        $restrictions['allowed'][$tag] = [];
      }
      // Iterate over any attributes, and mark them as allowed.
      foreach ($node->attributes as $name => $attribute) {
        // Only add specific attribute values if all values are not already
        // allowed.
        if (isset($restrictions['allowed'][$tag][$name]) && $restrictions['allowed'][$tag][$name] === TRUE) {
          continue;
        }
        // Put back any trailing * on wildcard attribute name.
        $name = str_replace($star_protector, '*', $name);
        // Put back any trailing * on wildcard attribute value and parse out
        // the allowed attribute values.
        $allowed_attribute_values = preg_split('/\\s+/', str_replace($star_protector, '*', $attribute->value), -1, PREG_SPLIT_NO_EMPTY);
        // Sanitize the attribute value: it lists the allowed attribute values
        // but one allowed attribute value that some may be tempted to use
        // is specifically nonsensical: the asterisk. A prefix is required for
        // allowed attribute values with a wildcard. A wildcard by itself
        // would mean allowing all possible attribute values. But in that
        // case, one would not specify an attribute value at all.
        $allowed_attribute_values = array_filter($allowed_attribute_values, function ($value) {
          return $value !== '*';
        });
        if (empty($allowed_attribute_values)) {
          // If the value is the empty string all values are allowed.
          $restrictions['allowed'][$tag][$name] = TRUE;
        }
        else {
          // A non-empty attribute value is assigned, mark each of the
          // specified attribute values as allowed.
          foreach ($allowed_attribute_values as $value) {
            $restrictions['allowed'][$tag][$name][$value] = TRUE;
          }
        }
      }
    }
    if (empty($restrictions['allowed'][$tag])) {
      // Mark the tag as allowed, but with no attributes allowed.
      $restrictions['allowed'][$tag] = FALSE;
    }
  }
  // The 'style' and 'on*' ('onClick' etc.) attributes are always forbidden,
  // and are removed by Xss::filter().
  // The 'lang', and 'dir' attributes apply to all elements and are always
  // allowed. The list of allowed values for the 'dir' attribute is enforced
  // by self::filterAttributes(). Note that those two attributes are in the
  // short list of globally usable attributes in HTML5. They are always
  // allowed since the correct values of lang and dir may only be known to
  // the content author. Of the other global attributes, they are not usually
  // added by hand to content, and especially the class attribute can have
  // undesired visual effects by allowing content authors to apply any
  // available style, so specific values should be explicitly allowed.
  // @see https://www.w3.org/TR/html5/dom.html#global-attributes
  $restrictions['allowed']['*'] = [
    'style' => FALSE,
    'on*' => FALSE,
    'lang' => TRUE,
    'dir' => [
      'ltr' => TRUE,
      'rtl' => TRUE,
    ],
  ];
  // Save this calculated result for re-use.
  $this->restrictions = $restrictions;
  return $restrictions;
}

Buggy or inaccurate documentation? Please file an issue. Need support? Need help programming? Connect with the Drupal community.