Interface HtmlProcessor

  • All Superinterfaces:
    Serializable

    public interface HtmlProcessor
    extends Serializable
    Processes HTML that is meant to be read from or written to the repository. It can be used to fix malformed HTML, remove unwanted elements and attributes, and transform elements into a representation needed by the CMS, e.g. images and internal links. The process applied is: 1. Parse HTML into DOM tree 2. Apply visitors to DOM tree 3. Serialize DOM tree as string
    • Method Detail

      • read

        String read​(String html,
                    List<TagVisitor> visitors)
             throws IOException
        Process stored HTML.
        Parameters:
        html - The stored HTML
        visitors - Visitors applied to the DOM tree
        Returns:
        Processed HTML
        Throws:
        IOException - when the DOM tree cannot be serialized
      • write

        String write​(String html,
                     List<TagVisitor> visitors)
              throws IOException
        Process HTML to store.
        Parameters:
        html - The HTML to be stored
        visitors - Visitors applied to the DOM tree
        Returns:
        Processed HTML
        Throws:
        IOException - when the DOM tree cannot be serialized