Class TagNode

  • All Implemented Interfaces:
    Iterable<Node>
    Direct Known Subclasses:
    BodyNode

    public class TagNode
    extends Node
    implements Iterable<Node>
    Node that can contain other nodes. Represents an HTML tag.
    • Method Detail

      • addChild

        public void addChild​(Node node)
        appends the provided node to the collection of children if this node is set as the parameter's parent. This method is used in the Node's constructor
        Parameters:
        node - - child to add must have this node set as its parent.
        Throws:
        IllegalStateException - - if the parameter has different parent than this node.
      • getIndexOf

        public int getIndexOf​(Node child)
        If the provided parameter is in the same tree with this object then this method fetches index of the parameter object in the children collection. If the parameter is from a different tree, then this method attempts to return the index of first semantically equivalent node to the parameter.
        Parameters:
        child - - the template of a tag we need an index for.
        Returns:
        the index of first semantically equivalent child or -1 if couldn't find one
      • addChild

        public void addChild​(int index,
                             Node node)
        Inserts provided node in the collection of children at the specified index if this node is set as a parent for the parameter.
        Parameters:
        index - - desired position among the children
        node - - the node to insert as a child.
        Throws:
        IllegalStateException - - if the provided node has different parent from this node.
      • getChild

        public Node getChild​(int i)
      • getNbChildren

        public int getNbChildren()
        Returns:
        number of elements in the children collection or 0 if the collection is null
      • getQName

        public String getQName()
      • getAttributes

        public Attributes getAttributes()
      • isSameTag

        public boolean isSameTag​(TagNode other)
        checks tags for being semantically equivalent if it's from a different tree and for being the same object if it's from the same tree as this tag.
        Parameters:
        other - - tag to compare to
      • equals

        public boolean equals​(Object obj)
        Considers tags from different trees equal if they have same name and equivalent attributes. No attention paid to the content (children) of the tag. Considers tags from the same tree equal if it is the same object.
        Overrides:
        equals in class Object
        Parameters:
        obj - - object to compare to.
      • isSimilarTag

        protected boolean isSimilarTag​(Node another)
        Returns true if this tag is similar to the given other tag. The tags may be from different trees. If the tag name and attributes are the same, the result will be true.
        Parameters:
        another - the tag to compare with
        Returns:
        wether this tag is similar to the other node
      • hashCode

        public int hashCode()
        Since we only consider so much information of the TagNode in equals method, we need to re-write hashCode method to correspond. Otherwise HashTables and HashMaps might behave unexpectedly.
        Overrides:
        hashCode in class Object
      • getOpeningTag

        public String getOpeningTag()
        Produces String for the opening HTML tag for this node. Includes the attributes. This probably doesn't work for image tag.
        Returns:
        the String representation of the corresponding opening HTML tag.
      • getEndTag

        public String getEndTag()
        Returns:
        String representation of the closing HTML tag that corresponds to the current node. Probably doesn't work for image tag.
      • getMinimalDeletedSet

        public List<Node> getMinimalDeletedSet​(long id)

        This recursive method considers a descendant deleted if all its children had TextNodes that now are marked as removed with the provided id. If all children of a descendant is considered deleted, only that descendant is kept in the collection of the deleted nodes, and its children are removed from the collection of the deleted nodes.
        The HTML tag nodes that never had any text content are never considered removed

        It actually might have nothing to do with being really deleted, because the element might be kept after its text content was deleted.
        Example:
        table cells can be kept after its text content was deleted
        horizontal rule has never had text content, but can be deleted

        Specified by:
        getMinimalDeletedSet in class Node
      • splitUntill

        public boolean splitUntill​(TagNode parent,
                                   Node split,
                                   boolean includeLeft)
        Attempts to create 2 TagNodes with the same name and attributes as the original this node. All children preceding split parameter are placed into the left part, all children following the split parameter are placed into the right part. Placement of the split node is determined by includeLeft flag parameter. The newly created nodes are only added to the parent of this node if they have some children. The original this node is removed afterwards. The process proceeds recursively hiking up the tree until the "parent" node is reached. "Parent" node will not be touched. This method is used when the parent tags of a deleted TextNode can no longer be found in the new doc. (means they either has been deleted or changed arguments). The "parent" parameter in that case is the deepest common parent between the deleted node and its surrounding remaining siblings.
        Parameters:
        parent - - the node that should not participate in split operation (where the split operation stops)
        split - - the node-divider to divide children among splitted parts
        includeLeft - - if true the "split" node will be included in the left part.
        Returns:
        true if single this node was substituted with 2 new similar nodes with original children divided among them.
      • isBlockLevel

        public static boolean isBlockLevel​(String qName)
      • isBlockLevel

        public static boolean isBlockLevel​(Node node)
      • isBlockLevel

        public boolean isBlockLevel()
      • isInline

        public static boolean isInline​(String qName)
      • isInline

        public static boolean isInline​(Node node)
      • isInline

        public boolean isInline()
      • getMatchRatio

        public double getMatchRatio​(TagNode other)
      • expandWhiteSpace

        public void expandWhiteSpace()
      • isPre

        public boolean isPre()