GithubHelp home page GithubHelp logo

Comments (4)

kurtrank avatar kurtrank commented on June 3, 2024 1

Yes @Mamaduka is correct, we can now do it with 6.5. This past week I happened to need to build a simple Table of Contents generator that gets all the heading tags, gives them an id if they don't have one, and also gets the inner text of each one to then create a list of links. I could have done it with regex but it was a great opportunity to learn the new HTML API stuff.

I think I've got a handle on how it works now; the biggest part was realizing I can use next_tag() and next_token() in conjunction with each other, and how to identify the tokens I wanted.

i want to know what is the text content of that tag, or what some attributes might be.

  1. To read the value of attributes you can simply use get_attribute( 'id' ) when the processor is pointed at the tag you want. If you want to search for an element with a class, you can do that by passing a query argument to next_tag()
  2. To get the text content is a little more involved than a single function call but still pretty easy using the new next_token() method. The key is that you will scan through potentially multiple tokens, as your button could have nested <strong> tags or other inline markup and we want the inner text only

The documentation on this API and these methods is pretty informative, I would recommend reviewing it https://developer.wordpress.org/reference/classes/wp_html_tag_processor/#methods

Example

For my similar use case I basically set up two nested while loops:

  1. The first outer one loops through our tags like normal
  2. Once we find a tag we want, we use $tags->next_token() to start a new inner while loop, getting the text inside the tag and concatenating it together
  3. When we hit a token that matches the tag we started with and is a closing tag, end the inner loop and continue on to the next tag
$h_tags = array( 'H1', 'H2', 'H3', 'H4', 'H5', 'H6' );

$tags = new WP_HTML_Tag_Processor( $html );

// ----> (1) start looping through tags
while ( $tags->next_tag() ) {

	// I wanted to match an array of tags, so I have an if statement inside, but if you are just
	// looking for one tag you can instead query your tag name directly in `next_tag()` in the while statement
	if ( in_array( $tags->get_tag(), $h_tags, true ) ) {

		$level = (int) str_replace( 'H', '', $tags->get_tag() );
		$id    = $tags->get_attribute( 'id' );

		// set bookmark to come back to in case we need to generate an id from inner text
		$tags->set_bookmark( 'current_heading_start' );

		$text = '';

		// ----> (2) start capturing inner text
		while ( $tags->next_token() ) {
			// we only want to get plain text, skip all other token types
			if ( '#text' === $tags->get_token_type() ) {
				$text .= $tags->get_modifiable_text();
			} elseif ( "H{$level}" === $tags->get_tag() && $tags->is_tag_closer() ) {
				// ----> (3) we got all the inner text, break our inner loop so we can go to the next tag
				$tags->set_bookmark( 'current_heading_end' );
				break;
			}
		}

		// generate a new id and insert into heading tag
		if ( ! $id ) {
			// return to starting tag and update id attribute
			$id = sanitize_title( $text );
			$tags->seek( 'current_heading_start' );
			$tags->set_attribute( 'id', $id );

			// resume and clean up bookmarks
			$tags->seek( 'current_heading_end' );
			$tags->release_bookmark( 'current_heading_start' );
			$tags->release_bookmark( 'current_heading_end' );
		}

		// insert item to toc
		$item = array(
			'text'  => $text,
			'id'    => $id,
			'items' => array(),
		);
	}
}

from gutenberg.

Mamaduka avatar Mamaduka commented on June 3, 2024 1

Thanks for sharing a great example, @kurtrank!

from gutenberg.

yglik avatar yglik commented on June 3, 2024 1

@kurtrank Thank you for the detailed examples, it realy help me understand what is the process i have to do i norder to accomplish what i want.

i also wanted to add "id" attriubte to each heading, and the make some sort of table of contents.
and just like you i thought this is a great opportunity to learn about WP HTML API

it was difficult for me to pinpoint what methods i should have used in order to extract the text of an element (what in JS referred to as textContent or innerText)
because in all the previous tools, like js or simple html dom which i use often, or other server side dom parsers, the name of the text is, textContent or something of that sort.
its a bit paradigm shift to think of the text as a token in that html (which i guess that under the abstraction levels it is that)

and i did searched through the docs of the HTML API and could find it.

i guess it could be beneficial to add some abstraction to the HTML API to be more like other tools, lets say, a method that retrives the textContent of a tag (or better to say a node or a dom element in that context)
also, the documentatino can benefit from a wider vary of use cases, which i suppose is always good.

i realy hope this thread will help people searching something like this to use in the WP HTML API in the future

thanks again

from gutenberg.

Mamaduka avatar Mamaduka commented on June 3, 2024

I think that might be possible with new methods introduced in WP 6.5. See the dev note for more details - https://make.wordpress.org/core/2024/03/04/updates-to-the-html-api-in-6-5/.

from gutenberg.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.