Javascript „striptags“: Remove HTML commands from text and variables!
07/15/2019 (979x read)
PHP comes with the „striptags“ function, with which you can easily remove all HTML commands from a text. In Javascript there is no command for this, but you can do the same with a little trick! This is important, for example, to process a text further, or just to count words and letters.
Javascript: Remove HTML commands
Javascript does not have its own function to delete HTML commands from a value or variable. But that doesn’t matter, you can use „innerHTML“ and „innerText“ for it: This is actually meant to either read or change the text-content of an element. With innerText one reads the pure text, with innerHTML also (possible) HTML commands.
Example: innerHTML and innerText
If there is a <div> element with the id „div“ for the website, you can access, read or change the content:
var html = document.getElementById('div').innerHTML; var text = document.getElementById('div').innerText;
The variable „html“ then contains the content of the element with all HTML commands, the variable „text“ only contains the text content without HTML: The HTML commands are automatically deleted when reading.
This can now be exploited by adding a new element to the document, putting the complete text with HTML commands there and then reading the text version again: So the variable „text“ contains only letters and numbers, no more HTML commands!
With this code you can for example read out the value of a textarea, save it in a new element and then only read out the text version:
var textarea = document.getElementById('textarea'); var newtextarea = ''; var div = document.createElement("div"); div.innerHTML = textarea.value; newtextarea = div.innerText;
Alternatively to „innerText“ you can also use „textContent“: This is even better supported by the browsers. While all current browsers have supported both for years, Microsoft has only known „innerHTML“ with version 10 of Internet Explorer, „textContent“ already with version 9.
textContent, unlike innerText, also returns the content with <script> and <style> commands: innerText also deletes them from the text with the other HTML commands.