i have a problem, when there is a "<" symbol in content without ">" it will remo

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Thanks, it works. For Ref: <a href="https://www.owasp.org/index.php/OWASP_Java_Enc

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

incomplete tag removing rest of content about antisamy HOT 8 CLOSED

goshantmeher commented on August 27, 2024

incomplete tag removing rest of content

from antisamy.

Comments (8)

feisec commented on August 27, 2024

I thk you need this：owasp-java-encoder

from antisamy.

kwwall commented on August 27, 2024

@goshantmeher I would speculate that the reason that AntiSamy does things this way is that there are browsers out there that will attempt to fill the missing ">" (or "/>", as the case may be). (That is, the follow Postel's Law, sometimes to the detriment of security.)

If you know that the input is like this and that what you expect the output to look like, then hopefully you also know the context of where the output will be placed, and thus @xtay004 is correct in stating that output encoding using something like the OWASP Java Encoder project is the way to go. In general, output encoding is preferred over HTML "sanitization" as a defense against XSS anyway since it is "less destructive" in that it will leave more of the original input rendered than an HTML sanitizer like AntiSamy or the Java HTML Sanitizer project will. HTML sanitization is intended when you have to accept HTML markup, but otherwise output encoding generally should be preferred.

from antisamy.

goshantmeher commented on August 27, 2024

Thanks, it works.
For Ref: https://www.owasp.org/index.php/OWASP_Java_Encoder_Project#tab=Use_the_Java_Encoder_Project

from antisamy.

goshantmeher commented on August 27, 2024

@kwwall, the thing we are doing is on the form submit, we are using org.owasp.validator.html.CleanResults.getCleanHTML() within RequestValidator.validate() to validate all the input fields before saving all data in DB.

if I use Encoder during this, all field values which have let say '<hello', it will return encoded '<hello' string and this value will get saved in DB. which is wrong for me.

For defense against XSS, want to remove only all HTML tags, non-html tags should not be get removed.
The need of mine is I have a Textarea field where I will input any string as description, where it may have <aaaa , and it should get saved.
Please give me a solution for this.

from antisamy.

davewichers commented on August 27, 2024

XSS encoding should only be done right before that content is included in the server response and not before. I.e., it should NOT be done before storing the data in the database or anywhere else for that matter, as you don't know for certain everywhere that data is going. Sanitization really should only be done right before rendering to the browser too. My suggestion is to validate the input and reject what you don't like, and then assume that potential XSS attacks still can sneak through, so whenever you include that data in a full page response (and is thus subject to XSS), then you output encode using the OWASP Java Encoder library. If you are instead including the 'dangerous' data in JSON responses, its actually not vulnerable to XSS, because your JavaScript is very likely to add that data to the DOM in a way that is immune to XSS.

from antisamy.

goshantmeher commented on August 27, 2024

Ok, I understand the issues but for the case what users will input is not in my control. for now, it means I can't use in-between '<>', which is not acceptable.
for now, I want when there will be input and output should be
So my request in this issue is, can anyone provide me a feature to allow all unknown tags.

from antisamy.

kwwall commented on August 27, 2024

Ok, I understand the issues but for the case what users will input is not in my control. for now, it means I can't use in-between '<>', which is not acceptable.
for now, I want when there will be input and output should be
So my request in this issue is, can anyone provide me a feature to allow all unknown tags.

@goshantmeher You say you are understanding @davewichers' comment, but I'm not sure you do. Cleansing via AntiSamy (or the OWASP Java HTML Sanitizer) is very different than output encoding, and almost all of us in the AppSec community consider it to be an XSS defense of last resort. The bottom line is, it doesn't matter if you store the raw, dangerous user input if you always ensure that you properly output encode it (properly, as in "using the output encoding for the appropriate context") before it is rendered in a browser. Yes, that requires a lot more work on your part, but it is much more bullet-proof then what you are proposing. That said, if you really want to insist on shooting your foot (if not your entire leg) off, you can customize AntiSamy's 'antisamy.xml' file to make it accept whatever you want. But CAVEAT EMPTOR; you have been warned.

from antisamy.

davewichers commented on August 27, 2024

As I don't think it is appropriate/practical for AntiSamy to implement this change, I'm closing this issue.

from antisamy.

incomplete tag removing rest of content about antisamy HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs