Comments (5)
@LiuXing-R the answer is in the policy. The current definition of lang
attribute on every predefined policy is:
<attribute name="lang" description="The 'lang' attribute tells the browser what language the element's attribute values and content are written in">
<regexp-list>
<regexp value="[a-zA-Z-]{2,20}"/>
</regexp-list>
</attribute>
The regex [a-zA-Z-]{2,20}
does not include -
and that is why is being filtered. As there are MANY ways of writting languages, the easiest (but safe) regex to use would be [a-zA-Z0-9-]{2,20}
. Of course it allows languages like ------
but that is not a problem when looking for malicious HTML input. That change on avery policy file would do.
from antisamy.
@spassarop - Sebastian? Any thoughts here? I have no idea.
from antisamy.
@spassarop You're right.
The regex [a-zA-Z-]{2,20} does not include - and that is why is being filtered. As there are MANY ways of writting languages, the easiest (but safe) regex to use would be [a-zA-Z0-9-]{2,20}. Of course it allows languages like ------ but that is not a problem when looking for malicious HTML input. That change on avery policy file would do.
@davewichers I'll change the regex to [a-zA-Z0-9-]{2,20}
later
from antisamy.
@LiuXing-R / @spassarop - Isn't the trailing dash after the Z a dash? I tested the original regex at: https://www.freeformatter.com/java-regex-tester.html#ad-output, entering: [a-zA-Z-]{2,20} as the regex, and en-GB as the value and it matched just fine. I don't see that adding digits to the regex hurts anything but it shouldn't fix the issue you reported either.
from antisamy.
That was my mistake, the trailing dash is not part of the current policies (the PR shows that). I added it for testing (it worked) and left it when I copied that definition to explain.
The digits is because you can express languages like "es-419" too. I've found out about that yesterday, as well as more formats which include lots of letters and dashes. It's just to be more inclusive.
from antisamy.
Related Issues (20)
- Change in behavior between 1.6.4 and 1.6.5 for getErrorMessages HOT 7
- Commit details for CVE-2022-28366? HOT 4
- Remove all deprecated APIs/features in prep for 1.7.0 release HOT 1
- ASHTMLSerializer uses deprecated HTMLSerializer. Replace with TrAX.
- AntiSamy converting single quotes to double quotes for font-family which is causing issue while rendering HOT 6
- AntiSamy not detecting XSS for anchor tag HOT 10
- CssHandler test case failure on Windows HOT 5
- Incorrect 'Contributing' link on OWASP wiki page HOT 1
- Javadoc cleanup
- 2 enhancement HOT 2
- 1 enhancement with api HOT 2
- CVE-2022-24891 HOT 7
- Removing Xerces dependency? HOT 3
- Does Antisamy has support for custom css properties " --* " and css-function " var() " and how to define it in the antisamy policy file? HOT 10
- Enabled noopenerAndNoreferrerAnchors policy drops nofollow HOT 7
- Covering all cases of "rel" attribute in "anchor" tag is quite verbose HOT 3
- Investigate replacing Batik CSS HOT 1
- Dealing with Security Vulnerabilities CVE-2023-26119 HOT 13
- AntiSamy encodes unknown tags despite not being configured that way HOT 6
- GraalVM Support HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from antisamy.