GithubHelp home page GithubHelp logo

Comments (56)

rombert avatar rombert commented on July 24, 2024 1

I stumbled upon this issue while looking myself into problems with AntiSamy 1.6.4 and Xalan within the Apache Sling project. We use AntiSamy embedded into our XSS protection bundle, and over the years we accumulate quite some workarounds for specific platforms:

The only solution that worked for us was to embed Xalan and make sure that it is used by AntiSamy, otherwise it's quite hard to guarantee that thing works as expected.

With the current proposed change (IIUC) Xalan is completely bypassed and we would be stuck without the possibility of working around it. Well, the only option would be to include some of AntiSamy as source code in our project and patch it, but that's really a last resort for us.

It would be great if a solution could be found that:

  • uses TransformerFactory.newInstance();
  • does not require that the implementation supports JAXP 1.5 features

A try-catch in a static block could work for us, but IIUC that's not a solution that the maintainers approve of. I wonder if using something like https://docs.oracle.com/javase/8/docs/api/javax/xml/XMLConstants.html#FEATURE_SECURE_PROCESSING could be used instead, as it supported by Xalan and the features enabled for AntiSamy should be enabled by the secure processing features, as indicated in the Javadoc, e.g. in https://docs.oracle.com/javase/8/docs/api/javax/xml/XMLConstants.html#ACCESS_EXTERNAL_DTD

When FEATURE_SECURE_PROCESSING is enabled, it is recommended that implementations restrict external connections by default, though this may cause problems for applications that process XML/XSD/XSL with external references.

Something like this makes AntiSamy 1.6.5-SNAPSHOT work for us, while 1.6.4 does not

diff --git a/src/main/java/org/owasp/validator/html/scan/AntiSamySAXScanner.java b/src/main/java/org/owasp/validator/html/scan/AntiSamySAXScanner.java
index b225723..5ead3df 100644
--- a/src/main/java/org/owasp/validator/html/scan/AntiSamySAXScanner.java
+++ b/src/main/java/org/owasp/validator/html/scan/AntiSamySAXScanner.java
@@ -55,7 +55,7 @@ public class AntiSamySAXScanner extends AbstractAntiSamyScanner {
     private static final Queue<CachedItem> cachedItems = new ConcurrentLinkedQueue<CachedItem>();
 
     private static final TransformerFactory sTransformerFactory =
-        TransformerFactory.newInstance("com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl", null );
+        TransformerFactory.newInstance();
 
     static {
         // Per issue #103, an IllegalArgumentException could be thrown below if the SAX parser does not
@@ -66,8 +66,11 @@ public class AntiSamySAXScanner extends AbstractAntiSamyScanner {
         // JDK provided Xalan SAX parser, which DOES support these features.
 
         // Disable external entities, etc.
-        sTransformerFactory.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "");
-        sTransformerFactory.setAttribute(XMLConstants.ACCESS_EXTERNAL_STYLESHEET, "");
+        try {
+            sTransformerFactory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
+        } catch (javax.xml.transform.TransformerConfigurationException e) {
+            throw new ExceptionInInitializerError(e);
+        }
     }
 
     static class CachedItem {

It would be great if a fix like this would be considered for 1.6.5.

from antisamy.

kwwall avatar kwwall commented on July 24, 2024 1

@azmau - You wrote:

we have ESAPI 2.2.3.1. I only mentioned it because I thought it might ring a bell for one of you guys, on my side is a pure hunch, I did no further analysis on this direction.

I only mentioned that ESAPI 2.4.0.0 uses AntiSamy 1.6.8 in case you had compatibility issues with ESAPI. We hadn't tested earlier versions of ESAPI with that version of AntiSamy, largely because earlier versions of ESAPI only required JDK 7 as the minimal JDK and AntiSamy 1.6.8 requires JDK 8 or later (as now ESAPI 2.4.0.0 does). However, given that you are using an older, untested version of ESAPI (2.2.3.1) with a newer version of AntiSamy (1.6.8), you may want to verify that ESAPI's Validator.getValidSafeHTML() and Validator.isValidSafeHTML() methods still work as you expect them. If they do not, I suggest that you upgrade to ESAPI 2.4.0.0. (Actually, given all the CVEs that were fixed in ESAPI 2.3.0.0 and 2.4.0.0, I would suggest upgrading to ESAPI 2.4.0.0 regardless.)

from antisamy.

davewichers avatar davewichers commented on July 24, 2024

What exactly do you want changed here? I believe I added these lines to simply get Fortify to stop complaining that there was a potential XXE vuln here. Even though we were reading a developer written XML file so there really wasn't a risk here in the first place. Are you suggesting we remove those lines? What compatibility breakage are you referring to?

from antisamy.

timcoleman avatar timcoleman commented on July 24, 2024

I'm suggesting putting a try/catch block around the lines in question.

try
{
sTransformerFactory.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "");
sTransformerFactory.setAttribute(XMLConstants.ACCESS_EXTERNAL_STYLESHEET, "");
}
catch ( IllegalArgumentException e )
{
//jaxp 1.5 feature not supported
}

The compatibility breakage that I'm referring to is with Apache Xerces. It is not JAXP 1.5, and trying to set those attributes with it will cause an exception to be thrown.

Oracle recommends catching the exception in https://docs.oracle.com/javase/tutorial/jaxp/properties/usingProps.html. I'm not sure what should happen in the catch block, but of course Fortify would complain if you left it empty.

I understand that if it doesn't pass a Fortify scan, then there is not much you can do about it. It seems to me that trying to set the attributes should be enough, and if an exception occurs, it is because of an older, possibly insecure, parser. But that shouldn't be an AntiSamy issue.

from antisamy.

kwwall avatar kwwall commented on July 24, 2024

@davewichers - If you just did this to shut Fortify up, putting it in a try/catch block and just ignoring it will likely cause Fortify to grip about that. If you you log it and ignore it, you may be able to get away with that. But why not just use AWB to mark that particular Fortify finding as "Not an Issue" so it gets ignored thereafter? (Assuming you do scans combined with merges.)

from antisamy.

davewichers avatar davewichers commented on July 24, 2024

@timcoleman - Doesn't AntiSamy itself provide the library in question through it's pom? Seems like AntiSamy should be able to figure out exactly which Xerces implementation it normally uses here and figure out if it breaks or not.

from antisamy.

timcoleman avatar timcoleman commented on July 24, 2024

I'm pretty sure that no implementation of Xerces works with it.

from antisamy.

AnVillab99 avatar AnVillab99 commented on July 24, 2024

Hi did you find a solution for this?. I believe I'm having the same issue

from antisamy.

davewichers avatar davewichers commented on July 24, 2024

@timcoleman - Any chance you could just submit a pull request with your suggested change? Or @AnVillab99 - you could do it too. Sorry I lost track of this one. Been busy ....

from antisamy.

timcoleman avatar timcoleman commented on July 24, 2024

from antisamy.

hazendaz avatar hazendaz commented on July 24, 2024

Are you using java 11+ which is causing the issue? If so, remove xerces from this dependency as all the necessary items are in that java release. XercesImpl here is only needed for java 7/8 usage because java didn't fully update what was needed and now used from xerces. However, in newer jdks, they updated that and are 1.5 compliant as they were in 7/8. We got around this a while ago by simply doing this for our newer jdk usage. It was honestly a hard one to figure out. If this is th same situation, you may want to give that a shot.

from antisamy.

davewichers avatar davewichers commented on July 24, 2024

@timcoleman I'm trying to implement this fix in the 1.6.5 branch, before I merge it into main. However, I'm not able to replicate the issue you are seeing (and @hazendaz has seen too). Can (either of) you give me the EXACT command you are running, and the EXACT version of Java you are seeing this with? I tried: mvn clean test (Using Java 8, 11, and 16) and can't seem to cause the exception you are seeing. I suspect the cause is that you are actually including a different version of Xerces in your project than what AntiSamy uses by default. So in your situation this different Xerces implementation is being used instead of xerces:xercesImpl:2.12.1 which is the version included with AntiSamy in its pom.xml. Once I can replicate the problem, I can then vet your solution, or come up with a different one. For example, I might be able to pin AntiSamy's use of Xerces to: xerces:xercesImpl:2.12.1 (rather than whatever other version of Xerces Java finds on the classpath first).

from antisamy.

hazendaz avatar hazendaz commented on July 24, 2024

This issue is somewhat complicated and hard to articulate. My understanding of this is mostly second hand and technically Xerces issue. We dropped usage of xerces with jdk 11+ to resolve as not needed. I'm not 100% certain for java 7/8 that this issue even pops up based on xerces documentation stating to use endorsed to drop to jaxp 1.4. So my write-up here may be slightly off and likely only applicable to jdk 9+.

The issue is with what the jdk does when xerces is added to the classpath. The jdk since jdk 7u40 has included jaxp 1.5. Xerces is only jaxp 1.4 compliant. When Xerces is added to the classpath, the jdk will stop using its implementation and this results in dropping jaxp from 1.5 to 1.4 (fuzzy if this is all jdks or just 9+ where no endorsed exists). A true fix would be getting xerces to properly support jaxp 1.5 which exists on their jira last I looked but seemed to have zero traction or concern.

So what is going on with antisamy. Recent versions require more code out of xerces that do not exist in the jdk prior to 9 (even commented in pom as such). Java has its own implementation which is still xerces plus jaxp 1.5, etc but not same xerces version needed by antisamy currently. So its not possible with jdk 7/8 to not use xercesImpl with antisamy. The code will compile without issue either way for the concern here (api usage). Its the runtime when it hits usage of that logic that causes the real problem (which implementation usage).

Where I've seen this most often is not antisamy but the parameters in general and where xerces showed up. Given our use case was already jdk 11, it was easy to just drop xerces without issue. This was further caught during code reviews as developers my team supports kept trying to use internal sun implementation instead as that also fixed issue with xerces was on path but not everything is oracle jdk (unclear if that really worked or just made the issue silent).

So possibly, fix would be to know what java version is being used. If java 7/8, set the internal sun class variables for forces its hand but that limits the solution to oracle (if that is tested to work). I would however, wrap xerces in a profile and simply not include it if jdk 9+ so at least users of jdk 9+ don't even see this issue. Also need feadback here on what jdk was involved.

from antisamy.

timcoleman avatar timcoleman commented on July 24, 2024

from antisamy.

davewichers avatar davewichers commented on July 24, 2024

@hazendaz @timcoleman - Thanks for the explanations but neither definitively describes how I can replicate the error that @timcoleman originally reported. Based on your explanations, It appears that if I upgrade to Java 11 (does it matter which version?), and then exclude XercesImpl from the pom, so it tries to use the Xerces built into Java 11+ (or maybe 9+), then I can cause the exceptions to occur that Tim originally reported as problematic??

Update: I tried excluding XercesImpl, and that doesn't work either. I get errors like:
[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR] /Users/Dave.Wichers/git/ANTISAMY/antisamy_main/src/main/java/org/owasp/validator/html/scan/MagicSAXFilter.java:[32,30] package org.apache.xerces.util does not exist
[ERROR] /Users/Dave.Wichers/git/ANTISAMY/antisamy_main/src/main/java/org/owasp/validator/html/scan/MagicSAXFilter.java:[33,30] package org.apache.xerces.util does not exist
[ERROR] /Users/Dave.Wichers/git/ANTISAMY/antisamy_main/src/main/java/org/owasp/validator/html/scan/MagicSAXFilter.java:[34,30] package org.apache.xerces.util does not exist
[ERROR] /Users/Dave.Wichers/git/ANTISAMY/antisamy_main/src/main/java/org/owasp/validator/html/scan/MagicSAXFilter.java:[35,29] package org.apache.xerces.xni does not exist
[ERROR] /Users/Dave.Wichers/git/ANTISAMY/antisamy_main/src/main/java/org/owasp/validator/html/scan/MagicSAXFilter.java:[36,29] package org.apache.xerces.xni does not exist
[ERROR] /Users/Dave.Wichers/git/ANTISAMY/antisamy_main/src/main/java/org/owasp/validator/html/scan/MagicSAXFilter.java:[37,29] package org.apache.xerces.xni does not exist
[ERROR] /Users/Dave.Wichers/git/ANTISAMY/antisamy_main/src/main/java/org/owasp/validator/html/scan/MagicSAXFilter.java:[38,29] package org.apache.xerces.xni does not exist
[ERROR] /Users/Dave.Wichers/git/ANTISAMY/antisamy_main/src/main/java/org/owasp/validator/html/scan/MagicSAXFilter.java:[39,29] package org.apache.xerces.xni does not exist
[ERROR] /Users/Dave.Wichers/git/ANTISAMY/antisamy_main/src/main/java/org/owasp/validator/html/scan/MagicSAXFilter.java:[40,36] package org.apache.xerces.xni.parser does not exist
...

So can one of you please provide the steps I need to follow to replicate the issue that Tim originally brought up in this thread? Without the ability to replicate the problem, I can't test/verify the fix. And this is holding up the release I'm trying to get out before the end of the year. And ideally the LAST version of AntiSamy before 1.7.0, which will drop Java 7 support.

from antisamy.

hazendaz avatar hazendaz commented on July 24, 2024

@hazendaz I also deep dived this last night to confirm what I was saying. Got the same XNI piece. I'm looking to see if there is anything to replace that with but seems like a lot of the xerces used in antisamy currently is deprecated and states to use xalan but nothing was exactly clean on trying to upgrade and javadocs are so old that they only gave vague solutions to use instead. Looking at just doing something with XNI would fix like I had noted but not sure what that can be replaced with. The remainder auto switches to included xerces in jdk without issue. I was trying the profiled method of xerces with jdk 7 and 8 only but removed with 11 in my test. Do you know if there is anything else that might work to replace that XNI usage?

I'm going to try to dig a bit deeper as well to see what is in the jdk exactly. Its greater than 2.9.0 for sure as the deprecations throughout mention that with jdk 11 at least. I did not check on jdk7/8 to see where it was then. I'm am getting the feeling its not quite 2.12.1 in jdk 11 which is a bit odd. Anyway still looking...

from antisamy.

davewichers avatar davewichers commented on July 24, 2024

@hazendaz - But how do I replicate the configuration that causes the IllegalArgumentException exception to be thrown in the first place? Do I have to add Xerces2-j (what coordinates?) or something? As removing XercesImpl and upgrading Java doesn't cause it.

from antisamy.

davewichers avatar davewichers commented on July 24, 2024

@timcoleman - The title of this issue is that Xerces2-J doesn't play well with ... However, when I dump out the class that is instantiated by the static initializer, I get:

sTransformerFactory is of type: class com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl and
sTransformerFactory.newTransformer() is of type: class com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl

As such, what does this static initializer have to do with Xerces2-J? It appears that AntiSamy is using Xalan (which comes with the JDK) and not Xerces in this case. There might still be problem here, but I don't think Xerces2-J is the cause.

When you get this IllegalArgumentException exception, can you dump out the value of .getClass() for the sTransformerFactory and the sTransformerFactory.newTransformer() to be instantiated and let me know what they are?

from antisamy.

timcoleman avatar timcoleman commented on July 24, 2024

from antisamy.

davewichers avatar davewichers commented on July 24, 2024

@timcoleman - Thanks for the clarification. There is a way to force the use of: com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl no matter what. Thus it will ignore xalan-j 2.7.2 (or anything else that implements the TransformerFactory API for that matter). I think that's a better solution than wrapping the call in a try/catch and logging any exceptions. I can actually do both, but definitely prefer to force the use of an XML parser that I know supports the settings I've added.

@hazendaz @timcoleman - What do you think of that solution, before I implement it?

from antisamy.

hazendaz avatar hazendaz commented on July 24, 2024

@davewichers By force do you mean using com.sun.org.* directly? If so, wouldn't that limit to oracle jdk only? Or is there another way you are thinking?

from antisamy.

timcoleman avatar timcoleman commented on July 24, 2024

from antisamy.

timcoleman avatar timcoleman commented on July 24, 2024

from antisamy.

kwwall avatar kwwall commented on July 24, 2024

from antisamy.

davewichers avatar davewichers commented on July 24, 2024

I just tested OpenJDK Runtime Environment AdoptOpenJDK-16.0.1+9 (build 16.0.1+9) and it returned: sTransformerFactory of type: class com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl, so I'm pretty sure this class is a standard part of the JDK, and different versions thereof. So not limited to Oracle JDK.

I just tried doing that in my code, and it works fine. I replaced this: private static final TransformerFactory sTransformerFactory = TransformerFactory.newInstance(); with: private static final TransformerFactory sTransformerFactory = new com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl(); I don't think that is limited to the Oracle JDK, because we are using Adoptium. But are there places where one might need a different transformer than the default implementation? Tim

On Wed, 29 Dec 2021 at 17:33, Jeremy Landis @.> wrote: @davewichers https://github.com/davewichers By force do you mean using com.sun.org. directly? If so, wouldn't that limit to oracle jdk only? Or is there another way you are thinking?
-- Tim Coleman e-mail: @.
*

from antisamy.

davewichers avatar davewichers commented on July 24, 2024

I am trying to avoid any non-test logging. That's why I'm trying to figure out the real root cause, and make it just 'go away' so no logging is necessary, or I can put in logging, but it won't go anywhere unless the AntiSamy library user adds logging.

Thought you were trying to avoid doing logging except in your JUnit tests? So why not catch the Exception, and just rethrow it as RuntimeException with the original Exception as the 'cause'?

-kevin

from antisamy.

davewichers avatar davewichers commented on July 24, 2024

@timcoleman @hazendaz - Guys - I just pushed a fix for this into the 1.6.5 branch. Can you test/review it and verify it fixes the issue in the way you expect? @kwwall - Can you review this new code change to verify it doesn't actually introduce any actual logging directly into AntiSamy, as no actual logging implementation is included.

from antisamy.

hazendaz avatar hazendaz commented on July 24, 2024

not confirmed but looked up ibm jdk which is used with websphere and it states per their documentation page that they built off openjdk and it contains all openjdk does so I agree we are likely ok using the internal one. I won't be able to confirm on a websphere instance until next week but don't think that should be any hold up.

from antisamy.

kwwall avatar kwwall commented on July 24, 2024

I am trying to avoid any non-test logging. That's why I'm trying to figure out the real root cause, and make it just 'go away' so no logging is necessary, or I can put in logging, but it won't go anywhere unless the AntiSamy library user adds logging.

Thought you were trying to avoid doing logging except in your JUnit tests? So why not catch the Exception, and just rethrow it as RuntimeException with the original Exception as the 'cause'?

-kevin

@davewichers - My point is, why log this at all? It seems to me that the worst decision is to catch the exception, have AntiSamy log it, and then otherwise ignore the exception. The client code MUST be aware of the failure, so you need to signal that to them in some manner. IMO, an unchecked exception such as RuntimeException (or a subclass thereof) is the ideal way to do that, although if the particular method in question already throws a checked exception and it is documented in Javadoc, just update the reason in Javadoc. But just chain the original exception to yours. It pretty much will get logged. (At this point, if the method signature in question doesn't already have a checked exception in its 'throws' declaration, I would strongly argue for throwing as some unchecked exception so you don't break your SDK contract with clients and cause them to alter their code.)

The reason I am against looking is until know, you use of logging had been limited to testing. And while from ESAPI's perspective, I don't care all that much if slf4j-api.jar is there as a dependency in something other than test scope, you certainly do have other clients, many of whom may not be supporting SLF4J. For them, that means dragging in an additional minimal dependency (probably 2; slf4j-api and something like slf4j-simple, since the first isn't terribly useful by itself to get useful output). An alternative is to signal an error in some other manner and provide a new method for clients to achieve the error messages.

But I would definitely argue against doing any sort of logging since you really don't have significant logging elsewhere, except in your test code. (By comparision, ESAPI uses a lot of internal logging, so adding additional logging here or there doesn't have nearly the impact on ESAPI clients as doing so in AntiSamy will for AntiSamy clients.)

Just my $.02.

from antisamy.

kwwall avatar kwwall commented on July 24, 2024

@kwwall - Can you review this new code change to verify it doesn't actually introduce any actual logging directly into AntiSamy, as no actual logging implementation is included.

@davewichers - I've be glad to take a look if you can point me to a specific commit. Thanks.

from antisamy.

davewichers avatar davewichers commented on July 24, 2024

@kwwall - The commit is already referred to above: 15d113f. I do like your idea of simply rethrowing the exception OR not catching it at all. @hazendaz @timcoleman - Any thoughts on what's best to do here? What if I simply remove the try/catch entirely and let the exception get thrown (since it shouldn't occur anymore)?

from antisamy.

timcoleman avatar timcoleman commented on July 24, 2024

from antisamy.

davewichers avatar davewichers commented on July 24, 2024

@timcoleman - It shouldn't get thrown as long as that class is provided by Java. Its present up through Java 16 so far, and I even tested Amazon Corretto Java 11, and it includes it too. As such, I think I'm going to rip out the try/catch that was suggested as its not needed (unless there are any objections)? Tim - can you test this branch in your setup to make sure my change fixes your issue? Or have you done that already?

from antisamy.

davewichers avatar davewichers commented on July 24, 2024

OK. I removed all the new try/catch logic in the proposed pull request to fix this. The net change of all this was replacing:

    private static final TransformerFactory sTransformerFactory = TransformerFactory.newInstance();

with:
private static final TransformerFactory sTransformerFactory =
TransformerFactory.newInstance("com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl", null );

Which forces the use of the JDK provided Xalan SAX Parser which does support the required JAXP 1.5 features. That way, if there are apps using AntiSamy that also provide another SAXParser that this TransformerFactory might decide to instantiate instead, it won't because we force the JDK provided one to be instantiated.

This fix will be included in the 1.6.5 release which I hope to release within the week (hopefully just a few days).

from antisamy.

hazendaz avatar hazendaz commented on July 24, 2024

@davewichers Thanks for everything here. As soon as this is out, I can get a number of teams to upgrade and see if they are experiency any further issues. We have a broad mix of websphere, tomcat, spring boot so I can tackle all those and the recent issues were getting reluctance on teams to upgrade. My gut tells me this issue was the primarly problem. All the outside usage, I didn't really like teams trying to use sun internals but clearly after all this, it appears that its everywhere and best approach.

from antisamy.

hazendaz avatar hazendaz commented on July 24, 2024

@rombert This should be already be fixed in 1.6.5 snapshot as we had a lengthy discussion and rework on it. Even with what you have above, that would effectively not work when that exception is thrown and xerces is on the path. Antisamy requires xerces (lots of code directly tied to it). The sun internals were confirmed part of most jdks including IBMs even if not really most appropriate fix. It does force it to work with xalan. Information is useful for further reference :)

from antisamy.

rombert avatar rombert commented on July 24, 2024

@hazendaz - thanks for the quick reply. I'm not sure I understand what my suggestion has to do with xerces. The code path is related to Xalan only, and the Xalan transformer needs to support the XMLConstants.FEATURE_SECURE_PROCESSING feature, which it does.

FWIW, my diff is on top of d8fa2f4 . With the diff applied, I can use AntiSamy. Without, I don't ( 1.6.4 or 1.6.5-SNAPSHOT ).

from antisamy.

hazendaz avatar hazendaz commented on July 24, 2024

Per javadoc, it states 'true instructs the implementation to process XML securely. This may set limits on XML constructs to avoid conditions such as denial of service attacks.' So that would imply it doesn't always do it. It will allow you to set that sure, but follow up on documentation starts discussing setting the other flags as done now. What is done is to force the internals that are in every openjdk based version up through current early access 19 releases. When jdk 9+ is used, endorsed is no longer present and antisamy has a hard requirement on Xerces which is really old. As such it automatically switches to use that copy of xerces unless told to use the internals. That is why 1.6.4 is broke. This change should have resolved that. The javadocs further state the above setting only in jaxp 1.5. The problem with xerces on path is that it downgrades to jaxp 1.4 without the hard-coded factory to use internals.

So if I understand properly here, I think above would effectly turn this back off. Further understanding is that XALAN was meant to replace XERCES. Most of what antisamy is still using is deprecated classes that are stated out of date from what made it to XALAN but there are performance issues in XALAN and what looks like a lot of work to upgrade. In both cases, what is used inside java is base + extra that isn't available in adding those on. My undertanding is java will only use one or the other. THe API's are part of a different package so technically unless some hard-wired reason, neither should be on classpath as extras.

To further give a real world example, we secured code (not with antisamy) to use the above but factory as you have (not overriding it). A downstream team complained their stuff did not work - same exceptions. Told them they were using Xerces which they said they were not. Looked at their pom and they were. Excluded xerces and it then worked properly. In that case, we had zero dependency on internals of Xerces so we did not need to do what was done here. Antisamy has to due to it heavily using internals of Xerces as noted.

Can you provide more details such as stack trace you see off 1.6.5-SNAPSHOT, check your stack for usage of direct XALAN, we know you are using XERCES because of antisamy, and finally what jdk you are using (version + vendor)?

from antisamy.

davewichers avatar davewichers commented on July 24, 2024

@rombert - I can't imagine how the change I've implemented here to force the use of the internal Xalan provided with all versions of the JDK could adversely affect anything. This only affects AntiSamy's use of this library. Whatever else you do with Xalan/Xerces/whatever in your app is not affected by this.

from antisamy.

rombert avatar rombert commented on July 24, 2024

Thanks for the extensive comments @hazendaz and @davewichers . My situation is a bit different as I actually want to force Xalan to be used, since that guarantees a baseline for all our deployments. We use OSGi, which has a different classloader model compared to Java SE and EE, which contributes to our very particular setup.

In other words - we want to be able to work with Xalan and not with the JRE-provided XML parser since that fails in obscure ways on various platforms that our consumers deploy on. Most of the time we can't ask them to tweak their deployments ( properties files, classloader delegation, endorsed libraries, etc ) and with a large consumer base it does not scale anyway.

As for an example, you can find one at https://github.com/apache/sling-org-apache-sling-xss/tree/issue/SLING-10953 .

$ git clone https://github.com/apache/sling-org-apache-sling-xss.git -b issue/SLING-10953
$ cd sling-org-apache-sling-xss/
$ mvn verify

The error that we see is

java.lang.ExceptionInInitializerError
        at org.owasp.validator.html.AntiSamy.scan(AntiSamy.java:129)
        at org.owasp.validator.html.AntiSamy.scan(AntiSamy.java:75)
        at org.apache.sling.xss.impl.HtmlToHtmlContentContext.getCleanResults(HtmlToHtmlContentContext.java:98)
        at org.apache.sling.xss.impl.HtmlToHtmlContentContext.filter(HtmlToHtmlContentContext.java:68)
        at org.apache.sling.xss.impl.XSSFilterImpl.filter(XSSFilterImpl.java:200)
        at org.apache.sling.xss.impl.XSSAPIImpl.filterHTML(XSSAPIImpl.java:428)
        at org.apache.sling.xss.impl.XSSAPIImplTest.testFilterHTML(XSSAPIImplTest.java:215)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
        at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
        at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
        at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
        at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
        at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
        at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
        at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
        at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
        at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
        at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
        at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
        at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
        at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
        at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
        at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
        at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
        at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:377)
        at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:138)
        at org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:465)
        at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:451)
Caused by: java.lang.IllegalArgumentException: Not supported: http://javax.xml.XMLConstants/property/accessExternalDTD
        at org.apache.xalan.processor.TransformerFactoryImpl.setAttribute(TransformerFactoryImpl.java:571)
        at org.owasp.validator.html.scan.AntiSamySAXScanner.<clinit>(AntiSamySAXScanner.java:62)
        ... 38 more

Thinking about this some more, would you be open to accepting a PR that does the following

  1. Sets the two features that are are now in 1.6.5-SNAPSHOT
  2. If that fails, attempts to enable the secure validation feature
  3. If that fails, rethrows the root case as an ExceptionInInitializerError and fails the whole process

?

I think that may be a reasonable compromise to getting AntiSamy to work in more environments. Of course, you may view this as not secure enough, which is totally fine, and I'll have to look for a workaround in the Sling project.

Thank you.

from antisamy.

rombert avatar rombert commented on July 24, 2024

In the meantime, we decided to update to 1.6.4 only and provide a custom TransformerFactory based on Xalan that accepts the new features and translates them to the 'secure' attribute. Since the original branch is gone, in case you want to view the failures for yourself you can find them at https://github.com/apache/sling-org-apache-sling-xss/tree/issue/antisamy-xml-failure .

from antisamy.

davewichers avatar davewichers commented on July 24, 2024

@rombert - Given the implementation just described, I'm going to add 1 minor change to the new code to address this issue. I'm going to get a property value with the class name to instantiate. I won't actually create such a property file, and the default will be the class I currently use. But it will allow apps like yours to set that property value to a different class (your custom implementation in this case). That class will still have to implement these features (somehow). Seem like a good idea to you? This would allow you to pretty easily upgrade to AntiSamy 1.6.5+.

from antisamy.

rombert avatar rombert commented on July 24, 2024

@davewichers - is that a property file that will be looked up in the classpath? That would work for me, yes, since we embed it in the jar and that will ( IIRC ) be looked up first.

from antisamy.

davewichers avatar davewichers commented on July 24, 2024

@rombert - yes. Standard Java property mechanism, which allows you to override it with whatever you want. I'll let you know when I've implemented the change so you can test it first, before we release.

from antisamy.

davewichers avatar davewichers commented on July 24, 2024

@rombert - I just implemented this change in 7ff740d. Can you test this change to see if it would work in your situation, allowing you to use a custom implementation, while using the built in Transformer Factory implementation from Java by default.

from antisamy.

davewichers avatar davewichers commented on July 24, 2024

FYI. The Java system property I added to make the XML parser used by AntiSamySAXScanner configurable is: "antisamy.transformerfactory.impl". By default, it uses: com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl, but this property allows you to change it. Whatever you change it to must implement the two 2 attributes this class expects. If it doesn't the static initializer will throw an Exception.

from antisamy.

davewichers avatar davewichers commented on July 24, 2024

Closing this issue as these changes went out with the 1.6.5 release I just pushed. @rombert - I still would appreciate some feedback on you after you test the new Java property I created to make the TransformerFactoryImpl used configurable.

from antisamy.

rombert avatar rombert commented on July 24, 2024

@davewichers - I've added a task on the Sling side to track the update. I don't think it's going to happen in the following days, given that we already made the effort to move to 1.6.4 and don't have an urgent need to move to 1.6.5 .

https://issues.apache.org/jira/browse/SLING-11111

That being said, this might be problematic to us since it uses system properties and that is something that we don't have influence over when customers deploy Sling-based applications. At any rate, we'll come back with a clear report in case it does not work for us.

Thanks!

from antisamy.

davewichers avatar davewichers commented on July 24, 2024

@rombert - Your library should be able to set this property itself, before invoking AntiSamy and causing it to initialize itself. That way, users of your library would automatically pick up this property, and not have to do anything, or even know anything, about it. This should work find as long as they don't import/use AntiSamy directly BEFORE your initialization of it. I know this isn't perfect, but I think the likelihood of such a conflict is very low.

Also, I think you're going to have to switch to this property based approach if you ever want to upgrade AntiSamy again, as I don't see any way around it at this point.

from antisamy.

rombert avatar rombert commented on July 24, 2024

Ack @davewichers - that seems to work for us.

from antisamy.

azmau avatar azmau commented on July 24, 2024

I discovered an unwanted side-effect of this fix in my project and this seems like the best spot to whine about it. About 3 months ago I have upgraded to AntiSamy 1.6.5 and also managed to remove xalan-j 2.7.1 from the dependencies of the project. All good at surface level, the new XXE protection from AntiSamy is in place, the default com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl is at work, however we begin to notice that the transformer is outputting more than it was before. More precisely, for a benign input of <p>abc</p>, the scan() now produces <html><head xmlns=\"http://www.w3.org/1999/xhtml\" /><body><p>abc</p></body></html> but in the past getCleanedEntry() was returning the same value as the input.
Is there a way to instruct AntiSamy to not add the extra HTML wrapper tags? Is this in fact the intended behavior of AntiSamy and because of the previous mixup with xalan it worked incorrectly but suitable to my needs? Can I not scan just a fragment of HTML?
Any piece of advice is highly appreciated. Thanks in advance!

from antisamy.

davewichers avatar davewichers commented on July 24, 2024

@spassarop - Hey Sebastian, this seems more like something you can best answer. Can you respond to this question?

from antisamy.

spassarop avatar spassarop commented on July 24, 2024

Sorry but I've used v1.6.5 and I could not reproduce the described behavior. At least with scan() and just using the default policy.

@azmau could you provide a more detailed test configuration of which directives are in places in your policy?

There are certain "head" tags which are omitted in the default policy like <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">. That is set with the omitDoctypeDeclaration directive in true, there is also a omitXmlDeclaration directive. However, I cannot guarantee this applies to you.

from antisamy.

azmau avatar azmau commented on July 24, 2024

@spassarop , @davewichers really appreciate your involvement, however I was not able to create a small test project to highlight my problem. My own project has lots of dependencies, maybe they produce some conflicts that I could not figure out in order to reproduce them. ESAPI would be my next suspect, but it's really just a hunch.

However, I managed to update my project to AntiSamy 1.6.8 and somehow it got rid of those extra wrappers. For me that is more mystery, however is a mystery I can live with it. I don't know if this information helps you in any way.

Many thanks again for your help!

from antisamy.

kwwall avatar kwwall commented on July 24, 2024

from antisamy.

azmau avatar azmau commented on July 24, 2024

@kwwall Should have been more precise from the start but, in my project, we have ESAPI 2.2.3.1. I only mentioned it because I thought it might ring a bell for one of you guys, on my side is a pure hunch, I did no further analysis on this direction.

from antisamy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.