mrkkrp / modern-uri Goto Github PK
View Code? Open in Web Editor NEWModern library for working with URIs
License: Other
Modern library for working with URIs
License: Other
It seems that it is rejected here but I can't find in the standard, nor with online URI validators why such a query string should be rejected. Context: I really have real-life queries like this in my data...
Does this library implement relative URI resolution as defined in section 5.2 of RFC 3986?
If not, would you accept a pull request to do this?
Hey,
Thanks for maintaining URI!
Is it intentional that a URI representing http://example.com
(e.g. when parsed with mkURI
) is rendered as http://example.com/
? I find the trailing slashes a little ugly when showing URLs to people.
Cheers
I realised that round tripping a data uri breaks it. E.g.
import Text.URI
let uri = "data:image/gif;base64,R0lGODlhAQABAPAAAP///wAAACH5BAEAAAAALAAAAAABAAEAAAICRAEAOw=="
in render <$> mkURI uri /= (pure uri) :: Maybe URI
I assume the data-scheme is not RFCs compliant - so it's probably not a bug.
Nevertheless I wanted to let you know, since it took me half a day to figure out why some of these uri's are broken. Maybe it is possible to have them round-trip correctly (e.g. by allowing empty path elements) or at least fail completely? Silently dropping slashes seems to me like unintended behaviour.
The modern-uri
library doesn't build with ghc-9.0-rc1
. When trying to build with the new compiler, the following compilation errors occur:
Text/URI/Types.hs:126:10: error:
• Couldn't match type ‘m’ with ‘TH.Q’
Expected: URI -> m TH.Exp
Actual: URI -> TH.Q TH.Exp
‘m’ is a rigid type variable bound by
the type signature for:
TH.lift :: forall (m :: * -> *). TH.Quote m => URI -> m TH.Exp
at Text/URI/Types.hs:126:3-6
• In the expression: liftData
In an equation for ‘TH.lift’: TH.lift = liftData
In the instance declaration for ‘TH.Lift URI’
• Relevant bindings include
lift :: URI -> m TH.Exp (bound at Text/URI/Types.hs:126:3)
|
126 | lift = liftData
| ^^^^^^^^
Text/URI/Types.hs:129:15: error:
• Couldn't match type ‘TH.TExp a0’ with ‘URI’
Expected: URI -> TH.Code m URI
Actual: URI -> TH.Code m (TH.TExp a0)
• In the expression: TH.unsafeTExpCoerce . TH.lift
In an equation for ‘TH.liftTyped’:
TH.liftTyped = TH.unsafeTExpCoerce . TH.lift
In the instance declaration for ‘TH.Lift URI’
|
129 | liftTyped = TH.unsafeTExpCoerce . TH.lift
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Text/URI/Types.hs:169:10: error:
• Couldn't match type ‘m’ with ‘TH.Q’
Expected: Authority -> m TH.Exp
Actual: Authority -> TH.Q TH.Exp
‘m’ is a rigid type variable bound by
the type signature for:
TH.lift :: forall (m :: * -> *).
TH.Quote m =>
Authority -> m TH.Exp
at Text/URI/Types.hs:169:3-6
• In the expression: liftData
In an equation for ‘TH.lift’: TH.lift = liftData
In the instance declaration for ‘TH.Lift Authority’
• Relevant bindings include
lift :: Authority -> m TH.Exp (bound at Text/URI/Types.hs:169:3)
|
169 | lift = liftData
| ^^^^^^^^
Text/URI/Types.hs:172:15: error:
• Couldn't match type ‘TH.TExp a1’ with ‘Authority’
Expected: Authority -> TH.Code m Authority
Actual: Authority -> TH.Code m (TH.TExp a1)
• In the expression: TH.unsafeTExpCoerce . TH.lift
In an equation for ‘TH.liftTyped’:
TH.liftTyped = TH.unsafeTExpCoerce . TH.lift
In the instance declaration for ‘TH.Lift Authority’
|
172 | liftTyped = TH.unsafeTExpCoerce . TH.lift
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Text/URI/Types.hs:195:10: error:
• Couldn't match type ‘m’ with ‘TH.Q’
Expected: UserInfo -> m TH.Exp
Actual: UserInfo -> TH.Q TH.Exp
‘m’ is a rigid type variable bound by
the type signature for:
TH.lift :: forall (m :: * -> *). TH.Quote m => UserInfo -> m TH.Exp
at Text/URI/Types.hs:195:3-6
• In the expression: liftData
In an equation for ‘TH.lift’: TH.lift = liftData
In the instance declaration for ‘TH.Lift UserInfo’
• Relevant bindings include
lift :: UserInfo -> m TH.Exp (bound at Text/URI/Types.hs:195:3)
|
195 | lift = liftData
| ^^^^^^^^
Text/URI/Types.hs:198:15: error:
• Couldn't match type ‘TH.TExp a2’ with ‘UserInfo’
Expected: UserInfo -> TH.Code m UserInfo
Actual: UserInfo -> TH.Code m (TH.TExp a2)
• In the expression: TH.unsafeTExpCoerce . TH.lift
In an equation for ‘TH.liftTyped’:
TH.liftTyped = TH.unsafeTExpCoerce . TH.lift
In the instance declaration for ‘TH.Lift UserInfo’
|
198 | liftTyped = TH.unsafeTExpCoerce . TH.lift
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Text/URI/Types.hs:221:10: error:
• Couldn't match type ‘m’ with ‘TH.Q’
Expected: QueryParam -> m TH.Exp
Actual: QueryParam -> TH.Q TH.Exp
‘m’ is a rigid type variable bound by
the type signature for:
TH.lift :: forall (m :: * -> *).
TH.Quote m =>
QueryParam -> m TH.Exp
at Text/URI/Types.hs:221:3-6
• In the expression: liftData
In an equation for ‘TH.lift’: TH.lift = liftData
In the instance declaration for ‘TH.Lift QueryParam’
• Relevant bindings include
lift :: QueryParam -> m TH.Exp (bound at Text/URI/Types.hs:221:3)
|
221 | lift = liftData
| ^^^^^^^^
Text/URI/Types.hs:224:15: error:
• Couldn't match type ‘TH.TExp a3’ with ‘QueryParam’
Expected: QueryParam -> TH.Code m QueryParam
Actual: QueryParam -> TH.Code m (TH.TExp a3)
• In the expression: TH.unsafeTExpCoerce . TH.lift
In an equation for ‘TH.liftTyped’:
TH.liftTyped = TH.unsafeTExpCoerce . TH.lift
In the instance declaration for ‘TH.Lift QueryParam’
|
224 | liftTyped = TH.unsafeTExpCoerce . TH.lift
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Text/URI/Types.hs:267:10: error:
• Couldn't match type ‘m’ with ‘TH.Q’
Expected: RText l -> m TH.Exp
Actual: RText l -> TH.Q TH.Exp
‘m’ is a rigid type variable bound by
the type signature for:
TH.lift :: forall (m :: * -> *). TH.Quote m => RText l -> m TH.Exp
at Text/URI/Types.hs:267:3-6
• In the expression: liftData
In an equation for ‘TH.lift’: TH.lift = liftData
In the instance declaration for ‘TH.Lift (RText l)’
• Relevant bindings include
lift :: RText l -> m TH.Exp (bound at Text/URI/Types.hs:267:3)
|
267 | lift = liftData
| ^^^^^^^^
Text/URI/Types.hs:270:15: error:
• Couldn't match type: TH.TExp a4
with: RText l
Expected: RText l -> TH.Code m (RText l)
Actual: RText l -> TH.Code m (TH.TExp a4)
• In the expression: TH.unsafeTExpCoerce . TH.lift
In an equation for ‘TH.liftTyped’:
TH.liftTyped = TH.unsafeTExpCoerce . TH.lift
In the instance declaration for ‘TH.Lift (RText l)’
• Relevant bindings include
liftTyped :: RText l -> TH.Code m (RText l)
(bound at Text/URI/Types.hs:270:3)
|
270 | liftTyped = TH.unsafeTExpCoerce . TH.lift
|
Cabal version bounds prevent from building with GHC 9.2 due to the TH bound being <2.18.
I wonder if there some concatenating operators for parts of complex URI? Similar to </>
from filepath
library. For example, I want to have:
https://markkarpov.com
https://markkarpov.com/foo
https://markkarpov.com/bar
https://markkarpov.com/bar/baz
Sure, I can write four hardcoded quasi quotes. But I wonder what is preferred and simplest way to do this reusing first URI?
I found mkPathPiece
smart constructor but have no idea how to reuse it...
It's not clear from current documentation and examples whether modern-uri
performs urlencoding or not.
I wasn't sure if file:///...
is actually a valid URI so checked the RFC and it gives this example: file:///etc/hosts
(section 1.1). However modern-uri can't parse this:
Prelude Text.URI> mkURI "file:///etc/hosts"
*** Exception: ParseException "file:///etc/hosts" (TrivialError (SourcePos {sourceName = "", sourceLine = Pos 1, sourceColumn = Pos 8} :| []) (Just (Tokens ('/' :| ""))) (fromList [Tokens ('%' :| ""),Tokens ('[' :| ""),Label ('A' :| "SCII alpha-numeric character"),Label ('i' :| "nteger"),Label ('u' :| "sername")]))
For comparison, network-uri
parses this:
Prelude Network.URI> parseURI "file:///etc/hosts"
Just file:///etc/hosts
e.g. for absolute URIs only, etc. One can always validate after the fact, but that is potentially less efficient. Might be easier to add a Parser
or Internal
module exposing all the bits for people to make their own sub parsers.
CC @luigy
In the current implementation, relativeTo returns Nothing if the base URI doesn't have a scheme. If the base URI is provided at compile time, I can know that it has a scheme, but there's no way for relativeTo to know. That leaves me in an awkward situation with a Maybe URI that I "know" will contain a value, but unwrapping it with a partial function is brittle, because either the base URI or the implementation of relativeTo could change in the future. Would it be possible to generate a function for a base URI provided at compile time, so that it always succeeds at runtime (no Maybe)? Something like this:
relativeToTest :: URI -> URI
relativeToTest = [relativeToUri|http://test.com/|]
I don't see a way to represent an URLs with the trailing slash only in the path, e.g. https://example.com/. Parsing these URLs and rendering them gives different results, so the following 2 tests fail:
it "1" $
render [QQ.uri|https://example.com/|] `shouldBe` "https://example.com/"
it "2" $
let given = "https://example.com/"
in (render <$> mkURI given) `shouldBe` Just given
If uriPath
is Nothing
I can't specify whether I want a trailing slash or not. Otherwise, I can't give an empty path since path pieces are a NonEmpty
list.
Hi, would you accept a patch that replaces the RText
machinery with a bunch of newtypes? i.e.
newtype Scheme = Scheme { unScheme :: Text }
newtype Host = Host { unHost :: Text }
... etc ...
I think it would make the library a bit friendlier to beginners :)
This gives weird behaviour:
> mkURI "https://104.155.144.4.sslip.io:443/"
The host gets chopped and half of it along with the port end up in the uriPath
:
URI {
uriScheme = Just "https",
uriAuthority = Right (Authority {authUserInfo = Nothing, authHost = "104.155.144.4", authPort = Nothing}),
uriPath = Just (True,".sslip.io:443" :| []),
uriQuery = [],
uriFragment = Nothing
}
I think it may be true that according to some RFCs there shouldn't be numbers in subdomains, but in practice these types of domains exist and work fine everywhere.
In any case, when I roundtrip the current result we get back the following, which is different than the original:
> render uri
"https://104.155.144.4/.sslip.io%3a443/"
Btw the example url I am using is a real url, part of the sslip.io service: https://104.155.144.4.sslip.io:443/
Thank you
For person who never worked previously with uri
or uri-bytestring
or Haskell beginner it's not very easy to understand how to use modern-uri
package and why one should do it.
The latest modern-uri
release is one of the few packages that doesn't build with bytestring-0.11.0.0
, not even with --allow-newer
. I see that you've already fixed the issue in 7943426#diff-008cf87a8b382cac22984446b6a6a3587d4994a8df780cf3d174f8ca2aff5c10. Could you make a release?! Thanks!
In your next release, it would be great if you could include a Hashable instance for URI :)
One thing I am having a hard time seeing is how does one add a relative path to a uri.
For me it would be quite helpful to have a simple combinator like:
url +/+ dir
(ie 'https://example.org/pub' +/+ 'some/subdir' == 'https://example.org/pub/some/subdir')
I dunno if the lens module makes this easier - but I generally try to avoid them.
Actually when I use relativeTo
it replaces the path from the base URI with that of the relative URI path.
Is there any smarter way to handle such changes?
This is a follow up to mrkkrp/req#102. I am going to try to convince you that modern-uri
implements RFC 3986 too naively.
I will base my case on the two main arguments:
The Reserved Characters section lists [
and ]
as gen-delims
, which is part of reserved
. Then it says:
URI producing applications should percent-encode data octets that
correspond to characters in the reserved set unless these characters
are specifically allowed by the URI scheme to represent data in that
component.
This specification does not explicitly state that it is using RFC 2119 keywords, so it is not entirely clear to me how to interpret this “should”, but, I believe, the best option we have is to interpret it as per RFC 2119, and thus web-apps not escaping square brackets are likely in compliance. This alone should be enough to allow them.
It then goes on:
If a reserved character is found in a URI component and
no delimiting role is known for that character, then it must be
interpreted as representing the data octet corresponding to that
character's encoding in US-ASCII.
Now this “no delimiting role is known” part is pretty mysterious. The paragraph right above the one I have been quoting seems to be an attempt at explaining what this means, but I am having a hard time understanding it.
If we go now to the Query section, it reads:
The query component is indicated by the first question
mark ("?") character and terminated by a number sign ("#") character
or by the end of the URI.
which suggest that anything that is not a #
is not a delimiter or, at least, the delimiting role of [
and ]
is unknown (to me 🙂). This part of text may be non-normative and it contradicts the ABNF right below it, but the specification never explains what is normative and what is informative. The intent seems to be to disallow the square brackets, but it is not that obvious, in my opinion.
Given how ambiguous all this is, I am not convinced that implementing exactly the ABNFs from the specification is enough.
The discussion in the req
issue was based around the handling of square brackets in real-world browsers. A couple of years ago there was a discussion about this in the Firefox issue tracker and there is a fantastic survey of behaviours of various browsers.
They surveyed specifically what the browsers were sending to the server, while the discussion on req
was about “my browser converts” – I am not sure what exactly you were testing. Note that the actual URL location in that issue was already encoded (likely, by GitHub’s markdown processor): the HTML code of the link was <a href="https://gitlab.com/morley-framework/morley/-/issues/new?issue%5Btitle%5D=Indigo%20website:%20" rel="nofollow">https://gitlab.com/morley-framework/morley/-/issues/new?issue[title]=Indigo%20website:%20</a>
. And even when I follow this encoded link in Firefox, it decodes the brackets. This is a result of that survey I linked – as you can see, Firefox was the only browser forcefully encoding square brackets, so this was considered a bug and was changed.
I can’t make a hyperlink with unencoded brackets here on GitHub, but if you paste https://gitlab.com/morley-framework/morley/-/issues/new?issue[title]=Indigo%20website:%20
into the address bar of your Google Chrome it will (I hope – at least that is what I see testing in Chromium) not encode the brackets and send them as is.
Thus, we can see that, regardless of what the specification has to say, in practice the square brackets are thought to be allowed in the query component and not allowing them was fixed as a bug in a major web-browser.
Version: modern-uri-0.3.4.4
(also occurs on older)
Colons in path pieces are percent-encoded, while it seems to me from https://www.rfc-editor.org/rfc/rfc3986#section-3.3 that they can appear unencoded from the second path piece on.
[nix-shell:~/]$ ghci
GHCi, version 9.0.2: https://www.haskell.org/ghc/ :? for help
ghci> import Data.Text
ghci> import Text.URI
ghci> Right u = mkURI (pack "https://mybusinessbusinessinformation.googleapis.com/v1/categories:batchGet")
ghci> render u
"https://mybusinessbusinessinformation.googleapis.com/v1/categories%3abatchGet"
This gives issues with f.e. Google, which uses colons in paths but does not accept the percent-encoded variant.
I don't see any limitations on port numbers in https://tools.ietf.org/html/rfc3986 itself, but I'd expect parsing to fail when port numbers are >= 2^16
, at least for the schemes that declare ports to be 16-bit numbers. Is this a bug?
> mkURI "http://localhost:65535"
URI {uriScheme = Just "http", uriAuthority = Right (Authority {authUserInfo = Nothing, authHost = "localhost", authPort = Just 65535}), uriPath = Nothing, uriQuery = [], uriFragment = Nothing}
> mkURI "http://localhost:65536"
URI {uriScheme = Just "http", uriAuthority = Right (Authority {authUserInfo = Nothing, authHost = "localhost", authPort = Just 65536}), uriPath = Nothing, uriQuery = [], uriFragment = Nothing}
> mkURI "http://localhost:65536000"
URI {uriScheme = Just "http", uriAuthority = Right (Authority {authUserInfo = Nothing, authHost = "localhost", authPort = Just 65536000}), uriPath = Nothing, uriQuery = [], uriFragment = Nothing}
Heya, thanks for the great library. Working with it I need to "update" the value of an existing QueryParam from a URI, and while writing a function to do this, it surprised me that something like it didn't exist in the library.
I'm not sure if you think this would be a nice thing to add to this library, or perhaps this is easy to do somehow with lenses (I do not know lenses very well, so it might be and I just don't know).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.