Releases: karwa/swift-url
0.4.2
It's been a long time. This release mostly just works around an issue with the Swift 6 beta standard library.
What's Changed
- Port setter should not throw if setting a nil port to a URL which doesn't support a port by @karwa in #177
- Work around a source compatibility break in the Swift 6 stdlib by @karwa in #189
Full Changelog: 0.4.1...0.4.2
0.4.1
TSan Workaround
This release includes a workaround for a bug in TSan.
PR #168
Issue #166 (thanks to @shadowfacts for the report)
TSan's internal bookkeeping seems to be corrupted if you use pass around an empty struct as an inout
parameter. This pattern is sometimes used by generic algorithms (for example, the standard library's SystemRandomNumberGenerator
is an empty struct), and is used internally by WebURL. There is no actual data race, but the corruption of TSan's bookkeeping data can lead to spurious reports of data races or even null-pointer dereferences within the TSan runtime.
To work around this, we add an unused field to these empty structs in debug builds.
Related bug reports: swiftlang/swift#61073 swiftlang/swift#61244 and swiftlang/swift#56405
Improvements to Testing
Additionally, some tests to Foundation extensions have been refactored, and the "Swifter" HTTP server dependency that was used by some tests has been dropped.
0.4.0
This is a big one!
- WebURL now supports Internationalised Domain Names (IDNs).
- The URL host parser is now exposed as API, so you can parse hostnames like URLs do.
- There is a new Domain type, which supports rich processing of domains/IDNs.
IDN support was the missing piece. Now it is done, we can say:
π WebURL fully conforms to WHATWG URL Standard π
Now, let's briefly go over each of those points:
π Internationalised Domain Names
WebURL now supports Internationalised Domain Names (IDNs):
import WebURL
WebURL("http://δΈε½η§»ε¨.δΈε½")
// β
"http://xn--fiq02ib9d179b.xn--fiqs8s/"
WebURL("https://π.example.com/")
// β
"https://xn--878h.example.com/"
This may look strange if you are unfamiliar with IDNs. In order to be compatible with existing internet infrastructure, Unicode text in domains needs special compatibility processing, resulting in an encoded string with the distinctive "xn--"
prefix. This processing is called IDNA. If somebody wants to register the domain "δΈε½η§»ε¨.δΈε½"
, they instead register "xn--fiq02ib9d179b.xn--fiqs8s"
, and behind the scenes, everything works just like it always did with plain, non-Unicode domains -- importantly, we don't need internet routing infrastructure or applications to process hostnames differently to how they normally would. This encoded version is not very helpful to humans, but browsers and applications can detect these domains and present them in Unicode (we have APIs for that; more info below).
For more information about IDNs see IDN World Report.
Browsers are making an increased effort this year to align their own IDNA implementations (Safari/WebKit already conforms), and it has been announced that Apple's next major operating system releases will include support in Foundation URL. Now WebURL also implements this part of the URL Standard, it is available now, and it fully backwards-deploys. It's important that URLs work consistently for everybody, and WebURL can help with that.
What's more - since this processing happens in the URL type, it works with our existing Foundation interop:
import WebURL
import Foundation
import WebURLFoundationExtras
let (data, _) = try await URLSession.shared.data(for: WebURL("http://ε
¨ε½ζΈ©ζ³γ¬γ€γ.jp")!)
// β
Works
let convertedToURL = URL(WebURL("http://ε
¨ε½ζΈ©ζ³γ¬γ€γ.jp")!)!
// ... continue processing 'convertedToURL' as you normally would
Developers have been asking for better IDN support across the industry for years - at this stage of adoption, most IDNs are in China, so Chinese developers in particular have been wanting to work with these kinds of URLs. I'm especially pleased that WebURL is now able to offer it to any Swift application.
π Host Parsing API
IDN support as the standard requires is great and all, but it isn't enough.
URLs are designed to be universal - infinitely customisable. There are some "special" schemes which the standard knows about, such as http:
, and while their hosts have semantic meaning (they are network addresses, hence we should use IDNA, detect IPv4 addresses, etc), generally, for other schemes, the host is just an opaque string and is not interpreted.
That's the correct model, but frequently we are processing URLs which are very HTTP-like, and we would like to support the same network addresses, in the same way, as an HTTP URL. For instance, suppose we were writing an application to handle ssh:
URLs - the standard would only parse IPv6 addresses out for us, and everything else would just be an opaque string.
WebURL("ssh://karl@somehost/")!.host
// π .opaque, "somehost"
WebURL("ssh://karl@abc.Ψ£ΩΩΨ§.com/")!.host
// π .opaque, "abc.%D8%A3%D9%87%D9%84%D8%A7.com"
WebURL("ssh://[email protected]/")!.host
// π€¨ .opaque, "192.168.0.1"
Request libraries generally need to write their own parsers to handle this, but it is difficult to match the host parser for HTTP URLs exactly... unless, of course, you are the URL host parser π€...
So with 0.4.0, WebURL's Host
type exposes the URL host parser directly to your applications. Not only is this great for processing URLs of any scheme, it's also useful for hostnames provided via command-line interfaces or configuration files. Being able to guarantee the host is interpreted the same way as it would be in an http:
URL is a very useful property, just by itself.
WebURL.Host("EXAMPLE.com", scheme: "http")
// π .domain, Domain { "example.com" }
WebURL.Host("abc.Ψ£ΩΩΨ§.com", scheme: "http")
// π€© .domain, Domain { "abc.xn--igbi0gl.com" }
WebURL.Host("192.168.0.1", scheme: "http")
// π₯³ .ipv4Address, IPv4Address { 192.168.0.1 }
π¦ Domain API
Exposing the host parser is great and all, but it also isn't enough.
Previously, we only had types for IPv4 and IPv6 addresses, and domains were represented as Strings. Now, domains have their own type - WebURL.Domain
, which is guaranteed to contain a validated, normalised domain from the URL host parser, and can be a useful place to house APIs which operate on domains.
WebURL.Domain("example.com") // β
"example.com"
WebURL.Domain("localhost") // β
"localhost"
WebURL.Domain("api.Ψ£ΩΩΨ§.com") // β
"api.xn--igbi0gl.com"
WebURL.Domain("xn--caf-dma") // β
"xn--caf-dma" ("cafΓ©")
WebURL.Domain("in valid") // β
nil (spaces are not allowed)
WebURL.Domain("xn--cafe-yvc") // β
nil (invalid IDN)
WebURL.Domain("192.168.0.1") // β
nil (not a domain)
The most important API right now is render
, which builds a result using an encapsulated algorithm. There is opportunity for renderers to produce any kind of result - for example, they might perform spoof-checking to guard against confusable text, or they might use a database to shorten domains to their most important section, or they might have special formatting for particular domains. You can create a renderer by conforming to the WebURL.Domain.Renderer
protocol.
WebURL comes with an uncheckedUnicodeString
renderer, so you can recover the Unicode form of a domain. This renderer does not perform any spoof-checking, so is not recommended for use in UI.
let domain = WebURL.Domain("xn--fiq02ib9d179b.xn--fiqs8s")!
domain.render(.uncheckedUnicodeString)
// β
"δΈε½η§»ε¨.δΈε½"
And with that, I'm happy with WebURL's host story. It provides rich, detailed information about the hosts defined in the URL Standard and gives you the means to easily and robustly process them. Please try it out and leave feedback!
π Bonus: Spoof-checked renderer prototype
It is important that applications use spoof checking when displaying domains in Unicode form. We have a proof-of-concept renderer which ports much of Chromium's IDN spoof-checking logic. It works on my Mac, but deploying it can be a pain because it depends on the ICU library for its implementation of UAX39.
// Non-IDNs.
WebURL.Domain("paypal.com")?.render(.checkedUnicodeString) // β
"paypal.com"
WebURL.Domain("apple.com")?.render(.checkedUnicodeString) // β
"apple.com"
// IDNs.
WebURL.Domain("a.Ψ£ΩΩΨ§.com")?.render(.checkedUnicodeString) // β
"a.Ψ£ΩΩΨ§.com"
WebURL.Domain("δ½ ε₯½δ½ ε₯½")?.render(.checkedUnicodeString) // β
"δ½ ε₯½δ½ ε₯½"
// Spoofs.
WebURL.Domain("ΡΠ°Ξ³pal.com")?.render(.checkedUnicodeString) // β
"xn--pal-vxc83d5c.com"
WebURL.Domain("Π°pple.com")?.render(.checkedUnicodeString) // β
"xn--pple-43d.com"
It would be great to turn this in to a maintained, easily-deployable package. I'm too busy right now, so it remains a prototype, but maybe one day? Or if anybody else would like to get involved, they can use it as a starting point.
Bugfixes
- Fixed a crash when appending an empty array of form params (#140). Thanks to @adam-fowler for the report. Sorry it took so long to get in to a release.
0.3.1
What's Changed
π Foundation Integration
0.3.0 brought Foundation-to-WebURL conversion, and this release adds conversion in the opposite direction (WebURL-to-Foundation). This is a particularly important feature for developers on Apple platforms, as it means you can now use WebURL
to make requests using URLSession
! We now have full, bidirectional interop with Foundation's URL
, which is a huge milestone and a big step towards v1.0.π₯³
WebURLFoundationExtras
now adds a number of extensions to types such as URLRequest
and URLSession
to make that super easy:
import Foundation
import WebURL
import WebURLFoundationExtras
// βΉοΈ Make URLSession requests using WebURL.
func makeRequest(to url: WebURL) -> URLSessionDataTask {
return URLSession.shared.dataTask(with: url) {
data, response, error in
// ...
}
}
// βΉοΈ Also supports Swift concurrency.
func processData(from url: WebURL) async throws {
let (data, _) = try await URLSession.shared.data(from: url)
// ...
}
// βΉοΈ For libraries: move to WebURL without breaking
// compatibility with clients using Foundation's URL.
public func processURL(_ url: Foundation.URL) throws {
guard let webURL = WebURL(url) else {
throw InvalidURLError()
}
// Internal code uses WebURL...
}
When you make a request using WebURL
, you will benefit from its modern, web-compatible parser, which matches modern browsers and libraries in other languages:
// Using WebURL: Sends a request to "example.com".
// Chrome, Safari, Firefox, Go, Python, NodeJS, Rust agree. β
print( try String(contentsOf: WebURL("http://[email protected]:[email protected]/")!) )
// Using Foundation.URL: Sends a request to "evil.com"! π΅
print( try String(contentsOf: URL(string: "http://[email protected]:[email protected]/")!) )
Note that this only applies to the initial request; HTTP redirects continue to be processed by URLSession
(it is not possible to override it universally), and so are not always web-compatible. As an alternative on non-Apple platforms, our fork of async-http-client
uses WebURL
for all of its internal URL processing, so it also provides web-compatible redirect handling.
For more information about why WebURL is a great choice even for applications and libraries using Foundation, and a discussion about how to safely work with multiple URL standards, we highly recommend reading: Using WebURL with Foundation.
URLSession
extensions are only available on Apple platforms right now, due to a bug in swift-corelibs-foundation
. I opened a PR to fix it, and once merged, we'll be able to make these extensions available to all platforms.
β‘οΈ Performance improvements
I say it every time, and it's true every time π
. For this release, I noticed that, due to a quirk with how ManagedBuffer
is implemented in the standard library, every access to the URL's header data required dynamic exclusivity enforcement. But that shouldn't be necessary - the URL storage uses COW to enforce non-local exclusivity, and local exclusivity can be enforced by the compiler if we wrap the ManagedBuffer
in a struct
with reference semantics. So that's what I did.
The result is ~5% faster parsing and 10-20% better performance when getting/setting URL components. For collection views like pathComponents
, these enforcement checks affect basically every operation and amount to a consistent overhead that we're now able to eliminate.
benchmark column results/0_3_0 results/0_3_1 %
------------------------------------------------------------------------------------------------
Constructor.HTTP.AverageURLs time 23909.00 22665.00 5.20
Constructor.HTTP.AverageURLs.filtered time 37826.50 36066.00 4.65
Constructor.HTTP.IPv4 time 12205.00 11627.00 4.74
Constructor.HTTP.IPv4.filtered time 19164.00 17819.00 7.02
Constructor.HTTP.IPv6 time 13677.00 13086.00 4.32
Constructor.HTTP.IPv6.filtered time 17614.00 16577.00 5.89
...
ComponentSetters.Unique.Username time 418.00 365.00 12.68
ComponentSetters.Unique.Username.PercentEncoding time 767.00 632.00 17.60
ComponentSetters.Unique.Username.Long time 636.00 527.00 17.14
...
ComponentSetters.Unique.Path.Simple time 2525.00 2247.00 11.01
...
PathComponents.Iteration.Small.Forwards time 705.00 602.00 14.61
PathComponents.Iteration.Small.Reverse time 718.00 619.00 13.79
PathComponents.Iteration.Long.Reverse time 3137.00 2752.00 12.27
PathComponents.Append.Single time 1362.00 1242.00 8.81
π Standard Update
This release also implements a recent change to the WHATWG URL Standard, which forbids C0 Control characters and U+007F delete from appearing in domains. whatwg/url#685
Full Changelog: 0.3.0...0.3.1
0.3.0
What's Changed
π DocC-based Documentation!
All of the documentation has been rewritten and reorganised to take advantage of the new DocC documentation engine. It's a really huge improvement, so do please check it out. And if you find anything which you think could be improved, don't hesitate to file an issue or even submit a PR π
π Foundation Integration
WebURL 0.3.0 includes the WebURLFoundationExtras
module, which comes with a way to convert Foundation URL objects to WebURLs. That means your libraries can use WebURL for their internal processing, while continuing to support clients who provide data using Foundation's types.
The async-http-client port is an example of this. Even though it uses WebURL for its internal processing, it is still possible to create requests using Foundation.URL using an extension. This means it gets to benefit from modern, web-compatible URL parsing (for example, when resolving HTTP redirects), and WebURL's simpler, more efficient API, without breaking compatibility.
New with this release, the async-http-client port offers a build configuration which omits all Foundation dependencies. By doing so, we've measured binary size improvements of up to 16% on a statically-linked & stripped executable, while keeping the full functionality of AHC such as streaming, compression, and HTTP/2. We expect that size improvement could improve even further with Swift 5.6, as the standard library will no longer need to link all of ICU's Unicode data.
β‘οΈ+π Performance and Code Size Improvements
WebURL keeps getting faster, and leaner, but not meaner π. Compared to 0.2.0, WebURL 0.3.0 offers some incredible performance enhancements. URL parsing time has been reduced by almost 1/3, our fantastic in-place component setters can be almost 40% faster, and common operations like iterating path components can now be performed in just half the time.
And that's not even the best part. All of these improvements come in a package which is 20% smaller!
Title Section Old New Percent
WebURLBenchmark __text: 1715105 1376177 -19.8%
(Measured on an Intel MBP)
π System.framework Integration Disabled on iOS For Now
And with all that great news, there had to be one... less great thing. Unfortunately, the last few releases of Xcode have shipped with a broken version of Apple's System.framework for iOS, which broke the build on that platform. Strangely, it is only iOS - macOS, tvOS, and even watchOS all work fine. We've disabled that integration on iOS for now, but we'll keep an eye on things and re-enable it once the issue is fixed (FB9832953).
In the mean time, you can still use swift-system
, the open-source distribution of System.framework, on all platforms, including iOS.
Full Changelog: 0.2.0...0.3.0
0.2.0
What's Changed
In addition to the changes listed below, the guide has been entirely rewritten, and now does a better job of explaining the WebURL API and object model, and the benefits it can bring to your application/library. The goal is to help you become as comfortable using WebURL
as you are using Foundation's URL
. It took a lot of work, and I'd really recommend giving it a read. Even if you're not using WebURL
yet, the chances are that you'll learn a thing or two about how Foundation's URL
actually works.
URL standard
- Domains which end in numbers must be IPv4 addresses. See here for more information. (whatwg/url#619)
- The JavaScript model's
hostname
setter now returns early if a port is given. (whatwg/url#604) - The
path
setter can no longer erase the path of path-only URLs. (whatwg/url#582) (reported by us)
API
- Support for creating file URLs from file paths, and file paths from file URLs.
- Added
WebURLSystemExtras
module which integrates with bothswift-system
and Apple'sSystem.framework
. LazilyPercentDecoded<Collection>
is now bidirectional when its source collection is.WebURL.cannotBeABase
has been renamed toWebURL.hasOpaquePath
, following an update in the standard. (whatwg/url#655) (reported by us)- Percent-encoding and -decoding APIs have been reworked to take advantage of static member syntax (SE-0299). A source-compatible fallback is in place for pre-5.5 compilers. This is the reason for duplicate functions appearing in the docs, one with a
EncodeSet._Member
argument. We're looking at moving to Swift-DocC which will hopefully fix this. The previous API has been back-ported with deprecation notices wherever possible, so the compiler should guide when it comes to updating your applications. - It is now possible to percent-decode a string as an array of bytes using the
.percentDecodedBytesArray()
function. This is useful for dealing with binary data and non-UTF8 strings. - The
.pathComponents
view now assumes inserted data is not percent-encoded, which preserves values exactly if they happen to contain strings which coincidentally look like percent-encoding. - A
.pathComponents[raw: Index]
subscript has been added, which returns a path component exactly as it appears in the URL string, including its percent-encoding. - A
.pathComponent.replaceSubrange(_:withPercentEncodedComponents:)
function has been added for inserting pre-encoded path components. - Added
Sendable
conformance toWebURL
,WebURL.Host
, IP addresses, origins, and the various wrapper views. - The
.serialized
property has been combined with.serializedExcludingFragment
in to a single function:.serialized(excludingFragment: Bool = false)
.
Implementation
- Better performance, especially for component setters
- Component setters are now benchmarked
- Support for fuzzing the parser
- Added
UnsafeBoundsCheckedBufferPointer
which allows us to keep bounds-checking without sacrificing performance - Simplified internal storage types, reducing code size
- Better percent-encoding performance
Full Changelog: 0.1.0...0.2.0
What's coming next
The major goal for 0.3.0 is compatibility with Foundation's URL
. At the very least, that is going to include support for creating a URL
from a WebURL
and vice versa, but we may need additional APIs for a truly great developer experience.
Initial Release!
This is the initial release of WebURL
! π
Take a look at the Getting Started guide in the repo, and the full documentation available here!