PHP Internals News: Episode 98: Deprecating utf8_encode and utf8_decode – Derick Rethans

PHP Internals News: Episode 98: Deprecating utf8_encode and utf8_decode

In this episode of “PHP Internals News” I chat with Rowan Tommins (GitHub, Website, Twitter) about the “Deprecate and Remove utf8_encode and utf8_decode” RFC.

The RSS feed for this podcast is https://derickrethans.nl/feed-phpinternalsnews.xml, you can download this episode’s MP3 file, and it’s available on Spotify and iTunes. There is a dedicated website: https://phpinternals.news

Transcript

Derick Rethans 0:14

Hi, I’m Derick. Welcome to PHP Internals News, a podcast dedicated to explaining the latest developments in the PHP language. This is episode 98. Today I’m talking with Rowan Tommins about the “Deprecate and remove UTF8_encode and UTF8_decode” RFC that he’s proposing. Hi, Rowan, would you please introduce yourself?

Rowan Tommins 0:38

Hi, I’m Rowan Tommins. I’m a PHP software architect by day and try and contribute back to the community and have been hanging around in the internals mailing list for about 10 years and contributed to make the language better, where I can.

Derick Rethans 0:57

Excellent. Yeah, that’s how I started out as well, many, many more years before that, to be honest. This RFC, what problem is this trying to solve?

Rowan Tommins 1:08

PHP has these two functions, utf8_encode and utf8_decode, which, in themselves, they’re not broken. They do what they are designed to do. But they are very frequently misunderstood. Mostly because of their name. And because Character Encodings in general, are not very well understood. People use them wrong, and end up getting in all sorts of pickles that are worse than if the functions weren’t there in first place.

Derick Rethans 1:37

What are you proposing with the RFC then?

Rowan Tommins 1:39

Fundamentally, I’m proposing to remove the functions. As of PHP 8.2, there will be a deprecation notice whenever you use them, and then in 9.0, they would be gone forever, and you wouldn’t be able to use them by mistake, because they just wouldn’t be there.

Derick Rethans 1:56

I reckon there’s going to be a way to actually do what people originally intended to do with it at some point, right?

Rowan Tommins 2:02

So yeah, there are alternatives to these functions, which are much clearer in what you’re doing, and much more flexible in what you can do with them so that they cover the cases that these functions sound like they’re going to do, but don’t actually do when you understand what they’re really doing.

Derick Rethans 2:20

I think we’ll get back to that a little bit later on. You’re wanting to deprecate these functions. But what do these functions actually do?

Rowan Tommins 2:27

What they actually do is convert between a character encoding called Latin-1, ISO 8859-1, and UTF-8. So utf8_encode converts from Latin-1 into UTF-8, utf8_decode does the opposite. And that’s all they do. Their names make it sound like they’re some kind of fix all the UTF 8 things in my text. But they are actually just these one very specific conversion, which is occasionally useful, but not clear from their names.

Derick Rethans 3:01

It’s certainly how I have seen it used in the past, where people just throw everything and the kitchen sink at it, and expecting it to be valid UTF 8, and then at the end, decode. I mean, the decoding was not even part much of this, right? It’s just throw ev

Truncated by Planet PHP, read more at the original (another 23825 bytes)

Getting OpenSwoole and the AWS SDK to Play Nice – Matthew Weier O’Phinney

I have some content that I store in S3-compatible object storage, and wanted to be able to (a) push to that storage, and (b) serve items from that storage.

Easey-peasey: use the Flysystem AWS S3 adapter, point it to my storage, and be done!

Except for one monkey wrench: I’m using OpenSwoole.

The Problem

What’s the issue, exactly?

By default, the AWS adapter uses the AWS PHP SDK, which in turn uses Guzzle.
Guzzle has a pluggable adapter system for HTTP handlers, but by default uses its CurlMultiHandler when the cURL extension is present and has support for multi-exec.
This is a sane choice, and gives optimal performance in most scenarios.

Internally, when the handler prepares to make some requests, it calls curl_multi_init(), and then memoizes the handle returned by that function.
This allows the handler to run many requests in parallel and wait for them each to complete, giving async capabilities even when not running in an async environment.

When using OpenSwoole, this state becomes an issue, particularly with services, which might be instantiated once, and re-used many times across many requests until the server is shutdown.
More specifically, it becomes an issue when coroutine support is enabled in OpenSwoole.

OpenSwoole has provided coroutine support for cURL for some time now.
However, when it comes to cURL’s multi-exec support, it only allows one multi-exec handle at a time.
This was specifically where my problem originated: I’d have multiple requests come in at once, each requiring access to S3, and each resulting in an attempt to initialize a new multi-exec handle.
The end result was a locking issue, which led to exceptions, and thus error responses.

(And boy, was it difficult to debug and get to the root cause of these problems!)

The solution

Guzzle allows you to specify your own handlers, thankfully, and the vanilla CurlHandler:

use GuzzleHttp\Client;
use GuzzleHttp\HandlerStack;
use GuzzleHttp\Handler\CurlHandler; $client = new Client([ 'handler' => HandlerStack::create(new CurlHandler()),
]);

The next hurdle is getting the AWS S3 SDK to use this handler.
Fortunately, the S3 client constructor has an http_handler option that allows you to pass an HTTP client handler instance.
I can re-use the existing GuzzleHandler the SDK provides, passing it my client instance:

use Aws\Handler\GuzzleV6\GuzzleHandler;
use Aws\S3\S3Client; $storage = new S3Client([ // .. connection options such as endpoint, region, and credentials 'http_handler' => new GuzzleHandler($client),
]);

While the namespace is GuzzleV6, the GuzzleHandler in that namespace also works for Guzzle v7.

I can then pass that to Flysystem, and I’m ready to go.

But what about those async capabilities?

But doesn’t switching to the vanilla CurlHandler mean I lose out on async capabilities?

The great part about the OpenSwoole coroutine support is that when the cURL hooks are available, you essentially get the parallelization benefits of multi-exec with the vanilla cURL functionality.
As such, the approach I outline both fixes runtime errors I encountered and increases performance.
I like easy wins like this!

Bonus round: PSR-7 integration

Unrelated to the OpenSwoole + AWS SDK issue, I had another problem I wanted to solve.
While I love Flysystem, there’s one place where using the AWS SDK for S3 directly is a really nice win: directly serving files.

When using Flysystem, I was using its mimeType() and fileSize() APIs to get file metadata for the response, and then copying the file to an in-memory (i.e. php://temp) PSR-7 StreamInterface.
The repeated calls meant I was querying the API multiple times for the same file, degrading performance.
And buffering to an in-memory stream had the potential for out-of-memory errors.

One alternative I tried was copying the file from storage to the local filesystem; this would allow me to use a standard filesystem stream with PSR-7, which is quite performant and doesn’t require a lot of memory.
However, one point of having object storage was so that I could reduce the amount of local filesystem storage I was using.

As a result, for this specific use case, I switched to using the AWS S3 SDK directly and invoking its getObject() method.
The method returns an array/object mishmash that provides object metadata, including the MIME type and content l

Truncated by Planet PHP, read more at the original (another 1344 bytes)

David Bisset talks about the business of WordPress – Voices of the ElePHPant

Cal Evans hosts WordPress community leader David Bisset as they talk about the business and business models built around WordPress.

This episode is sponsored by
RingCentral Developers

The post David Bisset talks about the business of WordPress appeared first on Voices of the ElePHPant.

Cancelling ReactPHP fibers – Cees-Jan Kiewiet

A feature that we really needed to make our fiber integration complete is the cancellation of them. Or to be more
precise, the cancellation any awaited promise yielding operations in that fiber and as a consequence the fiber that
those are awaited in. This post goes into detail how different cancelation scenarios work for
the PR introducing it, and was originally part of that PR’s documentation
but was replaced by a simpler section.

Cancelled PHP 8.1 fibers (green threads)

Photo by Jeffrey Czum from Pexels

Thoughts on psr/log versions – Cees-Jan Kiewiet

One of the things that came up while upgrading packages is PSR-3’s new v2 and v3 releases. They add type hints to
methods and return type hints. For packages implementing this means that they can’t support all 3 versions. For
packages only consuming psr/log all 3 versions can be used as you don’t have to build classes on them.

However, for packages implementing PSR-3 this suddenly became more complex. All of a sudden you need 3 major versions
if you want to support all PSR-3 versions. For a package that only implements PSR-3 this isn’t so much of an issue, but
when the implementation is embedded inside another package you all of a sudden reach dependency hell. And one thing I
learned while upgrading my packages is how deep our dependency on psr/log goes these days.

The mistake I’ve made with at least one PR in the past few weeks is miss that a consumer of psr/log is also an
implementer, and I missed that. So now I get to get back and make a new PR resolving that mess I introduced.

David Bisset talks about WordPress community – Voices of the ElePHPant

Cal Evans hosts WordPress community leader David Bisset as they talk about the WordPress community.

This episode is sponsored by
RingCentral Developers

The post David Bisset talks about WordPress community appeared first on Voices of the ElePHPant.

Xdebug Update: January 2022 – Derick Rethans

Xdebug Update: January 2022

In this monthly update I explain what happened with Xdebug development in this past month. These are normally published on the first Tuesday after the 5th of each month. I am late this month. Sorry.

Patreon and GitHub supporters will get it earlier, around the first of each month.

You can become a patron or support me through GitHub Sponsors. I am currently 45% towards my $2,500 per month goal. If you are leading a team or company, then it is also possible to support Xdebug through a subscription.

In January, I spend my time triaging issues, and planning for this year.

2022 Plans

I spend most of my time reflecting on what I can do to make Xdebug even better in 2022, and I have come to the conclusion that this is going to be done through multiple improvements.

  1. Creating an Xdebug Course: explaining in great detail how Xdebug works, how you use it, how you can get the most out of it, and many scenarios on how to set-up debugging in different environments. This needs to go beyond referential documentation pages, and will hence become a combined set of videos and tutorials, with examples and work-along exercises.

  2. Developing an set-up-free debugger: A new tool that can be used through Xdebug Cloud, that would allow you to debug without IDE.

  3. Xdebug Recorder and Player: A new feature in Xdebug which would allow for a full request to be stored in a file, including every intermediate state. Combined with a player, which would allow for replaying the request and interrogating every variable at every point during the execution of said script, through the debugging protocol and interacting with your IDE. The recorded files would be self contained, without needing access to the (original) source code. Besides “step over” it would also have a “step back”, and perhaps even a slider to slide back and forwards through time.

  4. Rewriting Xdebug’s Profiler: so that it is more lightweight, and so that it can be enabled for specific parts of an application/request. In addition to this I am looking at sending the profiling data over the debugging protocol, so that visualisation tools do not need to find and read files.

  5. Creating a profile analysis tool: To automatically analyse profiling files and apply logic so that it can point to the most likely cause of bottlenecks.

Let me know which one of these interests you most, and whether you would be willing to pay for such a tool.

Xdebug Videos

I did not create any new Xdebug videos this month on my YouTube channel. But as I mentioned earlier, I am working on a more comprehensive course. Stay tuned!

Business Supporter Scheme and Funding

In January, one new business supporter signed up:

Thank you!

If you, or your company, would also like to support Xdebug, head over to the support page!

Besides business support, I also maintain a Patreon page and a profile on GitHub sponsors.