PHP 8.2.0 RC3 available for testing – PHP: Hypertext Preprocessor

The PHP team is pleased to announce the third release candidate of PHP 8.2.0, RC 3. This continues the PHP 8.2 release cycle, the rough outline of which is specified in the PHP Wiki.For source downloads of PHP 8.2.0 RC3 please visit the download page.Please carefully test this version and report any issues found in the bug reporting system.Please DO NOT use this version in production, it is an early test version.For more information on the new features and other changes, you can read the NEWS file, or the UPGRADING file for a complete list of upgrading notes. These files can also be found in the release archive.The next release will be the fourth release candidate (RC 4), planned for Oct 13th 2022.The signatures for the release can be found in the manifest or on the QA site.Thank you for helping us make PHP better.

Good design means it’s easy-to-change – Matthias Noback

Software development seems to be about change: the business changes and we need to reflect those changes, so the requirements or specifications change, frameworks and libraries change, so we have to change our integrations with them, etc. Changing the code base accordingly is often quite painful, because we made it resistant to change in many ways.

Code that resists change

I find that not every developer notices the “pain level” of a change. As an example, I consider it very painful if I can’t rename a class, or change its namespace. One reason could be that some classes aren’t auto-loaded with Composer, but are still manually loaded with require statements. Another reason could be that the framework expects the class to have a certain name, be in a certain namespace, and so on. This may be something you personally don’t consider painful, since you can avert the pain by simply not considering to rename or move classes.

Still, in the end, you know that a code base that resists change like this is going to be considered “a case of severe legacy code”. That’s because you can’t resist change in the software development world, and eventually it will be time to make that change, and then you can experience the pain that you’ve been postponing for so long.

Software can resist change in many ways, here are just a few examples that come to mind:

  • Classes have to be in a certain directory/namespace and methods have to have certain names in order to be picked up by the framework
  • There’s another application using the same database, which can’t deal with extra columns that it doesn’t know about
  • Class auto-loading only works if you call session_start() first
  • And so on… If you have another cool example, please add it as a comment below!

Socially-established change aversion

Change-aversion can also be socially established. As an example, the team may use a rule that says “If you create a class, you also have to create a unit test for it”. Which is very bad, because you can use multiple classes in a test and still call it a unit-test, so the every-class-is-a-unit assumption is plain wrong. More importantly, you can’t unit-test all types of classes; some will require integrated tests. Anyway, let’s not get carried away 😉 My point is, if you have such a rule you’ll make it harder for developers to add a new class, since they fear the additional (often pointless) work of creating a test for it. In a sense, developers start to resist change. The code base itself will resist change as well, because unit tests are often too close to the implementation, making a change in the design really hard.

Unit tests are my favorite example, but there are other socially-established practices that get in the way of change. Like, “don’t change this code, because 5 years ago Leo touched it and we had to work until midnight to fix production”. Or “we asked the manager for some time to work on this, but we didn’t get it”.

Make changing things easy

From these, and many more – indeed – painful experiences, I have come to the conclusion that a very powerful way to judge the quality of code and design is to answer the question: is this easy to change? The change can be about a function name, the location of a file, installing a Composer dependency, injecting an additional constructor dependency, and so on.

However, it’s sometimes really hard to perform this evaluation yourself, since as a long-time developer you may already be used to quite some “pain”. You may be jumping through hoops to make a change, and not even realize that it’s silly and should be much easier. This is where pair or mob/ensemble programming can be really useful: working together on the same computer will expose all the changes that you avoid:

  • “Hey, let’s rename that class!”
  • “Well, I’m not sure that we can, let’s save this for another time.”

  • “Now let’s inject that new service as a constructor argument.”

  • “Sorry, we can’t use dependency injection in this part of the code base.”

That’s why I usually go all-in on ensemble programming, so we can have a clear view on all the changes that the team averts. We look the monster in the eyes.

Addendum: when changes break stuff

A partial reason for change aversion in developers is the risk that the change may break other things. If you rename a method, you should rename all the relevant method calls too. Luckily, we have static reflection these days, which will tell you about any call sites that you missed. And of course, the IDE can safely make the change for you in most cases. Unfortunately, this is not always the cas

Truncated by Planet PHP, read more at the original (another 5175 bytes)

Extrinsic sorting: A benchmark – Larry Garfield

Extrinsic sorting: A benchmark

Sorting algorithms are generally old hat to most programmers. They’ve either analyzed them to death in class, already written many of them, or work in a language where one is provided and they don’t need to think about it. Or all three.

For PHP developers, we have a suite of sorting tools available to us: sort(), usort(), ksort(), uasort(), and various other letter combinations. All use Quick Sort internally, which is generally the best performing single-threaded option. Most importantly, many of them let us provide a custom comparison function to use when determining which of two values is larger or smaller.

admin
25 September 2022 – 10:10am

Can we consider DateTimeImmutable a primitive type? – Matthias Noback

During a workshop we were discussing the concept of a Data Transfer Object (DTO). The main characteristic of a DTO is that it holds only primitive-type values (strings, integers, booleans), lists or maps of these values including “nested” DTOs. Not sure who came up with this idea, but I’m using it because it ensures that the DTO becomes a data structure that only enforces a schema (field names, the expected types, required fields, and optional fields), but doesn’t enforce semantics for any value put into it. That way it can be created from any data source, like submitted form values, CLI arguments, JSON, XML, Yaml, and so on. Using primitive values in a DTO makes it clear that the values are not validated. The DTO is just used to transfer or carry data from one layer to the next. A question that popped up during the workshop: can we consider DateTimeImmutable a primitive-type value too? If so, can we use this type inside DTOs?

I thought it was an interesting question to explore. I’d like to say “No” immediately, but why?

Is something deserving of a predicate? To decide, we have to define what the predicate means. Stated in an abstract way like this it makes a lot of sense, but when discussing concrete questions it’s not often clear that we should talk about definitions; we often like to jump to an answer immediately! So, in this case, what’s a primitive-type value?

First, we could consider “primitive” to mean “can’t be further divided into parts”. In that sense, when a Money value object would consist of an integer for the number of cents, and a string declaring the currency, Money is not primitive, but the integer and the string inside are, because it doesn’t make sense to take them apart. Although you might say that something more primitive than a string is a character. It’s just that PHP doesn’t distinguish this type. DateTimeImmutable in this sense is not primitive, the value that it contains (the timestamp) is.

Second, we could consider “primitive” to mean “what’s not an object”, or as PHP calls these values: scalars. Again, “primitive” takes into account what the programming language considers primitive, because Java for instance has strings which are considered primitive values but are nevertheless objects. Java has a weird relationship with primitive values anyway, because strings, integers, etc. look very primitive in Java code (i.e. you don’t do new String("a string") but just write "a string")). With PHP there’s less confusion around this concept. When “primitive” is used in this sense, DateTimeImmutable could never be considered a primitive-type value, because it’s an object, but in Java it could be, because other primitive values are considered to be primitive regardless of them being an object.

Third, we could consider “primitive” to mean “whatever types the language offers out-of-the-box”. This is often equivalent to “native”. Unfortunately, this isn’t a very helpful definition, since what’s native is unclear. Is there a “core” part of the language that defines these types? In that case, where does DateTimeImmutable belong? Isn’t that part of an extension? Also, would we consider file handles (resources) primitive types? In the end, most of what is part of the “core” or “native” language is quite arbitrary.

Fourth, we could consider “primitive” to mean – regardless of the language – “what types do we need to describe data?” In that sense, we may look back in the history of humanity itself and consider numbers very primitive (e.g. for describing the value of something). Same for strings (e.g. for writing down a name). Arguably a date or a time isn’t primitive, because it’s built up from strings (or characters), and numbers.

Fifth, we could consider “primitive” to mean “bare values, not necessarily sensible or correct ones”. So “2” is a primitive value, it doesn’t say 2 of what, so we can’t judge if the value is correct. “UKj” is a primitive value, it doesn’t say what it describes, so there’s no way to judge this value. Using this definition, a DateTimeImmutable value is certainly not a primitive value because when you instantiate it, it processes the provided string constructor argument and throws an error if it is not a sensible one. Or, maybe worse, converts it into a value that does make sense, but may no longer match the intention of the actor that produced the value.

For me, this final point is the most important attribute of primitive-ness, which disqualifies DateTimeImmutable as a primitive-type value. Anyway, we already established that DateTimeImmutable can’t be considered primitive according to the other definitions either.

Am I missing any possible definitions of “primitive” here? Just let me know!

PHP 8.2.0 RC2 available for testing – PHP: Hypertext Preprocessor

The PHP team is pleased to announce the release of PHP 8.2.0, RC 2. This is the second release candidate, continuing the PHP 8.2 release cycle, the rough outline of which is specified in the PHP Wiki. For source downloads of PHP 8.2.0, RC 2 please visit the download page. Please carefully test this version and report any issues found in the bug reporting system. Please DO NOT use this version in production, it is an early test version. For more information on the new features and other changes, you can read the NEWS file or the UPGRADING file for a complete list of upgrading notes. These files can also be found in the release archive. The next release will be the third release candidate (RC 3), planned for 29 September 2022. The signatures for the release can be found in the manifest or on the QA site. Thank you for helping us make PHP better.

Porting Curveball to Bun – Evert Pot

Bun is the hot new server-side Javascript runtime, in the same category
as Node and Deno. Bun uses the JavascriptCore engine from
Webkit, unlike Node and Deno which use V8. A big selling point is that
it’s coming out faster in a many benchmarks, however the things I’m personally
excited about is some of it’s quality of life features:

  • It parses Typescript and JSX by default (but doesn’t type check), which
    means there’s no need for a separate ‘dist’ directory, or a separate tool
    like ts-node.
  • It loads .env files by default.
  • It’s compatible with NPM, package.json, and many built-in Node modules.

I also like that it’s ‘Hello world HTTP server’ is as simple as writing this
file:

// http.ts
export default { port: 3000, fetch(request: Request): Promise<Response> { return new Response("Hello world!"); },
};

And then running it with:

bun run http.ts

Bun will recognize that an object with a fetch function was default-exported,
and start a server on port 3000. As you can see here, this uses the standard
Request and Response objects you use in a browser, and can use
async/await.

These are all things that didn’t exist when Node and Express were first
created, but seem like pretty good ideas for something built today. I don’t think
using Request and Response are good for more complex use-cases (streaming
responses, 1xx responses, trailers, upgrading to other protocols, getting tcp
connection metadata like remoteAddr are some that come to mind),
because these objects are designed for clients first.

But in many cases people are just building simple endpoints, and for that it’s
great.

Bun supports a ton of the standard Node modules, but it’s also missing some
such as support for server-side websockets and the node http/https/https
packages, which for now makes it incompatible with popular frameworks like
Express.

Porting Curveball

Curveball is a Typescript micro-framework we’ve been developing since
mid-2018 as a modern replacement for Express and Koa. A key difference between
Curveball and these two frameworks is that it fully abstracts and encapsulates
the core ‘Request’ and ‘Response’ objects Node provides.

This made it very easy to create a lambda integration in the past; instead of
mapping to Node’s Request and Response types, All I needed was simple mapping
function for Lambdas idea of what a request and response looks like.

To get Express to run on AWS Lambda the Node http stack needs to be emulated, or
a full-blown HTTP/TCP server needs to be started and proxied to. Each of these
workarounds require a ton of code from libraries like serverless-express.

So with Bun up and coming, either the same work would need to be done to emulate
Node’s APIs, or Bun would would need to add full compability for the Node http
module (which is eventually coming).

But because of Curveball’s message abstractions it was relatively easy to get
up and running. Most of the work was moving the Node-specific code into a new
pa

Truncated by Planet PHP, read more at the original (another 6665 bytes)

Is it a DTO or a Value Object? – Matthias Noback

A common misunderstanding in my workshops (well, whose fault is it then? ;)), is about the distinction between a DTO and a value object. And so I’ve been looking for a way to categorize these objects without mistake.

What’s a DTO and how do you recognize it?

A DTO is an object that holds primitive data (strings, booleans, floats, nulls, arrays of these things). It defines the schema of this data by explicitly declaring the names of the fields and their types. It can only guarantee that all the data is there, simply by relying on the strictness of the programming language: if a constructor has a required parameter of type string, you have to pass a string, or you can’t even instantiate the object. However, a DTO does not provide any guarantee that the values actually make sense from a business perspective. Strings could be empty, integers could be negative, etc.

There are different flavours of the class design for DTOs:

/** * @object-type DTO * * Using a constructor and public readonly properties: */
final class AnExample
{ public function __construct( public readonly string $field, // ... ) { }
} /** * @object-type DTO * * Using a constructor with private readonly properties * and public getters: */
final class AnotherExample
{ public function __construct( private readonly string $field, // ... ) { } public function field(): string { return $this->field; }
}

Regarding the naming of a DTO: I recommend not adding “DTO” to the name itself. If you want to make it clear what the type is, add a comment, or an invented annotation (or attribute) like @object-type. This will be very useful for developers that are not aware of these object types. It may trigger them to look up an article about what it means (this article, maybe :)).

What’s a value object and how do you recognize it?

A value object is an object that wraps one or more values or value objects. It guarantees that all the data is there, and also that the values make sense from a domain perspective. Strings will no longer be empty, numbers will be verified to be in the correct range. A value object can offer these guarantees by throwing exceptions inside the constructor, which is private, forcing the client to use one of the static, named constructors. This makes a value object easy to recognize, and clearly distinguishable from a DTO:

final class AnExample
{ private function __construct( private string $value ) { } public static function fromValue( string $value ): self { /* * Throw an exception when the value doesn't * match all the expectations. */ return new self($value); }
}

While a DTO just holds some data for you and provides a clear schema for this data, a value object also holds some data, but offers evidence that the data matches the expectations. When the value object’s class is used as a parameter, property, or return type, you know that you are dealing with a correct value.

How should we use these object types?

Meaning is defined by use. If we are using “DTO” and “value object” in the wrong way, their names will eventually get a different meaning. This might be how the confusion between the two terms arises in the first place.

DTOs

A DTO should only be used in two places: where data enters the application or where it leaves the application. Some examples:

  1. When a controller receives an HTTP POST request, the request data may have any shape. We need to go from shapeless data to data with a schema (verified keys and types). We can use a DTO for this. A form library may be able to populate this DTO based on submitted form data, or we can use a serializer to convert the plain-text request body to a populated DTO.
  2. When we make an HTTP POST request to a web service, we may collect the input data in a DTO first, and then serialize it to a request body that our HTTP client can send to the service.
  3. For queries the situation is similar. Here we can use a DTO to represent the query result. As an example we can pass a DTO to a template to render a view based on it. We can use a DTO, serialize it to JSON and send it back as an API response.
  4. When we send an HTTP GET request to a web service, we may deserialize the API response into a DTO first, so we can apply a known schema to it instead of just accessing array keys and guessing the types. API client packages usually offer DTOs for requests and responses.

Truncated by Planet PHP, read more at the original (another 1622 bytes)

A step-debugger for the PHP AST – Matthias Noback

When you’re learning to write custom rules for PHPStan or Rector, you’ll have to learn more about the PHP programming language as well. To be more precise, about the way the interpreter parses PHP code. The result of parsing PHP code is a tree of nodes which represents the structure of the code, e.g. you’ll have a Class definition node, a Method definition node, and within those method Statement nodes, and so on. Each node can be checked for errors (with PHPStan), or automatically refactored in some way (with Rector).

The tree of nodes is called Abstract Syntax Tree, and a successful PHPStan or Rector rule starts with selecting the right nodes from the tree and “subscribing” your rule to these nodes. A common approach for this is to start var_dump-ing or echo-ing nodes inside your new rule, but I’ve found this to be quite tedious. Which is why I’ve created a simple command-line tool that lets you inspect the nodes of any given PHP file.

The tool is called AST Inspector and is available on GitHub.

Install it with Composer:

composer require --dev matthiasnoback/php-ast-inspector

Then run:

vendor/bin/ast-inspect inspect [file.php]

You’ll see something similar to this output:

Screenshot of PHP AST inspector

You can navigate through the tree by going to the next or previous node, or jumping into the subnodes of the selected node. Navigation conveniently uses the a,s,d,w keys.

Currently the project uses the PHP-Parser library for parsing. Since PHPStan adds additional virtual nodes to the AST, it will be useful to show them in this tool as well, but that requires some additional work. Another interesting addition would be to show the types that PHPStan derives for variables in the inspected code. That will also require some more work…

For now, please give this program a try, and let me know what you think! I’m happy to add more features to it, as long as it makes the learning curve for these amazing tools less steep. And if you’re looking for an in-depth exploration of writing your own PHPStan or Rector rules, check out the documentation linked above or one of my books (Recipes for Decoupling, which shows how to create PHPStan rules, and Rector – The Power of Automated Refactoring, which does the same for Rector).