PHP 8.2.0 RC 4 available for testing – PHP: Hypertext Preprocessor

The PHP team is pleased to announce the release of PHP 8.2.0, RC 4. This is the fourth release candidate, continuing the PHP 8.2 release cycle, the rough outline of which is specified in the PHP Wiki. For source downloads of PHP 8.2.0, RC 4 please visit the download page. Please carefully test this version and report any issues found in the bug reporting system. Please DO NOT use this version in production, it is an early test version. For more information on the new features and other changes, you can read the NEWS file or the UPGRADING file for a complete list of upgrading notes. These files can also be found in the release archive. The next release will be the fifth release candidate (RC 5), planned for 27 October 2022. The signatures for the release can be found in the manifest or on the QA site. Thank you for helping us make PHP better.

Refactoring without tests should be fine – Matthias Noback

Refactoring without tests should be fine. Why is it not? When could it be safe?

From the cover of “Refactoring” by Martin Fowler:

Refactoring is a controlled technique for improving the design of an existing code base. Its essence is applying a series of small behavior-preserving transformations, each of which “too small to be worth doing”. However the cumulative effect of each of these transformations is quite significant.

Although the word “refactoring” is used by programmers in many different ways (often it just means “changing” the code), in this case I’m thinking of those small behavior-preserving transformations. The essence of those transformations is:

  • The structure of the code changes (e.g. we add a method, a class, an argument, etc.), but
  • The behavior of the program stays the same.

We perform these refactorings to establish a better design, which in the end will look nothing like the current design. But we only take small, safe steps. Fowler mentions in the introduction of the book that tests are necessary to find out if a refactoring didn’t break anything. Then the book goes on to show a large number of small behavior-preserving transformations. Most of those transformations are quite safe, and we can’t think of a reason why they would go wrong. So they wouldn’t really need tests as a safety net after all. Except, in practice, we do need tests because we run into problems, like:

  1. We make a mistake at syntax-level (e.g. we forget a comma, a bracket, or we write the wrong function name etc.)
  2. We neglect to update all the clients (e.g. we rename a method, but one of the call sites still uses the old name)
  3. An existing test is tied to the old code structure (e.g. it refers to a class that no longer exists after the refactoring)
  4. We change not only the structure, but also the behavior

I find that a static analysis tool like PHPStan will cover you when it comes to the first category of issues. For instance, PHPStan will report errors for incorrect code, or calls to methods that don’t exist, etc.

The same goes for the second category, but there’s a caveat. In order to make no mistakes here, we have to be aware of all the call sites/clients. This can be hard in applications where a lot of dynamic programming is used: a method is not called explicitly but dynamically, e.g. $controller->{$method}(). In that case, PHPStan won’t be able to warn you if you changed the method name that used to be invoked in this way. It’s why I don’t like methods and classes being dynamically resolved: because it makes refactoring harder and more dangerous. In some cases we may install a PHPStan extension or write our own, so PHPStan can dynamically resolve types that it could never derive from the code itself. But still, dynamic programming endangers your capability to safely refactor.

The third group of failures, where tests become the problem that keeps us from making simple refactorings, can be overcome to a large extent by writing better tests. Many unit tests that I’ve seen would count as tests that are too close to the current structure of the code. If you want to extract a class, or merge two classes, the unit test for the original class really gets in the way, because it still relies on that old class. We’d have to rewrite, rename, move tests, in order to keep track of the modified structure of the code. This is wasteful, and unnecessary: there are good ways to write so-called higher-level tests that aren’t so susceptible to a modified code structure. When the structure changes, they don’t immediately become useless or break because of the modified structure.

From my own experiences with refactoring-without-tests, I bet the fourth category is the worst and hardest to tackle. It often happens that you don’t just make that structural change, but add some semi-related change to the same commit, one that turns out to change the behavior of the code. I’ve found that you really need to program in (at least) a pair to reduce the number of mistakes in this category. A navigator will check constantly: are we changing behavior here as well? Is this merely the structural changes we were here for? Examples of mistakes that I made that were more than the small behavior-preserving transformations that refactorings were intended to be:

  • I removed stuff that seemed unused (correctness of such a change may be really hard to prove)
  • I changed settings, tweaked stuff hoping for a better performance (this has to be proven with a benchmark)
  • I tried to replace the design of several classes at once, instead of in the step-by-step fashion that refactoring requires.

I think as developers we’ve deviated a lot from the original idea behind refactoring. We’ve been hoping to impr

Truncated by Planet PHP, read more at the original (another 564 bytes)

PHP 8.2.0 RC3 available for testing – PHP: Hypertext Preprocessor

The PHP team is pleased to announce the third release candidate of PHP 8.2.0, RC 3. This continues the PHP 8.2 release cycle, the rough outline of which is specified in the PHP Wiki.For source downloads of PHP 8.2.0 RC3 please visit the download page.Please carefully test this version and report any issues found in the bug reporting system.Please DO NOT use this version in production, it is an early test version.For more information on the new features and other changes, you can read the NEWS file, or the UPGRADING file for a complete list of upgrading notes. These files can also be found in the release archive.The next release will be the fourth release candidate (RC 4), planned for Oct 13th 2022.The signatures for the release can be found in the manifest or on the QA site.Thank you for helping us make PHP better.

Good design means it’s easy-to-change – Matthias Noback

Software development seems to be about change: the business changes and we need to reflect those changes, so the requirements or specifications change, frameworks and libraries change, so we have to change our integrations with them, etc. Changing the code base accordingly is often quite painful, because we made it resistant to change in many ways.

Code that resists change

I find that not every developer notices the “pain level” of a change. As an example, I consider it very painful if I can’t rename a class, or change its namespace. One reason could be that some classes aren’t auto-loaded with Composer, but are still manually loaded with require statements. Another reason could be that the framework expects the class to have a certain name, be in a certain namespace, and so on. This may be something you personally don’t consider painful, since you can avert the pain by simply not considering to rename or move classes.

Still, in the end, you know that a code base that resists change like this is going to be considered “a case of severe legacy code”. That’s because you can’t resist change in the software development world, and eventually it will be time to make that change, and then you can experience the pain that you’ve been postponing for so long.

Software can resist change in many ways, here are just a few examples that come to mind:

  • Classes have to be in a certain directory/namespace and methods have to have certain names in order to be picked up by the framework
  • There’s another application using the same database, which can’t deal with extra columns that it doesn’t know about
  • Class auto-loading only works if you call session_start() first
  • And so on… If you have another cool example, please add it as a comment below!

Socially-established change aversion

Change-aversion can also be socially established. As an example, the team may use a rule that says “If you create a class, you also have to create a unit test for it”. Which is very bad, because you can use multiple classes in a test and still call it a unit-test, so the every-class-is-a-unit assumption is plain wrong. More importantly, you can’t unit-test all types of classes; some will require integrated tests. Anyway, let’s not get carried away 😉 My point is, if you have such a rule you’ll make it harder for developers to add a new class, since they fear the additional (often pointless) work of creating a test for it. In a sense, developers start to resist change. The code base itself will resist change as well, because unit tests are often too close to the implementation, making a change in the design really hard.

Unit tests are my favorite example, but there are other socially-established practices that get in the way of change. Like, “don’t change this code, because 5 years ago Leo touched it and we had to work until midnight to fix production”. Or “we asked the manager for some time to work on this, but we didn’t get it”.

Make changing things easy

From these, and many more – indeed – painful experiences, I have come to the conclusion that a very powerful way to judge the quality of code and design is to answer the question: is this easy to change? The change can be about a function name, the location of a file, installing a Composer dependency, injecting an additional constructor dependency, and so on.

However, it’s sometimes really hard to perform this evaluation yourself, since as a long-time developer you may already be used to quite some “pain”. You may be jumping through hoops to make a change, and not even realize that it’s silly and should be much easier. This is where pair or mob/ensemble programming can be really useful: working together on the same computer will expose all the changes that you avoid:

  • “Hey, let’s rename that class!”
  • “Well, I’m not sure that we can, let’s save this for another time.”

  • “Now let’s inject that new service as a constructor argument.”

  • “Sorry, we can’t use dependency injection in this part of the code base.”

That’s why I usually go all-in on ensemble programming, so we can have a clear view on all the changes that the team averts. We look the monster in the eyes.

Addendum: when changes break stuff

A partial reason for change aversion in developers is the risk that the change may break other things. If you rename a method, you should rename all the relevant method calls too. Luckily, we have static reflection these days, which will tell you about any call sites that you missed. And of course, the IDE can safely make the change for you in most cases. Unfortunately, this is not always the cas

Truncated by Planet PHP, read more at the original (another 5175 bytes)

Extrinsic sorting: A benchmark – Larry Garfield

Extrinsic sorting: A benchmark

Sorting algorithms are generally old hat to most programmers. They’ve either analyzed them to death in class, already written many of them, or work in a language where one is provided and they don’t need to think about it. Or all three.

For PHP developers, we have a suite of sorting tools available to us: sort(), usort(), ksort(), uasort(), and various other letter combinations. All use Quick Sort internally, which is generally the best performing single-threaded option. Most importantly, many of them let us provide a custom comparison function to use when determining which of two values is larger or smaller.

admin
25 September 2022 – 10:10am

Can we consider DateTimeImmutable a primitive type? – Matthias Noback

During a workshop we were discussing the concept of a Data Transfer Object (DTO). The main characteristic of a DTO is that it holds only primitive-type values (strings, integers, booleans), lists or maps of these values including “nested” DTOs. Not sure who came up with this idea, but I’m using it because it ensures that the DTO becomes a data structure that only enforces a schema (field names, the expected types, required fields, and optional fields), but doesn’t enforce semantics for any value put into it. That way it can be created from any data source, like submitted form values, CLI arguments, JSON, XML, Yaml, and so on. Using primitive values in a DTO makes it clear that the values are not validated. The DTO is just used to transfer or carry data from one layer to the next. A question that popped up during the workshop: can we consider DateTimeImmutable a primitive-type value too? If so, can we use this type inside DTOs?

I thought it was an interesting question to explore. I’d like to say “No” immediately, but why?

Is something deserving of a predicate? To decide, we have to define what the predicate means. Stated in an abstract way like this it makes a lot of sense, but when discussing concrete questions it’s not often clear that we should talk about definitions; we often like to jump to an answer immediately! So, in this case, what’s a primitive-type value?

First, we could consider “primitive” to mean “can’t be further divided into parts”. In that sense, when a Money value object would consist of an integer for the number of cents, and a string declaring the currency, Money is not primitive, but the integer and the string inside are, because it doesn’t make sense to take them apart. Although you might say that something more primitive than a string is a character. It’s just that PHP doesn’t distinguish this type. DateTimeImmutable in this sense is not primitive, the value that it contains (the timestamp) is.

Second, we could consider “primitive” to mean “what’s not an object”, or as PHP calls these values: scalars. Again, “primitive” takes into account what the programming language considers primitive, because Java for instance has strings which are considered primitive values but are nevertheless objects. Java has a weird relationship with primitive values anyway, because strings, integers, etc. look very primitive in Java code (i.e. you don’t do new String("a string") but just write "a string")). With PHP there’s less confusion around this concept. When “primitive” is used in this sense, DateTimeImmutable could never be considered a primitive-type value, because it’s an object, but in Java it could be, because other primitive values are considered to be primitive regardless of them being an object.

Third, we could consider “primitive” to mean “whatever types the language offers out-of-the-box”. This is often equivalent to “native”. Unfortunately, this isn’t a very helpful definition, since what’s native is unclear. Is there a “core” part of the language that defines these types? In that case, where does DateTimeImmutable belong? Isn’t that part of an extension? Also, would we consider file handles (resources) primitive types? In the end, most of what is part of the “core” or “native” language is quite arbitrary.

Fourth, we could consider “primitive” to mean – regardless of the language – “what types do we need to describe data?” In that sense, we may look back in the history of humanity itself and consider numbers very primitive (e.g. for describing the value of something). Same for strings (e.g. for writing down a name). Arguably a date or a time isn’t primitive, because it’s built up from strings (or characters), and numbers.

Fifth, we could consider “primitive” to mean “bare values, not necessarily sensible or correct ones”. So “2” is a primitive value, it doesn’t say 2 of what, so we can’t judge if the value is correct. “UKj” is a primitive value, it doesn’t say what it describes, so there’s no way to judge this value. Using this definition, a DateTimeImmutable value is certainly not a primitive value because when you instantiate it, it processes the provided string constructor argument and throws an error if it is not a sensible one. Or, maybe worse, converts it into a value that does make sense, but may no longer match the intention of the actor that produced the value.

For me, this final point is the most important attribute of primitive-ness, which disqualifies DateTimeImmutable as a primitive-type value. Anyway, we already established that DateTimeImmutable can’t be considered primitive according to the other definitions either.

Am I missing any possible definitions of “primitive” here? Just let me know!