Xdebug Update: December 2019 – Derick Rethans

Xdebug Update: December 2019

Another month, another monthly update where I explain what happened with Xdebug development in this past month. It will be published on the first Tuesday after the 5th of each month. Patreon supporters will get it earlier, on the first of each month. You can become a patron here to support my work on Xdebug. If you are leading a team or company, then it is also possible to support Xdebug through a subscription.

In December, I worked on Xdebug for near 50 hours, on the following things:

Xdebug 2.9.0

After releasing Xdebug 2.8.1, which I mentioned in last month’s update, at the start of the month, more users noticed that although I had improved code coverage speed compared to Xdebug 2.8.0, it was still annoyingly slow. Nikita Popov, one of the PHP developers, provided me with a new idea on how to approach trying to find out which classes and functions still had to be analysed. He mentioned that classes and functions are always added to the end of the class/function tables, and that they are never removed either. This resulted in a patch, where the algorithm to find out whether a class/function still needs to be analysed changed from from O(n²) to approximately O(n). You can read more about this in the article that I wrote about it. A few other issues were addressed in Xdebug 2.9.0 as well.

Breakpoint Resolving

In the May update I wrote about resolving breakpoints. This feature will try to make sure that whenever you set a breakpoint, Xdebug makes sure that it also breaks. However, there are currently two issues with this: 1. breaks happen more often than expected, and 2. the algorithm to find lines is really slow. I am addressing both these problems by using a similar trick to the one Nikita suggested for speeding up code coverage analysis. This work requires quite a bit of rewrites of the breakpoint resolving function, and hence this is ongoing. I expect this to cumulate in an Xdebug 2.9.1 release during January.

debugclient and DBGp Proxy

I have wanted to learn Go for a while, and in order to get my feet wet I started implementing Xdebug’s bundled debugclient in Go, and at the same time create a library to handle the DBGp protocol.

The main reason why a rewrite is useful, is that the debugclient as bundled with Xdebug no longer seems to work with libedit any more. This makes using debugclient really annoying, as I can’t simply use the up and down arrows to scroll through my command history. I primarily use the debugclient to test the DBGp protocol, without an IDE “in the way”.

The reason to write a DGBp library is that there are several implementations of a DBGp proxy. It is unclear as to whether they actually implement the protocol, or just do something that “works”. I will try to make the DBGp proxy that I will be working on stick to the protocol exactly, which might require changes to IDEs who implement it against an non compliant one (Komodo’s pydbgpproxy seems to be one of these).

This code is currently not yet open source, mostly because I am still finding my feet with Go. I expect to release parts of this on the way to Xdebug 3.0.

Truncated by Planet PHP, read more at the original (another 913 bytes)

Rules for working with dynamic arrays and custom collection classes – Matthias Noback

Here are some rules I use for working with dynamic arrays. It’s pretty much a Style Guide for Array Design, but it didn’t feel right to add it to the Object Design Style Guide, because not every object-oriented language has dynamic arrays. The examples in this post are written in PHP, because PHP is pretty much Java (which might be familiar), but with dynamic arrays instead of built-in collection classes and interfaces.

Using arrays as lists

All elements should be of the same type

When using an array as a list (a collection of values with a particular order), every value should be of the same type:

$goodList = [ 'a', 'b'
]; $badList = [ 'a', 1

A generally accepted style for annotating the type of a list is: @var array<TypeOfElement>.
Make sure not to add the type of the index (which would always be int).

The index of each element should be ignored

PHP will automatically create new indexes for every element in the list (0, 1, 2, etc.)
However, you shouldn’t rely on those indexes, nor use them directly.
The only properties of a list that clients should rely on is that it is iterable and countable.

So feel free to use foreach and count(), but don’t use for to loop over the elements in a list:

// Good loop:
foreach ($list as $element) {
} // Bad loop (exposes the index of each element):
foreach ($list as $index => $element) {
} // Also bad loop (the index of each element should not be used):
for ($i = 0; $i < count($list); $i++) {

(In PHP, the for loop might not even work, because there may be indices missing in the list, and indices may be higher than the number of elements in the list.)

Instead of removing elements, use a filter

You may want to remove elements from a list by their index (unset()), but instead of removing elements you should use array_filter() to create a new list, without the unwanted elements.

Again, you shouldn’t rely on the index of elements, so when using array_filter() you shouldn’t use the flag parameter to filter elements based on the index, or even based on both the element and the index.

// Good filter:
array_filter( $list, function (string $element): bool { return strlen($element) > 2; }
); // Bad filter (uses the index to filter elements as well)
array_filter( $list, function (int $index): bool { return $index > 3; }, ARRAY_FILTER_USE_KEY
); // Bad filter (uses both the index and the element to filter elements)
array_filter( $list, function (string $element, int $index): bool { return $index > 3 || $element === 'Include'; }, ARRAY_FILTER_USE_BOTH

Using arrays as maps

When keys are relevant and they are not indices (0, 1, 2, etc.). feel free to use an array as a map (a collection from which you can retrieve values by their unique key).

All the keys should be of the same type

The first rule for using arrays as maps is that all they keys in the array should be of the same type (most common are string-type keys).

$goodMap = [ 'foo' => 'bar', 'bar' => 'baz'
]; // Bad (uses different types of keys)
$badMap = [ 'foo' => 'bar', 1 => 'baz'

All the values should be of the same type

The same goes for the values in a map: they should be of the same type.

$goodMap = [ 'foo' => 'bar', 'bar' => 'baz'
]; // Bad (uses different types of values)
$badMap = [ 'foo' => 'bar', 'bar' => 1

A generally accepted style for annotating the type of a map is: @var array<TypeOfKey, TypeOfValue>.

Maps should remain private

Lists can safely be passed around from object to object, because of their simple characteristics.
Any client can use it to loop over its elements, or count its elements, even if the list is empty.
Maps are more difficult to work with, because clients may rely on keys that have no corresponding value.
This means that in general, they should remain private to the object that manages them.

Truncated by Planet PHP, read more at the original (another 8885 bytes)

Performance testing HTTP/1.1 vs HTTP/2 vs HTTP/2 + Server Push for REST APIs – Evert Pot

<script src=”/assets/js/request-simulator.js”/>

When building web services, a common wisdom is to try to reduce the number of
HTTP requests to improve performance.

There are a variety of benefits to this, including less total bytes being
sent, but the predominant reason is that traditionally browsers will only make
6 HTTP requests in parallel for a single domain. Before 2008, most browsers
limited this to 2.

When this limit is reached, it means that browsers will have to wait until
earlier requests are finished before starting new ones. One implication is that
the higher the latency is, the longer it will take until all requests finish.

Take a look at an example of this behavior. In the following simulation we’re
fetching a ‘main’ document. This could be the index of a website, or a some
JSON collection.

After getting the main document, the simulator grabs 99 linked items. These
could be images, scripts, or other documents from an API.

The 6 connection limit has resulted in a variety of optimization techniques.
Scripts are combined and compressed, graphics are often combined into
‘sprite maps’.

The limit and ‘cost’ of a single HTTP connection also has had an effect on web
services. Instead of creating small, specific API calls, designers of REST
(and other HTTP-) services are incentivized to pack many logical ‘entities’
in a single HTTP request/response.

For example, when an API client needs a list of ‘articles’ from an API, usually
they will get this list from a single endpoint instead of fetching each article
by its own URI.

The savings are massive. The following simulation is similar to the last,
except now we’ve combined all entities in a single request.

If an API client needs a specific (large) set
of entities from a server, in order to reduce HTTP requests, API developers will
be compelled to either build more API endpoints, each to give a result
that is tailored to the specific use-case of the client or deploy systems
that can take arbitrary queries and return all the matching entities.

The simplest form of this is perhaps a collection with many query parameters,
and a much more complex version of this is GraphQL, which effectively uses
HTTP as a pipe for its own request/response mechanism and allows for a wide range
of arbitrary queries.

Drawbacks of compounding documents

There’s a number of drawbacks to this. Systems that require compounding of
entities typically need additional complexity on both server and client.

Instead of treating a single entity as some object that has a URI, which
can be fetched with GET and subsequently cached, a new layer is required
on both server and client-side that’s responsible for teasing these entities

Re-implementing the logic HTTP already provides also has a nasty side-effect
that other features from HTTP must also be reimplemented. The most common
example is caching.

On the REST-side of things, examples of compound-documents can be seen in
virtually any standard. JSON:API, HAL and Atom all have
this notion.

If you look at most full-featured JSON:API client implementations, you will
usually see that these clients often ship with some kind of ‘entity store’,
allowing it to keep track of which entities it received, effectively
maintaining an equivalent of a HTTP cache.

Another issue is that for some of these systems, is that it’s typically
harder for clients to just request the data they need. Since they are often
combined in compound documents it’s all-or-nothing, or significant complexity
on client and server (see GraphQL).

A more lofty drawback is that API designers may have trended towards systems
that are more opaque, and are no longer part of the web of information due
to a lack that interconnectedness that linking affords.

HTTP/2 and HTTP/3

HTTP/2 is now widely available. In HTTP/2 the cost of HTTP requests is
significantly lower. Whereas with HTTP/1.1 it was required to open 1 TCP
connection per request, with HTTP/2 1 connection is opened per domain. Many
requests can flow through them in parallel, and potentially out of order.

Instead of delegating parallelism to compound documents, we can now actually
rely on the protocol itself to handle this.

Using many HTTP/2 req

Truncated by Planet PHP, read more at the original (another 333361 bytes)

Ajax live search, enhance your site’s search experience – PHP Scripts – Web Development Blog

If your website has a lot of pages, a good working search function becomes more important. Is your website based on WordPress? Then you don’t need to worry about the search function, because it’s a standard feature. But how about a great search experience? Relevant search results are one part of the game, but how […]


The release of Object Design Style Guide – Matthias Noback

Book cover

Today Manning released my latest book! It’s called “Object Design Style Guide”.

In November 2018 I started working on this book. The idea for it came from a conversation I had with the friendly folks at Akeneo (Nantes) earlier that year. It turned out that, after days of high level training on web application architecture and Domain-Driven Design, there was a need for some kind of manual for low level object-oriented programming. Not as low level as the kind of programming advice people usually refer to as clean code, but general programming rules for different kinds of objects. For instance:

  • A service gets its dependencies and configuration values injected as constructor arguments.
  • A service is an immutable object.
  • An entity always has a named constructor.
  • An entity is the only type of mutable object in an application.

And so on…

While exploring the subject area I realized there would be a lot to write about, and a lot to declare as out-of-scope too. I thought it would be very important to make it a practical book, with short and clear rules for object design.

Soon enough I was able to publish a couple of chapters on Leanpub. For me publishing through Leanpub has always worked well, maybe because my writing process usually turns out to be more or less linear. This allows me to publish the first chapters, and then keep publishing new chapters over the course of several months. I absolutely love how early supporters bought this book when they could only read a few chapters. I’ve said it before, but this is an important encouragement to keep going.

Re-publishing an existing book

In January of this year I was approached by Manning. They asked if I would like to work on a book about Domain-Driven Design. I suggested working on a new edition of the Style Guide instead, and they were happy about that from the start. Manning has a rather extensive process for onboarding, and although I thought it wouldn’t be worth it because the book already existed, they still pushed me to write a profile of the “Minimum Qualified Reader” (MQR) and to create a “Weighted Table of Contents” (WTC). I’m glad they did, because it definitely improved the quality of the learning experience for readers of this book. The MQR is a description of the reader: what do I expect from them in terms of experience, knowledge, interests, etc. This helped me write the most useful book for the imagined reader, and also allowed me to skip certain basic concepts that I could assume the reader to already know about.

The WTC shows for each chapter how hard it is to read and understand. Ideally a book would start simple, with some introductory chapters, and build up towards the more complicated chapters. I think I’ve actually achieved this. The book starts with some basic concepts, building up towards an explanation about changing behavior without changing code, and separating write from read models. The final two chapters provide an overview of different types of objects and how they work together in a web application, and an overview of related topics and suggestions for further reading.

Overall, the publishing process was well managed by the team at Manning. They provided lots of advice on using diagrams, adding exercises, etc. I can recommend them to anyone who is looking for a serious process leading to a book published by a serious publisher. Just be ready to spend some more time on making everything just right. For comparison: the first edition of the book, as published on Leanpub, took me only 127 hours to write. Working with Manning, processing review suggestions, fixing issues, following the steps in their publishing process, etc. took another 132 hours.

Comparison with the PHP edition

The Manning edition of the style guide is different from the PHP edition published on Leanpub in the following ways:

  • It has more words; there are many little additions, like “asides” to answer reader questions. It is now about 290 pages in print.
  • It has better words; it has been checked for grammar, spelling and style issues.
  • It has exercises; the first chapters have exercises. Some multiple-choice questions and some open questions where you have to write a bit of code. The answers can be found at the end of each chapter. For multiple-choice questions I explain why the correct answer is the only correct answer and why the other possible answers would be wrong. For the code exercises I prov

Truncated by Planet PHP, read more at the original (another 806 bytes)

Why You Should Use Asynchronous PHP

The Difference Between Synchronous vs. Asynchronous Programming

Before we discuss the merits of asynchronous PHP, let’s quickly review the differences between the synchronous and asynchronous programming models. Synchronous code is sequential. Individual tasks must be completed before you can start another one. In asynchronous code, multiple tasks can be completed simultaneously, which can dramatically boost application performance and user experience.

BigQuery: Using multiple cursors / rectangular selection in BigQuery UI – Pascal Landau

Multiple keyboard shortcuts usable in the BigQuery UI are listed in the
official documentation, though the one for
using multiple cursors is missing:

ALT + left-mouse-button-drag

Using multiple cursors in the BigQuery UI via ATL + left mouse drag


  • keep the ALT key pressed first and then click the left mouse button and drag it up or down vertically
  • the same hotkeys can be used to draw a rectangular selection (aka column selection)
    Rectangular / column selection in BigQuery
  • using ALT + LEFT and ALT + RIGHT will position one (or all) cursors at the beginning respectively end of the line

Use cases

We often deal with multiple datasets and tables that have the exact same structure, e.g. due to sharding. In those cases it’s
often required to modify different parts of the query in the exact same way so that multiple cursors come in handy.