Introducing Bag 1.0: Immutable Values Objects for PHP – Davey Shafik

For the last couple of years I’ve been using Value Objects in my projects to bring language-level strict types to what would typically be array data structures in my code. From method inputs to JSON API responses, value objects have almost entirely replaced arrays throughout. The ability to get runtime type checking and IDE auto-complete has eliminated many potential bugs, from key typos, to assigning an incorrectly typed value by accident: what type is an “amount” property in a credit card transaction API response? An integer of cents (or other minor units), a Money object such as brick/money or moneyphp/money? Or worst of all, a float?

About 18 months ago, I started using the excellent spatie/laravel-data v3 package for a new project, but I quickly realized there were a few features missing, most notably, factory support. Additionally, the collection class didn’t extend the Laravel Collection class and is anemic by comparison.

Note: spatie/laravel-data v4 adds support for factories, and has a slightly more capable DataCollection class. See below for details.

So I extended the base Data class and added factory support, following a similar pattern to Eloquent factories, including support for Sequences, and I extended DataCollection to add some missing functionality, and mostly, it was good.

Enter Spatie/Laravel-Data v4

Earlier this year, Spatie released v4 with support for Laravel 11 (and up until last month, no support for Laravel 11 in v3), with significant changes, including support for Factories, and a better DataCollection class. Unfortunately, the Factories built into v4 were incompatible with my own, and didn’t have the same feature set, and while better, the updated DataCollection class was still lackluster.

The upgrade process was difficult, and while ultimately successful, I was unhappy with the outcome. Then I decided that I would much prefer if my value objects were immutable, which was impossible using either v3 or v4, and ultimately, that, along with the difficult upgrade path to v4, led me to create Bag.

What is Bag?

Bag is a new library built from scratch — inspired by spatie/laravel-data — that provides immutable value objects for PHP. Built on top of Laravel’s excellent Validation and Collection classes, as well as Eloquent Factory Sequences, it is the value object library I wanted spatie/laravel-data to be.

Additionally, I leaned harder into the use of Attributes, for identifying Collection classes to be used for each value object class, wrapping, hiding data in both toArray() and toJson()/jsonSerialize(), and for identifying transformers (what Spatie calls “magical data object creation“).

I simplified input/output casting (as opposed to casting being for inputs, and transformers being for output),

I also added support for Variadics, something that the Spatie library does not allow. For a more detailed comparison of the two libraries, see the Bag documentation here.

Performance

Despite having a few more features, simple benchmarks of Bag do show it as being about 40-45% faster than spatie/laravel-data v4, and a whopping 70-78% faster than v3.

Benchmark Methodology

The benchmark script was intentionally very simple, using Laravel’s Benchmark class to get the average of 10 runs of a loop (default: 1000 iterations) that creates instances of a Bag or Spatie value object. I ran it both with 1,000 iterations and 10,000 iterations.

The value objects have the following features:

  • Class-level Input/Output Name Mapping to/from SnakeCase
  • A single property input name mapping from CamelCase
  • A single property input name mapping from an alias
  • A property with integer and required validations
  • A property input/output case from/to DateTime/formatted date string

You can see all the code for the benchmark

Truncated by Planet PHP, read more at the original (another 3402 bytes)

Initializing ZendHQ JobQueue During Application Deployment – Matthew Weier O’Phinney

In the past few years, I’ve transitioned from engineering into product management at Zend, and it’s been a hugely rewarding experience to be able to toss ideas over the fence to my own engineering team, and have them do all the fiddly tricky bits of actually implementing them!

Besides packaging long-term support versions of PHP, we also are publishing a product called ZendHQ. This is a combination of a PHP extension, and an independent service that PHP instances communicate with to do things like monitoring and queue management.

It’s this latter I want to talk about a bit here, as (a) I think it’s a really excellent tool, and (b) in using it, I’ve found some interesting patterns for prepping it during deployment.

What does it do?

ZendHQ’s JobQueue feature provides the ability to defer work, schedule it to process at a future date and time, and to schedule recurring work. Jobs themselves can be either command-line processes, or webhooks that JobQueue will call when the job runs.

Why would you use this over, say, a custom queue runner managed by supervisord, or a tool like Beanstalk, or cronjobs?

There’s a few reasons:

  • Queue management and insight. Most of these tools do not provide any way to inspect what jobs are queued, running, or complete, or even if they failed. You can add those features, but they’re not built in.
  • If you are using monitoring tools with PHP… queue workers used with these tools generally cannot be monitored. If I run my jobs as web jobs, these can run within the same cluster and communicate to the same ZendHQ instance, giving me monitoring and code traces for free.
  • Speaking of using web workers, this means I can also re-use technologies that are stable and provide worker management that I already know: php-fpm and mod_php. This is less to learn, and something I already have running.
  • Retries. JobQueue allows you to configure the ability to retry a job, and how long to wait between retries. A lot of jobs, particularly if they rely on other web services, will have transient failures, and being able to retry can make them far more reliable.

So, what about queue warmup?

When using recurring jobs, you’ll (a) want to ensure your queue is defined, and (b) define any recurring jobs at application deployment. You don’t want to be checking on each and every request to see if the queues are present, or if the recurring jobs are present. Ideally, this should only happen on application deployment.

When deploying my applications, I generally have some startup scripts I fire off. Assuming that the PHP CLI is configured with the ZendHQ extension and can reach the ZendHQ instance, these scripts can (a) check for and create queues, and (b) check for and create recurring jobs.

As a quick example:

use ZendHQ\JobQueue\HTTPJob;
use ZendHQ\JobQueue\JobOptions;
use ZendHQ\JobQueue\JobQueue;
use ZendHQ\JobQueue\Queue;
use ZendHQ\JobQueue\QueueDefinition;
use ZendHQ\JobQueue\RecurringSchedule; $jq = new ZendHQ\JobQueue(); // Lazily create the queue "mastodon"
$queue = $jq->hasQueue('mastodon') ? $jq->getQueue('mastodon') ? $jq->addQueue('mastodon', new QueueDefinition( QueueDefinition::PRIORITY_NORMAL, new JobOptions( JobOptions::PRIORITY_NORMAL, 60, // timeout 3, // allowed retries 30, // retry wait time JobOptions::PERSIST_OUTPUT_ERROR, false, // validate SSL )); // Look for jobs named "t

Truncated by Planet PHP, read more at the original (another 4534 bytes)

Local Whispers – Derick Rethans

Local Whispers

For most of the videos that I make, I also like to have subtitles, because sometimes it’s easier to just read along.

I used to make these subtitles with an online service called Otter.io, but they stopped allowing uploading of video files.

And then I found Whisper, which allows me to upload audio files to create subtitles. Whisper is an API from OpenAI, mainly known for ChatGPT.

I didn’t like having to upload everything to them either, as that means that they could train their model with my original video audio.

Whisper never really worked that well, because it broke up the sentences in weird places, and I had to make lots of edits. It look a long time to make subtitles.

I recently found out that it’s actually possible to run Whisper locally, with an open source project on GitHub. I started looking into this to see whether I could use this to create subtitles instead.

The first thing that their documentation tells you to do is to run: pip install openai-whisper.

But I am on a Debian machine, and here Python is installed through distribution packages, and I don’t really want to mess that up. apt-get actually suggests to create a virtual environment for Python.

In a virtual environment, you can install packages without affecting your system setup. Once you’ve made this virtual environment, there’s actually Python binaries symlinked in there, that you can then use for installing things.

You create the virtual environment with:

python3 -m venv `pwd`/whisper-local
cd whisper-local 

In the bin directory you then have python and pip. That’s the one you then use for installing packages.

Now let me run pip again, with the same options as before to install Whisper:

bin/pip install -U openai-whisper 

It takes quite some time to download. Once it is done, there is a new whisper binary in our bin directory.

You also need to install fmpeg:

sudo apt-get install ffmpeg 

Now we can run Whisper on a video I had made earlier:

./bin/whisper ~/media/movie/xdebug33-from-exception.webm 

The first time I ran this, I had some errors.

My video card does not have enough memory (2GB only). I don’t actually have a very good video card at all, and was better off disabling it, by instructing “Torch” that I do not have one:

export CUDA_VISIBLE_DEVICES="" 

And then run Whisper again:

./bin/whisper ~/media/movie/xdebug33-from-exception.webm 

It first detects the language, which you can pre-empt by using --language
English
.

While it runs, it starts showing information in the console. I quickly noticed it was misspelling lots of things, such as my name Derick as Derek, and Xdebug as XDbook.

I also noticed that it starts breaking up sentences in a odd way after a while. Just like what the online version was doing.

I did not get a good result this first time.

It did create a JSON file, xdebug33-from-exception.json, but it is all in one line.

I reformatted it by installing the yajl-tools package with apt-get, and flowing the data through json_reformat:

sudo apt-get install yajl-tools
cat xdebug33-from-exception.json | json_reformat >xdebug33-from-exception_reformat.json 

The reformatted file still has our full text in a line, but then a segments section follows, which looks like:

"segments": [ { "id": 0, "seek": 0, "start": 3.6400000000000006, "end": 11.8, "text": " Hi, I'm Derick. For most of the videos that I make, I also like to have subtitles, because", "tokens": [ 50363, 15902, 11, 314, 1101, 9626, 624, 13, 1114, 749, 286, 262, 5861, 326, 314, 787, 11, 314, 635, 588, 284, 423, 44344, 11, 780, 50960 ], "temperature": 0.0, "avg_logpro

Truncated by Planet PHP, read more at the original (another 4617 bytes)

Xdebug Update: May 2024 – Derick Rethans

Xdebug Update: May 2024

I have not written an update like this for a while. I am sorry.

In the last months I have not spent a lot of time on Xdebug due to a set of other commitments.

Since my last update in November a few things have happened though.

Xdebug 3.3

I released Xdebug 3.3, and the patch releases 3.3.1 and 3.3.2.

Xdebug 3.3 brings a bunch of new features into Xdebug, such as flamegraphs.

The debugger has significant performance improvements in relation to breakpoints. And it can now also show the contents of ArrayIterator, SplDoublyLinkedList, SplPriorityQueue objects, and information about thrown exceptions.

A few bugs were present in 3.3.0, which have been addressed in 3.3.1 and 3.3.2. There is currently still an outstanding issue (or more than one), where Xdebug crashes. There are a lot of confusing reports about this, and I have not yet managed to reproduce any of them.

If you’re running into a crash bug, please reach out to me.

There is also a new experimental feature: control sockets. These allow a client to instruct Xdebug to either initiate a debugging connection, or instigate a breakpoint out of band: i.e., when no debugging session is active. More about this in a later update.

Funding Platform

Last year, I made a prototype as part of a talk that I gave at NeosCon.io. In this talk I demonstrated native path mapping — configuring path mapping in/through Xdebug, without an IDE’s assistance.

In collaboration with Robert from NEOS, and Luca from theAverageDev, we defined a project plan that explains all the necessary functionality and work.

Adding this to Xdebug is a huge effort, and therefore I decided to set up a way how projects like this could be funded.

There is now a dedicated Projects section linked from the home page, with a list of all the projects. The page itself lists a short description for each project.

For each project, there is a full description and a list of its generous contributors. The Native Xdebug path Mapping project is currently 85% funded. Once it is fully done, I will start the work to get this included in Xdebug 3.4. You could be part of this too!

Truncated by Planet PHP, read more at the original (another 1644 bytes)

python-oracledb 2.2 and the VECTOR type in Oracle Database 23ai – Christopher Jones

python-oracledb 2.2, the extremely popular Oracle Database interface for Python, is now on PyPI.

Today we’re pleased to release python-oracledb 2.2 to coincide with the just-announced, general availability of Oracle Database 23ai.

Photo by Jamie Street on Unsplash

Python-oracledb is an open source package for the Python Database API specification with many additions to support advanced Oracle Database features. It is the new name for the cx_Oracle driver.

To get started quickly, use samples/sample_container to create a container image containing the database and python-oracledb.

Main Changes for Oracle Database 23ai

Support for the Oracle Database 23ai VECTOR data type

A major feature of Oracle Database 23ai is AI Vector Search (which is part of the database and is available at no additional charge in Enterprise Edition, Standard Edition 2, Database Free, and all Oracle Database cloud services).

Python-oracle supports this feature with its native capabilities for binding and fetching the new VECTOR data type. For example, given the table:

SQL> drop table if exists mytab_v; 
SQL> create table mytab_v (v64 vector(3, float64));

Python code to insert a vector could look like:

vector_data_64 = array.array("d", [11.25, 11.75, 11.5])

cursor.execute(
"insert into mytab_v (v64) values (:1)",
[vector_data_64]
)

A query:

cursor.execute("select * from mytab_v")
for row in cursor:
print(row)

would then show:

(array('d', [11.25, 11.75, 11.5]),)

Other examples are on GitHub.

(And did you note the new DROP TABLE IF EXISTS syntax I used?!

Support for the Oracle Database 23ai BOOLEAN data type

Oracle Database 23ai’s new SQL BOOLEAN data type lets you store true and false values in database tables. (Previously this type was only available in PL/SQL).

Python-oracledb 2.2 supports binding and fetching SQL BOOLEAN values. For example:

bv = cursor.var(oracledb.DB_TYPE_BOOLEAN)
bv.setvalue(0, True)
for r, in cursor.execute("select :bv", [bv]):
print(r)

displays:

True

Note there is no FROM DUAL in the query! One Oracle Database 23ai enhancement was to make the clause optional in this common query idiom.

Support for the Oracle Database INTERVAL YEAR TO MONTH data type

The not-so-common INTERVAL YEAR TO MONTH data type is not new to the database, but python-oracledb 2.2 now has support via anoracledb.IntervalYM class. The class is a named tuple with two integer attributes, years and months. The code:

bv = cursor.var(oracledb.DB_TYPE_INTERVAL_YM)
bv.setvalue(0, oracledb.IntervalYM(years=3, months=10))
cursor.execute("insert into mytab_b (iym_col) values (:1)", [bv])

for r, in cursor.execute("select * from mytab_b"):
print(r)

displays:

IntervalYM(years=3, months=10)

Oracle Database 23ai JSON and SODA improvements

One big Oracle Database 23ai announcement is JSON Relational Duality, letting you leverage the power of relational and the simplicity of JSON development within a single app. Also check out this blog post.

Various enhancements to python-oracledb improve its support for the original (another 10124 bytes)

Statement on glibc/iconv Vulnerability – PHP: Hypertext Preprocessor

EDIT 2024-04-25: Clarified when a PHP application is vulnerable to this bug.Recently, a bug in glibc version 2.39 and older (CVE-2024-2961) was uncovered where a buffer overflow in character set conversions to the ISO-2022-CN-EXT character set can result in remote code execution. This specific buffer overflow in glibc is exploitable through PHP, which exposes the iconv functionality of glibc to do character set conversions via the iconv extension. Although the bug is exploitable in the context of the PHP Engine, the bug is not in PHP. It is also not directly exploitable remotely. The bug is exploitable, if and only if, the PHP application calls iconv functions or filters with user-supplied character sets. Applications are not vulnerable if: Glibc security updates from the distribution have been installedOr the iconv extension is not loadedOr the vulnerable character set has been removed from gconv-modules-extra.confOr the application passes only specifically allowed character sets to iconv. Moreover, when using a user-supplied character set, it is good practice for applications to accept only specific charsets that have been explicitly allowed by the application. One example of how this can be done is by using an allow-list and the array_search() function to check the encoding before passing it to iconv. For example: array_search($charset, $allowed_list, true) There are numerous reports online with titles like “Mitigating the iconv Vulnerability for PHP (CVE-2024-2961)” or “PHP Under Attack”. These titles are misleading as this is not a bug in PHP itself. If your PHP application is vulnerable, we first recommend to check if your Linux distribution has already published patched variants of glibc. Debian, CentOS, and others, have already done so, and please upgrade as soon as possible. Once an update is available in glibc, updating that package on your Linux machine will be enough to alleviate the issue. You do not need to update PHP, as glibc is a dynamically linked library. If your Linux distribution has not published a patched version of glibc, there is no fix for this issue. However, there exists a workaround described in GLIBC Vulnerability on Servers Serving PHP which explains a way on how to remove the problematic character set from glibc. Perform this procedure for every gconv-modules-extra.conf file that is available on your system.Once an update is available in glibc, updating that package on your Linux machine will be enough to alleviate the issue. You do not need to update PHP, as glibc is a dynamically linked library.PHP users on Windows are not affected.There will therefore also not be a new version of PHP for this vulnerability.

Django 5 and Oracle Autonomous Database – Christopher Jones

I’m posting this quick blog in the ‘so I can find it again’ category. The question came up about how to connect to Oracle Autonomous Database using Django and a wallet ZIP file.

Photo by Brecht Corbeel on Unsplash

Make sure Django and python-oracledb are up to date:

python3 -m pip install Django oracledb --upgrade

At time of writing, this installed Django 5.0.4 and python-oracledb 2.1.2.

Oracle ADB-S with 1-way TLS

If your Autonomous Database was created using the ‘Secure access from allowed IPs and VCNs only’ option which gives you 1-way TLS, then you will have allow-listed the machine (or network) where you are running Django. In this case, your Django settings.py DATABASES entry will be the standard Oracle username/password/connection strings for the database and you don’t need the wallet file. You can find the connection string by clicking the ‘Database connection’ button on the cloud console for the database, Changing the ‘TLS authentication’ drop-down in the ‘Connection Strings’ pane to be ‘TLS’, and then copying one of the connection strings.

Your Django settings.py file can then look like:

DATABASES = {
'default': {
'ENGINE': 'django.db.backends.oracle',
'NAME': '(description= (retry_count=20)(retry_delay=3)(address=(protocol=tcps)(port=1521)(host=xxx.oraclecloud.com))(connect_data=(service_name=xxx_cjdb_high.adb.oraclecloud.com))(security=(ssl_server_dn_match=yes)))',
'USER': 'admin',
'PASSWORD': 'xxxxxxxxxx',
}
}

On macOS, if you run your Django python3 manage.py runserver command and get an error like:

oracledb.exceptions.OperationalError: DPY-6005: cannot connect to database (CONNECTION_ID=xxxx).
[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain (_ssl.c:1000)

you can follow this solution and first run a command like:

/Applications/Python\ 3.12/install\ Certificates.command

(using the appropriate Python version path).

Oracle ADB-S with mTLS

However if your Autonomous Database was created with the ‘Secure access from everywhere’ setting, you will need to download and use the wallet.zip file containing the access certificate from the cloud console.

Download the wallet file by clicking the ‘Database connection’ button on the cloud console for the database. This will prompt you to create a wallet password.

Next, unzip the wallet file — it must be unzipped. In the example below, my directory /Users/cjones/CJMTLS contains the files:

 -rw-r--r--@ 1 cjones staff 3029 22 Apr 00:07 README
-rw-r--r--@ 1 cjones staff 5349 22 Apr 00:07 cwallet.sso
-rw-r--r--@ 1 cjones staff 5304 22 Apr 00:07 ewallet.p12
-rw-r--r--@ 1 cjones staff 5710 22 Apr 00:07 ewallet.pem
-rw-r--r--@ 1 cjones staff 3192 22 Apr 00:07 keystore.jks
-rw-r--r--@ 1 cjones staff 691 22 Apr 00:07 ojdbc.properties
-rw-r--r--@ 1 cjones staff 114 22 Apr 00:07 sqlnet.ora
-rw-r--r--@ 1 cjones staff 768 22 Apr 00:07 tnsnames.ora
-rw-r--r--@ 1 cjones staff 2056 22 Apr 00:07 truststore.jks

Then your Django settings.py file will need to contain this:

DATABASES = {
'default': {
'ENGINE': 'django.db.backends.oracle',
'NAME': 'cjmtls_high',
'USER': 'admin',
'PASSWORD': 'xxxxxxxxxxxx',
'OPTIONS': {
"config_dir": "/Users/cjones/CJMTLS",
"wallet_location": "/Users/cjones/CJMTLS",
"wallet_password": "xxxxxxxxxx"
}
}
}

The NAME field is one of the connection aliases in the tnsnames.ora file. This file is read from the “config_dir” folder. Note that you have to supply the wallet directory location in “wallet_location”, and also the wallet password to open the PEM file. This is the password you created when you downloaded the wallet.zip file. This password is needed at runtime in the default ‘Thin’ mode of the python-oracledb driver.

For python-oracledb Thick mode use — which requires different settings, and for general background, see the python-oracledb doc Connecting to Oracle Cloud Autonomous Databases.

Moving on from Mocha, Chai and nyc. – Evert Pot

I’m a maintainer of several small open-source libraries. It’s a fun activity.
If the scope of the library is small enough, the maintenance burden is
typically fairly low. They’re usually mostly ‘done’, and I occasionally just need to
answer a few questions per year, and do the occasional release to bring it
back up to the current ‘meta’ of the ecosystem.

Also even though it’s ‘done’, in use by a bunch of people and well tested,
it’s also good to do a release from time to time to not give the impression
of abandonment.

This weekend I released a 2.0 version of my bigint-money library, which
is a fast library for currency math.

I originally wrote this in 2018, so the big BC break was switching everything
over to ESM. For a while I tried to support both CommonJS and ESM
builds for my packages, but only a year after all that effort it frankly no
longer feels needed. I was worried the ecosystem was going to
split, but people stuck on (unsupported) versions of Node that don’t
support ESM aren’t going to proactively keep their other dependencies updated,
so CommonJS is for (and many others) in the past now. (yay!)

Probably the single best way to keep maintenance burden for packages low is
to have few dependencies. Many of my packages have 0 dependencies.

Reducing devDependencies also helps. If you didn’t know, node now has a
built-in testrunner. I’ve been using Mocha + Chai for many many
years. They were awesome and want to thank the maintainers, but node --test
is pretty good now and has pretty output.

It also:

  • Is much faster (about twice as fast with Typescript and code coverage
    reporting, but I suspect the difference will grow with larger code bases).
  • Easier to configure (especially when you’re also using Typescript. Just use tsx --test).
  • It can output test coverage with (--experimental-test-coverage).

Furthermore, while node:assert doesn’t have all features of Chai, it has
the important ones (deep compare) and adds better Promise support.

All in all this reduced my node_modules directory from a surprising 159M
to 97M, most of which is now Typescript and ESLint, and my total dependency
count from 335 to 141 (almost all of which is ESLint).

Make sure that Node’s test library, coverage and assertion library is right
for you. It may not have all the features you expect, but I keep my testing
setup relatively simple, so the switch was easy.