Category: Development & Sysadmin

Feeling Satisfied

I had set out about 2 years ago to get some certifications for my own personal and professional growth, but life and things and other things got in the way (work? Factorio? excuses?) and I just never finished them.

Well I set out to fix that this year and just yesterday I passed the AWS Certified Solutions Architect – Associate certification exam. The exam was more challenging then I expected, but I passed! Whew.

Going to chill on these for a bit now. In the fall I will tackle the RHCSA.


Docker putting downloads behind a login wall

In regards to issue https://github.com/docker/docker.github.io/issues/6910, Docker putting links to download Docker CE behind a login wall. The comment thread is long, and people are, in my opinion, rightfully put out by the move.

Yeah, the issue is that they are not being transparent or real about the reasons why they put the download behind a login wall. All they needed to do was tell the truth, and then this wouldn’t be an issue. As it is now, they used marketing double speak and it came off disingenuous, and imo, rather pathetic.

I know that this can feel like a nuisance, but we’ve made this change to make sure we can improve the Docker for Mac and Windows experience for users moving forward.

This is simply not true.

They are not doing this to improve the experience of users, as has been explained several times in the comments. No, they are doing this because they are scrambling to collect data to figure out how to monetize a closed source product that they gave out for free. Now the expectation in the community is that the product is free which puts them in a weird position. However, honesty and transparency will go a lot further than lies and deception do in these sorts of things. Just gotta be honest with your user base. There is no harm in that. A simple message like; “hey so we messed up and in order to continue to provide Docker and support etc, we need to 1) understand the user base that is downloading the docker tools, and 2) identify ways to generate revenue to continue to develop and support the tools while maintaining a free core product. In order to do this we want to collect some data using these methods…” and so on. I’d have so much respect for a company who could say those things.

Maybe I am reading a lot into hiding the link, but at the end of the day if it feels sneaky and shady, it probably is. I do not trust that they actually had the users experience in mind when making this change. Therefore, I do not trust Docker, and would seriously consider the alternatives for my future endeavors.

Started using Let’s Encrypt for my personal websites and apps. Was pretty easy to setup, and thanks to Chef, even easier to deploy and keep updated.

After doing development for 15 years I would have expected my github to be more, substantial. Must be all those half finished projects sitting in ~/Development/projects 😀

The Reality of Gutenberg & WordPress

Gutenberg is happening. It is coming and it is coming soon. I am not thrilled.

WordPress the CMS

Automattic have been trying to tell us for years that WordPress is more than a blogging platform, that is a true and full Content Management System. "Look!", they say. "You have custom post types, and the metabox API allows for you to create complex content types. It's a CMS!". Alas, it is all built on top of a blog with very blog specific design patterns. The underbelly of WP is ugly and hacky, even if it works "just fine" most of the time. Gutenberg is as direct of a statement of intent as I could expect; WordPress is a blogging/marketing platform, and not a CMS.

WordPress the Casual Site Builder

Gutenberg is a response to the threat of Squarespace and Wix and Medium. This update is for WordPress.com, to combat the threat of those other systems, to ensure dominance in the web publishing space, to increase market share by appealing to more casual users, small business owners, etc. Automattic can probably then generate leads for their WP VIPs. But I think this will come at the expense of developers like me, working at a digital agency, who drank the koolaid about WordPress being a CMS and it being a tool that can be used for Higher-Ed, Government, Healthcare, and not just for a blog or a simple marketing site. I am ready to move on to real CMS's for those projects. As for the marketing and bloggy sites, Squarespace has a much more robust block and content building experience. Why even bother with WordPress at this point? Gutenberg is not anywhere close to those systems, yet. It will probably get there, but will it matter, and will it be worth it?

Gutenberg the Editor

The new editorial process is nice, but super limiting. I hope that more blocks are in development otherwise this feels dead on arrival. I am having some fun with it on my site, checking things out, playing with the shiny new toy, but after a few posts, it's already starting to feel restrictive. Theme builders give me so many more options for how to present content. Sure, they are nasty and terrible, but they offer so much more out of the box. I kinda hope that WordPress does not depend on extending block functionality via 3rd party plugin. They have come this far, surely they can offer some more variety?

For example, how about letting me insert a block above the title. Typically you start the page with a large hero, then the primary h1 follows. You can do this now by excluding the post title, but Gutenberg does not let you change the post slug (why?). And if you dont have an SEO plugin installed your <title> will be empty.

How about letting me define a wrapper element around a few blocks to give me more HTML to hang a frontend design off of? Like, making a group of blocks.

How about a related posts that is not just a list of post links, but something that will allow an editor to create a curated list, or a dynamically generated list, that includes thumbnails, excerpt, tags, whatever.

If it ain't broke…

Creating custom blocks has a much higher barrier to entry than say, creating custom field sets with ACF Pro, or using Metabox.io to create modular content blocks. We typically have some unique design constraints and features of our content patterns that do not allow for perfect modular re-use across websites. In other words, we make boutique websites for our clients. ACF and Metabox.io make this extremely easy. Gutenberg blocks are going to take a lot more time to build and test (now we have to test integration on the backend?!)

Ok. I am starting to rant. Ill end this by saying I don't think Gutenberg needs to be THE editor for WordPress, just AN editor for WordPress. Leave it as an optional plugin.

Profiling and Debugging a PHP app with Xdebug and Docker

I have started using an IDE again (PHPStorm) so that I could debug some applications and do some basic app profiling. I want to use Xdebug to profile my PHP apps. I am using Docker Compose on Windows 10. I have made this very complicated for myself but here we go.

The directory structure of my app looks like:

/build/docker/php/Dockerfile
/build/docker/php/php.ini
/build/docker/nginx/Dockerfile
/build/docker/nginx/default.conf
/web (contains my php app)
docker-compose.yml

First thing is to get Xdebug setup in the PHP container.
I am using a custom Dockerfile for my PHP container where I install a ton of additional modules and packages, install wp-cli, and copy a custom php.ini to the container.

Here is the entire Dockerfile for the PHP container:

FROM php:7.0-fpm

# Install some required tools
RUN apt-get update && apt-get install -y sudo less

# Install PHP Extensions
RUN apt-get update && apt-get install -y \
bzip2 \
libbz2-dev \
libc-client2007e-dev \
libjpeg-dev \
libkrb5-dev \
libldap2-dev \
libmagickwand-dev \
libmcrypt-dev \
libpng12-dev \
libpq-dev \
libxml2-dev \
mysql-client \
imagemagick \
xfonts-base \
xfonts-75dpi \
&& pecl install imagick \
&& pecl install oauth-2.0.2 \
&& pecl install redis-3.0.0 \
&& pecl install xdebug \
&& docker-php-ext-configure gd --with-png-dir=/usr --with-jpeg-dir=/usr \
&& docker-php-ext-configure imap --with-imap-ssl --with-kerberos \
&& docker-php-ext-configure ldap --with-libdir=lib/x86_64-linux-gnu/ \
&& docker-php-ext-enable imagick \
&& docker-php-ext-enable oauth \
&& docker-php-ext-enable redis \
&& docker-php-ext-enable xdebug \
&& docker-php-ext-install \
bcmath \
bz2 \
calendar \
gd \
imap \
ldap \
mcrypt \
mbstring \
mysqli \
opcache \
pdo \
pdo_mysql \
soap \
zip \
&& apt-get -y clean \
&& apt-get -y autoclean \
&& apt-get -y autoremove \
&& rm -rf /var/lib/apt/lists/* && rm -rf && rm -rf /var/lib/cache/* && rm -rf /var/lib/log/* && rm -rf /tmp/*

# Custom PHP Conf
COPY ./php.ini /usr/local/etc/php/conf.d/custom.ini

# WP-CLI
RUN curl -O https://raw.githubusercontent.com/wp-cli/builds/gh-pages/phar/wp-cli.phar \
&& mv wp-cli.phar /usr/local/bin \
&& chmod +x /usr/local/bin/wp-cli.phar \
&& ln -s /usr/local/bin/wp-cli.phar /usr/local/bin/wp

# Xdebug
RUN mkdir /tmp/xdebug
RUN chmod 777 /tmp/xdebug

# Run this container as "webuser"
RUN groupadd -r webuser && useradd -r -g webuser webuser
RUN usermod -aG www-data webuser
USER webuser

custom php.ini under ./build/docker/php/:

file_uploads = On
memory_limit = 512M
upload_max_filesize = 256M
post_max_size = 256M
max_execution_time = 600
display_errors = On
error_reporting = E_ALL

[XDebug]
xdebug.profiler_output_dir = "/tmp/xdebug/"
xdebug.profiler_output_name = "cachegrind.out.%t-%s"
xdebug.profiler_append = 1
xdebug.profiler_enable_trigger = 1
xdebug.trace_output_dir = "/tmp/xdebug/"
xdebug.remote_enable = on
xdebug.remote_autostart = true
xdebug.remote_handler = dbgp
xdebug.remote_mode = req
xdebug.remote_port = 9999
xdebug.remote_log = /tmp/xdebug/xdebug_remote.log
xdebug.idekey = MYIDE
xdebug.remote_connect_back = 1

Some important things here. I am creating a directory to store Xdebug output (/tmp/xdebug) which will be used by another container to parse and display the output. In the custom php.ini we tell Xdebug to store its output to this directory. We also configure Xdebug to enable remote debugging so that we can debug from our IDE. If you do not want to debug EVERY request you should disable remote_autostart. If you do this you need to pass in a specific GET/POST parameter to trigger the debugger (typically XDEBUG_PROFILE). Make note of the remote_port and idekey values. We need these when we configure our IDE.

In your IDE you would configure Xdebug to listen on port 9999 for connections and to use the IDE Session Key MYIDE to ensure you are only debugging requests that use that session key (really only necessary for complicated setups with multiple apps on the same server).

There are two environment variables that I set on the PHP container that are required to make this all work.

docker-compose.yml

php:
build: ./build/docker/php/
expose:
- 9000
links:
- mysql
volumes:
- .:/var/www/html
- /tmp/xdebug
environment:
XDEBUG_CONFIG: "remote_host=192.168.99.1"
PHP_IDE_CONFIG: "serverName=XDEBUG"

XDEBUG_CONFIG is required to tell Xdebug where the client is running. To be honest, I am not sure if this is actually required or is only required because of PHPStorm. I am using Docker Toolbox and am using the IP from the VirtualBox VM where the Docker env is running. It would be great to not have to have this param as it would be more portable.

The variable PHP_IDE_CONFIG though is required for PHPStorm, and it tells my IDE which server configuration to use.

Neither of these may be required if you are using native docker and a different IDE.  /shrug

The first part of this is done. We can now debug an app from our IDE. The second thing I wanted to do was run a profiler and inspect the results. Xdebug will output cachegrind files. We just need a way to inspect them. There are some desktop apps you can use, like KCacheGrind, QCacheGrind, WinCacheGrind, etc. Your IDE may even be able to parse them (PHPStorm is currently no able to for some reason). Or you can use a web based system. I opted for a web based system using WebGrind. There is, conveniently, a docker container for this.

I configured the php container to expose /tmp/xdebug as a shared volume, which is where Xdebug is configured too output cachegrind files. Then I configured the webgrind container to mount that volume. Also I pass an environment variable to tell WebGrind where to find the cachegrind files:

docker-compose.yml

webgrind:
    image: devgeniem/webgrind
    ports:
        - 8081:80
    volumes_from:
        - php
    environment:
        XDEBUG_OUTPUT_DIR: "/tmp/xdebug"

With that we can go to http://192.168.99.100:8081 and start digging into the app profiles.

Complete docker-compose.yml

mysql:
    image: mysql:latest
    volumes_from:
        - mysql_data
    environment:
        MYSQL_ROOT_PASSWORD: secret
        MYSQL_DATABASE: project
        MYSQL_USER: project
        MYSQL_PASSWORD: project
    expose:
        - 3306

mysql_data:
    image: tianon/true
    volumes:
        - /var/lib/mysql

nginx:
    build: ./build/docker/nginx/
    ports:
        - 80:80
    links:
        - php
    volumes:
        - .:/var/www/html

php:
    build: ./build/docker/php/
    expose:
        - 9000
    links:
        - mysql
    volumes:
        - .:/var/www/html
        - /tmp/xdebug
    environment:
        XDEBUG_CONFIG: "remote_host=192.168.99.1"
        PHP_IDE_CONFIG: "serverName=XDEBUG"

phpmyadmin:
    image: phpmyadmin/phpmyadmin
    ports:
        - 8080:80
    links:
        - mysql
    environment:
        PMA_HOST: mysql

webgrind:
    image: devgeniem/webgrind
    ports:
        - 8081:80
    volumes_from:
        - php
    environment:
        XDEBUG_OUTPUT_DIR: "/tmp/xdebug"

WP Transients must be used responsibly

We ran into an interesting issue with WooCommerce at work. First, here is the subject of the support request we got from our hosting provider:

The site is generating ~150MB/sec of transaction logs, filling 500GB of diskspace

Holy. Shit. A WordPress site should not be generating that much data. 150MB per second? Wow.

How? Why?

The simple explanation is that there is a bottleneck in WooCommerce with the filtered layer nav query objects using single transient record.

// We have a query - let's see if cached results of this query already exist.
$query_hash    = md5( $query );
$cached_counts = (array) get_transient( 'wc_layered_nav_counts' );

if ( ! isset( $cached_counts[ $query_hash ] ) ) {
    $results                      = $wpdb->get_results( $query, ARRAY_A );
    $counts                       = array_map( 'absint', wp_list_pluck( $results, 'term_count', 'term_count_id' ) );
    $cached_counts[ $query_hash ] = $counts;
    set_transient( 'wc_layered_nav_counts', $cached_counts, DAY_IN_SECONDS );
}

What is happening here is that a sql query based on the currently selected filters is hashed and shoved into an array that is saved to a single transient record. This means that every single interaction with the filters requires a read and possible write to a single transient record. A site with any sort of traffic and let’s say 9 filter widgets (with around 50 total options) will potentially generate a huge amount of unique queries. It is no wonder why we are pushing 150MB/s.

Our quick, temporary, patch was to simply remove the transient.

// We have a query - let's see if cached results of this query already exist.
$query_hash    = md5( $query );
$cached_counts = array();
$results                      = $wpdb->get_results( $query, ARRAY_A );
$counts                       = array_map( 'absint', wp_list_pluck( $results, 'term_count', 'term_count_id' ) );
$cached_counts[ $query_hash ] = $counts;

You can see the massive improvement in performance after removing the transients. We applied the patch around 9:47 am.

Object caching would probably help. I was surprised at how much of an improvement we saw by simply removing the transient.

I think a good solution here would be to use unique transients for each hashed query, and not a single transient for EVERY hashed query. It would work find on small WP installs and would scale.

I will try it out and see what we get and if the results are good I will submit a PR to the woocommerce devs.

update:

I said we should use transients responsibly. In this case, I would be creating potentially 15k additional (tiny) transient records. Is that more responsible than 1 massive 1mb transient?

WooCommerce devs has asked that I run some performance tests. Going to do so and report back!

update 2:

Not having any transients at all is better at scale in our case since the SQL query that is executed is not that heavy and we have some decent page caching via varnish. Also our MySQL server is well tuned. Every single request to a page with the layered nav will make N requests for the transient data. If data has to be written, that is N updates per request. This single record becomes a bottleneck as the field is locked while it is being written to. Redis or Memcache would be a a better solution. WP Transients are just bad on their own.