Skip to content

Stuff about Things

An Analysis of JSON Schema Defects

Context

While teaching back-end programming at Mines Paris, an engineering school which is part of PSL University, we have looked at how JSON data could be validated when transfered from a front-end (eg react-native) to a back-end (eg a REST API with Flask) and to storage (eg a Postgres database).

We have stumbled upon JSON Schema, and our investigation leads to an academic study which analyses many schemas, finds common defects, and proposes changes to the spec which would rule out syntactically most of these defects, at the price of some contraints.

Data Loading Performance of Postgres and TimescaleDB

Postgres is the leading feature-full independent open-source relational database, steadily increasing its popularity over the past 5 years. TimescaleDB is a clever extension to Postgres which implements time-series related features, including under the hood automatic partioning, and more.

Because he knows how I like investigate Postgres (among other things) performance, Simon Riggs (2ndQuadrant) prompted me to look at the performance of loading a lot of data into Postgres and TimescaleDB, so as to understand somehow the degraded performance reported in their TimescaleDB vs Postgres comparison. Simon provided support, including provisioning 2 AWS VMs for a few days each.

1010! Analysis

1010! is a simple puzzle game by Gram Games, originating from Turkey in August 2014. It feels like Tetris but without the time constraint and with more choices. It is available on Android and iOS.

Advices are available online on how to get a high score. I’m more interested in devising a decision strategy, automate it efficiently and optimize it through simulations… which also results in advices on how to get a high score, but more substanciated.

Teaser: below, learn how to get a score over 1 million every 8 games…

Postgres FILLFACTOR for UPDATE

This post discusses the performance impact of Postgres FILLFACTOR table storage parameter on an UPDATE OLTP load. Note that this FILLFACTOR is indeed the table storage parameter, although there is also an eponymous parameter for indexes.

Postgres page size for SSD (2)

I have recently posted performance figures with varying page size using pgbench on SSD, which show a +10% improvement with smaller 4 kB page size over the default 8 kB page size.

Josh Berkus pointed out that pgbench test uses rather small 100-bytes rows, and that changing the tuple size might induce a different conclusion. To assess this point, I ran some tests with different row sizes: 1 kB, 3 kB, 7 kB and 15 kB.

Postgres page size for SSD

In a previous post, I have outlined the time required by a Postgres database to warm-up from a HDD on a read-only load for a database that fits in memory.

In this post, I want to look at write performance on SSD, focusing on the impact of Postgres page size (blocksize), and on checking whether the current 8 kB default is relevant.

DataFiller 2.0.0 is out!

DataFiller processes a Postgres database schema file augmented with directives in comments, and generates pseudo-random data matching this schema, taking into account constraints such as types, but also primary key, unique, foreign keys, not null…

Version 2.0.0 introduces the following new features:

DataFiller 1.1.3 is out!

I have just released version 1.1.3 of DataFiller.

The Python script processes a Postgres database schema file augmented with directives in comments, and generates random data matching this schema, taking into account constraints such as types, but also primary key, unique, foreign keys, not null…