Stuff about Things

May 1, 2024
in JSON
5 min read

An Analysis of JSON Schema Defects

Context

While teaching back-end programming at Mines Paris, an engineering school which is part of PSL University, we have looked at how JSON data could be validated when transfered from a front-end (eg react-native) to a back-end (eg a REST API with Flask) and to storage (eg a Postgres database).

We have stumbled upon JSON Schema, and our investigation leads to an academic study which analyses many schemas, finds common defects, and proposes changes to the spec which would rule out syntactically most of these defects, at the price of some contraints.

September 13, 2019
in Postgres, performance
8 min read

Data Loading Performance of Postgres and TimescaleDB

Postgres is the leading feature-full independent open-source relational database, steadily increasing its popularity over the past 5 years. TimescaleDB is a clever extension to Postgres which implements time-series related features, including under the hood automatic partioning, and more.

Because he knows how I like investigate Postgres (among other things) performance, Simon Riggs (2ndQuadrant) prompted me to look at the performance of loading a lot of data into Postgres and TimescaleDB, so as to understand somehow the degraded performance reported in their TimescaleDB vs Postgres comparison. Simon provided support, including provisioning 2 AWS VMs for a few days each.

December 11, 2016
in Postgres, performance
10 min read

Choosing Postgres Bloom Index Parameters

Postgres Bloom is an extension contributed by Teodor Sigaev, Alexander Korotkov and Oleg Bartunov which provides a new index type for integer and text columns. There is some coverage on how to use it and how it works, which is good because documentation is scarse.

July 28, 2016
in Game, performance
9 min read

1010! Analysis

1010! is a simple puzzle game by Gram Games, originating from Turkey in August 2014. It feels like Tetris but without the time constraint and with more choices. It is available on Android and iOS.

Advices are available online on how to get a high score. I’m more interested in devising a decision strategy, automate it efficiently and optimize it through simulations… which also results in advices on how to get a high score, but more substanciated.

Teaser: below, learn how to get a score over 1 million every 8 games…

August 23, 2014
in Postgres, performance
4 min read

Postgres `FILLFACTOR` for `UPDATE`

This post discusses the performance impact of Postgres FILLFACTOR table storage parameter on an UPDATE OLTP load. Note that this FILLFACTOR is indeed the table storage parameter, although there is also an eponymous parameter for indexes.

August 17, 2014
in Postgres, performance
5 min read

Postgres page size for SSD (2)

I have recently posted performance figures with varying page size using pgbench on SSD, which show a +10% improvement with smaller 4 kB page size over the default 8 kB page size.

Josh Berkus pointed out that pgbench test uses rather small 100-bytes rows, and that changing the tuple size might induce a different conclusion. To assess this point, I ran some tests with different row sizes: 1 kB, 3 kB, 7 kB and 15 kB.

August 8, 2014
in Postgres, performance
5 min read

Postgres page size for SSD

In a previous post, I have outlined the time required by a Postgres database to warm-up from a HDD on a read-only load for a database that fits in memory.

In this post, I want to look at write performance on SSD, focusing on the impact of Postgres page size (blocksize), and on checking whether the current 8 kB default is relevant.

March 23, 2014
in Postgres, SQL
2 min read

DataFiller 2.0.0 is out!

DataFiller processes a Postgres database schema file augmented with directives in comments, and generates pseudo-random data matching this schema, taking into account constraints such as types, but also primary key, unique, foreign keys, not null…

Version 2.0.0 introduces the following new features:

December 1, 2013
in Postgres, SQL
5 min read

DataFiller Tutorial

This tutorial introduces how to use DataFiller to fill a Postgres database, say for testing functionalities and performances.

November 29, 2013
in Postgres, SQL
2 min read

DataFiller 1.1.3 is out!

I have just released version 1.1.3 of DataFiller.

The Python script processes a Postgres database schema file augmented with directives in comments, and generates random data matching this schema, taking into account constraints such as types, but also primary key, unique, foreign keys, not null…