Performance-Tuning

How A* Improves on Dijkstra (and When It Doesn't)

How A* Improves on Dijkstra (and When It Doesn't)

Valerie Parham-Thompson
In my previous post on routing, I used Dijkstra’s algorithm without much discussion of alternatives. The Dijkstra algorithm works for network routing, and for many problems it is the right choice. But pgRouting also ships with pgr_aStar, an implementation of the A* algorithm that can find the same shortest path while exploring fewer edges. The difference comes down to one thing: a heuristic that tells the algorithm which direction to look.
Routing APIs Compared: Choosing the Right Shortest-Path Service

Routing APIs Compared: Choosing the Right Shortest-Path Service

Valerie Parham-Thompson
In my previous post on pgRouting, I showed how to run shortest-path queries directly inside PostgreSQL. That approach works well when your road data is already in Postgres and your network is moderate-sized. But what happens when you need live traffic data, global coverage, or routing at thousands of queries per second? That is where external routing APIs and dedicated routing engines come in.

Latency by Pekka Enberg

While many authors have written about database tuning, systems tuning, or code optimization, I haven't seen any come together to cover the whole stack in such a comprehensive way, targeting both software engineers and database architects.

Valerie Parham-Thompson
Having a shared vocabulary across database, software, and infrastructure teams is critical when working together to tune latency issues. I’ve been in many incident rooms where the only report is “the application is slow” and had to unwind a series of questions: What do you mean by slow? Where do you see this? What parts are slow? If everyone in the room had read Enberg’s Latency, solving these kinds of incidents would be much faster.

Using pg_stat_statements for Query profiling and performance tuning

pg_stat_statements is an extension that tracks execution statistics for every normalized SQL statement.

Valerie Parham-Thompson
Database performance problems are often mysterious. Queries slow down, CPU usage spikes, or users complain about latency, but pinpointing the cause requires visibility into what your database is actually doing. pg_stat_statements is PostgreSQL’s answer to this challenge.
Query Optimization with HypoPG

Query Optimization with HypoPG

Using HypoPG to test hypothetical indexes for query optimization in YugabyteDB

Valerie Parham-Thompson
Query optimization is a critical aspect of database performance tuning. While YugabyteDB’s YSQL API provides powerful tools for analyzing query performance through EXPLAIN plans, sometimes we need to experiment with different indexing strategies without the overhead of actually creating the indexes. This is where HypoPG comes in handy.
Random Data Generation: Then and Now

Random Data Generation: Then and Now

Modern approaches to generating test data with Python Faker

Valerie Parham-Thompson
In 2018, I wrote about using SQL functions to generate random test data in MySQL. While that approach served its purpose, the landscape of test data generation has evolved significantly. Today, I want to share my experience with using the Faker library, which has become my go-to tool for creating realistic test datasets.
Count Large Partitions in YCQL

Count Large Partitions in YCQL

Counting large partitions in the YugabyteDB Cassandra API

Valerie Parham-Thompson
One thing that can really wreck your performance in Cassandra and the similar YugabyteDB YCQL is large partitions due to an imbalanced key. Without the robust nodetool commands of Cassandra, it can be challenging to find these large partitions in YugabyteDB.