Distributed Systems

Optimizing Read and Write Latency

Optimizing Read and Write Latency

Reducing latency in reads and writes in YugabyteDB

Valerie Parham-Thompson
Today’s global and distributed applications often need to serve user requests from a single data source across different regions. While providing data scaling and protection against network outages, ensuring low-latency access to data is critical for providing a seamless user experience. YugabyteDB, a distributed SQL database, is designed to handle global data workloads efficiently. In this blog post, I’ll share some techniques to optimize read and write latency in a multi-region YugabyteDB cluster.
Tablet Sizing Strategies

Tablet Sizing Strategies

Valerie Parham-Thompson
Modern distributed databases split large tables into tablets to enable parallel processing and efficient data distribution. Finding the right tablet size impacts everything from query performance to operational overhead. Let’s explore how to approach tablet sizing systematically to achieve optimal performance.
Leveraging time to live (TTL)

Leveraging time to live (TTL)

Using TTL to expire records in YugabyteDB and Cassandra

Valerie Parham-Thompson
In both MySQL and Postgres, expiring records after a set period of time takes a couple of timestamps and a little creativity. With Cassandra, or in this case the YugabyteDB ycql API, TTL (time to live) can be leveraged to handle this functionality, simplifying both the table definition and amount of work required by your code.
Processing Data with Pandas

Processing Data with Pandas

Processing weather data from NOAA in YugabyteDB with the Pandas library

Valerie Parham-Thompson
I’ve been experimenting with processing data with Pandas this week, specifically historical NOAA weather data, and storing it in a local YugabyteDB cluster. This open data set contains max/min/precipitation for years back to 1750 (not all data points are available for all years or locations). It’s available here: https://www.ncei.noaa.gov/metadata/geoportal/rest/metadata/item/gov.noaa.ncdc:C00861/html
Foreign Data Wrappers

Foreign Data Wrappers

Using foreign data wrappers to combine metrics across a distributed database cluster

Valerie Parham-Thompson
I was recently setting up a demo to show off query logging features. Two common extensions, pg_stat_statements and pg_stat_monitor, store data locally. In the case of a distributed database, it is helpful to combine the query runtimes on all nodes.
YugabyteDB Snapshots

YugabyteDB Snapshots

Taking snapshots and backups in YugabyteDB

Valerie Parham-Thompson
A distributed database is designed to withstand outages to a good degree. However, you should also maintain backups in case of “oops” scenarios like a dropped table.
String Search

String Search

Link to YFTT fuzzy matching for string searches

Valerie Parham-Thompson
Quick post to share my presentation last week at the YugabyteDB Friday Tech Talk. It was on fuzzy matching, and more generally string searches. Got to nerd out on two of my favorite topics: words (broadly, linguistics and specifically, names) and databases. Check it out!