Trade Area Analysis on a Budget: PostGIS, Census Data, and Drive-Time Isochrones
Retail site selection consultants charge thousands of dollars for trade area reports. The underlying work is a drive-time polygon overlaid with census demographics, with competitor locations plotted on top. All of it is reproducible with PostGIS, free government data, and open commercial data from Overture Maps.
I’ll walk through a brief analysis using Woodcroft Shopping Center in Durham, NC as the candidate site. You’ll end up with a demographic summary of your drive-time catchment and a ranked competitor list, built from public data and SQL.
What a Trade Area Is
A trade area is the geographic zone a business draws the majority of its customers. The size depends on the type of business: a coffee shop might pull from a 10-minute walk, while a specialty retailer might draw from a 30-minute drive.
The analysis here will provide a typical report: a drive-time polygon around the candidate site, intersected with census tract data to summarize demographics, plus a ranked count of existing competitors.
Data Sources
You need three free datasets.
Census TIGER/Line shapefiles provide the census tract boundaries.
The American Community Survey (ACS) 5-year estimates provide demographic data: population, median household income, age distribution, and more, tied to those same tracts. The 5-year estimates aggregate five years of survey data, which makes them more reliable at the tract level than the 1-year estimates.
For competitor locations, the Overture Maps Foundation publishes a places dataset with business listings derived from multiple sources.
Step 1: Define Your Candidate Location
Let’s say your first candidate location is in Woodcroft Shopping Center in Durham, NC. Find the lat/lon for a single tenant in the shopping center at 4711 Hope Valley Rd:
curl -s "https://nominatim.openstreetmap.org/search?q=4711+Hope+Valley+Rd,+Durham,+NC&format=json&limit=1"
Sample output:
[{"place_id":315782967,"licence":"Data © OpenStreetMap contributors, ODbL 1.0. http://osm.org/copyright","osm_type":"node","osm_id":5644388832,"lat":"35.9210546","lon":"-78.9575507","class":"amenity","type":"ice_cream","place_rank":30,"importance":6.708247501330613e-05,"addresstype":"amenity","name":"Menchie's","display_name":"Menchie's, 4711, Hope Valley Road, Durham, Durham County, North Carolina, 27713, United States","boundingbox":["35.9210046","35.9211046","-78.9576007","-78.9575007"]}]%
Create a database named trade_area_db and installed the PostGIS extension in it.
create database trade_area_db;
\c trade_area_db
create extension postgis;
Create a table to hold your sites and insert the candidate site as a PostGIS point:
CREATE TABLE sites (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
geom GEOMETRY(Point, 4326) NOT NULL
);
CREATE INDEX sites_geom_idx ON sites USING GIST (geom);
INSERT INTO sites (name, geom)
VALUES (
'Woodcroft Shopping Center',
ST_SetSRID(ST_MakePoint(-78.9576007, 35.9210046), 4326)
);
A 5-mile radius buffer gives you a rough first approximation of the trade area (5 miles × 1609.344 meters/mile = 8046.72 meters):
SELECT ST_AsGeoJSON(ST_Buffer(geom::geography, 8046.72)::geometry) AS trade_area
FROM sites
WHERE name = 'Woodcroft Shopping Center';
The cast to geography matters. ST_Buffer on a plain geometry in WGS 84 buffers in degrees, not meters. Casting to geography first buffers in meters and allows you to convert to an approximate 5-mile circle.
If you put the query results into geojson.io, you’ll see that quick radius:

A radius buffer is a reasonable sanity check, but it ignores roads: five miles as the crow flies might be 15 minutes in one direction and 8 minutes in another depending on the street network. Let’s look at isochrones to correct for that.
Step 2: Drive-Time Isochrones
An isochrone is a polygon connecting all points reachable within a given travel time from a starting location. The concept appears in hydrology and epidemiology well before retail analysis adopted it, which is why the open-source tooling for computing isochrones is relatively mature.
OSMnx and NetworkX handle this without a routing server. OSMnx downloads the road network from OpenStreetMap; you then project it to a local coordinate system with ox.project_graph(); NetworkX walks that graph to find all nodes reachable within a given travel time. Install the three required libraries in a virtual environment:
python3 -m venv venv
source venv/bin/activate
pip install osmnx networkx alphashape scikit-learn
Then compute the isochrone in Python (the road network download will take several seconds the first time):
import osmnx as ox
import networkx as nx
import alphashape
import numpy as np
from shapely.geometry import mapping
from shapely.ops import transform
import pyproj
import json
lat, lon = 35.9210046, -78.9576007 # Woodcroft Shopping Center
# Download the driving network within 20 miles of the site
# 20 miles × 1609.344 meters/mile = 32,186.88 meters
G = ox.graph_from_point((lat, lon), dist=32186.88, network_type="drive")
G_proj = ox.project_graph(G)
G_proj = ox.add_edge_speeds(G_proj)
G_proj = ox.add_edge_travel_times(G_proj)
# Fill any missing travel times with the median
travel_times = [
d["travel_time"] for _, _, d in G_proj.edges(data=True)
if not np.isnan(d.get("travel_time", float("nan")))
]
median_tt = np.median(travel_times)
for u, v, k, d in G_proj.edges(keys=True, data=True):
if np.isnan(d.get("travel_time", float("nan"))):
G_proj[u][v][k]["travel_time"] = median_tt
# nearest_nodes accepts lat/lon directly when using the WGS84 graph
center_node = ox.nearest_nodes(G, lon, lat)
# All nodes reachable within 5 minutes (300 seconds)
subgraph = nx.ego_graph(G_proj, center_node, radius=300, distance="travel_time")
node_points = [(d["x"], d["y"]) for _, d in subgraph.nodes(data=True)]
# Concave hull around reachable nodes
isochrone_proj = alphashape.alphashape(node_points, alpha=0.0005)
# Reproject to WGS84 for PostGIS
proj = pyproj.Transformer.from_crs(
G_proj.graph["crs"], "EPSG:4326", always_xy=True
)
isochrone_wgs84 = transform(proj.transform, isochrone_proj)
print(json.dumps(mapping(isochrone_wgs84)))
OSMnx can cache the downloaded graph to disk (controlled by ox.settings.use_cache), so repeat runs for the same area often skip the download. alphashape with alpha=0.0005 fits a concave hull to the reachable nodes; smaller values produce looser shapes. The boundary follows actual road geometry more closely than a convex hull would, which matters near water or in areas with sparse road coverage.
You will write the coordinates from the above script to a table below.
Create the table for trade areas:
CREATE TABLE trade_areas (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
geom GEOMETRY(Polygon, 4326) NOT NULL
);
CREATE INDEX trade_areas_geom_idx
ON trade_areas USING GIST (geom);
Load the printed GeoJSON into PostGIS:
INSERT INTO trade_areas (name, geom)
VALUES (
'Durham 5 min drive',
ST_SetSRID(
ST_GeomFromGeoJSON('{"type": "Polygon", "coordinates": [[[-78.9278142, 35.9461405], [-78.9222578, 35.940253399999996], [-78.913313, 35.933998], [-78.9145264, 35.931788600000004], [-78.8947631, 35.906796899999996], [-78.8945986, 35.905341199999995], [-78.894634, 35.9052499], [-78.905404, 35.90097900000001], [-78.9106547, 35.8985184], [-78.921211, 35.8979216], [-78.9294062, 35.8941264], [-78.9442393, 35.8853581], [-78.948933, 35.876278499999984], [-78.9494951, 35.874679099999994], [-78.9513309, 35.8752637], [-78.9552902, 35.8771328], [-78.9715558, 35.883707599999994], [-78.9828783, 35.8885222], [-79.0052026, 35.895135299999986], [-79.0092367, 35.897159099999996], [-79.0164046, 35.90513519999999], [-79.0162896, 35.9053053], [-79.0089753, 35.907356], [-78.9995206, 35.9314081], [-78.9998943, 35.9505751], [-79.0017204, 35.953535], [-78.9943273, 35.9530625], [-78.9770069, 35.961669900000004], [-78.9623101, 35.96077069999999], [-78.9550023, 35.955931], [-78.9369512, 35.954024999999994], [-78.9361198, 35.9536929], [-78.9350747, 35.953074399999984], [-78.933865, 35.95185599999999], [-78.9278142, 35.9461405]]]}'),
4326
)
);
We can visualize the isochrones as well:

Step 3: Demographic Overlay
We will need the census data and ACS data for this step.
Download the tract-level shapefile for your state from the Census Bureau’s TIGER download tool. For North Carolina, that’s tl_2025_37_tract.shp.

After downloading the TIGER tract zip file, unzip and load the tl_2025_37_tract.shp into PostGIS with shp2pgsql. The -s 4269:4326 flag transforms coordinates from EPSG:4269 (NAD83, which TIGER uses) to EPSG:4326 (WGS 84) on import so everything in your analysis is in one coordinate system:
shp2pgsql -I -s 4269:4326 tl_2025_37_tract.shp census_tracts | psql -d trade_area_db
Pull the demographic variables you need from ACS from their API. For this walkthrough, I’ll pull total population (B01001_001E), median age (B01002_001E), median household income (B19013_001E), and household income brackets (B19001_001E through B19001_017E) and store them in a table called acs_data with a geoid column that matches the tract identifiers in census_tracts.
(Create an API key at https://api.census.gov/data/key_signup.html.)
import requests
import psycopg2
API_KEY = "api_key_here"
vars = ",".join([
"GEO_ID", "NAME",
"B01001_001E", # total pop
"B01002_001E", # median age
"B19013_001E", # median household income
"B19001_001E", # total households
"B19001_002E","B19001_003E","B19001_004E","B19001_005E","B19001_006E", # <35k
"B19001_007E","B19001_008E","B19001_009E","B19001_010E","B19001_011E", # 35-75k
"B19001_012E","B19001_013E","B19001_014E", # 75-150k
"B19001_015E","B19001_016E","B19001_017E", # 150k+
])
url = (
f"https://api.census.gov/data/2023/acs/acs5"
f"?get={vars}"
f"&for=tract:*&in=state:37+county:063"
f"&key={API_KEY}"
)
resp = requests.get(url)
rows = resp.json()
headers = rows[0]
data = rows[1:]
def get(r, key):
return int(r.get(key) or 0)
conn = psycopg2.connect("dbname=trade_area_db")
cur = conn.cursor()
cur.execute("""
CREATE TABLE IF NOT EXISTS acs_data (
geoid TEXT PRIMARY KEY,
name TEXT,
total_pop INTEGER,
median_income INTEGER,
median_age NUMERIC,
pct_hh_under35k NUMERIC,
pct_hh_35_75k NUMERIC,
pct_hh_75_150k NUMERIC,
pct_hh_over150k NUMERIC
);
""")
for row in data:
r = dict(zip(headers, row))
# stripping the "1400000US" prefix because it is specific to tract-level (summary level 140) ACS responses
geoid = r["GEO_ID"].replace("1400000US", "")
total_hh = get(r, "B19001_001E")
if total_hh == 0:
continue # skip tracts with no reported households to avoid distorting income percentages
hh_under35k = sum(get(r, f) for f in [
"B19001_002E","B19001_003E","B19001_004E","B19001_005E","B19001_006E"])
hh_35_75k = sum(get(r, f) for f in [
"B19001_007E","B19001_008E","B19001_009E","B19001_010E","B19001_011E"])
hh_75_150k = sum(get(r, f) for f in [
"B19001_012E","B19001_013E","B19001_014E"])
hh_over150k = sum(get(r, f) for f in [
"B19001_015E","B19001_016E","B19001_017E"])
cur.execute("""
INSERT INTO acs_data (
geoid, name, total_pop, median_income, median_age,
pct_hh_under35k, pct_hh_35_75k, pct_hh_75_150k, pct_hh_over150k
) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s)
ON CONFLICT (geoid) DO UPDATE SET
name = EXCLUDED.name,
total_pop = EXCLUDED.total_pop,
median_income = EXCLUDED.median_income,
median_age = EXCLUDED.median_age,
pct_hh_under35k = EXCLUDED.pct_hh_under35k,
pct_hh_35_75k = EXCLUDED.pct_hh_35_75k,
pct_hh_75_150k = EXCLUDED.pct_hh_75_150k,
pct_hh_over150k = EXCLUDED.pct_hh_over150k;
""", (
geoid,
r["NAME"],
get(r, "B01001_001E"),
get(r, "B19013_001E"),
float(r["B01002_001E"] or 0),
round(hh_under35k / total_hh * 100, 1),
round(hh_35_75k / total_hh * 100, 1),
round(hh_75_150k / total_hh * 100, 1),
round(hh_over150k / total_hh * 100, 1),
))
conn.commit()
cur.close()
conn.close()
print(f"Loaded {len(data)} tracts")
With the isochrone stored in PostGIS, intersect it with census tracts and weight the demographic data by the fraction of each tract that falls inside the trade area:
WITH tract_weights AS (
SELECT
a.total_pop,
a.median_income,
a.median_age,
a.pct_hh_under35k,
a.pct_hh_35_75k,
a.pct_hh_75_150k,
a.pct_hh_over150k,
ST_Area(ST_Intersection(t.geom::geography, ta.geom::geography)) /
ST_Area(t.geom::geography) AS coverage_fraction
FROM census_tracts t
JOIN acs_data a ON t.geoid = a.geoid
JOIN trade_areas ta ON ST_Intersects(t.geom, ta.geom)
WHERE ta.name = 'Durham 5 min drive'
)
SELECT
ROUND(SUM(total_pop * coverage_fraction)) AS estimated_population,
ROUND(SUM(median_income * coverage_fraction)::numeric / SUM(coverage_fraction)::numeric) AS area_weighted_income_index,
ROUND(SUM(median_age * coverage_fraction)::numeric / SUM(coverage_fraction)::numeric, 1) AS area_weighted_median_age,
ROUND(SUM(pct_hh_under35k * coverage_fraction)::numeric / SUM(coverage_fraction)::numeric, 1) AS pct_households_under35k,
ROUND(SUM(pct_hh_35_75k * coverage_fraction)::numeric / SUM(coverage_fraction)::numeric, 1) AS pct_households_35_75k,
ROUND(SUM(pct_hh_75_150k * coverage_fraction)::numeric / SUM(coverage_fraction)::numeric, 1) AS pct_households_75_150k,
ROUND(SUM(pct_hh_over150k * coverage_fraction)::numeric / SUM(coverage_fraction)::numeric, 1) AS pct_households_over150k
FROM tract_weights;
Sample output:
estimated_population | 48097
area_weighted_income_index | 109382
area_weighted_median_age | 37.6
pct_households_under35k | 10.7
pct_households_35_75k | 17.8
pct_households_75_150k | 35.0
pct_households_over150k | 36.5
coverage_fraction is the share of each census tract’s area that falls inside the isochrone, computed by dividing the intersection area by the full tract area. Both area calculations use the geography type so PostGIS returns square meters rather than square degrees. estimated_population pro-rates each tract’s population by that fraction: a tract that’s 60% inside the trade area contributes 60% of its population to the total. area_weighted_median_income applies the same weighting to tract-level median household income and divides by the sum of weights. The result is a weighted mean of tract medians, not a true median for the combined area, so interpret it accordingly.
This approach assumes population and income are uniformly distributed within each tract, which isn’t quite true. Tracts vary in density and often contain both commercial and residential land. The result is an approximation, but it can be close enough for site screening.
Step 4: Competitor Proximity
Install the Overture CLI tool:
pip install overturemaps
Download Durham data:
overturemaps download \
--bbox=-79.10,35.83,-78.85,36.01 \
--type=place \
-f geojson \
-o durham_places.geojson
Create a table called competitors:
CREATE TABLE competitors (
id TEXT PRIMARY KEY,
name TEXT,
brand TEXT,
geom GEOMETRY(Point, 4326) NOT NULL
);
CREATE INDEX competitors_geom_idx
ON competitors USING GIST (geom);
Load the Overture Maps places data for your business category into the table:
python3 -c "
import json
with open('durham_places.geojson') as f:
data = json.load(f)
filtered = []
for feat in data['features']:
p = feat['properties']
if p.get('basic_category') == 'coffee_shop':
p['name'] = (p.get('names') or {}).get('primary', '')
filtered.append(feat)
with open('competitors.geojson', 'w') as f:
json.dump({'type': 'FeatureCollection', 'features': filtered}, f)
print(f'{len(filtered)} competitors found')
"
Load the file into Postgres:
ogr2ogr \
-f "PostgreSQL" \
PG:"dbname=trade_area_db user=vparham" \
competitors.geojson \
-nln competitors \
-overwrite \
-select "id,name,brand"
Then query within the trade area and rank competitors by distance from the candidate site:
SELECT
c.name,
ROUND(
(ST_Distance(c.wkb_geometry::geography, s.geom::geography) / 1609.344)::numeric
, 2) AS distance_miles
FROM competitors c
CROSS JOIN sites s
WHERE s.name = 'Woodcroft Shopping Center'
AND ST_Within(c.wkb_geometry, (
SELECT geom FROM trade_areas WHERE name = 'Durham 5 min drive'
))
ORDER BY distance_miles;
Example output:
name | distance_miles
-------------------------------+----------------
Joe Van Gogh | 0.04
Dunkin' | 0.21
Dunkin' Donuts | 0.22
Starbucks | 0.24
Starbucks | 1.19
Starbucks | 1.24
Starbucks | 1.24
Bean Traders | 1.31
Starbucks | 1.33
People's Coffee | 1.39
Starbucks | 1.39
PussyCat Cafe | 1.42
Starbucks | 1.42
Nordstrom Ebar Artisan Coffee | 1.46
Starbucks | 1.61
Starbucks | 1.67
BCBSNC Cafe 11309895 | 2.36
Panera Bread | 2.70
Zephyr Coffee | 2.76
Dunkin' | 3.10
Starbucks | 3.15
(21 rows)
Save the coordinates for these competitors locations to a json file and drag it into geojson.io:
COPY (
SELECT json_build_object(
'type', 'FeatureCollection',
'features', json_agg(
json_build_object(
'type', 'Feature',
'geometry', ST_AsGeoJSON(c.wkb_geometry)::json,
'properties', json_build_object('name', c.name)
)
)
)
FROM competitors c
WHERE ST_Within(c.wkb_geometry, (
SELECT geom FROM trade_areas WHERE name = 'Durham 5 min drive'
))
) TO '/tmp/competitors.geojson';

distance_miles is the straight-line distance from your candidate site to each competitor. Sorting by distance tells you which competitors are nearest to your intended location, while ST_Within keeps results inside the trade area. If you want to group competitors into spatial clusters, ST_ClusterKMeans works as a window function: ST_ClusterKMeans(c.wkb_geometry, 3) OVER () AS cluster_id assigns each competitor to one of three clusters by proximity.
What to Do With the Output
Use this analysis to compare two or three candidate sites: which location has more population, higher income, fewer direct competitors? That comparison narrows the field before committing to site visits or lease negotiations.
The PostGIS documentation covers more sophisticated spatial joins and index strategies. The Census Bureau’s data discovery tool makes it easier to browse ACS variables before pulling them through the API. The OSMnx documentation covers graph projection, edge weight customization, and the network_type options for walking or cycling networks, which matters if you’re evaluating a walkable urban location.