Pandas & GeoPandas Support¶
Geodistpy integrates with pandas and GeoPandas so you can pass DataFrames (or GeoDataFrames) directly into distance and spatial-query functions. This fits the common workflow: get coordinates from a geocoder or database, then run fast geodesic distance and nearest-neighbor logic on tabular data.
Installation (optional)¶
Pandas and GeoPandas are optional dependencies. Install them only if you use DataFrames:
pip install geodistpy[pandas] # for pandas DataFrame support
pip install geodistpy[geopandas] # for GeoDataFrame support (includes pandas)
Without these extras, all other geodistpy features work as usual; only DataFrame/GeoDataFrame arguments require the corresponding package.
After geocoding: the typical workflow¶
Geocoding (address → latitude, longitude) is often the first step — with geopy, your own API, or a database. The next step is usually:
- “How far from this user to these N stores?”
- “Which store is nearest?”
- “Which points are within 500 km?”
Geodistpy is built for that: once you have coordinates, run distance and nearest-neighbor logic fast with geodesic accuracy.
Example: geopy → geodistpy¶
# 1. Get coordinates (e.g. from geopy)
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="my_app")
user_location = geolocator.geocode("Alexanderplatz, Berlin")
user = (user_location.latitude, user_location.longitude)
# 2. Your store locations (lat, lon)
stores = [
(52.5200, 13.4050), # Berlin
(48.8566, 2.3522), # Paris
(51.5074, -0.1278), # London
]
# 3. How far from this user to each store?
from geodistpy import geodist_to_many
distances_km = geodist_to_many(user, stores, metric="km")
# 4. Which store is nearest?
from geodistpy import geodesic_knn
nearest_idx, nearest_dists = geodesic_knn(user, stores, k=1, metric="km")
# 5. Which stores are within 500 km?
from geodistpy import point_in_radius
within_idx, within_dists = point_in_radius(user, stores, 500, metric="km")
No projection, no haversine hacks — just fast geodesic (Vincenty) on WGS84.
Using pandas DataFrames¶
If your points live in a pandas DataFrame (e.g. a table of stores or cities), you can pass the DataFrame directly. Geodistpy will use the right columns and return results aligned with the DataFrame index.
Column names¶
- Auto-detect: Use columns named
lat/lonorlatitude/longitude(orLat/Lon,LAT/LON). No extra arguments needed. - Custom names: Pass
lat_colandlon_colto any function that accepts a DataFrame.
Functions that accept DataFrames¶
| Function | DataFrame argument | Return when DataFrame given |
|---|---|---|
geodist_to_many(origin, points, ...) |
points |
pandas.Series with points.index |
geodesic_knn(point, candidates, ...) |
candidates |
(indices, distances) — indices are index labels |
point_in_radius(center, candidates, ...) |
candidates |
(indices, distances) — indices are index labels |
Example: stores in a DataFrame¶
import pandas as pd
from geodistpy import geodist_to_many, geodesic_knn, point_in_radius
# DataFrame with lat/lon columns
stores_df = pd.DataFrame({
"lat": [48.8566, 51.5074, 40.7128],
"lon": [2.3522, -0.1278, -74.006],
"name": ["Paris", "London", "New York"],
})
user = (52.52, 13.40)
# One-to-many distances — returns a Series indexed like your DataFrame
distances = geodist_to_many(user, stores_df, metric="km")
# k nearest (indices are row labels)
idx, dists = geodesic_knn(user, stores_df, k=2, metric="km")
nearest_stores = stores_df.loc[idx]
# All stores within 1000 km
idx, dists = point_in_radius(user, stores_df, 1000, metric="km")
within = stores_df.loc[idx]
Custom column names¶
df = pd.DataFrame({"y": [48.85, 51.50], "x": [2.35, -0.12]})
distances = geodist_to_many(user, df, metric="km", lat_col="y", lon_col="x")
idx, dists = geodesic_knn(user, df, k=1, metric="km", lat_col="y", lon_col="x")
Extracting coordinates: coordinates_from_df¶
When you only need a (lat, lon) array from a DataFrame or GeoDataFrame (e.g. for geodist_matrix or custom logic), use coordinates_from_df:
from geodistpy import coordinates_from_df, geodist_matrix
coords, index = coordinates_from_df(stores_df)
# coords: (n, 2) array of (latitude, longitude)
# index: stores_df.index (for aligning results)
# Use with any function that expects an array
matrix = geodist_matrix(coords, metric="km")
pandas DataFrame¶
- Auto-detect: Columns
lat/lonorlatitude/longitude(orLat/Lon,LAT/LON). - Explicit:
coordinates_from_df(df, lat_col="y", lon_col="x"). - Returns:
(coords, df.index)—coordsis shape(n, 2).
GeoPandas GeoDataFrame¶
- Ignores
lat_col/lon_col. Uses the geometry column (point geometry). - Assumes a geographic CRS (e.g. WGS84): geometry
.x= longitude,.y= latitude. - Returns:
(coords, gdf.index).
import geopandas as gpd
from shapely.geometry import Point
from geodistpy import coordinates_from_df
gdf = gpd.GeoDataFrame(
{"name": ["Paris", "London"]},
geometry=[Point(2.35, 48.85), Point(-0.12, 51.50)],
crs="EPSG:4326",
)
coords, index = coordinates_from_df(gdf)
# coords = [[48.85, 2.35], [51.50, -0.12]]
One-to-many distances: geodist_to_many¶
geodist_to_many(origin, points, ...) returns the distance from a single origin to each of points. Same as geodist_matrix([origin], points, ...)[0] but with a clearer intent.
- origin:
(lat, lon)tuple. - points: Array of shape
(n, 2)or a pandas DataFrame / GeoDataFrame (withlat/lonorlat_col/lon_col). - Returns:
ndarrayof length n, or apandas.Series(indexed bypoints.index) when points is a DataFrame/GeoDataFrame.
Use it whenever you have one reference point and many targets (e.g. one user, many stores).
Summary¶
| Goal | Function | With DataFrame? |
|---|---|---|
| Distances from one point to many | geodist_to_many(origin, points, ...) |
Yes → returns Series |
| k nearest points | geodesic_knn(point, candidates, k=..., ...) |
Yes → indices = index labels |
| Points within radius | point_in_radius(center, candidates, radius, ...) |
Yes → indices = index labels |
| Get (lat, lon) array from table | coordinates_from_df(df, ...) |
Yes (DataFrame or GeoDataFrame) |
Install geodistpy[pandas] or geodistpy[geopandas] to use DataFrame/GeoDataFrame inputs. For more detail on k-NN and point-in-radius, see Spatial Queries. For the full API, see API Reference.