API Reference
polars_iptools.iptools
IpExprExt
IP address operations available via the .ip expression namespace.
All functions in this module are also available as standalone functions.
The .ip namespace is a convenience layer — e.g.:
.. code-block:: python
# Standalone
ip.to_address(pl.col("src"))
# Namespace
pl.col("src").ip.to_address()
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
expr
|
Expr
|
The Polars expression this namespace is attached to. |
required |
extract_all_ips(ipv6=False)
Deprecated: use :func:extract_ips instead.
ipv4_to_numeric()
Deprecated: use ipv4_to_numeric(expr) standalone function instead.
numeric_to_ipv4()
Deprecated: use numeric_to_ipv4(expr) standalone function instead.
to_address()
Promote to Unified IPAddress extension type (future-proof).
to_canonical()
Alias for to_string().
to_ipv4()
Convert/Parse to IPv4 extension type (optimized 32-bit).
to_native()
Alias for to_address().
to_string()
Convert IP extension back to a canonical string representation.
IpSeriesExt
IP address operations available via the .ip Series namespace.
Mirrors :class:IpExprExt for direct Series access — e.g.:
.. code-block:: python
series.ip.to_address()
series.ip.extract_ips()
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
s
|
Series
|
The Polars Series this namespace is attached to. |
required |
extract_all_ips(ipv6=False)
Deprecated: use :func:extract_ips instead.
ipv4_to_numeric()
Deprecated: use ipv4_to_numeric(expr) standalone function instead.
numeric_to_ipv4()
Deprecated: use numeric_to_ipv4(expr) standalone function instead.
to_canonical()
Alias for to_string().
is_valid(expr)
Check whether each string is a valid IPv4 or IPv6 address.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
expr
|
IntoExpr
|
Expression or column containing IP address strings. |
required |
Returns:
| Type | Description |
|---|---|
Expr
|
Boolean expression — |
Examples:
>>> import polars as pl
>>> import polars_iptools as ip
>>> pl.DataFrame({"ip": ["8.8.8.8", "::1", "not_an_ip"]}).with_columns(
... ip.is_valid("ip")
... )
shape: (3, 2)
┌───────────┬──────────┐
│ ip ┆ ip │
│ --- ┆ --- │
│ str ┆ bool │
╞═══════════╪══════════╡
│ 8.8.8.8 ┆ true │
│ ::1 ┆ true │
│ not_an_ip ┆ false │
└───────────┴──────────┘
is_private(expr)
Check whether each string is an RFC 1918 private IPv4 address.
Returns False for IPv6 addresses and invalid strings.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
expr
|
IntoExpr
|
Expression or column containing IP address strings. |
required |
Returns:
| Type | Description |
|---|---|
Expr
|
Boolean expression. |
Examples:
>>> pl.DataFrame({"ip": ["192.168.1.1", "8.8.8.8", "::1"]}).with_columns(
... ip.is_private("ip")
... )
shape: (3, 2)
┌─────────────┬────────────┐
│ ip ┆ ip │
│ --- ┆ --- │
│ str ┆ bool │
╞═════════════╪════════════╡
│ 192.168.1.1 ┆ true │
│ 8.8.8.8 ┆ false │
│ ::1 ┆ false │
└─────────────┴────────────┘
to_ipv4(expr)
Parse IPv4 address strings into the IPv4 extension type (UInt32 storage).
The IPv4 type is the most storage-efficient representation for IPv4-only
datasets — 4 bytes per address vs. ~9–15 bytes as a string. The type is
preserved through Parquet and IPC round-trips.
Invalid strings produce null. IPv6 addresses are not supported; use
:func:to_address for mixed IPv4/IPv6 data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
expr
|
IntoExpr
|
Expression or column containing IPv4 address strings. |
required |
Returns:
| Type | Description |
|---|---|
Expr
|
Expression of extension type |
Examples:
to_address(expr)
Promote strings, integers, or binary to the IPAddress extension type.
IPAddress uses 16-byte binary storage (network-order IPv6). IPv4 addresses
are stored as IPv4-mapped IPv6 (::ffff:x.x.x.x). This is the recommended
type for mixed IPv4/IPv6 datasets and for any data that will be written to
Parquet or IPC — the extension type metadata is preserved on read.
Accepts:
String— parsed as IPv4 or IPv6UInt32— treated as IPv4 numericBinary(16 bytes) — used as-is
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
expr
|
IntoExpr
|
Expression or column containing IP addresses. |
required |
Returns:
| Type | Description |
|---|---|
Expr
|
Expression of extension type |
Examples:
>>> df = pl.DataFrame({"ip": ["8.8.8.8", "2606:4700::1111", "192.168.1.1"]})
>>> df.with_columns(ip.to_address("ip"))
shape: (3, 2)
┌─────────────────┬─────────────────┐
│ ip ┆ ip │
│ --- ┆ --- │
│ str ┆ ip_addr │
╞═════════════════╪═════════════════╡
│ 8.8.8.8 ┆ 8.8.8.8 │
│ 2606:4700::1111 ┆ 2606:4700::1111 │
│ 192.168.1.1 ┆ 192.168.1.1 │
└─────────────────┴─────────────────┘
to_string(expr)
Convert an IPv4 or IPAddress extension column back to canonical string form.
Accepts IPv4 (UInt32 storage) or IPAddress (Binary storage)
extension columns. IPv4-mapped IPv6 addresses (::ffff:x.x.x.x) are
rendered as plain IPv4 strings.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
expr
|
IntoExpr
|
Expression or column of |
required |
Returns:
| Type | Description |
|---|---|
Expr
|
Expression of data type |
Examples:
>>> df = pl.DataFrame({"ip": ["8.8.8.8", "2606:4700::1111"]})
>>> df.with_columns(ip.to_address("ip").ip.to_string())
shape: (2, 2)
┌─────────────────┬─────────────────┐
│ ip ┆ ip │
│ --- ┆ --- │
│ str ┆ str │
╞═════════════════╪═════════════════╡
│ 8.8.8.8 ┆ 8.8.8.8 │
│ 2606:4700::1111 ┆ 2606:4700::1111 │
└─────────────────┴─────────────────┘
extract_ips(expr, ipv6=False, only_public=False, ignore_private=False, ignore_loopback=False, ignore_broadcast=False)
Extract IP addresses from text, including defanged IPs (e.g. 192[.]168[.]1[.]1).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
expr
|
IntoExpr
|
Expression or column containing text to extract IPs from. |
required |
ipv6
|
bool
|
If True, also extract IPv6 addresses. |
False
|
only_public
|
bool
|
If True, skip private, loopback, and broadcast addresses. |
False
|
ignore_private
|
bool
|
If True, skip RFC 1918 (IPv4) and ULA (IPv6) addresses. |
False
|
ignore_loopback
|
bool
|
If True, skip loopback addresses (127.0.0.0/8, ::1). |
False
|
ignore_broadcast
|
bool
|
If True, skip broadcast addresses (255.255.255.255). |
False
|
Returns:
| Type | Description |
|---|---|
Expr
|
Expression of data type |
extract_public_ips(expr, ipv6=False)
Extract only publicly routable IP addresses from text.
Shortcut for extract_ips(expr, only_public=True). Skips RFC 1918
private ranges, loopback (127.0.0.0/8, ::1), and broadcast
(255.255.255.255). Defanged IPs (e.g. 192[.]168[.]1[.]1) are
handled automatically.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
expr
|
IntoExpr
|
Expression or column containing text to extract IPs from. |
required |
ipv6
|
bool
|
If |
False
|
Returns:
| Type | Description |
|---|---|
Expr
|
Expression of data type |
Examples:
>>> pl.DataFrame({"text": ["seen 8.8.8.8 and 192.168.1.1"]}).with_columns(
... ip.extract_public_ips("text")
... )
shape: (1, 2)
┌───────────────────────────────┬──────────────┐
│ text ┆ text │
│ --- ┆ --- │
│ str ┆ list[str] │
╞═══════════════════════════════╪══════════════╡
│ seen 8.8.8.8 and 192.168.1.1 ┆ ["8.8.8.8"] │
└───────────────────────────────┴──────────────┘
extract_private_ips(expr, ipv6=False)
Extract only private IP addresses from text.
Returns RFC 1918 addresses (10/8, 172.16/12, 192.168/16) for
IPv4, and ULA addresses (fc00::/7) for IPv6. Implemented as a
post-extraction filter — the extractor first finds all IPs, then keeps
only those that pass Ipv4Addr::is_private() / ULA check.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
expr
|
IntoExpr
|
Expression or column containing text to extract IPs from. |
required |
ipv6
|
bool
|
If |
False
|
Returns:
| Type | Description |
|---|---|
Expr
|
Expression of data type |
Examples:
>>> pl.DataFrame({"text": ["8.8.8.8 and 10.0.0.1 and 192.168.1.1"]}).with_columns(
... ip.extract_private_ips("text")
... )
shape: (1, 2)
┌───────────────────────────────────────┬──────────────────────────────┐
│ text ┆ text │
│ --- ┆ --- │
│ str ┆ list[str] │
╞═══════════════════════════════════════╪══════════════════════════════╡
│ 8.8.8.8 and 10.0.0.1 and 192.168.1.1 ┆ ["10.0.0.1", "192.168.1.1"] │
└───────────────────────────────────────┴──────────────────────────────┘
is_in(expr, networks)
Returns a boolean if IPv4 or IPv6 address is in any of the network ranges in "networks"
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
expr
|
IntoExpr
|
The expression or column containing the IP addresses to check |
required |
networks
|
Union[Expr, Iterable[str]]
|
IPv4 and IPv6 CIDR ranges defining the network. This can be a Polars expression, a list of strings, or a set of strings. |
required |
Examples:
>>> import polars as pl
>>> import polars_iptools as ip
>>> df = pl.DataFrame({"ip": ["8.8.8.8", "1.1.1.1", "2606:4700::1111"]})
>>> networks = ["8.8.8.0/24", "2606:4700::/32"]
>>> df.with_columns(ip.is_in(pl.col("ip"), networks).alias("is_in"))
shape: (3, 2)
┌─────────────────┬───────┐
│ ip ┆ is_in │
│ --- ┆ --- │
│ str ┆ bool │
╞═════════════════╪═══════╡
│ 8.8.8.8 ┆ true │
│ 1.1.1.1 ┆ false │
│ 2606:4700::1111 ┆ true │
└─────────────────┴───────┘
ipv4_to_numeric(expr)
Convert IPv4 address strings to their 32-bit unsigned integer representation.
Invalid or non-IPv4 strings produce null.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
expr
|
IntoExpr
|
Expression or column containing IPv4 address strings. |
required |
Returns:
| Type | Description |
|---|---|
Expr
|
Expression of data type |
Examples:
numeric_to_ipv4(expr)
Convert 32-bit unsigned integers to IPv4 address strings.
Non-numeric or out-of-range values produce null.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
expr
|
IntoExpr
|
Expression or column containing numeric ( |
required |
Returns:
| Type | Description |
|---|---|
Expr
|
Expression of data type |
Examples:
extract_all_ips(expr, ipv6=False, **kwargs)
Deprecated: use :func:extract_ips instead.
IP Extension Types
polars_iptools.types
IPv4
Bases: BaseExtension
IPv4 Extension Type backing onto UInt32.
This type represents an IPv4 address stored efficiently as a 32-bit unsigned integer but displayed and handled as an IP address.
Known issues
- All-null columns panic when wrapped into extension types (https://github.com/pola-rs/polars/issues/25322, polars-expr/dispatch/extension.rs). Include at least one valid value to avoid this.
- Custom display formatting (showing
8.8.8.8instead of rawu32) is pending upstream support (https://github.com/pola-rs/polars/pull/26649).
IPAddress
Bases: BaseExtension
Unified IP Address Extension Type backing onto Binary(16).
This type represents any IP address (IPv4 or IPv6). IPv4 addresses are stored as IPv4-mapped IPv6 addresses (::ffff:x.x.x.x).
Known issues
- All-null columns panic when wrapped into extension types (https://github.com/pola-rs/polars/issues/25322, polars-expr/dispatch/extension.rs). Include at least one valid value to avoid this.
- Custom display formatting (showing
8.8.8.8instead of raw bytes) is pending upstream support (https://github.com/pola-rs/polars/pull/26649). to_list()crashes onlist[extension]columns (https://github.com/pola-rs/polars/issues/19418).
GeoIP
polars_iptools.geoip
GeoIpExprExt
This class contains tools for geolocation enrichment of IP addresses.
Polars Namespace: geoip
Example: df.with_columns([pl.col("srcip").geoip.asn()])
GeoIpSeriesExt
This class contains tools for parsing IP addresses.
Polars Namespace: geoip
Example: df["srcip"].geoip.asn()
asn(expr, reload_mmdb=False)
Retrieve ASN and Organizational names for Internet-routed IPv4 and IPv6 addresses Returns a string in the format "AS{asnum} {asorg}"
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
expr
|
IntoExpr
|
The expression or column containing IP addresses. |
required |
reload_mmdb
|
bool
|
Force reload/reinitialize of MaxMind db readers. Default is False. |
False
|
Returns:
| Type | Description |
|---|---|
Expr
|
Expression of :class: |
Examples:
>>> import polars as pl
>>> import polars_iptools as ip
>>> df = pl.DataFrame({"ip": ["8.8.8.8", "192.168.1.1", "2606:4700::1111", "999.abc.def.123"]})
>>> df.with_columns([ip.geoip.asn(pl.col("ip")).alias("asn")])
shape: (4, 2)
┌─────────────────┬───────────────────────┐
│ ip ┆ asn │
│ --- ┆ --- │
│ str ┆ str │
╞═════════════════╪═══════════════════════╡
│ 8.8.8.8 ┆ AS15169 GOOGLE │
│ 192.168.1.1 ┆ │
│ 2606:4700::1111 ┆ AS13335 CLOUDFLARENET │
│ 999.abc.def.123 ┆ │
└─────────────────┴───────────────────────┘
Notes
Invalid IP address strings or IPs not found in the database will result in an empty string output.
full(expr, reload_mmdb=False)
Retrieve full ASN and City geolocation metadata of IPv4 and IPv6 addresses
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
expr
|
IntoExpr
|
The expression or column containing IP addresses. |
required |
reload_mmdb
|
bool
|
Force reload/reinitialize of MaxMind db readers. Default is False. |
False
|
Returns:
| Type | Description |
|---|---|
Expr
|
An expression that returns a struct containing the following fields: - asnnum : UInt32 - asnorg : String - city : String - continent : String - country : String - country_iso : String - latitude : Float64 - longitude : Float64 - subdivision : String - subdivision_iso : String - timezone : String - postalcode: String |
Examples:
>>> import polars as pl
>>> import polars_iptools as ip
>>> df = pl.DataFrame({"ip": ["8.8.8.8", "192.168.1.1", "2606:4700::1111", "999.abc.def.123"]})
>>> df = df.with_columns([ip.geoip.full(pl.col("ip")).alias("geoip")])
shape: (4, 2)
┌─────────────────┬─────────────────────────────────┐
│ ip ┆ geoip │
│ --- ┆ --- │
│ str ┆ struct[12] │
╞═════════════════╪═════════════════════════════════╡
│ 8.8.8.8 ┆ {15169,"GOOGLE","","NA","","",… │
│ 192.168.1.1 ┆ {0,"","","","","","","",0.0,0.… │
│ 2606:4700::1111 ┆ {13335,"CLOUDFLARENET","","","… │
│ 999.abc.def.123 ┆ {null,null,null,null,null,null… │
└─────────────────┴─────────────────────────────────┘
>>> df.schema
Schema([('ip', String),
('geoip',
Struct({'asnnum': UInt32, 'asnorg': String, 'city': String,
'continent': String, 'subdivision_iso': String, 'subdivision': String,
'country_iso': String, 'country': String, 'latitude': Float64,
'longitude': Float64, 'timezone': String, 'postalcode': String}))])
Notes
IP addresses that are invalid or not found in the database will result in null values in the respective fields.
Spur
polars_iptools.spur
SpurExprExt
This class contains tools for Spur IP Context enrichment.
Polars Namespace: spur
Example: df.with_columns([pl.col("srcip").spur.full()])
SpurSeriesExt
This class contains tools for Spur IP Context enrichment.
Polars Namespace: spur
Example: df["srcip"].spur.full()
full(expr, reload_mmdb=False)
Retrieve full Spur IP Context metadata of IPv4 and IPv6 addresses
If you are customer of Spur, you can download a subset of their Anonymization and Anonymization+Residential feeds in Maxmind MMDB format. See https://docs.spur.us/feeds?id=feed-export-utility for more details.
This function requires the directory containing "spur.mmdb" to be defined by environment variable SPUR_MMDB_DIR.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
expr
|
IntoExpr
|
The expression or column containing IP addresses. |
required |
reload_mmdb
|
bool
|
Force reload/reinitialize of Spur's mmdb reader. Default is False. |
False
|
Returns:
| Type | Description |
|---|---|
Expr
|
An expression that returns a struct containing the following fields: - client_count : Float32 - infrastructure : String - location_city : String - location_country : String - location_state : String - tag : String - services : List[String] |
Examples:
>>> import polars as pl
>>> import polars_iptools as ip
>>> df = pl.DataFrame({"ip": ["8.8.8.8", "192.168.1.1", "999.abc.def.123"]})
>>> df = df.with_columns([ip.spur.full(pl.col("ip")).alias("spurcontext")])
shape: (3, 2)
┌─────────────────┬─────────────────────────────────┐
│ ip ┆ spurcontext │
│ --- ┆ --- │
│ str ┆ struct[7] │
╞═════════════════╪═════════════════════════════════╡
│ 8.8.8.8 ┆ {0.0,"","","","","",null} │
│ 192.168.1.1 ┆ {0.0,"","","","","",null} │
│ 999.abc.def.123 ┆ {null,null,null,null,null,null… │
└─────────────────┴─────────────────────────────────┘
>>> df.schema
Schema([('ip', String),
('spurcontext',
Struct({'client_count': Float32, 'infrastructure': String,
'location_city': String, 'location_country': String,
'location_state': String, 'tag': String, 'services': List(String)}))])
Notes
IP addresses that are invalid or not found in the database will result in null values in the respective fields.