Skip to content

API Reference

polars_iptools.iptools

IpExprExt

IP address operations available via the .ip expression namespace.

All functions in this module are also available as standalone functions. The .ip namespace is a convenience layer — e.g.:

.. code-block:: python

# Standalone
ip.to_address(pl.col("src"))

# Namespace
pl.col("src").ip.to_address()

Parameters:

Name Type Description Default
expr Expr

The Polars expression this namespace is attached to.

required
extract_all_ips(ipv6=False)

Deprecated: use :func:extract_ips instead.

ipv4_to_numeric()

Deprecated: use ipv4_to_numeric(expr) standalone function instead.

numeric_to_ipv4()

Deprecated: use numeric_to_ipv4(expr) standalone function instead.

to_address()

Promote to Unified IPAddress extension type (future-proof).

to_canonical()

Alias for to_string().

to_ipv4()

Convert/Parse to IPv4 extension type (optimized 32-bit).

to_native()

Alias for to_address().

to_string()

Convert IP extension back to a canonical string representation.

IpSeriesExt

IP address operations available via the .ip Series namespace.

Mirrors :class:IpExprExt for direct Series access — e.g.:

.. code-block:: python

series.ip.to_address()
series.ip.extract_ips()

Parameters:

Name Type Description Default
s Series

The Polars Series this namespace is attached to.

required
extract_all_ips(ipv6=False)

Deprecated: use :func:extract_ips instead.

ipv4_to_numeric()

Deprecated: use ipv4_to_numeric(expr) standalone function instead.

numeric_to_ipv4()

Deprecated: use numeric_to_ipv4(expr) standalone function instead.

to_canonical()

Alias for to_string().

is_valid(expr)

Check whether each string is a valid IPv4 or IPv6 address.

Parameters:

Name Type Description Default
expr IntoExpr

Expression or column containing IP address strings.

required

Returns:

Type Description
Expr

Boolean expression — True for valid addresses, False otherwise.

Examples:

>>> import polars as pl
>>> import polars_iptools as ip
>>> pl.DataFrame({"ip": ["8.8.8.8", "::1", "not_an_ip"]}).with_columns(
...     ip.is_valid("ip")
... )
shape: (3, 2)
┌───────────┬──────────┐
│ ip        ┆ ip       │
│ ---       ┆ ---      │
│ str       ┆ bool     │
╞═══════════╪══════════╡
│ 8.8.8.8   ┆ true     │
│ ::1       ┆ true     │
│ not_an_ip ┆ false    │
└───────────┴──────────┘

is_private(expr)

Check whether each string is an RFC 1918 private IPv4 address.

Returns False for IPv6 addresses and invalid strings.

Parameters:

Name Type Description Default
expr IntoExpr

Expression or column containing IP address strings.

required

Returns:

Type Description
Expr

Boolean expression.

Examples:

>>> pl.DataFrame({"ip": ["192.168.1.1", "8.8.8.8", "::1"]}).with_columns(
...     ip.is_private("ip")
... )
shape: (3, 2)
┌─────────────┬────────────┐
│ ip          ┆ ip         │
│ ---         ┆ ---        │
│ str         ┆ bool       │
╞═════════════╪════════════╡
│ 192.168.1.1 ┆ true       │
│ 8.8.8.8     ┆ false      │
│ ::1         ┆ false      │
└─────────────┴────────────┘

to_ipv4(expr)

Parse IPv4 address strings into the IPv4 extension type (UInt32 storage).

The IPv4 type is the most storage-efficient representation for IPv4-only datasets — 4 bytes per address vs. ~9–15 bytes as a string. The type is preserved through Parquet and IPC round-trips.

Invalid strings produce null. IPv6 addresses are not supported; use :func:to_address for mixed IPv4/IPv6 data.

Parameters:

Name Type Description Default
expr IntoExpr

Expression or column containing IPv4 address strings.

required

Returns:

Type Description
Expr

Expression of extension type IPv4 (UInt32 storage).

Examples:

>>> df = pl.DataFrame({"ip": ["8.8.8.8", "192.168.1.1"]})
>>> df.with_columns(ip.to_ipv4("ip"))
shape: (2, 2)
┌─────────────┬─────────────┐
│ ip          ┆ ip          │
│ ---         ┆ ---         │
│ str         ┆ ipv4        │
╞═════════════╪═════════════╡
│ 8.8.8.8     ┆ 8.8.8.8     │
│ 192.168.1.1 ┆ 192.168.1.1 │
└─────────────┴─────────────┘

to_address(expr)

Promote strings, integers, or binary to the IPAddress extension type.

IPAddress uses 16-byte binary storage (network-order IPv6). IPv4 addresses are stored as IPv4-mapped IPv6 (::ffff:x.x.x.x). This is the recommended type for mixed IPv4/IPv6 datasets and for any data that will be written to Parquet or IPC — the extension type metadata is preserved on read.

Accepts:

  • String — parsed as IPv4 or IPv6
  • UInt32 — treated as IPv4 numeric
  • Binary (16 bytes) — used as-is

Parameters:

Name Type Description Default
expr IntoExpr

Expression or column containing IP addresses.

required

Returns:

Type Description
Expr

Expression of extension type IPAddress (Binary storage).

Examples:

>>> df = pl.DataFrame({"ip": ["8.8.8.8", "2606:4700::1111", "192.168.1.1"]})
>>> df.with_columns(ip.to_address("ip"))
shape: (3, 2)
┌─────────────────┬─────────────────┐
│ ip              ┆ ip              │
│ ---             ┆ ---             │
│ str             ┆ ip_addr         │
╞═════════════════╪═════════════════╡
│ 8.8.8.8         ┆ 8.8.8.8         │
│ 2606:4700::1111 ┆ 2606:4700::1111 │
│ 192.168.1.1     ┆ 192.168.1.1     │
└─────────────────┴─────────────────┘

to_string(expr)

Convert an IPv4 or IPAddress extension column back to canonical string form.

Accepts IPv4 (UInt32 storage) or IPAddress (Binary storage) extension columns. IPv4-mapped IPv6 addresses (::ffff:x.x.x.x) are rendered as plain IPv4 strings.

Parameters:

Name Type Description Default
expr IntoExpr

Expression or column of IPv4 or IPAddress extension type. Pass expr.ext.storage() if working with raw storage.

required

Returns:

Type Description
Expr

Expression of data type String.

Examples:

>>> df = pl.DataFrame({"ip": ["8.8.8.8", "2606:4700::1111"]})
>>> df.with_columns(ip.to_address("ip").ip.to_string())
shape: (2, 2)
┌─────────────────┬─────────────────┐
│ ip              ┆ ip              │
│ ---             ┆ ---             │
│ str             ┆ str             │
╞═════════════════╪═════════════════╡
│ 8.8.8.8         ┆ 8.8.8.8         │
│ 2606:4700::1111 ┆ 2606:4700::1111 │
└─────────────────┴─────────────────┘

extract_ips(expr, ipv6=False, only_public=False, ignore_private=False, ignore_loopback=False, ignore_broadcast=False)

Extract IP addresses from text, including defanged IPs (e.g. 192[.]168[.]1[.]1).

Parameters:

Name Type Description Default
expr IntoExpr

Expression or column containing text to extract IPs from.

required
ipv6 bool

If True, also extract IPv6 addresses.

False
only_public bool

If True, skip private, loopback, and broadcast addresses.

False
ignore_private bool

If True, skip RFC 1918 (IPv4) and ULA (IPv6) addresses.

False
ignore_loopback bool

If True, skip loopback addresses (127.0.0.0/8, ::1).

False
ignore_broadcast bool

If True, skip broadcast addresses (255.255.255.255).

False

Returns:

Type Description
Expr

Expression of data type List(String).

extract_public_ips(expr, ipv6=False)

Extract only publicly routable IP addresses from text.

Shortcut for extract_ips(expr, only_public=True). Skips RFC 1918 private ranges, loopback (127.0.0.0/8, ::1), and broadcast (255.255.255.255). Defanged IPs (e.g. 192[.]168[.]1[.]1) are handled automatically.

Parameters:

Name Type Description Default
expr IntoExpr

Expression or column containing text to extract IPs from.

required
ipv6 bool

If True, also extract IPv6 addresses.

False

Returns:

Type Description
Expr

Expression of data type List(String).

Examples:

>>> pl.DataFrame({"text": ["seen 8.8.8.8 and 192.168.1.1"]}).with_columns(
...     ip.extract_public_ips("text")
... )
shape: (1, 2)
┌───────────────────────────────┬──────────────┐
│ text                          ┆ text         │
│ ---                           ┆ ---          │
│ str                           ┆ list[str]    │
╞═══════════════════════════════╪══════════════╡
│ seen 8.8.8.8 and 192.168.1.1  ┆ ["8.8.8.8"] │
└───────────────────────────────┴──────────────┘

extract_private_ips(expr, ipv6=False)

Extract only private IP addresses from text.

Returns RFC 1918 addresses (10/8, 172.16/12, 192.168/16) for IPv4, and ULA addresses (fc00::/7) for IPv6. Implemented as a post-extraction filter — the extractor first finds all IPs, then keeps only those that pass Ipv4Addr::is_private() / ULA check.

Parameters:

Name Type Description Default
expr IntoExpr

Expression or column containing text to extract IPs from.

required
ipv6 bool

If True, also extract private IPv6 (ULA) addresses.

False

Returns:

Type Description
Expr

Expression of data type List(String).

Examples:

>>> pl.DataFrame({"text": ["8.8.8.8 and 10.0.0.1 and 192.168.1.1"]}).with_columns(
...     ip.extract_private_ips("text")
... )
shape: (1, 2)
┌───────────────────────────────────────┬──────────────────────────────┐
│ text                                  ┆ text                         │
│ ---                                   ┆ ---                          │
│ str                                   ┆ list[str]                    │
╞═══════════════════════════════════════╪══════════════════════════════╡
│ 8.8.8.8 and 10.0.0.1 and 192.168.1.1 ┆ ["10.0.0.1", "192.168.1.1"] │
└───────────────────────────────────────┴──────────────────────────────┘

is_in(expr, networks)

Returns a boolean if IPv4 or IPv6 address is in any of the network ranges in "networks"

Parameters:

Name Type Description Default
expr IntoExpr

The expression or column containing the IP addresses to check

required
networks Union[Expr, Iterable[str]]

IPv4 and IPv6 CIDR ranges defining the network. This can be a Polars expression, a list of strings, or a set of strings.

required

Examples:

>>> import polars as pl
>>> import polars_iptools as ip
>>> df = pl.DataFrame({"ip": ["8.8.8.8", "1.1.1.1", "2606:4700::1111"]})
>>> networks = ["8.8.8.0/24", "2606:4700::/32"]
>>> df.with_columns(ip.is_in(pl.col("ip"), networks).alias("is_in"))
shape: (3, 2)
┌─────────────────┬───────┐
│ ip              ┆ is_in │
│ ---             ┆ ---   │
│ str             ┆ bool  │
╞═════════════════╪═══════╡
│ 8.8.8.8         ┆ true  │
│ 1.1.1.1         ┆ false │
│ 2606:4700::1111 ┆ true  │
└─────────────────┴───────┘

ipv4_to_numeric(expr)

Convert IPv4 address strings to their 32-bit unsigned integer representation.

Invalid or non-IPv4 strings produce null.

Parameters:

Name Type Description Default
expr IntoExpr

Expression or column containing IPv4 address strings.

required

Returns:

Type Description
Expr

Expression of data type UInt32.

Examples:

>>> pl.DataFrame({"ip": ["8.8.8.8", "1.1.1.1"]}).with_columns(
...     ip.ipv4_to_numeric("ip")
... )
shape: (2, 2)
┌─────────┬───────────┐
│ ip      ┆ ip        │
│ ---     ┆ ---       │
│ str     ┆ u32       │
╞═════════╪═══════════╡
│ 8.8.8.8 ┆ 134744072 │
│ 1.1.1.1 ┆ 16843009  │
└─────────┴───────────┘

numeric_to_ipv4(expr)

Convert 32-bit unsigned integers to IPv4 address strings.

Non-numeric or out-of-range values produce null.

Parameters:

Name Type Description Default
expr IntoExpr

Expression or column containing numeric (UInt32 or castable) values.

required

Returns:

Type Description
Expr

Expression of data type String.

Examples:

>>> pl.DataFrame({"n": [134744072, 16843009]}).with_columns(
...     ip.numeric_to_ipv4("n")
... )
shape: (2, 2)
┌───────────┬─────────┐
│ n         ┆ n       │
│ ---       ┆ ---     │
│ i64       ┆ str     │
╞═══════════╪═════════╡
│ 134744072 ┆ 8.8.8.8 │
│ 16843009  ┆ 1.1.1.1 │
└───────────┴─────────┘

extract_all_ips(expr, ipv6=False, **kwargs)

Deprecated: use :func:extract_ips instead.

IP Extension Types

polars_iptools.types

IPv4

Bases: BaseExtension

IPv4 Extension Type backing onto UInt32.

This type represents an IPv4 address stored efficiently as a 32-bit unsigned integer but displayed and handled as an IP address.

Known issues
  • All-null columns panic when wrapped into extension types (https://github.com/pola-rs/polars/issues/25322, polars-expr/dispatch/extension.rs). Include at least one valid value to avoid this.
  • Custom display formatting (showing 8.8.8.8 instead of raw u32) is pending upstream support (https://github.com/pola-rs/polars/pull/26649).

IPAddress

Bases: BaseExtension

Unified IP Address Extension Type backing onto Binary(16).

This type represents any IP address (IPv4 or IPv6). IPv4 addresses are stored as IPv4-mapped IPv6 addresses (::ffff:x.x.x.x).

Known issues
  • All-null columns panic when wrapped into extension types (https://github.com/pola-rs/polars/issues/25322, polars-expr/dispatch/extension.rs). Include at least one valid value to avoid this.
  • Custom display formatting (showing 8.8.8.8 instead of raw bytes) is pending upstream support (https://github.com/pola-rs/polars/pull/26649).
  • to_list() crashes on list[extension] columns (https://github.com/pola-rs/polars/issues/19418).

GeoIP

polars_iptools.geoip

GeoIpExprExt

This class contains tools for geolocation enrichment of IP addresses.

Polars Namespace: geoip

Example: df.with_columns([pl.col("srcip").geoip.asn()])

GeoIpSeriesExt

This class contains tools for parsing IP addresses.

Polars Namespace: geoip

Example: df["srcip"].geoip.asn()

asn(expr, reload_mmdb=False)

Retrieve ASN and Organizational names for Internet-routed IPv4 and IPv6 addresses Returns a string in the format "AS{asnum} {asorg}"

Parameters:

Name Type Description Default
expr IntoExpr

The expression or column containing IP addresses.

required
reload_mmdb bool

Force reload/reinitialize of MaxMind db readers. Default is False.

False

Returns:

Type Description
Expr

Expression of :class:Utf8 strings

Examples:

>>> import polars as pl
>>> import polars_iptools as ip
>>> df = pl.DataFrame({"ip": ["8.8.8.8", "192.168.1.1", "2606:4700::1111", "999.abc.def.123"]})
>>> df.with_columns([ip.geoip.asn(pl.col("ip")).alias("asn")])
shape: (4, 2)
┌─────────────────┬───────────────────────┐
│ ip              ┆ asn                   │
│ ---             ┆ ---                   │
│ str             ┆ str                   │
╞═════════════════╪═══════════════════════╡
│ 8.8.8.8         ┆ AS15169 GOOGLE        │
│ 192.168.1.1     ┆                       │
│ 2606:4700::1111 ┆ AS13335 CLOUDFLARENET │
│ 999.abc.def.123 ┆                       │
└─────────────────┴───────────────────────┘
Notes

Invalid IP address strings or IPs not found in the database will result in an empty string output.

full(expr, reload_mmdb=False)

Retrieve full ASN and City geolocation metadata of IPv4 and IPv6 addresses

Parameters:

Name Type Description Default
expr IntoExpr

The expression or column containing IP addresses.

required
reload_mmdb bool

Force reload/reinitialize of MaxMind db readers. Default is False.

False

Returns:

Type Description
Expr

An expression that returns a struct containing the following fields: - asnnum : UInt32 - asnorg : String - city : String - continent : String - country : String - country_iso : String - latitude : Float64 - longitude : Float64 - subdivision : String - subdivision_iso : String - timezone : String - postalcode: String

Examples:

>>> import polars as pl
>>> import polars_iptools as ip
>>> df = pl.DataFrame({"ip": ["8.8.8.8", "192.168.1.1", "2606:4700::1111", "999.abc.def.123"]})
>>> df = df.with_columns([ip.geoip.full(pl.col("ip")).alias("geoip")])
shape: (4, 2)
┌─────────────────┬─────────────────────────────────┐
│ ip              ┆ geoip                           │
│ ---             ┆ ---                             │
│ str             ┆ struct[12]                      │
╞═════════════════╪═════════════════════════════════╡
│ 8.8.8.8         ┆ {15169,"GOOGLE","","NA","","",… │
│ 192.168.1.1     ┆ {0,"","","","","","","",0.0,0.… │
│ 2606:4700::1111 ┆ {13335,"CLOUDFLARENET","","","… │
│ 999.abc.def.123 ┆ {null,null,null,null,null,null… │
└─────────────────┴─────────────────────────────────┘
>>> df.schema
Schema([('ip', String),
        ('geoip',
         Struct({'asnnum': UInt32, 'asnorg': String, 'city': String,
         'continent': String, 'subdivision_iso': String, 'subdivision': String,
         'country_iso': String, 'country': String, 'latitude': Float64,
         'longitude': Float64, 'timezone': String, 'postalcode': String}))])
Notes

IP addresses that are invalid or not found in the database will result in null values in the respective fields.

Spur

polars_iptools.spur

SpurExprExt

This class contains tools for Spur IP Context enrichment.

Polars Namespace: spur

Example: df.with_columns([pl.col("srcip").spur.full()])

SpurSeriesExt

This class contains tools for Spur IP Context enrichment.

Polars Namespace: spur

Example: df["srcip"].spur.full()

full(expr, reload_mmdb=False)

Retrieve full Spur IP Context metadata of IPv4 and IPv6 addresses

If you are customer of Spur, you can download a subset of their Anonymization and Anonymization+Residential feeds in Maxmind MMDB format. See https://docs.spur.us/feeds?id=feed-export-utility for more details.

This function requires the directory containing "spur.mmdb" to be defined by environment variable SPUR_MMDB_DIR.

Parameters:

Name Type Description Default
expr IntoExpr

The expression or column containing IP addresses.

required
reload_mmdb bool

Force reload/reinitialize of Spur's mmdb reader. Default is False.

False

Returns:

Type Description
Expr

An expression that returns a struct containing the following fields: - client_count : Float32 - infrastructure : String - location_city : String - location_country : String - location_state : String - tag : String - services : List[String]

Examples:

>>> import polars as pl
>>> import polars_iptools as ip
>>> df = pl.DataFrame({"ip": ["8.8.8.8", "192.168.1.1", "999.abc.def.123"]})
>>> df = df.with_columns([ip.spur.full(pl.col("ip")).alias("spurcontext")])
shape: (3, 2)
┌─────────────────┬─────────────────────────────────┐
│ ip              ┆ spurcontext                     │
│ ---             ┆ ---                             │
│ str             ┆ struct[7]                       │
╞═════════════════╪═════════════════════════════════╡
│ 8.8.8.8         ┆ {0.0,"","","","","",null}       │
│ 192.168.1.1     ┆ {0.0,"","","","","",null}       │
│ 999.abc.def.123 ┆ {null,null,null,null,null,null… │
└─────────────────┴─────────────────────────────────┘
>>> df.schema
Schema([('ip', String),
        ('spurcontext',
         Struct({'client_count': Float32, 'infrastructure': String,
         'location_city': String, 'location_country': String,
         'location_state': String, 'tag': String, 'services': List(String)}))])
Notes

IP addresses that are invalid or not found in the database will result in null values in the respective fields.