API Reference
Complete reference for all nwgrep functions and methods.
Core Function
nwgrep()
Search and filter dataframes with grep-like functionality.
nwgrep(
df,
pattern,
*,
columns=None,
case_sensitive=True,
regex=False,
invert=False,
whole_word=False
)
Parameters:
-
df: DataFrame or LazyFrame (pandas, polars, pyarrow, daft, etc.)The dataframe to search. Can be a native dataframe or a Narwhals DataFrame.
-
pattern:strorlist[str]The search pattern(s). If a list is provided, matches any pattern (OR logic).
-
columns:list[str], optionalSpecific column names to search. If
None(default), searches all columns. -
case_sensitive:bool, defaultTrueWhether the search should be case-sensitive.
-
regex:bool, defaultFalseIf
True, treatpatternas a regular expression. IfFalse, treat as literal string. -
invert:bool, defaultFalseIf
True, return rows that do NOT match the pattern (likegrep -v). -
whole_word:bool, defaultFalseIf
True, only match complete words. Adds word boundaries (\b) around the pattern.
Returns:
Same type as input - DataFrame or LazyFrame matching the input backend.
Examples:
Basic search:
import pandas as pd
from nwgrep import nwgrep
df = pd.DataFrame({"col": ["foo", "bar", "baz"]})
result = nwgrep(df, "ba")
# Returns rows with "bar" and "baz"
Case-insensitive:
Column-specific:
df = pd.DataFrame({
"name": ["Alice", "Bob"],
"email": ["alice@test.com", "bob@test.com"]
})
result = nwgrep(df, "alice", columns=["name"])
# Only searches the name column
Regex:
Multiple patterns:
Invert match:
Whole word:
df = pd.DataFrame({"text": ["active", "activate", "actor"]})
result = nwgrep(df, "active", whole_word=True)
# Only matches "active", not "activate"
Accessor Registration
register_grep_accessor()
Register the .grep() accessor method on pandas and polars DataFrames.
Parameters: None
Returns: None
Side Effects:
Registers .grep() method on:
pandas.DataFramepolars.DataFramepolars.LazyFrame
Examples:
from nwgrep import register_grep_accessor
import pandas as pd
# Register once at the start
register_grep_accessor()
df = pd.DataFrame({"col": ["foo", "bar"]})
# Now you can use .grep() directly
result = df.grep("foo")
result = df.grep("FOO", case_sensitive=False)
result = df.grep("pattern", columns=["col"])
Works with polars too:
import polars as pl
from nwgrep import register_grep_accessor
register_grep_accessor()
df = pl.DataFrame({"col": ["foo", "bar"]})
result = df.grep("foo")
Warning
Call register_grep_accessor() only once per session, typically at the start of your script or notebook.
DataFrame Accessor Method
.grep()
Available after calling register_grep_accessor().
df.grep(
pattern,
*,
columns=None,
case_sensitive=True,
regex=False,
invert=False,
whole_word=False
)
Parameters:
Same as nwgrep(), except df is implicit (the dataframe you're calling .grep() on).
Returns:
Same type as the input dataframe.
Examples:
from nwgrep import register_grep_accessor
import pandas as pd
register_grep_accessor()
df = pd.DataFrame({
"name": ["Alice", "Bob", "Eve"],
"status": ["active", "inactive", "active"]
})
# Find active users
active = df.grep("active")
# Case-insensitive search
users = df.grep("ALICE", case_sensitive=False)
# Column-specific
name_search = df.grep("Alice", columns=["name"])
# Regex
email_pattern = df.grep(r".*@example\.com", regex=True)
# Exclude pattern
not_active = df.grep("active", invert=True)
Type Signatures
nwgrep includes complete type annotations. The simplified signatures are:
from typing import TypeVar, overload
import narwhals as nw
FrameT = TypeVar("FrameT")
@overload
def nwgrep(
df: FrameT,
pattern: str | list[str],
*,
columns: list[str] | None = None,
case_sensitive: bool = True,
regex: bool = False,
invert: bool = False,
whole_word: bool = False,
) -> FrameT: ...
# Narwhals-specific overload
@overload
def nwgrep(
df: nw.DataFrame[Any],
pattern: str | list[str],
*,
columns: list[str] | None = None,
case_sensitive: bool = True,
regex: bool = False,
invert: bool = False,
whole_word: bool = False,
) -> nw.DataFrame[Any]: ...
The function preserves the exact type of the input dataframe.
Narwhals Integration
nwgrep is a Narwhals plugin, meaning:
- It can be auto-discovered by Narwhals-aware tools
- It handles Narwhals DataFrames natively
- It returns Narwhals DataFrames when given Narwhals input
import narwhals as nw
from nwgrep import nwgrep
# Native pandas
import pandas as pd
df_native = pd.DataFrame({"col": ["a", "b"]})
# Convert to Narwhals
df_nw = nw.from_native(df_native)
# nwgrep handles both
result_nw = nwgrep(df_nw, "a") # Returns Narwhals DataFrame
result_native = nwgrep(df_native, "a") # Returns pandas DataFrame
# Convert back if needed
result_pandas = nw.to_native(result_nw)
Parameter Combinations
Here are common parameter combinations:
| Use Case | Parameters |
|---|---|
| Literal search | pattern="text" |
| Case-insensitive | case_sensitive=False |
| Regex search | regex=True |
| Specific columns | columns=["col1", "col2"] |
| Exclude pattern | invert=True |
| Whole words only | whole_word=True |
| Multiple patterns | pattern=["text1", "text2"] |
| Complex regex | pattern=r"^\w+@\w+\.\w+$", regex=True |
Supported Backends
nwgrep works with any backend supported by Narwhals:
| Backend | Type Preserved | Notes |
|---|---|---|
| pandas | ✅ | Returns pandas.DataFrame |
| polars (eager) | ✅ | Returns polars.DataFrame |
| polars (lazy) | ✅ | Returns polars.LazyFrame |
| pyarrow | ✅ | Returns pyarrow.Table |
| daft | ✅ | Returns daft.DataFrame (lazy) |
| dask | ✅ | Returns dask.dataframe.DataFrame |
| modin | ✅ | Returns modin.pandas.DataFrame |
| cuDF | ✅ | Returns cudf.DataFrame |
The return type always matches the input type.
Error Handling
nwgrep raises clear errors for common issues:
Invalid column names:
Invalid regex:
Empty pattern:
Performance Characteristics
- O(n × m) where n = rows, m = columns searched
- Column filtering reduces m significantly
- Lazy backends (polars, daft) defer execution
- String operations are optimized per backend
Best Practices:
- Use
columnsparameter when possible - Use lazy frames for large data
- Compile complex regex patterns outside if reusing
- Consider backend strengths (polars for large data, pandas for small)
See Also
- Usage Guide - Examples and patterns
- CLI Reference - Command-line interface
- Narwhals Documentation - Backend abstraction