nwgrep
Grep your dataframes
Search and filter dataframes with grep-like patterns. Works with pandas, polars, and any backend supported by Narwhals.
# Find what you're looking for
df.grep("active") # Simple search
df.grep("@gmail.com") # Find patterns
df.grep(r"^\d{3}-\d{4}$") # Regex support
What is nwgrep?
nwgrep brings the familiar power of grep to dataframes. Search across columns, filter by patterns, use regex - all with a simple, intuitive interface that works seamlessly with any dataframe library thanks to Narwhals.
Why nwgrep?
- Familiar Interface
- If you know
grep, you know nwgrep. Same flags (-i,-v,-E), same intuition. - Backend Agnostic
- Write once, run anywhere. Switch from pandas to polars without changing your code.
- Simple to Use
- Three ways to use: function call, pipe method, or accessor. Choose what feels natural.
- Lightning Fast
- Lazy evaluation with polars/daft. Process multi-GB files efficiently.
- Type Safe
- Full type hints. Catch errors before runtime with ty.
Quick Start
Install with your preferred backend:
Search your data:
import pandas as pd
from nwgrep import nwgrep
df = pd.DataFrame({
"name": ["Alice", "Bob", "Eve"],
"status": ["active", "locked", "active"],
})
# Find all rows containing "active"
result = nwgrep(df, "active")
That's it. No complex queries, no backend-specific syntax.
Three Ways to Use
Choose the style that fits your workflow:
Simple and explicit.
Best for: Simple scripts, one-off searches, maximum clarity.
Functional style for data pipelines.
result = (
df
.pipe(nwgrep, "active")
.pipe(nwgrep, "@example.com", columns=["email"])
.pipe(lambda x: x.sort_values('name'))
)
Best for: Data pipelines, method chaining, functional programming.
Powerful Search Options
All the grep features you know and love:
# Case-insensitive search
df.grep("ACTIVE", case_sensitive=False)
# Invert match (like grep -v)
df.grep("test", invert=True)
# Regex patterns
df.grep(r".*@example\.com", regex=True)
# Multiple patterns (OR logic)
df.grep(["Alice", "Bob"])
# Whole word matching
df.grep("active", whole_word=True)
# Column-specific search
df.grep("pattern", columns=["name", "email"])
Backend Support
Works seamlessly with any dataframe library:
| Backend | Status | Notes |
|---|---|---|
| pandas | Full support | |
| polars | DataFrame and LazyFrame | |
| pyarrow | Table support | |
| dask | Distributed dataframes | |
| daft | Lazy evaluation | |
| cuDF | GPU acceleration | |
| modin | Parallel pandas |
Same code, any backend. Switch freely without rewriting your filters.
Real-World Examples
Find Active Users
Email Domain Search
Log Analysis
Data Quality Checks
# Find rows without email addresses
missing_email = df.grep(r"\w+@\w+\.\w+", regex=True, invert=True)
Pipeline Filtering
result = (
df
.grep("active", columns=["status"]) # Active users
.grep("@company.com", columns=["email"]) # Company emails
.grep("admin", invert=True) # Exclude admins
)
Why Narwhals?
Narwhals provides a unified API across dataframe libraries. This means:
- Write once, run anywhere - Same code for pandas, polars, or any backend
- No vendor lock-in - Switch backends without rewriting code
- Automatic optimization - Each backend uses its strengths
- Future-proof - Support for new backends as they emerge
nwgrep is a certified Narwhals plugin, enabling truly backend-agnostic filtering.
Next Steps
- Installation
Get nwgrep installed with your preferred backends
- Usage
Learn all the ways to search and filter your data
- API Reference
Complete function and parameter documentation
- CLI Reference
Command-line interface for binary formats
Credit
Built with using Narwhals for dataframe abstraction.
Special thanks to Claude and Gemini for their assistance in developing this project.