-
Notifications
You must be signed in to change notification settings - Fork 105
Scripting Luau Python
Tier: Advanced
Commands covered: luau, py, foreach, template
Per-command flag reference lives in
/docs/help/. This page is the workflow layer — when to reach for each command and how they compose.
When SQL, joins, and apply aren't enough, drop into a scripting language. qsv ships two embedded DSLs plus a shell-out command and a templating engine:
-
luau— qsv's flagship DSL, a Luau (Lua 0.720) interpreter with BEGIN/MAIN/END blocks, random-access mode, lookup-table helpers, and ~50 qsv-specific helper functions. Available in feature-capable builds. -
py— Python expressions per row, including f-strings. Requires Python 3.10+ and theqsvpy311/312/313binary variant. -
foreach— shell out a command per row. The escape hatch when neither DSL fits. -
template— MiniJinja render CSV rows into any text shape (HTML, Markdown, SQL, prompts, …).
For deep-dives: docs/INTERPRETERS.md plus the legacy Luau Development and Luau Helper Functions Examples wiki pages.
| If you want to… | Use | Notes |
|---|---|---|
| Per-row computation with lookup tables, BEGIN/MAIN/END, random access | luau |
qsv's DSL; fast; works in feature_capable and qsvdp builds |
| Per-row computation in Python with f-strings and stdlib | py |
Slower than Luau (no JIT); requires qsvpy* variant |
| Run any shell command once per row | foreach |
The footgun: validate inputs! |
| Render rows into per-row files or a combined report | template |
MiniJinja; supports register_lookup
|
qsv's flagship DSL. The Luau interpreter is fast (multiple million row/sec on simple expressions) and has two subcommands:
-
map— create new computed columns -
filter— keep rows where the expression is truthy
Two modes:
- Sequential mode — script runs once per row; row state is processed in order.
-
Random access mode — with
--remapor aLASTROWreference, qsv builds (or reuses) an index so the script can read any row by index.
Three execution blocks (separated by --begin / --end files or inline directives):
- BEGIN — runs once before any row; init aggregator variables and load lookup tables.
-
MAIN — runs per-row; the body of
map/filter. - END — runs once after all rows; emit final aggregates / reports.
qsv adds ~50 helpers: qsv_register_lookup, qsv_log, qsv_writefile, qsv_loadcsv, qsv_cumulative_avg, qsv_lag, qsv_format_float, qsv_breakeven, and many more. See the Luau Helper Functions Examples wiki page for the full catalog.
qsv luau map running_total \
"tot = (tot or 0) + tonumber(amount); return tot" \
transactions.csv > with_running_total.csvcat <<'LUA' > classify.lua
BEGIN {
qsv_register_lookup("agencies", "dathere://nyc-agencies.csv")
}!
local agency = agencies[Agency] or {}
local borough = Borough
local cat = "Other"
if string.find(_['Complaint Type'], "Noise") then
cat = "Noise"
elseif string.find(_['Complaint Type'], "HEATING") then
cat = "Heating"
end
END {
qsv_log("info", "Processed " .. _IDX .. " rows")
}!
return cat
LUA
qsv luau map category -x -f classify.lua \
NYC_311_SR_2010-2020-sample-1M.csv > nyc311_classified.csvqsv luau map prior_balance \
"if _IDX == 1 then return 0 end; return col[_IDX - 1]['balance']" \
--remap balance \
transactions.csv > with_prior.csvExample: a date-quarter helper script (used in Recipe: Date Enrichment)
The qsv repo ships ready-to-use Luau scripts in docs/cookbook/lua/ — getquarter.lua, turnaroundtime.lua, and others.
curl -LO https://raw.githubusercontent.com/dathere/qsv/master/docs/cookbook/lua/getquarter.lua
qsv luau map Quarter -x -f getquarter.lua NYC_311_SR_2010-2020-sample-1M.csv > with_quarter.csvqsv luau filter "tonumber(amount) > 0 and Country == 'US'" data.csv > positive_us.csvSee also: /docs/help/luau.md, docs/INTERPRETERS.md, Luau Development, Luau Helper Functions Examples, qsv-lookup-tables, qsv-recipes, Lookup Tables.
Python expression per row. Requires Python 3.10+ (matching the qsvpy311/312/313 binary you installed). The Python interpreter is dynamically linked — there's a 3-50× overhead vs Luau due to the GIL and call-per-row Python frame setup, but it's worth it when you need pandas-style expressions or a library only available in Python.
Column values are accessible four ways: as a local variable (spaces replaced with _), as col.colname, as col["colname"], or as col[N].
qsv py map balance_summary 'f"{int(amount):,.0f} ({col.tier})"' txns.csvqsv py map normalized_phone \
'import re; re.sub(r"[^\d]", "", col["Phone"])[:10]' \
contacts.csv(Note: py does support import statements — see docs/INTERPRETERS.md for caveats and --helper for shared imports.)
qsv py filter \
'datetime.fromisoformat(col["created_at"]).weekday() < 5' \
events.csv > weekday_events.csv# helper.py
import re, datetime
def clean_phone(s):
return re.sub(r"[^\d]", "", s)[:10]qsv py map clean_phone 'clean_phone(col.Phone)' \
--helper helper.py contacts.csv > clean.csvSee also: /docs/help/py.md, docs/INTERPRETERS.md, luau — typically 3-50× faster, Binary Variants — pick the qsvpy* matching your Python.
Execute a shell command once per row. Powerful and dangerous — validate inputs aggressively.
{} in the command is replaced by the value of the named column. {var} references another column by name.
qsv foreach pdf_url 'curl -s {} | pdftotext - -' documents.csv > combined.txtqsv foreach query --unify 'qsv search --year 2020 {}' queries.csv > all_results.csv--unify keeps headers from the first invocation and skips them on subsequent invocations.
qsv foreach query -u -c from_query 'qsv search {}' queries.csv > tagged.csv# 1. Split the giant CSV by Borough into one file per borough
qsv partition Borough nyc311_by_borough --filename '{}.csv' NYC_311.csv
# 2. For each borough file, run a heavy command in parallel
ls nyc311_by_borough/*.csv > infile_list.csv
qsv foreach filename 'heavy-stats-script {}' infile_list.csvDon't use foreach for things qsv already does in one shot — apply, luau, py, template, replace. Reach for it only when you genuinely need an external command.
See also: /docs/help/foreach.md, partition, split, template — for in-process per-row text rendering.
Render rows through a MiniJinja template. Either combine all rows into one stream or write one file per row.
Column names become variables (non-alphanumeric characters → _). qsv adds custom filters (format_float, human_count, pluralize, …) on top of MiniJinja's defaults plus minijinja-contrib.
{# report.j2 #}
# {{ Borough }} — {{ year_month }}
- Total complaints: **{{ complaints | human_count }}**
- Most common type: {{ top_complaint }}
{% if complaints > 1000 -%}
⚠️ High volume — review prioritization.
{% endif %}qsv template --template-file report.j2 \
reports_outdir/ \
--outfilename '{{ Borough }}_{{ year_month }}.md' \
borough_summary.csv--outfilename itself is a Jinja template — each row's filename is computed from its column values.
{% set ok = register_lookup("us_states", "dathere://us-states-example.csv") -%}
Dear {{ first_name | title }} {{ last_name | title }},
You qualify for {{ us_states[us_state].program }} in {{ us_states[us_state].name }}.qsv template --template-file rows.j2 - --output combined.html data.csvSELECT * FROM events
WHERE {{ field }} {{ operator }} '{{ value }}';qsv template --template-file rule.j2 - rules.csv | psqlSee also: /docs/help/template.md, MiniJinja docs, fetchpost --payload-tpl — MiniJinja for HTTP POST bodies, Lookup Tables.
- Command Reference (index)
- Aggregation & Statistics → frequency --stats-filter — Luau expressions on the stats cache
-
docs/INTERPRETERS.md— Luau and Python build details - Luau Development — legacy wiki page (preserved)
- Luau Helper Functions Examples — legacy wiki page (preserved)
- qsv-lookup-tables — companion repo for lookup CSVs
- qsv-recipes — community Luau scripts
-
Lookup Tables —
dynamicEnum/register_lookupdeep-dive
qsv — GitHub · Releases · Discussions · qsv pro · Try it online · Benchmarks · datHere · DeepWiki · Dual-licensed MIT / Unlicense
Edit this page: Contributing to the Wiki
Home · Why qsv? · Tier legend
- All Commands (index)
- Selection & Inspection
- Transform & Reshape
- Aggregation & Statistics
- Joins & Set Ops
- SQL & Polars
- Validation & Schema
- Conversion & I/O
- Geospatial
- HTTP & Web
- Scripting (Luau / Python)
- Indexing, Compression & Diff
- AI & Documentation