Skip to content

Scripting Luau Python

Joel Natividad edited this page May 13, 2026 · 2 revisions

Scripting (Luau / Python)

Tier: Advanced Commands covered: luau, py, foreach, template

Per-command flag reference lives in /docs/help/. This page is the workflow layer — when to reach for each command and how they compose.

When SQL, joins, and apply aren't enough, drop into a scripting language. qsv ships two embedded DSLs plus a shell-out command and a templating engine:

  • luau — qsv's flagship DSL, a Luau (Lua 0.720) interpreter with BEGIN/MAIN/END blocks, random-access mode, lookup-table helpers, and ~50 qsv-specific helper functions. Available in feature-capable builds.
  • py — Python expressions per row, including f-strings. Requires Python 3.10+ and the qsvpy311/312/313 binary variant.
  • foreach — shell out a command per row. The escape hatch when neither DSL fits.
  • template — MiniJinja render CSV rows into any text shape (HTML, Markdown, SQL, prompts, …).

For deep-dives: docs/INTERPRETERS.md plus the legacy Luau Development and Luau Helper Functions Examples wiki pages.

Quick decision table

If you want to… Use Notes
Per-row computation with lookup tables, BEGIN/MAIN/END, random access luau qsv's DSL; fast; works in feature_capable and qsvdp builds
Per-row computation in Python with f-strings and stdlib py Slower than Luau (no JIT); requires qsvpy* variant
Run any shell command once per row foreach The footgun: validate inputs!
Render rows into per-row files or a combined report template MiniJinja; supports register_lookup

luau

qsv's flagship DSL. The Luau interpreter is fast (multiple million row/sec on simple expressions) and has two subcommands:

  • map — create new computed columns
  • filter — keep rows where the expression is truthy

Two modes:

  • Sequential mode — script runs once per row; row state is processed in order.
  • Random access mode — with --remap or a LASTROW reference, qsv builds (or reuses) an index so the script can read any row by index.

Three execution blocks (separated by --begin / --end files or inline directives):

  • BEGIN — runs once before any row; init aggregator variables and load lookup tables.
  • MAIN — runs per-row; the body of map/filter.
  • END — runs once after all rows; emit final aggregates / reports.

Helper functions

qsv adds ~50 helpers: qsv_register_lookup, qsv_log, qsv_writefile, qsv_loadcsv, qsv_cumulative_avg, qsv_lag, qsv_format_float, qsv_breakeven, and many more. See the Luau Helper Functions Examples wiki page for the full catalog.

Example: cumulative running total

qsv luau map running_total \
  "tot = (tot or 0) + tonumber(amount); return tot" \
  transactions.csv > with_running_total.csv

Example: classify with branching and a lookup table

cat <<'LUA' > classify.lua
BEGIN {
  qsv_register_lookup("agencies", "dathere://nyc-agencies.csv")
}!

local agency = agencies[Agency] or {}
local borough = Borough
local cat = "Other"
if string.find(_['Complaint Type'], "Noise") then
  cat = "Noise"
elseif string.find(_['Complaint Type'], "HEATING") then
  cat = "Heating"
end

END {
  qsv_log("info", "Processed " .. _IDX .. " rows")
}!

return cat
LUA

qsv luau map category -x -f classify.lua \
  NYC_311_SR_2010-2020-sample-1M.csv > nyc311_classified.csv

Example: random-access "find prior row's value" via LASTROW

qsv luau map prior_balance \
  "if _IDX == 1 then return 0 end; return col[_IDX - 1]['balance']" \
  --remap balance \
  transactions.csv > with_prior.csv

Example: a date-quarter helper script (used in Recipe: Date Enrichment)

The qsv repo ships ready-to-use Luau scripts in docs/cookbook/lua/getquarter.lua, turnaroundtime.lua, and others.

curl -LO https://raw.githubusercontent.com/dathere/qsv/master/docs/cookbook/lua/getquarter.lua
qsv luau map Quarter -x -f getquarter.lua NYC_311_SR_2010-2020-sample-1M.csv > with_quarter.csv

Example: filter — only keep rows with positive amounts and a US country

qsv luau filter "tonumber(amount) > 0 and Country == 'US'" data.csv > positive_us.csv

See also: /docs/help/luau.md, docs/INTERPRETERS.md, Luau Development, Luau Helper Functions Examples, qsv-lookup-tables, qsv-recipes, Lookup Tables.

py

Python expression per row. Requires Python 3.10+ (matching the qsvpy311/312/313 binary you installed). The Python interpreter is dynamically linked — there's a 3-50× overhead vs Luau due to the GIL and call-per-row Python frame setup, but it's worth it when you need pandas-style expressions or a library only available in Python.

Column values are accessible four ways: as a local variable (spaces replaced with _), as col.colname, as col["colname"], or as col[N].

Example: integer arithmetic with f-strings

qsv py map balance_summary 'f"{int(amount):,.0f} ({col.tier})"' txns.csv

Example: regex normalization in Python (clearer than the equivalent replace chain)

qsv py map normalized_phone \
  'import re; re.sub(r"[^\d]", "", col["Phone"])[:10]' \
  contacts.csv

(Note: py does support import statements — see docs/INTERPRETERS.md for caveats and --helper for shared imports.)

Example: filter rows where Polars'-style logic doesn't fit

qsv py filter \
  'datetime.fromisoformat(col["created_at"]).weekday() < 5' \
  events.csv > weekday_events.csv

Example: shared helper file with imports

# helper.py
import re, datetime

def clean_phone(s):
    return re.sub(r"[^\d]", "", s)[:10]
qsv py map clean_phone 'clean_phone(col.Phone)' \
  --helper helper.py contacts.csv > clean.csv

See also: /docs/help/py.md, docs/INTERPRETERS.md, luau — typically 3-50× faster, Binary Variants — pick the qsvpy* matching your Python.

foreach

Execute a shell command once per row. Powerful and dangerous — validate inputs aggressively.

{} in the command is replaced by the value of the named column. {var} references another column by name.

Example: per-row shell pipeline (download then OCR)

qsv foreach pdf_url 'curl -s {} | pdftotext - -' documents.csv > combined.txt

Example: unify subcommand outputs into one CSV

qsv foreach query --unify 'qsv search --year 2020 {}' queries.csv > all_results.csv

--unify keeps headers from the first invocation and skips them on subsequent invocations.

Example: add a "source query" column to the unified output

qsv foreach query -u -c from_query 'qsv search {}' queries.csv > tagged.csv

Example: split + parallel processing pattern

# 1. Split the giant CSV by Borough into one file per borough
qsv partition Borough nyc311_by_borough --filename '{}.csv' NYC_311.csv

# 2. For each borough file, run a heavy command in parallel
ls nyc311_by_borough/*.csv > infile_list.csv
qsv foreach filename 'heavy-stats-script {}' infile_list.csv

Don't use foreach for things qsv already does in one shot — apply, luau, py, template, replace. Reach for it only when you genuinely need an external command.

See also: /docs/help/foreach.md, partition, split, template — for in-process per-row text rendering.

template

Render rows through a MiniJinja template. Either combine all rows into one stream or write one file per row.

Column names become variables (non-alphanumeric characters → _). qsv adds custom filters (format_float, human_count, pluralize, …) on top of MiniJinja's defaults plus minijinja-contrib.

Example: render a per-borough markdown summary

{# report.j2 #}
# {{ Borough }} — {{ year_month }}

- Total complaints: **{{ complaints | human_count }}**
- Most common type: {{ top_complaint }}

{% if complaints > 1000 -%}
  ⚠️ High volume — review prioritization.
{% endif %}
qsv template --template-file report.j2 \
  reports_outdir/ \
  --outfilename '{{ Borough }}_{{ year_month }}.md' \
  borough_summary.csv

--outfilename itself is a Jinja template — each row's filename is computed from its column values.

Example: render lookup-table-enriched form letters

{% set ok = register_lookup("us_states", "dathere://us-states-example.csv") -%}
Dear {{ first_name | title }} {{ last_name | title }},
You qualify for {{ us_states[us_state].program }} in {{ us_states[us_state].name }}.

Example: render to stdout (combined output)

qsv template --template-file rows.j2 - --output combined.html data.csv

Example: dynamically generate SQL from CSV-driven rules

SELECT * FROM events
WHERE {{ field }} {{ operator }} '{{ value }}';
qsv template --template-file rule.j2 - rules.csv | psql

See also: /docs/help/template.md, MiniJinja docs, fetchpost --payload-tpl — MiniJinja for HTTP POST bodies, Lookup Tables.

See also

Clone this wiki locally