Pakistan Taxpayer Data
Between 2013 and 2018, the Pakistan Federal Board of Revenue (FBR) published a directory of all taxpayers. Naturally, these are shared as PDFs that look something like:
To save anyone trying to see this data in the future, I’ve shared the extracted and compressed parquet files in the GitHub repo.
Some cursed knowledge I have learnt:
- NTNs have an 8th digit sometimes (data for 2013-2014), but it is just a check digit. The
all.parquetfile contains a columnntn7to help with grouping- Hyderabad Development Authority’s 8th digit was 0 in 2015 but 1 in 2016
- shell scripts are very fast
Explore the top 1000 taxpayers across all years and within each category at the GitHub pages link.
Query the data (or lookup an NTN/CNIC) directly in your browser with DuckDB-WASM at this link.