Tutorial

How to Open & Browse a Parquet File: Step-by-Step Tutorial

By FinancialDataTools.com Team  ·  March 2026  ·  8 min read  ·  Last updated March 14, 2026

🗜️ Open the Parquet Viewer and follow along with this tutorial.

Open Tool →

Steps

  1. Locate Your Parquet File
  2. Open the Parquet Viewer
  3. Load Your File
  4. Understand the Loading Process
  5. Browse Columns and Rows
  6. Sort and Filter Data
  7. Inspect Cell Values
  8. Inspect the Schema
  9. Export Your Data

This tutorial walks you through opening and exploring an Apache Parquet file using the free FinancialDataTools.com Parquet Viewer. The tool uses DuckDB-Wasm — the official WebAssembly build of the DuckDB analytical engine — to read Parquet natively in your browser. Nothing is sent to any server.

Try the Parquet Viewer — runs entirely in your browser and never uploads your files.

Open the Parquet Viewer →

Step 1: Locate Your Parquet File

Find the .parquet file you want to inspect. Parquet is the dominant columnar storage format for analytical data and appears in many financial data engineering workflows:

The viewer supports all standard Parquet versions (1.0 and 2.x) and compression codecs (Snappy, Gzip, LZ4, Zstandard, and uncompressed).

Step 2: Open the Parquet Viewer

Navigate to financialdatatools.com/viewers/parquet-viewer/ in any modern desktop browser. No login, account, or installation is required. The viewer works best on desktop.

Step 3: Load Your File

There are two ways to open your Parquet file:

Step 4: Understand the Loading Process

Unlike CSV or JSON, Parquet is a binary format that requires a database engine to read. The viewer uses DuckDB-Wasm, which initializes in a background Web Worker. Loading happens in three stages shown in the status indicator:

  1. Initialising DuckDB-Wasm — the DuckDB engine loads into the browser (first load only; subsequent files use the cached engine)
  2. Reading Parquet metadata — DuckDB reads the Parquet file footer to get column names, types, and row count without loading any row data yet
  3. Loading rows — the first page of up to 5,000 rows is fetched

For most Parquet files this entire process completes within a few seconds. Files with many row groups or complex nested schemas may take slightly longer during the metadata step.

Once loaded, the stats bar shows the total row count (from COUNT(*)), visible rows, column count, and the engine label DuckDB-Wasm.

Step 5: Browse Columns and Rows

Your data appears in a spreadsheet-style grid. Each column header shows:

Numeric columns (INT, FLOAT) are right-aligned and shown in blue. Boolean values appear in purple as true/false. Date and timestamp values are shown in ISO 8601 format (YYYY-MM-DD or YYYY-MM-DD HH:MM:SS). Nested types (LIST, MAP, STRUCT) are shown as their JSON string representation.

Row numbers appear on the left side of the grid. For paginated files, row numbers reflect absolute positions across the entire file.

Step 6: Sort and Filter Data

Sorting: Click any column header to sort the current page ascending. Click again for descending; click a third time to restore original row order.

Global search: Type in the search box in the toolbar to search across all visible columns simultaneously. Rows not containing the search term in any column are hidden.

Column filters: Click the filter icon in any column header for column-specific filtering:

Column filters operate on the currently loaded page. For filtering very large Parquet files across all rows, consider using the DuckDB Viewer where you can load the Parquet file via DuckDB's read_parquet() function and apply SQL WHERE clauses for server-side filtering.

Step 7: Inspect Cell Values

Click any cell to open the Cell Detail Panel on the right side of the viewer. This panel shows the row number, column name, full DuckDB type string (e.g., DECIMAL(18,6)), character length, and the full cell value without truncation.

For nested Parquet types (LIST, STRUCT, MAP), the cell value is a JSON string — the detail panel automatically pretty-prints it as formatted JSON so you can read complex nested values clearly. Use the Copy value button to copy the raw value to the clipboard.

Step 8: Inspect the Schema

Click the Schema button in the toolbar to open the column schema modal. For each column it shows the column name and full DuckDB type string, derived from a DESCRIBE query against the Parquet file.

This is particularly useful when you need to:

Use the Copy Schema button to copy the full column list as plain text.

Step 9: Export Your Data

Click the Export button in the toolbar to open the export dialog. Four formats are available:

Two export scopes let you control what gets exported:

Tip: Use Full file CSV export to quickly convert a Parquet file to CSV without writing any Python or using any command-line tools — all processing happens in your browser.

Related Articles

Advertisement