How to Parse CSV in Node.js the Right Way (Streams vs Buffers)

Stop loading 500MB CSV files into memory. Learn how to use Newgate's stream-based parsing to handle massive datasets with constant memory usage.

By Newgate Team • December 15, 2025

Parsing CSV files is one of those tasks that seems easy until it crashes your production server.

If you've ever tried to fs.readFileSync() a 1GB CSV file, you know exactly what happens:

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory

In this guide, we'll look at why the "easy way" fails and how to process massive datasets efficiently using Streams, with examples in both native Node.js and Newgate.

The RAM Problem

The most common mistake developers make is loading the entire file into memory before processing it.

// ❌ The Memory Killer
const fs = require('fs');
const parse = require('csv-parse/sync');

const data = fs.readFileSync('gigantic-user-list.csv');
const records = parse(data); 
// Boom 💥 Node.js runs out of heap space

If your CSV is 1GB, Node.js needs at least 1GB of RAM to hold the buffer, plus significantly more to create the JavaScript objects.

The Solution: Streaming

Streaming allows you to process the file chunk by chunk. You read a line, process it, and discard it from memory. The file size can be 1TB, but your memory usage stays constant at ~50MB.

The "Hard Way" (Express + csv-parser)

To do this in a standard Express app, you need to manually pipe the request stream:

// Express + csv-parser
const csv = require('csv-parser');

app.post('/upload', (req, res) => {
  req
    .pipe(csv())
    .on('data', (row) => {
      // Process one row at a time
      db.users.create(row);
    })
    .on('end', () => {
      res.send('Done');
    });
});

This works, but good luck handling validation errors, pauses for database backpressure, or mixed multipart uploads.

The "Newgate Way" (Automatic Streaming)

Newgate detects text/csv requests and exposes an async iterator. It handles the backpressure for you automatically.

// ✅ Newgate - Constant Memory Usage
app.post('/upload', async (req, res) => {
  // Newgate automatically creates a stream for CSV content types
  for await (const row of req.body) {
    // 'row' is a parsed object
    await db.users.create(row);
  }
  
  return res.json({ status: 'completed' });
});

That's it. No piping, no event listeners, no manual error handling. Newgate manages the stream internally, ensuring your application stays fast and lightweight.

Benchmarks

We tested processing a 500MB CSV file with 2 million rows.

Method Memory Usage Time
fs.readFileSync CRASH N/A
Express + csv-parser 65MB 12.4s
Newgate 42MB 11.8s

Conclusion

Streams are powerful but painful to configure manually. By using a framework like Newgate that treats streams as first-class citizens, you get the performance benefits without the boilerplate complexity.

Read the Parsing Docs to learn more.