Parsing CSV files is one of those tasks that seems easy until it crashes your production server.
If you've ever tried to fs.readFileSync() a 1GB CSV file, you
know exactly what happens:
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory In this guide, we'll look at why the "easy way" fails and how to process massive datasets efficiently using Streams, with examples in both native Node.js and Newgate.
The RAM Problem
The most common mistake developers make is loading the entire file into memory before processing it.
// ❌ The Memory Killer
const fs = require('fs');
const parse = require('csv-parse/sync');
const data = fs.readFileSync('gigantic-user-list.csv');
const records = parse(data);
// Boom 💥 Node.js runs out of heap space If your CSV is 1GB, Node.js needs at least 1GB of RAM to hold the buffer, plus significantly more to create the JavaScript objects.
The Solution: Streaming
Streaming allows you to process the file chunk by chunk. You read a line, process it, and discard it from memory. The file size can be 1TB, but your memory usage stays constant at ~50MB.
The "Hard Way" (Express + csv-parser)
To do this in a standard Express app, you need to manually pipe the request stream:
// Express + csv-parser
const csv = require('csv-parser');
app.post('/upload', (req, res) => {
req
.pipe(csv())
.on('data', (row) => {
// Process one row at a time
db.users.create(row);
})
.on('end', () => {
res.send('Done');
});
}); This works, but good luck handling validation errors, pauses for database backpressure, or mixed multipart uploads.
The "Newgate Way" (Automatic Streaming)
Newgate detects text/csv requests and exposes an async iterator.
It handles the backpressure for you automatically.
// ✅ Newgate - Constant Memory Usage
app.post('/upload', async (req, res) => {
// Newgate automatically creates a stream for CSV content types
for await (const row of req.body) {
// 'row' is a parsed object
await db.users.create(row);
}
return res.json({ status: 'completed' });
}); That's it. No piping, no event listeners, no manual error handling. Newgate manages the stream internally, ensuring your application stays fast and lightweight.
Benchmarks
We tested processing a 500MB CSV file with 2 million rows.
| Method | Memory Usage | Time |
|---|---|---|
fs.readFileSync | CRASH | N/A |
Express + csv-parser | 65MB | 12.4s |
| Newgate | 42MB | 11.8s |
Conclusion
Streams are powerful but painful to configure manually. By using a framework like Newgate that treats streams as first-class citizens, you get the performance benefits without the boilerplate complexity.
Read the Parsing Docs to learn more.