Core concepts

Performance & Large Datasets

How ImportCSV handles large files with virtual scrolling and progressive validation

ImportCSV is built to handle large datasets efficiently, supporting 10,000+ rows without freezing your browser. This is achieved through virtual scrolling and progressive validation techniques.

Virtual Scrolling

When dealing with large CSV files, rendering thousands of DOM elements can cause severe performance issues. ImportCSV uses @tanstack/react-virtual to implement virtual scrolling, which only renders the visible rows plus a small buffer.

How It Works

// Instead of rendering all 10,000 rows:
<div>{10000 rows}</div>  // Browser freezes

// Virtual scrolling renders only visible rows:
<div>{~20 visible rows}</div>  // Smooth scrolling

Benefits

  • Constant memory usage: Only ~20 rows in DOM at any time
  • Smooth scrolling: 60fps even with 10,000+ rows
  • Fast initial render: No delay when opening large files
  • Responsive UI: No freezing or lag during interaction

Progressive Validation

ImportCSV validates data in two phases to provide instant feedback while processing large files:

Phase 1: Instant Validation (First 50 Rows)

// First 50 rows are validated immediately
const instantRows = data.slice(0, 50);
validateRows(instantRows); // < 100ms

Phase 2: Background Validation (Remaining Rows)

// Remaining rows are validated asynchronously
const remainingRows = data.slice(50);
requestIdleCallback(() => {
  validateRows(remainingRows); // Doesn't block UI
});

User Experience

  1. Immediate feedback: Users see validation results for the first 50 rows instantly
  2. Non-blocking: UI remains responsive during validation
  3. Progress indication: "Validating..." status shows background processing
  4. Smart error display: Errors appear progressively as validation completes

Performance Benchmarks

Dataset SizeInitial RenderFull ValidationMemory Usage
100 rows< 50ms< 100ms~5MB
1,000 rows< 50ms< 500ms~8MB
5,000 rows< 50ms~2s~15MB
10,000 rows< 50ms~4s~20MB

Tested on Chrome 120, MacBook Pro M1

Memory Optimization

Efficient Data Structures

// Data is stored in a flat array structure
const rows = [
  { index: 0, values: { name: "John", email: "john@example.com" } },
  { index: 1, values: { name: "Jane", email: "jane@example.com" } }
];

Cleanup on Component Unmount

  • Automatic garbage collection of parsed data
  • Event listener cleanup
  • Virtual scroller disposal

Configuration

For Large Datasets (1,000+ rows)

<CSVImporter
  columns={columns}
  onComplete={handleComplete}
  // Virtual scrolling is automatic for large files
/>

Optimization Tips

  1. Keep validators simple: Complex regex patterns slow down validation
  2. Use type-specific columns: Native type validation is faster
  3. Minimize transformations: Apply only necessary transformations
  4. Consider server-side validation: For files > 10,000 rows

Browser Compatibility

Virtual scrolling and progressive validation work on all modern browsers:

  • Chrome 90+
  • Firefox 88+
  • Safari 14+
  • Edge 90+

Example: Handling 10,000 Rows

import { CSVImporter } from '@importcsv/react';

function LargeDatasetExample() {
  return (
    <CSVImporter
      columns={[
        { id: 'id', label: 'ID', validators: [{ type: 'required' }] },
        { id: 'name', label: 'Name', validators: [{ type: 'required' }] },
        { id: 'email', label: 'Email', type: 'email' },
        { id: 'amount', label: 'Amount', type: 'number' }
      ]}
      onComplete={(data) => {
        console.log(`Imported ${data.num_rows} rows successfully`);
      }}
    />
  );
}

Technical Implementation

Virtual Table Component

The virtual table maintains a window of visible rows:

// Internal implementation (simplified)
const VirtualTable = ({ data, rowHeight = 35 }) => {
  const virtualizer = useVirtualizer({
    count: data.length,
    getScrollElement: () => scrollRef.current,
    estimateSize: () => rowHeight,
    overscan: 5 // Render 5 extra rows for smooth scrolling
  });

  return (
    <div ref={scrollRef} style={{ height: '400px', overflow: 'auto' }}>
      {virtualizer.getVirtualItems().map(virtualRow => (
        <Row key={virtualRow.index} data={data[virtualRow.index]} />
      ))}
    </div>
  );
};

Progressive Validation Strategy

// Validation is split into chunks
const INSTANT_VALIDATION_LIMIT = 50;
const CHUNK_SIZE = 100;

async function validateProgressive(rows) {
  // Phase 1: Instant
  const instant = rows.slice(0, INSTANT_VALIDATION_LIMIT);
  validateChunk(instant);
  
  // Phase 2: Progressive
  for (let i = INSTANT_VALIDATION_LIMIT; i < rows.length; i += CHUNK_SIZE) {
    await new Promise(resolve => requestIdleCallback(resolve));
    const chunk = rows.slice(i, i + CHUNK_SIZE);
    validateChunk(chunk);
  }
}

Limitations

  • Maximum tested: 10,000 rows (larger files may work but aren't guaranteed)
  • Browser memory: Very large files (50,000+ rows) may hit browser memory limits
  • Complex validation: Performance degrades with many complex validators

Best Practices

  1. Test with real data: Performance varies based on data complexity
  2. Monitor memory usage: Use browser dev tools for large files
  3. Consider pagination: For files > 10,000 rows, consider server-side processing
  4. Optimize validators: Use simple, efficient validation rules
  5. Provide feedback: Show progress indicators for large files