Writing custom cell processors

Super CSV provides a wide variety of useful cell processors, but you are free to write your own if need to. If you think other people might benefit from your custom cell processor, send us a patch and we'll consider adding it to the next version of Super CSV.

So how do you write a custom cell processor?

Let's say you're trying to read a CSV file that has a day column, and you've written your own enumeration to represent that.

package org.supercsv.example;

/**
 * An enumeration of days.
 */
public enum Day {
        MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY, SATURDAY, SUNDAY
}

You could write the following processor to parse the column to your enum (ignoring the case of the input).

package org.supercsv.example;

import org.supercsv.cellprocessor.CellProcessorAdaptor;
import org.supercsv.cellprocessor.ift.CellProcessor;
import org.supercsv.exception.SuperCsvCellProcessorException;
import org.supercsv.util.CsvContext;

/**
 * An example of a custom cell processor. 
 */
public class ParseDay extends CellProcessorAdaptor {
        
        public ParseDay() {
                super();
        }
        
        public ParseDay(CellProcessor next) {
                // this constructor allows other processors to be chained after ParseDay
                super(next);
        }
        
        public Object execute(Object value, CsvContext context) {
                
                validateInputNotNull(value, context);  // throws an Exception if the input is null
                
                for (Day day : Day.values()){
                        if (day.name().equalsIgnoreCase(value.toString())){
                                // passes the Day enum to the next processor in the chain
                                return next.execute(day, context);
                        }
                }
                
                throw new SuperCsvCellProcessorException(
                        String.format("Could not parse '%s' as a day", value), context, this);
        }
}

The important things to note above are:

  • the processor must extend CellProcessorAdaptor - this ensures it implements the CellProcessor interface and can be chained to other processors
  • it has a no-args constructor (for when this is the last or only processor in the chain)
  • it has constructor that allows another processor to be chained afterwards (it must call super(next))
  • if the processor required further configuration, additional parameters could be added to the constructors
  • input is mandatory for this processor, so it calls its inherited validateInputNotNull() method, which throws an Exception for null input
  • the return statement for the execute() method actually invokes the execute() method of the next processor in the chain. If there is no next processor, the value will simply be returned, otherwise the next processor is free to perform additional processing. If your processor doesn't allow chaining at all, then you could simply return the value instead (i.e. return day; in the above example).
  • if the processor fails to parse the input (it doesn't match any of the days), it throws an Exception with a meaningful message, the current context (which will contain the line/row/column numbers), and a reference to the processor.

For more ideas, take a look at the existing cell processors in the project source.