Scala Case Classes to CSV
For the past eight months, I have have been playing and programming in Scala for fun. One of the many things I wanted to do was to convert a collection of case classes into a CSV file. This sounds like a problem that has already been solved see “PureCSV: A type-safe and boilerplate-free CSV library for Scala”, and it has… to an extent.
In my specific use case, I wanted to parse some data from JSON into a case class and eventually convert a collection of these classes into a CSV file. The problem I was running into was that the serialization library I was using, [GSON], sets JSON null values as Java null values, but Pure CSV did not like these null values and was throwing ReferenceNullExceptions.
I ended up creating a simple implementation based on this SO answer where I assign default values to different value types. Here is what I ended up doing:
import java.io.{BufferedWriter, FileWriter}
import com.github.tototoshi.csv.CSVWriter
// ...
def writeCaseClasses(seqOfCaseClasses: Seq[AnyRef], filename: String): Unit = {
import Implicits._
val mapOfKeyValues = seqOfCaseClasses.map(x => x.toMap)
val values = mapOfKeyValues.map(_.values.toSeq)
val header = mapOfKeyValues.head.keys.toSeq
val out = new BufferedWriter(new FileWriter(filename))
val writer = new CSVWriter(out)
// Prepend the header to the values, alternatively you can use only the values
writer.writeAll(Seq(header) ++ value)
}
import java.lang.reflect.Field
import org.apache.commons.lang3.StringEscapeUtils
/**
* Implicit class that provides a toMap method. It will turn a case class into a map of key, values.
* This is intended for case classes only.
*/
object Implicits {
implicit class CaseClassToMap(c: AnyRef) {
// --- Methods ---
def toMap: Map[String, Any] = {
toMap(getDefaultValue)
}
def toMap(formatFunction: (Field, AnyRef) => Any): Map[String, Any] = {
(Map[String, Any]() /: c.getClass.getDeclaredFields) { (map: Map[String, Any], field: Field) =>
field.setAccessible(true)
val fieldValue: Any = formatFunction(field, c)
map + (field.getName -> fieldValue)
}
}
def getDefaultValue(field: Field, c: AnyRef): Any = {
if (field.get(c) == null) {
// Set the default values to something other than null
if (field.getType.getName == "java.lang.String") ""
else if (field.getType.getName == "int") 0
else if (field.getType.getName == "long") 0
else if (field.getType.getName == "double") 0.0
else if (field.getType.getName == "boolean") false
} else {
// Ensure that values are not HTML escaped
if (field.getType.getName == "java.lang.String") StringEscapeUtils.unescapeHtml4(field.get(c).toString)
else field.get(c)
}
}
}
}
Here I’m using the com.github.tototoshi.csv.CSVWriter to write to a csv file, and the org.apache.commons.lang3.StringEscapeUtils
to format values, but other than that it should be all Scala.
The intention of the toMap method is to allow you to pass a function that check the value for null and set a default one. I have provided default formatting method that should cover some cases, but you can create your own and pass it to the toMap method.