Skip to content
This repository has been archived by the owner on Feb 14, 2018. It is now read-only.

rockymadden/delimited

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

#delimited Build Status Simple CSV IO for Scala. Read, write, validate, and transform. Do so line-by-line, all at once, or via streams.

Depending upon

The project is available on the Maven Central Repository. Adding a dependency to the core sub-project in various build systems (add other sub-projects as needed):

Simple Build Tool:

libraryDependencies += "com.rockymadden.delimited" %% "delimited-core" % "0.1.0"

Gradle:

compile 'com.rockymadden.delimited:delimited-core_2.10:0.1.0'

Maven:

<dependency>
	<groupId>com.rockymadden.delimited</groupId>
	<artifactId>delimited-core_2.10</artifactId>
	<version>0.1.0</version>
</dependency>

Reader Usage

The recommended usage of DelimitedReader is via the loan pattern, which is provided by functions in its companion object (shown below). Loaned readers have automatic resource clean up. Read functions ultimately return DelimitedLines, which is a type alias to IndexedSeq[String].


Line-by-line:

DelimitedReader.using("path/to/file.csv") { reader =>
	Iterator.continually(reader.readLine()).takeWhile(_.isDefined).foreach(println)
}

The readLine function returns Option[DelimitedLine]. The end of file is indicated by the return of None rather than Some.


All-at-once:

DelimitedReader.using("path/to/file.csv") { reader =>
	reader.readAll().foreach(_.foreach(println))
}

The readAll function returns Option[Seq[DelimitedLine]].


Via stream:

DelimitedReader.using("path/to/file.csv") { reader =>
	reader.readToStream().take(2).foreach(println)
}

The readToStream function returns Stream[DelimitedLine].


With header:

DelimitedReader.usingWithHeader("path/to/file.csv") { (reader, header) =>
	reader.readLine() map { line =>
		val field0 = line(header("field0")))
		val field1 = line(header("field1")))
	}
}

The header type is Map[String, Int]. It maps field values in the first line to their respective index.


Writer Usage

The recommended usage of DelimitedWriter is via the loan pattern, which is provided by functions in its companion object (shown below). Loaned writers have automatic resource clean up.


Line-by-line:

DelimitedWriter.using("path/to/file.csv") { writer =>
	val line = Some(Vector("field0", "field1", "field2"))
	writer.writeLine(line)
}

All-at-once:

DelimitedWriter.using("path/to/file.csv") { writer =>
	val lines = Some(Seq(
		Vector("field0", "field1", "field2"),
		Vector("field0", "field1", "field2")
	))
	writer.writeAll(lines)
}

Via stream:

DelimitedReader.using("path/to/file.csv") { => reader
	DelimitedWriter.using("path/to/anotherfile.csv") { writer =>
		val lines = reader.readToStream()
		writer.writeFromStream(lines)
	}
}

Decorating

It is possible to decorate readers and writers with additional functionality, this is provided by rich wrapping via implicits. Decorations include:

  • withTransform: Transform line values after reading and/or before writing. A handful of pre-built transforms are located in the transform module.

Non-decorated usage:

DelimitedReader.using("path/to/file.csv") { reader =>
	// Do something with reader.
}

Apply a filter so that we only get alphabetical characters in each line:

DelimitedReader.using("path/to/file.csv") { reader =>
	decoratedReader = reader withTransform StringTransform.filterAlpha

	// Do something with decoratedReader.
}

Make your own:

DelimitedReader.using("path/to/file.csv") { reader =>
	customTransform: StringTransform = (s) =>
		s.toCharArray.filter(c => c == 'A' || c == 'C' || c == 'G' || c == 'T').mkString
	decoratedReader = reader withTransform customTransform

	// Do something with decoratedReader.
}

Validator Usage

Validators exist to ensure files pass one or more checks. A handful of pre-built checks are located in the check module.


In this scenario, we want to ensure the number of fields in each line is consistent and that all fields have a length:

val reader = DelimitedReader("path/to/file.csv")

DelimitedValidator(reader).validate(
	DelimitedChecks.checkFieldCountConsistent,
	DelimitedChecks.checkFieldsHaveLength
)

In this scenario, we want to ensure all field lengths are consistent (e.g. each field has 2 characters):

val reader = DelimitedReader("path/to/file.csv")

DelimitedValidator(reader).validate(DelimitedChecks.checkFieldsLengthConsistent)

License

The MIT License (MIT)

Copyright (c) 2013 Rocky Madden (http://rockymadden.com/)

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

About

🚀 Simple CSV I/O for Scala. Read, write, validate, and transform. Do so line-by-line, all-at-once, or via streams.

Resources

Stars

Watchers

Forks

Packages

No packages published