Scala-IO Core: Unmanaged Resources

The main design of Scala-IO is around automatic closing of resources each time a resource is accessed in order to ensure that a programmer cannot unintentionally leave resources open in the face of exceptions or other unexpected situations. However, there are cases where the Scala-IO API is desired but the resource management is undesired. The classic case is of reading or writing to System.in and out. Thus Unmanaged resources exist to satisfy this use-case. 

Since unmanaged resources is a less common use-case there is not a factory object like there is for normal managed Resources.  Instead certain objects can be converted to unmanaged resources using the JavaConverters implicit methods as follows:
Continua a leggere

Pubblicato in Senza categoria

Scala-IO Core: To Resource Converters

In order to simplify integration with existing libraries, most commonly Java libraries, Scala-IO provides a JavaConverters object with implicit methods that add as*** methods (asInput, asOutput, asSeekable, etc…) to several types of objects.  It is the same pattern as in the scala.collection.JavaConverters object.

These methods can be used instead of the Resource.from*** methods to provide a slightly nicer appearing code.

There is one warning. When using JavaConverters, instead of Resource.from*** for creating Input/Output/Seekable/etc… objects, the chances of falling into the trap of creating non-reusable resources or causing a resource leak is increased. See: scala-io-core-reusable-resources for more details on this.
Continua a leggere

Pubblicato in Senza categoria

Scala-IO Core: Output – OutputConverter

As mentioned in the last post on Output, it is possible to write arbitrary objects to an output object and have it serialized to disk.

The way this is handled in Scala-IO is via OutputConverters.  If you are familiar with the type-class pattern then this should be very clear to you how this works.  For a very quick introduction you can read: http://www.sidewayscoding.com/2011/01/introduction-to-type-classes-in-scala.html.

The clue is in the signature of write:


def write[T](data: T)(implicit writer: OutputConverter[T]): Unit

the last parameter is the object that defines how the object is serialized.  The OutputConverter trait essentially converts and object into bytes and has a few built-in implementations in its companion object for objects like Int, Float, Byte, Char, etc… 
Since the parameter is implicit the compiler will search for an implementation that satisfies the requirements (that the OutputConverter has the type parameter T).  This allows:

import scalax.io._

val output:Output = Resource.fromFile("scala-io.out")

output write 3

// and

output write Seq(1,2,3)

// one can be more explicit and declare the OutputConverter
output.write(3)(OutputConverter.IntConverter)

The last line in the example shows the explicit declaration of the OutputConverter to use when writing the data. This indicates how one can provide their own converter.

Since the parameter is implicit there are two ways that custom OutputConverters can be used.

  • defining an implicit object for the object to be written. In this case all the possible ways implicits can be defined can be used. For example as an implicit value or in the companion object of the object to be written (serialized)
  • Explicitly declare the converter to use at the method call site

First let’s examine the use-case where the object is from a different library and therefore we cannot create a companion object for the object. The second case is where you are implementing the class and therefore can add a companion object:
For this next bit to work you need to paste it into a file and run that or use the paste mechanism of the REPL (type :paste into repl and press enter) Continua a leggere

Pubblicato in Senza categoria

Scala-IO Core: Long Traversable

The LongTraversable trait is one of the most important objects in Scala IO. Input provides a uniform way of creating views on the data (as a string or byte array or LongTraversable of something like bytes.)

LongTraversable is a scala.collection.Traversable with some extra capabilities. A few of the salient points of LongTraversable are:

  • It is a lazy/non-strict collection similar to Stream. In other words, you can perform operations like map, flatmap, filter, collect, etc… without accessing the resource
  • Methods like slice and drop will (if possible for the resource) skip the dropped bytes without reading them
  • Each usage of the LongTraversable will typically open and close the underlying resource.
  • Has methods that one typically finds in Seq.  For example: zip, apply, containsSlice
  • Has methods that take or return Longs instead of Ints like ldrop, lslice, ltake, lsize
  • Has limitFold method that allows fold like behaviour with extra features like skip and early termination
  • Can be converted to an AsyncLongTraversable which has methods that return Futures instead and won’t block the program
  • Can be converted to a Process object for advanced data processing pipelines
Example usage:

The limitFold method can be quite useful to process only a portion of the file if you don’t know ahead of time what the indices of the portion are: Continua a leggere

Pubblicato in Senza categoria

Scala-IO Getting Started

For the next several posts you will need to have Scala-IO installed and probably should have a sbt project as well.

There are currently 2 Scala-IO 0.4 releases.

  • Scala-io 0.4-seq – A version of Scala 0.4 without the Akka dependency and therefore no ASync support
  • Scala-io 0.4 – The full version that contains an Akka  dependency
The Scala 2.10 versions will have no Akka dependency but can optionally use Akka.
So getting started:
Download the example project on the docs website (http://jesseeichar.github.com/scala-io-doc/latest):
  • Go to Getting Started and follow instructions for downloading and running the example project.  The following goes through the steps for the 0.4.1 instructions.

The last line (Right(770)) is not a command to enter; it is the result of the asynchonous call. Continua a leggere

Pubblicato in Senza categoria

zipWithIndex

A common desire is to have access to the index of an element when using collection methods like foreach, filter, foldLeft/Right, etc… Fortunately there is a simple way.

List('a','b','c','d').zipWithIndex.

But wait!

Does that not trigger an extra iteration through the collection?. Indeed it does and that is where Views help.

List('a','b','c','d').view.zipWithIndex

When using a view the collection is only traversed when required so there is no performance loss.

Here are some examples of zipWithIndex:

  1. scala> val list = List('a','b','c','d')
  2. list: List[Char] = List(a, b, c, d)
  3. /*
  4. I like to use functions constructed with case statements
  5. in order to clearly label the index.  The alternative is 
  6. to use x._2 for the index and x._1 for the value
  7. */
  8. scala> list.view.zipWithIndex foreach {case (value,index) => println(value,index)}
  9. (a,0)
  10. (b,1)
  11. (c,2)
  12. (d,3)
  13. // alternative syntax without case statement
  14. scala> list.view.zipWithIndex foreach {e => println(e._1,e._2)}
  15. (a,0)
  16. (b,1)
  17. (c,2)
  18. (d,3)
  19. /*
  20. Fold left and right functions have 2 parameters (accumulator, nextValue) 
  21. using a case statement allows you to expand that but watch the brackets!
  22. */
  23. scala> (list.view.zipWithIndex foldLeft 0) {case (acc,(value,index)) => acc + value.toInt + index} 
  24. res14: Int = 400
  25. // alternative syntax without case statement
  26. scala> (list.view.zipWithIndex foldLeft 0) {(acc,e) => acc + e._1.toInt + e._2} 
  27. res23: Int = 400
  28. /*
  29. alternative foldLeft operator.  The thing I like about this
  30. syntax is that it has the initial accumulator value on the left 
  31. in the same position as the accumulator parameter in the function.
  32. The other thing I like about it is that visually you can see that it starts with
  33. "" and the folds left
  34. */
  35. scala> ("" /: list.view.zipWithIndex) {                          
  36.      | case (acc, (value, index)) if index % 2 == 0 => acc + value
  37.      | case (acc, _) => acc                                       
  38.      | }
  39. res15: java.lang.String = ac
  40. /*
  41. This example filters based on the index then uses map to remove the index
  42. force simply forces the view to be processed.  (I love these collections!)
  43. */
  44. scala> list.view.zipWithIndex.filter { _._2 % 2 == 0 }.map { _._1}.force
  45. res29: Seq[Char] = List(a, c)

Continua a leggere

Pubblicato in Senza categoria

Return value of a block

A common misunderstanding is that a code block (without parameters) is a function. That is not the case. A code block is a sequence of statements that are executed and result the last statement is returned. That sounds like a Function0, however, if the block is passed to a method/function only the last statement will be returned to the function/method. If that method/function expects a function as the parameter the last statement maybe returned as a function not a value, this means that the block itself is not a function.

  1. scala> var count = 0                                                                                                                         
  2. count: Int = 0
  3. // the last statement is returned as a function so count
  4. // is incremented only one during the creation of the function
  5. scala> List(1,2,3,4).map{count += 1;_ + 1}
  6. res9: List[Int] = List(2, 3, 4, 5)
  7. scala> count
  8. res10: Int = 1
  9. // now the count increment is within the function
  10. scala> List(1,2,3,4).map{i => count += 1;i + 1}
  11. res11: List[Int] = List(2, 3, 4, 5)
  12. scala> count
  13. res12: Int = 5


The previous example demonstrates a Gotcha if I ever saw one. Map expects a function so the block essentially constructs a function. The last statement being the function. The first line count += 1 executed only once because it is part of creating the function not part of the resulting function. This is equivalent to:

  1. scala> val x = {count += 1 ; i:Int => i +1}
  2. x: (Int) => Int = < function1>
  3. scala> List(1,2,3,4).map(x)
  4. res15: List[Int] = List(2, 3, 4, 5)


Beginning a block with the parameter list signals that the entire block is a function.

Rule of thumb: Functions with placeholder parameters should be a single statement. Continua a leggere

Pubblicato in Senza categoria

Type Inference with Abstract Types

A second “gotcha” that one might get tripped up when dealing with abstract types is the signature of the concrete class contains type information about the abstract type. So if you are not explicit when assigning a variable or defining a function you can get unexpected compiler errors.

  1. scala> trait S {
  2.      |   type x
  3.      |   def get : x
  4.      | }
  5. defined trait S
  6. scala> var sample = new S{ 
  7.      |   type x = Int
  8.      |   def get = 3
  9.      | }
  10. sample: java.lang.Object with S{type x = Int} = $anon$1@397af435
  11. scala> sample = new S {
  12.      |   type x = Double
  13.      |   def get = 3.0
  14.      | }
  15. < console>:7: error: type mismatch;
  16.  found   : java.lang.Object with S
  17.  required: java.lang.Object with S{type x = Int}
  18.        sample = new S {


In this example sample uses type inference so the actual type is S with underlying type Int. The consequence is that sample can only be assigned with instances of S with type x = Int. The fix is to explicitly declare the variable type:

  1. scala> var sample2 : S = new S{ 
  2.      |   type x = Int
  3.      |   def get = 3
  4.      | }
  5. sample2: S = $anon$1@31602bbc
  6. scala> sample2 = new S {
  7.      |   type x = Double
  8.      |   def get = 3.0
  9.      | }
  10. sample2: S = $anon$1@4de5ed7b


The same thing happens when declaring functions and allows type inference for function definition

  1. scala> class Fac {
  2.      |   def newS = new S {
  3.      |     type x = Int
  4.      |     def get = 3
  5.      |   }
  6.      | }
  7. defined class Fac
  8. scala> class SubFac extends Fac{
  9.      |   override def newS = new S {
  10.      |     type x = Double
  11.      |     def get = 3.0
  12.      |   }
  13.      | }
  14. < console>:8: error: type mismatch;
  15.  found   : java.lang.Object with S
  16.  required: java.lang.Object with S{type x = Int}
  17.          override def newS = new S {


The fix for this example is to be explicit in the definition of the function in the superclass Continua a leggere

Pubblicato in Senza categoria

Instance Type (Abstract type gotcha 1)

In a previous post about abstract types I showed one of the benefits of using abstract types over parameterized types. Abstract Types vs Parameter. The next several posts will feature potential problems you may encounter when using Abstract Types.

I should point out that abstract types are not inherently difficult to understand but they are rather different from anything you will see when you come from the Java world so if you are new to them I would use them with caution at first.

In the abstract types example you will notice that the abstract type ‘I’ in Foreach is not within the trait Source rather it is outside in the Foreach trait. At first one might consider putting the type in Source rather than Foreach. The naive change can get you in trouble (but there is a couple easy fixes)

  1. trait Foreach[A] {
  2.   trait Source {
  3.     type I <: java.io.Closeable  // moved this line into Source
  4.     def in : I
  5.     def next(in : I) : Option[A]
  6.   }
  7.   def source : Source
  8.   
  9.   def foreach[U](f : A => U) : Unit = {
  10.     val s = source.in
  11.     try {
  12.       def processNext : Unit = source.next(s) match {
  13.         case None => 
  14.           ()
  15.         case Some(value) => 
  16.           f(value)
  17.           processNext
  18.       }
  19.       
  20.       processNext
  21.     } finally {
  22.       // correctly handle exceptions
  23.       s.close
  24.     }
  25.   }
  26. }


Compiling the class results in a compilation error:

jeichar: tmp$ scalac XX.scala
XX.scala:12: error: type mismatch;
found : s.type (with underlying type Foreach.this.Source#I)
required: _2.I where val _2: Foreach.this.Source
def processNext : Unit = source.next(s) match {
^
XX.scala:16: error: type mismatch;
found : value.type (with underlying type Any)
required: A
f(value)
^
two errors found

So what is the problem? The problem is simple but subtle. Notice that source is defined as a def. So calling source 2 times may return 2 different instances of Source. A simple change can fix this. Either change def source : Source to val source : Source. Or change the method foreach to assign the result from source to a val.

  1. trait Foreach {
  2.   trait Source {
  3.     type I <: java.io.Closeable  // moved this line into Source
  4.     def in : I
  5.     def next(in : I) : Option[Int]
  6.   }
  7.   def source : Source
  8.   
  9.   def foreach[U](f : Int => U) : Unit = {
  10.     // this assignment allows this example to compile
  11.     val sameSource = source
  12.     val s = sameSource.in
  13.     try {
  14.       def processNext : Unit = sameSource.next(s) match {
  15.         case None => 
  16.           ()
  17.         case Some(value) => 
  18.           f(value)
  19.           processNext
  20.       }
  21.       
  22.       processNext
  23.     } finally {
  24.       // correctly handle exceptions
  25.       s.close
  26.     }
  27.   }
  28. }

Continua a leggere

Pubblicato in Senza categoria

Filter with FlatMap (or collect)

I picked up this tip from one of Daniel Spiewak’s tweets. He tweeted a pro tip that uses flatMap to create a filtered list:

  1. list flatMap {
  2.   case st: String => Some(st)
  3.   case _ => None
  4. }


At a glance one might wonder why not simply use list.filter{_.isInstanceOf[String]}. The difference is that the flatMap will return a List[String].

However Scala 2.8 offers the collect method for doing a similar thing.

  1. def strings(list: List[Any]) = list flatMap {
  2.   case st: String => Some(st)
  3.   case _ => None
  4. }
  5. // returned list is a List[String]
  6. scala> strings("hi" :: 1 :: "world" :: 4 :: Nil)
  7. res11: List[String] = List(hi, world)
  8. // returned list is a List[Any] (not as useful)
  9. scala> "hi" :: 1 :: "world" :: 4 :: Nil filter {_.isInstanceOf[String]}
  10. res12: List[Any] = List(hi, world)
  11. // collect returns List[String]
  12. scala> "hi" :: 1 :: "world" :: 4 :: Nil collect {case s:String => s}           
  13. res13: List[String] = List(hi, world)

Continua a leggere

Pubblicato in Senza categoria