clusterBy
, that clusters an aggregate based on a signature function. Here are some examples of its usage from his post ..*Main> let antwords = words "the tan ant gets some fat"
*Main> clusterBy length antwords
[["the","tan","ant","fat"],["gets","some"]]
*Main> clusterBy head antwords
[["ant"],["fat"],["gets"],["some"],["the","tan"]]
*Main> clusterBy last antwords
[["the","some"],["tan"],["gets"],["ant","fat"]]
The Haskell implementation uses
Arrow
s and is super elegant and concise ..import Control.Arrow ((&&&))
import qualified Data.Map as M
clusterBy :: Ord b => (a -> b) -> [a] -> [[a]]
clusterBy f = M.elems . M.map reverse . M.fromListWith (++) . map (f &&& return)
I was trying to implement the same in Scala, but could only proceed to the extent of having separate functions
clusterByLength
, clusterByHead
, clusterByLast
etc. ..val l: List[String] = List("the tan ant gets some fat".split(" "): _*)
def clusterByLength = {
val m = new HashMap[Int, List[String]]
l.map {x => (x.length, x)}.foreach {v =>
m.put(v._1, v._2 :: m.get(v._1).getOrElse(List[String]()))
}
m.keys.toList.sort((e1, e2) => e1 < e2).map {x => m.get(x).get.reverse}
}
def clusterByHead = {
val m = new HashMap[String, List[String]]
l.map {x => (x.substring(0, 1), x)}.foreach {v =>
m.put(v._1, v._2 :: m.get(v._1).getOrElse(List[String]()))
}
m.keys.toList.sort((e1, e2) => e1 < e2).map {x => m.get(x).get.reverse}
}
def clusterByLast = {
val m = new HashMap[String, List[String]]
l.map {x => (x.substring(x.length - 1), x)}.foreach {v =>
m.put(v._1, v._2 :: m.get(v._1).getOrElse(List[String]()))
}
m.keys.toList.sort((e1, e2) => e1 < e2).map {x => m.get(x).get.reverse}
}
Is there any way that I can raise the level of abstraction for the above implementation, and have a Scala equivalent of the Haskell
clusterBy
?
10 comments:
Something like http://paste.pocoo.org/show/100767/ ?
Here's my contribution:
http://paste.pocoo.org/show/100774/
Oh, yuck. The inner sets are mutable. That's no good.
Change .map(m) to either
.map(m.andThen(_.readOnly))
or
.map(m.andThen(collection.immutable.Set.empty ++ _))
Neither is ideal, but there ya go.
Hello!
I don't know if your looking for this arrow but maybe it helps:
http://lucdup.blogspot.com/2008/11/scala-monads-and-arrows.html
It may be out-of-date, but this looks useful:
http://scala.sygneca.com/code/arrows
My try:
http://gist.github.com/50572
Actually almost all the needed functionality is Haskell's fromListWith function so I implemented that one first.
Obvious solution with mutable hashmap:
http://paste.pocoo.org/show/100848/
Not what you were asking for, I understand, but here's a solution in J:
cluster=: 1 :'] <@#~ [:(="_ 0 ~.)u&.>'
A solution using arrows as defined here:
http://paste.pocoo.org/show/100931
Note: I had to define "new Function1" for type inference to work properly when composing the arrows. I don't know if there a simpler way to do it.
Reading the answers was an interesting exercise in itself, the common theme being folding into a map, where the function result value is used as a key for the map.
But the fact that there's an arrows implementation in Scala is even better.
Post a Comment