简体   繁体   中英

In Scala [2.11.6] how would one create a lazy stream of objects from an ordered set of Longs

In a nutshell what I wish to do is take a set of Longs , arbitrarily ordered as in (7,3,9,14,123,2) and have available a series of Objects :

Set(SomeObject(7),SomeObject(3),SomeObject(9),SomeObject(14),SomeObject(123),SomeObject(2))

However I do not want the SomeObject objects initialized until I actually ask for them. I wish to be able to ask for them in arbitrary order as well: As in give me the 3rd SomeObject (by index) or give me the SomeObject that maps to the Long value of 7. All that without triggering initializations down the stack.

I understand a lazy stream however I'm not quite sure how to connect the dots between the first Set of Longs (map will do that instantly of course as in map { x => SomeObject(x)} ) and yet end up with a Lazy Stream (in the same initial arbitrary order please!)

One of the additional rules is this needs to be Set based so I never have the same Long (and it's matching SomeObject ) appear twice.

An additional need is to to handle multiple Sets of Longs initially being mashed together, while maintaining the (fifo) order and uniqueness but I believe that is all built into a subclass of Set to begin with.

Set doesn't provide a indexed access so you can't get " 3rd SomeObject ". Also Set cant provide you any operations without evaluating values that it contains because this values need to be ordered (in case of Tree-based implementation) or hashed (in case of HashSets), and you cant sort or hash value that you do not know.

If creation of SomeObject is resource consuming maybe it is better to create a " SomeObjectHolder " class that would create SomeObject on demand and provide hashing operations that will not require creation of SomeObject .

Than you will have

Set(SomeObjectHolder(7),SomeObjectHolder(3),SomeObjectHolder(9),...

And each SomeObjectHolder will create corresponding SomeObject for you when you need.

Some of your requirements can be satisfied by lazy view of some indexed sequence:

case class SomeObject(v:Long) {
  println(s"$v created")
}

val source = Vector(0L, 1L, 2L, 3L, 4L)
val col = source.view.map(SomeObject.apply)

In this case, when you access individual elements by index col(2) only requested elements are evaluated. However when you request slice , all elements from 0 to endpoint are evaluated.

col.slice(1, 2).toList

Prints:

0 created
1 created

This approach has several drawbacks:

  • when you request element several times, it get's evaluated each time
  • when you request slice , all elements from the beginning are evaluated
  • you can't request mapping for arbitrary key (only for index)

To satisfy all you requirements custom class should be created:

class CachedIndexedSeq[K, V](source: IndexedSeq[K], func: K => V) extends IndexedSeq[V] {
  private val cache = mutable.Map[K, V]()

  def getMapping(key: K): V = cache.getOrElseUpdate(key, func(key))

  override def length: Int = source.length

  override def apply(idx: Int): V = getMapping(source(idx))
}

This class takes source indexed sequence as the argument along with mapping function. It lazily evaluates elements and also provides getMapping method to lazily map arbitrary key.

val source = Vector(0L, 1L, 2L)
val col2 = new CachedIndexedSeq[Long, SomeObject](source, SomeObject.apply)

col2.slice(1, 3).toList
col2(1)
col2(1)
col2.getMapping(1L)

Prints:

1 created
2 created

The only remaining requirement is the ability to avoid duplicates. Set doesn't combine well with requesting elements by index. So I suggest to put all your initial Longs into any indexed seq (such as Vector ) and then call distinct on them, before wrapping in CachedIndexedSeq .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM