Definition:
Collection is a group of objects to be treated as a single unit for unlimited objects, where array ends, the collection starts. In array, size cannot be increased or decreased but in collection we can add unlimited objects till memory is available.
Collection is required to arrange multiple objects in a single unit and perform a common task on them.
Collection is opposite to reflection, the reflection is always used in background whereas the collection is used in application development.
As an example enter a city and get all employees list belongs to that city. Here the function can return only one value, so we put all in a collection and return it from the function.
Here are some benefits of using collection:
1) It can arrange objects in sorted orders.
2) We have some collection which does not allow duplicate objects.
3) Objects can be managed and accessed in key value pair order.
4) check uniqueness etc.
A collection — sometimes called a container — is simply an object that groups multiple elements into a single unit. Collections are used to store, retrieve, manipulate, and communicate aggregate data. Typically, they represent data items that form a natural group, such as a poker hand (a collection of cards), a mail folder (a collection of letters), or a telephone directory (a mapping of names to phone numbers). If you have used the Java programming language — or just about any other programming language — you are already familiar with collections.
What Is a Collections Framework?
A collections framework is a unified architecture for representing and manipulating collections. All collections frameworks contain the following:
- Interfaces: These are abstract data types that represent collections. Interfaces allow collections to be manipulated independently of the details of their representation. In object-oriented languages, interfaces generally form a hierarchy.
- Implementations: These are the concrete implementations of the collection interfaces. In essence, they are reusable data structures.
- Algorithms: These are the methods that perform useful computations, such as searching and sorting, on objects that implement collection interfaces. The algorithms are said to be polymorphic: that is, the same method can be used on many different implementations of the appropriate collection interface. In essence, algorithms are reusable functionality.
Apart from the Java Collections Framework, the best-known examples of collections frameworks are the C++ Standard Template Library (STL) and Smalltalk's collection hierarchy. Historically, collections frameworks have been quite complex, which gave them a reputation for having a steep learning curve. We believe that the Java Collections Framework breaks with this tradition, as you will learn for yourself in this chapter.
Benefits of the Java Collections Framework
The Java Collections Framework provides the following benefits:
- Reduces programming effort: By providing useful data structures and algorithms, the Collections Framework frees you to concentrate on the important parts of your program rather than on the low-level "plumbing" required to make it work. By facilitating interoperability among unrelated APIs, the Java Collections Framework frees you from writing adapter objects or conversion code to connect APIs.
- Increases program speed and quality: This Collections Framework provides high-performance, high-quality implementations of useful data structures and algorithms. The various implementations of each interface are interchangeable, so programs can be easily tuned by switching collection implementations. Because you're freed from the drudgery of writing your own data structures, you'll have more time to devote to improving programs' quality and performance.
- Allows interoperability among unrelated APIs: The collection interfaces are the vernacular by which APIs pass collections back and forth. If my network administration API furnishes a collection of node names and if your GUI toolkit expects a collection of column headings, our APIs will interoperate seamlessly, even though they were written independently.
- Reduces effort to learn and to use new APIs: Many APIs naturally take collections on input and furnish them as output. In the past, each such API had a small sub-API devoted to manipulating its collections. There was little consistency among these ad hoc collections sub-APIs, so you had to learn each one from scratch, and it was easy to make mistakes when using them. With the advent of standard collection interfaces, the problem went away.
- Reduces effort to design new APIs: This is the flip side of the previous advantage. Designers and implementers don't have to reinvent the wheel each time they create an API that relies on collections; instead, they can use standard collection interfaces.
- Fosters software reuse: New data structures that conform to the standard collection interfaces are by nature reusable. The same goes for new algorithms that operate on objects that implement these interfaces.
The core collection interfaces encapsulate different types of collections, which are shown in the figure below. These interfaces allow collections to be manipulated independently of the details of their representation. Core collection interfaces are the foundation of the Java Collections Framework. As you can see in the following figure, the core collection interfaces form a hierarchy.
The core collection interfaces.
A
Set
is a special kind of Collection
, a SortedSet
is a special kind of Set
, and so forth. Note also that the hierarchy consists of two distinct trees — a Map
is not a true Collection
.
Note that all the core collection interfaces are generic. For example, this is the declaration of the
Collection
interface.public interface Collection<E>...
The
<E>
syntax tells you that the interface is generic. When you declare a Collection
instance you can and should specify the type of object contained in the collection. Specifying the type allows the compiler to verify (at compile-time) that the type of object you put into the collection is correct, thus reducing errors at runtime. For information on generic types, see the Generics (Updated) lesson.
When you understand how to use these interfaces, you will know most of what there is to know about the Java Collections Framework. This chapter discusses general guidelines for effective use of the interfaces, including when to use which interface. You'll also learn programming idioms for each interface to help you get the most out of it.
To keep the number of core collection interfaces manageable, the Java platform doesn't provide separate interfaces for each variant of each collection type. (Such variants might include immutable, fixed-size, and append-only.) Instead, the modification operations in each interface are designated optional — a given implementation may elect not to support all operations. If an unsupported operation is invoked, a collection throws an
UnsupportedOperationException
. Implementations are responsible for documenting which of the optional operations they support. All of the Java platform's general-purpose implementations support all of the optional operations.
The following list describes the core collection interfaces:
Collection
— the root of the collection hierarchy. A collection represents a group of objects known as its elements. TheCollection
interface is the least common denominator that all collections implement and is used to pass collections around and to manipulate them when maximum generality is desired. Some types of collections allow duplicate elements, and others do not. Some are ordered and others are unordered. The Java platform doesn't provide any direct implementations of this interface but provides implementations of more specific subinterfaces, such asSet
andList
. Also see The Collection Interface section.Set
— a collection that cannot contain duplicate elements. This interface models the mathematical set abstraction and is used to represent sets, such as the cards comprising a poker hand, the courses making up a student's schedule, or the processes running on a machine. See also The Set Interface section.List
— an ordered collection (sometimes called a sequence).List
s can contain duplicate elements. The user of aList
generally has precise control over where in the list each element is inserted and can access elements by their integer index (position). If you've usedVector
, you're familiar with the general flavor ofList
. Also see The List Interface section.Queue
— a collection used to hold multiple elements prior to processing. Besides basicCollection
operations, aQueue
provides additional insertion, extraction, and inspection operations.Queues typically, but do not necessarily, order elements in a FIFO (first-in, first-out) manner. Among the exceptions are priority queues, which order elements according to a supplied comparator or the elements' natural ordering. Whatever the ordering used, the head of the queue is the element that would be removed by a call toremove
orpoll
. In a FIFO queue, all new elements are inserted at the tail of the queue. Other kinds of queues may use different placement rules. EveryQueue
implementation must specify its ordering properties. Also see The Queue Interface section.Deque
— a collection used to hold multiple elements prior to processing. Besides basicCollection
operations, aDeque
provides additional insertion, extraction, and inspection operations.Deques can be used both as FIFO (first-in, first-out) and LIFO (last-in, first-out). In a deque all new elements can be inserted, retrieved and removed at both ends. Also see The Deque Interface section.Map
— an object that maps keys to values. AMap
cannot contain duplicate keys; each key can map to at most one value. If you've usedHashtable
, you're already familiar with the basics ofMap
. Also see The Map Interface section.
The last two core collection interfaces are merely sorted versions of
Set
and Map
:SortedSet
— aSet
that maintains its elements in ascending order. Several additional operations are provided to take advantage of the ordering. Sorted sets are used for naturally ordered sets, such as word lists and membership rolls. Also see The SortedSet Interface section.SortedMap
— aMap
that maintains its mappings in ascending key order. This is theMap
analog ofSortedSet
. Sorted maps are used for naturally ordered collections of key/value pairs, such as dictionaries and telephone directories. Also see The SortedMap Interface section.
To understand how the sorted interfaces maintain the order of their elements, see the Object Ordering section.
The Set Interface
A
the
The Java platform contains three general-purpose
Here's a simple but useful
It works by creating a
Or, if using JDK 8 or later, you could easily collect into a
Here's a slightly longer example that accumulates a
And the following is a minor variant of the first idiom that preserves the order of the original collection while removing duplicate elements:
The following is a generic method that encapsulates the preceding idiom, returning a
Set
is a Collection
that cannot contain duplicate elements. It models the mathematical set abstraction. The Set
interface contains only methods inherited from Collection
and adds the restriction that duplicate elements are prohibited. Set
also adds a stronger contract on the behavior ofthe
equals
and hashCode
operations, allowing Set
instances to be compared meaningfully even if their implementation types differ. Two Set
instances are equal if they contain the same elements.The Java platform contains three general-purpose
Set
implementations: HashSet
, TreeSet
, and LinkedHashSet
. HashSet
, which stores its elements in a hash table, is the best-performing implementation; however it makes no guarantees concerning the order of iteration. TreeSet
, which stores its elements in a red-black tree, orders its elements based on their values; it is substantially slower than HashSet
. LinkedHashSet
, which is implemented as a hash table with a linked list running through it, orders its elements based on the order in which they were inserted into the set (insertion-order). LinkedHashSet
spares its clients from the unspecified, generally chaotic ordering provided by HashSet
at a cost that is only slightly higher.Here's a simple but useful
Set
idiom. Suppose you have a Collection
, c
, and you want to create another Collection
containing the same elements but with all duplicates eliminated. The following one-liner does the trick.Collection<Type> noDups = new HashSet<Type>(c);
Set
(which, by definition, cannot contain duplicates), initially containing all the elements in c
. It uses the standard conversion constructor described in the The Collection Interface section.Or, if using JDK 8 or later, you could easily collect into a
Set
using aggregate operations:c.stream().collect(Collectors.toSet()); // no duplicates
Collection
of names into a TreeSet
:Set<String> set = people.stream().map(Person::getName).collect(Collectors.toCollection(TreeSet::new));
Collection<Type> noDups = new LinkedHashSet<Type>(c);
Set
of the same generic type as the one passed.public static <E> Set<E> removeDups(Collection<E> c) { return new LinkedHashSet<E>(c); }
The List Interface
A
List
is an ordered Collection
(sometimes called a sequence). Lists may contain duplicate elements. In addition to the operations inherited from Collection
, the List
interface includes operations for the following:Positional access
— manipulates elements based on their numerical position in the list. This includes methods such asget
,set
,add
,addAll
, andremove
.Search
— searches for a specified object in the list and returns its numerical position. Search methods includeindexOf
andlastIndexOf
.Iteration
— extendsIterator
semantics to take advantage of the list's sequential nature. ThelistIterator
methods provide this behavior.Range-view
— Thesublist
method performs arbitrary range operations on the list.
List
implementations. ArrayList
, which is usually the better-performing implementation, and LinkedList
which offers better performance under certain circumstances.Iterators
As you'd expect, the
Iterator
returned by List
's iterator
operation returns the elements of the list in proper sequence. List
also provides a richer iterator, called a ListIterator
, which allows you to traverse the list in either direction, modify the list during iteration, and obtain the current position of the iterator.
The three methods that
ListIterator
inherits from Iterator
(hasNext
, next
, and remove
) do exactly the same thing in both interfaces. The hasPrevious
and the previous
operations are exact analogues of hasNext
and next
. The former operations refer to the element before the (implicit) cursor, whereas the latter refer to the element after the cursor. The previous
operation moves the cursor backward, whereas next
moves it forward.
Here's the standard idiom for iterating backward through a list.
for (ListIterator<Type> it = list.listIterator(list.size()); it.hasPrevious(); ) { Type t = it.previous(); ... }
Note the argument to
listIterator
in the preceding idiom. The List
interface has two forms of the listIterator
method. The form with no arguments returns a ListIterator
positioned at the beginning of the list; the form with an int
argument returns a ListIterator
positioned at the specified index. The index refers to the element that would be returned by an initial call to next
. An initial call to previous
would return the element whose index was index-1
. In a list of length n
, there are n+1
valid values for index
, from 0
to n
, inclusive.
Intuitively speaking, the cursor is always between two elements — the one that would be returned by a call to
previous
and the one that would be returned by a call to next
. The n+1
valid index
values correspond to the n+1
gaps between elements, from the gap before the first element to the gap after the last one. The following figure shows the five possible cursor positions in a list containing four elements.The Queue Interface
A
Each
Queues typically, but not necessarily, order elements in a FIFO (first-in-first-out) manner. Among the exceptions are priority queues, which order elements according to their values — see the Object Ordering section for details). Whatever ordering is used, the head of the queue is the element that would be removed by a call to
It is possible for a
The
The
The
Queue implementations generally do not define element-based versions of the
The
In the following example program, a queue is used to implement a countdown timer. The queue is preloaded with all the integer values from a number specified on the command line to zero, in descending order. Then, the values are removed from the queue and printed at one-second intervals. The program is artificial in that it would be more natural to do the same thing without using a queue, but it illustrates the use of a queue to store elements prior to subsequent processing.
In the following example, a priority queue is used to sort a collection of elements. Again this program is artificial in that there is no reason to use it in favor of the
Queue
is a collection for holding elements prior to processing. Besides basic Collection
operations, queues provide additional insertion, removal, and inspection operations. The Queue
interface follows.public interface Queue<E> extends Collection<E> { E element(); boolean offer(E e); E peek(); E poll(); E remove(); }
Queue
method exists in two forms: (1) one throws an exception if the operation fails, and (2) the other returns a special value if the operation fails (either null
or false
, depending on the operation). The regular structure of the interface is illustrated in the following table.Type of Operation | Throws exception | Returns special value |
---|---|---|
Insert | add(e) | offer(e) |
Remove | remove() | poll() |
Examine | element() | peek() |
Queues typically, but not necessarily, order elements in a FIFO (first-in-first-out) manner. Among the exceptions are priority queues, which order elements according to their values — see the Object Ordering section for details). Whatever ordering is used, the head of the queue is the element that would be removed by a call to
remove
or poll
. In a FIFO queue, all new elements are inserted at the tail of the queue. Other kinds of queues may use different placement rules. Every Queue
implementation must specify its ordering properties.It is possible for a
Queue
implementation to restrict the number of elements that it holds; such queues are known as bounded. Some Queue
implementations in java.util.concurrent
are bounded, but the implementations in java.util
are not.The
add
method, which Queue
inherits from Collection
, inserts an element unless it would violate the queue's capacity restrictions, in which case it throws IllegalStateException
. The offer
method, which is intended solely for use on bounded queues, differs from add
only in that it indicates failure to insert an element by returning false
.The
remove
and poll
methods both remove and return the head of the queue. Exactly which element gets removed is a function of the queue's ordering policy. The remove
and poll
methods differ in their behavior only when the queue is empty. Under these circumstances, remove
throws NoSuchElementException
, while poll
returns null
.The
element
and peek
methods return, but do not remove, the head of the queue. They differ from one another in precisely the same fashion as remove
and poll
: If the queue is empty, element
throws NoSuchElementException
, while peek
returns null
.Queue
implementations generally do not allow insertion of null
elements. The LinkedList
implementation, which was retrofitted to implement Queue
, is an exception. For historical reasons, it permits null
elements, but you should refrain from taking advantage of this, because null
is used as a special return value by the poll
and peek
methods.Queue implementations generally do not define element-based versions of the
equals
and hashCode
methods but instead inherit the identity-based versions from Object
.The
Queue
interface does not define the blocking queue methods, which are common in concurrent programming. These methods, which wait for elements to appear or for space to become available, are defined in the interface java.util.concurrent.BlockingQueue
, which extends Queue
.In the following example program, a queue is used to implement a countdown timer. The queue is preloaded with all the integer values from a number specified on the command line to zero, in descending order. Then, the values are removed from the queue and printed at one-second intervals. The program is artificial in that it would be more natural to do the same thing without using a queue, but it illustrates the use of a queue to store elements prior to subsequent processing.
import java.util.*; public class Countdown { public static void main(String[] args) throws InterruptedException { int time = Integer.parseInt(args[0]); Queue<Integer> queue = new LinkedList<Integer>(); for (int i = time; i >= 0; i--) queue.add(i); while (!queue.isEmpty()) { System.out.println(queue.remove()); Thread.sleep(1000); } } }
sort
method provided in Collections
, but it illustrates the behavior of priority queues.static <E> List<E> heapSort(Collection<E> c) { Queue<E> queue = new PriorityQueue<E>(c); List<E> result = new ArrayList<E>(); while (!queue.isEmpty()) result.add(queue.remove()); return result; }
The Deque Interface
Usually pronounced as
Note that the
The 12 methods for insertion, removal and retieval of Deque elements are summarized in the following table:
In addition to these basic methods to insert,remove and examine a
deck
, a deque is a double-ended-queue. A double-ended-queue is a linear collection of elements that supports the insertion and removal of elements at both end points. The Deque
interface is a richer abstract data type than both Stack
and Queue
because it implements both stacks and queues at the same time. The Deque
interface, defines methods to access the elements at both ends of the Deque
instance. Methods are provided to insert, remove, and examine the elements. Predefined classes like ArrayDeque
and LinkedList
implement the Deque
interface.Note that the
Deque
interface can be used both as last-in-first-out stacks and first-in-first-out queues. The methods given in the Deque
interface are divided into three parts:Insert
Theaddfirst
and offerFirst
methods insert elements at the beginning of the Deque
instance. The methods addLast
and offerLast
insert elements at the end of the Deque
instance. When the capacity of the Deque
instance is restricted, the preferred methods are offerFirst
and offerLast
because addFirst
might fail to throw an exception if it is full.Remove
TheremoveFirst
and pollFirst
methods remove elements from the beginning of the Deque
instance. The removeLast
and pollLast
methods remove elements from the end. The methods pollFirst
and pollLast
return null
if the Deque
is empty whereas the methods removeFirst
and removeLast
throw an exception if the Deque
instance is empty.Retrieve
The methodsgetFirst
and peekFirst
retrieve the first element of the Deque
instance. These methods dont remove the value from the Deque
instance. Similarly, the methods getLast
and peekLast
retrieve the last element. The methods getFirst
and getLast
throw an exception if the deque
instance is empty whereas the methods peekFirst
and peekLast
return NULL
.The 12 methods for insertion, removal and retieval of Deque elements are summarized in the following table:
Type of Operation | First Element (Beginning of the Deque instance) | Last Element (End of the Deque instance) |
---|---|---|
Insert | addFirst(e) offerFirst(e) | addLast(e) offerLast(e) |
Remove | removeFirst() pollFirst() | removeLast() pollLast() |
Examine | getFirst() peekFirst() | getLast() peekLast() |
Deque
instance, the Deque
interface also has some more predefined methods. One of these is removeFirstOccurence
, this method removes the first occurence of the specified element if it exists in the Deque
instance. If the element does not exist then the Deque
instance remains unchanged. Another similar method is removeLastOccurence
; this method removes the last occurence of the specified element in the Deque
instance. The return type of these methods is boolean
, and they return true
if the element exists in the Deque
instance.The Map Interface
A
The Java platform contains three general-purpose
The remainder of this page discusses the
Or compute the sum of all salaries by department:
Or perhaps group students by passing or failing grades:
You could also group people by city:
Or even cascade two collectors to classify people by state and city:
Again, these are but a few examples of how to use the new JDK 8 APIs. For in-depth coverage of lambda expressions and aggregate operations see the lesson entitled Aggregate Operations.
The only tricky thing about this program is the second argument of the
The program yields the following output.
Suppose you'd prefer to see the frequency table in alphabetical order. All you have to do is change the implementation type of the
Similarly, you could make the program print the frequency table in the order the words first appear on the command line simply by changing the implementation type of the map to
This flexibility provides a potent illustration of the power of an interface-based framework.
Like the
By convention, all general-purpose
and with an
The idiom for iterating over values is analogous. Following is the idiom for iterating over key-value pairs.
At first, many people worry that these idioms may be slow because the
With all three
With the
The
The
Along similar lines, suppose you want to know whether two
Suppose you have a
Suppose you want to know all the keys common to two
A similar idiom gets you the common values.
All the idioms presented thus far have been nondestructive; that is, they don't modify the backing
Suppose you want to remove from one
What happens when you start mixing keys and values in the same bulk operation? Suppose you have a
Suppose you want to fire all the employees who report directly to some manager, Simon.
Note that this idiom makes use of
Once you've done this, you may have a bunch of employees whose managers no longer work for the company (if any of Simon's direct-reports were themselves managers). The following code will tell you which employees have managers who no longer works for the company.
This example is a bit tricky. First, it makes a temporary copy of the
There are many more idioms like the ones contained in this section, but it would be impractical and tedious to list them all. Once you get the hang of it, it's not that difficult to come up with the right one when you need it.
There is a standard trick for finding anagram groups: For each word in the dictionary, alphabetize the letters in the word (that is, reorder the word's letters into alphabetical order) and put an entry into a multimap, mapping the alphabetized word to the original word. For example, the word bad causes an entry mapping abd into bad to be put into the multimap. A moment's reflection will show that all the words to which any given key maps form an anagram group. It's a simple matter to iterate over the keys in the multimap, printing out each anagram group that meets the size constraint.
Running this program on a 173,000-word dictionary file with a minimum anagram group size of eight produces the following output.
Many of these words seem a bit bogus, but that's not the program's fault; they're in the dictionary file. Here's the
Map
is an object that maps keys to values. A map cannot contain duplicate keys: Each key can map to at most one value. It models the mathematical function abstraction. The Map
interface includes methods for basic operations (such as put
, get
, remove
, containsKey
, containsValue
, size
, and empty
), bulk operations (such as putAll
and clear
), and collection views (such as keySet
, entrySet
, and values
).The Java platform contains three general-purpose
Map
implementations: HashMap
, TreeMap
, and LinkedHashMap
. Their behavior and performance are precisely analogous to HashSet
, TreeSet
, and LinkedHashSet
, as described in The Set Interface section.The remainder of this page discusses the
Map
interface in detail. But first, here are some more examples of collecting to Map
s using JDK 8 aggregate operations. Modeling real-world objects is a common task in object-oriented programming, so it is reasonable to think that some programs might, for example, group employees by department:// Group employees by department Map<Department, List<Employee>> byDept = employees.stream() .collect(Collectors.groupingBy(Employee::getDepartment));
// Compute sum of salaries by department Map<Department, Integer> totalByDept = employees.stream() .collect(Collectors.groupingBy(Employee::getDepartment, Collectors.summingInt(Employee::getSalary)));
// Partition students into passing and failing Map<Boolean, List<Student>> passingFailing = students.stream() .collect(Collectors.partitioningBy(s -> s.getGrade()>= PASS_THRESHOLD));
// Classify Person objects by city Map<String, List<Person>> peopleByCity = personStream.collect(Collectors.groupingBy(Person::getCity));
// Cascade Collectors Map<String, Map<String, List<Person>>> peopleByStateAndCity = personStream.collect(Collectors.groupingBy(Person::getState, Collectors.groupingBy(Person::getCity)))
Map Interface Basic Operations
The basic operations ofMap
(put
, get
, containsKey
, containsValue
, size
, and isEmpty
) behave exactly like their counterparts in Hashtable
. The following program
generates a frequency table of the words found in its argument list. The frequency table maps each word to the number of times it occurs in the argument list.import java.util.*; public class Freq { public static void main(String[] args) { Map<String, Integer> m = new HashMap<String, Integer>(); // Initialize frequency table from command line for (String a : args) { Integer freq = m.get(a); m.put(a, (freq == null) ? 1 : freq + 1); } System.out.println(m.size() + " distinct words:"); System.out.println(m); } }
put
statement. That argument is a conditional expression that has the effect of setting the frequency to one if the word has never been seen before or one more than its current value if the word has already been seen. Try running this program with the command:java Freq if it is to be it is up to me to delegate
8 distinct words: {to=3, delegate=1, be=1, it=2, up=1, if=1, me=1, is=2}
Map
from HashMap
to TreeMap
. Making this four-character change causes the program to generate the following output from the same command line.8 distinct words: {be=1, delegate=1, if=1, is=2, it=2, me=1, to=3, up=1}
LinkedHashMap
. Doing so results in the following output.8 distinct words: {if=1, it=2, is=2, to=3, be=1, up=1, me=1, delegate=1}
Like the
Set
and List
interfaces, Map
strengthens the requirements on the equals
and hashCode
methods so that two Map
objects can be compared for logical equality without regard to their implementation types. Two Map
instances are equal if they represent the same key-value mappings.By convention, all general-purpose
Map
implementations provide constructors that take a Map
object and initialize the new Map
to contain all the key-value mappings in the specified Map
. This standard Map
conversion constructor is entirely analogous to the standard Collection
constructor: It allows the caller to create a Map
of a desired implementation type that initially contains all of the mappings in another Map
, regardless of the other Map
's implementation type. For example, suppose you have a Map
, named m
. The following one-liner creates a new HashMap
initially containing all of the same key-value mappings as m
.Map<K, V> copy = new HashMap<K, V>(m);
Map Interface Bulk Operations
Theclear
operation does exactly what you would think it could do: It removes all the mappings from the Map
. The putAll
operation is the Map
analogue of the Collection
interface's addAll
operation. In addition to its obvious use of dumping one Map
into another, it has a second, more subtle use. Suppose a Map
is used to represent a collection of attribute-value pairs; the putAll
operation, in combination with the Map
conversion constructor, provides a neat way to implement attribute map creation with default values. The following is a static factory method that demonstrates this technique.static <K, V> Map<K, V> newAttributeMap(Map<K, V>defaults, Map<K, V> overrides) { Map<K, V> result = new HashMap<K, V>(defaults); result.putAll(overrides); return result; }
Collection Views
TheCollection
view methods allow a Map
to be viewed as a Collection
in these three ways:keySet
— theSet
of keys contained in theMap
.values
— TheCollection
of values contained in theMap
. ThisCollection
is not aSet
, because multiple keys can map to the same value.entrySet
— theSet
of key-value pairs contained in theMap
. TheMap
interface provides a small nested interface calledMap.Entry
, the type of the elements in thisSet
.
Collection
views provide the only means to iterate over a Map
. This example illustrates the standard idiom for iterating over the keys in a Map
with a for-each
construct:for (KeyType key : m.keySet()) System.out.println(key);
iterator
:// Filter a map based on some // property of its keys. for (Iterator<Type> it = m.keySet().iterator(); it.hasNext(); ) if (it.next().isBogus()) it.remove();
for (Map.Entry<KeyType, ValType> e : m.entrySet()) System.out.println(e.getKey() + ": " + e.getValue());
Map
has to create a new Collection
instance each time a Collection
view operation is called. Rest easy: There's no reason that a Map
cannot always return the same object each time it is asked for a given Collection
view. This is precisely what all the Map
implementations in java.util
do.With all three
Collection
views, calling an Iterator
's remove
operation removes the associated entry from the backing Map
, assuming that the backing Map
supports element removal to begin with. This is illustrated by the preceding filtering idiom.With the
entrySet
view, it is also possible to change the value associated with a key by calling a Map.Entry
's setValue
method during iteration (again, assuming the Map
supports value modification to begin with). Note that these are the only safe ways to modify a Map
during iteration; the behavior is unspecified if the underlying Map
is modified in any other way while the iteration is in progress.The
Collection
views support element removal in all its many forms — remove
, removeAll
, retainAll
, and clear
operations, as well as the Iterator.remove
operation. (Yet again, this assumes that the backing Map
supports element removal.)The
Collection
views do not support element addition under any circumstances. It would make no sense for the keySet
and values
views, and it's unnecessary for the entrySet
view, because the backing Map
's put
and putAll
methods provide the same functionality.Fancy Uses of Collection Views: Map Algebra
When applied to theCollection
views, bulk operations (containsAll
, removeAll
, and retainAll
) are surprisingly potent tools. For starters, suppose you want to know whether one Map
is a submap of another — that is, whether the first Map
contains all the key-value mappings in the second. The following idiom does the trick.if (m1.entrySet().containsAll(m2.entrySet())) { ... }
Map
objects contain mappings for all of the same keys.if (m1.keySet().equals(m2.keySet())) { ... }
Map
that represents a collection of attribute-value pairs, and two Set
s representing required attributes and permissible attributes. (The permissible attributes include the required attributes.) The following snippet determines whether the attribute map conforms to these constraints and prints a detailed error message if it doesn't.static <K, V> boolean validate(Map<K, V> attrMap, Set<K> requiredAttrs, Set<K>permittedAttrs) { boolean valid = true; Set<K> attrs = attrMap.keySet(); if (! attrs.containsAll(requiredAttrs)) { Set<K> missing = new HashSet<K>(requiredAttrs); missing.removeAll(attrs); System.out.println("Missing attributes: " + missing); valid = false; } if (! permittedAttrs.containsAll(attrs)) { Set<K> illegal = new HashSet<K>(attrs); illegal.removeAll(permittedAttrs); System.out.println("Illegal attributes: " + illegal); valid = false; } return valid; }
Map
objects.Set<KeyType>commonKeys = new HashSet<KeyType>(m1.keySet()); commonKeys.retainAll(m2.keySet());
All the idioms presented thus far have been nondestructive; that is, they don't modify the backing
Map
. Here are a few that do. Suppose you want to remove all of the key-value pairs that one Map
has in common with another.m1.entrySet().removeAll(m2.entrySet());
Map
all of the keys that have mappings in another.m1.keySet().removeAll(m2.keySet());
Map
, managers
, that maps each employee in a company to the employee's manager. We'll be deliberately vague about the types of the key and the value objects. It doesn't matter, as long as they're the same. Now suppose you want to know who all the "individual contributors" (or nonmanagers) are. The following snippet tells you exactly what you want to know.Set<Employee> individualContributors = new HashSet<Employee>(managers.keySet()); individualContributors.removeAll(managers.values());
Employee simon = ... ; managers.values().removeAll(Collections.singleton(simon));
Collections.singleton
, a static factory method that returns an immutable Set
with the single, specified element.Once you've done this, you may have a bunch of employees whose managers no longer work for the company (if any of Simon's direct-reports were themselves managers). The following code will tell you which employees have managers who no longer works for the company.
Map<Employee, Employee> m = new HashMap<Employee, Employee>(managers); m.values().removeAll(managers.keySet()); Set<Employee> slackers = m.keySet();
Map
, and it removes from the temporary copy all entries whose (manager) value is a key in the original Map
. Remember that the original Map
has an entry for each employee. Thus, the remaining entries in the temporary Map
comprise all the entries from the original Map
whose (manager) values are no longer employees. The keys in the temporary copy, then, represent precisely the employees that we're looking for.There are many more idioms like the ones contained in this section, but it would be impractical and tedious to list them all. Once you get the hang of it, it's not that difficult to come up with the right one when you need it.
Multimaps
A multimap is like aMap
but it can map each key to multiple values. The Java Collections Framework doesn't include an interface for multimaps because they aren't used all that commonly. It's a fairly simple matter to use a Map
whose values are List
instances as a multimap. This technique is demonstrated in the next code example, which reads a word list containing one word per line (all lowercase) and prints out all the anagram groups that meet a size criterion. An anagram group is a bunch of words, all of which contain exactly the same letters but in a different order. The program takes two arguments on the command line: (1) the name of the dictionary file and (2) the minimum size of anagram group to print out. Anagram groups containing fewer words than the specified minimum are not printed.There is a standard trick for finding anagram groups: For each word in the dictionary, alphabetize the letters in the word (that is, reorder the word's letters into alphabetical order) and put an entry into a multimap, mapping the alphabetized word to the original word. For example, the word bad causes an entry mapping abd into bad to be put into the multimap. A moment's reflection will show that all the words to which any given key maps form an anagram group. It's a simple matter to iterate over the keys in the multimap, printing out each anagram group that meets the size constraint.
The following program
is a straightforward implementation of this technique.import java.util.*; import java.io.*; public class Anagrams { public static void main(String[] args) { int minGroupSize = Integer.parseInt(args[1]); // Read words from file and put into a simulated multimap Map<String, List<String>> m = new HashMap<String, List<String>>(); try { Scanner s = new Scanner(new File(args[0])); while (s.hasNext()) { String word = s.next(); String alpha = alphabetize(word); List<String> l = m.get(alpha); if (l == null) m.put(alpha, l=new ArrayList<String>()); l.add(word); } } catch (IOException e) { System.err.println(e); System.exit(1); } // Print all permutation groups above size threshold for (List<String> l : m.values()) if (l.size() >= minGroupSize) System.out.println(l.size() + ": " + l); } private static String alphabetize(String s) { char[] a = s.toCharArray(); Arrays.sort(a); return new String(a); } }
9: [estrin, inerts, insert, inters, niters, nitres, sinter, triens, trines] 8: [lapse, leaps, pales, peals, pleas, salep, sepal, spale] 8: [aspers, parses, passer, prases, repass, spares, sparse, spears] 10: [least, setal, slate, stale, steal, stela, taels, tales, teals, tesla] 8: [enters, nester, renest, rentes, resent, tenser, ternes, treens] 8: [arles, earls, lares, laser, lears, rales, reals, seral] 8: [earings, erasing, gainers, reagins, regains, reginas, searing, seringa] 8: [peris, piers, pries, prise, ripes, speir, spier, spire] 12: [apers, apres, asper, pares, parse, pears, prase, presa, rapes, reaps, spare, spear] 11: [alerts, alters, artels, estral, laster, ratels, salter, slater, staler, stelar, talers] 9: [capers, crapes, escarp, pacers, parsec, recaps, scrape, secpar, spacer] 9: [palest, palets, pastel, petals, plates, pleats, septal, staple, tepals] 9: [anestri, antsier, nastier, ratines, retains, retinas, retsina, stainer, stearin] 8: [ates, east, eats, etas, sate, seat, seta, teas] 8: [carets, cartes, caster, caters, crates, reacts, recast, traces]
dictionary file
we used. It was derived from the Public Domain ENABLE benchmark reference word list.
No comments:
Post a Comment