JavaRush /Java Blog /Random EN /Parallel Operations on Arrays in Java 8
billybonce
Level 29
Москва

Parallel Operations on Arrays in Java 8

Published in the Random EN group
article translation
//Parallel Array Operations in Java 8 //By Eric Bruno, March 25, 2014 //drdobbs.com/jvm/parallel-array-operations-in-java-8/240166287 //Eric Bruno works in the financial sector and blogs for Dr. Dobb's.
The new release of Java makes it easier to interact with arrays in parallel - resulting in greatly improved performance with a minimum of coding. Now, Oracle is releasing Java SE 8 - which is a huge step forward in terms of the language. One of the highlights of this release is the improved concurrency, part of which appears in the java.util.Arrays base class. New methods have been added to this class, which I will describe in this article. Some of these are used in another new JDK8 feature, lambdas. But let's get down to business.
Arrays.parallelSort()
Many of the features of parallelSort are based on a parallel merge sort algorithm that recursively splits an array into parts, sorts them, and then recombines them all at once into the final array. Using it in place of the existing, sequential Arrays.sort method results in improved performance and efficiency when sorting large arrays. For example, the code below uses serial sort() and parallel sort() to sort the same array of data: public class ParallelSort { public static void main(String[] args) { ParallelSort mySort = new ParallelSort(); int[] src = null; System.out.println("\nSerial sort:"); src = mySort.getData(); mySort.sortIt(src, false); System.out.println("\nParallel sort:"); src = mySort.getData(); mySort.sortIt(src, true); } public void sortIt(int[] src, boolean parallel) { try { System.out.println("--Array size: " + src.length); long start = System.currentTimeMillis(); if ( parallel == true ) { Arrays.parallelSort(src); } else { Arrays.sort(src); } long end = System.currentTimeMillis(); System.out.println( "--Elapsed sort time: " + (end-start)); } catch ( Exception e ) { e.printStackTrace(); } } private int[] getData() { try { File file = new File("src/parallelsort/myimage.png"); BufferedImage image = ImageIO.read(file); int w = image.getWidth(); int h = image.getHeight(); int[] src = image.getRGB(0, 0, w, h, null, 0, w); int[] data = new int[src.length * 20]; for ( int i = 0; i < 20; i++ ) { System.arraycopy( src, 0, data, i*src.length, src.length); } return data; } catch ( Exception e ) { e.printStackTrace(); } return null; } } To test, I loaded the raw data from the image into an array, which took 46,083,360 bytes (and you will depend on the images which you will use). The sequential sort method took almost 3,000 milliseconds to sort the array on my quad-core laptop, while the parallel sort method took a maximum of about 700 milliseconds. Agree, it does not often happen that a new language update would improve the performance of a class by 4 times.
Arrays.parallelPrefix()
The parallelPrefix method applies the specified mathematical function to the elements of an array in aggregate, processing the results within the array in parallel. This is much more efficient on modern multi-core hardware, compared to sequential operations on large arrays. There are many implementations of this method for different basic types of data operations (eg IntBinaryOperator, DoubleBinaryOperator, LongBinaryOperator, and so on), as well as for various types of mathematical operators. Here's an example of an accumulation-summation on a parallel array using the same large array as the previous example, which completes in about 100 milliseconds on my 4-core laptop. public class MyIntOperator implements IntBinaryOperator { @Override public int applyAsInt(int left, int right) { return left+right; } } public void accumulate() { int[] src = null; // accumulate test System.out.println("\nParallel prefix:"); src = getData(); IntBinaryOperator op = new ParallelSort.MyIntOperator(); long start = System.currentTimeMillis(); Arrays.parallelPrefix(src, new MyIntOperator()); long end = System.currentTimeMillis(); System.out.println("--Elapsed sort time: " + (end-start)); } ... }
Arrays.parallelSetAll()
The new parallelSetAll() method creates an array and sets each array element to a value according to the function that generates those values, using parallelism to improve efficiency. This method is based on lambdas (called "closures" (closures) in other languages) (and, yes, this is the author's mistake, because lambdas and closures are different things) , which are another new addition to JDK8 that we will discuss in future articles. It will suffice to note, lambdas, whose syntax is easily recognizable by the -> operator, which performs an operation on the right side after the arrow for all elements passed to it. In the code example below, the action is performed for each element in the array, indexed by i. Array.parallelSetAll() generates the elements of an array. For example, the following code fills a large array with random integer values: public void createLargeArray() { Integer[] array = new Integer[1024*1024*4]; // 4M Arrays.parallelSetAll( array, i -> new Integer( new Random().nextInt())); } To create a more complex array generator (for example, one that generates values ​​based on readings from real-world sensors), you can use code similar to the following: public void createLargeArray() { Integer[] array = new Integer[1024*1024*4]; // 4M Arrays.parallelSetAll( array, i -> new Integer( customGenerator(getNextSensorValue()))); } public int customGenerator(int arg){ return arg + 1; // some fancy formula here... } public int getNextSensorValue() { // Just random for illustration return new Random().nextInt(); } We'll start with getNextSensorValue, which will actually ask a sensor (like a thermometer) to return its current value. Here, for example, a random value is generated. The following customGenerator() method generates an array of elements using the chosen logic based on the case you selected. Here is a small addition, but for real cases, there would be something more complicated.
What is a splitter?
Another addition to the Arrays class that uses parallelism and lambdas is the Spliterator, which is used to iterate and split an array. Its action is not limited to arrays - it also works well for Collection classes and IO channels. Spliterators work by automatically splitting an array into different parts, and a new Spliterator is set up to perform operations on these related subarrays. Its name is made up of an Iterator which "splits" its move-iteration work into pieces. Using our still-same data, we can perform a split-iterated operation on our array like this: public void spliterate() { System.out.println("\nSpliterate:"); int[] src = getData(); Spliterator spliterator = Arrays.spliterator(src); spliterator.forEachRemaining( n -> action(n) ); } public void action(int value) { System.out.println("value:"+value); // Perform some real work on this data here... } Performing operations on data in this way takes advantage of parallelism. You can also specify splititerator options, such as the minimum size of each subarray.
Stream - processing
Finally, from an array (Array), you can create a Stream object, which allows parallel processing on a data sample as a whole, generalized into a stream sequence (stream). The difference between a Collection of data and a Stream from the new JDK8 is that collections allow you to work with elements individually, when a sequence-stream does not. For example, with collections, you can add items, remove items, and insert in the middle. A Stream Sequence does not allow you to manipulate individual elements from a set of data, but instead allows you to perform functions on the data as a whole. You can perform useful operations such as extracting only specific values ​​(ignoring repetitions) from a set, data transformation operations, finding minima and maxima of an array, map-reduce functions (used in distributed computing), and other mathematical operations. The following simple example uses concurrency to process an array of data in parallel and sum the elements. public void streamProcessing() { int[] src = getData(); IntStream stream = Arrays.stream(src); int sum = stream.sum(); System.out.println("\nSum: " + sum); }
Conclusion
Java 8 will definitely be one of the most useful updates to the language. The parallel features mentioned here, lambdas, and many other extensions, will be covered on our site in other Java 8 reviews.
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION