JavaRush /Java Blog /Random EN /Parallel operations on arrays in Java 8 - translation
billybonce
Level 29
Москва

Parallel operations on arrays in Java 8 - translation

Published in the Random EN group
translation of the article
//Parallel Array Operations in Java 8 //By Eric Bruno, March 25, 2014 //drdobbs.com/jvm/parallel-array-operations-in-java-8/240166287 //Eric Bruno works in the financial sector and blogs for the website Dr. Dobb's.
The new release of Java makes it easier to interact with arrays in parallel - resulting in significantly improved performance with a minimum of coding. Now, Oracle is releasing Java SE 8 - which is a huge step forward in terms of the language. One of the important features of this release is improved concurrency, some of which appears in the java.util.Arrays base class. New methods have been added to this class, which I will describe in this article. Some of these are used in another new feature of JDK8 - lambdas. But let's get down to business.
Arrays.paralellSort()
Many of parallelSort's features are based on a parallel merge sort algorithm that recursively splits an array into parts, sorts them, and then recombines them simultaneously into a final array. Using it instead of the existing, sequential Arrays.sort method results in improved performance and efficiency when sorting large arrays. For example, the code below uses sequential sort() and parallel parallelSort() to sort the same data array: public class ParallelSort { public static void main(String[] args) { ParallelSort mySort = new ParallelSort(); int[] src = null; System.out.println("\nSerial sort:"); src = mySort.getData(); mySort.sortIt(src, false); System.out.println("\nParallel sort:"); src = mySort.getData(); mySort.sortIt(src, true); } public void sortIt(int[] src, boolean parallel) { try { System.out.println("--Array size: " + src.length); long start = System.currentTimeMillis(); if ( parallel == true ) { Arrays.parallelSort(src); } else { Arrays.sort(src); } long end = System.currentTimeMillis(); System.out.println( "--Elapsed sort time: " + (end-start)); } catch ( Exception e ) { e.printStackTrace(); } } private int[] getData() { try { File file = new File("src/parallelsort/myimage.png"); BufferedImage image = ImageIO.read(file); int w = image.getWidth(); int h = image.getHeight(); int[] src = image.getRGB(0, 0, w, h, null, 0, w); int[] data = new int[src.length * 20]; for ( int i = 0; i < 20; i++ ) { System.arraycopy( src, 0, data, i*src.length, src.length); } return data; } catch ( Exception e ) { e.printStackTrace(); } return null; } } To test, I loaded the raw data from the image into the array, which took 46,083,360 bytes (and yours will depend on the images which you will use). The sequential sort method took almost 3,000 milliseconds to sort the array on my 4-core laptop, while the parallel sort method took about 700 milliseconds at most. Agree, it doesn’t often happen that a new language update improves class performance by 4 times.
Arrays.parallelPrefix()
The parallelPrefix method applies a specified mathematical function to the elements of an array collectively, processing the results within the array in parallel. This is much more efficient on modern multi-core hardware, compared to sequential operations on large arrays. There are many implementations of this method for different basic types of data operations (for example, IntBinaryOperator, DoubleBinaryOperator, LongBinaryOperator, and so on), as well as for different types of mathematical operators. Here's an example of parallel array stacking using the same large array as the previous example, which completed in about 100 milliseconds on my 4-core laptop. public class MyIntOperator implements IntBinaryOperator { @Override public int applyAsInt(int left, int right) { return left+right; } } public void accumulate() { int[] src = null; // accumulate test System.out.println("\nParallel prefix:"); src = getData(); IntBinaryOperator op = new ParallelSort.MyIntOperator(); long start = System.currentTimeMillis(); Arrays.parallelPrefix(src, new MyIntOperator()); long end = System.currentTimeMillis(); System.out.println("--Elapsed sort time: " + (end-start)); } ... }
Arrays.parallelSetAll()
The new parallelSetAll() method creates an array and sets each array element to a value according to the function that generated those values, using parallelism to improve efficiency. This method is based on lambdas (called "closures" in other languages) (and, yes, this is the author's mistake, because lambdas and closures are different things) , and which are another new feature of JDK8 that we will discuss in future articles. It will be enough to note that lambdas, whose syntax is easy to recognize by the -> operator, perform an operation on the right side after the arrow for all elements passed to it. In the code example below, the action is performed for each element in the array, indexed by i. Array.parallelSetAll() generates array elements. For example, the following code fills a large array with random integer values: public void createLargeArray() { Integer[] array = new Integer[1024*1024*4]; // 4M Arrays.parallelSetAll( array, i -> new Integer( new Random().nextInt())); } To create a more complex array element generator (for example, one that would generate values ​​based on readings from real-world sensors), you could use code similar to the following: public void createLargeArray() { Integer[] array = new Integer[1024*1024*4]; // 4M Arrays.parallelSetAll( array, i -> new Integer( customGenerator(getNextSensorValue()))); } public int customGenerator(int arg){ return arg + 1; // some fancy formula here... } public int getNextSensorValue() { // Just random for illustration return new Random().nextInt(); } We'll start with getNextSensorValue, which in reality it will ask the sensor (for example, a thermometer) to return its current value. Here, as an example, a random value is generated. The following customGenerator() method generates an array of elements using selected logic based on the case you select. Here's a small addition, but for real cases, it would be something more complicated.
What is Spliterator?
Another addition to the Arrays class that makes use of concurrency and lambdas is the Spliterator, which is used to iterate and split an array. Its effect is not limited to arrays - it also works well for Collection classes and IO channels. Spliterators work by automatically splitting an array into different parts, and a new Spliterator is installed to perform operations on these linked subarrays. Its name is made up of Iterator, which "splits" its work of moving-iteration into parts. Using our same data, we can perform a splititerated action on our array as follows: Performing actions on the data in this way takes advantage of parallelism. You can also set splitter parameters, such as the minimum size of each subarray. public void spliterate() { System.out.println("\nSpliterate:"); int[] src = getData(); Spliterator spliterator = Arrays.spliterator(src); spliterator.forEachRemaining( n -> action(n) ); } public void action(int value) { System.out.println("value:"+value); // Perform some real work on this data here... }
Stream - processing
Finally, from an Array, you can create a Stream object, which allows parallel processing on a sample of data as a whole, generalized into a stream sequence. The difference between a Collection of data and a Stream from the new JDK8 is that collections allow you to work with elements individually when a Stream does not. For example, with collections, you can add elements, remove them, and insert them in the middle. A Stream sequence does not allow you to manipulate individual elements from a data set, but instead allows you to perform functions on the data as a whole. You can perform such useful operations as extracting only specific values ​​(ignoring repetitions) from a set, data transformation operations, finding the minimum and maximum of an array, map-reduce functions (used in distributed computing), and other mathematical operations. The following simple example uses concurrency to process an array of data in parallel and sum the elements. public void streamProcessing() { int[] src = getData(); IntStream stream = Arrays.stream(src); int sum = stream.sum(); System.out.println("\nSum: " + sum); }
Conclusion
Java 8 will definitely be one of the most useful updates to the language. The parallel features mentioned here, lambdas, and many other extensions will be the subject of other Java 8 reviews on our site.
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION