Fixstars Launches the World’s Highest Density SSD, “SSD-3000M” For Media and Entertainment Professionals

The Highest Density and Performance Reliability for professionals.

Sunnyvale, CA – Feb 17, 2015 – Fixstars Solutions Inc., an innovator in flash storage solutions, today announced the start of sales for 3TB SSD, SSD-3000M, and 1TB SSD, SSD-1000M, in North America. The products feature enterprise level reliability and unprecedented sequential read/write performance aimed at professional content creation, Advanced driver assistance systems(ADAS), HPC, and Datacenters.

The 3TB SSD-3000M has the world’s highest capacity*1 for 2.5” SATA SSD. High capacity SSDs help reduce the number of drives required in professional setups reducing operational costs such as maintenance, energy, and chassis/rack infrastructure. More importantly a more reliable workflow with minimum handling failures is of significantly valuable. These disks integrate Fixstars’ proprietary NAND controller preventing latency spikes and performance deterioration ensuring consistent high performance. Applications for which fast and stable disk writes are crucial such as 4K video recording/editing and encrypted storage for film will benefit the most from Fixstars solid state disks.

_DSC3737 - flatten

“The SSD-3000M/1000M were released in Japan last November, and have been getting great feedbacks from our customers”, said Satoshi Miki, CEO & Co-Founder of Fixstars Corporation (Tokyo), “As an innovator of storage solutions, we are focused on providing high performance and reliability SSD solutions, to accelerate our customer’s business”.

For more information on the SSD-3000M/1000M, please visit our web site.

*1: As of Nov 18th 2014, according to a survey by Fixstars Corp.

About Fixstars

Fixstars Solutions is a technology company devoted to our philosophy “Speed up your Business”. Through its software parallelization/optimization expertise, its highly effective use of multi-core processors, and application acceleration for the next generation of flash memory technology that delivers high speed IO as well as power savings, Fixstars Solutions provides “Green IT”, while accelerating customers’ business in various fields. Learn more about how Fixstars Solutions can accelerate your business in life science, manufacturing, finance, and media & entertainment. For more information, visit www.fixstars.com.

 

 

[Tech Blog] PCIe SSD for Genome assembly

Introduction

dreamstime_ss_40556121

A genome assembly software takes a huge number of small pieces of DNA sequences (called “read”), and tries to assemble them to create a long DNA sequence which represents the original chromosomes. It generally consumes not only large computational power, but also large working memory space. The required memory space depends on the input data size, but often in the tera-bytes. Although it is true that the price of DRAM has dramatically decreased, a workstation or a server with a few TB of memory is still very expensive. For example, IBM System x3690 X5 can install 2TB of memory, but the list price is more than $300k.

On the other hand, PCI Express SSD board is on the rise. Many hardware vendors like Intel, Fusion-IO (acquired by San Disk), OCZ (acquired by Toshiba), etc, release variety of PCIe-SSD boards. Generally, it has lower bandwidth than DRAM, but much higher bandwidth than a standard SATA/SAS SSD. The price is a little high compared with a standard SATA-SSD, but Fusion-IO ioFX 2.0 has 1.6TB, at $5,418.95 on Amazon.com. Even if you insert this board into a high-end workstation, the total price is still below $10,000, which is much cheaper than a 2TB memory server.

In this blog post, I would like to explore whether using an SSD in place of DRAM is going to yield a viable solution. We will use an open-source Genome Assembler called “velvet” as a benchmark software.

Methods

First, I downloaded, compiled and installed “velvet” from the velvet web site.

$ tar xvfz velvet_1.2.08.tgz
$ cd velvet_1.2.08
$ make ‘MAXKMERLENGTH=51’ ‘OPENMP=1’
$ sudo cp velvetg velveth /usr/local/bin

In this procedure, ‘MAXKMERLENGTH=51’ sets the maximum number for “k-mer”. ‘OPENMP=1’ means that multi-threading by OpenMP is active. “k-mer” is a very important parameter in genome assembly, as it affects the quality of output DNA sequence. For more detail on k-mers, please refer to the velvet user manual.

Velvet has two process, velveth and velvetg. The velveth process creates a graph file to prepare the genome assembly. The memory required by velveth is not so large. The velvetg process, which is the actual assembling process, consumes much more memory and computation time.
Before we can start testing, we need input files. In this experiment, we will use two fastq files, SRR000021 and SRR000026, which were downloaded from this site . I processed these data by velveth as follows:

$ velveth SRR2126.k11  11 -short -fastq SRR2126.fastq

The first argument is the output directory name, the second argument is the length of the k-mer, the third argument specifies short read inputs, the 4th argument specifies the “fastq” file type, and the 5th argument is the input file.

The next step is assembly. The command to do so is as follows:

$ velvetg SRR2126.k11

The argument is the output directory generated by velveth. I measured the elapsed time of this command in several different hardware configurations:

1. Memory 4GB + Generic SATA HDD
2. Memory 4GB + Fusion IO ioFX
3. Memory 8GB

In case of configuration #2, I created a swap file on ioFX like this:

# dd if=/dev/zero of=/mnt/iofx/swap0 bs=1024 count=12582912
# chmod 600 swap0
# mkswap swap0 12582912
# swapoff –a
# swapon /mnt/iofx/swap0

Results

The velvetg process uses about 8GB memory space for this input data, so roughly half of temporary data is spilled out to swap memory space for configurations 1 and 2. Below figure  shows the elapsed time for each configuration. Using the Generic SATA HDD, the process was not finished after 2 hours, so we decided to kill the process.

Clipboard01

Conclusion

So is using PCIe-SSD a viable solution? It is hard to say, as 3x difference is not so small. In this particular experiment, only half of the memory space used by velvetg happened to be placed on the PCIe-SSD. As mentioned earlier, the real-life data that bioinformaticians deals with can be a few TBs. If almost all the memory space is on PCIe-SSD, the performance is expected to be much worse.

However, considering that HDD could not even complete the process in a reasonable amount of time, the PCIe-SSD card showed that it can vastly improve performance. This shows that the SSD can serve as a good compromise point, as DRAM is much more expensive than a SSD.

 

 

[Tech Blog] I/O Bench Mark Test for Windows Azure D-series

keck_l

Update: Microsoft has just released a Persistent SSD storage. This is still in preview phase, and available in a limited region. Here is a link and expected throughput:
http://azure.microsoft.com/blog/2014/12/11/introducing-premium-storage-high-performance-storage-for-azure-virtual-machine-workloads/

Recently Microsoft released a new series of Virtual Machine called D-Series. D-Series have a relatively new CPU (Xeon E5-2660, 2.2GHz) and local Solid State Drives (SSD). Amazon AWS already has SSD instances, and there are several SSD-only cloud vendors. I’m just excited that we can finally use SSD on Azure. In this blog post, I will show you the I/O benchmark result of the D-series VM.

Azure VM has two types of local storage; one is a persistent storage and the other is an ephemeral storage. The data on the persistent storage is reusable after the VM is rebooted, but the data on the ephemeral storage may be lost after rebooting. For the D-Series, SSD storage is an “ephemeral” storage. As shown below, the performance of SSD is outstanding but you have to save the data in a persistent storage before the VM is turned off. The size of the new SSD storage is limited, in case of D14 (a high end instance of D-Series), the size of local storage is 800GB. If you need more storage, you can use Azure Storage which is an elastic storage system for the cloud and provides several different ways to access it. An Azure VM just mounts Azure Storage with “mount” command as a local block device.

I prepared one D14 instance that CentOS release 6.5 was installed on with 30GB of persistent storage, 800GB SSD ephemeral storage, and 999GB Azure Storage is mounted. (Fig. 1). The specification of D14 is shown in table 1. In this blog post, the I/O performance of 3 different storage types are evaluated.


storage

Fig. 1: Three different storages mounted on Azure VM

 

# of cores 16
RAM 112GB
Persistent Storage 30GB
Ephemeral Storage 800GB
Price $1.542/hour
Table 1. Specification of the D14 instance

At first, I tried a simple read/write test with “dd” and “hdparam”. This test is very useful for a quick check since these commands should be pre-installed in many Linux distributions. The actual command is:

– Read Test

# sync
# echo 3 > /proc/sys/vm/drop_caches
# hdparm –t

– Write Test

# dd if=/dev/zero of= ibs=1M obs=1M count=1024 oflag=direct

The command is simple. However, if you missed a command or command arguments the result will likely be incorrect. Line 1 & 2 in Read Test Commands are to remove data on a read buffer cache on RAM, and evaluate the performance of storage IO. In the Write Test command, the 1GB data is read from /dev/zero and written in . The last argument in the dd command (oflag=direct) is for the direct access to the storage medium without Linux caching reads or writes. The result is shown in Fig. 2. The measurements were conducted 12 times and the interval of each trial is 10 sec. These commands issue I/O operations sequentially.

simple_rw

Fig.2: Result of Simple Read/Write test

For HDD, the typical performance of a sequential Read/Write is around 100MB/s. For SATA/SAS SSD, it should be around 400MB/s. As shown in Fig. 2, Azure’s Ephemeral storage shows a remarkably high performance. They must use the highend PCIe-SSD having the typical performance of around 1GB/s. In the result of Read Test, you can see that the performance of the persistent storage (Blue Line) depends on the trial number. As I mentioned, I removed the Linux OS cache with sync; echo 3 > /proc/sys/vm/drop_caches commands, but It still looks like cache effect. I don’t have any concrete reason for this, but that might be due to the effect of cache on the host of Virtual Machine which we cannot control from a guest OS.

The simple Read/Write test is easy-to-use, but the result is not so reliable so we also evaluated the 3 different types of storage with FIO. FIO is a standard tool in Linux for evaluating I/O performance and has a bunch number of options. It may be difficult for a beginner to pick up right options. In this evaluation, I used the job file published by WinKey (http://www.winkey.jp/downloads/index.php/fio-crystaldiskmark, written in Japanese). With this job file, you can do the same types of I/O performance test as CrystalDiskMark (http://crystalmark.info/software/CrystalDiskMark/index-e.html ), which is a famous benchmark tool in Windows OS. The command is just typing :

# fio crystaldiskmark.fio

The target storage area is /tmp as default. If you want to evaluate a different directory, Edit Line 7 (directory=/tmp/) in crystaldiskmark.fio. The result is shown in Fig. 3. Note that Y-axis is logarithmic scale. The performance difference between the ephemeral storage and the persistent storage is dramatically large especially in Random Read/Write at Queue Depth* = 32.
* Queue Depth: The number of simultaneous input/output requests for a storage device.

FIO
Fig. 3: Result of FIO benchmark

As demonstrated in the FIO test results, the SSD performance on Azure D-Series is great but again, the problem is that SSDs are ephemeral storage. If your application boots up and shuts down instances often, you could develop a management tool or daemon to save your essential data in Azure storage or persistent local storage.

 

 

[Tech Blog] Power Management Using the Windows Azure Java SDK

cloud2

Azure SDK’s

Windows Azure Portal offers an intuitive interface for performing tasks such as turning on/off a VM. However, when these tasks need to be performed in bulk, one soon comes to a realization that clicking through the user interface may not be the optimal method.

Fortunately, Windows Azure can be managed through REST API’s. And to makes these REST API’s easy to use, command-line tools and numerous SDK’s exist as a wrapper, available for PowerShell, Cross-Platform Command-Line Interface, .NET, PHP, Node.js, Python, Java, and Ruby.

Power Management code sample

After browsing through the javadoc for the SDK, one can find the methods start() and shutdown() implemented in class VirtualMachineOperationsImpl. Both methods require 3 parameters, all of which can be found from the Azure Portal.

 

But how does one create an instance of VirtualMachineOperationsImpl? This did not seem intuitive, so I downloaded the source code of the SDK and browsed through the test code, which led me to the following 3 lines of code:

 

Configuration config = PublishSettingsLoader.createManagementConfiguration(publishSettingsFileName, subscriptionId);
ComputeManagementClient computeManagementClient = ComputeManagementService.create(config);
OperationResponse operationResponse = computeManagementClient.getVirtualMachinesOperations().start(serviceName, deploymentName, virtualMachineName);

 

Basically, getVirtualMachinesOperations() method from class ComputeManagementClient returns an instance of VirtualMachineOperationsImpl. ComputeManagementClient class in turn requires an instance of the Configuration class, which requires the filename of the PublishSettings and the Subscription ID during instantiation.

 

The entire sample code is as follows:

import com.microsoft.windowsazure.core.*;
import com.microsoft.windowsazure.Configuration;
import com.microsoft.windowsazure.Configuration.*;
import com.microsoft.windowsazure.credentials.*;
import com.microsoft.windowsazure.management.configuration.*;
import com.microsoft.windowsazure.management.compute.*;
import com.microsoft.windowsazure.management.compute.models.*;
import org.apache.http.impl.client.HttpClientBuilder;
import java.util.concurrent.*;
import java.io.File;
import java.io.IOException;
import java.lang.String;

class AzureStartVM {
    public static void main(String[] args) {
        String publishSettingsFileName = System.getenv("AZURE_PUBLISH_SETTINGS_FILE");
        if (publishSettingsFileName == null) {
            System.err.println("Set AZURE_PUBLISH_SETTINGS_FILE.");
            System.err.println(" Ex. $ setenv AZURE_PUBLISH_SETTINGS_FILE $HOME/conf/XXX.publishsetting");
            System.exit(1);
        }
        File file = new File(publishSettingsFileName);
        if (!file.exists()) {
            System.err.println("File not found: " + publishSettingsFileName);
            System.exit(1);
        }

        String subscriptionId = "XXXXXXXXXXX";
        if (args.length != 3) {
            System.err.println("Usage: AzureStartVM serviceName deploymentName virtualMachineName");
            System.exit(1);
        }
        
        String serviceName = args[0];
        String deploymentName = args[1];           
        String virtualMachineName = args[2];

        try {                             
                Configuration config = PublishSettingsLoader.createManagementConfiguration(publishSettingsFileName, subscriptionId);
                ComputeManagementClient computeManagementClient = ComputeManagementService.create(config);

                OperationResponse operationResponse = computeManagementClient.getVirtualMachinesOperations().start(serviceName,deploymentName,virtualMachineName);
        }
        catch (Exception e){
                e.printStackTrace();
        }
    }
}

 

Compile source

1)      Add Java SDK jar files to CLASSPATH
2)      Download the publish settings file locally and set its path to AZURE_PUBLISH_SETTINGS_FILE
3)      Set string value of subscriptionId in the source.
4)      Compile

$ javac AzureStartVM.java

 

Execute Code

$ java AzureStartVM serviceName deploymentName vmName

 

So Why Java?

The different tools and SDK’s are in different stages of development, and the required management functions may not be available for your language of choice. PowerShell is by far the most complete and the easiest to use, but this is not a solution available for Linux users.

Cross-Platform CLI would probably be the next choice in the chain of tools to try. However, this CLI uses node.js underneath to trigger the REST API requests, and there seems to be either a bug or an incompatibility issue with node.js on the CentOS-based Linux VM for Azure, which caused unreliability in its operation. Microsoft support was able to boil down the problem to the https module in node.js, and that this problem was non-existent in Ubuntu, but switching operating systems just for this purpose seemed rather nonsensical.

PHP and Python do not yet have power management modules implemented, and while implementing these modules for the SDK was an option, Java seemed to be the easier choice.

Why the start() and stop() instead of async options?

The implementation of the REST API for power management seems to be thread-unsafe and result in an error when executed for VMs in the same cloud service. This means that if CloudService1 contained VM1 and VM2, a start command cannot be triggered for both VM1 and VM2 simultaneously. Even on the Azure portal, one has to wait until VM1 is running before VM2 can be started. Therefore, it made sense to implement a blocking start/shutdown.

VM’s in other cloud services, on the other hand, can be started/shutdown simultaneously, so the code can be executed simultaneously using multiple terminals.  Hence, the async was not necessary.

Remarks

The above command can be executed in a for-loop from shell to start multiple VMs at once.

While it may be trivial for a developer to come up with this solution, it is probably not part of the job description for Linux Administrators, who would probably benefit the most from this, to come up with this solution.

Fixstars Establishes Canadian Subsidiary

Sunnyvale, CA — November 3rd, 2014 — Fixstars Solutions, Inc. announced it’s Canadian subsidiary’s first official day of operations today.

Fixstars Solutions Canada will help to grow it’s parent companies’ business in the United States and abroad by expanding their research and development capabilities in multicore, storage, and cloud computing technologies. The new subsidiary will be headquartered in Victoria BC. Akihiro Asahara, the CEO of Fixstars Solutions will assume the role of Chairman while Owen Stampflee will assume the role of CEO.

About Fixstars Solutions

Fixstars Solutions is a software company devoted to “Speed up your Business”. Through its software parallelization/optimization expertise, its highly effective use of multi-core processors, and application acceleration for the next generation of memory technology that delivers high speed IO as well as power savings, Fixstars Solutions provides “Green IT,” while accelerating customers’ business in various fields. Learn more about how Fixstars Solutions can accelerate your business in life science, manufacturing, finance, and media & entertainment.
For more information, visit www.fixstars.com.

Fixstars Solutions Adopts Microsoft Azure to Boost Life Science Application Performance

Sunnyvale, CA — August 12th, 2014 — Fixstars Solutions Inc. today announced that it has adopted Microsoft Azure to help boost the performance of its life science application. The importance of computer simulation and analysis in life science field is growing, and some life science applications, such as Molecular Dynamic Simulation and DNA sequencing, consume a vast amount of computing resources. Building a private cluster server system only for these compute intensive applications can be complex and costly. Fixstars Solutions developed a computing platform for life science applications on Microsoft Azure, designed to provide an elastic and high performance cluster server system.

Fixstars Solutions has extensive knowledge of parallel processing technology, with a track record of numerous successful projects with hyper-scale cluster server systems. For the US Air Force Research Lab, Fixstars Solutions developed a simulation and video processing system made up of 2,016 SONY PS3® nodes. Recently, Fixstars and Mizuho Securities succeeded in accelerating derivative valuation system by 30-folds with Intel Xeon Phi® Coprocessors.

Fixstars Solutions worked closely with Microsoft to efficiently port a life science application to Microsoft Azure in three weeks. The nature of the target application is compute-intensive, and requires about 1000 CPU cores. After some initial benchmarking, it was found that the Microsoft Azure A8/A9 instances, which are optimized for high performance computing (HPC) applications, out-performed the competing cloud services, which lead to the decision to utilize Microsoft Azure. As a result of Microsoft Azure’s HPC capabilities, Fixstars Solutions was able to achieve a performance boost of more than 4-folds when compared to the software working originally on the private cluster server system.

“I am pleased to work with Microsoft to bring solutions to market that we believe will advance the life sciences field.” said Akihiro Asahara, CEO of Fixstars Solutions. “Using Microsoft Azure’s new compute intensive instances, A8 and A9, we can use a high performance cluster server system having over 1000 CPU cores available at all times.”

“The flexibility, speed and cost-saving of cloud computing has the power to accelerate crucial research in the life sciences field,” said Venkat Gattamneni, Senior Product Marketing Manager, Cloud and Enterprise, Microsoft. “We look forward to continued work with Fixstars Solutions to boost the performance of life sciences applications using the Microsoft Azure platform.”

Fixstars Solutions will release solutions built on Microsoft Azure, and provide technical service and consulting for clients using Microsoft Azure. Not only will Fixstars aid in porting to Microsoft Azure to achieve speed-up from parallel processing, but will also aid in enhancing system availability and reducing operation costs by using Azure’s management functions made available through Microsoft Azure SDK and command-line tools.

 

About Fixstars Solutions

Fixstars Solutions is a software company devoted to “Speed up your Business”. Through its software parallelization/optimization expertise, its highly effective use of multi-core processors, and application acceleration for the next generation of memory technology that delivers high speed IO as well as power savings, Fixstars Solutions provides “Green IT,” while accelerating customers’ business in various fields. Learn more about how Fixstars Solutions can accelerate your business in life science, manufacturing, finance, and media & entertainment.
For more information, visit www.fixstars.com.

Fixstars and Mizuho Securities Succeeds in Accelerating Derivative Valuation System by 30-folds with Intel Xeon Phi Coprocessor

Mizuho Securities becomes the first Financial Institution to deploy the coprocessor in the Production Environment

TOKYO — June 2, 2014 — Fixstars Corporation today announced that Fixstars and Mizuho Securities Co., Ltd achieved a 30-folds speed-up by porting Mizuho Securities’ Derivative Valuation System to Intel’s many-core processor, “Intel® Xeon Phi™ coprocessors”.  Mizuho Securities has become world’s first financial institution to utilize the Xeon Phi™ coprocessor in the Production Environment.

The recent low interest rates have increased the demand for structured bonds, which combines the use of bonds in conjunction with derivatives.  To meet these customer demands, Mizuho Securities had been investigating ways to efficiently perform the vast amount of computations required by their Derivative Valuation Systems, which led them to consider the use of Intel’s High-Performance Xeon Phi™ coprocessors.

Mizuho Securities, with the aid of their development partner Fixstars, succeeded in devising an algorithm to efficiently distribute the workloads across the cores of Intel® Xeon Phi™ coprocessor.  The Xeon Phi™ replaces an 8-core Xeon® system, which resulted in out-performing the previous system by a factor of 30.  In addition, by placing an emphasis on source-code readability, a highly maintainable system was created.

“I am happy to announce that Fixstars’ expertise in parallel processing in conjunction with Intel’s Technology had been able to improve upon Mizuho Securities’ Derivative Valuation System,” said Kosuke Hirano, Senior Executive Officer, Intel K.K.. “Intel will continue innovating with the highly parallel processing capability of Intel® Xeon Phi™ coprocessors.”

“I am very excited that our expertise in parallel programming techniques has enabled us to aid Mizuho Securities in the deployment of the New Derivative Valuation System,” said Miki Satoshi, CEO, Fixstars. “We had been involved in the project from the researching phase on what hardware to use, and I believe the smooth progress that led to this deployment could not have occurred without the mutual, underlying trust present in the partnership between Mizuho Securities’ and Fixstars, as well as the high capability and the reliability of Intel® Xeon Phi™ coprocessors.”

Mizuho Securities is already using the new system for Plain Vanilla Derivatives, with plans to shift to the new system this summer with Exotic Derivatives.

Fixstars will continue to speed up Mizuho Security’s business through provision of technical expertise in parallel processing.

* Intel, Xeon and Xeon Phi are trademarks of Intel Corporation in the U.S. and/or other countries.

Initial Public Offering

Fixstars Corporation, a parent company of Fixstars Solutions inc., announced the pricing of its initial public offering of 100,000 shares of Class A common stock. Fixstars Corp’s Class A common stock will trade on the Tokyo Stock Exchange (Mothers) under the code “3687” from April 23, 2014.

See more detail at http://www.tse.or.jp/english/listing/companies/index_e.html

[Tech Blog] Super Computing 2013

Super Computing 2013 (SC13) has started in Denver, Colorado! The Super Computing top 500 list has already been updated, and there are no big changes to the top 5. Tianhe-2, a supercomputer developed in China, retained its position as the world’s No. 1 system.

flog

One important but often ignored statistic for Super Computing is power consumption. For example the Tianhe-2 reaches 33.86 petaflop/s on the Linpack benchmark, but it consumes 17808 kW of power to do so! This is almost the entire output of a small thermal power station. The Green 500 ranks supercomputers by energy efficiency instead of just raw processing power. In terms of MFLOPS/Watt, the TSUBAME 2.5 developed by the Tokyo Institute of Technology should get their No. 1 spot. The latest Green 500 list will be available soon.

There are several techniques we can use to manage power consumption at the system level, but within each computing node we only have a few options. Heterogeneous setups, such as those using an NVIDIA/AMD GPU or Intel’s Xeon Phi have become the norm – and we think FPGA chips are another promising option. FPGA has pipeline parallelism, where several different tasks can be spawned in a push-pull configuration and each task has its own data supplied by the previous task – with or without host interaction. Several tech papers report that FPGA outperforms other chips in terms of both latency and energy efficiency. The standard FPGA languages (VHDL and Verilog-HDL) can be a complicated struggle for any software developer (like me!), but the chip vendor ALTERA has recently announced OpenCL support on their FPGA devices. This will make development faster, and cause fewer headaches. Fixstars has been a part of their early access program for their OpenCL SDK since last summer, and we provide OpenCL software and services to our clients.

Unfortunately, we don’t have a booth at SC13 this year, but we can be found at our partner Nallatech’s booth (#3519). We’ll even have coupons for copies of our OpenCL programming book.
Stop by and check out OpenCL on FPGA!

[Tech Blog] Computational Geometry Engine and Our Patented GPP Technology

phoneMap

Computational geometry engines are central to both emerging and ubiquitous technologies. From online mapping to digital chip manufacturing, computational geometry algorithms reduce enormous sets of data into visualizable results that enable engineers and casual users to make informed decisions.

The plane sweep algorithm, although widely used in computational geometry, does not parallelize efficiently, rendering it incapable of benefiting from recent trends in multicore CPUs and general-purpose GPUs. Instead of the plane sweep, some researchers have proposed a uniform grid as a foundation for parallel algorithms of computational geometry. However, long-standing robustness and performance issues have deterred its wider adoption, at least in the case of overlay analysis. To remedy this, we have developed previously unknown, unique methods to perform snap rounding and compute efficiently the winding number of overlay faces on a uniform grid. We have implemented them as part of an extensible geometry engine to perform polygon overlay with OpenMP on CPUs and CUDA on GPUs. As previously announced, we have released a software product, called “Geometric Performance Primitives (GPP),” based on this implementation. The overall algorithm works on any polygon configuration, either degenerate, overlapping, self-overlapping, disjoint, or with holes. With typical data, it features time and space complexities of O(N + K), where N is the number of edges and K the number of intersections. Its single-threaded performance not only rivals the plane sweep, it achieves a parallel efficiency of 0.9 on our quad-core CPU, with an additional speedup factor of over 4 on our GPU. These performance results should extrapolate to distributed computing and other geometric operations.

To obtain baseline performance, we compared GPP against equivalent functionality found in the following commonly used GIS software tools: ArcGIS 10.1 for Desktop, GRASS 6.4.2, OpenJUMP 1.6.3 (with JTS 1.13 as backend), and QGIS 1.8.0 (with GEOS 3.3.2 as backend), whose algorithms execute on a single thread only and may or may not exploit the plane sweep. Nevertheless, note that their rich set of features may create additional overhead not yet present in GPP. With each of them, we applied a dissolve (merge) operation and a geometric intersection (boolean AND) of two pairs of layers taken from real-world datasets and listed the resulting execution performance. Its overall performance fared well against the existing solutions, reaching an average speed up of 16 x on CPU alone against ArcGIS, the fastest currently available software application we found.

arcGISperformance

Performance of ArcGIS versus GPP on real-world datasets. Please refer to the paper for more technical details.

At this year’s ACM SIGSPATIAL, our technical paper was selected out of over 200 very competitive entries. Please refer to the paper for more technical details. The SIGSPATIAL section of the ACM covers a broad range of innovative designs and implementations, with special attention paid to emerging trends in geospatial information systems (GIS).

So, what’s our next step? GPP has already gained some success in the Electronic Design Automation (EDA) industry. Our next target is object-relational database engines. Computational geometry processing is a necessary component of spatially aware systems, including database management systems (DBMS) such as Oracle Spatial and Graph, and PostGIS for PostgreSQL. For database management systems, spatially-aware extensions provide native support for storage and retrieval of geometric objects that are commonly quite large and complex. This turns a standard DBMS into a location-aware engine that can drive business and technology applications referencing complex, layered, and/or geographical data into an intuitive and visualizable form. In addition to generalized spatial queries for their databases, Oracle’s solution, for example, provides explicit support for geo-referenced images, topologies, 3D data types, including triangulated irregular networks (TINs), and point clouds with native support for LIDAR data. PostGIS, on the other hand, provides native support and algorithms for ESRI shapefiles, raster map algebra, and spatial reprojection. When combined with GPP’s highly-parallelized computational geometry engine, these functions can processes the large, multi-layered data sets necessary to perform automated emergency route generation, natural resource management, insurance analysis, petroleum exploration, and financial queries 10 times or more faster than conventional algorithms.

We are excited to attend SIGSPATIAL 2013 in November and look forward to discussing the opportunities for geospatial processing that GPP provides with developers and researchers from around the world. See you in Florida!