The Knight Has Landed

facebooktwittergoogle_pluslinkedinyoutubeflickrby feather

Enterprises looking to integrate GPU-like parallel computing into their HPC hardware without having to make any significant changes to their IT infrastructure should look no further–the Knight has landed. Intel just released the highly-anticipated next-generation Xeon Phi processor, previously code-named, Knights Landing. The most notable feature of the Xeon Phi compared to PCIe GPU cards is that it can self-boot and can use an operating system as the native processor. Additionally, each CPU can handle up to four threads and has 3x per-thread performance improvement compared to earlier Phi products, while featuring 16GB of MCDRAM. It can also be configured to support multiple memory configurations, using both the onboard memory and the available DDR4 memory channels.

But what makes the Phi such a big deal? While on the surface, it seems like a business as usual approach with the standard hardware upgrades that are attributed to Moores Law, the Phi actually delivers the beginnings of supercomputing performance in an integrated, smaller package.

This allows your business to immediately scale up without having to completely redesign your hardware architecture. Meaning, if you want to continuously run theoretical models, create large scale predictive projects, or simply process data at the speed of light, the Phi will support that. And with AMAXs designed-to-order server and rack solutions as well as full menu of value added services, your exact high performance computing needs can be satisfied with as little compromise as possible. Click here to learn more.

facebooktwittergoogle_pluslinkedinmailby feather
Posted in AMAX News, Data Center, Deep Learning, Enterprise Computing, HPC Computing, Total Computing Solutions, converged infrastructure | Leave a comment

Best Deep Learning Performance By An NVIDIA GPU Card: The Winner Is

facebooktwittergoogle_pluslinkedinyoutubeflickrby feather

Deep Learning

Theres been much industry debate over which NVIDIA GPU card is best-suited for deep learning and machine learning applications. At GTC 2015, NVIDIA CEO and co-founder Jen-Hsun Huang announced the release of the GeForce Titan X, touting it as the most powerful processor ever built for training deep neural networks. Within months, NVIDIA proclaimed the Tesla K80 is the ideal choice for enterprise-level deep learning applications due to enterprise-grade reliability through ECC protection and GPU Direct for clustering, better than Titan X which is technically a consumer-grade card. Then in November of 2015, NVIDIA released the Tesla M40. At 5x the price point of the Titan X, the Tesla M40 was marketed as The Worlds Fastest Deep Learning Training Accelerator.

With this many worlds fastests and most powerfuls in such a short period of time, people were understandably confused. Therefore, as a leader in high-performance computing technologies and deep learning solutions, AMAXs engineering team endeavored to benchmark the various deep learning cards to determine which NVIDIA card performed the best for deep learning.

In a whitepaper titled, Basic Performance Analysis of NVIDIA GPU Accelerator Cards for Deep Learning Applications, AMAXs team analyzed NVIDIA K40, K80, and M40 Enterprise GPU cards along with GeForce GTX Titan X and GTX 980 Ti (water-cooled) consumer grade cards running 256×256 pixel image recognition training using Caffe software. Systems used in the benchmark tests were AMAXs DL-E400 (4xGPU workstation), DL-E380 (3U 8xGPU server) and the DL-E800 (4U 8xGPU server).

The study included:

- Card specific performance analysis

- Performance scaling from single GPU system to up to 8x GPU nodes

- Performance impact of the CPU

- Single and dual CPU solutions

- Platform-specific performance differences

The Results

GeForce TitanX

The study found that increasing the number of cards scaled the performance linearly, and cards based on the Maxwell architecture (Titan X, 980 Ti, M40) outperformed the Kepler cards (K40 and K80).

Most interesting was how poorly the K80 performed despite having the highest single-precision TFLOP performance spec.

So which card performed the best in our deep learning benchmark testing? Surprisingly, the water-cooled GTX 980 Ti. The Titan X and M40 came in second, displaying near neck-to-neck performance. Since the GTX 980 Ti may not be suitable for server integration, our recommendation would be the Titan X card and M40s for deep learning applications, with Titan X providing the best performance to cost ratio.

Machine Learning Applications

It remains to be seen how the Pascal-based GTX 1080 (replacement for GTX 980, to be released on May 27th, 2016) will perform in comparison, but early feedback is that the 2x better performance than Titan X statistic relates to VR applications, not deep learning applications.

In the meantime, to learn more about the results of AMAXs deep learning benchmark testing, you can download the white paper here. AMAXs full line of built-to-order deep learning solutions can be found here.

facebooktwittergoogle_pluslinkedinmailby feather
Posted in AMAX News, Deep Learning, Enterprise Computing, GPU Computing, HPC Computing | Tagged , , , , | Leave a comment

100 YearsThe Movie Youll Never See

facebooktwittergoogle_pluslinkedinyoutubeflickrby feather

This month at the Cannes Film Festival, the new movie starring John Malkovich and directed by Robert Rodriguez (Sin City, Machete) will be showcasedif by showcased, you mean the vault in which the physical film is sealed will be displayed in the by-invitation only Louis XIII Suite at the H担tel Le Majestic Barri竪re Cannes.

The movie is literally titled, 100 YearsThe Movie You Will Never See. Its a never done before concept that envisions Earth 100 years from now, and is set to be released on November 18th, 2115, a release date out of the current range of the average human lifespan.

WTF, you may be saying to yourself, especially once youve been tantalized with the Blade-Runneresque teasers. However, the project, written by Malkovich, is designed to be a cinematic time capsule to see how closely the filmmakers vision of future reality compares to actual reality in the year 2115.

The films storyline has been a closely guarded secret. All we know is that the film stars Malkovich, Marko Zaror, and Shya Chang and is set in the year 2115. The custom-built safe that holds the physical film reel utilizes a time-release technology and can only be opened once the 100 year countdown is completed on November 18, 2115.

Even those involved with the film, will have to wait until its release date.

Its the first time Ive done anything like this, said Rodriguez, in an interview with IndieWire. I was intrigued by the whole concept of working on a film that would be locked away for a hundred years. They even gave me silver tickets for my descendants to be at the premiere in Cognac in 2115. How cool is that? What John and I wanted it to be was a work of timeless art that can be enjoyed in 100 years. Im very proud of it even if only my great grandkids and hopefully my clone will be around to watch.

Some have dismissed the project as an elaborate publicity stunt for Louis XIII Cognac, with the movie being a tribute to the century of careful craftsmanship required to create each decanter of the luxury liquor. Seeking a work of art that could speak to the brands commitment to time-aged quality, the company brought on Malkovich and Rodriguez to create the vehicle.

Louis XIII is a true testament to the mastery of time and we sought to create a proactive piece of art that explores the dynamic relationship of the past, present and future, said global executive director of Louis XIII, Ludovic du Plessis.

“There were several options when the project was first presented of what [the future] would be, said Malkovich. An incredibly high tech, beyond computerized version of the world, a post-Chernoybl, back to nature, semi-collapsed civilization and then there was a retro future which was how the future was imagined in science fiction of the 1940s or 50s.”

All three have been represented in a series of teasers:

While most of us will not be around to see how closely the film nails our future reality, theres no doubt that the world is poised for incredible change in the near future due to all the technology advances around us, particularly developments in artificial intelligence, machine learning and automation. Self driving cars are a reality, smarter devices and machines mean more integrated city and global infrastructures. Better data analytics mean more predictive technologies in all industries including health care, security and social sciences. And hopefully, bots will one day be much more helpful after they learn to stay away from the bad kids. AMAX is highly invested in pushing our world towards a smart technology tipping point by assisting developers in AI and machine learning to develop intelligent technologies through its Deep Learning Solutions. Beyond that, 100 years from now, its anyones guess. But as Rodriguez said, with any luck, our clones will be there to decide if 100 Years was just a glorified publicity stunt, or well worth the wait.

facebooktwittergoogle_pluslinkedinmailby feather
Posted in Deep Learning | Leave a comment

NVIDIA DGX-1: The Game Changer That Took Deep Learning To Ludicrous Speed

facebooktwittergoogle_pluslinkedinyoutubeflickrby feather

HPCLast week, NVIDIA made a groundbreaking announcement, launching their mega-powerful NVIDIA DGX-1 Deep Learning System at the GPU Technology Conference in San Jose. As a purpose-built Deep Learning solution featuring 7TB of SSD storage and a whopping 170 TFlops performance, the DGX-1 was rightfully marketed as the worlds first and most powerful Deep Learning Supercomputer-in-a-Box, 12x faster than any previous GPU-accelerated solution that has come before.

The NVIDIA DGX-1 is not a configurable server, or a component that must be integrated into a larger Deep Learning system. It is a turnkey, plug-and-play solution featuring eight Pascal-based Tesla P100 cards installed in a hybrid mesh cube configuration, interconnected with NVIDIA NVLink. The system comes fully integrated with hardware and software designed specifically for deep learning development. It even comes with NVIDIA -backed support, software upgrades and a cloud management portal, so that companies have all the tools they need at their fingertips to quickly train neural networks with the processing power necessary to create viable Deep Learning applications.

Lord Dark HelmetWith the DGX-1, NVIDIA has now given developers a powerful engine with which to radically reduce training and inference time, fast tracking new products and features based on AI or machine learning to market at the speed of innovation. This is critical for a technology on the brink of changing our world and the way machines interact with and enrich it. Like the gold rush and space race, companies are racing to develop applications to take advantage of a wide open opportunity to create smarter and smarter applications.

That is why AMAX was chosen as an official NVIDIA Technology Partner authorized to take pre-orders for the DGX-1 Deep Learning System. The DGX-1 was purpose-built to help researchers and data scientists achieve new milestones in creating AI applications, said Jim McHugh, vice president and general manager of GRID and DGX-1 at NVIDIA. AMAXs extensive expertise in delivering deep learning solutions will be of considerable value to customers incorporating this one-of-a-kind supercomputing platform into their data centers to power their most demanding deep learning workloads.

During his keynote opening GTC 2016, Jen-Hsun Huang saw deep learning as not just a niche application, but a world-changing computing platform that one day every application will depend on. The number of companies involved in deep learning has just exploded, Huang said. Every internet service provider, every major computing company, the type of applications for deep learning to enhance the smartness of our applications, to enhance the greater insight that we can derive from large data is really crazy. Intelligent video analysis, surveillance will never be the same. Intelligent video tagging, image tagging, recognizing images, image search, voice, translation, a universal translatorapplications like Twitter, Uber and all these other amazing applications are all now powered by deep learning. The recommendations engines of movies and Amazon are going to go through a whole new phase of renaissance.

Tesla GP100 GPUEvery industry is touched by deep learning development. Exciting projects include self-driving cars, social media, AI and robotics, medical imaging and diagnosis, personalized online retail experiences and many more that seem to break ground every day. With the entrance of the DGX-1, this could provide the tipping point for building applications which are, if not as smart as humans, smart enough to cater to real human needs and desires.

The DGX-1 software stack includes all major Deep Learning frameworks, the NVIDIA Deep Learning SDK, the DIGITS GPU training system, drivers, and CUDA. This allows Deep Learning developers to construct deep neural networks (DNN) in their preferred machine learning framework, backed by the diagnostics and support offered by NVIDIA. No less important, the DGX-1 has been designed so that Xeon compute, Tesla compute and networking options can be upgraded independently. This transforms the DGX-1 into a total solution that can be deployed in-house within a matter of minutes.

Deep Learning not just the IT buzzword of 2016, but something quickly becoming the key to an entire paradigm shift in technology and the world in which we live. Because it has real world application in almost every industry and facet of life, from enterprise to academia to consumer, the development to date has only scratched the surface of its game-changing potential. But regardless of how you choose to utilize Deep Learning/Machine Learning in your application or development, there is no doubt that DGX-1 can get you there, like a bullet train. With the DGX-1, developers now have access to a deep-learning system with 12X higher application performance than any previous GPU-accelerated solution. That would be the equivalent of a years worth of development completed in a single month! If the glory (and the money) goes to the one who reaches the breakthrough first, the DGX-1 may well have just leveled the playing field.

facebooktwittergoogle_pluslinkedinmailby feather
Posted in Deep Learning, GPU Computing, HPC Computing, Product Development, Tradeshow/Events | Tagged , , , , , , | Leave a comment

Cloud Computing Ramps Up With Intel速 Xeon速 E5-2600 v4 Processor

facebooktwittergoogle_pluslinkedinyoutubeflickrby feather

Intel Xeon E5-2600 v4 processorWith the launch of the new Intel速 Xeon速 E5-2600 v4 series processor, cloud computing is about to take a big step forward. Starting today, Intel is releasing the latest processor in their Xeon based Broadwell-EP product line. With 22 cores/44 threads per CPU, DDR4 2400MHz memory support, and built-in features geared towards software-defined computing solutions, this next-generation microprocessor is uniquely suited to handle the evolving needs of cloud and data center infrastructures.

According to the RightScale 2016 State of the Cloud, more than 95% of businesses are using a cloud of some sort for Enterprise IT, while 71% are experimenting with hybrid cloud adoption. More importantly, companies are far more likely to shift Enterprise workloads to the cloud in times of need, and for the first time ever, companies are citing cloud cost management as a bigger concern than security. This is where the new E5-2600 v4 processor (Intel Grantley) can be so valuable in the dual-socket market space.

Cloud Computing ModelsWith an increased core count, HyperThreading capabilities, and enhanced workload optimization, the v4 processor is far more efficient when it comes to cloud computing (IaaS) and data-intensive applications related to the burgeoning Internet of Things (IoT). With the Intel Grantley v4 platform, you are basically getting 10% more performance than the Intel Xeon E5-2600 v3, for the same cost.

This comparative performance increase can be leveraged to even greater effect when the E5-2600 v4 is paired with data center servers that can be run on open standards like Apache Hadoop and OpenStack. For example, CB3 (Cloud Basic Building Blocks) is an Enterprise-class server series that utilizes a modular building block design to maximize efficiency in cloud infrastructures, so that companies can easily scale up even as they experiment with different workloads and applications.

Converged Cloud Infrastructure

The CB3 product line offers a range of uniquely cost-effective solutions for companies that have yet to refine a cloud strategy, because servers can be utilized as building blocks, or as vertically-integrated rack solutions that are delivered fully-tested, pre-loaded with software, and ready to start generating value upon power up. This modular approach is ideal for cloud service providers who need the flexibility to accommodate a wide range of applications, while still minimizing operational overhead.

So while the state of the cloud is still in flux, the performance enhancements of the v4 series, combined with a fresh take on cloud infrastructure management, are encouraging signs for companies that want the flexibility of scalable cloud solutions that can ramp up at a moments notice.

facebooktwittergoogle_pluslinkedinmailby feather
Posted in Cloud Computing, Enterprise Computing, Hybrid Cloud, Total Computing Solutions, converged infrastructure | Tagged , , , , , , | Leave a comment