Introducing Our Newest AMAX Family Member, Heart-Melting Specialist

On Monday, June 26th, a member of the AMAX family arrived to work and heard a weak, high-pitched cry coming from beneath a car. Upon investigation, he found a straggly gray ball of fur, which turned out to be a malnourished and terrified 4-week old kitten. A severe eye infection had left him blinded with both eyes sealed shut.


They say it takes a village to raise a child, and it was no different with this little guy. The AMAX family immediately jumped into action, warming him with a towel, cleaning him up, and feeding him kitten formula from a tiny baby bottle.


After a visit to the vet deemed him healthy outside of the eye infection, the AMAX team has spent the next weeks nursing him back to health.

We are proud to introduce the newest member of the AMAX family: Matrix GPU On-Premise Cloud (Powered by Bitfusion Flex), or Neo for short.


We hope we can keep him as the AMAX office cat and mascot, as his presence has melted the hearts of all who have nursed him and held him sleeping in their laps, and brought a new level of camaraderie, compassion, and joy to the office.


If you would like to donate to the care of little Neo, we hope you will consider procuring one of our MATRIX Deep Learning Solutions. Not only is MATRIX the best solution on the market for AI development and deployment, but proceeds go to keeping a roof over our little guy.collage1b

Posted in AMAX News, Deep Learning, GPU Computing, Product Development | Tagged , , , | Leave a comment

Fresh off GTC 2017: The Revolutionary MATRIX GPU Cloud Solution

Ten years have gone by since GPU-accelerated computing was first introduced. This year at GPU Technology Conference 2017, advances in GPU computing and methods have culminated into the most ambitious and far-reaching technical endeavor yet—Artificial Intelligence. As researchers, global enterprises, and startups alike converged at GTC, the hottest topic was clearly AI and Machine Learning, with NVIDIA doubling down on its position as an AI company, “Powering the AI Revolution.”

In his keynote, NVIDIA founder and CEO Jensen Huang discussed how the AI Boom is fueling a post-Moore’s Law (or Moore’s Law Squared) demand for GPU compute power. In response, NVIDIA has invested over $3 billion to develop the new Tesla (Volta) V100 accelerator. Built with 21 billion transistors, the Volta V100 delivers deep learning performance equal to 100 CPUs and will support new releases of deep learning frameworks such as Caffe 2, TensorFlow, Microsoft Cognitive Toolkit, and MXNet. The DGX-1 also gets an upgrade with V100 GPUs, selling at $149,000 (want one? Inquire here).


Huang also introduced a DGX workstation featuring 4x V100 GPUs for 480 Teraflops of Tensor computing power with a selling price of $69,000.

Other announcements included a collaboration with Toyota for autonomous driving, Project Holodeck for working in a shared VR environment, but more than anything, it was the signal that NVIDIA has every intention of powering the AI boom, particularly with regards to accelerating Machine Learning.


Along those same lines, AMAX showcased its dedication to providing advanced tools to fast track Deep Learning development while reducing barrier to entry. With launch of “The MATRIX” product line, AMAX combined its award-winning Deep Learning platforms with end-to-end Deep Learning tools as well as GPU virtualization technology. The MATRIX increases GPU utilization, fast tracks AI development and training, facilitates task management, minimizes infrastructure costs all to an unprecedented level.


While the MATRIX is deployed as a turnkey appliance in workstation, server and rackscale clusters, it’s especially beneficial to AI startups and incubators who need a deep learning platform to scale with them. The ultra-quiet MATRIX workstations feature a mini 2-GPU form factor and a 4-GPU form factor, and through the MATRIX software, GPU resources can be aggregated and presented to users as an on-premise GPU cloud for dynamic sharing. What this means is that AI companies can build virtual GPU clusters on demand, using hardware that sits quietly under a desk.

Our Presenter Series featured topics around GPU Virtualization and Cloud Computing for Machine Learning, including how the MATRIX enables AI startups to accelerate Time-to-Market, how to upgrade non-GPU infrastructures to include GPU resources, how to break through current performance limitations for GPU computing, and many more.


We were also honored to be interviewed by to talk about the use cases of the MATRIX.


insideHPC: What are you showcasing at the booth today?

Dr. Rene Meyer (VP of Technology, AMAX): What we are showcasing here is a very interesting solution—a hardware/software solution. We not only present the hardware, but we put a software layer on top, which allows you to virtualize GPUs in those machines.

insideHPC: Can you tell me about some use cases and what problems it solves?

Dr. Rene Meyer: One of the use cases is for enterprise customers who purchased a few racks of hardware, and now that they learn that the software they use supports GPUs, and benefits from acceleration. So what they usually do is rip the old hardware out, and replace them with new hardware featuring GPUs, which can be expensive. Rather than do that, they can add a few blocks of our high-density MATRIX servers, and then use the MATRIX software and virtualize the GPUs and reattach them to the existing cluster. So with the MATRIX, you can turn your existing non-GPU cluster into GPU cluster, with minimal additional hardware and without performing a complete refresh.

insideHPC: Ok Rene. This MATRIX box has been described as groundbreaking. Can you tell me more about it?

Dr. Rene Meyer: The MATRIX offering is an end-to-end solution. It’s not just a very powerful deep learning box, but it also has an integrated software layer for a plug and play solution. The software layer allows you to spin off instances, containers, which are pre-configured with Deep Learning frameworks, like Tensorflow, Caffe, Torch, and so on. So you don’t have to worry about having the IT to configure, set up or load things to make sure you have the latest version and things are working. You can literally, at a click of a button, spin off instances and be ready to go.

insideHPC: So for developers this would be pretty powerful to get stuff out of the way and focus on work. It that the idea?

Dr. Rene Meyer: That’s exactly the idea. You can start off development with one of these boxes. Once you see that you need to upgrade or out scale as you need more power, there are various ways how we can do this. One of the ways is that you buy multiple MATRIX boxes. What the MATRIX software does is through virtualization, you can attach GPUs from one box to another box or combine compute resources dynamically. Therefore, you can build more powerful servers or workstations for your workloads on demand. As you continue to grow, you can purchase more servers or workstations, which can be seamlessly integrated into your growing virtual GPU pool. What’s good for startups is you can grow your computation power significantly without the need to build a data center or rent from a colo, and reduce the time and cost of a traditional infrastructure.

For more information about AMAX’s MATRIX solution or Deep Learning platforms, please contact us!

Posted in AMAX News, Deep Learning, Enterprise Computing, GPU Computing, HPC Computing, Tradeshow/Events, Virtualization | Tagged , , , , , , | Leave a comment

GANs: When Machines vs Machines Aspire to Greatness

aoe-primevsgalvatronIn our last blog post, we touched a little upon the concept of GANs, short for Generative Adversarial Networks. GANs is a relatively new branch of unsupervised machine learning. It was first introduced by Ian Goodfellow in 2014, and has spurred major interests among scientists and researchers with its wide applications and remarkably good results.

To make it sound even more interesting, the concept of GANs was recently described by Yann LeCun, Director of AI Research at Facebook, as the most important development in deep learning, and “the most interesting idea in the last 10 years in ML (Machine Learning).”

A GAN takes two independent networks – one generative and one discriminative – that work separately and act as adversaries. Quite literally, the generative network generates novel synthesized instances, while the discriminative network discriminates between synthesized instances and real ones.


One way to interpret this is through an art investigator and an art forger. The generator, in this case the forger, wants to create, say, a fake Van Gogh painting. He starts by learning what Van Gogh paintings look like, and then imitates with the goal to fool other people. The discriminator, in this case the investigator, starts also by learning the characteristics of Van Gogh in order to recognize what’s fake. Whenever one side loses, either the forger gets caught, or the investigator gets fooled, he works harder to improve. In order to win the battle, both the forger and the investigator train and escalate until both become experts.

Now, imagine expanding on this concept, if machines can soon create masterpieces in art and design, we may be seeing artists on the “Endangered Jobs” list very soon.

Application of GANs

GANs can be applied to multiple scenarios, including image classification, speech recognition, video production, robot behavior generation, etc. One of the most common applications is image generation – more specifically, the generation of “natural” images.

Many of you may have played the “What will you look like when you are old?” game online, or something of that sort, usually just for a laugh. With GANs technology available, scientists have improved the simulation to become much more reliable, and something that one day could be used to help missing person investigations.


In the application, the generative network was trained on 5,000 faces labeled with ages. The machine learns the characteristic signatures of aging, and then apply them to faces to make them look older. The second step of the application takes the discriminative network, and has it compare the original “before” images with the synthesized “after” images to see whether they are the same person.

When pitted against other face-aging techniques, the team using GANs received 60% more successful results of “before” and “after” identifying the same person.

In addition to face recognition, GANs has been proven useful in astronomy research, by a group of Swiss scientists. Up until now, the human ability to observe outer space has been limited by the capabilities of telescopes. However advanced modern telescopes are, scientists are never satisfied with the amount of detail they can show.

In the study, scientists took a space image and deliberately degraded its resolution. Using the degraded image and the original image, scientists trained the GANs to recover the degraded image to the best, and most genuine degree.  Then using the trained GANs, scientists were able to receive a much sharper version of the original image, finding it better able to recover features than anything used to date.


Extensions of GANs

Ever since the concept of GANS was introduced, researchers have focused on how to improve the stability of GANs training. More suitable architectures have been developed to put constraints on the training, and tackle specific image generation tasks.


A CGAN is an extension of the basic GAN with a conditional setting. It works by taking into account external information, such as label, text or another image, to determine specific representation of the generated images. The scary cat drawing we mentioned in the previous blog, and the space image recovery technique are both the results of a CGAN. More experimented applications are:

Text to image:


Image to image:



A LAPGAN is a Laplacian Pyramid of GANs, used to generate high quality samples of natural images. The training of a LAPGAN starts first by breaking the original training task into multiple manageable stages. At each stage, a generative model is trained using a GAN. In other words, a LAPGAN increases the models’ learning capability, by allowing them to be trained sequentially. According to the research paper, LAPGAN-generated images were mistaken for real images around 40% of the time, compared to 10% using a basic GAN.



A DCGAN is short for Deep Convolutional GAN, a more stable set of architecture proposed in a paper published in 2016. It works as a reverse of Convolutional Neural Networks (CNN) while bridging the gap between CNNs for supervised learning and unsupervised learning. In the paper, researchers predicted promising extensions of the DCGAN framework into domains such as video frame prediction and speech synthesis.



InfoGAN is an information-theoretic extension to the Generative Adversarial Network. It’s been proven to be able to learn by maximizing the mutual information between a small subset of the latent variables and the observation. Real life applications are concepts of brightness, rotation and width of an object, and even hairstyles and expression on human faces.

Challenges of GANs

GANs have attracted major attention within the academic field since their advent three years ago. Near the end of last year, Apple published its very first AI paper, announcing its efforts in algorithm training using GANs.

In addition to the aforementioned extensions, more variations of GANs are being studied to further implement the model, as well as to tackle its shortcomings, including the difficulty and instability in the training process, as mentioned in detail by Ian Goodfellow in his answer on Quora.


As researchers continue developing advancements to the GAN models and scaling up the training, we can expect to see fairly accurate and realistic machine-generated samples of videos, images, text, interactions, etc in the very near future. Which begs the question…if we see machines being pitted against each other in a manner that gives them human-like abilities to mimic and validate, would this mean that at some point, they will not only have the ability to reflect the world to us, but also have a hand in creating it, too?

If you’re ready to build your GANs and need the most powerful machine learning engines in the world, please visit


Posted in AMAX Services, Deep Learning | Tagged , , , , , , | Leave a comment

A Breakdown of AI, Machine Learning & Emerging Trends


These days, buzz around Artificial Intelligence is everywhere. You hear about self-driving cars, futuristic personal assistants, and computers beating professional human players at board games. AI has the potential to radically improve our lives with early diagnostic tools and customized treatment applications in the medical industry. On the flip side, Elon Musk predicts that AI-driven technologies will displace 12-15% of the global workforce within 20 years. Stephen Hawking warns us that the rise of AI could be the best or worst thing to happen to humanity, depending on how it’s used.

Whatever the future holds, what we may not realize is that our lives already revolve around AI, much of which was developed using Machine Learning, a subfield of AI that trains machines to learn from data. From targeted ads on social media, to impulse buy recommendations from your favorite online retailers, even complex fraud detection systems that can distinguish unusual credit card purchases from fraudulent ones in near real time—all these are in some shape or form developed using Machine Learning to become life-integrated AI.

So What’s the Distinction between AI and Machine Learning?

AI is a branch of computer science geared towards building machines capable of human-like intelligence. Machine Learning is a subset of AI, involving the development of compute methods to enable a machine to learn in order to achieve that intelligence. As Arthur Samuel, the father of Machine Learning describes it, it’s giving machines “the ability to learn without being explicitly programmed.”

So think of AI as the intelligence within the machine or software, and Machine Learning as the underlying discipline that helps the artificial organism attain intelligence and common sense.

What about Deep Learning?

Going one step further, when you hear the term, Deep Learning, know that this is the most promising and cutting-edge subset of Machine Learning, involving the development of algorithms to build artificial neural networks that mimic the structure and function of the human brain.  This discipline was godfathered by Andrew Ng while at Google, where he famously taught his algorithm to recognize cats.


Deep Learning is particularly effective in the realm of image recognition. Promising applications include facial recognition for video surveillance, advanced screening technology for early cancer detection and prenatal care, weather forecasting, and financial modeling.

Deep Learning is so promising that NVIDIA, the maker of GPU cards that are integral to accelerated compute platforms for Deep Learning, has put its stake in the ground to be the world’s leading AI company. NVIDIA currently holds 90% of the market for the GPU technology powering Machine and Deep Learning, and Goldman Sachs believes NVIDIA’s total addressable market in AI and Deep Learning to be an estimated $5-$10 billion out of a $40 billion market.

3 Emerging Machine Learning Trends to Watch

So where are we today? Andrew Ng’s cat experiment was an example of Supervised Learning, which involves telling the machine what the answer is for a particular input (ie cat), then feeding it massive amounts of examples of that input as training. This is currently the most common technique for training neural networks.

Beyond Supervised Learning, there are three major emerging Machine Learning methods that have shown great promise for the development of AI.

Unsupervised Learning

While Supervised Learning requires a large pool of labeled training data, Unsupervised Learning involves large training pools of inputs with no labels. So rather than telling the system what the inputs are, it is left to the machine to figure out the structure and relationships between different inputs.

Two common approaches include cluster analysis, which looks for hidden patterns or groupings within data, and anomaly detection, which looks for outliers. Unsupervised Learning has proven particularly useful for data mining, with use cases that include fraud detection, medical image analysis, and marketing campaigns to identify trends and behaviors within demographics.


Reinforced Learning

Reinforced Learning begins with little data, but is trained based on reinforcement through rewards or penalties, similar to how children learn.

Reinforced Learning is based on three fundamental elements—States, Actions, and Rewards. A machine learns from applying an Action to a State with the goal being a transition to a new desirable State. If the resulting new State is desirable, the system receives a Reward. If the new State is not desirable, the system is penalized.

With Reinforced Learning, over time, the system learns to pick sequences of actions (ie policies) that work optimally to transition certain States to desirable States. Reinforced Learning has shown success in teaching machines to play games, as well as in advertising to influence consumer behavior. It’s credited as the branch of Machine Learning that teaches intuitive judgment.

Generative Adversarial Networks (GANs)

Training machines using labeled data (ie Supervised Learning) are called discriminative models. Generative models now hand the keys to the machines, once trained, to see what they can create.

So assume that a generative network has been trained to correctly recognize a set of inputs. For this example, let’s say, cats.


This is a Cat.

A generative AI program can now be created to generate cats. Kind of.


A GAN takes it one step further. It pits a generative network against an image-recognition network, both of which have been trained to recognize specific inputs. The generative network, aka the generator, produces fake images. The image recognition network, aka the discriminator, tries to correctly tell the fake images from the real ones. The discriminator then checks if the images were real or fake so that it can get better at distinguishing between the two, while also telling the generator how to tweak its output to make its images more real. Think of it like algorithm sparring that improves both partners, with one that gets better at spotting fakes, and the other that gets better at producing fakes.

So the results are things like realistic AI-generated images of space and volcanoes.


Or perhaps, one day, machines will be our society’s most admired artists.


Which leads us to the big question:

What will you do with AI and the emerging trends within Machine Learning?

We would love to hear from you!

Posted in AMAX Services, Deep Learning, Enterprise Computing | Tagged , | Leave a comment

Things You Should Know About NVIDIA Tesla® P100

NVIDIA’s new GPU accelerator  the NVIDIA® Tesla® P100 is a great option for both High Performance Computing (HPC) and Deep Learning workloads.  It comes in 2 different form factors, PCIe with either 12GB or 16GB of HBM2 memory and SXM2 with 16GB of HBM2 memory and NVIDIA NVlink™ high speed interconnect.

Here is the breakdown. In the beginning of this quarter, NVIDIA started shipping the Tesla P100 as the first member of NVIDIA’s new Pascal architecture family. The Pascal architecture replaces the Kepler architecture optimized for double precision performance (K40, K80) as well as the Maxwell architecture optimized for single precision performance (M40, M4). Double precision performance is required for HPC and scientific applications, while single precision performance is needed for rendering and deep learning applications. Note that NVIDIA skipped a beat on the HPC side. NVIDIA’s previous Maxwell architecture does not support double precision optimized hardware for scientific applications.

Besides substantial improvements of single precision (SP, FP32) and double precision performance (DP, FP64), the P100 sports a new native data type, half precision (HP, FP16). Why half precision? The short answer is deep learning (DL). While the performance of artificial neural networks does not improve with increased precision, the computation time can be sped up by reducing the precision. Instead of performing one SP operation, the P100 can perform 2 HP operations simultaneously. 11 TFLOPs SP turn into 22 TFLOPs HP, a performance increase of 3X over NVIDIA’s Tesla M40, the previous generation highest performing enterprise grade DL card. NVIDIA is currently working on updating CUDA libraries and frameworks like Caffe to enable the new data type. In that sense the P100 is not only an HPC but more so a true DL card. The table below summarizes double, single, and half precision performance and other metrics for the top-end-of-line Kepler, Maxwell and Pascal families:

As shown in the table, The Tesla P100 comes in 2 form factors. NVIDIA’s P100 PCIe form factor features the traditional PCIe 3.0 x16 interface for card-to-card and card-to-CPU communications. For some applications  the PCIe interface can be a bottleneck limiting the overall system performance. To address the issue, NVIDIA added NVlink, a new interconnect co-developed by IBM and NVIDIA, it is 5x faster than the x16 PCIe 3.0 to enable significantly faster communication between GPUs and from GPU to CPU. NVlink is not available for cards with PCIe form factor, but is only available on boards with a new mezzanine card-like SXM2 form factor. Each SXM2 card supports 4 NVlink channels both for card-to-card and card-to CPU communication.

Several x86 board manufacturer including SMC and Quanta will support the SXM2 form factor. For x86 based systems communication between GPU and CPU remains PCIe based. Currently the maximum number of supported P100 cards is 4 per CPU and 8 per server. Probably the most prominent example for an x86 based system is the NVIDIA DGX-1 box.

If needed, IBM’s OpenPower platform provides further acceleration. Starting with the Power8 CPU family, IBM supports NVlink for enhanced data transfer between GPU and CPU. First systems from Wistron are now available.

In summary, NVIDIA’s new Tesla P100 GPU Accelerator is a well-rounded high end processor for both HPC and DL applications. A new native data type, half precision or FP16, is introduced to essentially double the TFLOP performance for DL applications. P100 is offered as dual width PCIe cards without NVlink and SXM2 form factor with 4 NVlink channels. While x86 platform only support PGU-to-PGU NVlink communication, IBM’s new Openpower platform and Power8 CPU also enable CPU-to-PGU communication via NVlink.

About AMAX

AMAX is a global leader in application-tailored data center, HPC and OEM solutions. Recognized by several industry awards, including the Best of VMworld and Intel Server Innovation Award for the CloudMax Converged Cloud Infrastructure, AMAX aims to provide cutting-edge solutions to improve efficiency and cut costs for the modern data center. Founded in 1979 and headquartered in Silicon Valley (with additional locations in China and Ireland), AMAX is a full-service technology solutions provider specializing in innovative server-to-rack level solutions developed for data center, HPC, cloud and big data applications.

From white box server-to-rack integration, high-performance deep learning platforms or converged infrastructure solutions featuring OpenStack, Open Compute and SDN, to a comprehensive menu of professional services, AMAX is the full-service partner you need to help modernize your IT operations. To learn more or request a quote, contact AMAX.

Posted in Deep Learning | Leave a comment