Using Deep Neural Network Acceleration for Image Analysis in Drug Discovery

Novartis Illustration

What’s New: Intel collaborates with Novartis* on the use of deep neural networks (DNN) to accelerate high content screening – a key element of early drug discovery. The collaboration team cut time to train image analysis models from 11 hours to 31 minutes – an improvement of greater than 20 times1.

Collaboration team members from Novartis and Intel used eight CPU-based servers, a high-speed fabric interconnect and optimized TensorFlow to achieve the improvement in time needed to process a dataset of 10K images.

Why It’s Important: High content screening of cellular phenotypes is a fundamental tool supporting early drug discovery. The term “high content” signifies the rich set of thousands of pre-defined features (such as size, shape, texture) that are extracted from images using classical image-processing techniques. High content screening allows analysis of microscopic images to study the effects of thousands of genetic or chemical treatments on different cell cultures.

The promise of deep learning is that relevant image features that can distinguish one treatment from another are “automatically” learned from the data. By applying deep neural network acceleration, biologists and data scientists at Intel and Novartis hope to speed up the analysis of high content imaging screens. In this joint work, the team is focusing on whole microscopy images as opposed to using a separate process to identify each cell in an image first. Whole microscopy images can be much larger than those typically found in deep learning datasets. For example, the images used in this evaluation are more than 26 times larger than images typically used from the well-known ImageNet* dataset of animals, objects and scenes.

Deep convolutional neural network models, for analyzing microscopy images, typically work on millions of pixels per image, millions of parameters in the model and possibly thousands of training images at a time. That constitutes a high computational load. Even with advanced computational capabilities on existing computing infrastructure, deeper exploration of DNN models can be prohibitive in terms of time.

To solve these challenges, the collaboration is applying deep neural network acceleration techniques to process multiple images in significantly less time while extracting greater insight from image features that the model ultimately learns.

What It Looks Like: The collaboration team with representatives from Novartis and Intel have shown more than 20 times1 improvement in the time to process a dataset of 10K images for training. Using the Broad Bioimage Benchmark Collection* 021 (BBBC-021) dataset, the team has achieved a total processing time of 31 minutes with over 99 percent accuracy.

For this result, the team used eight CPU-based servers, a high-speed fabric interconnect, and optimized TensorFlow1.  By exploiting the fundamental principle of data parallelism in deep learning training and the ability to fully utilize the benefits of large memory support on the server platform, the team was able to scale to more than 120 3.9-megapixel images per second with 32 TensorFlow* workers.

What’s Next: While supervised deep learning methods are essential to accelerating image classification and speeding time to insight, deep learning methods depend on large expert-labeled datasets to train the models. The time and manual effort necessary to create such datasets is often prohibitive. Unsupervised deep learning methods – that may be applied to unlabeled microscopy images – hold the promise of revealing novel insights for cellular biology and ultimately drug discovery. This will be the focus of continuing efforts in the future.

More Context: Artificial Intelligence at Intel | Advancing Data-Driven Healthcare Solutions | 2018 Intel DevCon (press Kit)

The Fine Print:

[1] 20x claim based on 21.7x speed up achieved by scaling from single node system to 8-socket cluster.

8-socket cluster node configuration :

CPU: Intel® Xeon® 6148 Processor @ 2.4GHz

Cores: 40

Sockets: 2

Hyper-threading: Enabled

Memory/node: 192GB, 2666MHz

NIC: Intel® Omni-Path Host Fabric Interface (Intel® OP HFI)

TensorFlow: v1.7.0

Horovod: 0.12.1

OpenMPI: 3.0.0

Cluster: ToR Switch: Intel® Omni-Path Switch

Single node configuration:

CPU: Intel® Xeon® Phi Processor 7290F


1x 1.6TB Intel® SSD DC S3610 Series SC2BX016T4

1x 480GB Intel® SSD DC S3520 Series SC2BB480G7

Intel® MKL 2017/DAAL/Intel Caffe


BBBC-021: Ljosa V, Sokolnicki KL, Carpenter AE, Annotated high-throughput microscopy image sets for validation, Nature Methods, 2012

ImageNet: Russakovsky O et al, ImageNet Large Scale Visual Recognition Challenge, IJCV, 2015

Tensorflow: Abadi M et al, Large-Scale Machine Learning on Heterogeneous Systems, Software available from

Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase.  For more complete information about performance and benchmark results, visit  

Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at

Intel, the Intel logo, and Xeon are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries.

About Intel
Intel (NASDAQ: INTC) expands the boundaries of technology to make the most amazing experiences possible. Information about Intel can be found at and

Intel and the Intel logo are trademarks of Intel Corporation in the United States and other countries.

*Other names and brands may be claimed as the property of others.