Rig Your Organization to Succeed: Digital Nomadism, Deep Learning, Machine Learning, & the Procurement of All Kinds of Fun Stuff

Introduction

Digital nomads are a type of people who use telecommunications technologies to earn a living and, more generally, conduct their life in a nomadic manner. Such workers often work remotely from foreign countries, coffee shops, public libraries, co-working spaces, or recreational vehicles. This is often accomplished through the use of devices that have wireless Internet capabilities such as smartphones or mobile hotspots [10].

As I am a Digital Nomad, I can personally attest to how wonderful Digital Nomadism can be. Unfortunately, the virtues of Digital Nomadism and operational objectives are not always be aligned so infrastructure and collaboration tools are an exceedingly important consideration when managing resources this way. As a manager and decision-maker for a group that utilizes Digital Nomads, I must take into consideration the processes and technologies that are necessary to drive efficiency gains while simultaneously considering their implications on the people involved in them.

Throughout the remainder of this blog I am going to focus on some of the considerations that are necessary for supporting a group of Digital Nomads that perform resource intensive computational work, such as Deep Learning. It is worth noting that many of the talking points that will follow are equally relevant to supporting a group that performs software development or data analytics with the likely exception of GPU considerations. Whether you are a decision maker for one of these groups, or a procurement professional supporting such a group, hopefully find this to be a useful guide.

A Quick Detour

Deep Learning, Machine Learning, and All The Fun Stuff

Deep learning is a subfield of machine learning. It focuses on developing and implementing algorithms that are inspired by the structure and function of the brain. According to Andrew Ng, the idea behind deep learning is that we may use brain simulations to:

Make learning algorithms much better and easier to use.

Make revolutionary advances in machine learning and AI.

These algorithms are called artificial neural networks. Deep learning has become popular because while most learning algorithms eventually reach a plateau in their performance, deep learning is believed to be the first class of algorithms that are scalable. That is, their performance continues improving as you feed them more data (with some caveats that we won't get into here). [11]

CPU vs GPU

When most people discuss computer performance they are usually referring to CPUs. This is appropriate since for most activities, especially multi-tasking, CPUs are a highly important consideration. However, for backend matrix/vector operations such as graphics processing (i.e., playing computer games) and deep learning the GPU is king. A CPU is optimized to fetch small amounts of memory quickly while the GPU is optimized to fetch large amounts of memory not so quickly. One way to visualize the difference is using a squirt gun versus a hose in repeated water fights. The squirt gun can help you get your neighbor wet very quickly at any moment, while the hose can get them much more wet but requires much more time to setup between each fight. This is not to say that CPUs do not matter at all when performing matrix computations. Without a decent CPU one might run into some process bottlenecks.

GPU

Getting a bit more into the weeds, GPUs are computer chips that were originally developed for the purpose of rendering images which requires a heavy amount of matrix computation. As such, they have been optimized to perform this task. Coincidentally, deep learning also requires a heavy amount of matrix manipulation and so GPUs have since been re-purposed for the needs of data scientists and machine learning engineers. For a more in depth explanation see this excellent Quora post.

It is important to note that when I say GPU, as of writing this blog I really just mean NVIDIA. The reason for this is historical. NVIDIA's standard libraries made it easy for the first wave of deep learning libraries to be established in CUDA. It is for this reason, along with continued strong community support from NVIDIA, that deep learning capabilities rapidly grew. As of the publication of this blog AMD and Intel are just not truly viable options. See this post for a great analysis on the current state of GPUs.

Other Relevant Hardware

Other components that are necessary to build a deep learning rig include the following:

RAM: Random Access Memory. It performs short-term data storage by handling the information you are actively using so that you can access it quickly. You will likely want at least 32GB but your needs are largely defined by your use case - research & state-of-the-art capabilities requires a very different capability than Kaggle research or building a startup.
Hard Drive: It handles long-term data storage. You will likely want a solid-state drive (SSD) since it is much faster. If money is a concern a hybrid (SSHD) may be a viable alternative. You will likely want at least 1TB because deep learning datasets can get big.
Motherboard: In terms of your CPU, PCIe lanes are an important consideration. However, even more important is your ultimate use case and potential needs around using multiple GPUs. It's important to know your minimally viable considerations, and your long term objectives, since you want to ensure that the combination of your CPU and motherboard supports running the number GPUs that you are planning executing against.
Power Supply: You need a power supply that can produce as much power as is being consumed.
Case: You need something to put all of the above components in that will appropriately address protection and heat considerations.

Decisions, Decisions

Now that we have established a barebones understanding of the technologies involved in a computationally intensive rig we will simply list the options for deployment within a group that facilitate Digital Nomadism. For a more in depth analysis that provides pros and cons for each I highly recommend [1].

Tower

A full-blown piece of hardware that is used for local delivery. Not recommended.

Notebook

A high-end laptop.

Notebook + Tower

A low-to-mid tier laptop that can be used to run code remotely via the terminal (i.e., SSH-ing) on a piece of hardware.

Notebook + eGPU

A pre-built eGPU that affords plug and play capability.

Notebook + Cloud

A low-to-mid tier laptop that can be used to run code remotely via the terminal (i.e., SSH-ing) on someone else's (AWS, Google Cloud Platform, Microsoft, etc.) piece of hardware.

Costs

There are both direct and indirect costs associated with all of the options listed above. For instance, consider the opportunity costs and risks. Having a high-end laptop is great because you can run your code anywhere but it will likely come with considerations: a high upfront cost, operational risks if you are accessing data locally, upgrade and maintenance limitations, and probably a hefty weight that makes it less than desirable from a portability perspective. Alternatively, building and deploying your own rig may not be desirable - at least at the beginning - because you now have infrastructure costs and require time allocation towards management and maintenance. This leads one to believe that cloud computing might be preferable but then its highly important to consider how much data you have, costs associated with maintaining it, security risks around managing it, and execution times with your algorithms. In the long run there's a very real chance you'll spend less money not using web services after you've got yourself up and running.

Corcentric Solutions:

The Strategic Sourceror

Rig Your Organization to Succeed: Digital Nomadism, Deep Learning, Machine Learning, & the Procurement of All Kinds of Fun Stuff

Introduction

A Quick Detour

Deep Learning, Machine Learning, and All The Fun Stuff

CPU vs GPU

GPU

Other Relevant Hardware

Decisions, Decisions

Tower

Notebook

Notebook + Tower

Notebook + eGPU

Notebook + Cloud

Costs

Further Reading

Works Cited

Share To:

James Patounas

Post A Comment:

0 comments so far,add yours

Recent Posts

Popular Posts

Recent Comments

Strategic Sourceror

About Corcentric

Contact Us

Pages

Corcentric Solutions:

Rig Your Organization to Succeed: Digital Nomadism, Deep Learning, Machine Learning, & the Procurement of All Kinds of Fun Stuff

Introduction

A Quick Detour

Deep Learning, Machine Learning, and All The Fun Stuff

CPU vs GPU

GPU

Other Relevant Hardware

Decisions, Decisions

Tower

Notebook

Notebook + Tower

Notebook + eGPU

Notebook + Cloud

Costs

Further Reading

Works Cited

Share To:

Next

Newer Post

Previous

Older Post

James Patounas

Post A Comment:

0 comments so far,add yours