Backend Development 101: Prime Numbers and Multi-threading | Hacker Noon

Author profile picture

@mashAdam Umar

Learning and growing.

Recently, I have been trying to expand my knowledge as a backend developer. I want to understand solving problems at scale and also breaking down big tasks into little chunks of tasks.

Before I bore you with gibberish, what we will be trying to achieve today is simple

> Find all the prime numbers from 1…..n

A very classic interview question. To be honest I didn’t really settle down to solve it, not because I didn’t want to but because I had a different end goal in mind.

Get the simplest code you can find and try to achieve at a lesser time using multithreading in NodeJS

Before we dive into the code, we need to understand what the problem is. When you run a typical NodeJS program what you get is a single process, single-thread, single event-loop, and single instance app running in the background. (Please correct me if I am wrong.)

Now imagine trying to find the prime numbers between 1 and 250 million and that’s just a function within your application… Yeah, you guessed right.

To achieve multithreading in NodeJS, you need to use the

worker_threads

module and you can only get this from Node v10(experimental) and above. It became fully available from Node v12.

Like I said earlier, our aim is to find the prime numbers between

1....n

. We could use

Sieve of Eratosthenes

to solve the problem which is very efficient. But I wanted something that would overwork/overclock my CPU so that I could really appreciate multithreading for what it is.

To start with

  • Create a folder, Name it anything
  • Create 2 files
    threaded.js

    and

    non-threaded.js

Now in the

non-threaded.js

file paste this code in it

// Function to check if a number is prime number or not
const checkPrime = num => {
  for (let i = 2, s = Math.sqrt(num); i <= s; i++) if (num % i === 0) return false;
  return num > 1;
};
const maxNumber = 100_000; // Where the loop should stop
let primeNumbers = [];
const startTime = Date.now();
for (let index = 0; index < maxNumber; index++) {
  if (checkPrime(index)) {
    primeNumbers.push(index);
  }
}
const endTime = Date.now();
console.log('JOB TOOK:: ', (endTime - startTime) / 1000, ' to complete');
console.log('PRIMES:: ', primeNumbers);

So this is a basic NodeJS program that loops from 0 to `

maxNumber

` and stores the prime numbers in the Array called `

primeNumbers

`. Also, I am calculating the time it takes for the code to run.

const maxNumber = 100_000;

If you are confused about this line of code, please don’t be. It’s totally allowed in NodeJS. For me I use it to represent very long integers in my code.

Before running the benchmark, here are the details of my current laptop

  • OS: Ubuntu 20.04
  • CPU: Intel® Core™ i7-6500U CPU @ 2.50GHz × 4
  • RAM: 12gb (I know it’s terrible, forgive me, distinguished colleagues)
  • Node Version: 14

BENCHMARK

0-100: ~0s

0-1_000: ~0.001s

0-10_000: ~0.003s

0-100_000: ~0.021s

0-1_000_000: ~0.351s

0-10_000_000: ~8.3s

Let’s try 100 million… Lol…

0-100_000_000: ~220s

Wheeewww… My laptop fan almost blew out running this.

Before going deep, what is

MULTI-THREADING

? according to Wikipedia:

In computer architecture, multithreading is the ability of a central processing unit (CPU) (or a single core in a multi-core processor) to provide multiple threads of execution concurrently, supported by the operating system.

The keyword here is

concurrently

, how do we breakdown the problem into sub-problems and have them execute concurrently?, NOT sequentially. Remember, we are trying to save time, with as little resources as possible.

So before we go any further, I will post the code and explain everything section by section. Grab this block of code and put it in the `threaded.js` file you created earlier

const { Worker, isMainThread, parentPort, workerData } = require('worker_threads');
const os = require('os');
const checkPrime = num => {
  for (let i = 2, s = Math.sqrt(num); i <= s; i++) if (num % i === 0) return false;
  return num > 1;
};
if (isMainThread) {
  const startTime = Date.now();
  const coresCount = os.cpus().length;
  const numberOfElements = 100_000_000;
  let workers = [];
  const sharedBuffer = new SharedArrayBuffer(Int32Array.BYTES_PER_ELEMENT * numberOfElements);
  const arr = new Int32Array(sharedBuffer);
  console.log('MAIN THREAD');
  const numElementsPerThread = Math.ceil(numberOfElements / coresCount);
  let workerIndex = 0;
  let completed = 0;
  while (workers.length < coresCount) {
    workerIndex++;
    const start = workers.length * numElementsPerThread;
    const end = start + numElementsPerThread;
    const worker = new Worker(__filename, {
      workerData: {
        start,
        end,
        index: workerIndex,
        arr,
      },
    });
    worker.on('message', message => {
      if (message.completed) {
        completed++;
      }
      if (completed === coresCount) {
        console.log('Totally done!');
        const endTime = Date.now();
        console.log((endTime - startTime) / 1000, 'seconds to complete');
        console.log('FINAL ARR:: ', arr);
        console.log('FINAL ARRAY:: ', Array.from(arr).filter(Boolean));
      }
      console.log('final time:: ', message.index);
    });
    workers.push(worker);
  }
} else {
  console.log({
    start: workerData.start,
    end: workerData.end,
    index: workerData.index,
  });
  for (let i = workerData.start; i < workerData.end; i++) {
    let check = checkPrime(i);
    if (check) {
      workerData.arr[i] = i;
    }
  }
  parentPort.postMessage({ completed: true, index: workerData.index });
}

Like I said earlier, to perform multi-threaded operations in NodeJS, we need the

worker_threads

module. This enables us to write programs that take advantage of multi-threading.

const { Worker, isMainThread, parentPort, workerData } = require('worker_threads');
const os = require('os');

This is a normal standard import of packages we will need to make this program work

const checkPrime = num => {
  for (let i = 2, s = Math.sqrt(num); i <= s; i++) if (num % i === 0) return false;
  return num > 1;
};

Remember this function from the

non-threaded

version of our app?. All it does is take a number and tells us if it’s a prime number or not.

Before we dive into the main code, I want to show you the basic structure of the code, so that you can further understand the main code.

const { Worker, isMainThread, parentPort, workerData } = require('worker_threads');
const os = require('os');
if (isMainThread) {
  console.log('Main THREAD');
  let worker_count = os.cpus().length;
  console.log('THREADS::: ', worker_count);
  let workers = 0;
  while (workers < worker_count) {
    new Worker(__filename);
    workers++;
  }
} else {
  console.log('Worker THREAD');
}

Copy the above piece of code save it in a js file and run it. You should see a result like the one below

Main THREAD
THREADS:::  4
Worker THREAD
Worker THREAD
Worker THREAD
Worker THREAD

From the result above, you can see I use a crappy workstation.

The

isMainThread

from

worker_threads

tells us if we are running on the main thread or in one of the worker threads. In the

main thread

, I spun up more workers based on the number of cores my computer has. That’s as basic as a multithreaded program can look like in NodeJS. Now let’s get back to the main program.

const sharedBuffer = new SharedArrayBuffer(Int32Array.BYTES_PER_ELEMENT * numberOfElements);
const arr = new Int32Array(sharedBuffer);

To understand the above code, you need to understand that for multithreading to work, we will need a shared memory where every worker thread will output their result when they are done.

Hence, our need for

sharedArrayBuffer

[READ THIS]. Each index in the

sharedArrayBuffer

will be initialized with zero.

So we created a

sharedArrayBuffer

and each element of the array will be of type

Int32

and the length of the array will be the ceiling to where we want to search for prime numbers.

const numElementsPerThread = Math.ceil(numberOfElements / coresCount);

Remember, multi-threading means sub-processes working concurrently. So basically we are calculating what size of the chunk we will give to each worker to execute.

e.g if i want to search for prime numbers from 0-100 and my

coreCount

is 4 then each worker will get 100/4 (=25) elements to perform their operations on.

Now, into the while-loop

const start = workers.length * numElementsPerThread;
    const end = start + numElementsPerThread;
    const worker = new Worker(__filename, {
      workerData: {
        start,
        end,
        index: workerIndex,
        arr,
      },
    });

Even though, we know the size of the chunk each worker will get, how do we set the boundary for each worker.

start

and

end

variable are calculating and holding that value for each worker we are about to spin-up.

We initialize each worker with these parameters

start

,

end

,

index

(tracking the workers individually) and

arr

(this is the shared buffer array each worker will output their result to)

worker.on('message', message => {
      if (message.completed) {
        completed++;
      }
      if (completed === coresCount) {
        console.log('Totally done!');
        const endTime = Date.now();
        console.log((endTime - startTime) / 1000, 'seconds to complete');
        console.log('FINAL ARR:: ', arr);
        console.log('FINAL ARRAY:: ', Array.from(arr).filter(Boolean));
      }
      console.log('final time:: ', message.index);
    });

When each worker finishes its execution, it sends a signal back to the

mainThread

with whatever data it wants to send. So the block of code above is listening for when that event fires. When the last worker sends back its result, we calculate how long the whole process took, we also convert the

Int32Array

into the normal Javascript array we are used to. The problem here is, some indexes contain elements that are zero (not prime numbers). So we use the .

filter(Boolean)

trick to filter out all

falsy

values from the Array.

for (let i = workerData.start; i < workerData.end; i++) {
    let check = checkPrime(i);
    if (check) {
      workerData.arr[i] = i;
    }
  }

In each worker, we just loop between its boundaries and check if each index is a prime number or not. If it’s a prime number, then change the number at that index in the

sharedArrayBuffer

to

i

.

e.g If

i

is 7 and 7 is a prime number, then go to

sharedArrayBuffer

and put 7 at index 7

parentPort.postMessage({ completed: true, index: workerData.index });

Once the loop terminates, send back completion response to the

mainThread

.

And that my friends, was how I multi-threaded my way through prime numbers. Why don’t you run it and tell me the results you got in the comment section.

Please don’t break your computers while at it, I won’t be held liable… lol

Please share your results in the comment section. I hope you learned something new.

Previously published at https://umaradam.xyz/prime-numbers-and-multi-threading

Tags

The Noonification banner

Subscribe to get your daily round-up of top tech stories!

read original article here