I recently learned how to implement Multiprocessing in Python. So, I decided to share this with you! youtu.be/PcJZeCEEhws
πŸ‘︎ 585
πŸ’¬︎
πŸ‘€︎ u/nerdy_wits
πŸ“…︎ Jun 22 2021
🚨︎ report
I did a image noise correction filter using the median filter technique and multiprocessing. All in C! Link to repo in the comments.
πŸ‘︎ 105
πŸ’¬︎
πŸ‘€︎ u/theBadRoboT84
πŸ“…︎ Jun 19 2021
🚨︎ report
Introduction to threading and multiprocessing: Concurrency & Parallelism in Python pygs.me/007
πŸ‘︎ 73
πŸ’¬︎
πŸ‘€︎ u/pygsm
πŸ“…︎ Jun 17 2021
🚨︎ report
Simple Multiprocessing In Python: Comparing concurrent.futures to external libraries cosmiccoding.com.au/tutor…
πŸ‘︎ 19
πŸ’¬︎
πŸ‘€︎ u/samreay
πŸ“…︎ May 17 2021
🚨︎ report
Multiprocessing in Python Simplified - in 4 minutes! youtube.com/watch?v=nfRW7…
πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/KindsonTheGenius
πŸ“…︎ Jul 02 2021
🚨︎ report
multiprocessing in python for two lines of code

Hello all!

I am new to Python programming and I have tried researching multiprocessing. However, I just can't seem to grasp the concept. For the code example below:

    def doSomething():
       value = sayHello()
       fetchData(value) #function that could take an extended amount of time to run

I want to use multiprocessing because fetchData is a rather lengthy function and I want sayHello to be triggered without delay when doSomething is called (it is called constantly which is why time matters). Ideally, a queue of sorts is set up where sayHello keeps running and returning values while fetchData runs separately in the background with each corresponding value.

Unsure how to do this. Let me know if it is possible. Any help would be much appreciated :)

πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/rushimanche
πŸ“…︎ Jul 02 2021
🚨︎ report
How do i use multithreading / multiprocessing to speed up iteration?

I am not reading files, i am just making a small curses terminal simulation.

Ive tried multiprocessing.Pool and imap / map, ThreadPoolExecutor and another multithreading thing that involved a map method. I dont really remember.

πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/Jackiboi307
πŸ“…︎ May 19 2021
🚨︎ report
Things I Wish They Told Me About Multiprocessing in Python cloudcity.io/blog/2019/02…
πŸ‘︎ 252
πŸ’¬︎
πŸ‘€︎ u/pmz
πŸ“…︎ Apr 14 2021
🚨︎ report
multiprocessing infinite job every minute

Hi Gang,

Id like to write a program that has a background task that runs at the start of every minutes and then checks a file, the minute is in the file it then changes a flag, which is then picked up by another task.

for example.

variable_in_file = 1503
task_flag = False


# background task1 running every minute.
    # when the time is 1503 the background task sets the task flag to True


#background task2 checks the task_flag every 5 seconds and when it is set to True , runs a function and then sets it back to false.

I know this explanation is a bit loose, but Ive been looking into multiprocessing and i feel like my head is going to implode.

If someone can help me get my head around this i would be very grateful

πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/martynrbell
πŸ“…︎ May 30 2021
🚨︎ report
Slurm with python multiprocessing

Hi,

So I am looking into running a python script that uses multiprocessing.

Can I increase the number of cpus-per-task to a value higher than all cpus in a node? For example: i have several nodes with 16 cpus. I want to run a single task with 32 cpus, i.e use two nodes for one task and all cpus for a task.

Is this possible? Or am I always capped at the maximum numbers of a node?

Thanks

πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/Flicked_Up
πŸ“…︎ May 26 2021
🚨︎ report
Multiprocessing + Selenium objects... possible? (Chrome user here...)

quick rundown of what I want to do:

>open website via selenium

>gather many objects from it (in this case it's js checkboxes)

>click them all AS QUICKLY AS POSSIBLE, which would entail engaging all my processor's cores, so that each core is responsible for approximately a quarter of the objects (since I have 4 cores)

I've already written the code so that it works like a charm WITHOUT multiprocessing.

So, if I want to use MP, the plan right now is to roughly do this:

import selenium_operations as sop #separate file in which I've defined some selenium-related functions related to mining stuff from a webpage

from multiprocessing import Process

if __name__ == '__main__':

    boxes = sop.getBoxes(selection) #get the boxes. There's 200 of them

	box_set_1 = Process(target=sop.clickBoxes, args=(boxes[0:50],))
	box_set_2 = Process(target=sop.clickBoxes, args=(boxes[50:100],))
	box_set_3 = Process(target=sop.clickBoxes, args=(boxes[100:150],))
	box_set_4 = Process(target=sop.clickBoxes, args=(boxes[150:200],))
	
	box_set_1.start() #click the boxes
	box_set_2.start()
	box_set_3.start()
	box_set_4.start()
	
	box_set_1.join()
	box_set_2.join()
	box_set_3.join()
	box_set_4.join()

Look good? Or is tere something I should be aware of?

p.s. am using Chrome

πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/ablaaa_
πŸ“…︎ May 18 2021
🚨︎ report
Give me your best multithreading / multiprocessing gif!
πŸ‘︎ 21
πŸ’¬︎
πŸ‘€︎ u/rkinabhi
πŸ“…︎ May 09 2021
🚨︎ report
Asynchronous PHP β€” Multiprocessing, Multithreading & Coroutines - Diving Laravel divinglaravel.com/asynchr…
πŸ‘︎ 54
πŸ’¬︎
πŸ‘€︎ u/themsaid
πŸ“…︎ Apr 02 2021
🚨︎ report
With multiprocessing, how can i get user input() inside of the function?

I have a script that does a task, and when the task runs into a certain case it stops and asks the user for input. I have seen some things that looked helpful on SO and tried to implement them, but they didnt do what i needed.

he relevant part is this one, here i am trying to get user input in the function

        elif keepgoing == False:
            feedback = q.get()
            fn = sys.stdin.fileno()
            print(feedback)
            self.Main(feedback, weaponPos, gamblePos, wipePos, colors, q, fn)

anyways, this is the is my current code, a bit botched because i tried a few things myself and because i didnt do much with multithreaded programming yet

import pyscreenshot as ImageGrab
import pydirectinput as pyautogui
import time
import os
import sys
from multiprocessing import Process, freeze_support, JoinableQueue

class AutoGambler:
    def Main(self, *args):
        #print(args)
        intervals = args[0]
        weaponPos = args[1]
        gamblePos = args[2]
        wipePos = args[3]
        colors = args[4]
        q = args[5]
        sys.stdin = os.fdopen(args[6])
        #print(colors)
        keepgoing = True
        x = 0
        print("intervals: " + str(intervals))
        print("keepgoing: " + str(keepgoing))
        while x < int(intervals) and keepgoing == True:
            self.gamble(weaponPos, gamblePos, wipePos)
            #print("gamble succeeded")

            keepgoing = self.checkPix()
            #print("checkPix suceeded")
            x+=1
            print(x)

            if x == int(intervals):
                break

        if x >= int(intervals):
            feedback = input(f"{intervals} amount of retries reached. Go again? (Enter amount of retries, default 10)") or 10
            self.Main(feedback, weaponPos, gamblePos, wipePos, colors, q)

        elif keepgoing == False:
            feedback = q.get()
            fn = sys.stdin.fileno()
            print(feedback)
            self.Main(feedback, weaponPos, gamblePos, wipePos, colors, q, fn)

    def doubleclickBox(self, horz,vert):
        #608 | 683 "0/0"
        #640 | 715 erste box
        #930 | 875 letzte box
        #32 abstand
        xcoord = 608+32*horz
        ycoord = 683+32*vert
        time.sleep(0.01)
        py
... keep reading on reddit ➑

πŸ‘︎ 4
πŸ’¬︎
πŸ‘€︎ u/FlyingThunder2992
πŸ“…︎ May 08 2021
🚨︎ report
Multiprocessing is increasing computation time, spending most of it's time on .join()

Code in a Pastebin link at the bottom, explanation here.

I think it will help to explain what this project is. I'm a graduate student analyzing some of my simulations. The simulations produce a trajectory which has the positions and velocities of all the simulated atoms at a bunch of different frames, just like a movie. The code runs perfectly fine in serial and I have used that to get great results before. However, we've extended the simulations and they now have ~50k frames. The computations aren't fast so the estimated time for running in serial is ~55 hours. The cluster I run on has a 24 hour limit but also has 24 cores so this feels like the perfect place to use multiprocessing. I should also mention that I use a ton of Python in my work so I feel like I'm getting better, but I've had no formal training of any kind so I am not an expert by any means. Multiprocessing in particular is totally new to me (and a constant headache).

So the goal of the code is to simply chop up the 50k frames into n number of sections and have n processes compute each leg of the trajectory simultaneously. For reference, the output is 3 computed values: convexity, mcurvature and gcurvature (and I'm really bad about keeping that consistent so also various abbreviations of those). Convexity is a single value, the curvatures are both variable length arrays. I'm storing all the results as dictionaries because it can handle the different data types and array lengths very easily and makes combining and sorting different times very easy. Also note that due to the large number of frames and large data output, it's really easy to fill up the queue. I had a bunch of issues where the processes would get hung because the queue was full. The solution I worked out was to dedicate half the processes to be writer processes that carry out the analysis and then shove results to a Queue and then half reader processes that pull from the Queue and write to a dictionary. Of course... "solution" may be an overstatement. It was a solution to a previous problem so maybe now it needs updating again.

The issue is that multiprocessing is not doing what it's supposed to be doing: decreasing the computation time. The latest version shown in the attached code, running a test of 100 frames produces the timings:

  • 2 processes: 3.66 minutes
  • 4 processes: 3.67 minutes
  • 8 processes: 3.76 minutes
  • 24 processes: 3.92 minutes

That time increase obviously isn't much, but this is only 100 frames out of

... keep reading on reddit ➑

πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/Bohrealis
πŸ“…︎ Jun 17 2021
🚨︎ report
Multiprocessing and Multithreading Issues

Hi everyone,

I'm trying to look into running concurrent processes. I've tried getting multiprocessing to work but I keep getting the same error, no matter how complicated the code is:

>ModuleNotFoundError: No module named 'multiprocessing.spawn'; 'multiprocessing' is not a package

This was caused by the following code:

PASTEBIN LINK 1

However, when running the threading-equivilant, there are no issues:

PASTEBIN LINK 2

Can someone please shed some light on how to get multiprocessing to work? I've tried upgrading Python, installing Visual Studio's C++ compiler/packages and reading through google. I'm still relatively new to Python and I get lost pretty quickly, so please use noob-friendly language :)

Thanks.

edit: Reddit butchered the formatting. Uploaded to pastebin instead.

SOLVED! K900_ and chewy1970 fixed it. In an effort to help other people with the same issue, ensure you have no other scripts called multiprocessing.py.

πŸ‘︎ 8
πŸ’¬︎
πŸ‘€︎ u/PhannyTwoShoes
πŸ“…︎ May 02 2021
🚨︎ report
Asynchronous PHP β€” Multiprocessing, Multithreading & Coroutines - Diving Laravel divinglaravel.com/asynchr…
πŸ‘︎ 46
πŸ’¬︎
πŸ‘€︎ u/themsaid
πŸ“…︎ Mar 26 2021
🚨︎ report
Simple Multiprocessing In Python: Comparing concurrent.futures to external libraries

Hey all, was checking out some libraries for work on how we can slot in multiprocessing with minimal fuss, and decided to write up the initial investigation.

Its by no means comprehensive, just hope it might be useful in comparing a few popular libraries and how to set them all up.

If there are good libraries I've missed, please let me know and I'll look into them!

Heres the write up

πŸ‘︎ 11
πŸ’¬︎
πŸ‘€︎ u/samreay
πŸ“…︎ May 17 2021
🚨︎ report
Can anything be done about synchronize.Lock in 3.8+ breaking multiprocessing.pool.Pool in AWS lambda

I'm using a library that uses multiprocessing.pool.Pool in our Lambda functions, but after upgrading 3.7 -> 3.8 multiprocessing.pool.ThreadPool is no longer working.

The error I'm getting is:

OSError: [Errno 38] Function not implemented

File "/var/lang/lib/python3.8/multiprocessing/synchronize.py", line 57, in __init__

sl = self._semlock = _multiprocessing.SemLock

I'm pretty sure the reason is that:

  1. synchronize.Lock doesn't work in lambda for any version of Python (lambda has no /dev/shm, and no write access to /dev in lambda - see: https://aws.amazon.com/blogs/compute/parallel-processing-in-python-with-aws-lambda)
  2. ThreadPool is now using synchronize.Lock from version 3.8

I can't find exactly why ThreadPool now uses synchronize.Lock, but given the wide usage of AWS lambda and other environments that don't have /dev/shm (assuming there are a few because the unit tests run into this as well: https://bugs.python.org/issue38377) - is there anything to work around this?

  • perhaps have synchronize.Lock use /tmp if /dev is not writable and /dev/shm isn't available?
  • is this a local config change, bug, or a feature request?

Any ideas or suggestions would be much appreciated.

πŸ‘︎ 11
πŸ’¬︎
πŸ‘€︎ u/jacques_sec
πŸ“…︎ Apr 29 2021
🚨︎ report
Multiprocessing in Flask

Context:

I have two heavy machine learning inferences models running whose webcam streams are being streamed with flask.

These two models are different and have different weights.

Currently I have them setup as /blueprint1 and /blueprint2 for ML model 1 and ML model 2 respectively.

But I Think my implementation is not safe since flask is switching threads for both weights and this is causing few seconds of lag in the webcam streams. I am not sure if this thread switching concept is even correct.

What I want to do:

I was thinking of some easy solution to run these like two different apps ( multiprocessing ).

πŸ‘︎ 20
πŸ’¬︎
πŸ‘€︎ u/speedx10
πŸ“…︎ Apr 11 2021
🚨︎ report
Multiprocessing items from a List

Hello,

I have a small application for generating gif files from videos, I started that as a learning project and I'm still adding features as learning exercises.

The code is quite simple, there's a for loop scanning a folder recursively, with magic I'm checking if the file is a video file, if it is then I'm appending that file into a list. Once this is done I'm processing the items in the list with my generate_gif() method. Fairly straight forward, nothing complicated.

However, this way I can process only one video at a time. I thought it would be good if I could process multiple videos simultaneously. After doing some research I decided to use pool. Here's how I tried it.

video_files is a list of strings, filepath to each video file.

if __name__Β ==Β '__main__':
withΒ ProcessPoolExecutor(max_workers=4)Β asΒ pool:
forΒ videoΒ in range(len(video_files)):
pool.submit(generate_gif,Β video_files[video])

(screenshot of the code: https://gyazo.com/d32d2ea01a7f76e7c1a7f41b30cc8c29 )

As you expect, that didn't work. Program spits out all the file names to the console then process only one file and exit (gracefully). I would like to understand how should I approach this problem and what I'm doing wrong. I'd be really happy if someone could explain what I'm doing wrong and maybe some sample code showing how to process items from a list with multiple workers.

(I have the full code in github but not adding a link just in case it would look like I'm promoting).

Thanks!
Rooti

(I tried to format the code properly but I'm struggling. sorry)

πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/rootifera
πŸ“…︎ Apr 27 2021
🚨︎ report
Does using multiprocessing Pool use 4x the RAM too?

Just started using parallel processing and loving it.

Currently my program is slow due to computations and parallel processing (quad core) is a godsend. However, I have another program I'm considering implementing parallel processing that takes significantly more memory, so much so, I believe it uses harddrive as virtual memory.

In this case, would parallel processing be a benefit? Am I correct in saying that my program would use 4x more memory?

πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/programmerProbs
πŸ“…︎ May 14 2021
🚨︎ report
Multiprocessing

Does anyone know a good resource for intermediate multiprocessing examples? If someone would be willing to walk me through an intermediate level example that would be even more helpful too. All I can find online is either extremely basic or so advanced I can't make any changes without breaking everything.

I've watched at least 15 hours worth of tutorials, and I've been trying for the past 3 days to get anywhere but every tutorial just shows extremely basic examples and I can't get passed anything more than basic stuff in the tutorials. If I try to implement anything even remotely complex I run straight into a brick-wall with no clear idea of what is going wrong or why.

I feel like I understand all the theoretical stuff but I just can't execute anything effectively, and it's driving me crazy

πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/Mr_Branflakes
πŸ“…︎ Apr 15 2021
🚨︎ report
multiprocessing trouble, how can i speed this function up using multiprocessing?

Short rant: Going through the questions on my mock exam and I'm having trouble with multiprocessing, my exam is tomorrow and I missed so much 'cause of a kidney stone and tendonitis at the same time; that pain coupled with ADHD levels of concentration mean I've learnt nothing and I feel useless right now. For the first time, I'm not enjoying python and I just feel lost and stressed.

Going through the questions on my mock exam and I'm having trouble with multiprocessing

So here is my actual mock exam question:

"By the following convergent series pi can be estimated analytically exactly (note - this does not mean that this series converges to pi!):

 conv = lambda n: 1/n**2

def conv_series(n):
    sum = 0
    for i in range(1,n+1):
        sum += conv(i)
    return sum

Unfortunately, this series converges very slowly.

You would like to determine the convergence value of the series to 10 decimal places. This requires a very large number of terms in the series - n = 1000000000 or more sum elements may be necessary.

Consider how you can parallelize conv_series(n) to speed up the calculation of the series by multiprocessing. Calculate the sum with (at least) n = 1000000000 terms.

What are the 10 decimal places? (E.g. if you would have the result 2.7819512345, then enter 7819512345 as result)"

So far I have managed to get this:

import multiprocessing
from multiprocessing import Pool

n = 1000000000
x = 60 # maximum multiprocesses.. not sure if this matters?
conv = lambda n: 1 / n ** 2


def conv_series(n):
    sum = 0
    for i in range(1, n + 1):
        sum += conv(i)
    return sum


if __name__ == '__main__':
    with Pool(x) as pool:
        p = multiprocessing.Process()
        for item in pool.map(conv_series, (n,)):
            print(item)

although this is just doing the same process multiple times right?

How would I actually speed this process up as the question asks?

πŸ‘︎ 11
πŸ’¬︎
πŸ‘€︎ u/yerba-matee
πŸ“…︎ Apr 05 2021
🚨︎ report
Multiprocessing module

Hi all,

I have been tinkering with the multiprocessing module and have been following this tutorial (https://realpython.com/courses/functional-programming-python/), but I ran into a small problem.

When I run this code:

import collections
import time
import multiprocessing

# New named_tuple which describes a scientsist
Scientist = collections.namedtuple("Scientist", ['name', 'field', 'born', 'nobel'])

# some sample scientists
scientists = (
    Scientist(name='Ada Lovelace', field='math', born=1815, nobel=False),
    Scientist(name='Emmy Noether', field='math', born=1882, nobel=False),
    Scientist(name='Marie Curie', field='physics', born=1867, nobel=True),
)

# some function
def transform(scientist):
    # scientist : named_tuple
    time.sleep(1)
    return {'name': x.name, 'age': 2021-x.born}

start = time.time()
pool = multiprocessing.Pool()
result = pool.map(transform, scientists)
pool.close()
pool.join()

my program seems to crash with the error code:

RuntimeError:
    An attempt has been made to start a new process before the current process has finished its bootsrapping phase

Moreover, the program doesn't exit but keeps re-running the script.

But when I replace the multiprocessing part it works:

if __name__ == '__main__':
    pool = multiprocessing.Pool()
    result = pool.map(transform, scientists)
    pool.close()
    pool.join()

So my question is basically why? Why does the modified part (with the "main" function) work while the other one doesn't?

Thanks in advance, any and all help (and guesses) is appreciated!

Best

πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/Painaple
πŸ“…︎ Apr 14 2021
🚨︎ report
Can anything be done about synchronize.Lock in 3.8+ breaking multiprocessing.pool.Pool in AWS lambda /r/Python/comments/n13zrm…
πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/alex-sec
πŸ“…︎ Apr 30 2021
🚨︎ report
How to extend existing multiprocessing framework to mpi4py for cluster

How to extend existing multiprocessing framework to mpi4py in python?

With multiprocessing, starts by physically splitting big npz file into small chunks and every Chuck processed individually then all output chunks collected and gathered into a single processed npz file.

Input files are in npz format containing multiple np.arrays.

πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/NoEnv98
πŸ“…︎ Apr 30 2021
🚨︎ report
Multiprocessing
πŸ‘︎ 298
πŸ’¬︎
πŸ‘€︎ u/footageforfree
πŸ“…︎ Jan 13 2021
🚨︎ report

Please note that this site uses cookies to personalise content and adverts, to provide social media features, and to analyse web traffic. Click here for more information.