F5F Stay Refreshed Software Operating Systems Set up a cluster of Arch Linux machines.

Set up a cluster of Arch Linux machines.

Set up a cluster of Arch Linux machines.

Pages (3): Previous 1 2 3
T
Toodaloo_246
Senior Member
439
04-19-2023, 07:45 AM
#21
In short, you want a Python solution instead of using it directly in Python.
T
Toodaloo_246
04-19-2023, 07:45 AM #21

In short, you want a Python solution instead of using it directly in Python.

C
Chassabelle
Junior Member
19
04-19-2023, 09:41 AM
#22
It simply executes operations. It functions like running a script across multiple machines. Irrespective of what the tasks are or their origin, Slurm doesn't understand parallel execution. The program or job being run must support parallel processing. Each task should have its own method for communicating with other machines.
C
Chassabelle
04-19-2023, 09:41 AM #22

It simply executes operations. It functions like running a script across multiple machines. Irrespective of what the tasks are or their origin, Slurm doesn't understand parallel execution. The program or job being run must support parallel processing. Each task should have its own method for communicating with other machines.

S
Skion_
Junior Member
10
04-20-2023, 06:00 AM
#23
Yes, you can adapt your Python project to support parallel processing. There are several libraries available such as multiprocessing, concurrent.futures, or joblib that help distribute tasks across multiple cores or machines.
S
Skion_
04-20-2023, 06:00 AM #23

Yes, you can adapt your Python project to support parallel processing. There are several libraries available such as multiprocessing, concurrent.futures, or joblib that help distribute tasks across multiple cores or machines.

J
jacalix
Junior Member
12
05-10-2023, 05:16 AM
#24
You should consider several aspects when evaluating your approach: 1. Can the issue you're addressing be broken into smaller parts? Not all challenges fit well with distributed methods or parallel processing. As mentioned, you could adjust your crawler so each node handles a portion of the search space (for example, x out of total URLs). 2. Is breaking it down efficient enough? For effective distribution, each node should perform mostly independent tasks with little need for coordination. Excessive communication reduces efficiency. Also, ensure nodes aren't restricted by shared resources like bandwidth. If network speed is a bottleneck, performance will suffer. On the other hand, if CPU usage dominates, distribution works better. 3. Which method suits your needs best? You must decide how much interaction is necessary between nodes and find the optimal way to manage it. In some scenarios, shared databases or locks can prevent duplicates. If not, direct node communication or a central coordinator might be more effective. 4. Is language X appropriate? My familiarity with Python is limited, so I can't confidently suggest it based on this. The GIL may hinder threading in Python, making other languages potentially more suitable depending on your goals. These points aren’t directly solving your problem but guide your next steps.
J
jacalix
05-10-2023, 05:16 AM #24

You should consider several aspects when evaluating your approach: 1. Can the issue you're addressing be broken into smaller parts? Not all challenges fit well with distributed methods or parallel processing. As mentioned, you could adjust your crawler so each node handles a portion of the search space (for example, x out of total URLs). 2. Is breaking it down efficient enough? For effective distribution, each node should perform mostly independent tasks with little need for coordination. Excessive communication reduces efficiency. Also, ensure nodes aren't restricted by shared resources like bandwidth. If network speed is a bottleneck, performance will suffer. On the other hand, if CPU usage dominates, distribution works better. 3. Which method suits your needs best? You must decide how much interaction is necessary between nodes and find the optimal way to manage it. In some scenarios, shared databases or locks can prevent duplicates. If not, direct node communication or a central coordinator might be more effective. 4. Is language X appropriate? My familiarity with Python is limited, so I can't confidently suggest it based on this. The GIL may hinder threading in Python, making other languages potentially more suitable depending on your goals. These points aren’t directly solving your problem but guide your next steps.

X
54
05-10-2023, 07:12 AM
#25
@Franck mentioned similar ideas about using another programming language such as C#, C++, or Java for this project. He also suggested that JavaScript might be a good fit since it would integrate more smoothly with the web front end I'm developing.
X
xXcarlos117Xx2
05-10-2023, 07:12 AM #25

@Franck mentioned similar ideas about using another programming language such as C#, C++, or Java for this project. He also suggested that JavaScript might be a good fit since it would integrate more smoothly with the web front end I'm developing.

Pages (3): Previous 1 2 3